Tuesday, September 30, 2008

First Look at Comcast's Neutral Throttling Scheme

Last week Comcast submitted to the FCC their detailed proposal for congestion throttling of broadband subscribers, one that does not discriminate on content. When I read reports about it I was very curious about it so I read through the document to see the details, or at least the details that they disclosed. While this is a US regulatory process it could very well impact Canadian ISPs, especially in light of throttling of P2P and other applications by Rogers, Bell Canada (both Sympatico and unrelated ISPs) and others.

The questions I had for myself were, how it works, will it meet its objectives, and who is impacted and how? My hypothesis on a first skim of the document was that it would not do quite what they claim. Then I read it more carefully, and now I more sure of it. It isn't that it's a bad plan, just deficient in certain and important ways. Let me take you through my thinking on this, and my conclusions.

For purposes of argument I will construct a simple cable broadband scenario with lots of easy to manipulate numbers. It doesn't fully correlate to reality or Comcast's figures though it should prove sufficient for a quick analysis. You'll see that it's easy enough to alter the numbers for a real case study while using the same methodology. Here are the attributes of the scenario I am constructing:
  • 200 Mbps downstream capacity serving 275 subscribers (275 is Comcast's average number). Comcast's scheme targets both upstream and downstream data, and although I am only describing downstream, upstream is similar. Also note that the cable access is shared, unlike DSL. However it does correlate well with the shared link between the DSLAM and IP core, so in that way my analysis could also apply to DSL. In other words, Bell Canada could, if the CRTC nails them, implement this scheme (with suitable interfaces).
  • Each subscriber's downstream capacity is 8 Mbps, as determined by modem, policies and DOCSIS level.
  • From the above, bandwidth is overbooked at over 10:1. This is perfectly reasonable and good engineering practice as we'll see.
  • At 'busy hour' there are 200 active subscribers (~75%), with an average long-term (15-minute window) utilization of 0.85 Mbps (my invention, but is probably not an unreasonable choice). The pseudo-normal curve of subscribers counts will show a peak near 0.85 Mbps, with a declining number of subscribers at lower (toward 0 bps) and higher (toward 8 Mbps) speeds.
Here we have a long-term utilization of 85% (200x0.85 = 170 Mbps out of 200 Mbps), which Comcast correctly notes is within the range where congestion can appear often enough to impact subscribers. When congestion does occur it will impact all subscribers regardless of their individual traffic profiles since every packet has an equal probability of being delayed (buffered) or lost (protocol or application timer expiry, or buffer overflow); all subscribers start with the same 'best effort' service level.

As an aside, at windows shorter than 15 minutes the probability of congestion (>80% utilization) increases. This phenomenon is a consequence of the statistical nature of communications by multiple independent transmitters. Therefore at very short windows, say 1 second, congestion is common but is usually invisible except as some (variable) latency as packets are buffered until the link is free. As congestion increases, so does packet loss (and retries).

Now let's say there is one subscriber downloading at maximum rate (8 Mbps) in the measured 15-minute window. Comcast will throttle that one subscriber to, perhaps, 1 Mbps. That will drop the utilization from 170 Mbps to 163 Mbps, or 81.5% of capacity. Congestion will be reduced but not to below their objective of under 80%. If that subscriber is only using 6 Mbps, which is still above Comcast threshold of 70% of the maximum 8 Mbps, the effect is even less. However this should not be disregarded since congestion becomes severe because of a 'knee in the curve' that is typical in packet networks. That is, if the knee is near 80% of capacity, the impact of a 1% from 84% to 85% can be much greater than a similar 1% from 80% to 81%. Therefore in the instance of one heavy usage subscriber being throttled, the improvement for the other 199 subscribers can be significant.

To get below 80% they will need to throttle 2 such heavy usage subscribers, which is 0.73% of the total of 275 subscribers (both active and inactive during the period). By Comcast's own figures, in the Colorado throttling trial, about 0.36% of the over 6,000 subscribers were throttled at one time or other. However these were not even concurrent and they did not say how many subscribers were at most concurrently throttled. This implies that my choice of 2 concurrently throttled high usage subscribers would be uncommon.

Let's now look at the case where there is no subscriber using more than 80% of their 8 Mbps capacity in the 15-minute window. We are still at 85% of the 200 Mbps capacity yet there is no one to throttle according to their policy. In other words we have 200 fairly average subscribers, or we may have one or more heavy usage subscribers who game the policy by deliberately throttling their traffic to below 5.6 Mbps (70% of 8 Mbps) during busy hours. Comcast can then only respond by changing the policy or reducing the number of subscribers per shared access to a number less than 275, or I suppose they could ignore the problem and allow the congestion to affect all 200 subscribers.

I don't have the numbers at hand, though it is known that the average user's traffic rate and traffic total are increasing, due in part to richer web content and the popularity of video sites (Hulu, YouTube, etc.). So even if a connection is now at 70% utilization at busy hour they need only wait a few months until they reach 80%. They will still need to upgrade the network, at some cost, regardless of high usage subscribers such as BitTorrent users.

My conclusion is that their throttling scheme, while it is application agnostic, does not solve any problem. All they are doing is delaying for a short period the need for network upgrades. Otherwise their only alternative is to use the technology to put in place increasingly severe policies, which over time will decrease their delivered bandwidth to subscribers to much lower levels, even (horrors!) to below that of their DSL competitors.

Update: 22 out of 6,016 users is 0.36%, not 3% as I originally stated. Now corrected.
Update (Oct 1): Yikes! Another error. 8 Mbps - 1 Mbps = 7 Mbps, not 5! Fixed throughout subsequent calculations. Conclusions remain valid.

No comments: