Wednesday, October 1, 2008

Measurement Windows and Comcast Throttling

In my preceding post on Comcast's neutral throttling I left something out that may be of interest, though perhaps not a major factor in the impacts of their new system. This has to do with how their measurement windows are implemented.

Comcast says they are using 15 minute intervals (windows) to measure traffic usage of individual subscribers and of shared connections. They do this to determine whether there is congestion of the shared connection (80% threshold) and to identify heavy usage subscribers (70% threshold).
This is an indirect measurement of congestion since the percentage of capacity says nothing about whether traffic is delayed or lost. This is like counting cars and trucks at the onramps of a freeway and, without looking at the freeway or the offramps, deciding how well traffic is flowing to its various destinations. It can work quite well most of the time if you have done ample analysis beforehand in a controlled environment so that there is a reliable correlation between a threshold level (percentage of capacity) and congestion performance.
What I found interesting is that how the windows are implemented is not mentioned in Comcast's FCC submission. There are two basic methods to consider. However, first let's calculate the thresholds for a 15 minute window.
  • If the shared connection is 100 Mbps the 80% threshold is 0.80 x 15 x 60 x 100 = 72 Gbps, or about 9 GB.
  • For an 8 Mbps subscriber the 70% threshold is 0.70 x 15 x 60 x 8 = 5.04 Gbps, or about 630 MB.
Now let's take a look at the two windowing methods:
  1. Non-overlapping windows: To illustrate with an example, let's start the clock at 10:00. At 10:15 we look at the traffic for the preceding 15 minutes and make the determinations of congestion and high usage subscribers. We do the same thing at 10:30, 10:45, and so on.
  2. Sliding window: Again, we'll start the clock at 10:00, noting that there is not a 15 minute interval until 10:15, at which time we look at the traffic and make our determinations as in the first case. But this time we won't wait until 10:30. Instead we will use a sliding window with a 1 minute interval. This means we do the next measurements and determinations at 10:16 using the data from the 15 minute window 10:01 to 10:16. We do it again at 10:17 for the window 10:02 to 10:17, and so on every minute.
The outcomes are different and so are useful to understand. The sliding window provides a superior solution (for Comcast) since there would be earlier discovery of congestion and heavy usage. For example, if a subscriber streams at the maximum 8 Mbps beginning at 10:07, using non-overlapping windows it would not be caught until 10:30, giving a 23 minute period during which other subscribers may suffer degraded performance. With a sliding window the heavy usage would be caught at 10:18, after only 11 minutes (5.04 Gbps = 8 x 10.5 x 60). With non-overlapping windows the time to catch that subscriber can range from 11 to 25 minutes, so it is almost always worse than with sliding windows. There is a similar impact on congestion detection although the results are less predictable due to the variable load of other subscribers.

A disadvantage of sliding windows is that it can be more expensive because the measurements and determinations for congestion and heavy usage are done more frequently. This can be managed by picking an interval other than the 1 minute choice in my example. There is also the possibility of technical constraints on the interval (frequency) due to the equipment being used.

As can be seen there are pros and cons of Comcast's windowing choice on all subscribers, though whether each difference is a pro or a con depends on your viewpoint.

No comments: