[isf-wifidog] Traffic shaping, part 2: The issues
Benoit Grégoire
bock at step.polymtl.ca
Ven 2 Nov 22:11:56 EDT 2007
There are different types of problems that can be solved through traffic
shaping, and they must not be confused with one another, because the
solutions for each is different. However, solving them all at the same time
in practice is difficult, because the solutions interact.
But first here are practical definitions for the two main performance
characteristics of network performance:
Bandwidth: The amount of data the the network can transfer in a given unit of
time. This is what determines how long it takes to transfer a large file.
Latency: The amount of time required for a small packet of data to make a
round trip. This is what determines how long it takes before something
happens when you click something on the Internet (for example, when you click
on your friend's head in Counterstrike...). Note that except for large file
transfer, high latency is the primary cause of perceived slowdown.
Off course, networks have other performances characteristics (jitter, packet
loss, etc.) but we'll ignore them for now.
Issue 1: Buffers in DSL/Cable/WiMax modems are way too large for good
multi-user performance.
Problem caused: Maxing out upload will lead to higher latency and lower
download bandwidth, and vice-versa. This is off course much more likely to
happen if you have many users.
Historical reason for the problem's existence: The ISPs designed their
infrastructure to optimize performance for single users. While large buffers
can cause latency problems for even a single user (say you are downloading
large files and playing a first person shooter game at the same time), the
will lead to the largest peak bandwith. In other words, the ISP will fare
better when the connection is benchmarked on bandwidth.
Typical solution(s): Limiting total incoming and outgoing bandwidth to
slightly less (90-95%) of available bandwidth, prioritizing TCP ACK packets.
Solution in the wifidog context: The same solution is applicable.
Challenges:
1- It is impractical to know in advance how much bandwidth is available. Not
only can every hotspot have different Internet plans, but even if one knows
the maximum bandwidth of the ISP's plan, that bandwidth may not be available.
For instance, if you subscribe to a 5Mbps DSL plan, if you have long/poor
phone lines, you modem might connect at only 1.8Mbps. So that bandwidth has
to be measured by wifidog.
2- It is not always practical for the wifidog gateway to be the very first
thing plugged in the modem. If you are plugged into a LAN that is server by
a DSL modem, and someone uses bandwidth somewhere on the LAN, your
measurement above will no lounger be valid, and your shaping may actually
make things worse if you are not very carefull.
Issue 2: Users not getting their fair share of bandwidth.
Problem caused: When someone download's large files using modern P2P
applications, in addition to triggering Issue 1, he will cause additional
problems. That is if 3 users are on the network, two downloading a mail
attachment, and one downloading a file over P2P. Typically, they will NOT
get 1/3 of the available bandwidth each.
Historical reason for the problem's existence: In the begining, shapers
did "fair queuing" between IP/port pairs. So in theory If the P2P client in
the scenario above opened 100 connections, the use would get ~98% of the
bandwidth. Modern shapers and default kernel configs aren't nearly that bad,
but e frequent problem is that the oldest opened connection will keep hodding
most of the bandwidth.
Typical solution(s): Various misguided "solutions" are frequently applied to
this problem:
-Static user classes: Say you have a 3Mbps uplink, and you only allow users
to use up to 300Kbps. This will fix the problem (assuming you have no more
than 10 concurrent users), at the cost of making the connection suck for
everyone, 100% of the time.
-Connection aging: Make each connection fast at first, and then slow it
down. Yes, people really do this. The rationale is (presumably) that web
browsing will be fast (lots of small, short connections) and the user will
only look at the download speed of large file at the beginning. Besides
being stupid, this will actually give a BIG advantage to our P2P user
compared to our two mail users: the P2P client will snob peers that look
like they are slowing down, and will open brand new, fast connections to new
peers.
-Trying to block/throttle P2P users. Above and beyond the fact that this is
is ethically questionable for various reasons, it is an arm's race that
network administrators are unlikely to win. See examples in my last email.
It's also extremely shortsighted since it's very expensive both
computationally and in manpower, and has to be revisited over and over.
Solution in the wifidog context: ESFQ (Enhanced Stocastic Fair Queuing),
which would allow each wireless client to get no more than it's share of
bandwidth, but allow the entire amount of bandwidth to be used.
Challenges:
1- Issue 1 must be solved for ESFQ to have any chance of working at all.
2- It is not possible to instantly throttle downstream bandwidth. The lag
time in doing so can cause problems fo ESFQ.
Issue 3: Applications that would need priority, such as VOIP
Problem caused: Depending on the user, some application should have more
priority over other for network latency. VOIP > web browsing. SSH > FTP.
World of warcraft > Bittorrent > Everything else.
Historical reason for the problem's existence: Ever since IPv4 was
standardised, there was a QoS flag, that you were supposed to set when an
application needs priority. Sadly, human nature being what it is, if the
users notice that an application will go faster if they set the flag, they
would start to set it for every application (not caring that it may slow down
their neighbor). Once the neighbor notices, he will very rationally set the
flag as well to defend himself, leaving the whole network ... right back
where it started. So in practice no one obeys the QoS flag anyway.
Typical solution(s): Trying to discriminate the type of service from the port
range (or more sophisticated packed analysis), make a value judgement over
which service is more important that some other, and give priority acoording
to that grid. The problems are one again the questionnable ethics of it, and
the simple fact that not only what is a priority for one user may not be for
another, but that if ISP would start to give priority to everything VOIP, you
can be sure that P2P apps would offer an aption to transfer data over VOIP
protocols.
Solution in the wifidog context: Actually obey the QOS flag, but only up to a
part (say 10%) of the slice the user would get in the solution to Issue 2.
In other words, pass ACKs first, QOS traffic second (up to 10% of the user's
slice), and pas the rest after.
Challenges:
1- Issue 1 and 2 must be solved.
2- If you VOIP handset doesn't set the QOS flag, it doesn't help you (although
you'll probably still get decent performance from the solution to issue 2)
Issue 4: Chronic bandwith abuse over a long period/reducing bandwidth cost.
Bandwith takes real money, and real resources to create. Whether you run a
free or for pay network, you may decide that there is a maximum amount of
network resources that your users should be allowed to use per
day/month/hotspot.
Typical solution(s):
-Bandwidth capping: Not allowing the user to use more than 1Mbps
-Data transfer capping: Not allowing the user to tranfer more than 40GB per
month.
Solutions in the wifidog context:
-Dynamic abuse control. Allow defining criterias of maximum data transfer per
unit of time, at a hotspot, over the entire network, etc.
-Opening hours support. For free networks designed to be used in public
places, closing access when the public place is closed can drasticaly reduce
monthly bandwidth consumption.
-Supporting the "password of the day" model. Allows drastically reducing the
bandwidth leached by a hotspot's neighbor's by forcing them to physically
visit the place to get access.
Note that technically, none of the above require any kind of traffic shaping.
Traffic shaping is involved if you want to implement policies that are a
little less drastic than cutting off access once a user/machine reaches the
threshold. Let's say that your quota in 5GB per hotspot per month, instead
of cutting off the user once he reaches 5GB, at 4GB you would progressively
slow down the user in such a way that he would never reach 5GB, or you would
slow down the user to a low maximum bandwidth (say 128Kbps), or make him pass
after every other users.
Challenges:
1-The wifidog protocol needs to be redesigned to allow the auth server to
specify the maximum bandwidth for each user individually, and update that
number periodically.
2-The user could open a new account/spoof the MAC address. There are ways to
make that very inconvenient, but that's another arms race (and another
feature list altogether).
Issue 5: Selling the user monthly access with fixed bandwidth (say 512Kbps).
Typical solution(s): Client side user classes
Solutions in the wifidog context: server side token architecture and per user
bandwidth specification. Basically, if we have per user bandwidth
specifications in the gateway and protocol, selling fixed slices is just a
degenerate case of the general problem.
Ok, that's long enough for one night.
Plus d'informations sur la liste de diffusion WiFiDog