Friday, January 12, 2007

Rate Control Using Throttles

In my article "Traffic Management", I talked about the need for rate control in real-time systems. I mentioned that there were efficient algorithms for implementing sophisticated rate control mechanisms. Such algorithms can be used for traffic policing on the consumer side and for traffic shaping on the producer side. They are frequently simple enough that they are implemented in off-the-shelf chip sets for networking technologies like Frame Relay and Asynchronous Transfer Mode (ATM).

They are also simple enough to be implemented efficiently in firmware or software and incorporated into applications ranging from embedded systems to enterprise Service Oriented Architectures. Several commercial telecommunications products contain proprietary rate control software that I developed. Digital Aggregates provides clean-room implementations of several rate control algorithms based on public standards. These are available in C++ in the open source (LPGL) Desperado library, and in Java in the open source (Apache 2.0) Buckaroo library.

Historically - in this case meaning in my personal history - rate control has been cast in terms of the emission of ATM cells. Much of the rate control software that I have developed has been based on the virtual scheduler described in the ATM Forum Traffic Management 4.0 specification. This is because I spent several years at Bell Laboratories immersed in the development of ATM devices, including an ATM network interface card that supported Voice and Telephony Over ATM (VTOA) with traffic shaping, and which to this day is still arguably the most complex single board ever produced by that organization, and an ATM network switch with a sophisticated Connection Admission Control (CAC) algorithm. Alas, ATM became the Sony Betamax of network technologies, along with all that implies. However, replace the word cell with buffer, packet, datagram, log message, event, or foo, and the same code works equally well in non-ATM applications. (You can replace cell with byte too, but calling this software for every single byte you want to transmit is not cost effective. Stay tuned for future developments.)

The basic abstraction for the Desperado and Buckaroo rate control classes is the Throttle. Throttles are objects that maintain internal state about the event stream they are rate controlling. Before emitting an event (whatever the words emit and event mean in context), the application calls the Throttle to ask if the event is admissible. If the Throttle says that it is, the application emits the event and commits the Throttle state. If it is not, the application does not emit the event and instead rolls back the Throttle state. What happens next depends on what kind of Throttle you have.

Throttles may be time-based or event-based. Time-based throttles base admission decisions on temporal factors such as the inter-arrival time between successive events. They are initialized with a specific traffic contract that formally describes the expected (traffic policing) or desired (traffic shaping) emission behavior of the event stream that they are rate controlling. Event-based throttles base admission decisions on something other than time, such as the number of events already admitted. Throttles may be compounded together to form composite Throttles that exhibit behavior that may be quite complex. Think of it as adding simple periodic waveforms together to get speech or music.

When using an event-based Throttle and the event is not admissible, the event is not emitted, and the application may try again later at some indeterminate time. The Geometric Throttle admits events following a geometric progression: it admits the first event, the second event, the fourth event, the eighth event, up to and including the thirtieth event, after which no more events are admitted until the Throttle is reset. This simple Throttle is very useful in embedded systems for controlling error messages to a log file or a console when a persistent failure has occurred. The application only emits the error message if it is admissible, and it resets the Throttle once the error clears, placing the Throttle back in its initial state. Several error messages will be emitted initially for a particular failure, then with ever decreasing frequency, until finally no more error messages appear. This is a good strategy whether the log is being watched in real-time or being written to a file, because it produces the occasional reminder that the error still exists without resulting in a fire hose of error messages.

When using a time-based Throttle and the event is not admissible, the event is not emitted, and the Throttle gives the application a delay interval, in units like microseconds, that indicates how far in the future the event will become admissible. The application can delay emitting the event until this interval has passed, at which point it asks again if the event is admissible to update the Throttle state, commits the state, and emits the event. Time-based throttles are ideal for shaping outgoing data streams to make a message producer a good citizen that plays well with others, and for policing incoming data streams to prevent a misbehaving message producer from fire hosing a message consumer.

Below is a Java code snippet (with some distracting details removed) that shows a trivial application initializing and then using a Throttle to control the rate at which events are emitted by doing a simple Thread sleep. Without going into too much explanation, this gives you the flavor of how a Throttle might be used.

int pcr = 10; // Peak Cell Rate
int cdvt = 250; // Cell Delay Variation Tolerance
int scr = 1; // Sustained Cell Rate
int mbs = 10; // Maximum Burst Size
 

Throttle th =
new CellRateThrottle(pcr, cdvt, scr, mbs);
 

while (true) {
long delay = th.admissible();
while (delay > 0) {
th.rollback();
Thread.sleep(delay);
delay = th.admissible();
}
th.commit();
emit(event());
}

The Generic Cell Rate Algorithm (GCRA) is a time-based Throttle which is based on the virtual scheduler. Its traffic contract consists of just two parameters, the increment, which is the expected or desired inter-arrival time between events, and the limit, which is the total variation from the inter-arrival time that the event stream may exhibit before being in violation of the traffic contract. Telecommunications folk will recognize this slack in the traffic contract as an accommodation for jitter, small variations in the inter-arrival time of events. It is impossible for firmware or even hardware to nail the emission of events exactly on the dot time-wise. And when aggregating multiple event streams into a single event stream, one event has to be emitted before another, regardless of their respective traffic contracts. Jitter happens. As someone once said, "perfection is the enemy of good enough", and rate control mechanisms have to take this into account. If you find it impossible to emit event streams in conformance with their individual traffic contracts, either your traffic contracts are unrealistic, or you have a network engineering problem that you need to solve.

However, there may be legitimate reasons to commit the Throttle state and emit an event even if the event is not admissible. It may well be better to emit an event at a time that violates its traffic contract and let it take its chances being dropped due to congestion by a downstream node in the network, than to guarantee that it is dropped due to, for example, a full buffer in the consumer, before it is ever emitted into the network. (It also may well be that this is not the case; your mileage may vary.) If the application commits the Throttle for an event which is not admissible, the Throttle becomes alarmed. The Throttle remains in the alarmed state until the event stream once again conforms to its traffic contract.

The Cell Rate Throttle is a composite Throttle made up of two GCRA Throttles. It implements a complex traffic contract based on parameters that may sound a little more approachable than increment and limit. The Peak Cell Rate (PCR) describes the maximum expected or desired rate of event emission in cells per second. The Cell Delay Variation Tolerance (CDVT) is the limit for jitter in the PCR, typically in microseconds. The Sustained Cell Rate (SCR) is the long term average rate of event emission in cells per second. The Maximum Burst Size (MBS) is the number of cells that maybe emitted at PCR (which is presumably greater than SCR) without violating the traffic contract. The math behind the GCRA and the virtual scheduler calculates the equivalent burst size for any emission rate between PCR and SCR. The Cell Rate Throttle allows an event stream to burst for short periods without violating its long term average rate. This may sound like rocket science, but once again, the implementation of this mechanism is actually very simple and efficient.

Each time-based Throttle implementation has a time granularity known as a tick. A tick may be a millisecond, microsecond, nanosecond, or even some arbitrary unit that may not even be representable as a whole number and is perhaps based on the underlying CPU clock. The inverse of this granularity, that is, the number of ticks per second, is known as the Throttle's frequency. A time-based Throttle may be queried as to its frequency, and each time-based Throttle provides a method that returns the current time in units appropriate to its frequency. The current time may be equivalent to wall clock time relative to some epoch, or merely a monotonically increasing counter with a consistent interval of change suitable for measuring arbitrary time durations. This flexibility in regards to how time is measured makes Throttles easily adaptable to many operating system platforms and hardware targets.

A few more minor details: Throttles may be reset at any time to return them to their initial state, causing them to forget any past sins of the event stream they are rate controlling. The first event for any Throttle is guaranteed to be admissible. Throttles are guaranteed not to alarm as long as their event streams conform to their traffic contracts.

This has been a brief introduction to Throttles. I believe that rate control is crucial for any real-time system that is to be robust, scalable, and supportable, and most especially one in which event streams must be aggregated. Using Throttles and other rate control mechanisms for traffic shaping and traffic policing make producers and consumers well behaved, and allows you to make useful assumptions about traffic, load, and quality of service when engineering a network, configuring a large distributed application, or troubleshooting a customer system. In the future I will discuss traffic contracts in more detail, and the theoretical underpinnings of the Generic Cell Rate Algorithm and the virtual scheduler.

Sources

Buckaroo, Digital Aggregates Corp., 2007

Desperado, Digital Aggregates Corp., 2007

Chip Overclock, "Traffic Management", December 2006

N. Giroux, et al., Traffic Management 4.0, tm-0056.000, ATM Forum, April 1996, pp. 62-67

J. L. Sloan, "ATM Traffic Management", Digital Aggregates Corp., August 2005

No comments: