Monday, January 22, 2007

Separate But Equal

The separation of control messaging and data streams is an old pattern that keeps popping up, to be rediscovered and declared a new invention in new problem domains.

The first time I ran across it was when I was writing device drivers for real-time applications on PDP-11s (and later, LSI-11s) starting back in 1976. I had written drivers for things like A/D converters, parallel and serial I/O boards, real-time clocks, and punch-card readers. Those devices all followed a common pattern: you had a memory-mapped status register and a memory-mapped data register, and it was the responsibility of the driver to handle the movement of data to and from the device and memory.

It wasn't until I started writing drivers for direct-access storage devices like hard disks and floppy drives that the young Mr. Overclock first encountered direct memory access (DMA). DMA was great: you just programmed in some parameters in some status registers, and the hardware itself took care of moving data to and from memory. The only interrupt you received was when the entire I/O transfer was complete.

I know y'all laughed when I said I wrote device drivers for cards readers. But see, I've written a lot of device drivers for a lot of platforms since then, and I can tell you this: those card readers are still the most challenging devices I've ever worked with, before or since. You set the read bit in the status register, the blowers and motors on the reader spun up, and a punch card flew through the read head. You got a fast stream of interrupts, one for every one of eighty columns, and maybe or maybe not another interrupt for end-of-card, depending on the timing. You either serviced every one of those interrupts, or you lost the data. There was no backing up and trying again. If you're goal was, as mine was, to run the card reader at its full rated speed, this took some clever programming, not to mention some careful planning of interrupt priorities for the other devices in the system. (It wasn't until years later that I wrote drivers to handle analog trunks and stations, in Java of all things, for a telephony product, that I encountered devices with these kind of no-compromise hard-real-time constraints.)

So I was in a good position to really appreciate the whole idea of DMA, where I just handled the control and something smarter, or at least a whole lot faster, then my code took care of the data.

Through my wanderings I somehow ended up at the National Center for Atmospheric Research, a national lab in Boulder Colorado. NCAR had a floor full of supercomputers (mostly Cray Research machines at the time). It was there I learned that when you have a lot of supercomputers, you have a lot of data. Petabytes of data. Having a lot of MIPS and FLOPS lying around isn't useful unless you have a lot of MBPS available too. This is where the NCAR Mass Storage System (MSS) comes in. The NCAR MSS is hierarchical storage system that manages a huge disk farm, several robotic tape silos, and (at the time) a vast backing store of offline tape cartridges.

The NCAR MSS Group is credited with inventing the idea of the separation of control and data as it applies to mass storage. Control messaging was carried among a distributed farm of servers and adapters and their client supercomputers over LAN technologies that most of us would recognize, typically Fiber-Distributed Data Interface (FDDI) or some form of Ethernet. The ancillary equipment handled the I/O setup, including any necessary staging of data and mounting of tapes, and it wasn't until the gigabytes were ready to fly that the supercomputers were re-engaged. Then those gigabytes streamed over specialized data paths using (again, at the time) more exotic and specialized I/O-centric technologies like High Performance Parallel Interface (HIPPI) and Fibre Channel (FC).

There were a lot of good reasons for this separation. The LAN technologies at the time tended to be broadcast media. The I/O technologies were switched, point-to-point channels. The LAN technologies had small physical layer data segments which were not optimal for streaming a lot of data really quickly, but was good for moving small payloads between widely distributed endpoints. The I/O technologies had high initial connection setup latencies, but were very high performance once the data stream began. The LAN technologies were cheap. The I/O technologies were expensive. The separation of control and data was a physical-layer artifact. (I've been told that technologies like Gigabit Ethernet have muddied the waters in this respect since I worked at NCAR.)

The NCAR MSS was using the available technologies in the way in which they were intended, and reaping the performance and economic benefits. It wasn't a great leap to notice the common pattern between this architecture at NCAR, where it was writ large, and my earlier work developing tiny little device drivers in assembly language.

It wasn't until I joined Bell Laboratories and entered the domain of telecommunications that I discovered that the telephony folks had also faced, and solved, this same problem. Digital telephony trunks were bundles of sixty-four kilobit per second bearer channels controlled by a single signaling channel of the same bandwidth. Protocols like Integrated Services Digital Network (ISDN) controlled the setup and release of telephone calls over the signaling channel, called the Delta-channel or more commonly the D-channel, while the eight-bit voice samples of each telephone call were transferred over one of many Bearer- or B-channels.

In ISDN, the D-channel and its associated B-channels are typically carried over the same physical path, such as bundles of copper wires in a single T-1 trunk, or a single Synchronous Optical Network (SONET) optical fiber, although I have also seen cases where they were split off into separate physical paths as well. But generally, the separation of control and data in ISDN is a transport-layer artifact.

There were very good reasons for this as well. Separating control (signaling) and data (bearer) made economic sense since the messaging bandwidth of a single D-channel was adequate to control tens of B-channels. The data streams from the D-channel and its B-channels were routed to very different endpoints, the D-channel messages to a control element that would be a conventional processor, the B-channel voice samples to a digital signal processor (DSP) or a time division multiplexed (TDM) circuit switch or backplane. And that fact that the asynchronous signaling messages were carried out-of-band, instead of in-band along with the synchronous voice traffic, decoupled these two streams so that the voice streams could be controlled more efficiently, and it reduced jitter in the voice streams as well.

So it is no surprise that when Voice over IP (VoIP) came along to replace traditional TDM-based telephony, the IP world adopted the same pattern. The Session Initiation Protocol (SIP) is a commonly used application-layer protocol for setting up and releasing VOIP calls. But stations and trunks that speak SIP pass their application data (for example, voice samples) using some other application-layer protocol specialized for that purpose, typically the Real-Time Transport (RTP) protocol.

In SIP, the separation of control and data is mostly an application-layer artifact. The SIP and RTP streams use the same network protocol, IP, and the same physical layer, such as Ethernet, although the transport-layer TCP or UDP packet streams may be handled quite differently in intermediate routers because of their very different quality of service requirements.

The separation of control and data is a common architectural pattern that seems to be reinvented in every problem domain in which I find myself working. In a recent project implementing business process automation of high-end communications capabilities, the control messages were carried over an enterprise service bus, but the data (which from the application's point of view was both the SIP traffic and the RTP traffic) was carried over the LAN.

What other examples of this architectural pattern can we find if we look hard enough? And where should we be applying it that we are not?

No comments: