Understanding How It Works Really Does Matter
Famed English mathematician and physicist Oliver Heaviside once said to his critics, “Should I refuse a good dinner because I do not understand the process of digestion?” He was defending his use of mathematical concepts that were not clearly defined, but the real lesson life repeatedly teaches us is that we would be better off knowing how something works before we use it. The outlook for our economy, for example, mirrors the anguished tone of Edvard Munch’s painting “The Scream” because Wall Street investors did not fully understand the welter of credit default swaps that failed to protect their debt holdings against default. Thus, a housing crisis begot a financial crisis that begot a credit crisis that begot a recession that has eviscerated the net wealth of people across all incomes.
Misunderstanding the relationships of the parts will eventually lead to the circuitous collapse of any closed system. Outside plant (OSP) is no different. How many times have we engineered, installed, or repaired OSP facilities for a customer and not fully understood the inner workings of what the electrons and photons in the cables are doing with our voice or data that it represents? A look inside OSP circuits reveals a 61-year-old mathematical theory of information that can teach us how to understand the efficiency of our communications infrastructure.
The Origin of Information Theory
Information Theory originated from the study of electrical communication in the 1800s. The most famous encoding scheme by Samuel F. B. Morse was devised in 1838. He cleverly assigned the letter E as a single dot. By doing that, he used the shortest possible symbol with the fewest bits to represent the most frequently occurring letter (13% of all appearing letters) in English text and gained tremendous communicating efficiency.
In 1874, one year before the telephone was invented, Thomas Edison went further by developing a quadruplex telegraph system to send two messages simultaneously. Using two intensities of electric current and two directions of current, he had four different states of current flow with which to convey a message.
In 1924, mathematician Harry Nyquist published his first famous paper, ”Certain Factors Affecting Telegraph Speed,” in which he developed a formula for bit rate based on the number of electrical current values. His formula showed that by going from the “on-off” telegraph of Morse code fame to a four-current value system, we could double the bit rate. This is, of course, what Edison accomplished with his quadruplex telegraph.
The point is clear: how one encodes a message into electrical signals matters profoundly in the efficiency of the transmission of that message. This is the heart of information theory.
In 1928 Nyquist published a second paper, ”Certain Topics in Telegraph Transmission Theory.” An important point from this topic became what is known as Nyquist’s Theorem. He showed that if 2N different electrical current values per second are sent, the sinusoidal components of the signal with frequencies greater than N are redundant and are not needed in the transmission of the message. In other words, as long as the original signal is sampled at a rate that is twice the highest frequency of the original signal, then the encoded message will be able to be deciphered.
The bit rate was proportional to the width of the band of frequencies used, now commonly referred to as bandwidth of a signal. For example, it is well known that the audible spectrum for the human ear is from about 20 Hz to 20 kHz. However, the human voice band falls in the 20 Hz - 4 kHz range, with 90% of all speech falling into the 300 Hz - 3.4 kHz range. Since the highest frequency of the voice band is about 4,000 Hz, Nyquist showed that sampling the original voice signal at 8,000 times per second would result in an accurate representation of the original signal. This is a good thing. If voice signals required 20 kHz of bandwidth on every POTS circuit, the capital expenditures to build such an OSP network would burgeon beyond proportion.
The Father of Information Theory
During World War II, communication theory advanced beyond its fundamental base of coded letters and ciphers and evolved into schemes of detecting aircraft headings from noisy radar data. In 1948, Bell Labs mathematician Claude E. Shannon wrote his famous paper “A Mathematical Theory of Communication,” which formed the basis of Information Theory. Shannon’s work ingeniously went beyond interpretation of noisy signals and focused on determining the best signal to send along a known noisy channel in order to optimally convey that message to a receiver. He defined for us what information really is and identified shortcuts in communicating it more effectively. Shannon tells us there is a best way of encoding a signal from a source using the least number of bits.
Unlike previous researchers, Shannon formed a theory with the components of a communication system, and mathematically modeled their interaction so that it was intelligible. Until Shannon’s work, no one was able to predict how much information a receiver could capture from an information source. A significant result from Shannon’s theory is the formula shown in Figure 1 where R is the amount of information collected by receiver y, H is the amount of information sent by a source x and Hy(x) is the information lost in transit between source x and receiver y.
Figure 1. R = H(x) - Hy(x)
If x and y are not correlated to each other Hy(x) = H(x) and R = 0. This means two objects that are independent of each other can share no information between them. Conversely, when x and y are the same object (i.e., x = y), Hy(x) = 0 and no transmission loss occurs. In general, the higher the correlation between the source and receiver, the more information transmitted.
Shannon defined the critical relationships between four basic components of a communication system by using mathematical models to describe the elements’ behavior within the system:
Component 1. The power of the message source.
Component 2. The bandwidth of the channel.
Component 3. The noise of the channel.
Component 4. The receiver’s ability to decode the message.
Four Components Behind Successful Communications Systems
Component 1. The power of the message source.
Transmitting a message with a certain reliability required the source to have a minimum transmitting power. The power required to send a message depends on the noise in the channel. Ideally, to transmit C bits per second of bandwidth W, the signal power P for a given noise power N is expressed by:
1 + P
C = W log2 --------
N
This formula was a bulwark for communication theorists for decades. Now, present-day data compression algorithms, or turbo codes, are squeezing far more throughput for a given source power than Shannon could ever imagine.
Component 2. The bandwidth of the channel.
The bit is the basic unit of measure for the transmission of information over a communication channel, and is sometimes referred to as a binary digit. In his 1948 paper, Shannon credits Bell Lab statistician John Tukey for coining the word bit in 1947.
Think of bits as the on/off pulses in the wire or any other waveguide. Bits can be represented by electrical current in a copper cable, by laser diode pulses in a fiber optic cable, or by radio waves in free space.
Analysis of bits centers on how much information on average can be stored on a bit statistically. The number of bits that can be sent over perfect and imperfect communication channels per unit time, or the bandwidth, continues to receive academic focus with communication engineers and theorists today.
Think of bit rate not as a speed but as a flow rate. What's the difference? Use the analogy of a river. The bit rate is like the speed of a raft moving along the surface of a river (i.e., bits per second) toward a passing bridge up ahead. Communication theory focuses on the flow rate - how much volume of water (also in bits per second) passes under the bridge per unit time - the equivalent of bandwidth.
Component 3. The noise of the channel.
Shannon's formulas told us computationally how noises affect the signal. Whether through a hiss or static on the radio or pixilation on the HDTV set, noise is an inevitable fact of communication that we must accept. Noise is the communication equivalent of reaction to every physical action, as emboldened by Newton's 3rd Law. Noise is what is responsible for reducing a channel's bit rate, in bits per second, to less than its information capacity, also in bits per second.
To borrow from the earlier river analogy, the noise of the channel is like the rough shapes along a river bed that cause the water to become shallow, resulting in slower water (i.e., slower bit rate). Slowing water reduces the volume throughput (i.e., flow rate) passing under the bridge.
Shannon describes ways to communicate through noisy channels with the smallest error rate possible. The most obvious solution to fighting noise is with repetition. But repeating the symbol a certain number of times will reduce the amount of information one can send and is also very costly. One could increase the transmitting power or make the receiver less noisy, but there must first be a way to measure the amount of information. And there is. It's called entropy, and it shouldn't be associated with the concept of entropy in physics. The entropy of communication theory is measured in bits, unlike in physics where it represents the uncertainty of which state a physical system lies in. In communication theory, entropy is the average uncertainty of the next symbol to be sent by a source, and it increases as the number of possible messages increases.
What Shannon's work says, surprisingly, is that by properly encoding a message, it can be sent with acceptable loss even over a noisy channel. If we can identify the type of noise in the channel and its magnitude, then we can calculate how many characters that can be sent over that channel per second without significant error.
Component 4. The receiver's ability to decode the message.
Shannon transformed the standard definition of information. Similar to compressed telegraph messages of the past where certain words like "a" or "the" are left out to reduce transmissions costs, Shannon defined information as symbols that only contain unpredictable news.
Why send a symbol that the recipient already knows or can guess? The predictable and redundant symbols can be left out since they are not really news, and their omission does not reduce the clarity of the message. For example, "just info esentil to undrstndn mst b tranmitd."
When the sender and recipient are uncertain of news, a message is needed. Uncertainty is a key commodity brokered in communication theory. To help reduce the unpredictability in transmitting messages, conditional probabilities are assigned to each letter by diagram-probability charts that help make fair predictions of what to expect.
For example, a text message that begins with "T" has a 37% chance that the next letter is "h." Such knowledge can help transmission equipment reconstruct bit errors encountered when transmitting messages.
Just like actuarial tables help insurance companies predict what percentage of a large group of males will file a claim in a given year, statistical models from information theory allow us to take shortcuts in communicating more effectively by reducing uncertainty or entropy.
Real-Life Applications Make Theories Come To Life
All electronic communication involves some sort of encoding of messages. Efficient encoding is what enables us to speak at 100 words/minute but transmit at 1,000 word/minute over a POTS line. A fundamental tenet of Shannon's Information Theory is that signals can be quantized into a measurable form.
The human voice, for example, is a continuous sound wave that can assume any pressure within a range of pressures. These pressures are quantized from speech so that it assumes a collection of 128 assigned values before it is digitized into bit streams of 0's and 1's. Thanks to the work of Nyquist, we know that sampling an 8-bit data stream at 8,000 times per second will yield a noiseless DS0 channel of 64 bits per second (i.e., a perfect POTS line). What Information Theory enables us to do, however, is compute how to succeed in transmitting data when noise is present in the channel.
For example, say a channel has noise present that accounts for an error rate of 10% (i.e., 10% of the data bits are 1 instead of a 0 or vice versa). What if we sample at 16,000 times per second to compensate for the error? 8 bits sampled at 16,000 times per second gives us 128,000 bits per second. Shannon tells us that such a channel would only be reliable to accurately send 67,840 bits per second, due to the noise present. In other words, Information Theory tells us that to achieve the desired accuracy of transmission with acceptable loss (one error every billion bits), we need to devote half of the channel's 128,000 bit capacity to error correction.
In most cases, using something without knowing how it works at some fundamental level is a fatuous enterprise replete with risk. Daily experience is rife with examples. Just think of the vulnerability of driving while totally oblivious to what is under the hood when the car breaks down with kids in the backseat.
As with all animals, we communicate because we want to feel connected with our surroundings and culture. Signals are transmitted to be heard or seen, and delivering that signal is what gives OSP its importance in our culture. Inside a multiplexer is the very success story behind our efficiency in OSP. And there is a tremendous need for that efficiency.
There is no reason OSP practitioners should supinely allow the inner workings of DLE carrier systems to pass them by. Planners plan, OSP engineers design, Construction builds, DLC turns up, Repair fixes, and Installation installs. But Information Theory is the lighthouse for these various forces to navigate the choppy telecommunication seas. Like any beacon, it can tell us only where to finish, not how to get there. The pursuit of tomorrow's technological advances in transmission efficiency will be achieved through understanding the path navigated by the giants before us.
Sources: Pierce, J. R. An Introduction to Information Theory. Symbols, Signals and Noise. Dover Publications Inc., 2nd edition, 1980. Shannon, C.E. A Mathematical Theory of Communication, Reprinted with corrections from The Bell System Technical Journal, Vol.27, pp.379-423, October, 1948.
About the Author
Brian Lackovic is an OSP Engineering Area Manager with AT&T. He has more than 9 years of experience in OSP engineering and budget. For more information, visit www.att.com.
What’s your take on this subject? Leave a comment and get the conversation going.


Information Theory
The writer does a very good job of explaining the mathematical formulas that are the foundation of data communications today while making the material less complicated and very easy to understand.
So can I assume that bridge-tap on a copper cable is like a “tributary” using the river analogy and introduces noise to the signal?
Excellent question! Bridged
Excellent question! Bridged tap most certainly adds noise to the channel. The electrical pulses will "echo" along the bridged segment of the loop and possibly severely affect a circuit. That is why ILECs focus on keeping such detailed loop length records of its OSP facilities. With the location of the bridged tap identified (i.e. the source of the noise) in the channel, we can then use Shannon's work and calculate what type of circuit can qualify on that pair before Engineering needs to issue a work order to Construction to alter the loop for higher bandwidth services.
noise??
Is it really nose that is killing the bandwith on a pair with bridge tap? Or is it electronic capacity (pf), resonate length. There are a ton of factors that effect bandwith on a pair.
Bell Labs
Your article was both interesting and informative. In the age of text messaging, I found component 4 (The receiver's ability to decode the message) particularly facinating. I was unaware that conditional probabilities were assigned to each letter by diagram-probability charts. Reducing error rate events makes perfect sense.
I also want to thank you for reminding us of the incredible contributions made by Shannon and the rest of the Bell Labs folks.