Digital encoding, as opposed to analog, was born of the necessity of encrypting two-way radio in WWII. A story of heroic engineering at Bell Labs.
Efforts to create a secure voice system had existed since the 1920s. Some progress had been made, but as with the A-3, no device was able to offer complete security. In the early 1940's however, the situation began to improve.
See SIGSALY
The cypher used in SIGSALY was a one-time pad. Shannon ended up writing a proof that the one-time pad is unbreakable. Part of the reason Shannon's initial publications on cryptography and information theory were so complete is because he'd been involved in analyzing the most ground-breaking secret communications system of the day -- a system that would remain a tightly guarded military secret for another thirty years.
Claude Shannon was just a youngster in Ralph Miller's eyes, and given too much credit as an individual for work that was actually created by a very high performing team.
See Bell Labs Heros
A 1983 review of this remarkable system for the Institute of Electrical and Electronic Engineers (IEEE)5 attributes no fewer than eight "firsts" to SIGSALY. They are as follows:
1. The first realization of enciphered telephony
2. The first quantized speech transmission
3. The first transmission of speech by Pulse Code Modulation (PCM)
4. The first use of companded PCM
5. The first examples of multilevel Frequency Shift Keying (FSK)
6. The first useful realization of speech bandwidth compression
7. The first use of FSK - FDM (Frequency Shift Keying-Frequency Division Multiplex) as a viable transmission method over a fading medium
8. The first use of a multilevel "eye pattern" to adjust the sampling intervals (a new, and important, instrumentation technique)
One of the vocoder channels defines the pitch of person's voice, and this channel was very sensitive, requiring about thirty levels. The use of many more channels would negate the possibility of using the frequency-shift telegraph system, which had been successfully used over the short wave radio system, the only means available over the Atlantic. The suggestion then arose of subtracting out the nearest lower level, multiplying the error signal by six, and treating it as another channel; thus, we obtained thirty six levels with two channels. It was called a Vernier channel, a rather obvious comparison to other such arrangements for scaling. It then became apparent that this was a very general process, an n-ary coding arrangement. If we had used a factor of two instead of six, it would have become a binary system. This then was the way PCM was independently invented at Bell Labs.
See Project X
.
I remember reading a thin book in the math library back in school studying electrical engineering. It was titled Information Theory and was probably by Shannon.
The book defined the bit in communication terms, not storage or logic, but analogous for sure. I remember formulas in terms of log base two that seemed pretty obvious. The one part I though original was considering the degree that a decision space was reduced by decoding a message.
A high point in my one visit to London was visiting Churchill's war time bunker and seeing the room full of equipment that decoded his messages.
From the footnotes of the Digital Revolution cited above: "Even today, the legalistic and somewhat arcane language used in patent applications is difficult to comprehend. Yet it is interesting to read the patent applications of the late 1930s and early 1940s on this topic. The BTL inventors were establishing a completely new way to transmit voice signals, and they were also inventing the nomenclature at the same time."
Ralph's own language from the Project X cited above is just as difficult to understand. Not sure if I can make it any more clear. Ten vocoder channels each used a base 6 quantization of segment of the analog voice wave. In private conversations, Ralph told of the generals rejecting the initial prototype because the voice pitch, then encoded with only 6 levels, was unacceptable. There was some panic about how to deal with the need for "about 30 levels". Adding 50% more channels just for pitch was prohibitively expensive and prohibitively large in size. Ralph himself had the insight to multiply two channels and thereby encode 36 levels which provided recognizable vocal pitch. They ended up with a n-ary encoding, not just binary.