home     authors     titles     dates     links     about

the information

4 july 2015

I greatly admired James Gleick's Chaos when it appeared in 1987 – everybody did, its ideas were pervasive – but then I lost track of Gleick's work. When perusing my public library's limited selection of nonfiction e-books recently, I stumbled across The Information, a long, labyrinthine book, sometimes florid, sometimes crystalline. It took Gleick a long time to write, and it took me a long time to read. It was worth the effort to behold such a synthetic imagination unifying an extremely big collection of vital ideas.

The Information sprawls around quite a bit, repeats itself, and can go from rigorous to mystical midsentence – like many of the mathematical theorists who are its heroes. Among the more prominent of the many heroes are Charles Babbage, the prescient 19th-century deviser of calculating machines, and Claude Shannon, who imagined the digital universe that lets you read this review.

Gleick ranges back and forth across the century and more between Babbage and Shannon, but as the laws of thermodynamics would seem to want it, his narrative drive is ultimately in the forward direction. We move from a world where we didn't know much, and had rather approximate ways of representing what we knew, to the 21st century where it seems that our every waking thought is digitally archived. (Ooh, that's a nice sentence; I'd better go post it on Facebook.)

Babbage, who thought in terms of machines, was mostly a visionary; Shannon, largely a theorist, lived to see massive material networks built on his theories. In between and around their work, we get to see others who contributed to the informational imaginary: Ada Lovelace, Samuel Morse and other inventors of telegraphy, Norbert Wiener, Watson & Crick, Turing, Gödel, von Neumann, Einstein (inevitably), Dawkins, Gregory Chaitin – and Jorge Luis Borges.

"Information" means many things: coded instructions, non-redundant encrypted messages, randomness, complexity (of a technical kind), and of course a more seat-of-the-pants sense of knowledge about the world, though that sense is sometimes ignored and often deliberately excluded from information theory; it tends to get in the way. But in terms of how humans make use of information, the knowledge part is paramount.

I was continually amazed, while reading The Information, at a theme that Gleick brings out vividly: before electric telegraphy, you just didn't know what people outside of your daily ambit were doing. The weather, more than a horizon away, was a mystery.

The very idea of a "weather report" was new. It required some approximation of instant knowledge of a distant place. The telegraph enabled people to think of weather as a widespread and interconnected affair, rather than an assortment of local surprises. (147)
Even something as simple as "let's meet somewhere far from here on a given day": think of how hard that would be given just two people from two distant places, let alone organizing something like an Olympic Games. I often wonder how people managed life outside their immediate circle before the telegraph: letters and calendars are fine, of course, but the logistics of synods and festivals and other committee business of the early modern world must have been terribly labor-intensive and highly approximate.

The most evocative parts of Gleick's book concern the mapping of descriptive information onto coded databases – whether in real life, as in the way that DNA serves as a blueprint for an entire organism, or in imagination, as Borges's Library of Babel contains the sum of the universe's possibilities. Algorithms now seem to be able to generate anything. As data storage becomes cheaper and smaller, our knowledge of everything grows in definition and completeness: which also entails our being able to make less and less use of it, unless we design new algorithms to extract meaning from information.

When I was a kid, the Macmillan Baseball Encyclopedia – impossible without computers even in 1969 – listed every player who'd ever played major-league baseball, with their games played, times at bat, runs, hits, extra-base hits, batting average, slugging percentage (or the corresponding stats for pitchers: starts, complete games, earned-run-average, and the like. Each player's stat line consisted of ten or twelve entries for each season he'd played. It was staggering. It was the best book ever. It was updated every few years. I still have a couple of its editions, and I still consult them from time to time.

Not all that much, though, because now there's Baseball-Reference.com. The main stat table for a player there – let's take Adrian Beltre, he's my current favorite – contains 26 basic statistical categories for each year of his career. A bit fuller than a Macmillan entry, but quite similar. A secondary table of value stats derived from various analytic formulas has another 19 entries per year. You can then click on fielding stats to add another 49, and on batting to add another 143. But that's not all; you can click on another tab to see the 26 basic batting categories in 190 different situations for each of the 18 years Beltre has played: at least 88,920 basic entries for batting alone; and if you would like to get more "granular," you can see another 36 columns for each of the 2,486 games Beltre has played as of this morning, another 89,486 entries. And B-Ref is updated daily.

Of course, it may not really matter how many times Beltre struck out with two outs and a runner on third base (only) as a member of the 2009 Seattle Mariners (three times, as a matter of fact). The thing is, we know it; it's recorded, it's accessible quickly and cheaply, and we might as well know it as not. And in a few years cameras and motion sensors will track every move Beltre makes on the field to examine the efficiency of his play – I think they're actually doing this already, just not publishing the results on line.

The result is a bit like the Library of Babel. The records of baseball now extend far beyond any human's ability to know or care about them; there are facts in Adrian Beltre's B-Ref pages that will never be consciously processed by a human being.

I have sometimes dreamed of a database that would record human activity completely: every trip ever taken in an automobile, every time somebody ever turned a lightbulb on or off, every phone call ever made – well, I guess the NSA's got that covered – every watching of every episode of every TV show ever. But quite apart from its uselessness, where would you store that information? You'd need one of those Borgesian maps drawn at 1:1 scale. Or perhaps the setup that Steven Wright has for his collection of billions of seashells. (He keeps it on all the beaches of the world.)

Gleick, James. The Information: A history a theory a flood. New York: Pantheon, 2011. Z 665 .G547. Kindle Edition.