There was a time that when you wanted a quick lunch, you told the cook behind the counter exactly what you wanted. ...'Easy on the onions, and why don't you slice one of those pickles real fine and put it in there, with just a dab of mayonnaise?'...Then we industrialized the process, and the people behind the counter at McDonalds don't even have to know about food or money: They just hit a code for the order on the register.
J. Stoors Hall uses that example in Beyond A.I. to introduce what he calls 'formalist float.' We formalize information for efficiency, either communication or logistics, and in the process we lose customized detail. That's the price we pay for the systems we build. We attempt to codify justice into laws, education into curricula, and information into ones and zeros.
In each of these examples, there's a size-able gap between what the system decrees and what the individual wants or needs, or is attempting to communicate.
I've been thinking about this formalist float as I write about computers struggling with human language. The 'real' world, with all of its complexity, cannot be rendered accurately in symbols. That's why we use the symbols in the first place: to generalize. In a sense, each word we use is as imprecise as that key the McDonald's worker punches for the quarter pounder. There's something I'm thinking, it's as unique to me as the sandwich with the pickle and the dab of mayonnaise. But you and I don't share a word for that exact thought. So I come as close as I can with our formal vocabulary, and then we use gestures, voice tone, context, and shared memories to narrow the gap between the formal and the individual case. It's that cultural negotiation that the computer cannot understand.
We need words to be inexact, because if they were too precise we'd each have a unique vocabulary of several billion words, all of them intelligible to every one else. (Maybe that's what animals have.) I'd have a unique word for the sip of coffee I just took at 6:59 on this fifth of July, which was flavored with the anxiety that I'd better get out on my bike before the day heats up. (That would be as useless to me as to everyone else. A word has to be used at least twice to have any purpose.)
If you think about it, each word is a lingua franca, a fragment of a clumsy common language we struggle with. Imagine I say that I'm 'weary.' I'm thinking one thing, and you might have a very different idea. Maybe I carried a load a long way in the sun. I may have a troubled child. I may have argued with my editor or spent fruitless hours trying to balance my checkbook. You certainly have different ideas, based on your own experience, about what 'weary' means. In addition to all different meanings, it might also send other signals to you. Maybe where you come from, it has a slightly rarefied feel, and you're wondering whether I'm signaling my sophistication. In any case, we don't know what each other is thinking. But that single word 'weary' extends a tiny bridge between us.
Now, with that bridge in place, the word shared, we dig deeper to see if we can agree on its meaning. You study my expression and my tone of voice. That communicates a lot. Someone who has won the Boston Marathon might look contentedly weary. Another, in a divorce hearing, looks anything but. I may slack my jaw in an exaggerated way, illustrating the word with a gesture, as if to say, 'Know what I mean?' In this tiny negotiation, we're bridging the formalist float. And closing that gap is the challenge for computers like IBM's Watson, the one I'm writing about.
As computers struggle to bridge the formalist float, millions of humans are making it even more difficult for them. We're distancing ourselves from formal structures. With shorthand and abbreviations in text messages, many of us are creating our own patois. Humans have done this forever. It's how Spanish, Portuguese, Italian and French all grew out of Latin. But technology is speeding it up. The meaning of a single emoticon--;...gt;)--evolves day by day, tribe by tribe.
Verbally, we're making it even harder. I hear conversations all the time in which people bypass the formal vocabulary altogether and rely entirely on sounds, gestures and tone. 'So I'm like uuuun, and she's like hhhmmm?' Characters in Jane Austin's novels would find words for these feelings, perhaps 'befuddled' and 'huffy.' Computers could look those words up and have at least an inkling of what we're talking about. They'll never bridge the formalist float entirely--our complexity cannot be reduced to ones and zeros. But eliminating words from our discourse makes their job even tougher.
Why computers can't figure out words
Other Posts by Stephen Baker
Healthcare's Only Hope: Surveillance - September 20, 2011
Getting Ready for the Post-Season: Numerati Baseball - September 17, 2011
The Statistics of Counter-Terrorism - September 12, 2011
You Will Be Monitored, Step by Step - September 9, 2011
Post Steve Jobs: 'Hard to Imagine' Game-Changing Technology - August 29, 2011
» Already a member? Login now to comment!
» Not a member? Register to comment!
TheodoreOmtzigt said:
I love this article as it gave me a new appreciation of the difficulties to make Web 3.0 a reality. I just came out of a huge deep dive into the intelligence community SNA research and tool chains where I had made up my mind that automatic text comprehension had become better and more quantitative than a human being could ever be. Armed with your interpretation of human language inherent ambiguity I can now better place the semantic web as we have it: clustering a million documents and quantifying the relationships between hundreds of thousands of subjects mentioned in these documents is an act of information layering that can side step the ambiguity trap as repetition can resolve the ambiguity. Extracting the intent, poetry, or savagery of a single subject is much harder.
The moderated business community for business intelligence, predictive analyics, and data professionals.
The Predictive Analytics in the Cloud Study is complete!
Register here to access the full results of this exclsuive study on Predictive Analytics and Cloud Technology including a whitepaper, 2 webinars, multiple podcasts and more!
Stephen Baker is the author of The Numerati & a journalist with 20 years of experience at BusinessWeek. More »
Paul Barsch directs professional services marketing programs for Teradata and has more than fifteen years of information... More »
Gary Cokins is an internationally recognized expert, speaker, and author. More »
Jill Dyché is an internationally recognized author, speaker, and business consultant. More »
Themos Kalafatis has worked as a consultant for Data Mining, Text Mining, Information Extraction and Data Quality for over a decade. More »
James Taylor is CEO and Principal Consultant at Decision Management Solutions and a leading expert in decision management. More »
SmartData Collective
- YOU
- Dean Abbott
- Teradata AusNZ
- Paul Barsch
- Meta S. Brown
- Jason Burke
- Gary Cokins
- Ted Cuzzillo
- Barry Devlin
- Chris Dixon
- Jill Dyché
- Timo Elliott
- Teradata EMEA
- Teradata Experts
- Michael Fauscette
- Bill Franks
- Bob Gourley
- Julie Hunt
- Doug Lautzenheiser
- Jack Mason
- Darryl McDonald
- Alex Olesker
- David Smith
- James Taylor
- Daniel Tunkelang
HR & Workforce Analytics Innovation Summit
When: Thu, 2012-05-24 08:00
Business Analytics Innovation Summit
When: Thu, 2012-05-24 08:00
Salford Analytics and Data Mining Conference
When: Thu, 2012-05-24 12:09
Information management and governance for the public services
When: Fri, 2012-05-25 08:00
Disruptive Technologies & Innovation Minds 2012
When: Mon, 2012-06-18 09:00
Advanced Analytics for Retail
When: Thu, 2012-06-21 08:00
Advanced Analytics for Consumer Goods
When: Thu, 2012-06-21 08:00
CIMI.Con Evolution 2012
When: Mon, 2012-06-25 08:00
Predictive Analytics World, June 25-26, 2012 in Chicago
When: Mon, 2012-06-25 09:00
Big Data for Enterprise USA 2012
When: Wed, 2012-06-27 08:00

About Social Media Today


