As regular readers know, I am strong advocate for transparency in any system where people interact with machines. In fact, such transparency is a core HCIR value, since communication depends on the clarity with which a message traverses the noisy channel of human-computer interaction.
So I was a bit taken aback by a recent blog post in which Stephen Arnold seemed to attach the notion that an effective search engine could be transparent. But a mo…
As regular readers know, I am strong advocate for transparency in any system where people interact with machines. In fact, such transparency is a core HCIR value, since communication depends on the clarity with which a message traverses the noisy channel of human-computer interaction.
So I was a bit taken aback by a recent blog post in which Stephen Arnold seemed to attach the notion that an effective search engine could be transparent. But a more careful reading led me to believe that he’s reacting, perhaps a bit too cynically, to the increased currency that the word “transparency” has in marketing literature.
Let me try to cut through the marketing hype. Transparency is more than a buzzword to sell software. It is a core value than imposes significant constraints on how a system can act. If a system is not bound by transparency, then it is free to respond to user inputs arbitrarily, unconstrained by any requirement to offer users insight into the basis for its response. In contrast, a transparent system must produce user-consumable explanations of its output. A transparent system can’t get away with saying “if I told you, I’d have to kill you.”
In fact, a transparent system might have to reject a possible response to a user because it can’t present an explanation for the response that the user will understand. For this reason, some machine learning purists reject transparency as overly constraining, and prefer approaches that simply optimize an objective function that, in all likelihood, is completely opaque to the user–and possibly even to the system developer.
Why is transparency so important in systems that support information seeking, i.e., search and information retrieval systems? Because any systems that requires people to interact non-trivially with machines are fraught with communication challenges. Best-effort attempts to extrapolate user intent from a query–often a query comprised of only a couple of words–are beyond AI-hard; they’re ESP-hard. While all systems have to accept that they’ll misread users’ intention a significant fraction of the time, transparent systems at least offer users the opportunity to worth with the system to get back on track.
To be clear, implementing transparency isn’t simple. It’s like Mark Twain said: short letters are often harder to write than long ones. In a related vein, the world’s greatest minds aren’t always the world’s greatest communicators. And what holds true for people holds even more so for machines (or the people for program them): it’s hard to develop algorithms that deliver useful results and provide human-consumable explanations for them.
I understand Arnold’s frustration with vendors. And I won’t claim that Endeca always gets it right, though I think (and have been told) that we do better than many in communicating how our technology works. But there is no question in my mind that information seeking support systems have to become more transparent if we want them to work in the real world.