Last week, I had the great opportunity to chat with Seth Grimes about some of the work he is doing, his upcoming conferences and what he feels are the elements of any organization’s text-analytics strategy. Seth consults on analytics strategy via Alta Plana Corporation, which he founded in 1997. He is the contributing editor at TechWeb’s InformationWeek and founding chair of the Text Analytics Summit and Sentiment Analysis Symposium conferences. He’s also one of the most respected thought leaders in the text analytics space.
Last week, I had the great opportunity to chat with Seth Grimes about some of the work he is doing, his upcoming conferences and what he feels are the elements of any organization’s text-analytics strategy. Seth consults on analytics strategy via Alta Plana Corporation, which he founded in 1997. He is the contributing editor at TechWeb’s InformationWeek and founding chair of the Text Analytics Summit and Sentiment Analysis Symposium conferences. He’s also one of the most respected thought leaders in the text analytics space. I had the chance to chat with Seth a little more about his upcoming plans, the text analytics symposium and the future of text analytics.
Jen Roberts (JR): Seth, thanks for joining me this morning. I really appreciate the time. Can you provide a little background on your work and what you are working on currently?
Seth Grimes (SG): Thanks Jennifer. The aim of my work is to advance the cause of smarter business decision-making through analytics. I consult to and advise user organizations and solution providers and also follow the research community, creating value for each, I hope, and I write for publication whenever I can find, or make, time.
Right now, I’m most actively working on the two conferences you mention in the intro, the Text Analytics Summit and the Sentiment Analysis Symposium, and readers will want to check out my report, “Text/Content Analytics 2011: User Perspectives on Solutions and Providers” , which has just come out. It covers uptake, plans, and perceptions of text and content analytics technologies and solutions and their business focused applications. Folks can download the report for free from altaplana.com/TA2011. It presents findings from a survey of current and prospective users as well as a qualitative look at the market.
JR: How about defining text (and content) analytics in just a couple of sentences?
SG: My definition: “Text analytics” describes software and transformational steps that discover business value in “unstructured” text. (Analytics in general is a process, not just algorithms and software.) The aim is to improve automated text processing, whether for search, classification, data and opinion extraction, business intelligence, or other purposes.
Text analytics draws on data mining and visualization and also on natural-language processing (NLP). Supplement NLP with technologies that recognize patterns and extract information from images, audio, video, and composites and you have content analytics.
Of course, others may use the terms a bit differently. There’s a lot of settling-out to do although I doubt we’ll ever arrive at a set of agreed definitions for anything in the IT realm.
JR: It’s one thing to think ‘wouldn’t it be great to get like-minds together to discuss text analytics’, another matter all together to put together a conference. What was the genesis behind the Text Analytics Summit? How has the summit evolved over time?
SG: We’ve had 8 summits so far, and it has been a privilege to have been involved from the start, from the first Boston summit in 2005 to the summit planned for this coming November 10-11 in San Jose, California. Admittedly at the start we were vendor-heavy, but given the huge acceleration in market adoption – for business (as opposed to government and research), spurred largely by customer-experience, marketing, and social analysis applications – we’ve had great end-user participation in recent years.
How did the summit get started? I had a phone call from Matt Muldoon, who was doing market research for conference owner FC Business Intelligence. I did some projects bringing text into BI applications in ’96-’97, and my first article on text mining came out in 2002, I think, and I had gained visibility as one of the few analysts in the space. Matt’s research showed demand for a business-focused conference, and he asked me to chair, help put together the program, recruit speakers, moderate panels, and so on. It has been a lot of fun
JR: Sentiment analysis gets a lot of scrutiny. What is your definition of sentiment analysis?
SG: Sentiment analysis finds, and distills the business value from, attitudes, emotions, mood, and opinions in social, online, and enterprise sources. It draws on text and also on other forms of content and numbers. Other forms of content: For example, there’s emotion in speech, indicated by volume, pace, intonation, and other non-textual clues, and analysis of images and video might be similarly revealing. That said, text is the Number 1 sentiment-analysis source, driven by demand to know what’s being said on social platforms about products, brands, topics such as the economy, and public figures.
JR: You have an upcoming Sentiment Analysis Symposium. How does this differ from the Text Analytics Summit? It seems there would be some overlap. What can attendees expect?
SG: The summit is broader, the symposium more focused. The summit covers sentiment analysis although obviously in less depth than the symposium, and it covers many topics and applications that don’t involve sentiment. The symposium in turn gets into topics such as crowd-sourced sentiment analysis that the summit doesn’t touch.
Both should be very accessible to business users while still appealing to technologists. At both, expect an interesting mix of talks and panels and networking opportunities, including with exhibiting vendors. You can learn more by checking out the online agendas.
Audience-wise, pre-summit workshops target newer users, and the afternoon before the symposium, we similarly have an in-depth Practical Sentiment Analysis tutorial, but the symposium also adds innovative vendors and the research community to the mix via a series of lightning talks and an optional pre-summit research session.
JR: With the emergence of social media data, there is increasing talk about managing and analyzing Big Data. What are your thoughts on this emerging topic?
SG: Big Data is an interesting term, but essentially it’s marketing coinage, a way to suggest that an organization’s existing data systems aren’t up to handling modern data and processing demands. There’s truth there: Database systems designed for transaction processing won’t handle high-velocity, high-volume Web click-streams, telecom call-detail records, and sensor data, nor all the “unstructured” text out there, nor representations of the social graph.
If you associate Big Data exclusively with key-value stores, machine-generated data (for instance, from Web servers and sensors), and the like, you miss out on information captured in online and social conversations, valuable business information that explains the causes of the behaviors revealed in click-stream and transaction analyses, that can help you (via sentiment analysis) boost customer satisfaction, that reveals customer purchase intent and can predict stock prices.
So I see Big Data becoming passé, with attention (rightly) moving to complex data and integrated analytics.
JR: How do you see the field of text analytics evolving? Have there been any surprises in the past couple of years?
SG: Of course technical capabilities continue to improve, moving into real-time processing, into more languages and business domains and functions, offering wider information-extraction capabilities including for subjective data (opinions and emotions), temporal events, and facts and relationships.
But it’s funny, I can’t think of any surprises, well, not on the technology front. On the business front, I’m pretty amazed that neither Microsoft nor Oracle has any text-analytics market presence at this point. SAP and IBM have it, and HP is buying it with the Autonomy acquisition.
IBM’s Watson Jeopardy-playing system, which relies on natural-language processing to feed its knowledge base and parse questions, marked an inflection point, pointing the way toward wide enterprise adoption. Watson demonstrated key directions: Question answering, integrated analysis of text and other data forms, data-based reasoning, even language generation. Look for those capabilities and others to develop and hit commercial markets in coming years.
JR: What are the critical elements of a text analytics strategy? And how much does this approach change between analyzing social media or private data text?
SG: I see three critical elements in any organization’s text-analytics strategy:
- You have to identify information with significant business value in text sources.
- You have to figure out what techniques and tools will best help you get at that information and transform it to create business insight.
- You have to understand how you’re going to use what you’ve learned to improve and optimize business processes, to better reach business goals, whatever they may be.
In practice, an organization is going to step through that list backwards. For instance, you might say, “We need to improve customer experience in order to win new customers, boost satisfaction, and decrease churn” (all of which, of course, contribute to competiveness and profitability). Given the customer (as opposed to market) focus, you might decide that analysis of surveys, contact-center notes, and other customer interactions, linked to transactional records and customer profiles, is the way to go. As for information you should be looking for, you might try to extract the basic Who, What, When, Where, How of each, individual interactions, classified by product, touch point, customer profile, and experience, in order to look for satisfaction and dissatisfaction causes and remedies.
What you shouldn’t do is start with “People are talking about our products on Twitter and Facebook. How can we get them to say nice things?”, which is not sensible, not strategic.
JR: For people who want to read more about your thinking on text analytics and sentiment analysis, where should they go?
SG: Simplest is to just point folks to my Twitter account, @sethgrimes , where they can find a link to a site with a whole bunch of links. I’d also point people to my Vimeo account where I have speaker videos posted from last April’s New York Sentiment Analysis Symposium and last October’s Smart Content conference, which focused on content analytics.
By the way, readers who’d like to attend the up-coming, November 9 sentiment symposium, can use the registration code FOAF100 for a $100 discount, and you can get $200 off the November 10-11 text summit with the code SETH33. Thanks again for the opportunity to talk about all this stuff!
JR: Thanks for your time, Seth, and good luck with your upcoming events!