Blog comments. Reviews. Call centre transcripts. Social Media Content...Etc. Etc. These days, textual data comes in many forms, and from many sources. But what to do with it all? Many companies have terabytes of the stuff, but have yet to understand how to exploit it properly.
Text analytics technology remains young, and there is a general and basic lack of knowledge and understanding in many companies. Most of the early adopters are in web and mobile start-ups in certain technology-centric industries, and they tend to be based in English-speaking countries. The question is whether the analytics industry can push on to new territories and sectors. And whether privacy concerns, and security questions, including around the use of cloud-based services, will prove to be long-term obstacles. The potential is vast – but so, maybe, are the challenges.
Data Driven Business recently caught up with several European text analytics experts to discuss these issues in more depth.
INNOVATIONS: What are the main obstacles to growth, and what innovations are we likely to see in the next few years?
Federico Cesconi, CEO and founder, ISV World: I think we’ll see two things. One, more solutions based in distributed computing environments, like Hadoop or Cloudera. The second is more multilingual applications. The American vendors have focused a lot on the English applications, which makes sense, because you have a market of 300 million consumers. But to reach Europe’s 300 million consumers, we’ll need to speak 20 different languages. We have started from the beginning with four different languages, so we have a lot of know-how around those different languages. It’s going to be very important. If you want to properly serve a market, you need to have a high level in the language you want to support.
In the past, companies like SPSS IBM translated everything into English, and then mined in English. But that’s very expensive, and you immediately lose something in the translation. That’s where the vendors offering specific language solutions can make a big difference. The sentiment of what people say in different languages is totally different.
Themos Kalafatis, Independent Consultant: As analytics becomes more mainstream, the tools are becoming multilingual. There are now solutions for German and French, for example. However, there is still a lack of tools for other languages, including some Eastern European ones, and this creates problems. If an analyst wants to use pre-processing techniques, such as stemming, it’s not possible. There is a lot of space for improvement here. But will someone invest in solutions for a country of four or five million people? I’m not sure about it.
Rosaria Silipo, Independent Consultant: A lot of text mining providers are putting their services on the cloud, because of the need for speed. But I’m not sure it’s necessary, or a good idea, for all businesses. I think you can combine text analysis on the cloud for some businesses with text analysis off-the-cloud for others. The cloud is seen by many people as having privacy issues, whether this is true or not. For example, banks are still very reluctant in putting their data on the cloud. Until it is perceived by everybody as fully secure, text mining providers need to offer different options, even at the price of speed.
Rene van Erk, CEO, CustVox: I think we’re going to see much richer information available. Today, if I need information on a company and its subsidiaries, then maybe Dun & Bradstreet has a good picture of the overall company. But it doesn’t give you any more detailed information, such as country-by-country data. Text analytics will allow me to read all the relevant financial statements, so I can tell if one of the subsidiaries is not doing well, for example. If you just look at the overall numbers, you can’t capture that sort of information. You need to look at the text as well. Richer data can greatly improve decision-making. If I want to know whether to lend money to a company, for example, I need to know about its financial health in all those subsidiaries, not just the overall business.
I think Salesforce.com may end up replacing Dun & Bradstreet, because it has so much data available in its own cloud. It’s not for nothing that they paid $1.5m for the Data.com domain a few years ago. They can’t use individual details about individual transactions. But they can consolidate it and use it. The amount of data that they can have access to is of a completely different scale. I think the whole area of business information may evaporate in its traditional form, to be replaced by companies that know how to deal with big data. If the technologies continue to advance at the rate they have been advancing, there is a potential for industries worth billions to be replaced by companies using these new technologies.
If you’d like to find out more about potential growth markets for text analytics, what customers want from their TA solutions, and how the industry can increase customer adoption, download the full report here.
These issues will also be covered in depth at our upcoming text analytics summits in London and Boston. Click on the links below for more information: