Why Unstructured Data Matters To Brands Right Now

Unstructured data includes everything from audio, including voices and other kinds of sounds, and video, and images – and that’s not easy for a search engine to find and present, notes IBM Watson CMO Jordan Bitterman.

When IBM Chief Digital Officer Bob Lord was recently asked what the most transformative technology is right now, he immediately pointed to unstructured data.

“Using the data to get creative insight is at the top of the list,” Lord said following his presentation at last week’s Ad Age Next conference. “What artificial intelligence will allow you to do is to get at that ‘dark data,’ that unknown data: unstructured data. Right now, as marketers, we have access to ‘known data,’ structured data, which makes up roughly 20 percent of the world’s data.”

What AI will allow you do will allow you to get insight on imagery, videos, tweets, text, and voice at scale, Lord said. You can get insight on behavioral patterns of consumers that remain otherwise elusive via regular channels, such as search.

“If you combine the structured and the unstructured data together, you now have insight in ways you never had before,” Lord said, “which will help to improve our targeting, our creative, and ultimately, the value chain that we create for that consumer.”

All good points.

But what exactly is unstructured data? And why is it important for brands to understand it at this point in time?

IBM Chief Digital Officer Bob Lord at Ad Age Next 2017

For that, Lord suggested we check in with Jordan Bitterman, CMO for IBM’s Watson Content & IoT Platform, ‎which oversees The Weather Company, Watson Advertising, and Watson Media.

In a conversation, Bitterman offered a simple analogy: think of the difference between structured data and unstructured data in the way of thinking of moments that are scripted – it’s been written down, saved or otherwise collected in an easily sharable, repeatable, widely accessible format; then consider something unscripted: something more nebulous, harder to grasp, original, that doesn’t fit a preconceived design or arrangement.

“Scripted versus unscripted is a good way to begin to understand what makes unstructured data important,” Bitterman said.

GeoMarketing: What is unstructured data and why is it important right now from a marketing perspective?

Jordan Bitterman: There are a number of ways of explaining this. Right now, if you were to do a search on Google (or another prominent search engine), you would only be able to access a piece of it, since most of the data in the world is not searchable by their indexes.

Obviously, consumers don’t need most of that data because structured data does a fine job for us as we conduct our online searches. But when you start getting down to what businesses actually want to accomplish, it goes so far beyond what a search engine is able to crawl.

Search engines crawl structured data. It’s all organized, it’s all cleansed; it’s all zeros and ones. That makes structured data easy for search engines to find. Whereas, unstructured data includes everything from audio, including voices and other kinds of sounds, and video, and images – and that’s not easy for a search engine to find and present.

What’s the scope of unstructured data? And how does it impact the development of AI for marketing purposes?

Depending on the statistics you want to cite, 80 percent, 85 percent of the world’s data is not structured. It’s not organized; therefore, unstructured data can’t be indexed by search engines.

While Watson and a lot of other AI tools are not directly focused on consumer uses, they are focused on business uses. It’s where businesses, enterprises, want to be able to utilize all sorts of data, and that’s where this stuff gets really interesting.

Jordan Bitterman, CMO for IBM’s Watson Content & IoT Platform, ‎which includes The Weather Company, Watson Advertising, Watson Media.

What are some use cases that showcase the utility and function of unstructured data?

One example could involve a hotel chain that wants to be able to take all your structured data to inform its insights about the travel industry. It might want insights about a business so it can plan for the same hotel usage patterns for the next year on the same date.

That’s all structured. You have that already, so you can bring that into a tool for predictive analysis. But, you could also bring in all sorts of other tools that don’t exist right now.

For instance, consider an image site such as Pinterest or Flickr or Instagram. A marketer wants to be able to identify what images seem to be getting uploaded the most at any given time. It gives them the insight to say, “Alright, it looks like Europe is popping and that’s an area of the world we need to focus on. It looks like the country of Honduras is popping. We gotta focus on that.” You can do that from images in that example.

That becomes a data set that previously you wouldn’t have access to, and with unstructured data, now you do.

So unstructured helps marketers grasp the rise of interests across less tangible pieces of information – as compared to something more concrete like keywords – in real-time. Is that correct?

Absolutely. And, because audio is involved, if you were, say, a hotel chain, you’d also be able to take a listen and structure all the insights, which turns into data that comes off all your call centers. What is it that people are excited about, what are they upset about, where’s the friction in the system, where is it not? Knowing all of that helps them too.

In thinking about the limitations of search engines to provide answers from unstructured data, the platform companies such as a Facebook, a Pinterest or an Amazon Alexa to generate content from unstructured data, what role does a Watson play in connecting marketers and consumers?

It’s the job of whoever is building a solution that needs to be usable. Google might want to incorporate that into a more robust search product.

Facebook may want to use it because when you take a picture of you and three or four friends, and the second you upload it to Facebook, it automatically tells you who’s in that photo. So, that helps that. Those are both consumer uses, but as a business, use it’s up to IBM and Watson to help marketers make sense of unstructured data and provide insights, understanding, and clarity about what that information is.

Let’s just go back to the hotel analogy for a second.

For us, it starts by working with that hotel chain to determine how they want to utilize that data. It could be an insight engine on year over year sales, it could be on a CRM tool, there’s a million different ways it could be used. So, it’s incumbent on us, with our client partners, to determine what the uses that best affects the business.

Why is unstructured data so important in this moment right now? Is it because the search engines simply can’t keep with the amount of data that been – and continuing to be – created by artificial intelligence applications?

About 15 years ago, search engines and browsers were able to index something like 20,000 web pages. And now there’s over 1 billion webpages on earth.

Obviously, traditional text search has done such a great job bringing that all together. What we need to do is the same thing with voice, bu. it goes so far beyond 1.2 billion pages because it goes into like an infinite number of potential experiences.

Is the use of unstructured data primarily around search?

No, it’s not just search: it’s navigation, it’s business purposes, it’s customer service. All of that needs to be able to understand context in the spoken word.

We’re not doing it for search; we’re doing it for other means. My guess is that search engines are applying spoken word AI in so many different ways into various search engines that they will be able to keep up. But, yes, they’re behind where they want to be, but they’re certainly not behind where customer uses are.

How do IBM and Watson Advertising view the possibilities for its businesses?

For us, the way we’re looking at it is that, not that Google is straight B2C, or Amazon is straight B2C, but the way that Google Home and Alexa work is that they’re trying to help you with easy household skills. I say, “easy,” it’s not that it’s simple to program, or it’s not super cool, but they’re pretty much wrote. It’s like, “Alexa, play Pandora,” or something. And, those are things that, you know, it’s taking that tool to a next logical step.

What we’re doing is we’re applying insights to the B2B world. So, while Alexa has something like 2,000 or 3,000 different skills, right now we have 38 skills. That doesn’t mean that Alexa’s better than our skills, it just means that our skills are things like natural language, processing, language translation, personality insights, behavior sentiment. That’s all stuff we then can apply, not to building a search engine because that’s not our business, but to helping a hotel chain figure out who their next best customer is, or helping an airline decide whether they want to put a flight in the air or ground it due to some sort of circumstance that’s going on somewhere.

That’s how we’re applying it. And speech is, in my opinion, every bit as important in that B2B context as it is in a B2C context.

Are there any other examples of how voice activation and unstructured data impact consumers’ connections with brands?

The example I would use there would be in customer service.

Right now, when you make a phone to call to your credit card company or your phone company, you get on the phone with them and you’re asking a question, you’re talking to a live voice. Or, perhaps before you got to that live voice, you get a series of prompts: you hit 2, you hit 4, you hit 1, you hit #.

That’s all the current/traditional way of doing things. But in the future, what happens is, we can speak all of our responses. We’re seeing a little bit of this right now when you call the phone company.

There’s going to be a sequence of questions that a digital customer service representative will ask you. You’ll start answering those questions, and not only will the subsequent questions be asked to you because it’s learning and it’s asking you questions that it thinks you want to get into, rather than just going through the script of, “Ask question 1. Ask question 2. Ask question 3.”

Now, it’ll ask you question 1. Then, it might ask you question 6. Then, it might ask you question 9. And in addition to that, it’s also measuring your sentiment. So, it’s going to be able to tell whether you’re frustrated, whether you’re happy. So, it knows kind of who you are in that moment, like, what’s it’s dealing with. Hopefully as well, if not better, than the way a real customer representative would.

That’s how the unstructured data plays, so that it really gets to understand the intent, not just going through a list of programmatic questions and responses.

About The Author
David Kaplan David Kaplan @davidakaplan

A New York City-based journalist for over 20 years, David Kaplan is managing editor of A former editor and reporter at AdExchanger, paidContent, Adweek and MediaPost.