Follow this blog:
RSS

Big Data is big, but still not trustworthy: study

By | October 17, 2012, 7:34 PM PDT

It’s a value proposition too good for organizations to pass up — taking the massive amounts of data they are accumulating inside and outside their walls, and mining for diamonds of information amidst many endless veins of coal. Unfortunately, the diamonds may be tiny and very difficult to find.

Companies are actually making progress with organizing and sifting through their internal data, such as call center interactions, sales transactions, and product levels. However, they’re not ready for external information. And, ultimately, what makes Big Data big data is external web information, especially social media. However, most organizations are leery about the validity or trustworthiness of such data.

These are the conclusions of a report just released by IBM and the Saïd Business School at the University of Oxford, based on a global survey of 1,144 business and IT professionals. Three-quarters (76%) of the respondents are currently engaged in Big Data development efforts, mostly in the early stages.

Most Big Data initiatives currently being deployed by organizations are aimed at improving the customer experience. Yet, despite the strong focus on the customer, fewer than half of the organizations engaged in active Big Data initiatives are currently collecting and analyzing external sources of data, such as social media. That’s because business leaders feel there is uncertainty inherent within certain types of data, such as the weather, the economy, or the sentiment and truthfulness of people expressed on social networks. In the survey, respondents questioned their ability to trust comments, reviews, tweets and other forms of freely offered opinions online.

As the report states: “Sentiment and truthfulness in humans; GPS sensors bouncing among the skyscrapers of Manhattan; weather conditions; economic factors; and the future. When dealing with these types of data, no amount of data cleansing can correct for it. Yet despite uncertainty, the data still contains valuable information. The need to acknowledge and embrace this uncertainty is a hallmark of Big Data.”

The second major challenge is a growing skills gap when it comes to finding people who know how to manage and mine this data, the survey finds. Only 25% of the survey respondents say they have the required capabilities to analyze highly unstructured data – a major inhibitor to getting the most value from Big Data. Big Data requires the capability to analyze semi-structured and unstructured data, including a variety of data types that may be entirely new for many organizations. Having the advanced capabilities required to analyze unstructured data – data that does not fit in traditional databases such as text, sensor data, geospatial data, audio, images and video – as well as streaming data remains a major challenge for most organizations.

Survey respondents report a range of business opportunities and benefits as a result of their Big Data projects. Nearly two-thirds (63%) of the survey respondents report that using information, including Big Data, and analytics is “creating a competitive advantage” for their organizations. This is a 70% increase from the 37% who cited a competitive advantage in a 2010 IBM study.

More than half of the survey respondents reported internal data — which is typically structured data — as the primary source of Big Data within their organizations. In more than half of the active Big Data efforts, respondents reported using advanced capabilities designed to analyze text in its natural state, such as the transcripts of call center conversations. These analytics include the ability to interpret and understand the nuances of language, such as sentiment, slang and intentions. Such data can help companies, like a bank or telco provider, understand the current mood of a customer and gain valuable insights that can be immediately used to drive customer management strategies.

In addition to customer-centric outcomes, which half (49%) of the respondents identified as a top priority, early applications of Big Data are addressing other functional objectives. Nearly one-fifth (18%) cited optimizing operations as a primary objective. Other Big Data applications are focused on risk and financial management (15%), enabling new business models (14%) and employee collaboration (4%).

(Thumbnail graphic: Joe McKendrick.)

Start your week smarter with our weekly e-mail newsletter. It's your cheat sheet for good ideas. Get it.

Joe McKendrick

About Joe McKendrick

Joe McKendrick is a contributing editor for SmartPlanet.

Joe McKendrick

Joe McKendrick

Contributing Editor

Joe McKendrick is an independent analyst who tracks the impact of information technology on management and markets. He is the author of the SOA Manifesto and has written for Forbes, ZDNet and Database Trends & Applications. He holds a degree from Temple University. He is based in Pennsylvania.

Follow him on Twitter.

Joe McKendrick

Joe McKendrick

Joe McKendrick is an independent consultant and editor. Joe has performed project work for the following companies in the IT marketspace: IBM, Systinet/HP, Teradata. He has performed project work for the following organizations in partnership with Unisphere Research (Unisphere Media): IBM, Oracle Corp., International Oracle Users Group, Oracle Applications Users Group, Professional Association for SQL Server, International DB2 Users Group, International Sybase Users Group.

He writes for SmartPlanet and is not an employee of CBS.

If you liked this, don't miss...
2
Comments

Join the conversation!

Follow via:
RSS
0 Votes
+ -
IBM and US IC: Severely Retarded
IBM and the US secret intelligence community share the same intellectual deficiencies -- they just cannot come to grips with two facts: a) 80% or more of the data that is online is Deep Web and NOT subject to search discovery in its present state of infantile development; and b) 80% of relevant information is not digital, not in English, and not even published in analog form. Learn more at http://www.phibetaiota.net.
Posted by Robert Davud STEELE Vivas
18th Oct
0 Votes
+ -
It's the other forms of carbon in "big data" that is the big problem.
What the "big data" advocates don't talk about is how bad much of that data they are sorting through really is. For every carat of diamonds that might be hidden out there, there is also tons of crap that contaminates much of what they are hoping to discover. Hardly a week goes by where I don't experience some aspect of this; someone (or some computer) coming to a conclusion about me that is astoundingly incorrect.

Simple examples of this might be the amount of mail I get with my name or address misspelled. Clearly, someone entering data, possibly a decade or more ago mistyped it into a database, that then got sold, and then got combined, and then got resold, etc.

More complex examples come courtesy of deadbeat debt collectors looking for people who are 3 or 4 levels of contact remote from me, and yet some data aggregation has somehow connected with my phone number or address that has never had anything to do with the individuals involved.

I shudder to think what will start happening when organizations start having more confidence in this technology and begin using it for more than junk mail and cold calls. The results they get will be limited, and the collateral damage to innocent citizens could be considerable.
Posted by JohnMcGrew@...
18th Oct
Join the conversation
Formatting +
BB Codes - Note: HTML is not supported in forums
  • [b] Bold [/b]
  • [i] Italic [/i]
  • [u] Underline [/u]
  • [s] Strikethrough [/s]
  • [q] "Quote" [/q]
  • [ol][*] 1. Ordered List [/ol]
  • [ul][*] · Unordered List [/ul]
  • [pre] Preformat [/pre]
  • [quote] "Blockquote" [/quote]

Join the SmartPlanet community and join the conversation! Signing up is fast and free. Don't wait -- we want to hear your opinion!