UPDATED 08:41 EDT / JULY 17 2013

We Take On the Next Big Data Mystery @ MIT: Quality vs. Quantity [LIVE Broadcast]

Quality vs. Quantity – this is the looming question when it comes to, well, almost anything.  The same question can be directed at the busy Big Data industry, now becoming an expected standard within the industry.  Now that we can collect and analyze more data than we ever imagined,  how do we determine the quality of that data? How would you define data quality in Big Data, and what’s the criteria?

First off, data is considered high quality if “they are fit for their intended uses in operations, decision-making and planning,” a definition easily applied to the emerging applications of Big Data.

Big Data should indeed undergo the process of data cleaning or data cleansing, wherein corrupt or inaccurate records from a record set, table, or database are detected and corrected either by replacing, modifying, or deleting.  It’s easier said than done, especially when you’re dealing with huge amounts of data.  One of the best ways to go about this is to cluster these data to determine similar features, ultimately making it more useful for data scientists.

Wikibon Principal Research Contributor Jeff Kelly explains in a recent post, titled “Big Data Adds Complexity, Nuance to the Data Quality Equation,” how Big Data quality can be overlooked in some industries while in some, it can mean saving lives.

“Big Data evangelists maintain that the sheer volume of data in Big Data scenarios mitigate the effects of occasional poor data quality,” Kelly writes. “If you’re exploring petabytes of data to identify historical trends, a few data input errors will barely register as a blip on a dashboard or report. Is it even worth the time and effort, then, to apply data quality measures in such a scenario? Probably not.

“But that doesn’t mean data quality isn’t important to Big Data. This is particularly true in real-time transactional scenarios. Big Data applications that recommend medicines and doses for critically ill patients, for one, better be relying on good data. Same goes for Big Data operational applications that support commercial aviation, the power grid and other Industrial Internet use cases,” Kelly says.

Watch today’s LIVE broadcast from MIT’s IQ Symposium

The  MIT Chief Data Officer and Information Quality (CDOIQ) Symposium kicks off today in Cambridge, Massachusetts and will run through until the 19th.  The symposium will focus on delivering the importance of good data for the success of Big Data via sessions such as How To Avoid The Most Common Big Data Problems, A Practical Approach To Data Governance, IQ and Compliance, The Role Of IQ In Performance Excellence, Human Factors In Information Quality, IQ Issues In Public Sector, Government, Healthcare, Finance, and The Latest Information Quality Research From MIT.

SiliconANGLE’s premier video production, theCUBE, will be at the event, extracting the signal from the noise, and you can watch out coverage at SiliconANGLE.tv or tune in for updates here on SiliconANGLE, Wikibon, and on Twitter – @SiliconANGLE, @CDOIQ, and @Wikibon.

Joining Kristin Feledy on this morning’s Live NewsDesk Show is SiliconANGLE Senior Managing Editor Kristen Nicole, discussing some of the topics we’ll be investigating at the MIT event.  In the video below, Kristen provides her Breaking Analysis on how data practitioners should go about selecting data that will be good, quality data for analysis.

“This is an ongoing debate in the industry right now and we’ve reached a point… where we collected all these data, we’ve created ways to analyze it and now there’s a lot of data that’s here and we have to determine if that data is worth our time, if it’s not how we can determine the best data out of all these information that we are now able to collect and analyze.

“For businesses looking to make those determinations, there are certainly some emerging standards that are coming into the industry now, and that’s one of the topics we’ll be looking at closely throughout the rest of this year particularly for the MIT event that kicks off today.  Because this is going to be an increasing importance as more companies look to use data in their everyday practices, decision making, things of that nature,” Kristen explained.


Since you’re here …

… We’d like to tell you about our mission and how you can help us fulfill it. SiliconANGLE Media Inc.’s business model is based on the intrinsic value of the content, not advertising. Unlike many online publications, we don’t have a paywall or run banner advertising, because we want to keep our journalism open, without influence or the need to chase traffic.The journalism, reporting and commentary on SiliconANGLE — along with live, unscripted video from our Silicon Valley studio and globe-trotting video teams at theCUBE — take a lot of hard work, time and money. Keeping the quality high requires the support of sponsors who are aligned with our vision of ad-free journalism content.

If you like the reporting, video interviews and other ad-free content here, please take a moment to check out a sample of the video content supported by our sponsors, tweet your support, and keep coming back to SiliconANGLE.