UPDATED 07:00 EDT / MARCH 19 2018

BIG DATA

Io-Tahoe brings machine learning to data discovery and cataloging

Another company has joined the smart data catalog party.

Io-Tahoe LLC, a unit of British utility giant Centrica PLC, today is introducing a machine learning-driven data discovery product that it says can find and classify data across a wide range of platforms ranging from traditional databases to semistructured data lakes.

At the center of the software is a data catalog that uses a set of 14 machine learning algorithms to create, maintain and search business rules, define policies and support governance-based workflows. The software automatically enriches metadata and enables business users to create and manage policies and rules.

Data catalogs are similar to card catalogs in a library. They tell where data elements can be found and may also include other information, such as ownership, intended use and governance policies.

Io-Tahoe said its machine learning technology can look beyond metadata to the underlying source data for deep visibility into complex data sets, the company said. Io-Tahoe has filed for two patents on its relationship discovery technology, which examines the primary foreign key relationships in relational tables and plots them on a map.

“We look at data only, so if you have a field like a transaction ID that goes across multiple databases, we’re able to find it,” said Chief Executive Oksana Sokolovsky. The software can also be used for impact analysis to help organizations detect changes in data. “A week or a year later, we can look at the databases and see how the landscape has changed over time, as well as if data elements have been introduced without the company’s knowledge,” Sokolovsky said. 

The technology grew out of Sokolovsky’s experience as a top information technology executive at Wall Street investment and health care firms. “I spent 20 years dealing with large enterprises and relied a lot on data discovery,” she said. “Much of it was on spreadsheets, which resulted in inaccuracies.”

She founded Rokitt Inc. in 2014 to sell Rokitt Astra, a tool for finding hidden relationships within relational databases. Rokitt was acquired by Centrica last year and renamed Io-Tahoe. Rokitt Astra was primarily used by technical organizations for tasks like migrating between relational databases or inferring structure from messy data lakes.

With the addition of the data catalog, Io-Tahoe is now targeting nontechnical business users. “Data catalogs allow us to work with the data owners, who can create business rules, search for rules that exist and ultimately enhance the description of data elements so others can get the benefits of those rules,” she said. The technology currently works only on structured and semistructured data, but support for unstructured data is in the works.

The market for data discovery and cataloging tools has been hot of late, in part due to the impending imposition of the General Data Protection Regulation in Europe. Research firm MarketsandMarkets Research Private Ltd. estimates that the data discovery market will grow from $4.33 billion in 2016 to $10.66 billion in 2021. Waterline Data Inc. recently introduced a data discovery platform targeted specifically at GDPR compliance. One month earlier, Podium Data Inc. migrated its data catalog to the cloud.

Pricing wasn’t disclosed.

Image: Io-Tahoe

Since you’re here …

… We’d like to tell you about our mission and how you can help us fulfill it. SiliconANGLE Media Inc.’s business model is based on the intrinsic value of the content, not advertising. Unlike many online publications, we don’t have a paywall or run banner advertising, because we want to keep our journalism open, without influence or the need to chase traffic.The journalism, reporting and commentary on SiliconANGLE — along with live, unscripted video from our Silicon Valley studio and globe-trotting video teams at theCUBE — take a lot of hard work, time and money. Keeping the quality high requires the support of sponsors who are aligned with our vision of ad-free journalism content.

If you like the reporting, video interviews and other ad-free content here, please take a moment to check out a sample of the video content supported by our sponsors, tweet your support, and keep coming back to SiliconANGLE.