Subscribe
About

Fitting the tools to the job

You don't necessarily need exotic technologies to crunch sizeable data sets.

Lance Harris
By Lance Harris, freelancer
Johannesburg, 05 Aug 2013

Amid the hype about big data appliances, in-memory computing and noSQL databases, many data-intensive South African businesses are still doing just fine using industry-standard databases and analytics tools that have been around for years.

Those applications are pretty niche, but we're looking at tools that make it easier to mine unstructured data sets.

One example is the specialist credit bureau, Inoxico, which has built a real-time 'data beneficiation' business that draws on millions of records from multiple sources using a standard Microsoft SQL server database and commodity hardware. "We haven't encountered any limitations with this technology as yet," says Andr'e St"urmer, CEO of Inoxico. "I don't foresee us having to move to something more exotic for a year or two."

Inoxico - which provides risk assessment and credit data about organisations to its commercial clients - runs a database with information on ten million directors and companies. This data is enriched by data from other public and private sources, such as the deeds office, address databases, bank codes, credit judgments and statutory information.

The data from all these sources can be aggregated, sliced, diced, correlated and manipulated, and viewed in visual representations to provide companies with a more recent and complete picture of the fraud and credit risks attached to organisations with which they do business.

Niche applications

What really sets the solution apart is the proprietary algorithms Inoxico uses to mine the data rather than the underlying infrastructure, St"urmer says. But, he says, the company may need more specialised solutions as it expands into the rest of Africa, where similar commercial information isn't as readily available in structured databases. "Those applications are pretty niche, but we're looking at tools that make it easier to mine unstructured data sets," adds St"urmer.

"We're seeing a trend among larger retailers, banks, utilities and telcos towards buying appliance hardware to manage big data," says Craig Stephens, principal solution manager for Information Management at SAS. "But it can be expensive to deal with gig data if you go down that route." Commodity hardware and open source solutions such as Hadoop do the job for most applications, he adds.

One reason many organisations go down this road is that their big data projects are driven by the CTO rather than by business users, he adds. The key to success is to start out with a clear idea of the business value the organisation hopes to get from big data analytics, rather than on the technologies it will use.

Share