There is a mythic trope that suggests whales have a specific death ground, a graveyard where all the old among the pod quietly drift off to when the time arises. There bones can be found of thousands upon thousands of wise old whales gone to die, their remains imbued with the collective story of whale-kind through all time.
That's just silly, perhaps, it's just a kid's story told to them before they drift off to gentle slumber. But the truth is that most businesses very likely have a whale carcass or two quietly beached in their databases.
There's no real problem with having a gentle giant of the deep - albeit a dead one - hanging about the office. The pong probably won't hit unless the company is about to kick off a project that involves some type of enterprise data. And that's when the fleshy flotsam will permeate the project and stink up timelines and budgets.
Data rot
Most businesses very likely have a whale carcass or two quietly beached in their databases.
Mervyn Mooi is director of Knowledge Integration Dynamics.
It's a very unflattering way to talk about whales, particularly since many are endangered species, which data in the modern enterprise clearly is not - by any stretch of the imagination - but it does quite potently highlight the possible extent of the problem of having rotten data muck up a project.
The whale corpses are data such as invoices, customer details, products, and other corporate 'paperwork'. Systems are so highly automated these days that this stuff gets collected without anyone really being aware of it. It comes in, goes through the process, gets all the stamps, ticks, and nods then into the big black data bin it goes, never to be seen or heard from again. Until someone realises profits are down and begins to wonder why.
That's when the dredgers begin to run across the dead whales. Out comes the bad data, the missing fields exposed, empty entries made conspicuous by their abundance, and the business intelligence dashboards and other systems deliver a pile of goop that nobody can make sense of, let alone base a decision on.
High maintenance
Data issues often arise only after the project timelines and budgets have already been agreed on. While the immediate problem is obvious - timelines must change and costs escalate, sometimes dramatically - data issues are also typically ongoing and cannot be fixed then left alone.
Data quality must be a sustained process, so the cost must be absorbed by the standard operational or the change management budget before it is charged back to the data custodians or creators, which is typically a specific business division, unit or company.
The question also has to be asked: why was the company running that project in the first place? Is there some business imperative that must be met within a specific timeframe? Is it a new product coming to market hopefully before the competition launches theirs? If so, then halting the entire process to fix up the data, and clean out the whale skeletons from the data closet, can be more than just a heavy expense. It comes at a time when the company can least afford it. It costs the business right now, but then it keeps on costing it every week and month and year thereafter.
Dead whales are no joke. In November 1970, a dead giant sperm whale was found on a beach in Oregon, in the US. The Highway Division figured that a giant dead whale needed a giant amount of explosives and proceeded to stuff the deceased cetacean chock-full of dynamite. The plunger was depressed and big chunks of blubber rained down on onlookers and vehicles - the majority of the carcass, however, remained.
Thirty-nine years later a second, smaller, whale washed up not far from that very site. The parks board officials responsible for its removal ruled out dynamite, and had to choose between towing it offshore for a burial at sea, or digging a hole in the sand to bury it. They chose the latter.
Companies won't be able to bury their whale of a data problem though, if they're trying to use it for some business project or other. It will have to be fixed, start running again and made serviceable. The best time to do that is before the business is relying on its usefulness or, like most companies that don't fix it before the time, at least cost it into the projects and operations.
Data cleansing must be firmly entrenched in the organisational processes as a formal, ongoing operation, whether it be prior or during projects. It is an inevitable occurrence and the best time to start is now.
Share