Just as your laptop over time becomes clogged up with old files, downloads, plug-ins, unread emails and other stuff you will likely never need or look at again, enterprises today bob along uncertainly on a sea of data detritus.
And, just as with the PC, the situation leads to issues affecting performance, data governance and even productivity because data can't easily be found or recovered.
In the case of the enterprise, what are the sources of this unhappy situation? We could start by thinking of backups, archives, file shares, content management systems, applications no longer in use, object stores, data volumes that have been kept on a just-in-case basis, cloud services old and new, personal content encouraged by ‘bring your own device' schemes, systems management software and… well, so many more.
The scale of the problem is larger than you think
The challenge is ubiquitous: companies in full control of their data are as common as sightings of Tasmanian tigers. Even well-run data centers can still be repositories of lots of junk. Why? Because IT always has enough on its plate and rarely has time to do a spring clean.
It's also because CIOs are ‘too scared to scrub'. IT leaders often fear deleting data might come back to bite them. They worry that a file might need to be located to answer a regulatory probe, might hold the key to something business-critical, or might be of value with a discovery tool that inspects log analytics and other digital exhaust fumes.
CIOs hoped that the cloud would relieve them of this mess. But instead, they now have multiple clouds, exacerbating the issue. So, here we are today, running expensive storage subsystems, incurring regulatory and data security risks, seeing performance lag, lacking integration with other systems, overseeing dispirited IT teams that have to chase their admin tail to get things done, and living with flagging service levels that disappoint the business.
Finding a defragmentation solution
Shine light on this ‘dark data', however, and the picture begins to look a lot brighter. Benefits include better insights, lower costs, less time squandered in ‘keeping the lights on', a brisker customer and employee experience, increased brand trust, and the ability to move faster and trust cloud.
These are all possible, and this is the aim of mass data defragmentation: to unravel the spaghetti strands that bind the data center and allow for far greater consolidation, visibility and accountability.
Reaching this state is not easy, of course. It requires tools to converge platforms, a file system that supports deduplication, indexing and search, very high levels of scalability, and intuitive reporting and visualisation.
Manage that though, and there are two key advantages. First, we save the data center costs of storing the data as dark data can represent anywhere from 50 to 90% of the total. Second, the remaining data becomes much easier to wrangle, interrogate and extract value from.
All too often, companies wait for a crisis to strike before acting to rationalise data, like a regulatory query or a legal change such as Australia's data breach reporting requirements. While this attitude can be understood to some degree, the amount of disruption the past decade saw from new market entrants with a ‘data-driven' approach to IT and operations, you've got to think it's actually crazy.
There is trouble in their midst though, with regulations governing data privacy being strengthened around the globe. It's far better to bake smarter data management into everyday operations now and start the process of translating your data detritus into data riches, rather than fall foul and be playing catch up over the next decade.