The trouble with data hoarding — and how to fix It
Article written by StorageCraft APAC head of sales Marina Brook
There’s a US reality TV show called Hoarders that features people who compulsively acquire stuff and are unwilling or unable to discard it. Eventually, they get buried.
A lot of companies are behaving the same way. They’ve become data hoarders. And, like those unfortunate folks on the TV show, they’ll soon find themselves flooded with data and struggling with the cost of managing it—if they aren’t already.
Companies are brought to this point by the strong belief, now widespread in business, that the worst thing they can do with a piece of data is to throw it away. It’s an understandable impulse. They want to keep all their data because they never know which innocent piece of data might suddenly become hugely important, today or tomorrow.
Data accumulation has become a problem for companies both large and small. Not long ago, 500 terabytes of data used to be solely the concern for a Fortune 500 company. Today, it is a problem for lots of small and midsize companies as well.
For example, a healthcare customer stores medical images such as CT scans and mammograms for multiple radiology offices. Recently they upgraded to high-resolution 3D mammography, which meant their storage requirements for each image went from a few gigabytes to tens of gigabytes. In a short space of time, they went from an organisation dealing with data on a few hundred TB-scale to one needing to manage storage on a petabyte scale.
Solutions that worked perfectly fine just five years ago simply don’t work any longer. In these ‘old days’, companies would just drop data into their existing storage infrastructure. But when their data is growing at 120% per year, that’s not an option because they will end up either swapping out their entire storage infrastructure every year or two or adding disparate storage silos to address data growth. Either option is bad for business in terms of cost, complexity, cohesion and continuity.
Today, companies need to figure out quickly which information is critical to their business, and which information can be neglected. They need to understand which data should be pushed to the cloud, so it’s always available, and which data can be stored locally.
The answer to this challenge lies in self-organising storage that applies intelligence and, specifically, machine learning to the management of information. In this scenario, real-time analytics done by the storage system itself decide what the optimal placement for data is and what the optimal protection for any element of information within a dataset may be. This is the only way people are going to keep up with the explosive growth of data that has reached a scale and size that humans can no longer handle effectively.
The future where machine-learning algorithms go through the content of data and establish relationships has already begun. As a result, organisations are able to organise data in its proper context and ensure that datasets with similar context are matched together, making it easier to manage and make sense of mountains of information.
To tackle an organisation’s growing data issue, my advice is to start with digestible chunks. Don’t try to pretend you know what the world of technology and data storage is going to bring next year. Don’t go out and buy a storage system that hopefully will be sufficient five years from now. The digital world moves far too fast and is far too fluid for that kind of guesswork.
Instead, start small with a strategy that scales over time. Take sensible steps that you can test and prove as you go. You’ll be light on your feet and ready to respond as business changes, as the type of data being stored changes, as the technologies in the market change, as regulations change, as your company changes.
So, go ahead and hoard that data. But be smart about it. Be nimble and flexible. Go with a system that’s scalable and requires a relatively small investment now, so you can develop with agility and accuracy in the years ahead.