In recent years, the data storage part of the IT energy efficiency equation has been getting a ton of attention. From deduplication drama to a white-hot flash startup scene, where our data’s stored and how it’s managed have become big priorities.
The good news is that when it comes to architecting an efficient storage infrastructure and management platform, the tech’s all there. But you can’t achieve it without some guidance on dealing with digital waste .
Luckily, Johns Hopkins University computer scientists Ragib Hasan and Randal Burns (that’s them on the right) have put together a paper, The Life and Death of Unwanted Bits: Toward Proactive Waste Data Management in Digital Ecosystems (PDF), with a plan that echoes the tricks we use to deal with waste in meatspace: reduce, reuse, recycle, recover and dispose.
Their strategy for dealing with resource-hogging digital waste, which they describe as data that ranges from aged, barely-used files to completely useless data, is concisely summarized in the pyramid below:
Here are some suggestions from the Hasan and Burns:
Reduce: At the top of the pyramid, the most preferred option is to cut back on the amount of waste data that flows into a computer to begin with. This can be done, the Johns Hopkins researchers say, by encouraging software makers to design their programs to leave fewer unneeded files behind after a program is installed. To coax the software makers to comply, computers could be set up to “punish” programs that do excessive data dumping; such programs would be forced to run more slowly.
Reuse: Software makers also could break their complex strings of code into smaller modules that could serve double-duty. If two programs are found to utilize identical modules, one might be eliminated in a process called “data deduplication.” This the second-best option in the waste-management pyramid, the researchers said.
Recycle: Just as discarded plastic can be refashioned into new soda bottles, some files could be repurposed. For example, when old software is about to be removed, the computer could look for useful pieces of the program that could be put to work in other applications.
Recover: Even when waste data can’t be reused or recycled, these digital leftovers might yield information worth studying after private identification details are removed. In their paper, the researchers suggest that “obsolete data can also be mined to gather patterns about historical trends.”
Dispose: Sitting at the bottom of the pyramid, this is the least desirable option, the researchers say, and the messiest, when you consider the energy used to completely eliminate old files or the real-world pollution created when one destroys an old hard drive or other form of storage media. However, the scientists say, one solution could be a “digital landfill.” This could be accomplished with a “semi-volatile storage device” that would provide a temporary home to data that is designed to automatically fade away over time, freeing up space for the next tenants.
The idea of penalizing software that heaps crappy data onto your system is an interesting one, but good luck getting developers on board. You’ll also see that data deduplication factors into the plan as do shades of business intelligence (to glean useful data from old stores) and tiered storage systems.
Like I said, the tech is there. For efficient storage, it’s just a matter of putting it to work. The “think green” strategy outlined by Ragib Hasan and Randal Burns should help in not only prioritizing your IT storage budget, but driving storage energy efficiency as well.