Datamation Logo

Data Deduplication: Two Methods

January 21, 2010
Datamation content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

Data Deduplicationsounds like an excellent candidate for an HGTV reality show — companies drowning in a sea of redundant data receive a visit from two perky IT people who descend on their files like vultures on road kill with the promise of curing the company’s duplicate data ills with a weekend and few thousand dollars worth of software.

I agree that it doesn’t sound like the best premise for a new show, but if you’re looking for a new money-saving topic to discuss at the conference table next week, toss out the concept of data deduplication. Yes, it’s a mouthful to say but that mouthful might save you a handful — a handful of dollars, that is.

The data deduplication process involves removing copies of files and replacing those duplicates with pointers back to the original copy. Removing multiple copies frees up valuable storage space, makes backups smaller and faster, and reduces network traffic for over-the-network backups. Add the three together and you have significant money savings.

Usually the term “deduplication,” refers to enterprise storage systems that house huge amounts of data harboring perhaps tens of thousands of duplicated files. The sheer number of files and possible copies of those files makes the task seem overwhelming, but fortunately, there is hope in the form of sophisticated software designed for this purpose.

A Tale of Two Methods

There are two types of data deduplication: source and target. Source-based deduplicationtakes place as the backup software processes the files prior to transfer to media. This means the deduplication software replaces your current backup software and strategy with one that examines file contents on the fly. As you might expect, source deduplication speeds aren’t stellar (though still better than tape), but savings come in the form of less network bandwidth being consumed, due to fewer files being transferred, and reduced space on backup media.

Read the rest at ServerWatch.

  SEE ALL
ARTICLES
 

Subscribe to Data Insider

Learn the latest news and best practices about data science, big data analytics, artificial intelligence, data security, and more.

Datamation Logo

Datamation is the leading industry resource for B2B data professionals and technology buyers. Datamation's focus is on providing insight into the latest trends and innovation in AI, data security, big data, and more, along with in-depth product recommendations and comparisons. More than 1.7M users gain insight and guidance from Datamation every year.

Advertisers

Advertise with TechnologyAdvice on Datamation and our other data and technology-focused platforms.

Advertise with Us

Our Brands


Privacy Policy Terms & Conditions About Contact Advertise California - Do Not Sell My Information

Property of TechnologyAdvice.
© 2025 TechnologyAdvice. All Rights Reserved

Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.