As organizations and their data increasingly move to the cloud and large scale distributed databases, one particular challenge that has emerged is how to actually backup and recover all that data. With a distributed cloud-scale database, such as MongoDB or Cassandra, data is made highly available across multiple nodes, which is great for data resiliency in the cloud, but makes backup somewhat challenging.
Tarun Thakur, co-founder and CEO of startup Datos IO, is aiming to solve the cloud distributed database challenge with his company’s RecoverX platform. The primary innovation of RecoverX is that it has what Thakur referred to as Consistent Orchestrated Distributed Recovery (CODR), that can backup an organization data from a distributed cloud database. The initial generally available iteration of RecoverX supports Apache Cassandra (v2.0, v2.1), DataStax DSE (v4.5, v4.6, v4.7, v4.8) and MongoDB (v3.0, v3.2).
Protecting your company’s data is critical. Cloud storage with automated backup is scalable, flexible and provides peace of mind. Cobalt Iron’s enterprise-grade backup and recovery solution is known for its hands-free automation and reliability, at a lower cost. Cloud backup that just works.
Datos IO first existed its stealth mode in September 2015, announcing that it had raised $12.5 million in a Series A round of funding that included the participation of Lightspeed Venture Partners and True Ventures.
With MongoDB and Cassandra, both database technologies include elements that enable high-availability and even data redundancy, though Thakur noted that neither by default have features for true point-in-time backup.
“With RecoverX we enable a true point-in-time backup that is cluster consistent,” Thakur said.
With a multi-node database, where data is distributed widely, being able to get a consistent database point-in-time image requires the use of a technology methodology in the industry known as distributed consensus. Specifically, Datos IO is using the Raft distributed consensus model that was originally developed by Salesforce lead software engineer Diego Ongaro and is widely used in cloud systems today including Google Kubernetes.
“MongoDB and Cassandra enables a masterless distributed data architecture,” Thakur explained. “As such, the data versioning has to be cluster level.”
That cluster level of data versioning for backup also applies to de-duplication of data, so as not to have multiple version of the same data in a backup. To that end, RecoverX enables semantic de-duplication, which is able to take the data from multiple nodes of a distributed cloud scale database and then create a single golden backup. Thakur said that the RecoverX semantic de-duplication is able to reduce backup storage space requirements by approximately 70 percent. He added that once an initial backup is done, RecoverX continues to enable storage savings, by only doing incremental backups of data that has changed in a cluster.
From a storage technology perspective, Thakur explained that RecoverX enables software defined storage, whereby the organization chooses where they want the data to be stored. Options include cloud storage platforms such as Amazon S3 and Google cloud, as well as traditional NFS (Network File System) based storage systems.
Sean Michael Kerner is a senior editor at Datamation and InternetNews.com. Follow him on Twitter @TechJournalist
Ethics and Artificial Intelligence: Driving Greater Equality
FEATURE | By James Maguire,
December 16, 2020
AI vs. Machine Learning vs. Deep Learning
FEATURE | By Cynthia Harvey,
December 11, 2020
Huawei’s AI Update: Things Are Moving Faster Than We Think
FEATURE | By Rob Enderle,
December 04, 2020
Keeping Machine Learning Algorithms Honest in the ‘Ethics-First’ Era
ARTIFICIAL INTELLIGENCE | By Guest Author,
November 18, 2020
Key Trends in Chatbots and RPA
FEATURE | By Guest Author,
November 10, 2020
FEATURE | By Samuel Greengard,
November 05, 2020
ARTIFICIAL INTELLIGENCE | By Guest Author,
November 02, 2020
How Intel’s Work With Autonomous Cars Could Redefine General Purpose AI
ARTIFICIAL INTELLIGENCE | By Rob Enderle,
October 29, 2020
Dell Technologies World: Weaving Together Human And Machine Interaction For AI And Robotics
ARTIFICIAL INTELLIGENCE | By Rob Enderle,
October 23, 2020
The Super Moderator, or How IBM Project Debater Could Save Social Media
FEATURE | By Rob Enderle,
October 16, 2020
FEATURE | By Cynthia Harvey,
October 07, 2020
ARTIFICIAL INTELLIGENCE | By Guest Author,
October 05, 2020
CIOs Discuss the Promise of AI and Data Science
FEATURE | By Guest Author,
September 25, 2020
Microsoft Is Building An AI Product That Could Predict The Future
FEATURE | By Rob Enderle,
September 25, 2020
Top 10 Machine Learning Companies 2021
FEATURE | By Cynthia Harvey,
September 22, 2020
NVIDIA and ARM: Massively Changing The AI Landscape
ARTIFICIAL INTELLIGENCE | By Rob Enderle,
September 18, 2020
Continuous Intelligence: Expert Discussion [Video and Podcast]
ARTIFICIAL INTELLIGENCE | By James Maguire,
September 14, 2020
Artificial Intelligence: Governance and Ethics [Video]
ARTIFICIAL INTELLIGENCE | By James Maguire,
September 13, 2020
IBM Watson At The US Open: Showcasing The Power Of A Mature Enterprise-Class AI
FEATURE | By Rob Enderle,
September 11, 2020
Artificial Intelligence: Perception vs. Reality
FEATURE | By James Maguire,
September 09, 2020
Datamation is the leading industry resource for B2B data professionals and technology buyers. Datamation's focus is on providing insight into the latest trends and innovation in AI, data security, big data, and more, along with in-depth product recommendations and comparisons. More than 1.7M users gain insight and guidance from Datamation every year.
Advertise with TechnologyAdvice on Datamation and our other data and technology-focused platforms.
Advertise with Us
Property of TechnologyAdvice.
© 2025 TechnologyAdvice. All Rights Reserved
Advertiser Disclosure: Some of the products that appear on this
site are from companies from which TechnologyAdvice receives
compensation. This compensation may impact how and where products
appear on this site including, for example, the order in which
they appear. TechnologyAdvice does not include all companies
or all types of products available in the marketplace.