On the surface, Big Data as a Service seems like a natural evolution. Yet big questions surround its nascent growth.
On-demand services are one of the big technological revolutions of the 21st century, attributable to the Internet revolution that made remote servers as close as the ones in your data center. It started with software as a service, followed by platform and infrastructure as a service, with a few stray ideas like storage as a service. The next big on-demand service may very well be Big Data as a Service.
Big Data as a Service, as defined in a lengthy research piece in Service Technology Magazine by Varun Sharma, an enterprise solutions architect, is a recommended framework “to enable information availability to consumers via reports, discovery services, etc., through the reuse of data services, promoting best practices in data management using data modeling and metadata services.”
Sharma notes the migration of data stores from mainframes to the present, where the underlying platform for the data is no longer relevant, and we are moving away from application-based enterprises to data-driven enterprises. Big Data takes data from sources that range from internal metrics, sales figures and Twitter feedback, making it both internally- and externally-generated.
Big Data means lots of data to process, and when it gets into the petabytes, it doesn’t make sense to move it around for processing. Should you really pull several petabytes of data into your organization to process it on a Hadoop cluster? Or take all your internal data and send it up to the cloud?
No, said Nick Heudecker, research director for information management at Gartner. “A hybrid deployment makes sense,” he said. “Doing some processing in the cloud makes sense, and doing some on premises makes sense. If you have data coming in from cloud services, you can deploy a collection management infrastructure in the cloud, do analytics on it and move it through on premises services. You don’t want to move everything to the cloud.”
He’s not sold on the concept of BDaaS being a pure cloud play the way SalesForce is a pure cloud play or Softlayer is a pure on-demand platform provider. “Big Data as a Service is nonsense from start to finish. There are too many things to do and integrate in a pure cloud play. You’re talking enterprise data warehousing, Hadoop, RDBMS, event processing, NoSQL, in-memory databases and a variety of other things. If all of that is encompassed in Big Data, how can you realistically get that as a service?” he said.
John Myers, research director for business intelligence at Enterprise Management Associates, said the definition of Big Data is evolving, as is the definition of BDaaS. But he adds that it is built on Platform as a Service. “What we’re seeing is people want to move faster. They want to be more nimble and the PaaS argument makes a lot of sense for them,” he said.
“We’ve done research over the last few years leading with the question ‘How big is your Hadoop?’ Now we ask about how they handle existing data structure. We found people using a wide range of technologies. No one has a platform with all your data online like Salesforce, because it’s almost as difficult to manage Hadoop in the cloud as if you were installing them in your own office,” he said.
BDaaS ideal for faster deployment because people can provision the resources they need, then deprovision them after the work is done and not be saddled with a lot of hardware, Myers notes. “You can go to [Microsoft] Azure and set up a platform based on Hortonworks and literally say give me a 100 node cluster and they build it. Now you have this platform as a service available to you,” he said.
Services provider CSC is a proponent of BDaaS and it also advocates the hybrid model. “It’s a world where the end state is to get as much data as you possibly can,” said Jim Kaskade, vice president and general manager of Big Data and analytics at CSC.
“There’s so much in data to go after and how you store and analyze it. I think people just want to get it all in one place they can control. So they will take external and internal data all in one place where you can get an analytic and query capability quickly. Eventually you will get to a federated model where it doesn’t matter where you store it. That’s the holy grail of the future,” he said.
CSC launched its BDPaaS services at the end of July, using Amazon, CSC Cloud Solutions, RedHat OpenStack and VMware VSphere private clouds to integrate client data centers with major cloud services providers. CSC BDPaaS offers batch analytics, fine-grained and interactive analytics, and real-time streaming analytics. It promises insights from data in less than 30 days, even in the most complex hybrid environments.
HP is offering its own BDaaS, called HAVEn As a Service, a cloud-based method for enterprises to subscribe to several of its Big Data analytic products on an as-need basis. HAVEn is HP’s brand for its Hadoop, Autonomy, Vertica and other BD products for processing and analyzing data.
EMC also has talked up BDaaS in a white paper (here in PDF format), which it promotes its own products like Greenplumb and Pivotal. Its services are built on four platforms: Cloud infrastructure, data fabric, data platform as a service and analytics software as a service.
Sharma added another element to BDaaS: governance. Data in the cloud has got to be secured. “Data governance is a must-have, and no longer merely a good-to-have,” he wrote. Ignoring data security, data quality and data access can cost an organization dearly in terms of money, efficiency and reputation.
He also advocates breaking the operational tiers for data flow into logical groups to allow agility via loose coupling and abstraction. Finally, he says not to focus solely on the volume, variety and complexity of data. “Consider the whole cycle from the acquisition of data to the extraction of information, and consider the hygiene factors along this path,” he wrote.
At the end of the day it’s all about storing, analyzing and querying more data from more data sources. There’s more than just the volume of data that you are storing, it’s the speed at which you can acquire the data and act on it. The whole paradigm comes into play. It’s not storing a lot of data and analyzing it.
Kaskade believes the future of BDaaS is something that doesn’t involve a human looking at the data. We already have that in the financial services sector, he notes, where banks will alert a customer if there is a surprisingly large charge. That type of instant action will roll out everywhere. “Now a whole set of industries are applying complex event processing as well. They are acting on millions of streams coming in doing analytics in real time,” he said.
As for what companies should deploy first, he said don’t build a big sandbox of technology, because you might not need it. “What everybody needs to do is use case first and the questions that drive use case. That will dictate what you will buy. You might just need a SaaS app. Don’t feel like you need to make a really big investment. Solve one problem and show your C suite what you can do before going for a bigger bite,” he said.
Photo courtesy of Shutterstock.
Huawei’s AI Update: Things Are Moving Faster Than We Think
FEATURE | By Rob Enderle,
December 04, 2020
Keeping Machine Learning Algorithms Honest in the ‘Ethics-First’ Era
ARTIFICIAL INTELLIGENCE | By Guest Author,
November 18, 2020
Key Trends in Chatbots and RPA
FEATURE | By Guest Author,
November 10, 2020
FEATURE | By Samuel Greengard,
November 05, 2020
ARTIFICIAL INTELLIGENCE | By Guest Author,
November 02, 2020
How Intel’s Work With Autonomous Cars Could Redefine General Purpose AI
ARTIFICIAL INTELLIGENCE | By Rob Enderle,
October 29, 2020
Dell Technologies World: Weaving Together Human And Machine Interaction For AI And Robotics
ARTIFICIAL INTELLIGENCE | By Rob Enderle,
October 23, 2020
The Super Moderator, or How IBM Project Debater Could Save Social Media
FEATURE | By Rob Enderle,
October 16, 2020
FEATURE | By Cynthia Harvey,
October 07, 2020
ARTIFICIAL INTELLIGENCE | By Guest Author,
October 05, 2020
CIOs Discuss the Promise of AI and Data Science
FEATURE | By Guest Author,
September 25, 2020
Microsoft Is Building An AI Product That Could Predict The Future
FEATURE | By Rob Enderle,
September 25, 2020
Top 10 Machine Learning Companies 2020
FEATURE | By Cynthia Harvey,
September 22, 2020
NVIDIA and ARM: Massively Changing The AI Landscape
ARTIFICIAL INTELLIGENCE | By Rob Enderle,
September 18, 2020
Continuous Intelligence: Expert Discussion [Video and Podcast]
ARTIFICIAL INTELLIGENCE | By James Maguire,
September 14, 2020
Artificial Intelligence: Governance and Ethics [Video]
ARTIFICIAL INTELLIGENCE | By James Maguire,
September 13, 2020
IBM Watson At The US Open: Showcasing The Power Of A Mature Enterprise-Class AI
FEATURE | By Rob Enderle,
September 11, 2020
Artificial Intelligence: Perception vs. Reality
FEATURE | By James Maguire,
September 09, 2020
Anticipating The Coming Wave Of AI Enhanced PCs
FEATURE | By Rob Enderle,
September 05, 2020
The Critical Nature Of IBM’s NLP (Natural Language Processing) Effort
ARTIFICIAL INTELLIGENCE | By Rob Enderle,
August 14, 2020
Datamation is the leading industry resource for B2B data professionals and technology buyers. Datamation's focus is on providing insight into the latest trends and innovation in AI, data security, big data, and more, along with in-depth product recommendations and comparisons. More than 1.7M users gain insight and guidance from Datamation every year.
Advertise with TechnologyAdvice on Datamation and our other data and technology-focused platforms.
Advertise with Us
Property of TechnologyAdvice.
© 2025 TechnologyAdvice. All Rights Reserved
Advertiser Disclosure: Some of the products that appear on this
site are from companies from which TechnologyAdvice receives
compensation. This compensation may impact how and where products
appear on this site including, for example, the order in which
they appear. TechnologyAdvice does not include all companies
or all types of products available in the marketplace.