Big Data has become big business. According to IDC, “Worldwide revenues for big data and business analytics (BDA) will reach $150.8 billion in 2017, an increase of 12.4 percent over 2016.”
And the market for big data solutions is likely to get even bigger. The same IDC report added, “Commercial purchases of BDA-related hardware, software, and services are expected to maintain a compound annual growth rate (CAGR) of 11.9 percent through 2020 when revenues will be more than $210 billion.”
For enterprises, managing and analyzing big data – and developing a big data strategy – has become a critical part of how business is done. According to the Big Data Executive Survey 2016 from New Vantage Partners, a majority (62.5 percent) of the Fortune 1000 companies surveyed had at least one instance of big data in production, and only 5.4 percent said they had no big data initiatives planned or underway.
Clearly, big data is a key concern for business and IT leaders.
The first (or one of the first) people to use the term “big data” was John Mashey, who began discussing big data in the late 1990s when he worked for SGI.
The authoritative definition of big data came from Doug Laney, who was an analyst at META Group, which has since become part of Gartner. In 2001, Laney published a paper called “3D Data Management: Controlling Data Volume, Velocity, and Variety.” His three Vs — volume, velocity and variety — have since become the industry-standard way to define big data.
Some vendors have attempted to add a fourth V to the original three. They may talk about variability, which refers to data flows that speed up or slow down at different times; veracity, which refers to how trustworthy, accurate and consistent the data is; or value, the amount of money organizations can make from the data. However, none of these fourth Vs have caught on widely, and most people still use the three Vs to describe big data.
Big Data experts often note that Big Data is comprised of “the three V’s” – volume, velocity and variety.
In order to deal with the massive volume, velocity and variety of their big data, enterprises deploy a wide variety of technologies, including the following:
In order to have enough room to store their big data, enterprises need a lot of physical or cloud-based storage hardware. Often, they choose virtualized storage solutions that offer excellent scalability.
Organizations also need software that can store their big data. They might choose a data warehouse, a data lake, a NoSQL database and/or a distributed storage solution, such as Hadoop.
Vendors also offer a variety of tools that can help organizations move, integrate, clean and otherwise prepare their data for analytics. These tools fit into a variety of categories, including data integration, data virtualization, data preparation, ETL, data quality and data governance. Many companies use Big Data virtualization to help with management.
For most organizations, the goal of big data initiatives is to generate valuable insights that the company can then use to become more efficient, better serve customers or become more competitive. Big data analytics tools include data mining, business intelligence, predictive analytics, machine learning, cognitive computing, artificial intelligence, search and data modeling solutions. A related technology, in-memory data fabric, can speed up big data analytics tasks.
Large repositories of data can be an attractive target for hackers, so enterprises need to make sure that they are appropriately securing their big data. Popular big data security technologies include encryption and access management solutions.
On the flip side, some organizations run big data analytics on their security and log data in order to detect, prevent and mitigate attacks. Software with these capabilities are often referred to as security intelligence or security information and event management (SIEM) solutions.
Many vendors offer cloud-based solutions for storing, managing, analyzing or securing big data. The advantage in choosing a cloud big data tool is the affordability and easy scalability that the cloud offers. However, some organizations have security or compliance concerns that prevent them from using cloud-based big data tools.
For most organizations, the primary purpose in launching a big data initiative is to analyze that data in order to improve business outcomes. In the New Vantage Partners survey, the number one business driver of big data projects was “greater business insights,” which was selected by 37 percent of respondents.
They way that organizations generate those insights is through the use of analytics software. Vendors use a lot of different terms, such as data mining, business intelligence, cognitive computing, machine learning and predictive analytics, to describe their big data analytics solutions. In general, however, these solutions can be separated into four broad categories:
While big data offers tremendous business opportunities, it also poses some challenges, including the following:
Dealing with data growth. According to the IDC’s Digital Universe report, the amount of digital information stored on the world’s systems is growing by 40 percent each year. For enterprises, simply storing that ever-growing amount of information can be a difficult — and costly — proposition. Analyzing those vast quantities of data poses additional challenges because as data stores grow, analytics processes take longer and require more computing power.
Generating insights in a timely manner. Many organizations are looking to analyze and respond to their big data in real time. That requires specialized hardware and software with advanced capabilities. In the past, business analysts may have generated business intelligence (BI) reports on a weekly or monthly basis, but now many organizations are pressuring their analysts to create those same reports — and more — several times per day.
Recruiting and retaining big data talent. Big data experts and data scientists are some of the most highly sought employees in the extremely competitive IT talent market. According to the 2017 Robert Half Technology Salary Guide, the average big data engineer earns between $135,000 and $196,000, while data scientists make $116,000 to $163, 500 and business intelligence analysts bring home $118,000 to $138,750. Many organizations find it difficult to hire the big data experts that they need. To fill in the gaps, they often look for big data analytics tools that promise to allow business users to self-service their own needs.
Integrating disparate data sources. Most organizations have data that comes from a wide variety of different enterprise applications and both internal and external sources. Before they can perform analytics on those different data sets, they need a way to bring all that data together. Several vendors offer big data integration tools that can help, but integration remains difficult for many organizations.
Validating data. Big data analytics can only yield valuable insights if it is based on accurate data. Unfortunately, many organizations find that the data they have on their various systems is not consistent. Before they can analyze that data effectively, they need to have a process — and technology — for cleaning and standardizing that data.
Securing big data. Big data repositories can be particularly attractive to advanced persistent threats (APTs), the nation-states and competitors with the resources to launch a sophisticated, hard-to-detect cyberattack. Organizations need to make sure that they are protecting their big data stores with appropriate security measures, including encryption and access control.
The task of securing big data is complicated by the volume, velocity and variety of the data. Any big data store is likely to include some sensitive information, such as customer credit card numbers, usernames, passwords, email addresses, etc. Enterprises often address this issue with encryption technology, but standard encryption techniques can slow down data retrieval or make big data analytics difficult or even impossible.
To get around this problem, organizations have a couple of options. First, attribute-based encryption encrypts only the sensitive data. So, for example, it could encrypt credit card numbers and names in a database while leaving the customers’ ages and genders in the clear. This allows business users to conduct analysis on anonymized data while restricting access to personal information.
Another option is fully homomorphic encryption. This technique allows users to conduct analysis on encrypted data. This generates encrypted results, which can be decrypted with the same key used to conduct the original encryption. This option secures the data even as it is being analyzed.
Organizations should also make sure that any big data solutions they deploy have role-based access control with an audit trail. This protects against insider threats and gives organizations a way to see who may have accessed data. In addition, organizations should use real-time monitoring, and intrusion detection and prevention systems to help thwart attacks against their big data systems.
Another issue that complicates big data security is the prevalence of open source solutions. Many organizations have cobbled together their own big data solutions using freely available software. However, open source software doesn’t always have the same level of built-in security as proprietary solutions. NoSQL databases, in particular, often come under criticism for a lack of protection against attacks. Organizations that rely heavily on open source software will need to be especially vigilant to make sure that they are using appropriate levels of data protection for their big data.
Some of the best and most popular big data solutions are available under an open source license. In fact, the open source Apache Hadoop project dominates the big data space, and analysts at Forrester have gone so far as to call Hadoop a “must-have for large enterprises.”
The best-known open source big data solutions include the following:
The list of big data vendors seems to be endless. Frankly, if you name any technology company, it’s quite likely that it has a product with a “big data” label. Additionally, new big data startups are constantly being created. The vendors below are some of the best known big data companies in each category:
FEATURE | By Rob Enderle,
December 04, 2020
ARTIFICIAL INTELLIGENCE | By Guest Author,
November 18, 2020
FEATURE | By Guest Author,
November 10, 2020
FEATURE | By Samuel Greengard,
November 05, 2020
ARTIFICIAL INTELLIGENCE | By Guest Author,
November 02, 2020
ARTIFICIAL INTELLIGENCE | By Rob Enderle,
October 29, 2020
ARTIFICIAL INTELLIGENCE | By Rob Enderle,
October 23, 2020
FEATURE | By Rob Enderle,
October 16, 2020
FEATURE | By Cynthia Harvey,
October 07, 2020
ARTIFICIAL INTELLIGENCE | By Guest Author,
October 05, 2020
FEATURE | By Guest Author,
September 25, 2020
FEATURE | By Rob Enderle,
September 25, 2020
FEATURE | By Cynthia Harvey,
September 22, 2020
ARTIFICIAL INTELLIGENCE | By Rob Enderle,
September 18, 2020
ARTIFICIAL INTELLIGENCE | By James Maguire,
September 14, 2020
ARTIFICIAL INTELLIGENCE | By James Maguire,
September 13, 2020
FEATURE | By Rob Enderle,
September 11, 2020
FEATURE | By James Maguire,
September 09, 2020
FEATURE | By Rob Enderle,
September 05, 2020
ARTIFICIAL INTELLIGENCE | By Rob Enderle,
August 14, 2020
Datamation is the leading industry resource for B2B data professionals and technology buyers. Datamation's focus is on providing insight into the latest trends and innovation in AI, data security, big data, and more, along with in-depth product recommendations and comparisons. More than 1.7M users gain insight and guidance from Datamation every year.
Advertise with TechnologyAdvice on Datamation and our other data and technology-focused platforms.
Advertise with Us
Property of TechnologyAdvice.
© 2025 TechnologyAdvice. All Rights Reserved
Advertiser Disclosure: Some of the products that appear on this
site are from companies from which TechnologyAdvice receives
compensation. This compensation may impact how and where products
appear on this site including, for example, the order in which
they appear. TechnologyAdvice does not include all companies
or all types of products available in the marketplace.