Datamation Logo

New Open Source Projects: Big Data, DevOps

September 21, 2015
Datamation content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

On a regular basis we take a look at what has been happening in the open source community and make a list of noteworthy new projects that have debuted in the last year or two. As always, the open source community is bursting with new projects. For this list, we’ve focused on two growing categories, Big Data and DevOps.

Analytics and Big Data

1. Data Torrent RTS

Data Torrent has been around a while, but it first open sourced its Core RTS technology in June of this year. It claims to be “the industry’s only open source enterprise-grade unified stream and batch platform.” It comes in community, standard and enterprise versions. Operating System: Linux

2. Genie

Created by Netflix, Genie allows IT administrators to manage Hadoop jobs running on cloud computing services. Netflix uses it to run many thousands of Hadoop jobs every day. Operating System: Windows, Linux, OS X

3. Pinot

Developed by LinkedIn and open sourced in June 2015, Pinto is a distributed OLAP datastore that enables real-time scalable analytics. Key features include column orientation, pluggable indexing technology, data ingestion from Kafka and Hadoop and support for a SQL-like language. Operating System: Windows, Linux, OS X

4. Pivotal GemFire, Greenplum Database and HAWQ

Pivotal, a spinoff of EMC and VMware, announced earlier this year that they would be open sourcing key pieces of their big data suite, including GemFire, Greenplum Database and HAWQ. Gemfire helps organizations build data-driven applications, Greenplum is an analytics-enabled database, and HAWQ is a SQL analytics engine for Hadoop. Operating System: Windows, Linux, OS X

5. Lipstick

This Netflix project provides an easy-to-understand graphical representation of Hadoop Pig jobs. It updates as the job executes so that administrators and developers no longer need to sift through log data. Operating System: Windows, Linux, OS X

6. Taiga

Originally developed by LinkedIn, this Apache top-level project does distributed stream processing. It relies on the Kafka messaging system as well as Hadoop YARN distributed processing technology. Operating System: Windows, Linux, OS X

Machine Learning

7. FeaturFu

LinkedIn first released this project earlier this month. According to the company it is “a new open source toolkit designed to enable creative and agile feature engineering for most machine learning tasks such as statistical modeling (classification, clustering, and regression) and rule-based decision engines.” Operating System: Linux

DevOps Tools

8. DebOps

DebOps describes itself as “your Debian-based data center in a box.” It’s a set of Ansible playbooks designed to make it easier to set up and manage a data center. Operating System: Linux

9. Hygieia

Created by financial services heavyweight Capital One, Hygieia (pronounced “hi-gee-ya”) is a DevOps dashboard that allows users to see the status of their entire delivery pipeline at a glance. The code, screenshots and a video explaining the tool are available on Capital One’s Git Hub pages. Operating System: OS Independent

Photo courtesy of Shutterstock.

  SEE ALL
ARTICLES
 

Subscribe to Data Insider

Learn the latest news and best practices about data science, big data analytics, artificial intelligence, data security, and more.

Datamation Logo

Datamation is the leading industry resource for B2B data professionals and technology buyers. Datamation's focus is on providing insight into the latest trends and innovation in AI, data security, big data, and more, along with in-depth product recommendations and comparisons. More than 1.7M users gain insight and guidance from Datamation every year.

Advertisers

Advertise with TechnologyAdvice on Datamation and our other data and technology-focused platforms.

Advertise with Us

Our Brands


Privacy Policy Terms & Conditions About Contact Advertise California - Do Not Sell My Information

Property of TechnologyAdvice.
© 2025 TechnologyAdvice. All Rights Reserved

Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.