While many businesses keep clinging to existing Windows, NetWare, and OS/390 solutions, more and more are turning to Linux clusters for high performance computing (HPC), high availability (HA), and Web farm applications. To get the most out of Linux clusters, however, you need to know the ins-and-outs of installation and maintenance, as well as the best software and hardware configurations for specific types of clustering implementations.
So what exactly is a cluster, anyway? “More than one machine, working cooperatively on one or more tasks,” says Sean Dague of the IBM Linux Technology Center.
Linux clusters are “like 1,000,000 ants vs. one elephant,” Dague illustrated, speaking during the Linux Boot Camp at the recent PCExpo/TechNYXpo conference in New York City.
Yet despite advantages ranging from speed to cost effectiveness, Linux clusters can be a tough solution to sell, according to some audience members at the boot camp in Manhattan. One IT consultant attending the show said that, after years of trying, he is just now starting to convince some of his customers of the benefits.
“My customers have been willing to adopt Linux — but not Linux clusters.”
Over the past month, though, one of his customers, a large insurance firm, has decided to replace its previous Novell NetWare implementation with Linux clusters, instead of Microsoft Windows clusters as originally planned. “The firm could have saved $65,000 in consulting costs if they’d followed my first advice,” the consultant said.
Closer than Windows to Unix and NetWare
Linux also bears much closer resemblance to legacy OSes such as Unix and NetWare than Microsoft .NET, some observers say.
One of the IT consultant’s insurance firm customers actually tried out Windows clusters for several months before taking the consultant’s advice and moving to Linux instead. The flat file database formerly used with NetWare is working much better under a Linux architecture than on Windows.
“Lawyers at the company like to be able to share files without dealing with Windows’ proprietary things such as UNC,” added the consultant.
Reliability and performance of Linux
Other frequently cited advantages of Linux clusters include reliability, modularity, and fast performance.
The IT consultant told a story about the CEO of another client company that has moved to Linux. When the CEO was about to deliver a presentation at an industry conference, he asked for statistics on the availability of the company’s Linux systems. The company chief was astounded to get the answer: “364 days a year.”
“Jobs that [would otherwise take us] about a month to complete can now be run in about 10 to 12 hours,” says Alex Bogdan, a principal developer at Electro-Optical Sciences (EOS). EOS is now operating a Red Hat Linux-based cancer diagnosis application called MelaFind on PC-based eCluster systems at IBM’s “Deep Computing on demand” facility in Poughkeepsie, NY.
Page 2: The Cost Benefit of Clusters
The cost benefit of clusters
The cost effectiveness of open source software is a no-brainer. “No one ever needs a cluster — but it is often the most cost effective way to get the job done,” claims Dague, especially when coupled with Linux.
Clusters can be run on inexpensive PC hardware, as well as on Linux distributions operating on top of zOS- or OS390-driven mainframes.
But keep ‘Hidden Costs’ in mind, too
On the downside, however, clusters can be accompanied by some hidden costs, including:
If your organization faces one or more of these barriers, outsourced hosting represents an alternative.
Tips and Tricks: Installation
Linux clustering is also a relatively new phenomenon, a fact that might help explain lingering reluctance by some businesses to get started.
Moreover, in some ways, Linux is still sort of a land unto itself. Methods of initial installation, for example, can vary from one Linux distribution to another.
For “lights out” installation, Dague recommends administrators use the following:
“New machines must be brought from bare members of the cluster with minimal effort,” says Dague. “CDs don’t cut it.”
What to do about ‘version skew’
Software maintenance is another big issue. Sometimes, mass updates are performed on some nodes, while other nodes are down. Certain administrators apply hot fixes only to individual nodes that are in particular need of maintenance.
“Before long, though, it becomes unclear what [software] is on any given node,” says Dague. Unless you keep careful documentation, this situation leads to “version skew.”
Dague offers three options for staving off version skew:
Page 3: High Availability Clusters
HA clusters – Software needed for switching and failure detection
Software and hardware configurations vary according to type of cluster. High Availability clusters are used when systems need to be “always on,” observes Dague. “In the real world, things break. Disks wear out. Memory goes bad. CPUs overheat.”
As with other OSes, hot spare systems are required, along with HA software for detecting failures and for switching over to the backup node.
Web farm clusters – Load balancing is key
Web farm implementations come into play when extra CPU and network capacity is needed for handling large volumes of data. In this type of clustering implementation, content is distributed over multiple machines, with load balancing at the front end. Typically, Apache is used as the Web server.
Administrators can consult a couple of Web sites for Linux-based load balancing software:
Too small for clusters?
Several bootcamp attendees contended their networks are too small to benefit from Linux clusters.
However, regardless of their size, companies with HPC needs – such of those in scientific and technical fields – are seeing advantages.
In HPC solutions, many computers are tied together through a high-speed network. Applications are then rewritten into “parallelizable chunks.”
“Running a simulation, for example, on one node could actually take many years,” Dague maintained during the Linux bootcamp.
EOS’s Bogdan says it only took his company a couple of hours to port 10 or 12 miniclusters from a homegrown Linux system to IBM’s supercomputing architecture in Poughkeepsie, and to optimize the existing Melafind application for IBM’s Parallel Virtual Machine (PVM).
HPC clusters – Messaging and dispatch software
Dague suggests the following open source software for HPC messaging and dispatch:
“HPC in a box”
Message passing
Dispatch
So whether you’re interested in HPC, HA, or Web farm applications, Linux clusters could hold a lot of promise for your organization. However, unless you plan to outsource the whole ball of wax, you need to get familiar with open source solutions for installation, maintenance, and specific types of clustering applications.
»
See All Articles by Columnist Jacqueline Emigh
Ethics and Artificial Intelligence: Driving Greater Equality
FEATURE | By James Maguire,
December 16, 2020
AI vs. Machine Learning vs. Deep Learning
FEATURE | By Cynthia Harvey,
December 11, 2020
Huawei’s AI Update: Things Are Moving Faster Than We Think
FEATURE | By Rob Enderle,
December 04, 2020
Keeping Machine Learning Algorithms Honest in the ‘Ethics-First’ Era
ARTIFICIAL INTELLIGENCE | By Guest Author,
November 18, 2020
Key Trends in Chatbots and RPA
FEATURE | By Guest Author,
November 10, 2020
FEATURE | By Samuel Greengard,
November 05, 2020
ARTIFICIAL INTELLIGENCE | By Guest Author,
November 02, 2020
How Intel’s Work With Autonomous Cars Could Redefine General Purpose AI
ARTIFICIAL INTELLIGENCE | By Rob Enderle,
October 29, 2020
Dell Technologies World: Weaving Together Human And Machine Interaction For AI And Robotics
ARTIFICIAL INTELLIGENCE | By Rob Enderle,
October 23, 2020
The Super Moderator, or How IBM Project Debater Could Save Social Media
FEATURE | By Rob Enderle,
October 16, 2020
FEATURE | By Cynthia Harvey,
October 07, 2020
ARTIFICIAL INTELLIGENCE | By Guest Author,
October 05, 2020
CIOs Discuss the Promise of AI and Data Science
FEATURE | By Guest Author,
September 25, 2020
Microsoft Is Building An AI Product That Could Predict The Future
FEATURE | By Rob Enderle,
September 25, 2020
Top 10 Machine Learning Companies 2021
FEATURE | By Cynthia Harvey,
September 22, 2020
NVIDIA and ARM: Massively Changing The AI Landscape
ARTIFICIAL INTELLIGENCE | By Rob Enderle,
September 18, 2020
Continuous Intelligence: Expert Discussion [Video and Podcast]
ARTIFICIAL INTELLIGENCE | By James Maguire,
September 14, 2020
Artificial Intelligence: Governance and Ethics [Video]
ARTIFICIAL INTELLIGENCE | By James Maguire,
September 13, 2020
IBM Watson At The US Open: Showcasing The Power Of A Mature Enterprise-Class AI
FEATURE | By Rob Enderle,
September 11, 2020
Artificial Intelligence: Perception vs. Reality
FEATURE | By James Maguire,
September 09, 2020
Datamation is the leading industry resource for B2B data professionals and technology buyers. Datamation's focus is on providing insight into the latest trends and innovation in AI, data security, big data, and more, along with in-depth product recommendations and comparisons. More than 1.7M users gain insight and guidance from Datamation every year.
Advertise with TechnologyAdvice on Datamation and our other data and technology-focused platforms.
Advertise with Us
Property of TechnologyAdvice.
© 2025 TechnologyAdvice. All Rights Reserved
Advertiser Disclosure: Some of the products that appear on this
site are from companies from which TechnologyAdvice receives
compensation. This compensation may impact how and where products
appear on this site including, for example, the order in which
they appear. TechnologyAdvice does not include all companies
or all types of products available in the marketplace.