For many organizations, particularly smaller ones, the concept of fault tolerance extends only as far as doing a nightly tape backup. In many cases, the reason cited for using just this measure is a lack of available funds, but perhaps more prevalent is simply a lacking full appreciation of the impact a downed server brings. Tape backups provide an insurance policy against one thing–data loss. They do not protect against downtime and, quite often constitute the slowest part of a system recovery process.
A fact that organizations must understand is the distinction between fault tolerance and data protection. The data is of value, obviously, and the hardware is of value, but the cost of downtime is somewhat harder to
determine. Backups protect the data, and the hardware is protected by virtue of its bring kept in a safe
location, but the prevention of downtime is a little more complex. Even those working in environments where
clustering and fail-over systems are used must consider downtime.
Many of the same organizations that just use tape backups will be very willing to implement fault tolerant
measures after a damaging event has occurred. As with most things, the benefit of hindsight is great. The
principle of fault tolerance is that an ounce of prevention is worth a pound of cure, and it should be viewed as an investment just like any other aspect of business. This is even truer nowadays as more companies find
themselves unable to function without the use of a server, and the price of hardware that can be used to
provide fault tolerance continues to fall.
I remember when teaching technical training courses some years ago, the downside of disk mirroring was
cited as the fact that it costs 50 percent of disk space. In today’s market, disk space is one of the cheapest
commodities we have. So should we all mirror our drives? In the absence of a RAID 5 array, I would say
yes, why not? Heck, for the sake of a few hundred bucks you could even consider implementing disk
duplexing, but more about that in part two.
For each fault tolerant step you consider, you must look at a number of factors. Possibly the biggest
consideration is the question of how likely a given component is to fail. I attended a seminar given by Intel
recently, where we were discussing a feature called Fault Resilient Booting (FRB) which is where if one
processor fails, the system will disable the failed processor and reboot. Someone had to ask the question, so
I did. How often does a processor fail? (In my 13 years on the job I have never, to my knowledge, had a
failed processor.) The answer was ‘very, very seldom’ though of course what would you expect someone
from Intel to say? Unless you are looking to create a supremely fault tolerant system, features that protect
against ‘very, very seldom’ occurrences must be weighed against those that protect a more susceptible
component. But that raises another question. What is a susceptible component?
Some years ago, when working for a major financial institution I arrived at work one Monday morning (have
you ever noticed how these things always happen on a Monday!) to find that three drives in the RAID array
of one of the servers had gone down. The cause? No, it wasn’t a faulty batch of drives–it was a faulty back
plane. This server was the full meal deal–‘biggie sized’. It had dual power supplies, adapter teaming, RAID
5 with a hot spare and a vastly oversized UPS. None of which could prevent the system falling foul of a $90
component. The fact is no matter how many fault tolerant measures are in place there is always an unknown
factor. In other words, the search for the Holy Grail of reliability, 100 percent uptime, is not possible. But increased availability can be achieved.
In part two of this article, we will look in more detail at some of the options available for fault
tolerance on server based systems, and evaluate their effectiveness in relation to investment.
Drew Bird (MCT, MCNI) is a freelance instructor and technical writer. He has been working in the IT
industry for 12 years and currently lives in Kelowna, B.C., Canada.
Huawei’s AI Update: Things Are Moving Faster Than We Think
FEATURE | By Rob Enderle,
December 04, 2020
Keeping Machine Learning Algorithms Honest in the ‘Ethics-First’ Era
ARTIFICIAL INTELLIGENCE | By Guest Author,
November 18, 2020
Key Trends in Chatbots and RPA
FEATURE | By Guest Author,
November 10, 2020
FEATURE | By Samuel Greengard,
November 05, 2020
ARTIFICIAL INTELLIGENCE | By Guest Author,
November 02, 2020
How Intel’s Work With Autonomous Cars Could Redefine General Purpose AI
ARTIFICIAL INTELLIGENCE | By Rob Enderle,
October 29, 2020
Dell Technologies World: Weaving Together Human And Machine Interaction For AI And Robotics
ARTIFICIAL INTELLIGENCE | By Rob Enderle,
October 23, 2020
The Super Moderator, or How IBM Project Debater Could Save Social Media
FEATURE | By Rob Enderle,
October 16, 2020
FEATURE | By Cynthia Harvey,
October 07, 2020
ARTIFICIAL INTELLIGENCE | By Guest Author,
October 05, 2020
CIOs Discuss the Promise of AI and Data Science
FEATURE | By Guest Author,
September 25, 2020
Microsoft Is Building An AI Product That Could Predict The Future
FEATURE | By Rob Enderle,
September 25, 2020
Top 10 Machine Learning Companies 2020
FEATURE | By Cynthia Harvey,
September 22, 2020
NVIDIA and ARM: Massively Changing The AI Landscape
ARTIFICIAL INTELLIGENCE | By Rob Enderle,
September 18, 2020
Continuous Intelligence: Expert Discussion [Video and Podcast]
ARTIFICIAL INTELLIGENCE | By James Maguire,
September 14, 2020
Artificial Intelligence: Governance and Ethics [Video]
ARTIFICIAL INTELLIGENCE | By James Maguire,
September 13, 2020
IBM Watson At The US Open: Showcasing The Power Of A Mature Enterprise-Class AI
FEATURE | By Rob Enderle,
September 11, 2020
Artificial Intelligence: Perception vs. Reality
FEATURE | By James Maguire,
September 09, 2020
Anticipating The Coming Wave Of AI Enhanced PCs
FEATURE | By Rob Enderle,
September 05, 2020
The Critical Nature Of IBM’s NLP (Natural Language Processing) Effort
ARTIFICIAL INTELLIGENCE | By Rob Enderle,
August 14, 2020
Datamation is the leading industry resource for B2B data professionals and technology buyers. Datamation's focus is on providing insight into the latest trends and innovation in AI, data security, big data, and more, along with in-depth product recommendations and comparisons. More than 1.7M users gain insight and guidance from Datamation every year.
Advertise with TechnologyAdvice on Datamation and our other data and technology-focused platforms.
Advertise with Us
Property of TechnologyAdvice.
© 2025 TechnologyAdvice. All Rights Reserved
Advertiser Disclosure: Some of the products that appear on this
site are from companies from which TechnologyAdvice receives
compensation. This compensation may impact how and where products
appear on this site including, for example, the order in which
they appear. TechnologyAdvice does not include all companies
or all types of products available in the marketplace.