In IT, hard drives and hardware fail – this is a fact. Anything that contains moving parts will, given time, cease moving.
Aging hardware comes with a slew of direct and indirect costs that eat into your budget and productivity without you even realizing it. According to a recent study, more than a third of small businesses are using PCs that are at least four years old. These older computers are more susceptible to crashes and security breaches than newer versions, which can result in the loss of crucial data, money, and leave employees fidgeting their thumbs for long periods of time.
But why do hard drives and hardware fail so often, and when should you upgrade to avoid anything like this happening to you?
Why Do Hard Drives and Hardware Fail?
There are numerous factors that affect the lifetime of hardware and can cause failures.
It’s no surprise that solid state drives (SSDs) are progressively outperforming traditional hard disk drives in terms of dependability and endurance. While SSD failures aren’t unheard of, they are far less common than traditional hard drive failures. The storage component itself isn’t prone to mechanical failure, but other components are. The most common reasons why SSDs fail include operating system errors, bad blocks, file system issues, NAND flash memory failures, and yes, even environmental factors such as heat, dust, and water damage (although those are far less common).
We don’t always buy SSDs for everything, though. For backups, bulk storage, and large archives, traditional hard disk drives (HDDs) are still very common.
HDDs are electromechanical devices that are vulnerable to wear and tear due to their high usage, and are extremely sensitive to their surroundings – even something as benign as smoke in the air can permanently harm a disk. Hard drives like to live in a cozy, climate-controlled paradise, but life isn’t always like that. It’s aggressive and unpredictable, resulting in considerably shorter lifespans than the Mean Time Between Failures (MTBF) estimations suggest.
A major factor that should be taken into consideration is the bathtub curve. When hardware is brand new, it can have a lot of failures – but after that it runs in a fairly steady condition for many years. As it ages, however, bearings on moving parts begin to wear down, years of operating have an impact on circuits, and minor manufacturing flaws begin to compound.
The bathtub curve predicts how an asset will perform during its life cycle. To extend the usable life of an asset, each point along the curve can present particular strategies. For example, although failures during the normal (middle) period are largely random, breakdowns near the wear-out period become more predictable. Certain predictive maintenance strategies can detect breakdowns before they happen based on an asset’s age and performance.
Other things such as movement and vibration, firmware corruption, heat, water damage, power issues/electricity, dust, and human errors are all considered factors to hard drive and hardware lifetime and failure.
When Should They Be Upgraded?
Similar to a lot of things in the IT world, there are a number of elements one must evaluate when answering this question: heading off impending hardware failures, maintaining acceptable levels of performance, and staying compatible with current versions of software.
Heading off Impending Failures
As far as SSD failures, there are a few things you can look out for including write endurance. SSDs are rated based on a number of writes over a rated warranty. Normally that’s five years. So, if someone buys a read intensive drive that’s set to write once a day for five years, and they use them roughly that much, then in five years it’s expected that drive will be reaching end of life.
Other factors to look out for include files that can’t be written or read, bad block errors, frequent crashes/slow machine boots, the drive becoming “read-only,” freezing or crashing of active applications, and the SSD running unreasonably slow. If you’re beginning to see any of these things, it’s time to take action and avoid any detrimental data loss.
HDDs are a bit different. Manufacturers’ drive specifications state that the MTBF of drives is typically 1,000,000 hours (or more), which equates to more than 100 years. That means our drive will last as long as we need it and will never die, correct? Well, not quite, because while it may be 1,000,000 hours, you must also consider the number of drives you have.
To exemplify this, our CTO Brent helps us understand by doing some simple scale up math:
First, how many hours are in a year? 365 days x 24 hours = 8760 hours (ignoring leap years).
That means that each drive runs for 8760 hours a year. So, if I have 10 drives that run, in a given year for 87,600 hours: 1,000,000 divided by 87,600 equals 11.4 years. That means that, according to MTBF alone, they will run for 11 years before one fails.
What if I have 100 drives? Now we’re up to 876,000 hours in a year, which means we would have a failure every 1.1 years, which means in a five-year life cycle, we would have to replace five and a half (six) drives. What if I have 1,000 drives? That adds up to 8,760,000 hours, which means I have a failure every .11 years, or 40 days, and over a five-year life cycle, I could have 44 drives fail.
The general principle is to upgrade hardware only when the expense of not upgrading outweighs the expense of upgrading. However, there is more to consider than just the expense.
Maintaining an Acceptable Level of Performance
New hardware should allow you to work more quickly and efficiently. Additionally, if you require hardware upgrades in order to run new software applications to increase productivity, the best option is to upgrade. Similar examples include a faulty PC that crashes frequently, or one that otherwise prevents you from accomplishing tasks and getting work done. Certainly, in each of these circumstances, delaying the upgrade will cost you more than proceeding with it.
Additionally, other signs that may be telling you it’s time to upgrade include frequent crashes, network slowdown/decreased server speed, servers are out of warranty, cooling and power costs are skyrocketing, and increased security issues/concerns. While this is not an all-inclusive list, if you’re beginning to see one or more of these issues come up, it’s time to start considering upgrades.
Staying Compatible with Supported Versions of Operating Systems, Hypervisors and Other Mission-Critical Software
A major reason we recommend upgrading old equipment aside from likelihood of failures is compatibility with new versions of software. Older servers may not be compatible with new, stable, and secure versions of Windows or a hypervisor or a database, etc. Windows 2012 R2 is going EOL next year, and many servers running it today may not be compatible with Server 2019 or higher. Additionally, as a general rule of thumb, when new program versions come out, they typically require more resources and performance, not less. That older server may have kept up with VMware vSphere 5, but the demands of vSphere 7 may be beyond its spec. It’s important to validate against hardware compatibility lists and make a refresh cycle a regular part of your budget to ensure you can run secure, supported versions of your mission critical applications and software.
Benefits of Upgrading
There are a lot of elements to look at when considering these types of upgrades, and it can be overwhelming – but we can’t forget about all the benefits. Upgrading your hardware will almost always improve productivity and performance via increased storage, higher speeds, enhanced communications, reduced downtime, higher security, and software compatibility. It’s time we accept old drives and hardware brings businesses down.
A major reason you should keep your IT equipment upgraded is because increased security comes along with new, supported equipment. More modern hardware more immune to known threats through security support from the manufacturer, but hackers may be more unfamiliar with it. As a result, there is less possibility that they will be subject to security risks over an extended length of time.
Overall, upgrading your hardware and hard drives can dramatically improve the employee experience and morale. Technology functionality should be nearly invisible to your staff — any slowness, outages or issues will directly impact productivity and create widespread frustration. Improve your employees’ experience by upgrading hardware they use all day every day so there is no longer a delay while trying to work and, as a result, they are able to accomplish more tasks and work without frustration.
Examine both your current and future computing demands and business goals. You must pay close attention to the age, usage, and environments of your drives and hardware in order to avoid complete failures, security risks, and unproductive work. It’s difficult to know when to pull the trigger on these upgrades, and with the supply chain issues we are seeing today, the pressure is on.
Consider what may need to be updated in the near future. If you’re not sure where to begin, our team of experts can help you through assessing, testing, and evaluating all aspects of your IT system and environment.