Preventive Maintenance Tips to Extend the Life of Your IT Infrastructure

Thumbnail image for toolkit.jpgAccording to a recent Forrester report, computer equipment purchases are expected to continue to decline through the second half of 2009 due to the economy. However, not purchasing new IT equipment means it’s even more important than ever to properly care for and maintain your existing systems.
Emerson Network Power offers these preventive maintenance tips to IT and facilities managers to help maintain reliability and availability while waiting out this economic storm. While some of it is a bit technical, it can give you the basis for a discussion with your own IT department or outsourced provider to make sure they are doing what they should to keep your power and cooling system – and therefore your network – running well.
An Ounce of Prevention
Preventive maintenance tips to extend the life of your IT infrastructure and supported systems
By Jeff Powers, Henry Hu and Jeff Donato
Service Product Managers
Emerson Network Power, Liebert Services

Today’s economy is forcing businesses to look at all scenarios to cut costs, including delaying the purchase of new equipment. That means preventive maintenance on existing equipment is more important than ever.
Focusing on preventive maintenance can help businesses minimize the need to repair or replace important components that could cost hundreds of thousands of dollars if not properly maintained. Neglecting a preventive maintenance program greatly increases the chance that business operations will be disrupted if the power or cooling equipment fails, thus exposing the business to loss of revenue, reducing work productivity, affecting customer satisfaction and loyalty, etc. That’s not to mention the costs incurred for repairs and replacements.
One way businesses can minimize unit-related failures is to institute a comprehensive preventive maintenance program implemented by original equipment manufacturer (OEM) trained and certified technicians. When correctly implemented, preventive maintenance visits ensure maximum reliability of data center equipment by providing systematic inspections, detection and correction of incipient failures, either before they occur or before they develop into major defects that could translate into costly downtime. Typical preventive maintenance programs include inspections, tests, measurements, adjustments, parts replacement, and housekeeping practices.
Here are a few preventive maintenance tips for small and mid-sized businesses, focusing on the power and cooling infrastructure:

Begin with the UPS
To keep running through power outages, utility spikes and other unforeseeable power issues, critical systems are dependent on the reliability of the UPS (uninterruptible power supply) system. Therefore, keeping these systems in working condition is crucial.
While the UPS systems are designed to offer the utmost reliability and performance at an affordable price, they are not failure-proof. Factors such as application, installation, design, real world operating conditions and maintenance practices can impact the reliability and performance of the UPS systems.
Remember, the reliability of a system is only as long as the shortest component life in the unit. However, some manufacturers, including Emerson Network Power, are addressing this issue by reducing the number of parts that need to be replaced, thus decreasing the chance of a failure. However, the reality is failures still occur; therefore being proactive with maintenance can greatly reduce your chances for downtime.
Do Not Overlook Batteries
Battery maintenance begins with installation of your system. Batteries must be fully charged, battery room conditions verified and baseline ohmic readings recorded for proper trend analysis throughout the life of the battery. If this information is not properly gathered and documented, determining bad batteries could prove to be difficult.
Observe Best Practices … as a Start
For best practices for battery maintenance, refer to the manufacturer’s recommendations, the IEEE-1188 for Valve Regulated Lead Acid (VRLA) batteries and the IEEE-450 for Vented Lead Acid (VLA or flooded) batteries. However, best practices do not always equate to common practices. Governed by real-world factors, many facility managers often are forced to take into account the cost of performing the recommended IEEE schedule as it relates to the criticality of the application.
High ambient temperature and frequent discharges are most commonly responsible for reducing useful life across all types of batteries. (Dryout is the most common cause of VRLA battery failure.) Battery aging accelerates dramatically as ambient temperature increases. This is true of batteries in service and in storage. Even under specified temperatures, batteries are designed to provide a limited number of discharge cycles during their expected life. While that number may be adequate in some applications, there are instances where a battery can wear out prematurely.
Enhance Maintenance Through Monitoring
Once a battery is put into service, it’s important to proactively monitor battery performance trends to help detect battery failure. A battery monitoring system provides a continuous watch of the battery to assess its true state of health. Instead of waiting for an inevitable failure or replacing batteries prematurely to prevent problems, battery monitors allow organizations to optimize the use of their batteries.
While there are many battery services available, the best solution to maximizing battery performance is to utilize an integrated battery monitoring service that combines state-of-the-art battery monitoring technology with proactive maintenance and service response. This type of proactive solution integrates onsite and remote preventive maintenance activities with expert predictive analysis to identify problems before they occur.
Replace Batteries Safely
If a power outage occurs, even a single bad cell in a string could compromise your entire backup system and leave you without protection. In addition to implementing proper maintenance practices and monitoring batteries, safely replacing failing batteries will help keep IT systems running to specifications and minimize the risk of costly downtime to business operations. IEEE standards recommend replacing a battery at the time its capacity reaches 80 percent.
Keep the Air Moving
Clogged air filters reduce the airflow through cooling systems and increase the load on the blower drive system. This may result in reduced system cooling performance, higher operating costs, reduced component life of the blower drive systems, and higher operating temperatures of the equipment in the data center. These higher operating temperatures may be very localized and hard to detect within a data center due to irregular air flow patterns. Similarly, substitution of non-OEM filters may also change performance of the system. Lower-cost substitute filters may collapse if they have not been designed for the air flow of the system. Always use OEM replacement filters when servicing your equipment.
Monitor the Moving Parts
The blower belts, bearings, motors and wheels need to be inspected regularly. Wear or damage to any of these components may result in the reduction or loss of airflow or reduced cooling performance. Again, all of these components are specifically selected for the performance requirements of the system, and only OEM replacements should be used. Belts, specifically, are literally machined to match as a set of 2 or 3 belts so that belt tension is uniform – maximizing belt life.
Inspect Humidifiers Regularly
Humidifiers may be connected with valves and hoses that may leak and drains may become clogged over time. These components should be inspected regularly. Steam generating humidifiers are vulnerable to water deposits that can diminish humidifier performance. Once these deposits form, more power is required to achieve the same level of humidification, and the assembly begins to deteriorate toward failure.
Infrared humidifier bulbs may burn out, diminishing performance. Drains and pans should be inspected for deposits and clogs. When handling humidifier bulbs, it is important not to touch the glass bulb with your fingers – the residue from skin oils creates a ‘hot spot’ on the bulb that reduces the life of the bulb.
Check the Oil
Inspect oil level in the compressor and check for leaks. Compressors running with too much or too little oil will see diminished service life. Always use the same type of oil supplied with the compressor from the OEM.
Keep Things Clean
Dirt and debris can cause problems in several areas:
• CONDENSATE DRAINS AND PUMPS: Confirm proper pump function and verify drains are not clogged. Obviously, the combination of a clogged drain and a failed level sensor results in pan overflow.
• REHEAT: Inspect and clean reheat elements – inspect and tighten support hardware.
• FACILITY FLUID AND PIPING: Units connected to facility water or glycol should perform necessary maintenance to assure the quality of the fluid. Contaminants in these fluids can lead to diminished performance.
• EVAPORATOR COILS: Evaporator coils should be checked periodically to verify they are clean and free of debris. Dirty coils are less efficient at removing heat.
• CONDENSERS AND DRY COOLERS: Condenser coils should be checked periodically to verify they are clean and free of debris. Similar to evaporator coils; dirty condenser coils are less efficient at rejecting heat.
In Conclusion
Call a Professional
Most preventive maintenance measures should be left to qualified and trained personnel. Business owners, facility and IT managers can provide preventive support such as replacing air filters when dirty, ensuring environmental specifications are met and maintained, and monitoring the UPS for alarms, but most tasks are best left to a professional.
When choosing a service provider, seek out a group that offers a comprehensive portfolio of services. Service can be customized to satisfy customer requirements. In addition, preventive maintenance service should at least include the following to minimize your time to recovery should you experience a downtime event :
• 24 x 7 emergency services
• Parts replacement and available in the shortest possible time
• End-user training seminars detailing best practices and service tips.
The service provider also should provide access to highly trained technicians that engage in ongoing industry training.