Troubleshoot Hardware Faults Using the Oracle ILOM Web Interface
Cooling Issue Description Action Prevention
the front or back vents are blocked,
the airflow through the server is
disrupted and the cooling system
fails to function properly causing the
server internal temperature to rise.
installed components or cables that
can block the flow of air through the
server.
fans, air baffles and dividers are
properly installed. Never operate the
server without the top cover installed.
Cooling Areas
Compromised
The air baffle, component filler
panels, and server top cover maintain
and direct the flow of cool air
through the server. These server
components must be in place for
the server to function as a sealed
system. If these components are not
installed correctly, the airflow inside
the server can become chaotic and
non-directional, which can cause
server components to overheat and
fail.
Inspect the server interior to ensure
that the air baffle is properly
installed. Ensure that all external-
facing slots (storage drive, PCIe) are
occupied with either a component or
a component filler panel. Ensure that
the server top cover is in place and
sits flat and snug on top of the server.
When servicing the server, ensure
that the air baffle is installed
correctly and that the server has no
unoccupied external-facing slots.
Never operate the server without the
top cover installed.
Hardware
Component
Failure
■ Components, such as power
supplies and fan modules, are
an integral part of the server
cooling system. When one of
these components fails, the server
internal temperature can rise.
This rise in temperature can
cause other components to enter
into an over-temperature state.
Additionally, some components,
such as processors, might
overheat when they are failing,
which can also generate an over-
temperature event.
■ To reduce the risk related to
component failure, power
supplies and fan modules are
installed in pairs to provide
redundancy. Redundancy ensures
that if one component in the
pair fails, the other functioning
component can continue to
maintain the subsystem. For
example, power supplies serve
a dual function; they provide
both power and airflow. If one
power supply fails, the other
functioning power supply can
maintain both the power and the
cooling subsystems.
Investigate the cause of the
overtemperature event, and
replace failed components
immediately. For hardware
troubleshooting information, see
“Troubleshooting Server Hardware
Faults” on page 25.
Component redundancy is provided
to allow for component failure
in critical subsystems, such as
the cooling subsystem. However,
once a component in a redundant
system fails, the redundancy no
longer exists, and the risk for server
shutdown and component failures
increases. Therefore, it is important
to maintain redundant systems
and replace failed components
immediately.
36 Oracle Server X8-2L Service Manual • January 2021