EasyManuals Logo
Home>Nvidia>Computer Hardware>DGX A100

Nvidia DGX A100 Service Manual

Nvidia DGX A100
108 pages
To Next Page IconTo Next Page
To Next Page IconTo Next Page
To Previous Page IconTo Previous Page
To Previous Page IconTo Previous Page
Page #57 background imageLoading...
Page #57 background image
Chapter 12. DIMM Replacement
12.1. DIMM Replacement Overview
This is a high-level overview of the procedure to replace a dual inline memory module (DIMM) on the
DGX A100 system.
1. Use the nvsm health commands to identify the failed DIMM
2. Get a replacement DIMM from NVIDIA Enterprise Support.
3. Shut down the system.
4. Label all motherboard tray cables and unplug them.
5. Remove the motherboard tray and place on a solid at surface.
6. Remove the motherboard tray lid.
7. Use the reference diagram on the lid of the motherboard tray to identify the failed DIMM.
8. Replace the bad DIMM with the new one.
9. Close the lid on the motherboard tray.
10. Insert the motherboard tray into the system.
11. Plug in all cables using the labels as a reference.
12. Power on the system.
13. Verify that all DIMMs are now healthy with nvsm.
12.2. Identifying the Failed DIMM
1. From the console, run the following nvsm command to identify memory alerts.
$ sudo nvsm show ∕systems∕localhost∕memory∕alerts
Alerts will appear under the Target section. For example.
Targets:
alert0
2. Get specic information about the memory alert.
The following example obtains information for alert0.
51

Table of Contents

Other manuals for Nvidia DGX A100

Questions and Answers:

Question and Answer IconNeed help?

Do you have a question about the Nvidia DGX A100 and is the answer not in the manual?

Nvidia DGX A100 Specifications

General IconGeneral
BrandNvidia
ModelDGX A100
CategoryComputer Hardware
LanguageEnglish

Related product manuals