EasyManua.ls Logo

Nvidia DGX H100 - Identifying the Nvme Manufacturer and Model

Nvidia DGX H100
146 pages
To Next Page IconTo Next Page
To Next Page IconTo Next Page
To Previous Page IconTo Previous Page
To Previous Page IconTo Previous Page
Loading...
NVIDIA DGX H100 Service Manual
Identifying the Failed NVMe from the Console
To identify the failed data drive, you can use the nvsm command:
sudo nvsm show health
View the command output and look for drive alerts to identity the failed drive.
Alternatively, you can use the BMC web user interface to access the Sensor screen, the IPMI event
log, and the System log to identify issues with the U.2 drives.
6.3. Identifying the NVMe Manufacturer and
Model
Use the nvsm command to display the drive information:
sudo nvsm show ∕systems∕localhost∕storage∕drives∕nvmeXn1
Replace X in the preceding command with the number that corresponds to the Linux device
name for the failed drive.
Example Output
∕systems∕localhost∕storage∕drives∕nvme5n1
Properties:
PhysicalLocation_Info = SlotU.2_Slot3
BlockSizeBytes = 512
SerialNumber = 22L0A01WT2N8
Model = KCM6DRUL3T84
Revision = 0107
Manufacturer = KIOXIA Corporation
Status_State = Enabled
Status_Health = OK
Name = nvme5n1
MediaType = SSD
EncryptionStatus = Unlocked
CapacityBytes = 3840755982336
Id = nvme5n1
Targets:
Verbs:
cd
set
show
Refer to the Manufacturer and Model elds in the output. Request a replacement NVMe from
NVIDIA Enterprise Support, specifying this information.
38 Chapter 6. U.2 NVMe Cache Drive Replacement

Table of Contents

Other manuals for Nvidia DGX H100

Related product manuals