EasyManuals Logo
Home>Nvidia>Desktop>DGX-1

Nvidia DGX-1 User Manual

Nvidia DGX-1
120 pages
To Next Page IconTo Next Page
To Next Page IconTo Next Page
To Previous Page IconTo Previous Page
To Previous Page IconTo Previous Page
Page #100 background imageLoading...
Page #100 background image
Maintaining and Servicing the NVIDIA DGX-1
www.nvidia.com
NVIDIA DGX-1 DU-08033-001 _v13.1|96
The output should be a list of lb_ and mlx_ driver components.
Example:
ib_ucm 20480 0
ib_ipoib 131072 0
ib_cm 45056 3 rdma_cm,ib_ucm,ib_ipoib
ib_uverbs 73728 2 ib_ucm,rdma_ucm
ib_umad 24576 0
mlx5_ib 192512 0
mlx4_ib 192512 0
ib_sa 36864 5 rdma_cm,ib_cm,mlx4_ib,rdma_ucm,ib_ipoib
ib_mad 57344 4 ib_cm,ib_sa,mlx4_ib,ib_umad
ib_core 143360 13
rdma_cm,ib_cm,ib_sa,iw_cm,nv_peer_mem,mlx4_ib,mlx5_ib,ib_mad,ib_ucm,ib_umad,ib_uverbs,rdma_ucm,ib_ipoib
ib_addr 20480 3 rdma_cm,ib_core,rdma_ucm
ib_netlink 16384 3 rdma_cm,iw_cm,ib_addr
mlx4_core 344064 2 mlx4_en,mlx4_ib
mlx5_core 524288 1 mlx5_ib
mlx_compat 16384 18
rdma_cm,ib_cm,ib_sa,iw_cm,mlx4_en,mlx4_ib,mlx5_ib,ib_mad,ib_ucm,ib_netlink,ib_addr,ib_core,ib_umad,ib_uverbs,mlx4_core,mlx5_core,rdma_ucm,ib_ipoib
3.
Verify that the OFED software was installed correctly.
$ modinfo mlx5_core | grep -i version | head -1
Example output:
Version : 3.4-1.0.0
DGX-1 OS release 1.0 should have OFED software 3.2.
DGX-1 OS release 2.0 should have OFED software 3.4.
4.
Restart the InfiniBand services so that the new card is recognized.
a) Restart the InfiniBand service.
$ sudo service openibd restart
b) Restart the Service Manager service.
$ sudo service opensmd restart
c) Verify that the service has started.
$ service openibd status
openibd start/running
$ service opensmd status
OpenSM is running...
d) If the services do not start, verify
That the drivers are loaded according to step 3.
That the associated cables are connected to the InfiniBand ports.
The state of ibstat (refer to step 7)
Whether errors are reported in /var/log/syslog.
If these steps do not indicate a problem and yet the services still do not start,
contact NVIDIA Enterprise Support and obtain an RMA for the card.
5.
Verify the firmware version.
$ cat /sys/class/infiniband/mlx5*/fw_ver
Example output:

Table of Contents

Questions and Answers:

Question and Answer IconNeed help?

Do you have a question about the Nvidia DGX-1 and is the answer not in the manual?

Nvidia DGX-1 Specifications

General IconGeneral
BrandNvidia
ModelDGX-1
CategoryDesktop
LanguageEnglish

Related product manuals