EasyManua.ls Logo

Nvidia DGX Station A100 User Manual

Nvidia DGX Station A100
72 pages
To Next Page IconTo Next Page
To Next Page IconTo Next Page
To Previous Page IconTo Previous Page
To Previous Page IconTo Previous Page
Page #17 background imageLoading...
Page #17 background image
Getting Started with DGX Station A100
DGX Station A100 DU-10189-001 _v5.0.2|11
GPU 0: Graphics Device (UUID: GPU-269d95f8-328a-08a7-5985-ab09e6e2b751)
GPU 1: Graphics Device (UUID: GPU-0f2dff15-7c85-4320-da52-d3d54755d182)
In this example, Docker selected the first two GPUs to run the container, but if the device
option is used, you can specify which GPUs to use:
lab@ro-dvt-058-80gb:~$ docker run --gpus '"device=GPU-dc598de6-dd4d-2f43-549f-
f7b4847865a5,GPU-e32263f2-ae07-f1db-37dc-17d1169b09bf"' --rm -it ubuntu nvidia-smi -L
GPU 0: Graphics Device (UUID: GPU-dc598de6-dd4d-2f43-549f-f7b4847865a5)
GPU 1: Graphics Device (UUID: GPU-e32263f2-ae07-f1db-37dc-17d1169b09bf)
In this example, the two GPUs that were not used earlier are now assigned to run on the
container.
2.3.2. Running on Bare Metal
To run applications by using the four high performance GPUs, the CUDA_VISIBLE_DEVICES
variable must be specified before you run the application.
Note: This method does not use containers.
CUDA orders the GPUs by performance, so GPU 0 will be the highest performing GPU, and the
last GPU will be the slowest GPU.
Important: If the CUDA_DEVICE_ORDER variable is set to PCI_BUS_ID, this ordering will be
overridden.
In the following example, a CUDA application that comes with CUDA samples is run. In
the output, GPU 0 is the fastest in a DGX Station A100, and GPU 4 (DGX Display GPU) is the
slowest:
lab@ro-dvt-058-80gb:~$ sudo apt install cuda-samples-11-2
lab@ro-dvt-058-80gb:~$ cd /usr/local/cuda-11.2/samples/1_Utilities/p2pBandwidthLatencyTest
lab@ro-dvt-058-80gb:/usr/local/cuda-11.2/samples/1_Utilities/p2pBandwidthLatencyTest
$ sudo make
/usr/local/cuda/bin/nvcc -ccbin g++ -I../../common/inc -m64 --threads
0 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37
-gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52
-gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61
-gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75
-gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86
-gencode arch=compute_86,code=compute_86 -o p2pBandwidthLatencyTest.o -c
p2pBandwidthLatencyTest.cu
nvcc warning : The 'compute_35', 'compute_37', 'compute_50', 'sm_35', 'sm_37' and
'sm_50' architectures are deprecated, and may be removed in a future release (Use -
Wno-deprecated-gpu-targets to suppress warning).
/usr/local/cuda/bin/nvcc -ccbin g++ -m64
-gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37
-gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52
-gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61
-gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75
-gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86
-gencode arch=compute_86,code=compute_86 -o p2pBandwidthLatencyTest
p2pBandwidthLatencyTest.o
nvcc warning : The 'compute_35', 'compute_37', 'compute_50', 'sm_35', 'sm_37' and
'sm_50' architectures are deprecated, and may be removed in a future release (Use -
Wno-deprecated-gpu-targets to suppress warning).
mkdir -p ../../bin/x86_64/linux/release
cp p2pBandwidthLatencyTest ../../bin/x86_64/linux/release
lab@ro-dvt-058-80gb:/usr/local/cuda-11.2/samples/1_Utilities/p2pBandwidthLatencyTest
$ cd /usr/local/cuda-11.2/samples/bin/x86_64/linux/release

Table of Contents

Question and Answer IconNeed help?

Do you have a question about the Nvidia DGX Station A100 and is the answer not in the manual?

Nvidia DGX Station A100 Specifications

General IconGeneral
BrandNvidia
ModelDGX Station A100
CategoryDesktop
LanguageEnglish

Summary

Introduction to the NVIDIA DGX Station A100

Registering Your DGX Station A100

Instructions for registering your DGX Station A100 for support.

Getting Started with DGX Station A100

Connecting and Powering on the DGX Station A100

Step-by-step guide to connect and power on the DGX Station A100.

Using DGX Station A100 as a Server Without a Monitor

Configuration for operating the DGX Station A100 without a display.

Running Workloads on Systems with Mixed Types of GPUs

Methods for running workloads utilizing mixed GPU types effectively.

Running with Docker Containers

Guide to running workloads within Docker containers on DGX.

Running on Bare Metal

Instructions for running applications directly on the system hardware.

Using Multi-Instance GPUs

How to utilize Multi-Instance GPU (MIG) on NVIDIA A100 GPUs.

Completing the Initial Ubuntu OS Configuration

Steps to finalize the initial setup of the Ubuntu OS.

Using the BMC

Understanding the BMC Controls

Overview of the primary controls available in the BMC dashboard.

Configuring a Static IP Address for the BMC

Steps to assign a static IP address to the BMC.

Configuring a BMC Static IP Address Using ipmitool

Using ipmitool to set a static IP address for the BMC via command line.

Configuring a BMC Static IP Address Using the System BIOS

Setting a static IP for the BMC through the system BIOS.

Logging into the BMC

Procedure for accessing the BMC via a web browser.

Changing Your Default BMC Password

Instructions to change the default BMC password for security.

Logging in After Entering an Incorrect Password

Information on recovering access after multiple failed login attempts.

Enable MIG Mode in DGX Station A100

Managing Self-Encrypting Drives on DGX Station A100

Installing the nv-disk-encrypt Package

Steps to install the nv-disk-encrypt software package.

Initializing the System for Drive Encryption

Procedure to initialize DGX system drives for encryption.

Enabling Drive Locking

How to enable automatic drive locking after initialization.

Erasing Your Data

Procedure for securely erasing data and reconfiguring RAID.

Enabling the TPM

Steps to enable the Trusted Platform Module (TPM) in BIOS.

Unpacking and Repacking the DGX Station A100

Security

Changing Your BMC Credentials

Procedure to change BMC username and password for security.

Safety

General Precautions

General safety guidelines for operating the DGX Station A100.

Electrical Precautions

Safety measures related to power cables and electrical connections.

Related product manuals