NVIDIA DGX A100 DU-09821-001 _v01|30
Chapter5. Additional Features and
Instructions
This chapter describes specific features of the DGX A100 server to consider during setup and
operation.
5.1. Managing the DGX Crash Dump
Feature
The DGX OS includes a script to manage this feature.
5.1.1. Using the Script
This section provides information about how to use the script to manage DGX crash dumps.
‣
To enable only dmesg crash dumps, enter the following command:
$ /usr/sbin/dgx-kdump-config enable-dmesg-dump
This option reserves memory for the crash kernel.
‣
To enable both dmesg and vmcore crash dumps, enter the following command:
$ /usr/sbin/dgx-kdump-config enable-vmcore-dump
This option reserves memory for the crash kernel.
‣
To disable crash dumps, enter the following:
$ /usr/sbin/dgx-kdump-config disable
This option disables the use of kdump and make sure no memory is reserved for the crash
kernel.
5.1.2. Connecting to Serial Over LAN to View the
Console
While dumping vmcore, the BMC screen console goes blank approximately 11 minutes after
the crash dump is started. To view the console output during the crash dump, connect to
serial over LAN as follows:
$ ipmitool -I lanplus -H <bmc-ip-address> -U <BMC-USERNAME> -P <BMC-PASSWORD>
sol activate