Linux gpu clock monitor

Top 7 Linux GPU Monitoring and Diagnostic Commands Line Tools

A video card is a special circuit board that controls what is displayed on a computer monitor. It is also called a graphics processing unit (GPU), which calculates 3D images and graphics for Linux gaming and other usages. Let us see the top 7 Linux GPU monitoring and diagnostic command-line tools to solve issues.

The following tools work on Linux for GPU monitoring and diagnostic purposes and other operating systems such as FreeBSD. The majority of Linux and FreeBSD users these days use Nvidia, Intel, and AMD GPUs.

Linux GPU Monitoring and Diagnostic Commands Line Tools

We can use the following tools to monitor, diagnostic, and inspect our Linux or *BSD based systems.

Finding information about GPU on Linux

To get the GPU info simply run:
sudo lshw -C display -short
lspci -v | more
Which is output something as follows:

Want to find out video card GPU memory RAM size on Linux? Try:
sudo lspci
sudo lshw -C display
glxinfo | egrep -i ‘device|memory’
grep -i —color memory /var/log/Xorg.0.log

1. glmark2 – Stress-testing GPU performance on Linux

glmark2 is an OpenGL 2.0 and ES 2.0 benchmark command-line utility. We can install it as follows:
$ sudo apt install glmark2
Now run it as follows:
$ glmark2
Then it will begin the test as follows and would stress test your GPU on Linux:

Linux glmark2 test screen

2. glxgears – Simple Linux GPU performance testing tool

It will displays a set of rotating gears and prints out the frame rate at regular intervals. It has become quite popular as basic benchmarking tool for Linux and Unix-like system such as FreeBSD. Install and run it as follows:
$ apt install mesa-utils
$ glxgears

The GPU frame rate is measured and published out on the screen every five seconds. The final result will look as follows:

3. gpustat – A simple tool to get Nvidia GPU stats on Linux and FreeBSD Unix

It is written in Python and the perfect tool for CLI users, especially ML/AI developers. One can install it as follows using PIP
$ pip install gpustat
$ pip3 install gpustat
Run it as follows:
$ gpustat
$ gpustat -cp
Here we see name of running process and their PIDs running on Nvidia GPU:

See help:
$ gpustat -h

  • No ads and tracking
  • In-depth guides for developers and sysadmins at Opensourceflare✨
  • Join my Patreon to support independent content creators and start reading latest guides:
    • How to set up Redis sentinel cluster on Ubuntu or Debian Linux
    • How To Set Up SSH Keys With YubiKey as two-factor authentication (U2F/FIDO2)
    • How to set up Mariadb Galera cluster on Ubuntu or Debian Linux
    • A podman tutorial for beginners – part I (run Linux containers without Docker and in daemonless mode)
    • How to protect Linux against rogue USB devices using USBGuard

Join Patreon

4. intel_gpu_top – Displying a top-like summary of Intel GPU usage on Linux

First install the tool, run:
$ sudo apt install intel-gpu-tools
## CentOS/RHEL/Fedora Linux user try the dnf command ##
$ sudo dnf install intel-gpu-tools
Fedora, RHEL and CentOS Linux user can use the podman command as follows to install the same:
$ podman run —rm —priviledged registry.freedesktop.org/drm/igt-gpu-tools/igt:master
The tool gathers data using perf performance counters (PMU) exposed by i915 and other platform drivers like RAPL (power) and Uncore IMC (memory bandwidth). Run it as follows on Linux system:
$ sudo intel_gpu_top

5. nvidia-smi – NVIDIA System Management Interface program

The nvidia-smi provides monitoring and management capabilities for each of NVIDIA’s Tesla, Quadro, GRID and GeForce devices from Fermi and higher architecture families. GeForce Titan series devices are supported for most functions with very limited information provided for the remainder of the Geforce brand. NVSMI is a cross platform tool that supports all standard NVIDIA driver-supported Linux and FreeBSD. Install it as follows once Nvidia driver installed on Ubuntu Linux:
$ apt install nvidia-smi
Open the terminal and then run:
$ nvidia-smi -q -g 0 -d UTILIZATION -l 1
$ sudo nvidia-smi
$ nvidia-smi —help
Here is what we see:

Читайте также:  Кряки для обновления windows

6. nvtop – NVIDIA GPU top

Another fancy but very useful tool for NVIDIA GPU. It is a ncurses-based GPU status viewer for NVIDIA GPUs similarly to the htop command or top command. We can install it as follows using the apt command/apt-get command on a Debian or Ubuntu Linux:
$ apt install nvtop
## RUN the tool ##
$ nvtop

The following commands are available while in nvtop is on screen:

  • Up – Select (highlight) the previous process.
  • Down – Select (highlight) the next process.
  • Left / Right – Scroll in the process row.
  • + – Sort increasingly.
  • — – Sort decreasingly.
  • F1 – Select a signal to send to the highlighted process.
  • F2 – Select the field for sorting. The current sort field is highlighted inside the header bar.
  • F3 , q , Esc – Exit nvtop and return to your shell

7. radeontop – Tool to show AMD GPU utilization on Linux

View your AMD GPU utilization, both for the total activity percent and individual blocks on Linux. Install it as follows:
$ sudo apt install radeontop
$ sudo radeontop
It works with R600 and up GPUs, even Southern Islands should work fine. Works with both the open source AMD drivers and AMD Catalyst cloused-source drivers:

Conclusion

You learned about the various Linux GPU commands and tools for monitoring and diagnostic purposes on Linux and BSD-based systems. Let me know if I missed your favorite tool in the comment section below.

Источник

NVIDIA/Tips and tricks

Contents

Fixing terminal resolution

Transitioning from nouveau may cause your startup terminal to display at a lower resolution.

For systemd-boot, set console-mode in esp/EFI/loader/loader.conf . See systemd-boot#Loader configuration for details.

For rEFInd, add to esp/EFI/refind/refind.conf and /etc/refind.d/refind.conf (latter file is optional but recommended):

A small caveat is that this will hide the kernel parameters from being shown during boot.

Using TV-out

X with a TV (DFP) as the only display

The X server falls back to CRT-0 if no monitor is automatically detected. This can be a problem when using a DVI connected TV as the main display, and X is started while the TV is turned off or otherwise disconnected.

To force NVIDIA to use DFP, store a copy of the EDID somewhere in the filesystem so that X can parse the file instead of reading EDID from the TV/DFP.

To acquire the EDID, start nvidia-settings. It will show some information in tree format, ignore the rest of the settings for now and select the GPU (the corresponding entry should be titled «GPU-0» or similar), click the DFP section (again, DFP-0 or similar), click on the Acquire Edid Button and store it somewhere, for example, /etc/X11/dfp0.edid .

If in the front-end mouse and keyboard are not attached, the EDID can be acquired using only the command line. Run an X server with enough verbosity to print out the EDID block:

After the X Server has finished initializing, close it and your log file will probably be in /var/log/Xorg.0.log . Extract the EDID block using nvidia-xconfig:

Edit xorg.conf by adding to the Device section:

The ConnectedMonitor option forces the driver to recognize the DFP as if it were connected. The CustomEDID provides EDID data for the device, meaning that it will start up just as if the TV/DFP was connected during X the process.

This way, one can automatically start a display manager at boot time and still have a working and properly configured X screen by the time the TV gets powered on.

If the above changes did not work, in the xorg.conf under Device section you can try to remove the Option «ConnectedMonitor» «DFP» and add the following lines:

The NoDFPNativeResolutionCheck prevents NVIDIA driver from disabling all the modes that do not fit in the native resolution.

Headless (no monitor) resolution

In headless mode, resolution falls back to 640×480, which is used by VNC or Steam Link. To start in a higher resolution e.g. 1920×1080, specify a Virtual entry under the Screen subsection in xorg.conf :

Читайте также:  Windows loader для sp1

Check the power source

The NVIDIA X.org driver can also be used to detect the GPU’s current source of power. To see the current power source, check the ‘GPUPowerSource’ read-only parameter (0 — AC, 1 — battery):

Listening to ACPI events

NVIDIA drivers automatically try to connect to the acpid daemon and listen to ACPI events such as battery power, docking, some hotkeys, etc. If connection fails, X.org will output the following warning:

While completely harmless, you may get rid of this message by disabling the ConnectToAcpid option in your /etc/X11/xorg.conf.d/20-nvidia.conf :

If you are on laptop, it might be a good idea to install and enable the acpid daemon instead.

Displaying GPU temperature in the shell

There are three methods to query the GPU temperature. nvidia-settings requires that you are using X, nvidia-smi or nvclock do not. Also note that nvclock currently does not work with newer NVIDIA cards such as GeForce 200 series cards as well as embedded GPUs such as the Zotac IONITX’s 8800GS.

nvidia-settings

To display the GPU temp in the shell, use nvidia-settings as follows:

This will output something similar to the following:

The GPU temps of this board is 41 C.

In order to get just the temperature for use in utilities such as rrdtool or conky:

nvidia-smi

Use nvidia-smi which can read temps directly from the GPU without the need to use X at all, e.g. when running Wayland or on a headless server. To display the GPU temperature in the shell, use nvidia-smi as follows:

This should output something similar to the following:

Only for temperature:

In order to get just the temperature for use in utilities such as rrdtool or conky:

nvclock

Use nvclock AUR which is available from the AUR.

There can be significant differences between the temperatures reported by nvclock and nvidia-settings/nv-control. According to this post by the author (thunderbird) of nvclock, the nvclock values should be more accurate.

Overclocking and cooling

Enabling overclocking

Overclocking is controlled via Coolbits option in the Device section, which enables various unsupported features:

The Coolbits value is the sum of its component bits in the binary numeral system. The component bits are:

  • 1 (bit 0) — Enables overclocking of older (pre-Fermi) cores on the Clock Frequencies page in nvidia-settings.
  • 2 (bit 1) — When this bit is set, the driver will «attempt to initialize SLI when using GPUs with different amounts of video memory».
  • 4 (bit 2) — Enables manual configuration of GPU fan speed on the Thermal Monitor page in nvidia-settings.
  • 8 (bit 3) — Enables overclocking on the PowerMizer page in nvidia-settings. Available since version 337.12 for the Fermi architecture and newer.[1]
  • 16 (bit 4) — Enables overvoltage using nvidia-settings CLI options. Available since version 346.16 for the Fermi architecture and newer.[2]

To enable multiple features, add the Coolbits values together. For example, to enable overclocking and overvoltage of Fermi cores, set Option «Coolbits» «24» .

The documentation of Coolbits can be found in /usr/share/doc/nvidia/html/xconfigoptions.html and here.

Setting static 2D/3D clocks

Set the following string in the Device section to enable PowerMizer at its maximum performance level (VSync will not work without this line):

Allow change to highest performance mode

The factual accuracy of this article or section is disputed.

Since changing performance mode and overclocking memory rate has little to no effect in nvidia-settings, try this:

  • Setting Coolbits to 24 or 28 and remove Powermizer RegistryDwords -> Restart X
  • find out max. Clock and Memory rate. (this can be LOWER than what your gfx card reports after booting!):
  • set rates for GPU 0:

After setting the rates the max. performance mode works in nvidia-settings and you can overclock graphics-clock and memory transfer rate.

Saving overclocking settings

Typically, clock and voltage offsets inserted in the nvidia-settings interface are not saved, being lost after a reboot. Fortunately, there are tools that offer an interface for overclocking under the proprietary driver, able to save the user’s overclocking preferences and automatically applying them on boot. Some of them are:

  • gweAUR — graphical, applies settings on desktop session start
  • nvclockAUR and systemd-nvclock-unitAUR — graphical, applies settings on system boot
  • nvocAUR — text based, profiles are configuration files in /etc/nvoc.d/ , applies settings on desktop session start

Custom TDP Limit

Modern Nvidia graphics cards throttle frequency to stay in their TDP and temperature limits. To increase performance it is possible to change the TDP limit, which will result in higher temperatures and higher power consumption.

Читайте также:  Как активировать windows 10 pro без активатора

For example, to set the power limit to 160.30W:

To set the power limit on boot (without driver persistence):

Set fan speed at login

This article or section needs language, wiki syntax or style improvements. See Help:Style for reference.

You can adjust the fan speed on your graphics card with nvidia-settings’ console interface. First ensure that your Xorg configuration has enabled the bit 2 in the Coolbits option.

Place the following line in your xinitrc file to adjust the fan when you launch Xorg. Replace n with the fan speed percentage you want to set.

You can also configure a second GPU by incrementing the GPU and fan number.

If you use a login manager such as GDM or SDDM, you can create a desktop entry file to process this setting. Create

/.config/autostart/nvidia-fan-speed.desktop and place this text inside it. Again, change n to the speed percentage you want.

To make it possible to adjust the fanspeed of more than one graphics card, run:

Kernel module parameters

This article or section needs language, wiki syntax or style improvements. See Help:Style for reference.

Some options can be set as kernel module parameters, a full list can be obtained by running modinfo nvidia or looking at nv-reg.h . See Gentoo:NVidia/nvidia-drivers#Kernel module parameters as well.

For example, enabling the following will turn on kernel mode setting (see above) and enable the PAT feature [5], which affects how memory is allocated. PAT was first introduced in Pentium III [6] and is supported by most newer CPUs (see wikipedia:Page attribute table#Processors). If your system can support this feature, it should improve performance.

On some notebooks, to enable any nvidia settings tweaking you must include this option, otherwise it responds with «Setting applications clocks is not supported» etc.

Preserve video memory after suspend

By default the NVIDIA Linux drivers save and restore only essential video memory allocations on system suspend and resume. Quoting NVIDIA ([7], also available with the nvidia-utils package in /usr/share/doc/nvidia/html/powermanagement.html): The resulting loss of video memory contents is partially compensated for by the user-space NVIDIA drivers, and by some applications, but can lead to failures such as rendering corruption and application crashes upon exit from power management cycles.

The still experimental system enables saving all video memory (given enough space on disk or main RAM). The interface is through the /proc/driver/nvidia/suspend file as follows: write «suspend» (or «hibernate») to /proc/driver/nvidia/suspend immediately before writing to the usual Linux /sys/power/state file, write «resume» to /proc/driver/nvidia/suspend immediately after waking up, or after an unsuccessful attempt to suspend or hibernate.

The NVIDIA drivers rely on a user defined file system for storage. The chosen file system needs to support unnamed temporary files (ext4 works) and have sufficient capacity for storing the video memory allocations (e.g., at least (sum of the memory capacities of all NVIDIA GPUs) * 1.2 ). Use the command nvidia-smi -q -d MEMORY to list the memory capacities of all GPUs in the system.

To choose the file system used for storing video memory during system sleep (and change the default video memory save/restore strategy to save and restore all video memory allocations), it is necessary to pass two options to the «nvidia» kernel module. For example, write the following line to /etc/modprobe.d/nvidia-power-management.conf and reboot:

Feel free to replace «/tmp-nvidia» in the previous line with a path within your desired file system.

The interaction with /proc/driver/nvidia/suspend is handled by the simple Unix shell script at /usr/bin/nvidia-sleep.sh, which will itself be called by a tool like Systemd. The Archlinux nvidia-utils package ships with the following relevant Systemd services (which essentially just call nvidia-sleep.sh): nvidia-suspend , nvidia-hibernate , nvidia-resume . Contrary to NVIDIA’s instructions, it is currently not necessary to enable nvidia-resume (and it is in fact probably not a good idea to enable it), because the /usr/lib/systemd/system-sleep/nvidia script does the same thing as the service (but slightly earlier), and it is enabled by default (Systemd calls it after waking up from a suspend). Do enable nvidia-suspend and/or nvidia-hibernate .

Driver persistence

Nvidia has a daemon that can be optionally run at boot. In a standard single-GPU X desktop environment the persistence daemon is not needed and can actually create issues [8]. See the Driver Persistence section of the Nvidia documentation for more details.

Источник

Оцените статью