Control of Nvidia graphics card parameters

I made myself sensors to track the parameters of the nvidia graphics card, which I use in conjunction with jellyfin in the openmediavault operating system. I want to share these sensors. Maybe it will be useful to someone.

  - platform: command_line
    name: 'OMV_HA graphics card Temp'
    command: "ssh -i /config/id_rsa -o StrictHostKeyChecking=no [email protected] -t 'nvidia-smi --query-gpu=temperature.gpu --format=csv,noheader,nounits'"
    unit_of_measurement: '°C'
    scan_interval: 30
    
  - platform: command_line
    name: 'OMV_HA graphics card Load'
    command: "ssh -i /config/id_rsa -o StrictHostKeyChecking=no [email protected] -t 'nvidia-smi --query-gpu=utilization.gpu --format=csv,noheader,nounits'"
    unit_of_measurement: '%'
    scan_interval: 29
    
  - platform: command_line
    name: 'OMV_HA graphics card used RAM'
    command: "ssh -i /config/id_rsa -o StrictHostKeyChecking=no [email protected] -t 'nvidia-smi --query-gpu=memory.used --format=csv,noheader,nounits'"
    unit_of_measurement: 'MiB'
    scan_interval: 28
    
  - platform: command_line
    name: 'OMV_HA graphics card free RAM'
    command: "ssh -i /config/id_rsa -o StrictHostKeyChecking=no [email protected] -t 'nvidia-smi --query-gpu=memory.free --format=csv,noheader,nounits'"
    unit_of_measurement: 'MiB'
    scan_interval: 27
    
  - platform: command_line
    name: 'OMV_HA graphics driver version'
    command: "ssh -i /config/id_rsa -o StrictHostKeyChecking=no [email protected] -t 'nvidia-smi --query-gpu=driver_version --format=csv,noheader'"
    scan_interval: 26
    
  - platform: command_line
    name: 'OMV_HA graphics card GPU FAN'
    command: "ssh -i /config/id_rsa -o StrictHostKeyChecking=no [email protected] -t 'nvidia-smi --query-gpu=fan.speed --format=csv,noheader,nounits'"
    unit_of_measurement: '%'
    scan_interval: 25
    
  - platform: command_line
    name: 'OMV_HA graphics card GPU clock'
    command: "ssh -i /config/id_rsa -o StrictHostKeyChecking=no [email protected] -t 'nvidia-smi --query-gpu=clocks.current.graphics --format=csv,noheader,nounits'"
    unit_of_measurement: 'MHz'
    scan_interval: 24
    
  - platform: command_line
    name: 'OMV_HA graphics card GPU clock MAX'
    command: "ssh -i /config/id_rsa -o StrictHostKeyChecking=no [email protected] -t 'nvidia-smi --query-gpu=clocks.max.graphics --format=csv,noheader,nounits'"
    unit_of_measurement: 'MHz'
    scan_interval: 23
    
  - platform: command_line
    name: 'OMV_HA graphics card used process'
    command: "ssh -i /config/id_rsa -o StrictHostKeyChecking=no [email protected] -t 'nvidia-smi --query-compute-apps=name,used_memory --format=csv'"
    scan_interval: 22
    
  - platform: command_line
    name: 'OMV_HA graphics card perf'
    command: "ssh -i /config/id_rsa -o StrictHostKeyChecking=no [email protected] -t 'nvidia-smi --query-gpu=pstate --format=csv,noheader'"
    scan_interval: 300
    
  - platform: command_line
    name: 'OMV_HA graphics card name'
    command: "ssh -i /config/id_rsa -o StrictHostKeyChecking=no [email protected] -t 'nvidia-smi --list-gpus'"
    scan_interval: 300

How to create and copy ssh keys can be viewed here: How to monitor Proxmox CPU temp

Thank you for sharing! I have been searching for it for months!

However, I have adopted a different approach. Instead of executing multiple queries with varying intervals, I modified the syntax of ‘nvidia-smi’ to retrieve multiple columns without headers. Then, I utilized a single command-line sensor to store the data in JSON format.

- sensor:
      name: 'GPU Data'
      command: "ssh -o UserKnownHostsFile=/config/.ssh/known_hosts username@IPAddress -i /config/.ssh/id_rsa 'nvidia-smi --query-gpu=power.draw,temperature.gpu,utilization.gpu,utilization.memory --format=csv,noheader,nounits'"
      scan_interval: 30
      command_timeout: 10
      value_template: >-
        {% set lines = value.split('\n') %}
        {% set values = lines[0].split(',') %}
        {{
          {
            "gpu_power_draw": values[0]         | trim,
            "gpu_temperature": values[1]        | trim,
            "gpu_utilization": values[2]        | trim,
            "gpu_memory_utilization": values[3] | trim,
          } | to_json
        }}

Then, I proceeded to create a sensor template for each JSON value:

- name: "GPU Power"
      unique_id: gpu_power_draw
      unit_of_measurement: W
      state_class: measurement
      state: >-
       {{ ((states('sensor.gpu_data') | from_json).gpu_power_draw) | round(2) }}

    - name: "GPU Temperature"
      unique_id: gpu_temperature
      unit_of_measurement: °C
      state_class: measurement
      state: >-
        {{ ((states('sensor.gpu_data') | from_json).gpu_temperature) | round(2) }}

    - name: "GPU Utilization"
      unique_id: gpu_utilization
      unit_of_measurement: "%"
      state_class: measurement
      state: >-
        {{ ((states('sensor.gpu_data') | from_json).gpu_utilization) | round(2) }}

    - name: "GPU Memory Utilization"
      unique_id: gpu_memory_utilization
      unit_of_measurement: "%"
      state_class: measurement
      state: >-
        {{ ((states('sensor.gpu_data') | from_json).gpu_memory_utilization) | round(2) }}
1 Like