Custom Component - ESXi Stats

wxt9861 · September 20, 2019, 11:31pm

Yeah, there’s no point to add any documentation right now, the examples will have to change and there are minor changes to services as well.

I guess I am still not understanding the problem. Is it a problem with templating? It looks like the sensor is correctly displaying the names.

daphatty · September 20, 2019, 11:36pm

Yes, the issue is the format of the example templates. If the target VM contains a dash in the hostname then your example template:

${states['sensor.esxi_stats_vms'].attributes.<VM_NAME_HERE>.uptime_hours}

will fail to retrieve the value.

However, by removing the trailing period after attributes and encasing <VM_NAME_HERE> in [’’],

${states['sensor.esxi_stats_vms'].attributes['<VM_NAME_HERE>'].uptime_hours}

the template will return the values successfully.

Schmidij_ch · September 23, 2019, 1:12pm

Yes with this works!

      - custom_fields:
          cpu: |
            [[[
              return `<ha-icon
                icon="mdi:server"
                style="width: 12px; height: 12px; color: deepskyblue;">
                </ha-icon><span>CPU: <span style="color: var(--text-color-sensor);">
                ${states['sensor.esxi_stats_vms'].attributes['w10srv-01'].cpu_count}</span></span>`
            ]]]
          ram: |
            [[[
              return `<ha-icon
                icon="mdi:memory"
                style="width: 12px; height: 12px; color: deepskyblue;">
                </ha-icon><span>Mem: <span style="color: var(--text-color-sensor);">
                ${states['sensor.esxi_stats_vms'].attributes['w10srv-01'].memory_allocated_mb} MB</span></span>`
            ]]]
          state: |
            [[[
              return `<ha-icon
                icon="mdi:harddisk"
                style="width: 12px; height: 12px; color: deepskyblue;">
                </ha-icon><span>State: <span style="color: var(--text-color-sensor);">
                ${states['sensor.esxi_stats_vms'].attributes['w10srv-01'].state}</span></span>`
            ]]]
          uptime: |
            [[[
              return `<ha-icon
                icon="mdi:arrow-up"
                style="width: 12px; height: 12px; color: deepskyblue;">
                </ha-icon><span><span style="color: var(--text-color-sensor);">
                ${states['sensor.esxi_stats_vms'].attributes['w10srv-01'].uptime_hours}</span></span>`
            ]]]
        entity: sensor.esxi_stats_vms
        icon: 'mdi:server'
        name: w10srv-01
        styles:
          icon:
            - color: |
                [[[
                    if ( states['sensor.esxi_stats_vms'].attributes['w10srv-01'].state == "running")
                    return "green";
                    if ( states['sensor.esxi_stats_vms'].attributes['w10srv-01'].state == "off" )
                    return "red";
                ]]]
        template: esxi_stats_vm
        type: 'custom:button-card'

wxt9861 · September 23, 2019, 2:43pm

0.5.0b1 beta is available for testing. There are breaking changes, so if you’re going to install please remove the integration configuration, run an upgrade, reboot and re-add integration.

This includes the following changes:

Support to monitor multiple hosts/vcenters
I removed support for configuration via YAML, please use Integration UI
all of the changes from first beta version Custom Component - ESXi Stats

daphatty · September 24, 2019, 6:44am

WOOT!

I’m currently out of town but will endeavor to get this tested upon my return tomorrow night. Can’t wait to try out the new functionality!

kryt1kal · September 24, 2019, 1:09pm

Is there any way we can get allocated storage space as an attribute?

Great work on 0.5.0b1 btw!

wxt9861 · September 24, 2019, 1:42pm

I looked at it, but need to spend more time to figure out how it is being calculated. It isn’t a field I can just pull, so likely need to perform some calculation. Is this a case where you are using a thin provisioned VMDK?

kryt1kal · September 24, 2019, 1:59pm

Gotcha. I probably am using thin provision. Not a huge deal if it can’t be pulled.

wxt9861 · September 25, 2019, 4:15pm

Looking at this more and I don’t know how to make this work accurately. Right now, used_space_gb covers this for thick provision VMDKs.

used_space_gb shows how much storage a VM consumes across all datastores. If you’re using thick provisioned VMDKs, this is just as good as allocated storage.

This includes storage used by:

all VMDKs
snapshots
ram
other small files

The difference comes when you have thin VMDKs that are not fully used. For example a 120gb thin VMDK that is only half utilized would report 60gb used and the other 60gb would be reported as uncommitted.
Let’s assume VM is also configured with 12gb of ram.
Total used space = 60 (used by VMDK) + 12 (used by ram) = 72gb

This formula would give us “allocated” or “total a VM could consume” if there are no snapshots.

used_space + uncommited_space - ram = allocated
72         + 60               - 12  = 120

But, adding a snapshot to the mix produces odd results where both used and uncommited go up and how they are calculated I don’t fully understand.

daphatty · September 26, 2019, 3:12am

Having some strange behavior with this latest beta. The logs seem to indicate that the component is able to reach my second esxi server, but the VMs are never parsed and the entities are never created. I have a theory as to why.

My new ESXi server has a VM in an ‘invalid’ state. If you look at the logs below, the scanning process stops when the component tries to enumerate the invalid VM. I’ll try to resolve the validity issue and get back to you with my findings.

2019-09-25 19:47:33 DEBUG (MainThread) [custom_components.esxi_stats] Getting stats for licenses
2019-09-25 19:47:33 DEBUG (MainThread) [custom_components.esxi_stats.esxi] {'name': 'Evaluation Mode', 'status': 'expired', 'product': 'VMware ESX Server', 'expiration_days': 0}
2019-09-25 19:47:33 DEBUG (MainThread) [custom_components.esxi_stats] Getting stats for vm: /vmfs/volumes/5571bb32-b4dd1c74-85d7-b8aeed739b46/grafana/grafana.vmx
2019-09-25 19:47:33 DEBUG (MainThread) [custom_components.esxi_stats.esxi] Logged out - session 5271949a-de9d-271e-51a8-0b43e67dd5f4
2019-09-25 19:47:33 ERROR (MainThread) [custom_components.esxi_stats] 'NoneType' object has no attribute 'committed'
2019-09-25 19:47:33 DEBUG (MainThread) [custom_components.esxi_stats.esxi] Logged out - session 52814787-9761-2488-4c1<redacted>
2019-09-25 20:10:31 WARNING (MainThread) [homeassistant.config_entries] Config entry for esxi_stats not ready yet. Retrying in 80 seconds.

EDIT: Confirmed. Having a VM in an “Invalid” state breaks the components scanning process. As soon as I unregistered the Invalid VM, the component successfully scanned the new ESXi instance at the next interval. Is it possible to add logic that can account for and work around the Invalid state?

daphatty · September 26, 2019, 3:53am

Another observation - The license entities are a little too similar. Maybe add an attribute to the license sensor that stores the Hostname of its associated ESXi server?

wxt9861 · September 26, 2019, 1:14pm

Yes, I’ll fix those.

In your screenshot, one of the eval licenses was expired - was that reported correctly? Just a thought, maybe make the license sensor a binary sensor. That would make some sense I think. It is either ok or it has a problem.

daphatty · September 26, 2019, 5:15pm

I actually reset the licenses on both of my esxi boxes while troubleshooting the Invalid VM issue, just to make sure they weren’t causing any issues.

daphatty · September 28, 2019, 12:34am

Quick question - Does the ESXi component report values below 1% for CPU usage? I’m seeing weird Lovelace card behavior for under utilized VMs where in most cases, the Lovelace card will not render if CPU% is >1. On a rare occasion, the Lovelace card will report a value of RUNNING in lieu of a CPU% (as shown below).

At this point, I’m merely trying to understand what is being passed to home assistant by the component. Next step is to test the CPU% value with a few other graphing style cards to determine if the errant behavior follows the data or is specific to the card itself.

Rare render of a card tied to a VM with >1% CPU usage. In most cases, no card is rendered.

Expected outcome. This particular card (flex-horseshoe) can be configured to render tenths of a percent.

wxt9861 · September 28, 2019, 1:05am

The component calculates cpu % based on other metrics and rounds the value to a whole number. But it’ll be a number, nonetheless, unless there is a problem with the host reporting stats or the VM is not running, in those cases it’ll say n/a. I can make it round to 1 or 2 decimal place.
Edit: I think, in order to prevent 0, it has to be at least 2 decimal places. I just checked a VM that’s not doing anything and 1 decimal places will still show 0.

I am guessing you configured the card to report running or is it doing it on its own?
I also wonder if the lovelace card does not handle 0 correctly?

daphatty · September 28, 2019, 1:13am

I’ve not changed the default behavior of the ESXi component, mainly because I want to test the features as they are initially intended to be used. It is certainly possible that the card doesn’t handle 0 values very well which is why I wanted to test against other cards before reporting an actual issue to anyone. That said, I believe flex horseshoe supports two decimal places and will add that to my list of things to test once I have the time. Thanks for clarifying such a minute detail.

wxt9861 · September 28, 2019, 1:24am

It’s no problem. Now that I think about it, seeing 0 might be a little misleading, too. Is it working? Is it reporting properly? Is the vm dead? Even seeing 0.01 might be more reassuring, I suppose.

The nice thing with the UI options menu, is that the default can be set to 0 or 2 decimal places but also give the user an option to change it to whatever they like.

btw, I’m really liking your cards!

daphatty · September 28, 2019, 3:20am

Thanks. Here’s a WIP of my individual server view. I’m about to install b2 so this is likely going to change a bit.

EDIT: The blank space is actually a card that won’t load.

daphatty · September 28, 2019, 10:06am

Finally had time to do some thorough troubleshooting and I think I found the culprit to my card loading problem. For whatever reason, if I configure a card to pull the value of a VM’s cpu_use_% attribute, the card either takes a very long time to load (flex-horseshoe-card), the card doesn’t load at all (>1% CPU usage on flex-horseshor-card) or pulls the default attribute of the VM instead (bar-card).

However, if cpu_use_% becomes the primary attribute of the entity_id, either as a unique sensor entity or via the Virtual Machine State Attribute Option, the values load correctly and the cards render immediately every time.

Another observation - Changing the Virtual Machine State Attribute requires a reboot of Home Assistant but does not prompt the user informing them of this requirement. Wasn’t sure whether or not this was intentional but figured it was worth mentioning. These test results were obtained against 0.5.0b2.

wxt9861 · October 1, 2019, 12:47pm

Now sure what to attribute this behavior to, but I changed cpu_usu_% to be 2 decimal places, so lets see how that works.

Changing the Virtual Machine State Attribute requires a reboot

Yes, I added this to documentation. Support for options via GUI was just released in HA recently, so I am not 100% if I’m just doing something wrong or that’s how it work. I’ll see if I can make it work without a reboot in later releases.