I’m trying to use Gemini AI to send a notification to my phone if a person is detected by my video doorbell. I’ve got the automation sending images and descriptions to my phone, but I can’t work out how to ONLY send a notification if the AI description (in image_description.response_text) does NOT include the phrase “no people are visible”.
Can anyone tell me how I would need to modify the code below to add this condition?
alias: send_camera_snapshot
sequence:
- variables:
file_time: "{{ now().strftime('%Y%m%d_%H%M%S') }}"
- data:
filename: /config/www/{{ camera_id }}_snapshot_{{ file_time }}.jpg
target:
entity_id: camera.{{ camera_id }}
action: camera.snapshot
- action: llmvision.image_analyzer
metadata: {}
data:
remember: false
use_memory: false
include_filename: false
target_width: 1280
max_tokens: 100
temperature: 0.2
generate_title: false
expose_images: false
provider: 01JQM1Y8JRXFCDFA9X5FHTVEZC
message: >-
Describe the image in a single sentence. If you see a person, describe
them. If you see multiple people, give a count of the number of people
and describe them. Try to determine if they are arriving or leaving and
state which if you can.
image_file: /config/www/{{ camera_id }}_snapshot_{{ file_time }}.jpg
response_variable: image_description
- data:
message: "{{ camera_id }} alarm - {{image_description.response_text}}"
title: Alarm triggered
data:
image: /local/{{ camera_id }}_snapshot_{{ file_time }}.jpg
channel: Motion
ttl: 0
priority: high
importance: high
visibility: public
actions:
- action: URI
title: Open Image
uri: >-
https://mydomain/local/{{ camera_id
}}_snapshot_{{file_time }}.jpg
action: notify.mobile_app_ian_galaxy_s24_ultra
mode: single
Thanks. I’ve tried to incorporate this but am struggling with the syntax. I am getting this error:
Message malformed: Unable to determine action @ data['sequence'][3]
This is the code:
Message malformed: Unable to determine action @ data['sequence'][3]
alias: send_camera_snapshot
sequence:
- variables:
file_time: "{{ now().strftime('%Y%m%d_%H%M%S') }}"
- data:
filename: /config/www/{{ camera_id }}_snapshot_{{ file_time }}.jpg
target:
entity_id: camera.{{ camera_id }}
action: camera.snapshot
- action: llmvision.image_analyzer
metadata: {}
data:
remember: false
use_memory: false
include_filename: false
target_width: 1280
max_tokens: 100
temperature: 0.2
generate_title: false
expose_images: false
provider: 01JQM1Y8JRXFCDFA9X5FHTVEZC
message: >-
Describe the image in a single sentence. If you see a person, describe
them. If you see multiple people, give a count of the number of people
and describe them. Try to determine if they are arriving or leaving and
state which if you can.
image_file: /config/www/{{ camera_id }}_snapshot_{{ file_time }}.jpg
response_variable: image_description
- data:
message: "{{ camera_id }} alarm - {{image_description.response_text}}"
title: Alarm triggered
data:
image: /local/{{ camera_id }}_snapshot_{{ file_time }}.jpg
channel: Motion
ttl: 0
priority: high
importance: high
visibility: public
actions:
- action: URI
title: Open Image
uri: >-
https://mydomain/local/{{ camera_id
}}_snapshot_{{file_time }}.jpg
- choose:
- conditions:
- condition: template
value_template: {{ not "no people are visible" in image_description }}
sequence:
- action: notify.mobile_app_ian_galaxy_s24_ultra
mode: single
OK I’ve re-arranged it, as below. Still get this error:
Message malformed: Unable to determine action @ data['sequence'][3]
alias: send_camera_snapshot
sequence:
- variables:
file_time: "{{ now().strftime('%Y%m%d_%H%M%S') }}"
- data:
filename: /config/www/{{ camera_id }}_snapshot_{{ file_time }}.jpg
target:
entity_id: camera.{{ camera_id }}
action: camera.snapshot
- action: llmvision.image_analyzer
metadata: {}
data:
remember: false
use_memory: false
include_filename: false
target_width: 1280
max_tokens: 100
temperature: 0.2
generate_title: false
expose_images: false
provider: 01JQM1Y8JRXFCDFA9X5FHTVEZC
message: >-
Describe the image in a single sentence. If you see a person, describe
them. If you see multiple people, give a count of the number of people
and describe them. Try to determine if they are arriving or leaving and
state which if you can.
image_file: /config/www/{{ camera_id }}_snapshot_{{ file_time }}.jpg
response_variable: image_description
- data:
message: "{{ camera_id }} alarm - {{image_description.response_text}}"
title: Alarm triggered
data:
image: /local/{{ camera_id }}_snapshot_{{ file_time }}.jpg
channel: Motion
ttl: 0
priority: high
importance: high
visibility: public
actions:
- action: URI
title: Open Image
uri: >-
https://mydomain/local/{{ camera_id
}}_snapshot_{{file_time }}.jpg
- condition: template
value_template: {{ not "no people are visible" in image_description }}
- action: notify.mobile_app_ian_galaxy_s24_ultra
mode: single
alias: send_camera_snapshot
sequence:
- variables:
file_time: "{{ now().strftime('%Y%m%d_%H%M%S') }}"
- data:
filename: /config/www/{{ camera_id }}_snapshot_{{ file_time }}.jpg
target:
entity_id: camera.{{ camera_id }}
action: camera.snapshot
- action: llmvision.image_analyzer
metadata: {}
data:
remember: false
use_memory: false
include_filename: false
target_width: 1280
max_tokens: 100
temperature: 0.2
generate_title: false
expose_images: false
provider: 01JQM1Y8JRXFCDFA9X5FHTVEZC
message: >-
Describe the image in a single sentence. If you see a person, describe
them. If you see multiple people, give a count of the number of people
and describe them. Try to determine if they are arriving or leaving and
state which if you can.
image_file: /config/www/{{ camera_id }}_snapshot_{{ file_time }}.jpg
response_variable: image_description
- condition: template
value_template: "{{ not 'no people are visible' in image_description }}"
- action: notify.mobile_app_ian_galaxy_s24_ultra
data:
message: "{{ camera_id }} alarm - {{image_description.response_text}}"
title: Alarm triggered
data:
image: /local/{{ camera_id }}_snapshot_{{ file_time }}.jpg
channel: Motion
ttl: 0
priority: high
importance: high
visibility: public
actions:
- action: URI
title: Open Image
uri: >-
https://mydomain/local/{{ camera_id
}}_snapshot_{{file_time }}.jpg
mode: single
Message malformed: template value should be a string for dictionary value @ data['sequence'][3]['value_template']
This is the code:
alias: send_camera_snapshot
sequence:
- variables:
file_time: "{{ now().strftime('%Y%m%d_%H%M%S') }}"
- data:
filename: /config/www/{{ camera_id }}_snapshot_{{ file_time }}.jpg
target:
entity_id: camera.{{ camera_id }}
action: camera.snapshot
- action: llmvision.image_analyzer
metadata: {}
data:
remember: false
use_memory: false
include_filename: false
target_width: 1280
max_tokens: 100
temperature: 0.2
generate_title: false
expose_images: false
provider: 01JQM1Y8JRXFCDFA9X5FHTVEZC
message: >-
Describe the image in a single sentence. If you see a person, describe
them. If you see multiple people, give a count of the number of people
and describe them. Try to determine if they are arriving or leaving and
state which if you can.
image_file: /config/www/{{ camera_id }}_snapshot_{{ file_time }}.jpg
response_variable: image_description
- condition: template
value_template: {{ not "no people are visible" in image_description }}
- action: notify.mobile_app_ian_galaxy_s24_ultra
data:
message: "{{ camera_id }} alarm - {{image_description.response_text}}"
title: Alarm triggered
data:
image: /local/{{ camera_id }}_snapshot_{{ file_time }}.jpg
channel: Motion
ttl: 0
priority: high
importance: high
visibility: public
actions:
- action: URI
title: Open Image
uri: >-
https://mydomain/local/{{ camera_id
}}_snapshot_{{file_time }}.jpg
mode: single
Thanks for the update. It passed the validation, but the notifications are still being sent even when image_description contains the text “no people are visible”.
Any idea why that might be?
alias: send_camera_snapshot
sequence:
- variables:
file_time: "{{ now().strftime('%Y%m%d_%H%M%S') }}"
- data:
filename: /config/www/{{ camera_id }}_snapshot_{{ file_time }}.jpg
target:
entity_id: camera.{{ camera_id }}
action: camera.snapshot
- action: llmvision.image_analyzer
metadata: {}
data:
remember: false
use_memory: false
include_filename: false
target_width: 1280
max_tokens: 100
temperature: 0.2
generate_title: false
expose_images: false
provider: 01JQM1Y8JRXFCDFA9X5FHTVEZC
message: >-
Describe the image in a single sentence. If you see a person, describe
them. If you see multiple people, give a count of the number of people
and describe them. Try to determine if they are arriving or leaving and
state which if you can.
image_file: /config/www/{{ camera_id }}_snapshot_{{ file_time }}.jpg
response_variable: image_description
- condition: template
value_template: "{{ not 'no people' in image_description }}"
- action: notify.mobile_app_ian_galaxy_s24_ultra
data:
message: "{{ camera_id }} alarm - {{image_description.response_text}}"
title: Alarm triggered
data:
image: /local/{{ camera_id }}_snapshot_{{ file_time }}.jpg
channel: Motion
ttl: 0
priority: high
importance: high
visibility: public
actions:
- action: URI
title: Open Image
uri: >-
https://mydomain/local/{{ camera_id
}}_snapshot_{{file_time }}.jpg
mode: single
2025-04-01 12:33:11.702 INFO (MainThread) [custom_components.llmvision.providers] Response data: {'candidates': [{'content': {'parts': [{'text': "Here is a description of the image: Eye-level view of a residential street scene captured by a Reolink video doorbell, showing a row of brick houses on a sunny day; several cars are parked along the street, and a dark-colored Audi is parked near the camera's location. There are no people visible in the image."}], 'role': 'model'}
Message malformed: invalid template (TemplateSyntaxError: expected token 'end of print statement', got 'content') for dictionary value @ data['sequence'][3]['value_template']
alias: send_camera_snapshot
sequence:
- variables:
file_time: "{{ now().strftime('%Y%m%d_%H%M%S') }}"
- data:
filename: /config/www/{{ camera_id }}_snapshot_{{ file_time }}.jpg
target:
entity_id: camera.{{ camera_id }}
action: camera.snapshot
- action: llmvision.image_analyzer
metadata: {}
data:
remember: false
use_memory: false
include_filename: false
target_width: 1280
max_tokens: 100
temperature: 0.2
generate_title: false
expose_images: false
provider: 01JQM1Y8JRXFCDFA9X5FHTVEZC
message: >-
Describe the image in a single sentence. If you see a person, describe
them. If you see multiple people, give a count of the number of people
and describe them. Try to determine if they are arriving or leaving and
state which if you can.
image_file: /config/www/{{ camera_id }}_snapshot_{{ file_time }}.jpg
response_variable: image_description
- condition: template
value_template: "{{ not 'no people' in image_description.candidates[0]content.parts[0]text }}"
- action: notify.mobile_app_ian_galaxy_s24_ultra
data:
message: "{{ camera_id }} alarm - {{image_description.response_text}}"
title: Alarm triggered
data:
image: /local/{{ camera_id }}_snapshot_{{ file_time }}.jpg
channel: Motion
ttl: 0
priority: high
importance: high
visibility: public
actions:
- action: URI
title: Open Image
uri: >-
https://mydomain/local/{{ camera_id
}}_snapshot_{{file_time }}.jpg
mode: single
Thanks. That passed validation but I get this error in the log when it’s executed:
2025-04-01 14:06:41.122 INFO (MainThread) [custom_components.llmvision.providers] Response data: {'candidates': [{'content': {'parts': [{'text': 'Here is a description of the image: A wide shot of a residential street on a sunny day, captured by a Reolink Video Doorbell WiFi, shows a row of brick houses, several parked cars, and a person in dark clothing walking away from the camera.'}], 'role': 'model'}, 'finishReason': 'STOP', 'avgLogprobs': -0.1503448486328125}], 'usageMetadata': {'promptTokenCount': 314, 'candidatesTokenCount': 54, 'totalTokenCount': 368, 'promptTokensDetails': [{'modality': 'IMAGE', 'tokenCount': 258}, {'modality': 'TEXT', 'tokenCount': 56}], 'candidatesTokensDetails': [{'modality': 'TEXT', 'tokenCount': 54}]}, 'modelVersion': 'gemini-1.5-flash-latest'}
2025-04-01 14:05:36.608 WARNING (MainThread) [homeassistant.helpers.script] Error in 'condition' evaluation:
In 'template' condition: UndefinedError: 'dict object' has no attribute 'candidates'
alias: send_camera_snapshot
sequence:
- variables:
file_time: "{{ now().strftime('%Y%m%d_%H%M%S') }}"
- data:
filename: /config/www/{{ camera_id }}_snapshot_{{ file_time }}.jpg
target:
entity_id: camera.{{ camera_id }}
action: camera.snapshot
- action: llmvision.image_analyzer
metadata: {}
data:
remember: false
use_memory: false
include_filename: false
target_width: 1280
max_tokens: 100
temperature: 0.2
generate_title: false
expose_images: false
provider: 01JQM1Y8JRXFCDFA9X5FHTVEZC
message: >-
Describe the image in a single sentence. If you see a person, describe
them. If you see multiple people, give a count of the number of people
and describe them. Try to determine if they are arriving or leaving and
state which if you can.
image_file: /config/www/{{ camera_id }}_snapshot_{{ file_time }}.jpg
response_variable: image_description
- condition: template
value_template: >-
{{ not 'no people' in
image_description.candidates[0].content.parts[0].text }}
- action: notify.mobile_app_ian_galaxy_s24_ultra
data:
message: "{{ camera_id }} alarm - {{image_description.response_text}}"
title: Alarm triggered
data:
image: /local/{{ camera_id }}_snapshot_{{ file_time }}.jpg
channel: Motion
ttl: 0
priority: high
importance: high
visibility: public
actions:
- action: URI
title: Open Image
uri: >-
https://mydomain/local/{{ camera_id
}}_snapshot_{{file_time }}.jpg
mode: single