There is already an OpenAI integration for home assistant, and as LocalAI is following the OpenAI spec, it should be already possible to integrate it. The only change required would be for the OpenAI plugin to also support specifying a “base url” to the user where to point requests to in the options.
So, I had a quick hack on this last week and got LocalAI to work with Home Assistant.
I basically forked the "Open"AI integration and hacked up the code to add the ability to set a custom API endpoint.
I’ve found developing and testing with Home Assistant to be … painful … to say the least so there’s a bunch of problems with it at present, the main branch is currently broken - but I promise you - it did work for a while last week.
Very very early days and I have limited time at the moment but please feel free to submit PRs / fix bug or if you think it’s so dreadful it needs a rewrite (probably the case) please do so and I’ll use yours!
Yep! (I also have text-gen-webui running here, I just found localAI to be a bit better packaged).
Once I get some more time to work on it and get it going properly (hopefully this weekend) you’ll be able to provide any API endpoint you want, and as long as it provides an OpenAI compatible response it should work.
I’m already doing this with a few other apps, just haven’t had time to hack on it any further yet.
Have you tested it out?
I’m about to give it a spin!
EDIT: I can’t seem to get that one working
EDIT2:
I figured out the issue.
home assistant does not let you connect to a non-encrypted endpoint if you’re using https for the home assistant instance.
I threw NPM as a proxy and it works correctly now
Have you been able to increase the token size when using the custom openapi addon? I’ve tried increasing all the “Truncate the prompt up to this length” within the text-generation-webui interface but I still an error stating:
Sorry, I had a problem talking to Custom OpenAI compatible server: This model maximum context length is 2048 tokens.
Wait for the model to download. Usually the model with more parameters is better. q4 is usually the sweet spot, but 13b q3 should perform better in terms of “quality” than 7b q4/q5. With windows running, I can fit in my RTX3060 with 12GB VRAM - thebloke__vicuna-7b-v1.5-ggml__vicuna-7b-v1.5.ggmlv3.q5_k_m.bin (even q8, but there is very little difference in terms of anything) and thebloke__vicuna-13b-v1.5-ggml__vicuna-13b-v1.5.ggmlv3.q3_k_s.bin
Change the model parameters in thebloke__vicuna-13b-v1.5-ggml__vicuna-13b-v1.5.ggmlv3.q3_k_s.bin.yaml:
Pay attention to parameter context_size, where you can customize the allowed context size of your model.
Also pay attention to gpu_layers. The more gpu_layers you deffer to GPU the faster it will go, but more GPU VRAM will be used.
This smart home is controlled by Home Assistant.
An overview of the areas and the devices in this smart home:
{%- for area in areas() %}
{%- set area_info = namespace(printed=false) %}
{%- for device in area_devices(area) -%}
{%- if not device_attr(device, "disabled_by") and not device_attr(device, "entry_type") and device_attr(device, "name") %}
{%- if not area_info.printed %}
{{ area_name(area) }}:
{%- set area_info.printed = true %}
{%- endif %}
{%- for entity in device_entities(device) %}
{%- if not is_state(entity,'unavailable') and not is_state(entity,'unknown') and not is_hidden_entity(entity) %}
- {{ state_attr(entity, 'friendly_name') }} is {{ states(entity) }}
{%- endif %}
{%- endfor %}
{%- endif %}
{%- endfor %}
{%- endfor %}
Answer the user's questions about the world truthfully.
Next step would be configuring the functions so the LLM can actually call your home assistant services and control your home.
So I’m using your model and selected llamma.cpp as the loader and increased n_ctx to 6144 then loaded the model but I’m still receiving the same error in home assistant.
But maybe should be something more generic than LocalAI, so you could use privateGPT or anything else.
Like maybe just use the already implemented Wyoming protocol integration.
So I’ve got this set up using localai now and am no longer getting the error about context length.
I’ve set my vicuna-chat.tmpl to this:
A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
USER: {prompt}
ASSISTANT:
And I’ve set the template within home assistant to this:
This smart home is controlled by Home Assistant.
An overview of the areas and the devices in this smart home:
{%- for area in areas() %}
{%- set area_info = namespace(printed=false) %}
{%- for device in area_devices(area) -%}
{%- if not device_attr(device, "disabled_by") and not device_attr(device, "entry_type") and device_attr(device, "name") %}
{%- if not area_info.printed %}
{{ area_name(area) }}:
{%- set area_info.printed = true %}
{%- endif %}
{%- for entity in device_entities(device) %}
{%- if not is_state(entity,'unavailable') and not is_state(entity,'unknown') and not is_hidden_entity(entity) %}
- {{ state_attr(entity, 'friendly_name') }} is {{ states(entity) }}
{%- endif %}
{%- endfor %}
{%- endif %}
{%- endfor %}
{%- endfor %}
Answer the user's questions about the world truthfully.
but am receiving this response whenever I try to interact with the LLM within home assistant: