Very little actually - it’s more about choosing a model with the right capabilities now.
I look for
- Tool User < I find that if the summarizer doesnt understand tools it hallucinates the heck out of stuff.
- Something with a USEABLE context window MINIMUM 8k but reasonably that we can potentially push to 32K (Im riding 16K on local now successfully summarizing without livestate or tools data - an ai_task) < How much context we get before the sumamrizer fails out.
- "reasoning’ (Chain of thought reasoner, test time compute capabilities) < Impacts your AI’s ability to chain tools
so on the frointline that’s currently gpt5.1-mini and on the back end the current os oss:20b. If you can do one of the deepseek tool users, or one of the new gemma’s with tool use or a grok past 2.x mostly it just works. This comes from trying to slice the dune (the grounding) not pushing the ball… Then, most omdels kinda just fall in line if they have the base capabilities and it comes down to your preferences. Why I said above this would work with a custom distilled model with additional training if you REALLY want to go there - but totally not required.
The speed issue is mostly tabled by presummarizing context - so the frontline has VERY little to actually ingest. Even with a non streaming high end cloud voice like OAI’s voice models I get responses in 5-12 seconds.
When I turn my attention to pulling voice STT and TTS local as well - I expect a 10x improvement or more on response. Her bottleneck is not currently cognition - or capability it’s squarely ‘how fast that speech is.’ In time - Im currently working on nailing the function.
(Which by the way I now have confirmation of two cabinet systems with green health sensors, the test default prompt, real essence capsules and a prompt lighting up.
)
What Im focusing on basically, is making the BEST index using tool user I can for frontline Friday - then putting all the data back in flexible storage with a badass index and give her a group of SMEs to do her heavy lifts. (And run as much of that offline as possible, when frontline can move - soon) we do that too.
Oh and Fez’s music assistant scripts and a helper search script give Friday the music library
it IS a party afterall… Hmm Im due a project update there…


