TIL: There is an open source "Alexa replacement" project

cm0002@libretechni.ca · 6 hours ago

TIL: There is an open source "Alexa replacement" project

brucethemoose@lemmy.world · edit-2 4 hours ago

I mean, there are many. TTS and self-hosted automation are huge in the local LLM scene.

We even have open source “omni” models now, that can ingest and output speech tokens directly (which means they get more semantic understanding from tone and such, they ‘choose’ the tone to reply with, and that it’s streamable word-by-word). They support all sorts of tool calling.

…But they aren’t easy to run. It’s still in the realm of homelabs with at least an RTX 3060 + hacky python projects.

If you’re mad, you can self-host Longcat Omni

https://huggingface.co/meituan-longcat/LongCat-Flash-Omni

And blow Alexa out of the water with a MIT-licensed model from, I kid you not, a Chinese food delivery company.

EDIT

For the curious, see:

Audio-text-to-text (and sometimes TTS): https://huggingface.co/models?pipeline_tag=audio-text-to-text&num_parameters=min%3A6B&sort=modified

TTS: https://huggingface.co/models?pipeline_tag=text-to-speech&num_parameters=min%3A6B&sort=modified

“Anything-to-anything,” generally image/video/audio/text -> text/speech: https://huggingface.co/models?pipeline_tag=any-to-any&num_parameters=min%3A6B&sort=modified

Bigger than 6B to exclude toy/test models.

fonix232@fedia.io · 3 hours ago

I do wish there was a smaller LongCat model available. My current AI node has a hard 16GB VRAM limit (yay AMD UMA limitations), so 27B can’t really fit. An 8B dynamically loaded model would fit, and run much better.

TIL: There is an open source "Alexa replacement" project

TIL: There is an open source "Alexa replacement" project

Home