need help understanding if this setup is even feasible.
brucethemoose @ brucethemoose @lemmy.world Posts 35Comments 2,879Joined 1 yr. ago
brucethemoose @ brucethemoose @lemmy.world
Posts
35
Comments
2,879
Joined
1 yr. ago
Scoop: Four reasons Musk attacked Trump's "big beautiful bill"
Niche Model of the Day: Openbuddy 25.2q, QwQ 32B with Quantization Aware Training
MAGA vs. Musk: Right-wing critics allege censorship, loss of X badges
The LLM “engine” is mostly detached from the UI.
kobold.cpp is actually pretty great, and you can still use it with TabbyAPI (what you run for exllama) and the llama.cpp server.
I personally love this for writing and testing though:
https://github.com/lmg-anon/mikupad
And Open Web UI for more general usage.
There’s a big backlog of poorly documented knowledge too, heh, just ask if you’re wondering how to cram a specific model in. But the “jist” of the optimal engine rules are: