Alpaca is a neat little flatpak that containerizes everything and makes running local models so easy that I can literally do it without a mouse or keyboard.
Oh my god I feel so stupid. I’ve been arguing back and forth whether it was worth de-atomizing my steam deck to spin up alpaca in docker. I forgot they have a flatpak
Bazzite also has podman, though not specifically docker, in the core OS.
So… I have spun up one local LLM in Alpaca, told it what hardware, OS, and environment it is in/on, told it to generate a context prompt to inform itself of all that… and its now helping me try to figure out how/if it is possible to set up a podman container/environment… for LLMs that either Alpaca does not yet support, or I am too stupid to figure out.
Alpaca even has tools. You can give an LLM the ability to search the web for something, and find some info or what not.
ROCm on a Deck seems to kind of sort of work via … basically you spoof your gpu id in the podman environment, and then… you would either hwve to do the ole allocate more ram to gpu thing, or attempt to edit the LLM’s config and such, to try an run in a much lower than expected vram situation.
(WIP)
Presumably you could tell it to do a lot of things but that seems like a bad idea lol, anyway yeah, I was able to just tell it ‘go online and lookup bazzite, familiarize yourself with pertinent details, reformulate context prompt.’
Most people do not have a local LLM in their pocket right now.
Most people have a client app that talks to a remote LLM, which ‘lives’ in an ecologically and economically dubious mega-datacenter, in their pocket right now.
Plenty of the AI functions on phones are on-device. I know the iPhone is capable of several text-based processing (summarizing, translating) offline, and they have an API for third party developers to use on-device models. And the Pixels have Gemini Nano on-device for certain offline functions.
You can get offline versions of LLMs.
And gpt-oss is an offline version of chatgpt
Indeed https://huggingface.co/openai-community
First thing that came to mind: GPT4All
I’ve been toying with Qwen3.
On my steam deck.
8 bil param model runs stably.
Its’s opensource too!
Alpaca is a neat little flatpak that containerizes everything and makes running local models so easy that I can literally do it without a mouse or keyboard.
Oh my god I feel so stupid. I’ve been arguing back and forth whether it was worth de-atomizing my steam deck to spin up alpaca in docker. I forgot they have a flatpak
Bazzite also has podman, though not specifically docker, in the core OS.
So… I have spun up one local LLM in Alpaca, told it what hardware, OS, and environment it is in/on, told it to generate a context prompt to inform itself of all that… and its now helping me try to figure out how/if it is possible to set up a podman container/environment… for LLMs that either Alpaca does not yet support, or I am too stupid to figure out.
Alpaca even has tools. You can give an LLM the ability to search the web for something, and find some info or what not.
ROCm on a Deck seems to kind of sort of work via … basically you spoof your gpu id in the podman environment, and then… you would either hwve to do the ole allocate more ram to gpu thing, or attempt to edit the LLM’s config and such, to try an run in a much lower than expected vram situation.
(WIP)
Presumably you could tell it to do a lot of things but that seems like a bad idea lol, anyway yeah, I was able to just tell it ‘go online and lookup bazzite, familiarize yourself with pertinent details, reformulate context prompt.’
I mean, most people have a local LLM in their pocket right now.
Unless I am missing something:
Most people do not have a local LLM in their pocket right now.
Most people have a client app that talks to a remote LLM, which ‘lives’ in an ecologically and economically dubious mega-datacenter, in their pocket right now.
Plenty of the AI functions on phones are on-device. I know the iPhone is capable of several text-based processing (summarizing, translating) offline, and they have an API for third party developers to use on-device models. And the Pixels have Gemini Nano on-device for certain offline functions.
My phone does speech-to-text flawlessly offline, it’s a crazy useful little LLM tool
Oh!
Well, I didn’t know that.
I’m too poor to be able to afford such fancy phones.
Gemini nano, Apple Intelligence On-device, etc.
https://ollama.org/