I’ve read about this method in the GitHub issues, but to me it seemed impractical to have different models just to change the context size, and that was the point I started looking for alternatives
It was multiple models, mainly 32-70B
There are many projects out there optimizing the speed significantly. Ollama is unbeaten in the convenience though
Yeah, but there are many open issues on GitHub related to these settings not working right. I’m using the API, and just couldn’t get it to work. I used a request to generate a json file, and it never generated one longer than about 500 lines. With the same model on vllm, it worked instantly and generated about 2000 lines
Über n Salamibrot geht halt nix
Yo I think we Path of Exile gamers made it pretty clear he is not one of us
Take a look at NVIDIA Project Digits. It’s supposed to release in May for 3k usd and will be kind of the only sensible way to host LLMs then:
How is Apple pretty bad?
I’ve discovered it just a few days ago and now use it on all my machines
For anyone trying this, make sure you do not have “- TS_USERSPACE=false” in your yaml from previous experimentation. After removing this, it works for me too.
In the documentation they say to add sysctl entries, it is possible in docker compose like so:
tailscale:
sysctls:
- net.ipv4.ip_forward=1
- net.ipv6.conf.all.forwarding=1
But it does not seem to make a difference for me. Does anyone know why these would not be required in this specific setup?
How could you solve the problem of storage expansion? I assume there exists some kind of thunderbolt jbod thing or similar
Yeah show me a phone with 48GB RAM. It’s a big factor to consider. Actually, some people are recommending a Mac Studio cause you can get it with 128GB RAM and more and it’s shared with the AI/GPU accelerator. Very energy efficient, but sucks as soon as you want to do literally anything other than inference
Im used to this from the whole “build your own gaming pc/nas” rabbit hole. Now it’s just some extra gpus and I might be able to have a two in one build (which will of course offset any costs for more 3090s /s)
I’d be interested (and surprised) too
Yeah it’s a pretty cool project and I’ll definitely use it. However nothing can beat a straight connection from monitor to gpu, so I’ll probably use passthrough for the gpu when gaming
No :(
I have a separate gaming PC and am considering to just use that hardware for my NAS and create a VM for gaming
Vermisse Reddit echt nicht
You can only resign from being part of the church, which many young people do once they see this on their first paycheck.
I wanted to set this up for a while now. Guess it’s time