• 1 Post
  • 133 Comments
Joined 2 years ago
cake
Cake day: December 31st, 2023

help-circle
  • Oh definitely. I think it’s anthropic who have stated in multiple interviews that they break even on most of their models, it’s just that they keep spending exponentially more to train the next model. They and openai seem to be stuck in an arms race where switching to purely serving existing models to their existing clients just won’t work. I do wonder how accurate that assessment is on their part.




  • GC enables webpage bloat, in the sense that these bloated designs would be unfeasible to code with manual memory management. I’m not saying they are caused by GC, but that now extra discipline is needed to resist taking the “easy path”. This is the point I’m trying to make with regard to making LLMs code for us; they’ve added incentive to be sloppy because the “black box” result is the same only more trivially obtained. I’m worried about the knock-on effects because I feel like I’ve seen this cycle happen numerous times. And for some reason some places going “all-in on ai” are now either backing off from that approach or shipping buggier software. If you’re not getting worse code from using LLMs, great. Good for you. Having tried again and again to work with these tools myself, I don’t see how to overall gain any actual effectiveness with/from them - shuffle around the effort, sure, but trying to arrive at the same place as without them only faster and/or with less effort? I just don’t see it happen in my attempts. Invariably I come out feeling like I’ve been over promised and simultaneously lost time trying to wrangle hard truths and intentional code out of something designed for the exact opposite. Or that I’ve burnt what used to be my hourly salary in data center costs to save me a few minutes of doldrums.

    It’s funny, I get the impression that you’re doing the exact same thing just with the opposite conclusion to mine. I can’t tell if we just have different priorities when it comes to programming, or some other fundamental miscomprehension of what the other is writing. If there is a conclusion I’m already at and guilty of retrofitting into this conversation, it’s that we are collectively, as a species, taking yet another step towards ballooning our energy consumption out of greed and lazyness and I would at least like to be certain it’s partly enabling meaningful progress towards emancipation of the common person, not further proprietary capture of the tools of labor. This is too close to “factory farming so that everyone can eat (dubiously nutritious) pork chops every day for cheap without doing any farm work themselves” for me to just focus on individual luxury or productivity. I don’t understand how the externalities make up for less manual writing of boilerplate, especially when you need to make the thing double-check it’s boilerplate because it can’t reliably one-shot it.

    I want to write more but I’m not certain how relevant it would be to the current discussion, so I’ll just wait to see if you’re still interested in continuing this exchange.


  • I want to agree, but for example GC has enabled webpages that take 3gigs of ram to do the same tasks we could do with 200 megs fifteen years ago. We don’t automatically build more interesting things once the gritty details and boilerplate are automated, and this stochastic automation gives even more room for “bad practices” to creep in and rob us of the gains it is supposed to bring.


  • Sorry, I misspoke (miswrote?). I meant growing the code through a genetic-algorithm-like process. Though, fundamentally, I don’t think there’s that much difference between applying a selection process on randomized bytes and having an LLM churn on a codebase.

    I feel like you’re only considering the time it takes to reach a particular solution when considering what is inefficient - in which case I would agree it’s probably a wash. However, I don’t think an LLM is less energy-hungry than my own body, and I learn by doing, effectively reducing the cost of future coding iterations. I guess if I could run the LLM and surrounding hardware entirely off of solar power I wouldn’t mind nearly as much - though there’s still that part of banging my head against a problem that I believe is crucial for my own growth. I think that, over time and problems/projects, this compounds in a way that letting the LLM figure out the gritty details just won’t.

    I think I agree with your last paragraph, though I do wish the LLM was capable of needing less massaging the more it runs. I hope we’ll be able to figure out how to achieve effectively infinite context length so that it doesn’t have to “forget” all of the previous tasks I’ve had it work on.


  • I really dislike the idea of making the whole program a genetic algorithm - that approach is nice when you don’t have a straightforward approach to employ/enact, but otherwise it feels both overkill and horrendously inefficient.

    The next step for my own harness (whenever I get back to working on it) is definitely to look at leveraging structured outputs to help these smaller models iterate towards a longer term goal.


  • I’ve been pleasantly surprised by Qwen3.6-27b on a Radeon 6700xt (12GB of VRAM) with 32GB of system RAM for it to offload onto (especially when pushing the context window up past 50k). Definitely more of a “compose prompt and hit send -> do something else -> check back after a while to view results” experience than an engaged back-and-forth, but at least compared to previous models I’ve tried running over the past year or two the results are palatable and sometimes even meaningfully useful.

    Given the speed I get, I’ve mostly found it useful for doing overviews of a codebase southy some sort of improvement plan suggested at the end. Tool calls work, but I’m still not comfortable letting it code outright (plus, I think I can still code faster than it for now).


  • I like vampire stories that use them to explore addiction dynamics. The one that stood out to me was the “Joe Pitt Casebooks” - very gritty, set in Manhattan with not a victorian accent in sight. Standouts for me included a monastic sect that spend their time fasting and meditating in the belief they can train away their need for blood, and sunlight not burning but causing essentially instant cancerous tumor growth. Also, the “monsters living among us” is more of the Epstein variety than every single vampire being a literal monster.

    There is emo but it’s more of a “I can’t stop fucking up my life” kind of emo than “woe is me sob sob cry” emo.

    I suspect this won’t be enough of a departure from what you complain about to be palatable.


  • No idea how easy this will be to follow if you’re forced to rely on text-to-speech and/or other assistive technologies, but here goes:

    • to tell nginx the product is physically on /wp/, you probably want a root /wp directive
    • to tell nginx the browser can point to domain.tld/post or domain.tld/english/post, you probably want two location blocks (one for each url) that each contain a rewrite directive that massages the url requested by the browser into pointing to the correct post or page location.
    • for this to be in a file on it’s own, and assuming your nginx setup is pretty standard, you probably want to have the entire server block be in a file that lives in the sites-available directory and symbolically linked (“symlinked”) into the sites-enabled directory.

    For the rewrites, here is the link to the relevant documentation page: https://nginx.org/en/docs/http/ngx_http_rewrite_module.html . You will need to understand the basics of how to write a Regular Expression, or get someone to write it for you. If you can’t find a human that’s available and willing to help, maybe a back-and-forth with an L.L.M. can get you to what you need (I don’t like suggesting L.L.M.s but being sighted myself I don’t really know if they’re better or worse than recommending you just work at learning how to do this on your own, given the current state of the web).


  • I’m surprised that you’re talking about models being CUDA-specific or AMD-specific. I’ve had a bunch of models running on my amd-only pc, using ollama, lemonade, and lm-studio, through either rocm or vulkan. None of these models were billed as AMD-specific. I had to do some config tweaking for ollama to use my graphics card but that’s more because I have a weird in-between-generations card that also predates the LLM hype (6700XT).

    However, I did generally need to look for the GGUF format versions of things - usually accounts like unsloth have them uploaded on huggingface barely a day or two after the original version gets posted.






  • Learned helplessness is an insidious foe, and one that market forces have tended to side with over the past 20 years (probably for far longer than that, but as I was a mere child back then I wouldn’t claim it with as much certainty).

    It’s an “easy way” for those like you and me who have more or less already built up the know-how over countless small steps, but if you’ve never known “life” outside of these corporate surveillance playgrounds I imagine it seems very scary and deserted.



  • It’s been a while since I set up my runner, and I have it on my personal desktop (which is wayyyyyy beefier than the VPS I host my forgejo instance on), but I’m pretty sure I was able to specify that only my user account can trigger actions to be run on this runner. What I’m getting at is that there is a decent amount of granularity for forgejo action permissions; you should be able to find a balance that suits you between “no actions at all” and “anyone can run any code they desire on your server”.