Father, Hacker (Information Security Professional), Open Source Software Developer, Inventor, and 3D printing enthusiast

  • 1 Post
  • 214 Comments
Joined 3 years ago
cake
Cake day: June 23rd, 2023

help-circle


  • The real problem here is that Xitter isn’t supposed to be a porn site (even though it’s hosted loads of porn since before Musk bought it). They basically deeply integrated a porn generator into their very publicly-accessible “short text posts” website. Anyone can ask it to generate porn inside of any post and it’ll happily do so.

    It’s like showing up at Walmart and seeing everyone naked (and many fucking), all over the store. That’s not why you’re there (though: Why TF are you still using that shithole of a site‽).

    The solution is simple: Everyone everywhere needs to classify Xitter as a porn site. It’ll get blocked by businesses and schools and the world will be a better place.




  • No, a .safetensors file is not a database. You can’t query a .safetensors file and there’s nothing like ACID compliance (it’s read-only).

    Imagine a JSON file that has only keys and values in it where both the keys and the values are floating point numbers. It’s basically gibberish until you go through an inference process and start feeding random numbers through it (over and over again, whittling it all down until you get a result that matches the prompt to a specified degree).

    How do the “turbo” models work to get a great result after one step? I have no idea. That’s like black magic to me haha.


  • Riskable@programming.devtopolitics @lemmy.worldWho Controls AI Exactly?
    link
    fedilink
    English
    arrow-up
    5
    arrow-down
    1
    ·
    5 days ago

    Or, with AI image gen, it knows that when some one asks it for an image of a hand holding a pencil, it looks at all the artwork in it’s training database and says, “this collection of pixels is probably what they want”.

    This is incorrect. Generative image models don’t contain databases of artwork. If they did, they would be the most amazing fucking compression technology, ever.

    As an example model, FLUX.dev is 23.8GB:

    https://huggingface.co/black-forest-labs/FLUX.1-dev/tree/main

    It’s a general-use model that can generate basically anything you want. It’s not perfect and it’s not the latest & greatest AI image generation model, but it’s a great example because anyone can download it and run it locally on their own PC (and get vastly superior results than ChatGPT’s DALL-E model).

    If you examine the data inside the model, you’ll see a bunch of metadata headers and then an enormous array of arrays of floating point values. Stuff like, [0.01645, 0.67235, ...]. That is what a generative image AI model uses to make images. There’s no database to speak of.

    When training an image model, you need to download millions upon millions of public images from the Internet and run them through their paces against an actual database like ImageNET. ImageNET contains lots of metadata about millions of images such as their URL, bounding boxes around parts of the image, and keywords associated with those bounding boxes.

    The training is mostly a linear process. So the images never really get loaded into an database, they just get read along with their metadata into a GPU where it performs some Machine Learning stuff to generate some arrays of floating point values. Those values ultimately will end up in the model file.

    It’s actually a lot more complicated than that (there’s pretraining steps and classifiers and verification/safety stuff and more) but that’s the gist of it.

    I see soooo many people who think image AI generation is literally pulling pixels out of existing images but that’s not how it works at all. It’s not even remotely how it works.

    When an image model is being trained, any given image might modify one of those floating point values by like ±0.01. That’s it. That’s all it does when it trains on a specific image.

    I often rant about where this process goes wrong and how it can result in images that look way too much like some specific images in training data but that’s a flaw, not a feature. It’s something that every image model has to deal with and will improve over time.

    At the heart of every AI image generation is a random number generator. Sometimes you’ll get something similar to an original work. Especially if you generate thousands and thousands of images. That doesn’t mean the model itself was engineered to do that. Also: A lot of that kind of problem happens in the inference step but that’s a really complicated topic…







  • The mistakes it makes depends on the model and the language. GPT5 models can make horrific mistakes though where it randomly removes huge swaths of code for no reason. Every time it happens I’m like, “what the actual fuck?” Undoing the last change and trying usually fixes it though 🤷

    They all make horrific security mistakes quite often. Though, that’s probably because they’re trained on human code that is *also" chock full of security mistakes (former security consultant, so I’m super biased on that front haha).



  • You want to see someone using say, VS Code to write something using say, Claude Code?

    There’s probably a thousand videos of that.

    More interesting: I watched someone who was super cheap trying to use multiple AIs to code a project because he kept running out of free credits. Every now and again he’d switch accounts and use up those free credits.

    That was an amazing dance, let me tell ya! Glorious!

    I asked him which one he’d pay for if he had unlimited money and he said Claude Code. He has the $20/month plan but only uses it in special situations because he’ll run out of credits too fast. $20 really doesn’t get you much with Anthropic 🤷

    That inspired me to try out all the code assist AIs and their respective plugins/CLI tools. He’s right: Claude Code was the best by a HUGE margin.

    Gemini 3.0 is supposed to be nearly as good but I haven’t tried it yet so I dunno.

    Now that I’ve said all that: I am severely disappointed in this article because it doesn’t say which AI models were used. In fact, the study authors don’t even know what AI models were used. So it’s 430 pull requests of random origin, made at some point in 2025.

    For all we know, half of those could’ve been made with the Copilot gpt5-mini that everyone gets for free when they install the Copilot extension in VS Code.



  • Good games are orthogonal to AI usage. It’s possible to have a great game that was written with AI using AI-generated assets. Just as much as it’s possible to have a shitty one.

    If AI makes creating games easier, we’re likely to see 1000 shitty games for every good one. But at the same time we’re also likely to see successful games made by people who had great ideas but never had the capital or skills to bring them to life before.

    I can’t predict the future of AI but it’s easy to imagine a state where everyone has the power to make a game for basically no cost. Good or bad, that’s where we’re heading.

    If making great games doesn’t require a shitton of capital, the ones who are most likely to suffer are the rich AAA game studios. Basically, the capitalists. Because when capital isn’t necessary to get something done anymore, capital becomes less useful.

    Effort builds skill but it does not build quality. You could put in a ton of effort and still fail or just make something terrible. What breeds success is iteration (and luck). Because AI makes iteration faster and easier, it’s likely we’re going to see a lot of great things created using it.




  • “Finish high school, get a job, get married, have kids, go to church. Those are all in your control,” -Ben Shapiro

    …all within the comfort of your parents home (if they even have one big enough for you, your wife, and your kids). Because that’s all they can afford.

    Also: WTF does going to church have to do with anything‽ That’s not going to land you a good job (blessed are the poor)! There’s no marketable skills to be learned from going to church either (unless you want to socialize with pedophiles and other sex offenders in order to better understand Trump’s social circle; see: “pastor arrested”).