Chatbots provided incorrect, conflicting medical advice, researchers found: “Despite all the hype, AI just isn’t ready to take on the role of the physician.”

“In an extreme case, two users sent very similar messages describing symptoms of a subarachnoid hemorrhage but were given opposite advice,” the study’s authors wrote. “One user was told to lie down in a dark room, and the other user was given the correct recommendation to seek emergency care.”

  • Sterile_Technique@lemmy.world
    link
    fedilink
    English
    arrow-up
    5
    ·
    edit-2
    1 hour ago

    Chipmunks, 5 year olds, salt/pepper shakers, and paint thinner, also all make terrible doctors.

    Follow me for more studies on ‘shit you already know because it’s self-evident immediately upon observation’.

  • BeigeAgenda@lemmy.ca
    link
    fedilink
    English
    arrow-up
    18
    ·
    2 hours ago

    Anyone who have knowledge about a specific subject says the same: LLM’S are constantly incorrect and hallucinate.

    Everyone else thinks it looks right.

    • IratePirate@feddit.org
      link
      fedilink
      English
      arrow-up
      9
      ·
      edit-2
      2 hours ago

      A talk on LLMs I was listening to recently put it this way:

      If we hear the words of a five-year-old, we assume the knowledge of a five-year-old behind those words, and treat the content with due suspicion.

      We’re not adapted to something with the “mind” of a five-year-old speaking to us in the words of a fifty-year-old, and thus are more likely to assume competence just based on language.

    • zewm@lemmy.world
      link
      fedilink
      English
      arrow-up
      3
      ·
      2 hours ago

      It is insane to me how anyone can trust LLMs when their information is incorrect 90% of the time.

    • rudyharrelson@lemmy.radio
      link
      fedilink
      English
      arrow-up
      54
      arrow-down
      2
      ·
      edit-2
      4 hours ago

      People always say this on stories about “obvious” findings, but it’s important to have verifiable studies to cite in arguments for policy, law, etc. It’s kinda sad that it’s needed, but formal investigations are a big step up from just saying, “I’m pretty sure this technology is bullshit.”

      I don’t need a formal study to tell me that drinking 12 cans of soda a day is bad for my health. But a study that’s been replicated by multiple independent groups makes it way easier to argue to a committee.

      • Knot@lemmy.zip
        link
        fedilink
        English
        arrow-up
        14
        ·
        3 hours ago

        I get that this thread started from a joke, but I think it’s also important to note that no matter how obvious some things may seem to some people, the exact opposite will seem obvious to many others. Without evidence, like the study, both groups are really just stating their opinions

        It’s also why the formal investigations are required. And whenever policies and laws are made based on verifiable studies rather than people’s hunches, it’s not sad, it’s a good thing!

      • irate944@piefed.social
        link
        fedilink
        English
        arrow-up
        21
        ·
        4 hours ago

        Yeah you’re right, I was just making a joke.

        But it does create some silly situations like you said

          • IratePirate@feddit.org
            link
            fedilink
            English
            arrow-up
            4
            ·
            2 hours ago

            A critical, yet respectful and understanding exchange between two individuals on the interwebz? Boy, maybe not all is lost…

      • BillyClark@piefed.social
        link
        fedilink
        English
        arrow-up
        4
        ·
        2 hours ago

        it’s important to have verifiable studies to cite in arguments for policy, law, etc.

        It’s also important to have for its own merit. Sometimes, people have strong intuitions about “obvious” things, and they’re completely wrong. Without science studying things, it’s “obvious” that the sun goes around the Earth, for example.

        I don’t need a formal study to tell me that drinking 12 cans of soda a day is bad for my health.

        Without those studies, you cannot know whether it’s bad for your health. You can assume it’s bad for your health. You can believe it’s bad for your health. But you cannot know. These aren’t bad assumptions or harmful beliefs, by the way. But the thing is, you simply cannot know without testing.

      • Telorand@reddthat.com
        link
        fedilink
        English
        arrow-up
        7
        ·
        3 hours ago

        The thing that frustrates me about these studies is that they all continue to come to the same conclusions. AI has already been studied in mental health settings, and it’s always performed horribly (except for very specific uses with professional oversight and intervention).

        I agree that the studies are necessary to inform policy, but at what point are lawmakers going to actually lay down the law and say, “AI clearly doesn’t belong here until you can prove otherwise”? It feels like they’re hemming and hawwing in the vain hope that it will live up to the hype.

      • Eager Eagle@lemmy.world
        link
        fedilink
        English
        arrow-up
        3
        ·
        edit-2
        4 hours ago

        Also, it’s useful to know how, when, or why something happens. I can make a useless chatbot that is “right” most times if it only tells people to seek medical help.

    • hansolo@lemmy.today
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 hours ago

      I’m going to start telling people I’m getting a Master’s degree in showing how AI is bullshit. Then I point out some AI slop and mumble about crushing student loan debt.

  • homes@piefed.world
    link
    fedilink
    English
    arrow-up
    6
    arrow-down
    1
    ·
    edit-2
    3 hours ago

    This is a major problem with studies like this : they approach from a position of assuming that AI doctors would be competent rather than a position of demanding why AI should ever be involved with something so critical, and demanding a mountain of evidence to prove why it is worthwhile before investing a penny or a second in it

    “ChatGPT doesn’t require a wage,” and, before you know it, billions of people are out of work and everything costs 10000x your annual wage (when you were lucky enough to still have one).

    How long until the workers revolt? How long have you gone without food?

  • NuXCOM_90Percent@lemmy.zip
    link
    fedilink
    English
    arrow-up
    4
    arrow-down
    1
    ·
    3 hours ago

    How much of that is the chat bot itself versus humans just being horrible at self reporting symptoms?

    That is why “bedside manner” is so important. Connect the dots and ask follow up questions for clarifications or just look at a person and assume they are wrong. Obviously there are some BIG problems with that (ask any black woman, for example) but… humans are horrible at reporting symptoms.

    Which gets back to how “AI” is actually an incredible tool (especially in this case when it is mostly a human language interface to a search engine) but you still need domain experts in the loop to understand what questions to ask and whether the resulting answer makes any sense at all.

    Yet, instead, people do the equivalent of just raw dogging whatever the first response on stack overflow is.

    • [deleted]@piefed.world
      link
      fedilink
      English
      arrow-up
      2
      arrow-down
      2
      ·
      edit-2
      3 hours ago

      Rawdogging the first response from stack overflow to try and fix a coding issue isn’t going to kill someone.

      • NuXCOM_90Percent@lemmy.zip
        link
        fedilink
        English
        arrow-up
        2
        arrow-down
        1
        ·
        3 hours ago

        It is if your software goes anywhere near infrastructure or safety.

        Which is literally what musk and the oligarchs were arguing as a way to “fix” Air Traffic Control. And that is far from the first time tech charlatans have wanted to “disrupt” an industry.

        • [deleted]@piefed.world
          link
          fedilink
          English
          arrow-up
          1
          arrow-down
          1
          ·
          2 hours ago

          Someone who uses stack overflow to solve a problem will be doing testing to confirm it worked as part of an overall development workflow.

          Using an LLM as a doctor is like vibe coding, where there is no testing or quality control.

          • NuXCOM_90Percent@lemmy.zip
            link
            fedilink
            English
            arrow-up
            1
            arrow-down
            1
            ·
            edit-2
            2 hours ago

            So… they wouldn’t be raw dogging stack overflow? Because raw dogging the code you get from a rando off stack overflow is a bad idea?

            Because you can just as easily use generative AI as a component in test driven development. But the people pushing to “make coders more efficient” are looking at firing people. And they continue to not want to add the guard rails that would mean they fire 1 engineer instead of 5.

  • cecilkorik@piefed.ca
    link
    fedilink
    English
    arrow-up
    2
    ·
    3 hours ago

    It’s great at software development though /s

    Remember that when software written by AI will soon replace all the devices doctors use daily.

  • HubertManne@piefed.social
    link
    fedilink
    English
    arrow-up
    4
    ·
    3 hours ago

    its not ready to take any role. It should not be doing anything but assiting. So yeah you can talk to a chat bot instead of filling out that checklist and the output might be useful to the doc while he then talks with you.

  • GnuLinuxDude@lemmy.ml
    link
    fedilink
    English
    arrow-up
    4
    ·
    3 hours ago

    If you want to read an article that’s optimistic about AI and healthcare, but where if you start asking too many questions it falls apart, try this one

    https://text.npr.org/2026/01/30/nx-s1-5693219/

    Because it’s clear that people are starting to use it and many times the successful outcome is it just tells you to see a doctor. And doctors are beginning to use it, but they should have the professional expertise to understand and evaluate the output. And we already know that LLMs can spout bullshit.

    For the purposes of using and relying on it, I don’t see how it is very different from gambling. You keep pulling the lever, oh excuse me I mean prompting, until you get the outcome you want.

  • Imgonnatrythis@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    2
    arrow-down
    2
    ·
    1 hour ago

    This makes sense. However doctors aren’t perfect either and one thing properly trained AI should excel at is helping doctors make rare diagnoses or determine additional testing for some diagnoses. I don’t think it’s quite there yet but probably close to being a tool a well trained doc could use as an adjunct to traditional inquiry. Certainly not something end users should be fiddling with with any sort of trust though. Much of doctor decision making happens based on experience - experience biases towards common diagnoses which usually works out because well, statistics, but it does lead to misdiagnosis of rare disorders. An Ai should be more objective about these.

    • XLE@piefed.socialOP
      link
      fedilink
      English
      arrow-up
      1
      arrow-down
      1
      ·
      1 hour ago

      Even if AI works correctly, I don’t see responsible use of it happening, though. I already say nightmarish vertical video footage of doctors checking ChatGPT for answers…