• tourist@lemmy.world
    link
    fedilink
    English
    arrow-up
    17
    ·
    1 year ago

    The participants judged GPT-4 to be human a shocking 54 percent of the time.

    ELIZA, which was pre-programmed with responses and didn’t have an LLM to power it, was judged to be human just 22 percent of the time

    Okay, 22% is ridiculously high for ELIZA. I feel like any half sober adult could clock it as a bot by the third response, if not immediately.

    Try talking to the thing: https://web.njit.edu/~ronkowit/eliza.html

    I refuse to believe that 22% didn’t misunderstand the task or something.