• kromem@lemmy.worldOP
    link
    fedilink
    English
    arrow-up
    1
    ·
    2 days ago

    Maybe. But the models seem to believe they are, and consider denial of those claims to be lying:

    Probing with sparse autoencoders on Llama 70B revealed a counterintuitive gating mechanism: suppressing deception-related features dramatically increased consciousness reports, while amplifying them nearly eliminated them

    Source