• LovableSidekick@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      edit-2
      9 days ago

      Another realization might be that the humans whose output ChatGPT was trained on were probably already 40% wrong about everything. But let’s not think about that either. AI Bad!

      • Shanmugha@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        edit-2
        9 days ago

        I’ll bait. Let’s think:

        -there are three humans who are 98% right about what they say, and where they know they might be wrong, they indicate it

        • now there is an llm (fuck capitalization, I hate the ways they are shoved everywhere that much) trained on their output

        • now llm is asked about the topic and computes the answer string

        By definition that answer string can contain all the probably-wrong things without proper indicators (“might”, “under such and such circumstances” etc)

        If you want to say 40% wrong llm means 40% wrong sources, prove me wrong

      • starman2112@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        1
        ·
        8 days ago

        This is a salient point that’s well worth discussing. We should not be training large language models on any supposedly factual information that people put out. It’s super easy to call out a bad research study and have it retracted. But you can’t just explain to an AI that that study was wrong, you have to completely retrain it every time. Exacerbating this issue is the way that people tend to view large language models as somehow objective describers of reality, because they’re synthetic and emotionless. In truth, an AI holds exactly the same biases as the people who put together the data it was trained on.