• mindbleach@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    0
    arrow-down
    1
    ·
    1 year ago

    My guy. We can pretty confident that expensive training using human-labeled data did not include child pornography. Nobody just slipped in the sort of images that are illegal to even look at.

    • ⚡⚡⚡@feddit.de
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      What do you mean “human labeled”? They train it with as much data as possible and humans don’t validate each byte of it.

      How do you build an image generating AI? You don’t google “dog” and download images and label the image as “black dog”. Instead, you scrape the WWW and other sources an download pretty much everything they can and train the AI and give it instructions to not generate certain stuff.

      I’m 100% sure that ChatGPT was also trained on text including CP fantasies and I’m 100% sure, the image generators were also trained on images you’d classify as “should not exist”…