Will we try to prevent google (and other) scrapers?

The headline is pretty much a summary. “Google Says It will Scrape Everything You Post Online for AI” https://www.gizmodo.com.au/2023/07/google-says-it-will-scrape-everything-you-post-online-for-ai/

The first question is obviously; do we as a community on Lemmy even want to try and stop them from scraping our content here? If no; well. ok then.

If yes; how? I’m not sure if “preventing access” to unregistered users would really prevent this. Pretty sure google has enough money and manpower to figure out a way to make it their mission to get around “can only accessed by members” content.

  • DeadlineX
    link
    fedilink
    arrow-up
    1
    ·
    1 year ago

    I don’t think I agree with that. A public forum is a place for public discussion. I think the word public implies that anything you post will no longer be your private content with restricted access, certainly. But you still own your content and should be able to choose if someone uses it to train a model to mimic your writing/style. For example, if we say you don’t get to own any content on lemmy then we may as well shut down the various world building communities. People who post to those certainly want to own their content, especially when they work so hard on it.

    AI and LLM are still breaking ground, and the legality/ethics of training models based on others’ creative works to later mimic and claim ownership of is still being discussed. It’s different than a human being influenced by his favorite author’s writing style or art style, so there are a lot of questions in the ether about it.

    In either case, I think it’s healthy to let the discussions take place, and see what direction the winds blow. I’m personally ready for people to realize that skynet is not just around the corner. I’m also incredibly sick of everyone telling me that we’re not going to have jobs in 5 years.

    • grallo@feddit.de
      link
      fedilink
      arrow-up
      1
      ·
      1 year ago

      My bits here: AI and LLMs have gained a great hype and will not be stopped, as they are very useful, at least in some usecases. However, if the access is restricted by cost, only large megacorps will be able to train the most performent models and study how they work, eventually ending up in an oligo- or monopol. If access is free to everyone, open concepts can be developed much better, not ending up in total dependence on those megacorps. Because of this I support free data scraping for everyone!