Google has struck a deal with Reddit that will allow the search engine maker to train its AI models on Reddit’s vast catalog of user-generated content, the two companies announced. Under the arrangement, Google will get access to Reddit’s Data API, which will help the company “better understand” content from the site.

The deal also provides Google with a valuable source of content it can use to train its AI models. “Google will now have efficient and structured access to fresher information, as well as enhanced signals that will help us better understand Reddit content and display, train on, and otherwise use it in the most accurate and relevant ways,” the company said in a statement.

  • pingveno@lemmy.ml
    link
    fedilink
    English
    arrow-up
    1
    ·
    9 months ago

    The argument isn’t just around content, it’s around hosting. If Google is sitting there scarfing down Reddit’s data, that costs Reddit in server time. That can get extremely expensive. So yeah, if Google is going to train an AI that Google will profit off of, it should pay Reddit for server time.

    • catloaf
      link
      fedilink
      English
      arrow-up
      3
      ·
      edit-2
      9 months ago

      More than server time, for big Internet connections, uploads are priced by the byte. When someone requests a lot of data, reddit has to pay their provider to send it.