• FlowVoid@lemmy.world
      link
      fedilink
      English
      arrow-up
      17
      ·
      4 months ago

      You can prove it through discovery, ie force the AI developers to reveal all the songs they used to train their AI.

          • cheese_greater@lemmy.world
            link
            fedilink
            arrow-up
            5
            arrow-down
            1
            ·
            edit-2
            4 months ago

            Huh? That would be the point of not keeping logs of the inputs and outputs and any process in between

            • FlowVoid@lemmy.world
              link
              fedilink
              English
              arrow-up
              10
              ·
              edit-2
              4 months ago

              If you don’t keep logs, and someone has evidence you did something wrong, then there won’t be any opposing evidence that you were in the right. So the jury will start out siding against you, and you won’t have any way to win them back.

              In fact if a judge thinks you didn’t keep logs because you were afraid they would incriminate you, then they will tell the jury to consider the lack of logs as further evidence against you.

          • GBU_28
            link
            fedilink
            English
            arrow-up
            4
            arrow-down
            1
            ·
            edit-2
            4 months ago

            Well I’m not defending this, but that isn’t how crimes are prosecuted, thankfully. Prosecution is obligated to prove their case with evidence.

            Like, they have to prove (beyond a reasonable doubt) that you were at the crime scene and committed the act.

            Edit Indeed of someone DOES have some sort of evidence you committed the crime and you offer no rebuttal, then you’re hosed

            • FlowVoid@lemmy.world
              link
              fedilink
              English
              arrow-up
              4
              arrow-down
              2
              ·
              edit-2
              4 months ago

              This isn’t a prosecution, and nobody is alleging a crime. This is a civil lawsuit.

              In a civil lawsuit, the standard of evidence is much different. You do not have to “prove” things beyond a reasonable doubt like in a criminal trial. The jury is instructed to weigh the evidence like a balance, and whichever side has the best evidence wins. Even if it’s only a small difference that only slightly favors one side, they win.

              That’s why it’s so important to have evidence that counters whatever the other side claims. You are bound to lose if your opponents are the only ones offering evidence on their side of the balance.

              • GBU_28
                link
                fedilink
                English
                arrow-up
                3
                ·
                4 months ago

                Agree. I believe I acknowledged that in my last sentence.

                My point is a frivolous claim is a thing , and someone bringing a claim must suffice a basic level of evidence to even proceed. Indeed, as you say, “judgement” is at a lower final standard in a civil suit.

                • FlowVoid@lemmy.world
                  link
                  fedilink
                  English
                  arrow-up
                  2
                  arrow-down
                  1
                  ·
                  4 months ago

                  I don’t think this is frivolous. If you publish a song that includes part of my song, that’s good evidence that you copied my song.

  • Chozo@fedia.io
    link
    fedilink
    arrow-up
    17
    arrow-down
    11
    ·
    4 months ago

    “The basic point is that [the AI companies’] model requires a vast corpus of sound recordings in order to output synthetic music files that are convincing imitations of human music,” the suits alleged. “Because of their sheer popularity and exposure, the Copyrighted Recordings had to be included within Suno’s training data for Suno’s model to be successful at creating the desired human-sounding outputs.

    Nope, there’s plenty of other ways for an AI to have created similar notes. Say you have Song A written by Steve. Steve grew up listening to a lot of John, who wrote songs B through Z. Steve spent his childhood listening to and being influenced by John, so when Steve eventually grows up to write Song A, it’s incredibly possible for it to contain elements from songs B through Z. So if an AI trains off of Steve it’s going to consequently pick up whatever habits Steve learned from John.

    Just like how you picked up some habits from your parents, which they picked up from their parents… etc. You could develop a habit that started with an ancestor you’ve never met; who are you copying?

    • FlowVoid@lemmy.world
      link
      fedilink
      English
      arrow-up
      15
      arrow-down
      4
      ·
      edit-2
      4 months ago

      Of course there are other ways to create similar notes.

      But now the AI developers will have to testify under oath that they did not use Johnny B Goode, and identify the soundalike song they used that is not among the millions of other IPs held by the RIAA.

      • Chozo@fedia.io
        link
        fedilink
        arrow-up
        12
        arrow-down
        8
        ·
        4 months ago

        I feel that this logic follows a common misconception of generative AI. Its output isn’t made from the training data. It will take inspiration from it, but it doesn’t just mix-and-match samples from the training materials. GenAI uses metadata that it builds based on that training data, but the data, itself, isn’t directly referenced during generation.

        The way AI generates content isn’t like when Vanilla Ice sampled Under Pressure; it would be more like if Vanilla Ice had talent and could actually write music, and had accidentally written the same bass line without ever hearing Queen. While unlikely, it’s still possible, and I’m sure we’ve all experienced a similar situation; ie. you open a comment thread to post a joke based on the headline and see the top comment is already the exact same joke you were going to make… You didn’t copy the other user, and they didn’t copy you, but you both likely share a similar experience that trigger the same associations.

        For the same reasons that two different writers can accidentally tell the same story, or two different comedians can write the same joke, two different musicians can write the same melodies if they have shared inspirations. In all of those instances, both parties can create entirely original materials own their own accord, even if they aren’t meaningfully unique from each other. The way generative AI works isn’t significantly different, which is why this is such a legally-murky situation. If generative AI were more rudimentary and was actually sampling the training data, it would be an open-and-shut copyright infringement case. But, because the materials the AI produces are original creations of its own, we get into this situation where we have to argue over where to draw the line between “inspiration” and “replication”.

        • FlowVoid@lemmy.world
          link
          fedilink
          English
          arrow-up
          12
          arrow-down
          4
          ·
          edit-2
          4 months ago

          I think a common misconception of these lawsuits is that the AI output is an issue. It isn’t. It doesn’t matter what the generative AI generates. The AI developers, not the AIs, are the problem.

          Let’s go back to your Vanilla Ice example. Suppose Vanilla Ice is found to have downloaded a massive collection of mp3s from The Pirate Bay. He is sued by the RIAA, just like Napster users were sued years ago.

          In court, he explains that what he did is legal because his music doesn’t sample from his mp3 collection at all. And he loses, because the RIAA doesn’t care what he did after he pirated mp3s. Pirating them, by itself, is illegal.

          And that’s what’s going on here. The RIAA isn’t arguing that the AI output is illegal. They are arguing that the AI output is basically a snitch: it’s telling the RIAA that the developers must have pirated a bunch of mp3s.

          In other words, artists like Vanilla Ice have to pay for their mp3s like everyone else. And so do software developers.

          • Chozo@fedia.io
            link
            fedilink
            arrow-up
            7
            arrow-down
            5
            ·
            4 months ago

            Piracy isn’t the issue, I’m not sure if we’re referencing different things here.

            How the developers came to possess the training material isn’t being called into question - it’s whether or not they’re allowed to train an AI with it, and whether doing so constitutes copyright infringement. And currently, the way in which generative AI works does not cross those legal boundaries, as written.

            The argument the RIAA wants to make is that using copyrighted material for the purposes of training software extends beyond the protections of fair use. I believe their argument is that - even if acquired otherwise legally - acquiring music for the explicit purpose of making new music would be considered a commercial use of the material. Basically like the difference between buying an album to listen to with your headphones or buying an album to play for a packed concert hall, suggesting that the commercial intent behind acquiring the music is what makes it illegal.

            • FlowVoid@lemmy.world
              link
              fedilink
              English
              arrow-up
              3
              arrow-down
              1
              ·
              edit-2
              4 months ago

              This is the basis for the RIAA claims, which sure sounds like piracy:

              On information and belief, similar to other generative AI audio models, Suno trains its AI model to produce audio output by generally taking the following steps: a. Suno first copies massive numbers of sound recordings, including by “scraping” (i.e., copying or downloading) them from digital sources. This vast collection of information forms the input, or “corpus,” upon which the Suno AI model is trained.

              There is no evidence the AI devs bought any music, for any use. Quite the opposite:

              Antonio Rodriguez, a partner at the venture capital firm Matrix Partners, explained that his firm invested in the company with full knowledge that Suno might get sued by copyright owners, which he understood as “the risk we had to underwrite when we invested in the company.” Rodriguez pulled the curtain back further when he added that “honestly, if we had deals with labels when this company got started, I probably wouldn’t have invested in it. I think they needed to make this product without the constraints.” By “constraints,” Rodriguez was, of course, referring to the need to adhere to ordinary copyright rules and seek permission from rightsholders to copy and use their works.

              • Chozo@fedia.io
                link
                fedilink
                arrow-up
                2
                ·
                edit-2
                4 months ago

                I don’t think that’s the basis of their argument.

                The RIAA alleges that the generators used the record labels’ songs to illegally train the models since they didn’t have the rights holders’ permission to use the recordings. But whether the companies needed that permission is unclear. AI companies have argued that the use of training data is a case of fair use, meaning they are allowed to use the recordings with impunity.

                Emphasis mine. Their concern is that the music was used for commercial purposes, not how the music came into their possession. Web scraping is already legal, that’s never been a piracy issue.

                • FlowVoid@lemmy.world
                  link
                  fedilink
                  English
                  arrow-up
                  2
                  arrow-down
                  1
                  ·
                  edit-2
                  4 months ago

                  Courts have found that scraping data from a public website is legal, because data is not protected by copyright. But copying protected works without permission is generally illegal, it doesn’t matter if you use a scraper.

                  If the defendants in this case admit using RIAA works, then they will probably try to argue fair use. At that point their product will become relevant, including its commercial nature. This will weigh against them, because their songs directly compete against RIAA songs. In fact, that’s why artists who include samples in their work usually obtain permission first.

        • Natanael@slrpnk.net
          link
          fedilink
          arrow-up
          8
          arrow-down
          3
          ·
          edit-2
          4 months ago

          The problem here is that we don’t have real AI.

          We have fancier generative machine learning, and despite the claims it does not in fact generalize that well from most inputs and a lot of recurring samples end up actually embedded in the model and can thus be replicated (there’s papers on this such as sample recovery attacks and more).

          They heavily embedd genre tropes and replicate existing bias and patterns much too strongly to truly claim nothing is being copied, the copying is more of a remix situation than accidental recreation.

          Elements of the originals is there, and many features can often be attributed to the original authors (especially since the models often learn to mimic the style of individual authors, which means it embedds information about features of copyrighted works from individual authors and how to replicate them)

          While it’s not a 1:1 replication in most instances, it frequently gets close enough that a human doing it would be sued.

          This photographer lost in court for recreating the features of another work too closely

          https://www.copyrightuser.org/educate/episode-1-case-file-1/

    • dependencyinjection@discuss.tchncs.de
      link
      fedilink
      arrow-up
      13
      arrow-down
      3
      ·
      4 months ago

      IMO music copyright has gone too far. Have a look at the Creep - Radiohead chord progression of I, III, IV, and iV.

      There have been a few lawsuits over using the same chord progression, but with music theory there are only so many permutations before you end up arriving at the same logical places. The same is quite true for overlaying melodies to this particular progression.

        • dependencyinjection@discuss.tchncs.de
          link
          fedilink
          arrow-up
          8
          ·
          edit-2
          4 months ago

          Tell that to Radiohead and the people that previously sued Radiohead over Creep and had writing credits added for it.

          As I mentioned it was more than the progression, but the melody too. As to my understanding of music theory, which isn’t my field of expertise, in this particular chord progression the melody one would put over the progression is further limited.

          Edit: Here you can find mention to the songs used in lawsuits

          Edit: Here is someone discussing this who has more musical knowledge than I

          • undergroundoverground@lemmy.world
            link
            fedilink
            arrow-up
            2
            arrow-down
            1
            ·
            4 months ago

            There might have been a copyright issue. However, the only legal cases involving

            There have been a few lawsuits over using the same chord progression

            Have failed. The most famous of these would be ed sherans. Its other stuff, with the chord progression. Like the eye of the tiger. You could steal those chords but you couldn’t steal the “duh…duh duh duh” bit with the chords, if that makes sense.

            Oh for sure, with the melody too, it would be well over the line. I just didn’t want any aspiring song writters to get spooked. So, I thought to clarify was all.

    • jumjummy@lemmy.world
      link
      fedilink
      arrow-up
      3
      arrow-down
      3
      ·
      4 months ago

      You’re equating a human listening, learning, and getting inspired by other sources with what an LLM does as part of the model building. I don’t think the two are the same. Look at copyright law as it sits right now with AI not being able to hold copyrights.

      We can’t treat AI with the same legal protections as humans.

    • Dkarma@lemmy.world
      link
      fedilink
      arrow-up
      4
      arrow-down
      6
      ·
      4 months ago

      Listening / training does not equal copying. Period.

      This isn’t even a copyright issue.

      • jumjummy@lemmy.world
        link
        fedilink
        arrow-up
        3
        ·
        4 months ago

        It absolutely is. An AI model being trained on sources is not “listening” or “viewing”. You can’t apply human legal paradigms to AI models.