First, they restricted code search without logging in so I’m using sourcegraph But now, I cant even view discussions or wiki without logging in.

It was a nice run

  • Omega_Haxors@lemmy.ml
    link
    fedilink
    English
    arrow-up
    59
    ·
    10 months ago

    The writing was on the wall when they established a generative AI using everyone’s code and of course without asking anyone for permission.

    • Elise@beehaw.org
      link
      fedilink
      arrow-up
      5
      arrow-down
      3
      ·
      10 months ago

      It’s an interesting debate isn’t it? Does AI transform something free into something that’s not? Or does it simply study the code?

      • chebra@mstdn.io
        link
        fedilink
        arrow-up
        7
        ·
        10 months ago

        @xilliah It’s not free though. It came with licenses. And LLMs don’t have the capability to “study”, they are just a glorified random word generator.

      • Omega_Haxors@lemmy.ml
        link
        fedilink
        English
        arrow-up
        14
        arrow-down
        9
        ·
        edit-2
        10 months ago

        There’s no debate. LLMs are plagiarism with extra steps. They take data (usually illegally) wholesale and then launder it.

        A lot of people have been doing research into the ethics of these systems and that’s more or less what they found. The reason why they’re black boxes is precisely the reason we all suspected; they were made that way because if they weren’t we’d all see them for what they are.

        • AnonStoleMyPants@sopuli.xyz
          link
          fedilink
          arrow-up
          8
          arrow-down
          1
          ·
          10 months ago

          The reason they’re black boxes is because that’s how LLMs work. Nothing new here, neural networks have been basically black boxes for a long time.

          • Kaldo@kbin.social
            link
            fedilink
            arrow-up
            6
            arrow-down
            1
            ·
            edit-2
            10 months ago

            Sure, but nothing is theoretically stopping them from documenting every single data source input into the training module and then crediting it later.

            For some reason they didn’t want to do that of course.

        • count_duckula@discuss.tchncs.de
          link
          fedilink
          arrow-up
          3
          ·
          edit-2
          10 months ago

          The reason they are blackboxes is because they are function approximators with billions of parameters. Theory has not caught up with practical results. This is why you tune hyperparameters (learning rate, number of layers, number of neurons ina layer, etc.) and have multiple iterations of training to get an approximation of the distribution of the inputs. Training is also sensitive to the order of inputs to the network. A network trained on the same training set but in a different order might converge to an entirely different function. This is why you train on the same inputs in random order over multiple episodes to hopefully average out such variations. They are blackboxes simply because you can’t yet prove theoretically the function it has approximated or converged to given the input.

          • Turun@feddit.de
            link
            fedilink
            arrow-up
            2
            ·
            10 months ago

            I doubt they have a factual basis for their opinion, considering

            they were made that way because if they weren’t we’d all see them for what they are.

            Is just plain wrong. Researchers would love to have a non black box AI (i.e. a white box AI), but it’s unfortunately impossible with the current architecture.

            • Elise@beehaw.org
              link
              fedilink
              arrow-up
              1
              ·
              edit-2
              10 months ago

              Their use of language also feels more emotional and if anything it makes me more skeptical.