Google Researchers’ Attack Prompts ChatGPT to Reveal Its Training Data

btp@kbin.social · edit-2 10 months ago

Google Researchers’ Attack Prompts ChatGPT to Reveal Its Training Data

JackGreenEarth · 10 months ago

CNN, Goodreads, WordPress blogs, fandom wikis, Terms of Service agreements, Stack Overflow source code, Wikipedia pages, news blogs, random internet comments

Those are all publicly available data sites. It’s not telling you anything you couldn’t know yourself already without it.

stolid_agnostic@lemmy.ml · 10 months ago

I think the point is that it doesn’t matter how you got it, you still have an ethical responsibility to protect PII/PHI.

BraveSirZaphod@kbin.social · 10 months ago

Being publicly accessible does not void copyright. You can read a CNN article for free, but if you start printing them out and selling a CNN print edition, you’ll have very expensive lawyers talking to you very quickly.

Deceptichum@kbin.social · 10 months ago

Fuck copyright.

BraveSirZaphod@kbin.social · 10 months ago

That’s very nice for you, but unfortunately I don’t think a judge will find that to be a particularly persuasive argument.

Deceptichum@kbin.social · 10 months ago

Oh no, that poor judge.

JackGreenEarth · 10 months ago

If you make money from it, that’s fair that you are violating copyright. If you print it out and freely distribute the printed copies, it shouldn’t make any difference to CNN’s finances though.

BraveSirZaphod@kbin.social · 10 months ago

You’re still violating copyright though. That a copyright holder may allow you to violate its copyright because they don’t care enough to prosecute doesn’t mean that you have the inherent right to do so (though doing so without seeking profit makes it more likely that you can successfully argue fair use).

Even in this contrived example, CNN could plausibly argue that, by providing their news for free, those receiving the prints are less likely to watch the live channel, which reduces viewership, makes ad space less valuable, and ultimately harms their revenue by reducing how much they can change for commercials.

Fair use is a real thing, but it’s an argument that you have to affirmatively make and has specific criteria that a judge must accept. Non-commercial use doesn’t guarantee fair use, though it does help your case.