OpenAI Says It’s "Over" If It Can’t Steal All Your Copyrighted Work

DeadNinja@lemmy.world · 4 months ago

OpenAI Says It’s "Over" If It Can’t Steal All Your Copyrighted Work

nothacking@discuss.tchncs.de · 4 months ago

But when China steals all their (arguably not copywrite-able) work…

turnip@sh.itjust.works · 4 months ago

Sam Altman hasn’t complained surprisingly, he just said there’s competition and it will be harder for OpenAI to compete with open source. I think their small lead is essentially gone, and their plan is now to suckle Microsoft’s teet.

HiddenLayer555@lemmy.ml · 4 months ago

it will be harder for OpenAI to compete with open source

Can we revoke the word open from their name? Please?

Phoenixz@lemmy.ca · 4 months ago

This is a tough one

Open-ai is full of shit and should die but then again, so should copyright law as it currently is

PropaGandalf@lemmy.world · 4 months ago

yes, screw them both. let altman scrape all the copyright material and choke on it

meathappening@lemmy.ml · edit-2 4 months ago

That’s fair, but OpenAI isn’t fighting to reform copyright law for everyone. OpenAI wants you to be subject to the same restrictions you currently face, and them to be exempt. This isn’t really an “enemy of my enemy” situation.

Melvin_Ferd@lemmy.world · edit-2 4 months ago

Is anyone trying to make stronger copyright laws? Wouldn’t be rich people that control media would it?

Onno (VK6FLAB)@lemmy.radio · 4 months ago

“Your proposal is acceptable.”

TachyonTele@lemm.ee · 4 months ago

Darn

misunderstooddemon @reddthat.com · 4 months ago

Too bad, so sad

AfricanExpansionist@lemmy.ml · 4 months ago

Obligatory: I’m anti-AI, mostly anti-technology

That said, I can’t say that I mind LLMs using copyrighted materials that it accesses legally/appropriately (lots of copyrighted content may be freely available to some extent, like news articles or song lyrics)

I’m open to arguments correcting me. I’d prefer to have another reason to be against this technology, not arguing on the side of frauds like Sam Altman. Here’s my take:

All content created by humans follows consumption of other content. If I read lots of Vonnegut, I should be able to churn out prose that roughly (or precisely) includes his idiosyncrasies as a writer. We read more than one author; we read dozens or hundreds over our lifetimes. Likewise musicians, film directors, etc etc.

If an LLM consumes the same copyrighted content and learns how to copy its various characteristics, how is it meaningfully different from me doing it and becoming a successful writer?

kibiz0r@midwest.social · 4 months ago

If an LLM consumes the same copyrighted content and learns how to copy its various characteristics, how is it meaningfully different from me doing it and becoming a successful writer?

That is the trillion-dollar question, isn’t it?

I’ve got two thoughts to frame the question, but I won’t give an answer.

Laws are just social constructs, to help people get along with each other. They’re not supposed to be grand universal moral frameworks, or coherent/consistent philosophies. They’re always full of contradictions. So… does it even matter if it’s “meaningfully” different or not, if it’s socially useful to treat it as different (or not)?
We’ve seen with digital locks, gig work, algorithmic market manipulation, and playing either side of Section 230 when convenient… that the ethos of big tech is pretty much “define what’s illegal, so I can colonize the precise border of illegality, to a fractal level of granularity”. I’m not super stoked to come with an objective quantitative framework for them to follow, cuz I know they’ll just flow around it like water and continue to find ways to do antisocial shit in ways that technically follow the rules.

catloaf@lemm.ee · 4 months ago

In your example, you could also be sued for ripping off his style.

Bassman1805@lemmy.world · 4 months ago

You can sue for anything in the USA. But it is pretty much impossible to successfully sue for “ripping off someone’s style”. Where do you even begin to define a writing style?

iAmTheTot@sh.itjust.works · 4 months ago

“style”, in terms of composition, is actually a component in proving plagiarism.

catloaf@lemm.ee · 4 months ago

There are lots of ways to characterize writing style. Go read Finnegans Wake and tell me James Joyce doesn’t have a characteristic style.

MrQuallzin@lemmy.world · edit-2 4 months ago

Edited for clarity: If that were the case then Weird AL would be screwed.

Original: In that case Weird AL would be screwed

ryedaft@sh.itjust.works · 4 months ago

No because what he does is already a settled part of the law.

MrQuallzin@lemmy.world · 4 months ago

That’s the point. It’s established law so OP wouldn’t be sued

Pennomi@lemmy.world · 4 months ago

Right. The problem is not the fact it consumes the information, the problem is if the user uses it to violate copyright. It’s just a tool after all.

Like, I’m capable of violating copyright in infinitely many ways, but I usually don’t.

SoulWager@lemmy.ml · edit-2 4 months ago

The problem is that the user usually can’t tell if the AI output is infringing someone’s copyright or not unless they’ve seen all the training data.

A_norny_mousse@feddit.org · 4 months ago

Except the reason Altman is so upset has nothing to do with this very valid discussion.

As I commented elsewhere:

Fuck Sam Altmann, the fartsniffer who convinced himself & a few other dumb people that his company really has the leverage to make such demands.

He doesn’t care about democracy, he’s just scared because a chinese company offers what his company offers, but for a fraction of the price/resources.

He’s scared for his government money and basically begging for one more handout “to save democracy”.

Yes, I’ve been listening to Ed Zitron.

ricecake@sh.itjust.works · 4 months ago

Yup. Violating IP licenses is a great reason to prevent it. According to current law, if they get Alice license for the book they should be able to use it how they want.
I’m not permitted to pirate a book just because I only intend to read it and then give it back. AI shouldn’t be able to either if people can’t.

Beyond that, we need to accept that might need to come up with new rules for new technology. There’s a lot of people, notably artists, who object to art they put on their website being used for training. Under current law if you make it publicly available, people can download it and use it on their computer as long as they don’t distribute it. That current law allows something we don’t want doesn’t mean we need to find a way to interpret current law as not allowing it, it just means we need new laws that say “fair use for people is not the same as fair use for AI training”.

droplet6585@lemmy.ml · edit-2 4 months ago

and learns how to copy its various characteristics

Because you are a human. Not an immortal corporation.

I am tired of people trying to have iNtElLeCtUaL dIsCuSsIoN about/with entities that would feed you feet first into a wood chipper if it thought it could profit from it.

fartsparkles@lemmy.world · 4 months ago

If this passes, piracy websites can rebrand as AI training material websites and we can all run a crappy model locally to train on pirated material.

A_norny_mousse@feddit.org · 4 months ago

You are a glass half full sort of person!

underisk@lemmy.ml · 4 months ago

That would work if you were rich and friends with government officials. I don’t like your chances otherwise.

moreeni@lemm.ee · 4 months ago

Another win for piracy community

Knock_Knock_Lemmy_In@lemmy.world · 4 months ago

Fuck it. I’m training my home AI will the world’s TV, Movies and Books.

SplashJackson@lemmy.ca · 4 months ago

OpenAI can open their asses and go fuck themselves!

surph_ninja@lemmy.world · edit-2 4 months ago

This is why they killed that former employee.

Sazruk@lemmy.wtf · 4 months ago

Say his name y’all

Suchir Balaji

surph_ninja@lemmy.world · 4 months ago

Sorry, wasn’t trying to be a dick. Just couldn’t think of it at the time.

Sazruk@lemmy.wtf · 4 months ago

No worries wasn’t trying to call you out just making a point

schnurrito@discuss.tchncs.de · 4 months ago

If It Can’t Steal All Your Copyrighted Work

https://commons.wikimedia.org/wiki/File:Copying_Is_Not_Theft.webm

Niquarl@lemmy.ml · 4 months ago

Of course it is if you copy to monetise which is what they do.

droplet6585@lemmy.ml · 4 months ago

They monetize it, erase authorship and bastardize the work.

Like if copyright was to protect against anything, it would be this.

Creosm@lemmy.world · 4 months ago

Oh it’s “over”? Fine for me

VeryInterestingTable@lemm.ee · 4 months ago

Ho no, what will we do without degenerate generative AIs?!

rumba@lemmy.zip · 4 months ago

Okay, I can work with this. Hey Altman you can train on anything that’s public domain, now go take those fuck ton of billions and fight the copyright laws to make public domain make sense again.

meathappening@lemmy.ml · 4 months ago

This is the correct answer. Never forget that US copyright law originally allowed for a 14 year (renewable for 14 more years) term. Now copyright holders are able to:

reach consumers more quickly and easily using the internet
market on more fronts (merch didn’t exist in 1710)
form other business types to better hold/manage IP

So much in the modern world exists to enable copyright holders, but terms are longer than ever. It’s insane.

melpomenesclevage@lemmy.dbzer0.com · edit-2 4 months ago

Removed by mod

rumba@lemmy.zip · 4 months ago

Counter counterpoint: I don’t know, I think making an exception for tech companies probably gives a minor advantage to consumers at least.

You can still go to copilot and ask it for some pretty fucking off the wall python and bash, it’ll save you a good 20 minutes of writing something and it’ll already be documented and generally best practice.

Sure the tech companies are the one walking away with billions of dollars and it presumably hurts the content creators and copyright holders.

The problem is, feeding AI is not significantly different than feeding Google back in the day. You remember back when you could see cached versions of web pages. And hell their book scanning initiative to this day is super fucking useful.

If you look at how we teach and train artists. And then how those artists do their work. All digital art and most painting these days has reference art all over the place. AI is taking random noise and slowly making things look more like the reference art that’s not wholly different than what people are doing.

We’re training AI on every book that people can get their hands on, But that’s how we train people too.

I say that training an AI is not that different than training people, and the entire content of all the copyright they look at in their lives doesn’t get a chunk of the money when they write a book or paint something that looks like the style of Van Gogh. They’re even allowed to generate content for private companies or for sale.

What is different, is that the AI is very good at this and has machine levels of retention and abilities. And companies are poised to get rich off of the computational work. So I’m actually perfectly down with AI’s being trained on copyrighted materials as long as they can’t recite it directly and in whole, But I feel the models that are created using these techniques should also be in the public domain.

melpomenesclevage@lemmy.dbzer0.com · edit-2 4 months ago

Removed by mod

ef9357@lemmy.world · 4 months ago

Good, go away.

Melvin_Ferd@lemmy.world · 4 months ago

Let them. Copyright is bullshit. What’s the issue. He’s right

kn0wmad1c@programming.dev · 4 months ago