OpenAI declares AI race “over” if training on copyrighted works isn’t fair use

cyrano@lemmy.dbzer0.com · 1 day ago

OpenAI declares AI race “over” if training on copyrighted works isn’t fair use

Embargo@lemm.ee · edit-2 1 day ago

Oh no! How will I generate a picture of Sam Altman blowing himself now!?

This is fine🔥🐶☕🔥@lemmy.world · 1 day ago

Wdym? He removed his rib or something?

Embargo@lemm.ee · 1 day ago

I was thinking more of a Sam 1 and Sam 2 type situation.

Zier@fedia.io · 1 day ago

Photoshop, just like the rest of us.

latenightnoir@lemmy.world · edit-2 1 day ago

Sad to see you leave (not really, tho’), love to watch you go!

Edit: I bet if any AI developing company would stop acting and being so damned shady and would just ASK FOR PERMISSION, they’d receive a huge amount of data from all over. There are a lot of people who would like to see AGI become a real thing, but not if it’s being developed by greedy and unscrupulous shitheads. As it stands now, I think the only ones who are actually doing it for the R&D and not as eye-candy to glitz away people’s money for aesthetically believable nonsense are a handful of start-up-likes with (not in a condescending way) kids who’ve yet to have their dreams and idealism trampled.

daniskarma@lemmy.dbzer0.com · 1 day ago

In Spain we trained an AI using a mix of public resources available for AI training and public resources (legislation, congress sessions, etc). And the AI turned out quite good. Obviously not top of the line, but very good overall.

It was a public project not a private company.

HakFoo@lemmy.sdf.org · 1 day ago

But what data would it be?

Part of the “gobble all the data” perspective is that you need a broad corpus to be meaningfully useful. Not many people are going to give a $892 billion market cap when your model is a genius about a handful of narrow subjects that you could get deep volunteer support on.

OTOH maybe there’s probably a sane business in narrow siloed (cheap and efficient and more bounded expectations) AI products: the reinvention of the “expert system” with clear guardrails, the image generator that only does seaside background landscapes but can’t generate a cat to save its life, the LLM that’s a prettified version of a knowledgebase search and NOTHING MORE

latenightnoir@lemmy.world · edit-2 1 day ago

You’ve highlighted exactly why I also fundamentally disagree with the current trend of all things AI being for-profit. This should be 100% non-profit and driven purely by scientific goals, in which case using copyrighted data wouldn’t even be an issue in the first place… It’d be like literally giving someone access to a public library.

Edit: but to focus on this specific instance, where we have to deal with the here-and-now, I could see them receiving, say, 60-75% of what they have now, hassle-free. At the very least, and uniformly distributed. Again, AI development isn’t what irks most people, it’s calling plagiarism generators and search engine fuck-ups AI and selling them back to the people who generated the databases - or, worse, working toward replacing those people entirely with LLMs! - they used for those abhorrences.

Train the AI to be factually correct instead and sell it as an easy-to-use knowledge base? Aces! Train the AI to write better code and sell it as an on-board stackoverflow Jr.? Amazing! Even having it as a mini-assistant on your phone so that you have someone to pester you to get the damned laundry out of the washing machine before it starts to stink is a neat thing, but that would require less advertising and shoving down our throats, and more accepting the fact that you can still do that with five taps and a couple of alarm entries.

Edit 2: oh, and another thing which would require a buttload of humility, but would alleviate a lot of tension would be getting it to cite and link to its sources every time! Have it be transformative enough to give you the gist without shifting into plagiarism, then send you to the source for the details!

CrazyLikeGollum@lemmy.world · 3 hours ago

What’s wrong with the sentiment expressed in the headline? AI training is not and should not be considered fair use. Also, copyright laws are broken in the west, more so in the east.

We need a global reform of copyright. Where copyrights can (and must) be shared among all creators credited on a work. The copyright must be held by actual people, not corporations (or any other collective entity), and the copyright ends after 30 years or when the all rights holders die, whichever happens first. That copyright should start at the date of initial publication. The copyright should be nontransferable but it should be able to be licensed to any other entity only with a majority consent of all rights holders. At the expiration of the copyright the work in question should immediately enter the public domain.

And fair use should be treated similarly to how it is in the west, where it’s decided on a case-by-case basis, but context and profit motive matter.

TheBrideWoreCrimson@sopuli.xyz · 22 hours ago

My main takeaway is that some contrived notion of “national security” has now become an acceptable justification for business decisions in the US.

shaggyb@lemmy.world · 1 day ago

“How am I supposed to make any money if I can’t steal all of my products to sell back to the world that produced them?”

Yeah, fuck that. The whole industry deserves to die.

geography082@lemm.ee · 1 hour ago

Fuck these psychos. They should pay the copyright they stole with the billions they already made. Governments should protect people, MDF

Ferroto@lemmy.world · 2 hours ago

Siegfried@lemmy.world · 22 hours ago

Idk about that, but openai is probably over

Shanmugha@lemmy.world · 20 hours ago

National security my ass. More like his time span to show more dumb “achievements” while getting richer depends on it and nothing else

JHD@lemmy.world · 1 day ago

Strange that no one mentioned openai making money off copyrighted works.

Rhoeri@lemmy.world · 9 hours ago

What a giant load of crap.

SlopppyEngineer@lemmy.world · 1 day ago

He’s afraid of losing his little empire.

OpenAI also had no clue on recreation the happy little accident that gave them chatGPT3. That’s mostly because their whole thing was using a simple model and brute forcing it with more data, more power, more nodes and then even more data and power until it produced results.

As expected, this isn’t sustainable. It’s beyond the point of decreasing returns. But Sam here has no idea on how to fix that with much better models so goes back to the one thing he knows: more data needed, just one more terabyte bro, ignore the copyright!

And now he’s blaming the Chinese into forcing him to use even more data.

hornedfiend@sopuli.xyz · 17 hours ago

over it is then. Buh bye!

Jericho_One@lemmy.world · 22 hours ago

Greyfoxsolid@lemmy.world · 6 hours ago

Sorry to say, but he’s right. For AI to truly flourish in the West, it needs access to all previously human made information and media.