this post was submitted on 02 Oct 2023
120 points (100.0% liked)

Technology

1086 readers
8 users here now

This is the official technology community of Lemmy.ml for all news related to creation and use of technology, and to facilitate civil, meaningful discussion around it.


Ask in DM before posting product reviews or ads. All such posts otherwise are subject to removal.


Rules:

1: All Lemmy rules apply

2: Do not post low effort posts

3: NEVER post naziped*gore stuff

4: Always post article URLs or their archived version URLs as sources, NOT screenshots. Help the blind users.

5: personal rants of Big Tech CEOs like Elon Musk are unwelcome (does not include posts about their companies affecting wide range of people)

6: no advertisement posts unless verified as legitimate and non-exploitative/non-consumerist

7: crypto related posts, unless essential, are disallowed

founded 5 years ago
MODERATORS
top 50 comments
sorted by: hot top controversial new old
[–] Moobythegoldensock@lemm.ee 26 points 1 year ago (5 children)

At the crux of the author's lawsuit is the argument that OpenAI is ruthlessly mining their material to create "derivative works" that will "replace the very writings it copied."

The authors shoot down OpenAI's excuse that "substantial similarity is a mandatory feature of all copyright-infringement claims," calling it "flat wrong."

Goodbye Star Wars, Avatar, Tarantino’s entire filmography, every slasher film since 1974…

[–] taanegl 19 points 1 year ago (2 children)

Uh, yeah, a massive corporation sucking up all intellectual property to milk it is not the own you think it is.

[–] Umbrias 14 points 1 year ago* (last edited 1 year ago) (1 children)

But this is literally people trying to strengthen copyright and its scope. The corporation is, out of pure convenience, using copyright as it exists currently with the current freedoms applied to artists.

The fix to the issues with ai displacing a market for artists isn't yet stronger copyright..

[–] taanegl 3 points 1 year ago* (last edited 1 year ago) (1 children)

Listen, it's pretty simple. Copyright was made to protect creators on initial introduction to market. In modern times it's good if an artist has one lifetime, i.e their lifetime of royalties, so that they can at least make a little something - because for the small artist that little something means food on their plate.

But a company, sitting on a Smaug's hill worth of intellectual property, "forever less a day"? Now that's bonkers.

But you, scraping my artwork to resell for pennies on the dollar via some stock material portal? Can I maybe crawl up your colon with sharp objects and kindling to set up a fire? Pretty please? Oh pretty please!

Also, if you AI copies my writing style, I will personally find you, rip open your skull AND EAT YOUR BRAINS WITH A SPOON!!!! Got it, devboy?

Won't be Mr Hotshot with a pointy objects and a fire up you ass, as well as less than half a brain... even though I just took a couple of bites.

Chew on that one.

EDIT: the creative writer is doomed, I tells ya! DOOOOOOMED!

[–] Umbrias 9 points 1 year ago (2 children)

This is remarkably aggressive and assumptive. It also addresses none of my beliefs substantively so not much to really chew on there.

You let me know if you ever want to chat about the issue, but right now it looks like you just want to vent. Feel free to do that but I'm not going to just be an object of your anger.

[–] winky88@startrek.website 2 points 1 year ago* (last edited 1 year ago) (2 children)

Bleeding hearts rarely do their cause justice (referring to the person you replied to)

load more comments (2 replies)
load more comments (1 replies)
[–] Even_Adder@lemmy.dbzer0.com 10 points 1 year ago* (last edited 1 year ago)

AI training isn’t only for mega-corporations. We can already train our own open source models, so we shouldn't applaud someone trying to erode our rights and let people put up barriers that will keep out all but the ultra-wealthy. We need to be careful not weaken fair use and hand corporations a monopoly of a public technology by making it prohibitively expensive to for regular people to keep developing our own models. Mega corporations already have their own datasets, and the money to buy more. They can also make users sign predatory ToS allowing them exclusive access to user data, effectively selling our own data back to us. Regular people, who could have had access to a corporate-independent tool for creativity, education, entertainment, and social mobility, would instead be left worse off with fewer rights than where they started.

[–] HawlSera@lemm.ee 13 points 1 year ago* (last edited 1 year ago)

Is actually reminds me of a Sci-Fi I read where in the future, they use an ai to scan any new work in order to see what intellectual property is the big Corporation Zone that may have been used as an influence in order to Halt the production of any new media not tied to a pre-existing IP including 100% of independent and fan-made works.

Which is one of the contributing factors towards the apocalypse. So 500 years later after the apocalypse has been reversed and human colonies are enjoying post scarcity, one of the biggest fads is rediscovering the 20th century, now that all the copyrights expired in people can datamine the ruins of Earth to find all the media that couldn't be properly preserved heading into Armageddon thanks to copyright trolling.

It's referred to in universe as "Twencen"

The series is called FreeRIDErs if anyone is curious, unfortunately the series may never have a conclusion, (untimely death of co creator) most of its story arcs were finished so there's still a good chunk of meat to chew through and I highly recommend it.

[–] anachronist@midwest.social 7 points 1 year ago (4 children)

OpenAI is trying to argue that the whole work has to be similar to infringe, but that's never been true. You can write a novel and infringe on page 302 and that's a copyright infringement. OpenAI is trying to change the meaning of copyright otherwise, the output of their model is oozing with various infringements.

load more comments (4 replies)
load more comments (2 replies)
[–] FaceDeer@kbin.social 21 points 1 year ago

Did anyone expect them to go "oh, okay, that makes sense after all"?

[–] ZILtoid1991@kbin.social 18 points 1 year ago (2 children)

seethe

Very concerning word use from you.

The issue art faces isn't that there's not enough throughput, but rather there's not enough time, both to make them and enjoy them.

[–] mkhoury@lemmy.ca 10 points 1 year ago

That's always been the case, though, imo. People had to make time for art. They had to go to galleries, see plays and listen to music. To me it's about the fair promotion of art, and the ability for the art enjoyer to find art that they themselves enjoy rather than what some business model requires of them, and the ability for art creators to find a niche and to be able to work on their art as much as they would want to.

[–] sadreality@kbin.social 5 points 1 year ago

Headline is stupid.

Millenails journalism is fucking got to stop with these clown word choices...

[–] archomrade@midwest.social 16 points 1 year ago (1 children)

Copyright is already just a band-aid for what is really an issue of resource allocation.

If writers and artists weren't at risk of loosing their means of living, we wouldn't need to concern ourselves with the threat of an advanced tool supplanting them. Nevermind how the tool is created, it is clearly very valuable (otherwise it would not represent such a large threat to writers) and should be made as broadly available (and jointly-owned and controlled) as possible. By expanding copyright like this, all we're doing is gatekeeping the creation of AI models to the largest of tech companies, and making them prohibitively expensive to train for smaller applications.

If LLM's are truly the start of a "fourth industrial revolution" as some have claimed, then we need to consider the possibility that our economic arrangement is ill-suited for the kind of productivity it is said AI will bring. Private ownership (over creative works, and over AI models, and over data) is getting in the way of what could be a beautiful technological advancement that benefits everyone.

Instead, we're left squabbling over who gets to own what and how.

[–] Franzia@lemmy.blahaj.zone 4 points 1 year ago (1 children)

fourth industrial revolution" as some have claimed

The people claiming this are often the shareholders themselves.

prohibitively expensive to train for smaller applications.

There is so much work out there for free, with no copyright. The biggest cost in training is most likely the hardware, and I see no added value in having AI train on Stephen King ☠️

Copyright is already just a band-aid for what is really an issue of resource allocation.

God damn right but I want our government to put a band aid on capitalists just stealing whatever the fuck they want "move fast and break things". It's yet another test for my confidence in the state. Every issue, a litmus test for how our society deals with the problems that arise.

[–] archomrade@midwest.social 3 points 1 year ago (1 children)

There is so much work out there for free, with no copyright

There's actually a lot less than you'd think (since copyright lasts for so long), but even less now that any online and digitized sources are being locked down and charged for by the domain owners. But even if it were abundant, it would likely not satisfy the true concern here. If there was enough data to produce an LLM of similar quality without using copyrighted data, it would still threaten the security of those writers. What is to say a user couldn't provide a sample of Stephen King's writing to the LLM and have it still produce derivative work without having trained it on copyrighted data? If the user had paid for that work, are they allowed to use the LLM in the same way? If they aren't who is really at fault, the user or the owner of the LLM?

The law can't address the complaints of these writers because interpreting the law to that standard is simply too restrictive and sets an impossible standard. The best way to address the complaint is to simply reform copyright law (or regulate LLM's through some other mechanism). Frankly, I do not buy that the LLM's are a competing product to the copyrighted works.

The biggest cost in training is most likely the hardware

That's right for large models like the ones owned by OpenAI and Google, but with the amount of data needed to effectively train and fine-tune these models, if that data suddenly became scarce and expensive it could easily overtake hardware cost. To say nothing for small consumer models that are run on consumer hardware.

capitalists just stealing whatever the fuck they want “move fast and break things”

I understand this sentiment, but keep in mind that copyright ownership is just another form of capital.

load more comments (1 replies)
[–] autotldr@lemmings.world 9 points 1 year ago

This is the best summary I could come up with:


ChatGPT creator OpenAI has been on the receiving end of two high profile lawsuits by authors who are absolutely livid that the AI startup used their writing to train its large language models, which they say amounts to flaunting copyright laws without any form of compensation.

One of the lawsuits, led by comedian and memoirist Sarah Silverman, is playing out in a California federal court, where the plaintiffs recently delivered a scolding on ChatGPT's underlying technology.

At the crux of the author's lawsuit is the argument that OpenAI is ruthlessly mining their material to create "derivative works" that will "replace the very writings it copied."

The authors shoot down OpenAI's excuse that "substantial similarity is a mandatory feature of all copyright-infringement claims," calling it "flat wrong."

It can brag that it's a leader in a booming AI industry, but in doing so it's also painted a bigger target on its back, making enemies of practically every creative pursuit.

High profile literary luminaries behind that suit include George R. R. Martin, Jonathan Franzen, David Baldacci, and legal thriller maestro John Grisham.


The original article contains 369 words, the summary contains 180 words. Saved 51%. I'm a bot and I'm open source!

[–] Madison_rogue@kbin.social 6 points 1 year ago* (last edited 1 year ago)

Here’s current guidance from US Congress regarding AI copyright infringement.

Page 3 includes guidance on fair use.

https://crsreports.congress.gov/product/pdf/LSB/LSB10922

[–] beejjorgensen@lemmy.sdf.org 4 points 1 year ago

"substantial similarity is a mandatory feature of all copyright-infringement claims"

Is that not a requirement? Time for me to start suing people!

[–] JokeDeity@lemm.ee 3 points 1 year ago (1 children)

Wah. Waaaah. Cry more rich people.

[–] Franzia@lemmy.blahaj.zone 10 points 1 year ago (1 children)

Writers are rich because they've made artwork and sold it. I personally hold that to a higher value than CEOs.

[–] floofloof@lemmy.ca 4 points 1 year ago

And while these ones may not be badly off, most writers are far from rich.

load more comments
view more: next ›