this post was submitted on 01 Feb 2024

330 points (100.0% liked)

Memes

1358 readers

34 users here now

Rules:

Be civil and nice.
Try not to excessively repost, as a rule of thumb, wait at least 2 months to do it if you have to.

founded 5 years ago

MODERATORS

gary_host_laptop@lemmy.ml

sexy_peach@feddit.de

cyclohexane@lemmy.ml

cypherpunks@lemmy.ml

330

very upsetting (lemmy.ml)

submitted 9 months ago* (last edited 9 months ago) by cypherpunks@lemmy.ml to c/memes@lemmy.ml

15 comments fedilink hide all child comments

caption

a screenshot of the text:

Tech companies argued in comments on the website that the way their models ingested creative content was innovative and legal. The venture capital firm Andreessen Horowitz, which has several investments in A.I. start-ups, warned in its comments that any slowdown for A.I. companies in consuming content “would upset at least a decade’s worth of investment-backed expectations that were premised on the current understanding of the scope of copyright protection in this country.”

underneath the screenshot is the "Oh no! Anyway" meme, featuring two pictures of Jeremy Clarkson saying "Oh no!" and "Anyway"

screenshot (copied from this mastodon post) is of a paragraph of the NYT article "The Sleepy Copyright Office in the Middle of a High-Stakes Clash Over A.I."

top 15 comments

sorted by: hot top controversial new old

[–] vzq@lemmy.blahaj.zone 53 points 9 months ago* (last edited 9 months ago) (2 children)

We need copyright reform. Life of author plus 70 for everything is just nuts.

This is not an AI problem. This is a companies literally owning our culture problem.

[–] MustrumR@kbin.social 22 points 9 months ago

Going one step deeper, at the source, it's oligarchy and companies owning the law and in consequence also its enforcement.

[–] OmnipotentEntity 7 points 9 months ago (1 children)

If this is what it takes to get copyright reform, just granting tech companies unlimited power to hoover up whatever they want and put it in their models, it's not going to be the egalitarian sort of copyright reform that we need. Instead, we will just getting a carve out just for this, which is ridiculous.

There are small creators who do need at least some sort of copyright control, because ultimately people should be paid for the work they do. Artists who work on commission are the people in the direct firing line of generative AI, both in commissions and in their day jobs. This will harm them more than any particular company. I don't think models will suffer if they can only include works in the public domain, if the public domain starts in 2003, but that's not the kind of copyright protection that Amazon, Google, Facebook, etc. want, and that's not what they're going to ask for.

[–] Rivalarrival@lemmy.today 1 points 9 months ago (1 children)

Copyright protects against creating and distributing copies. Copyright does not protect against reading and understanding a work.

What LLMs and other models are doing is analogous to reading a book and writing a book report. They are not regurgitating a copy of the book to users. They are not creating or distributing a copy.

The purpose of copyright laws are to promote the progress of Science and the Useful Arts. The purpose is to expand the depth and breadth of human knowledge and technology. "Fair Use" is not an exception: "Fair Use" is purpose. "Copyright" is the exception.

If technology is fundamentally incompatible with copyright law, that technology has the right-of-way, and copyright must yield.

[–] OmnipotentEntity 2 points 9 months ago

What LLMs and other models are doing is analogous to reading a book and writing a book report.

It is purported to be analogous to that. But given that in actuality it can also simply reproduce nearly entire articles word for word from a short prompt, it's clear that the analogy that you are attempting to draw is flawed. Inside of the LLM, encoded in the weights and biases of the network, is that article and many others, it has been copied into the network, encoded, and can be referenced.

The Pile is 825GiB of text. ChatGPT-4 is about 400 billion parameters, and each of those parameters is 2 bytes, which is 800GiB of data. There's certainly enough redundancy in whatever corpus they're using to just memorize the entire thing and still have sufficient network space leftover to actually make some sense of it.

[–] far_university1990@feddit.de 13 points 9 months ago (2 children)

Either this kill large AI models (at least commercial). Or it kill some copyright bs in some way. Whatever happens, society wins.

Second option could also hurt small creator though.

[–] LarmyOfLone@lemm.ee 6 points 9 months ago (1 children)

I fear this is a giant power grab. What this will lead to is that IP holders, those that own the content that AI needs to train will dictate prices. So all the social media content you kindly gave reddit, facebook, twitter, pictures, all that stuff means you won't be able to have any free AI software.

No free / open source AI software means there is a massive power imbalance because now only those who can afford to buy this training data and do it, any they are forced to maximize profits (and naturally inclined anyway).

Basically they will own the "means of generation" while we won't.

[–] far_university1990@feddit.de 3 points 9 months ago (1 children)

Current large model would all be sued to death, no license with IP owner yet, would kill all existing commercial large models. Except all IP owner are named and license granted retroactive, but sound unlikely.

Hundred of IP owner company and billion of individual IP owner setting prices will probably behave like streaming: price increase and endless fragmentation. Need a license for every IP owner, paperwork will be extremely massive. License might change, expire, same problem as streaming but every time license expire need to retrain entire model (or you infringe because model keep using data).

And in the EU you have right to be forgotten, so excluded from models (because in this case not transformative enough, ianal but sound like it count as storing), so every time someone want to be excluded, retrain entire model.

Do not see where it possible to create large model like this with any amount of money, time, electricity. Maybe some smaller models. Maybe just more specific for one task.

Also piracy exists, do not care about copyright, will just train and maybe even open source (torrent). Might get caught, might not, might become dark market, idk. Will exist though, like deepfakes.

[–] LarmyOfLone@lemm.ee 2 points 9 months ago

Yeah those are the myriad of complications this will cause. People are worried about AI, I'm too, but we need smart regulation not to use IP laws that only increases power of the ultra-rich. Because if AI will continue to exist, that will severely distort and limit the market to very specific powerful entities. And that is almost certainly going to be worse than completely unregulated.

[–] Honytawk@lemmy.zip 5 points 9 months ago

I know plenty of small creators who urge me to pirate their content.

Because all they want is people to enjoy their content, and piracy helps spread their art.

So even small creators are against copyright.

[–] peak_dunning_krueger@feddit.de 10 points 9 months ago

I mean, I won't deny that small bit of skill it took to construct a plausible sounding explanation for why the public should support your investment, because it's "not illegal (yet)".

[–] ShortN0te@lemmy.ml 7 points 9 months ago

That's the point about money, if you have enough you can simply sue or bribe in order to not lose money.

[–] Floshie@lemmy.blahaj.zone 1 points 9 months ago (2 children)

Can someone rephrase me ? I've read it two times and I really don't get it

[–] dylanmorgan@slrpnk.net 7 points 9 months ago

As Robert Evan’s put it: “If we can’t steal every book ever written, we’ll go broke!”

[–] MajorHavoc@programming.dev 4 points 9 months ago

A scammer made unreasonable promises to investors and is now warning everyone that their victims/investors are going to lose money when the process of making fair laws takes the typical amount of time that it always takes.