this post was submitted on 02 Jul 2023

324 points (100.0% liked)

Piracy: ꜱᴀɪʟ ᴛʜᴇ ʜɪɢʜ ꜱᴇᴀꜱ

1445 readers

8 users here now

⚓ Dedicated to the discussion of digital piracy, including ethical problems and legal advancements.

Rules • Full Version

1. Posts must be related to the discussion of digital piracy

2. Don't request invites, trade, sell, or self-promote

3. Don't request or link to specific pirated titles, including DMs

4. Don't submit low-quality posts, be entitled, or harass others

Loot, Pillage, & Plunder

📜 c/Piracy Wiki (Community Edition):

💰 Please help cover server costs.


Ko-fi	Liberapay

founded 1 year ago

MODERATORS

db0@lemmy.dbzer0.com

sunbrothersco@lemmy.dbzer0.com

dataprolet@lemmy.dbzer0.com

Flatworm7591@lemmy.dbzer0.com

RandomLegend@lemmy.dbzer0.com

324

OpenAI being Sued for "Stealing" Peoples Content Online (www.firstpost.com)

submitted 1 year ago by manitcor@lemmy.intai.tech to c/piracy@lemmy.dbzer0.com

49 comments fedilink hide all child comments

cross-posted from: https://lemmy.intai.tech/post/43759

cross-posted from: https://lemmy.world/post/949452

OpenAI's ChatGPT and Sam Altman are in massive trouble. OpenAI is getting sued in the US for illegally using content from the internet to train their LLM or large language models

top 38 comments

sorted by: hot top controversial new old

[–] RatzChatsubo@vlemmy.net 72 points 1 year ago

So we can sue robots but when I ask if we can tax them, and reduce human working hours, I'm the crazy one?

[–] Geograph6@lemmy.dbzer0.com 56 points 1 year ago (3 children)

People talk about OpenAI as if its some utopian saviour that's going to revolutionise society. When in reality its a large corporation flooding the internet with terrible low-quality content using machine learning models that have existed for years. And the fields it is "automating" are creative ones that specifically require a human touch, like art and writing. Language learning models and image generation isn't going to improve anything. They're not "AI" and they never will be. Hopefully when AI does exist and does start automating everything we'll have a better economic system though :D

[–] fiasco@possumpat.io 10 points 1 year ago (7 children)

The thing that amazes me the most about AI Discourse is, we all learned in Theory of Computation that general AI is impossible. My best guess is that people with a CS degree who believe in AI slept through all their classes.

[–] leonardo_arachoo@lemm.ee 32 points 1 year ago* (last edited 1 year ago) (1 children)

we all learned in Theory of Computation that general AI is impossible.

I strongly suspect it is you who has misunderstood your CS courses. Can you provide some concrete evidence for why general AI is impossible?

[–] fiasco@possumpat.io 4 points 1 year ago (2 children)

Evidence, not really, but that's kind of meaningless here since we're talking theory of computation. It's a direct consequence of the undecidability of the halting problem. Mathematical analysis of loops cannot be done because loops, in general, don't take on any particular value; if they did, then the halting problem would be decidable. Given that writing a computer program requires an exact specification, which cannot be provided for the general analysis of computer programs, general AI trips and falls at the very first hurdle: being able to write other computer programs. Which should be a simple task, compared to the other things people expect of it.

Yes there's more complexity here, what about compiler optimization or Rust's borrow checker? which I don't care to get into at the moment; suffice it to say, those only operate on certain special conditions. To posit general AI, you need to think bigger than basic block instruction reordering.

This stuff should all be obvious, but here we are.

[–] leonardo_arachoo@lemm.ee 14 points 1 year ago* (last edited 1 year ago) (1 children)

Given that humans can write computer programs, how can you argue that the undecidability of the halting problem stops intelligent agents from being able to write computer programs?

I don't understand what you mean about the borrow checker in Rust or block instruction reordering. These are certainly not attempts at AI or AGI.

What exactly does AGI mean to you?

This stuff should all be obvious, but here we are.

This is not necessary. Please don't reply if you can't resist the temptation to call people who disagree with you stupid.

[–] fiasco@possumpat.io 2 points 1 year ago (1 children)

This is proof of one thing: that our brains are nothing like digital computers as laid out by Turing and Church.

What I mean about compilers is, compiler optimizations are only valid if a particular bit of code rewriting does exactly the same thing under all conditions as what the human wrote. This is chiefly only possible if the code in question doesn't include any branches (if, loops, function calls). A section of code with no branches is called a basic block. Rust is special because it harshly constrains the kinds of programs you can write: another consequence of the halting problem is that, in general, you can't track pointer aliasing outside a basic block, but the Rust program constraints do make this possible. It just foists the intellectual load onto the programmer. This is also why Rust is far and away my favorite language; I respect the boldness of this play, and the benefits far outweigh the drawbacks.

To me, general AI means a computer program having at least the same capabilities as a human. You can go further down this rabbit hole and read about the question that spawned the halting problem, called the entscheidungsproblem (decision problem) to see that AI is actually more impossible than I let on.

[–] leonardo_arachoo@lemm.ee 6 points 1 year ago* (last edited 1 year ago) (1 children)

Here are two groups of claims I disagree with that I think you must agree with

1 - brains do things that a computer program can never do. It is impossible for a computer to ever simulate the computation* done by a brain. Humans solve the halting problem by doing something a computer could never do.

2 - It is necessary to solve the halting problem to write computer programs. Humans can only write computer programs because they solve the halting problem first.

*perhaps you will prefer a different word here

I would say that:

it doesn't require solving any halting problems to write computer programs
there is no general solution to the halting problem that works on human brains but not on computers.
computers can in principle simulate brains with enough accuracy to simulate any computation happening on a brain. However, there would be far cheaper ways to do any computation.

Which of my statements do you disagree with?

[–] fiasco@possumpat.io 2 points 1 year ago (1 children)

I suppose I disagree with the formulation of the argument. The entscheidungsproblem and the halting problem are limitations on formal analysis. It isn't relevant to talk about either of them in terms of "solving them," that's why we use the term undecidable. The halting problem asks, in modern terms—

Given a computer program and a set of inputs to it, can you write a second computer program that decides whether the input program halts (i.e., finishes running)?

The answer to that question is no. In limited terms, this tells you something fundamental about the capabilities of Turing machines and lambda calculus; in general terms, this tells you something deeply important about formal analysis. This all started with the question—

Can you create a formal process for deciding whether a proposition, given an axiomatic system in first-order logic, is always true?

The answer to this question is also no. Digital computers were devised as a means of specifying a formal process for solving logic problems, so the undecidability of the entscheidungsproblem was proven through the undecidability of the halting problem. This is why there are still open logic problems despite the invention of digital computers, and despite how many flops a modern supercomputer can pull off.

We don't use formal process for most of the things we do. And when we do try to use formal process for ourselves, it turns into a nightmare called civil and criminal law. The inadequacies of those formal processes are why we have a massive judicial system, and why the whole thing has devolved into a circus. Importantly, the inherent informality of law in practice is why we have so many lawyers, and why they can get away with charging so much.

As for whether it's necessary to be able to write a computer program that can effectively analyze computer programs, to be able to write a computer program that can effectively write computer programs, consider... Even the loosey goosey horseshit called "deep learning" is based on error functions. If you can't compute how far away you are from your target, then you've got nothing.

[–] leonardo_arachoo@lemm.ee 3 points 1 year ago* (last edited 1 year ago)

Well, I'll make the halting problem for this conversation decidable by concluding :). It was interesting to talk, but I was not convinced.

I think some amazing things are coming out of deep learning and some day our abilities will generally be surpassed. Hopefully you are right, because I think we will all die shortly afterwards.

Feel free to have the final word.

[–] irmoz@reddthat.com 2 points 1 year ago

From what I've heard, the biggest hurdle for AI right now is the fact that computers only work with binary. They are incapable of actually "reading" the things they write - all they're qctually aware of is the binary digits they manipulate, that represent the words they're reading and writing. It could analyse War and Peace over and over, and even if you asked it who wrote it, it wouldn't actually know.

[–] qfe0@lemmy.dbzer0.com 15 points 1 year ago

The existence of natural intelligence is the proof that artificial intelligence is possible.

[–] IllNess@infosec.pub 9 points 1 year ago (1 children)

It's all buzzword exaggerations. It's marketing.

Remember when hoverboards were for things that actually hover instead of some motorized bullshit on two wheels? Yeah, same bullshit.

[–] gerbal 1 points 1 year ago

Like 'hover boards' current 'ai' does demonstrates passable facimile of intelligence is possible within specific domains. But it is being marketed as if it is the thing it approximates.

The main difference is, a sufficiently advanced simulacrum of intelligence has the same utility as genuine intelligence to capital.

[–] argv_minus_one 9 points 1 year ago (1 children)

We can simulate all manner of physics using a computer, but we can't simulate a brain using a computer? I'm having a real hard time believing that. Brains aren't magic.

[–] fiasco@possumpat.io 2 points 1 year ago (1 children)

Computer numerical simulation is a different kind of shell game from AI. The only reason it's done is because most differential equations aren't solvable in the ordinary sense, so instead they're discretized and approximated. Zeno's paradox for the modern world. Since the discretization doesn't work out, they're then hacked to make the results look right. This is also why they always want more flops, because they believe that, if you just discretize finely enough, you'll eventually reach infinity (or infinitesimal).

This also should not fill you with hope for general AI.

[–] argv_minus_one 1 points 1 year ago

The same argument could be made for sound, and yet digital computers have no problem approximating it to sufficient precision as to make it indistinguishable from the original.

Discretization works fine and is not the problem. The problem is that the “AI” everyone's so hyped up about is nothing more than a language model. It has no understanding of what it's talking about because it has not been taught or allowed to experience anything other than language. Humans use language to express ideas; language-model AIs have no ideas to express because they have no life experience from which to form any ideas.

That doesn't mean AGI is impossible. It is likely infeasible on present-day hardware, but that's not the same thing as being impossible.

[–] coolin 1 points 1 year ago* (last edited 1 year ago)

I mean on its face this claim is absurd. The human brain is literally a computer (neural network), and we can, given enough compute, simulate each and every single neuron and glia to make a brain-in-the-computer AGI. We know we can simulate bioneurons with ~1000 artificial ones, and we know can also perform human tasks such as NLP (Broca and Wernicke's areas) or hippocampus memory functions (Tollman-Eichenbaum machine) even with currently attainable compute resources and far less than that 1000:1 ratio.

[–] gerbal 1 points 1 year ago

As someone who didn't study CS, why is general AI impossible?

load more comments (1 replies)

[–] kurwa 1 points 1 year ago

I think the non creative fields are the ones that are really going to be automated. Things like data entry, which could have been automated in the past, can now be automated at a potentially faster pace. Turning natural human language into commands for a system is something Llama can do.

[–] argv_minus_one 1 points 1 year ago

We won't. If and when AI does become capable of replacing most human labor, there's gonna be a lot of hungry people from then on.

[–] Uriel238@lemmy.fmhy.ml 37 points 1 year ago

If this lawsuit is ruled in favor of the plaintiff, it might lead to lawsuits against those who have collected and used private data more maliciously, from advertisement-targeting services to ALPR services that reveal to law enforcement your driving habits.

[–] UntouchedWagons@lemmy.ca 22 points 1 year ago (2 children)

Piracy isn't stealing and neither is this.

[–] manitcor@lemmy.intai.tech 8 points 1 year ago

yarrr!

[–] gigglehurtz@lemmy.dbzer0.com 7 points 1 year ago (1 children)

Piracy also isn't copyright infringement but that's what this is. Under the law, which sucks. And if it applies to us it should apply to them.

[–] xSinStarx@lemmy.fmhy.ml 7 points 1 year ago (1 children)

Seeders are committing copyright infringement, by definition. Piracy actively encourages that behavior. Whether that is unethical or not can be debated though (FBI, I swear I have nyaa idea how qBT ended up on my machine, must have been a virus from a 4chan hacker).

[–] argv_minus_one 1 points 1 year ago

FBI, I swear I have nyaa idea how qBT ended up on my machine

I know why I have Transmission installed on mine: to download LibreOffice and Debian disk images.

[–] Technoguyfication@lemmy.ml 18 points 1 year ago (3 children)

It’s wild to see people in the piracy community of all places have an issue with someone benefiting from data they got online for free.

[–] OtakuAltair@vlemmy.net 24 points 1 year ago* (last edited 1 year ago) (1 children)

Key difference is that they're making (alot of) money of off the stolen work, and in a way that's only possible for the already filthy rich

Wouldn't mind it personally if it was foss though, like their name suggests

[–] whoisearth@lemmy.ca 13 points 1 year ago

FWIW even if it was FOSS I'd still care. For me it's more about intent. If your business model/livelihood relies on stealing from people there's a problem. That's as true on a business level as it is an individual one.

Doesn't mean I have an answer as sometimes it's extremely complex. The easy analogy is how we pirate TV shows and movies. Netflix originally proved this could be mitigated by providing the material cheaply and easily. People don't want to steal (on average).

[–] briongloid@aussie.zone 20 points 1 year ago

Many of us are sharing without reward and have strong ethical beliefs regarding for-profit distribution of material versus non-profit sharing.

[–] pipows@lemmy.pt 20 points 1 year ago

They're using people's content without authorization, but for a open information ideology or something like that, they are closed source and they are using it to make money. I don't think that should be illegal, but it is certainly a dick move

[–] state_electrician@discuss.tchncs.de 7 points 1 year ago

I don't see how this is any different than humans copying or being inspired by something. While I hate seeing companies profiting off of the commons while giving nothing of value back, how do you prove that an AI model is using your work in any meaningful or substantial way? What would make me really mad is if this dumb shit leads to even harsher copyright laws. We need less copyright not more.

[–] argv_minus_one 2 points 1 year ago

Huh. I always thought someone would get used for using software source code generated by an AI, but I admit I wasn't expecting the AI company itself to get sued.

Makes sense, though. AI as we know it is a copyright infringement machine.

[–] Treemaster099@pawb.social 1 points 1 year ago* (last edited 1 year ago) (1 children)

Good. Technology always makes strides before the law can catch up. The issue with this is that multi million dollar companies use these gaps in the law to get away with legally gray and morally black actions all in the name of profits.

Edit: This video is the best way to educate yourself on why ai art and writing is bad when it steals from people like most ai programs currently do. I know it's long, but it's broken up into chapters if you can't watch the whole thing.

[–] PlebsicleMcGee@feddit.uk 1 points 1 year ago (1 children)

Totally agree. I don't care that my data was used for training, but I do care that it's used for profit in a way that only a company with big budget lawyers can manage

[–] CoderKat@lemm.ee 1 points 1 year ago* (last edited 1 year ago) (1 children)

But if we're drawing the line at "did it for profit", how much technological advancement will happen? I suspect most advancement is profit driven. Obviously people should be paid for any work they actually put in, but we're talking about content on the internet that you willingly create for fun and the fact it's used by someone else for profit is a side thing.

And quite frankly, there's no way to pay you for this. No company is gonna pay you to use your social media comments to train their AI and even if they did, your share would likely be pennies at best. The only people who would get paid would be companies like reddit and Twitter, which would just write into their terms of service that they're allowed to do that (and I mean, they already use your data for targeting ads and it's of course visible to anyone on the internet).

So it's really a choice between helping train AI (which could be viewed as a net benefit for society, depending on how you view those AIs) vs simply not helping train them.

Also, if we're requiring payment, only the super big AI companies can afford to frankly pay anything at all. Training an AI is already so expensive that it's hard enough for small players to enter this business without having to pay for training data too (and at insane prices, if Twitter and Reddit are any indication).

[–] programmer_belch@lemmy.dbzer0.com 1 points 1 year ago

Hundreds of projects in github are supported by donations, innovation happens even without profit incentives. It may slow down the pace of AI development but I am willing to wait anothrt decade for AIs if it protects user data and let's regulation catch up.

load more comments