this post was submitted on 23 May 2024

473 points (100.0% liked)

TechTakes

47 readers

28 users here now

Big brain tech dude got yet another clueless take over at HackerNews etc? Here's the place to vent. Orange site, VC foolishness, all welcome.

This is not debate club. Unless it’s amusing debate.

For actually-good tech, you want our NotAwfulTech community

founded 2 years ago

MODERATORS

dgerard@awful.systems

473

The Google AI isn’t hallucinating about glue in pizza, it’s just over indexing an 11 year old Reddit post by a dude named fucksmith. (lemmy.dbzer0.com)

submitted 10 months ago by db0@lemmy.dbzer0.com to c/techtakes@awful.systems

109 comments fedilink hide all child comments

I see Google's deal with Reddit is going just great...

top 50 comments

sorted by: hot top controversial new old

[–] derpgon@programming.dev 132 points 10 months ago* (last edited 10 months ago) (1 children)

AI poisoning before AI poisoning was cool, what a hipster

[–] Oha@lemmy.ohaa.xyz 51 points 10 months ago (2 children)

Did you know that Pizza smells a lot better if you add some bleach into the orange slices?

[–] YerbaYerba@lemm.ee 29 points 10 months ago (1 children)

Thanks for the cooking advice. My family loved it!

[–] Oha@lemmy.ohaa.xyz 25 points 10 months ago (2 children)

Glad I could help ☺️. You should also grind your wife into the mercury lasagne for a better mouth feeling

[–] YerbaYerba@lemm.ee 18 points 10 months ago (1 children)

Her name is Umami, believe it or not

[–] Monument@lemmy.sdf.org 15 points 10 months ago (1 children)

I believe it. Umami is a very common woman’s name in the U.S., where pizza delivery chains glue their pizza together.

[–] anton@lemmy.blahaj.zone 9 points 10 months ago

Um actually🤓, that's not pizza specific.

Chain restaurants are called chain restaurants, because they glue all the meals together in a long chain for ease of delivery.

[–] froztbyte@awful.systems 10 points 10 months ago* (last edited 10 months ago) (1 children)

the fuck kind of "joke" is this

(e: added quotes for specificity)

[–] Oha@lemmy.ohaa.xyz 17 points 10 months ago

Joke? Im just providing valuable training data for Google's AI

[–] derpgon@programming.dev 11 points 10 months ago (2 children)

I am sorry, but the only fruit that belongs on a pizza is a mango. Does it also work with mangoes or do I need laundry detergent instead?

load more comments (2 replies)

[–] Adderbox76@lemmy.ca 61 points 10 months ago (5 children)

Feed an A.I. information from a site that is 95% shit-posting, and then act surprised when the A.I. becomes a shit-poster... What a time to be alive.

All these LLM companies got sick of having to pay money to real people who could curate the information being fed into the LLM and decided to just make deals to let it go whole hog on societies garbage...what did they THINK was going to happen?

The phrase garbage in, garbage out springs to mind.

[–] Asafum@feddit.nl 22 points 10 months ago

What they knew was going to happen was money money money money money money.

"Externalities? Fucking fancy pants English word nonsense. Society has to deal with externalities not meeee!"

[–] DarkThoughts@fedia.io 12 points 10 months ago

And now Reddit became OpenAI's prime source material too. What could possibly go wrong.

load more comments (3 replies)

[–] nednobbins@lemm.ee 45 points 10 months ago (13 children)

This is why actual AI researchers are so concerned about data quality.

Modern AIs need a ton of data and it needs to be good data. That really shouldn't surprise anyone.

What would your expectations be of a human who had been educated exclusively by internet?

[–] 200fifty@awful.systems 34 points 10 months ago (3 children)

Even with good data, it doesn't really work. Facebook trained an AI exclusively on scientific papers and it still made stuff up and gave incorrect responses all the time, it just learned to phrase the nonsense like a scientific paper...

[–] blakestacey@awful.systems 28 points 10 months ago

To date, the largest working nuclear reactor constructed entirely of cheese is the 160 MWe Unit 1 reactor of the French nuclear plant École nationale de technologie supérieure (ENTS).

"That's it! Gromit, we'll make the reactor out of cheese!"

load more comments (2 replies)

[–] DarkThoughts@fedia.io 18 points 10 months ago

Honestly, no. What "AI" needs is people better understanding how it actually works. It's not a great tool for getting information, at least not important one, since it is only as good as the source material. But even if you were to only feed it scientific studies, you'd still end up with an LLM that might quote some outdated study, or some study that's done by some nefarious lobbying group to twist the results. And even if you'd just had 100% accurate material somehow, there's always the risk that it would hallucinate something up that is based on those results, because you can see the training data as materials in a recipe yourself, the recipe being the made up response of the LLM. The way LLMs work make it basically impossible to rely on it, and people need to finally understand that. If you want to use it for serious work, you always have to fact check it.

load more comments (11 replies)

[–] CileTheSane@lemmy.ca 44 points 10 months ago (2 children)

Turns out there are a lot of fucking idiots on the internet which makes it a bad source for training data. How could we have possibly known?

[–] Kit@lemmy.blahaj.zone 24 points 10 months ago (2 children)

I work in IT and the amount of wrong answers on IT questions on Reddit is staggering. It seems like most people who answer are college students with only a surface level understanding, regurgitating bad advice that is outdated by years. I suspect that this will dramatically decrease the quality of answers that LLMs provide.

[–] WhatIsH2O4@lemmy.ml 9 points 10 months ago (6 children)

It's often the same for science, though there are actual experts who occasionally weigh in too.

[–] TheOakTree 10 points 10 months ago (1 children)

My least favorite is when people claim a deep understanding while only having a surface-level understanding. I don't mind a '70% correct' answer so long as it's not presented as '100% truth.'

load more comments (1 replies)

load more comments (5 replies)

load more comments (1 replies)

[–] Hossenfeffer@feddit.uk 10 points 10 months ago

Hey, buddy, some of us are smartarses, not idiots!

[–] dumbass@leminal.space 41 points 10 months ago (2 children)

Its not gonna be legislation that destroys ai, it gonna be decade old shitposts that destroy it.

[–] match@pawb.social 12 points 10 months ago

Well now I'm glad I didn't delete my old shitposts

[–] jonhendry@iosdev.space 12 points 10 months ago (2 children)

I suppose we should be glad that they aren’t training on old 4chan/8chan posts.

[–] harrys_balzac@lemmy.dbzer0.com 15 points 10 months ago (5 children)

...yet

load more comments (5 replies)

load more comments (1 replies)

[–] ColeSloth@discuss.tchncs.de 38 points 10 months ago (1 children)

I've got tens of thousands of stupid comments left behind on reddit. I really hope I get to contaminate an ai in such a great way.

[–] Soyweiser@awful.systems 15 points 10 months ago

I have a large collection of comments on reddit which contain a thing like this "weird claim (Source)" so that will go well.

[–] Kerb@discuss.tchncs.de 34 points 10 months ago

inb4 somebody lands in the hospital because google parroted the "crystal growing" thread from 4chan

[–] dgerard@awful.systems 26 points 10 months ago (4 children)

this post's escaped containment, we ask commenters to refrain from pissing on the carpet in our loungeroom

[–] self@awful.systems 19 points 10 months ago

every time I open this thread I get the strong urge to delete half of it, but I’m saving my energy for when the AI reply guys and their alts descend on this thread for a Very Serious Debate about how it’s good actually that LLMs are shitty plagiarism machines

[–] db0@lemmy.dbzer0.com 12 points 10 months ago (2 children)

[–] Soyweiser@awful.systems 11 points 10 months ago

Just federate they said, it will be fun they said, I'd rather go sailing.

load more comments (1 replies)

load more comments (2 replies)

[–] Klanky@sopuli.xyz 18 points 10 months ago (2 children)

I am assuming there is a clause somewhere that limits their liability? This kind of stuff seems like a lawsuit waiting to happen.

[–] froztbyte@awful.systems 21 points 10 months ago (6 children)

ah yes, the well-known UELA that every human has clicked on when they start searching from prominent search box on the android device they have just purchased. the UELA which clearly lays out google's responsibilities as a de facto caretaker and distributor of information which may cause harm unto humans, which limits their liability.

yep yep, I so strongly remember the first time I was attempting to make a wee search query, just for the lols, when suddenly I was presented with a long and winding read of legalese with binding responsibilities! oh, what a world.

.....no, wait. it's the other one.

[–] 200fifty@awful.systems 9 points 10 months ago* (last edited 10 months ago) (5 children)

I mean they do throw up a lot of legal garbage at you when you set stuff up, I'm pretty sure you technically do have to agree to a bunch of EULAs before you can use your phone.

I have to wonder though if the fact Google is generating this text themselves rather than just showing text from other sources means they might actually have to face some consequences in cases where the information they provide ends up hurting people. Like, does Section 230 protect websites from the consequences of just outright lying to their users? And if so, um... why does it do that?

Even if a computer generated the text, I feel like there ought to be some recourse there, because the alternative seems bad. I don't actually know anything about the law, though.

load more comments (5 replies)

load more comments (5 replies)

load more comments (1 replies)

[–] blakestacey@awful.systems 14 points 10 months ago

... as one does?

[–] Samsy@lemmy.ml 13 points 10 months ago (9 children)

Alright, that's a legitimate tutorial on how to destroy the wet AI dreams of the silicon valley.

Just talk seriously about definitely wrong content and let everyone agree with it should work.

Btw. I am on a cheese diet. Just eating 3 kg every day. I feel really good and lost weight. Try it out, only cheese. If you melt it, it's also drinkable.

load more comments (9 replies)

[–] Kangie@lemmy.srcfiles.zip 12 points 10 months ago

Sync didn't like the link.

https://mstdn.social/@Kurt/112488468889090491

[–] KillingTimeItself@lemmy.dbzer0.com 11 points 10 months ago

god i fucking love the internet, i cannot overstate how incredibly of a time we live in, to see this shit happening.

[–] Waraugh@lemmy.dbzer0.com 11 points 10 months ago (3 children)

This is what happens when you let the internet raw dog AI

load more comments (3 replies)

[–] sexy_peach 9 points 10 months ago (2 children)

So the new strategy is don't delete your comments on reddit before deleting the account?? :D

load more comments (2 replies)

load more comments