this post was submitted on 12 Sep 2023

152 points (100.0% liked)

Technology

37735 readers

43 users here now

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:

This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

founded 2 years ago

MODERATORS

TheRtRevKaiser@kbin.social

152

Today's Large Language Models are Essentially BS Machines (quandyfactory.com)

submitted 1 year ago by Veraticus@lib.lgbt to c/technology

118 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] scrubbles@poptalk.scrubbles.tech 79 points 1 year ago (5 children)

And everyone in tech who has worked on ML before collectively says "yeah that's what we've been trying to tell you". Don't get me wrong, LLMs are a huge leap, but god did it show how greedy corporations are, just immediately jumping to "how quick can we lay people off?". The tech is not to that spec. Yet. It will get there, but goddamn do we need to be demanding some regulations now

[–] Dark_Arc@social.packetloss.gg 39 points 1 year ago (1 children)

The tech is not to that spec. Yet.

I'm not sure it will. At least, not this tech, not this approach to the problem. From my understanding there's fundamentally no comprehension; it's not bugged, broken, or incomplete, it's just not there... it's missing from the design.

[–] communist 17 points 1 year ago (3 children)

We don't know that for sure yet, we saw a lot of emergent intelligent properties appear as we scaled up, and we're nowhere near done scaling LLM's, I'm not saying it will be solved, just that we don't know one way or the other yet.

[–] Veraticus@lib.lgbt 12 points 1 year ago (22 children)

LLMs are fundamentally different from human consciousness. It isn't a problem of scale, but kind.

They are like your phone's autocomplete, but very very good. But there's no level of "very good" for autocomplete that makes it a human, or will give it sentience, or allow it to understand the words it is suggesting. It simply returns the next most-likely word in a response.

If we want computerized intelligence, LLMs are a dead end. They might be a good way for that intelligence to speak pretty sentences to us, but they will never be that themselves.

[–] communist 5 points 1 year ago* (last edited 1 year ago) (1 children)

You're guessing, you don't actually know that for sure, it seems intuitively correct, but we simply do not know enough about cognition to make that assumption.

Perhaps our ability to reason exclusively comes from our ability to predict, and by scaling up the ability to predict, we become more and more able to reason.

These are guesses, all we have now are guesses, you can say "it doesn't reason" and "it's just autocorrect" all you want, but if that were the case why did scaling it up eventually enable it to perform basic math? Why did scaling it up improve its ability to problemsolve significantly (gpt3 vs gpt4), there's so many unknowns in this field, to just say "nah, can't be, it works differently from us" doesn't mean it can't do the same things as us given enough scale.

[–] Veraticus@lib.lgbt 8 points 1 year ago (2 children)

I'm not guessing. When I say it's a difference of kind, I really did mean that. There is no cognition here; and we know enough about cognition to say that LLMs are not performing anything like it.

Believing LLMs will eventually perform cognition with enough hardware is like saying, "if we throw enough hardware at a calculator, it will eventually become alive." Even if you throw all the hardware in the world at it, there is no emergent property of a calculator that would create sentience. So too LLMs, which really are just calculators that can speak English. But just like calculators they have no conception of what English is and they do not think in any way, and never will.

[–] communist 5 points 1 year ago* (last edited 1 year ago) (1 children)

I’m not guessing. When I say it’s a difference of kind, I really did mean that. There is no cognition here; and we know enough about cognition to say that LLMs are not performing anything like it.

We do not know that, I challenge you to find a source for that, in fact, i've seen sources showing the opposite, they seem to reason in tokens, for example, LLM's perform significantly better at tasks when asked to give a step by step reasoned explanation, this indicates that they are doing a form of reasoning, and their reasoning is limited by what I have no better term for than laziness.

https://blog.research.google/2022/05/language-models-perform-reasoning-via.html

[–] Veraticus@lib.lgbt 5 points 1 year ago* (last edited 1 year ago) (2 children)

It is your responsibility to prove your assertion that if we just throw enough hardware at LLMs they will suddenly become alive in any recognizable sense, not mine to prove you wrong.

You are anthropomorphizing LLMs. They do not reason and they are not lazy. The paper discusses a way to improve their predictive output, not a way to actually make them reason.

But don't take my word for it. Go talk to ChatGPT. Ask it anything like this:

"If an LLM is provided enough processing power, would it eventually be conscious?"

"Are LLM neural networks like a human brain?"

"Do LLMs have thoughts?"

"Are LLMs similar in any way to human consciousness?"

Just always make sure to check the output of LLMs. Since they are complicated autosuggestion engines, they will sometimes confidently spout bullshit, so must be examined for correctness. (As my initial post discussed.)

[–] communist 4 points 1 year ago* (last edited 1 year ago)

You're assuming i'm saying something that i'm not, and then arguing with that, instead of my actual claim.

I'm saying we don't know for sure what they will be able to do when they're scaled up. That's the end of my assertion. I don't have to prove that they will suddenly come alive, i'm not claiming they will, i'm just claiming we don't know what will happen when they're scaled, and they seem to have emergent properties as they scale up. Nobody has devised a way of predicting what emergent properties happen when, nobody has made any progress whatsoever on knowing what scaling up accomplishes.

Can they reason? Yes, but poorly right now, will that get better? Who knows.

The end of my claim is that we don't know what'll happen when they scale up, and that you can't just write it off like you are.

If you want proof that they reason, see the research article I linked. If they can do that in their rudimentary form that we've created with very little time, we can't write off the possibility that they will scale.

Whether or not they reason LIKE HUMANS is irrelevant if they can do the job.

And i'm not anthropomorphizing them without reason, there aren't terms for this already, what would you call this behavior of answering questions significantly better when asked to fully explain reasoning? I would say it is taking the easiest option that still meets the qualifications of what it is requested to do, following the path of least resistance, I don't have a better word for this than laziness.

https://www.downtoearth.org.in/news/science-technology/artificial-intelligence-gpt-4-shows-sparks-of-common-sense-human-like-reasoning-finds-microsoft-89429

Furthermore predictive power is just another way of achieving reasoning, better predictive power IS better reasoning, because you can't predict well without reasoning.

[–] Slotos@feddit.nl 2 points 1 year ago (1 children)

It’s your job to prove your assertion that we know enough about cognition to make reasonable comparisons.

[–] Veraticus@lib.lgbt 4 points 1 year ago (4 children)

May as well ask me to prove that we know enough about calculators to say they won't develop sentience while I'm at it.

load more comments (4 replies)

load more comments (1 replies)

[–] Zormat@lemmy.blahaj.zone 4 points 1 year ago (4 children)

So for context, I am an applied mathematician, and I primarily work in neural computation. I have an essentially cursory knowledge of LLMs, their architecture, and the mathematics of how they work.

I hear this argument, that LLMs are glorified autocomplete and merely statistical inference machines and are therefore completely divorced from anything resembling human thought.

I feel the need to point out that not only is there no compelling evidence that any neural computation that humans do anything different from a statistical inference machine, there's actually quite a bit of evidence that that is exactly what real, biological neural networks do.

Now, admittedly, real neurons and real neural networks are way more sophisticated than any deep learning network module, real neural networks are extremely recurrent and extremely nonlinear, with some neural circuits devoted to simply changing how other neural circuits process signals without actually processing said signals on their own. And in the case of humans, several orders of magnitude larger than even the largest LLM.

All that said, it boils down to an insanely powerful statistical machine.

There are questions of motivation and input: we all want to stay alive (ish), avoid pain, and have constant feedback from sensory organs while a LLM just produces what it was supposed to. But in an abstraction the ideas of wants and needs and rewards aren't substantively different from prompts.

Anyway. I agree that modern AI is a poor substitute for real human intelligence, but the fundamental reason is a matter of complexity, not method.

Some reading:

Large scale neural recordings call for new insights to link brain and behavior

A unifying perspective on neural manifolds and circuits for cognition

a comparison of neuronal population dynamics measured with calcium imaging and electrophysiology

load more comments (4 replies)

[–] emptiestplace@lemmy.ml 3 points 1 year ago (1 children)

I am picking up a hint of the autocompletion you describe, in your writing.

[–] Veraticus@lib.lgbt 3 points 1 year ago

I think I write well :) I am not an LLM though.

load more comments (19 replies)

[–] Dark_Arc@social.packetloss.gg 9 points 1 year ago* (last edited 1 year ago) (1 children)

I don't believe in scaling as a way to discover understanding. Doing that is just praying that the machine comes alive... these machines weren't programmed to come alive in that way. That's my fundamental argument, the design of LLMs ignores understanding of the content... it doesn't matter how much content it's been scaled up to.

If I teach a real AI about fishing, it should be able to reason about fishing and it shouldn't need to have read a supplementary knowledge of mankind to do it.

What the LLMs seem to be moving towards is more of a search and summary engine (for existing content). That's a very similar and potentially quite useful thing, but it's not the same thing as understanding.

It's the difference between the kid that doesn't know much but is really good at figuring it out based on what they know vs the kid that's read all the text books front to back and can't come up with anything original to save their life but can quickly regurgitate and summarize anything they've ever read.

[–] communist 5 points 1 year ago* (last edited 1 year ago) (1 children)

If I teach a real AI about fishing, it should be able to reason about fishing and it shouldn’t need to have read a supplementary knowledge of mankind to do it.

This is a faulty assumption.

In order for you to learn about fishing, you had to learn a shitload about the world. Babies don't come out of the womb able to do such tasks, there is a shitload of prerequisite knowledge in order to fish, it's unfair to expect an ai to do this without prerequisite knowledge.

Furthermore, LLM's have been shown to do many things that aren't in their training data, so the notion that it's a stochastic parrot is also false.

[–] Dark_Arc@social.packetloss.gg 4 points 1 year ago* (last edited 1 year ago) (1 children)

Furthermore, LLM’s have been shown to do many things that aren’t in their training data, so the notion that it’s a stochastic parrot is also false.

And (from what I've seen) they get things wrong with extreme regularity, increasingly so as thing diverge from the training data. I wouldn't say they're a "stochastic parrot" but they don't seem to be much better when things need to be correct... and again, based on my (admittedly limited) understanding of their design, I don't anticipate this technology (at least without some kind of augmented approach that can reason about the substance) overcoming that.

In order for you to learn about fishing, you had to learn a shitload about the world. Babies don’t come out of the womb able to do such tasks, there is a shitload of prerequisite knowledge in order to fish, it’s unfair to expect an ai to do this without prerequisite knowledge.

That's missing the forest for the trees. Of course an AI isn't going to go fishing. However, I should be able to assert some facts about fishing and it should be able to reason based on those assertions. e.g. a child can work off of facts presented about fishing, "fish are hard to catch in muddy water" -> "the water is muddy, does that impact my chances of a catching a bluegill?" -> "yes, it does, bluegill are fish, and fish don't like muddy water".

There are also "teachings" brought about by how these are programmed that make the flaws less obvious, e.g., if I try to repeat the experiment in the post here Google's Bard outright refuses to continue because it doesn't have information about Ryan McGee. I've also seen Bard get notably better as it's been scaled up, early on I tried asking it about RuneScape and it spewed absolute nonsense. Now... It's reasonable-ish.

I was able to reproduce a nonsense response (once again) by asking about RuneScape. I asked how to get 99 firemaking, and it invented a mechanic that doesn't exist "Using a bonfire in the Charred Stump: The Charred Stump is a bonfire located in the Wilderness. It gives 150% Firemaking experience, but it is also dangerous because you can be attacked by other players." This is a novel (if not creative) invention of Bard likely derived from advice for training Prayer (which does have something in the Wilderness which gives 350% experience).

[–] communist 4 points 1 year ago* (last edited 1 year ago)

And (from what I’ve seen) they get things wrong with extreme regularity, increasingly so as thing diverge from the training data. I wouldn’t say they’re a “stochastic parrot” but they don’t seem to be much better when things need to be correct… and again, based on my (admittedly limited) understanding of their design, I don’t anticipate this technology (at least without some kind of augmented approach that can reason about the substance) overcoming that.

Keep in mind, you're talking about a rudimentary, introductory version of this, my argument is that we don't know what will happen when they've scaled up, we know for certain hallucinations become less frequent as the model size increases (see the statistics on gpt3 vs 4 on hallucinations), perhaps this only occurs because they haven't met a critical size yet? We don't know.

There's so much we don't know.

That’s missing the forest for the trees. Of course an AI isn’t going to go fishing. However, I should be able to assert some facts about fishing and it should be able to reason based on those assertions. e.g. a child can work off of facts presented about fishing, “fish are hard to catch in muddy water” -> “the water is muddy, does that impact my chances of a catching a bluegill?” -> “yes, it does, bluegill are fish, and fish don’t like muddy water”.

https://blog.research.google/2022/05/language-models-perform-reasoning-via.html

they do this already, albeit imperfectly, but again, this is like, a baby LLM.

and just to prove it:

https://chat.openai.com/share/54455afb-3eb8-4b7f-8fcc-e144a48b6798

[–] BotCheese 7 points 1 year ago (2 children)

And we're nowhere near dome scalimg LLM's

I think we might be, I remember hearing openAI was training on so much literary data that they didn't and couldn't find enough for testing the model. Though I may be misrememberimg.

[–] newde@feddit.nl 5 points 1 year ago (1 children)

No that's definitely the case. However, Microsoft is now working making LLM's more dependent on several high quality sources. For example: encyclopedias will be more important sources than random reddit posts.

[–] HobbitFoot@thelemmy.club 2 points 1 year ago (1 children)

Microsoft is also using LinkedIn to help as well, getting users to correct articles generated by AI.

[–] Zaktor@sopuli.xyz 2 points 1 year ago

Cunningham's Law may be very helpful in this respect.

"the best way to get the right answer on the internet is not to ask a question; it's to post the wrong answer."

load more comments (1 replies)

[–] Veraticus@lib.lgbt 21 points 1 year ago* (last edited 1 year ago) (3 children)

I was mostly posting this because the last time LLMs came up, people kept on going on and on about how much their thoughts are like ours and how they know so much information. But as this article makes clear, they have no thoughts and know no information.

In many ways they are simply a mathematical party trick; formulas trained on so much language, they can produce language themselves. But there is no “there” there.

[–] lily33@lemm.ee 11 points 1 year ago* (last edited 1 year ago) (1 children)

have no thoughts

True

know no information

False. There's plenty of information stored in the models, and plenty of papers that delve into how it's stored, or how to extract or modify it.

I guess you can nitpick over the work "know", and what it means, but as someone else pointed out, we don't actually know what that means in humans anyway. But LLMs do use the information stored in context, they don't simply regurgitate it verbatim. For example (from this article):

If you ask an LLM what's near the Eiffel Tower, it'll list location in Paris. If you edit its stored information to think the Eiffel Tower is in Rome, it'll actually start suggesting you sights in Rome instead.

[–] Veraticus@lib.lgbt 6 points 1 year ago (1 children)

They only use words in context, which is their problem. It doesn't know what the words mean or what the context means; it's glorified autocomplete.

I guess it depends on what you mean by "information." Since all of the words it uses are meaningless to it (it doesn't understand anything of what it either is asked or says), I would say it has no information and knows nothing. At least, nothing more than a calculator knows when it returns 7 + 8 = 15. It doesn't know what those numbers mean or what it represents; it's simply returning the result of a computation.

So too LLMs responding to language.

[–] lily33@lemm.ee 4 points 1 year ago* (last edited 1 year ago) (1 children)

Why is that a problem?

For example, I've used it to learn the basics of Galois theory, and it worked pretty well.

The information is stored in the model, do it can tell me the basics
The interactive nature of taking to LLM actually helped me learn better than just reading.
And I know enough general math so I can tell the rare occasions (and they indeed were rare) when it makes things up.
Asking it questions can be better than searching Google, because Google needs exact keywords to find the answer, and the LLM can be more flexible (of course, neither will answer if the answer isn't in the index/training data).

So what if it doesn't understand Galois theory - it could teach it to me well enough. Frankly if it did actually understand it, I'd be worried about slavery.

[–] Veraticus@lib.lgbt 2 points 1 year ago (1 children)

Basically the problem is point 3.

You obviously know some of what it's telling you is inaccurate already. There is the possibility it's all bullshit. Granted a lot of it probably isn't, but it will tell you the bullshit with the exact same level of confidence as actual facts... because it doesn't know Galois theory and it isn't teaching it to you, it's simply stringing sentences together in response to your queries.

If a human were doing this we would rightly proclaim the human a bad teacher that didn't know their subject, and that you should go somewhere else to get your knowledge. That same critique should apply to the LLM as well.

That said it definitely can be a useful tool. I just would never fully trust knowledge I gained from an LLM. All of it needs to be reviewed for correctness by a human.

[–] lily33@lemm.ee 4 points 1 year ago (1 children)

That same critique should apply to the LLM as well.

No, it shouldn't. Instead, you should compare it to the alternatives you have on hand.

The fact is,

Using LLM was a better experience for me then reading a textbook.
And it was also a better experience for me then watching recorded video lectures.

So, if I have to learn something, I have enough background to spot hallucinations, and I don't have a teacher (having graduated college, that's always true), I would consider using it, because it's better then the alternatives.

I just would never fully trust knowledge I gained from an LLM

There are plenty of cases where you shouldn't fully trust knowledge you gained from a human, too.

And there are, actually, cases where you can trust the knowledge gained from an LLM. Not because it sounds confident, but because you know how it behaves.

load more comments (1 replies)

[–] sincle354 10 points 1 year ago

Sadly we don't even know what "knowing" is, considering human memory changes every time it is accessed. We might just need language and language only. Right now they're testing if generating verbalized trains of thought helps (it might?). The question might change to: Does the sum total of human language have enough consistency to produce behavior we might call consciousness? Can we brute force the Chinese room with enough data?

[–] pbjamm 7 points 1 year ago

They are the perfect embodiment of the internet.

They know everything, but understand nothing

[–] MasterBuilder@lemmy.one 14 points 1 year ago (3 children)

I've been unemployed for 7 months. Every online job I see that's been posted for at least 6 hours has over 200 applications. I'm a senior Dev with 30 years experience, and I can't find work.

I'd say generative AI is an existential threat as bad as offshoring was for steel in the early 80s. I'm now left with the prospect of spending the last 20 years of my work life at or near minimum wage.

After all, I can't afford to spend $250,000 on a new bachelor's degree, and a community college degree might get me to $25/hr, and still costs thousands. This is causing impoverishment on a massive scale.

Ignore this threat at your peril.

[–] seang96@spgrn.com 17 points 1 year ago* (last edited 1 year ago) (2 children)

Your issue sounds more like a capitalism issue. FANG companies lay off thousands of employees to cut costs and prepare for changes in the economy. AI didn't make them lay off all those employees, just corporate greed. Until AI can gather requirements, accurately produce code with at least 80%, can compile the software itself, it isn't a threat.

Edit fix autocorrect

[–] scrubbles@poptalk.scrubbles.tech 2 points 1 year ago (1 children)

and 100% accuracy. Only a fool would trust something coming out of AI and slap it right into production right now.

[–] seang96@spgrn.com 2 points 1 year ago

I agree though I was following the 80/20 rule. if the softwares essentially free and does 80% of your business needs businesses would be happy. Either way AI is nowhere near that since it requires someone with the knowledge currently to get it anywhere close to a complete project.

load more comments (1 replies)

[–] scrubbles@poptalk.scrubbles.tech 12 points 1 year ago

I'm a senior dev too, and at first I thought the same, but really it's a market downturn. Companies are just afraid to hire right now. I'd look into generative AI, try to understand how it works. That's how I've been spending my time, and yeah, it's intuitive the way they do it but the more you understand how it works the more you realize that it's not ready to take our jobs. Yet. Again maybe someday, but there is a lot of work that needs to be done to get something semi up and running, and the models that Google uses are not going to be usable for every company. (Take a look at all the specialized models already).

Our job never goes away, but it does constantly evolve. This is just another point where we have to learn new skills, and that may be that we all need to be model tuners some day. At the end of the day the user still needs to correctly describe what they want to have happen on the screen, and there are currently no ways to take what they describe into a full piece of software.

[–] HelixTitan 9 points 1 year ago (1 children)

Hard to believe a senior dev can't find work. Those positions are the most needed. Also 25 an hour is 50k a year. No where in the US are senior devs paid that little. I suppose you may not be US based, but your cost for college seems to imply US, albeit at an expensive school.

load more comments (1 replies)

[–] p03locke@lemmy.dbzer0.com 13 points 1 year ago (1 children)

And everyone in tech who has worked on ML before collectively says “yeah that’s what we’ve been trying to tell you”.

Everybody in tech would even have a passing understanding of the technology was collectively saying that. We understand the limits of technology and can feel out the bounds easily. But, too many of these dumbasses with dollar signs in their eyes are all "to the moon!", and tripping and failing on implementing the tech in unreasonable ways.

It was never a factoid machine, like some people wanted to believe. It was always about creatively writing something, and only one with so much attention.

[–] interolivary 10 points 1 year ago

It was never a factoid machine

Funny tidbit about the word "factoid": its original meaning was "an item of unreliable information that is reported and repeated so often that it becomes accepted as fact", but the modern usage is "a brief or trivial item of news or information".

This means that the modern usage of "factoid" is in itself a factoid, and that in the old sense LLMs sort of are factoid machines.

Note that I'm not saying the modern use is wrong. Languages evolve, and words taking on new meanings doesn't mean the new meanings are "wrong" (and surprisingly words changing to mean the opposite of what they used to mean isn't all that uncommon either.)

[–] biddy@feddit.nl 7 points 1 year ago

I disagree, a lot of white collar work is simply writing bullshit.