AI

139 readers

1 users here now

Artificial intelligence (AI) is intelligence demonstrated by machines, unlike the natural intelligence displayed by humans and animals, which involves consciousness and emotionality. The distinction between the former and the latter categories is often revealed by the acronym chosen.

founded 3 years ago

How reliable are modern LLMs? (lemmy.today)

submitted 1 month ago by mods_mum@lemmy.today to c/artificial_intel@lemmy.ml

10 comments fedilink hide all child comments

I wanted to extract some crime statistics broken by the type of crime and different populations, all of course normalized by the population size. I got a nice set of tables summarizing the data for each year that I requested.

When I shared these summaries I was told this is entirely unreliable due to hallucinations. So my question to you is how common of a problem this is?

I compared results from Chat GPT-4, Copilot and Grok and the results are the same (Gemini says the data is unavailable, btw :)

So is are LLMs reliable for research like that?

top 10 comments

sorted by: hot top controversial new old

[–] jet@hackertalks.com 1 points 1 month ago* (last edited 1 month ago) (1 children)

LLMs are totally unreliable for research. They are just probable token generators.

Especially if your looking for new data that nobody has talked about before, then your just going to get convincing hallucinations, like talking to a slightly drunk professor at a loud bar who can't ever admit they don't know something.

Example: ask a llm this "what open source software developer died in the September 11th attacks?"

It will give you names, and when you try to verify those names, you'll find out those people didn't die. It's just generating probable tokens

[–] ViaFedi@lemmy.ml 0 points 1 month ago (1 children)

Solutions exist where you give the LLM a bunch of files e.g., PDFs which it then will solely base it's knowledge on

[–] jet@hackertalks.com 1 points 1 month ago (1 children)

It's still a probable token generator, you're just training it on your local data. Hallucinations will absolutely happen.

[–] slacktoid@lemmy.ml 0 points 1 month ago* (last edited 1 month ago)

This isn't training its called a RAG Workflow, as there is no training step per se

[–] xia@lemmy.sdf.org 1 points 1 month ago

How reliable is autocorrect?

[–] rickdg@lemmy.ml 1 points 1 month ago (1 children)

Treat it like an eager impressionable intern with a confident stride.

[–] jeffhykin@lemm.ee 1 points 1 month ago (1 children)

Who is also 15 yrs old and has brain damage

[–] rickdg@lemmy.ml 1 points 4 weeks ago

Proof that you can do anything if somebody piles billions of money on you.

[–] PerogiBoi@lemmy.ca 1 points 1 month ago

They aren’t. They’re a party trick.

[–] queermunist@lemmy.ml 0 points 1 month ago

It's a scam.