Never thought I'd die fighting alongside a League of Legends fan.
Aye. That I could do.
Big brain tech dude got yet another clueless take over at HackerNews etc? Here's the place to vent. Orange site, VC foolishness, all welcome.
This is not debate club. Unless it’s amusing debate.
For actually-good tech, you want our NotAwfulTech community
Never thought I'd die fighting alongside a League of Legends fan.
Aye. That I could do.
Stack overflow now with the sponsored crypto blogspam Joining forces: How Web2 and Web3 developers can build together
I really love the byline here. "Kindest view of one another". Seething rage at the bullshittery these "web3" fuckheads keep producing certainly isn't kind for sure.
Dude discovers that one LLM model is not entirely shit at chess, spends time and tokens proving that other models are actually also not shit at chess.
The irony? He's comparing it against Stockfish, a computer chess engine. Computers playing chess at a superhuman level is a solved problem. LLMs have now slightly approached that level.
For one, gpt-3.5-turbo-instruct rarely suggests illegal moves,
Writeup https://dynomight.net/more-chess/
HN discussion https://news.ycombinator.com/item?id=42206817
Particularly hilarious at how thoroughly they're missing the point. The fact that it suggests illegal moves at all means that no matter how good it's openings are the scaling laws and emergent behaviors haven't magicked up an internal model of the game of Chess or even the state of the chess board it's working with. I feel like playing games is a particularly powerful example of this because the game rules provide a very clear structure to model and it's very obvious when that model doesn't exist.
I remember when several months (a year ago?) when the news got out that gpt-3.5-turbo-papillion-grumpalumpgus could play chess around ~1600 elo. I was skeptical the apparent skill wasn't just a hacked-on patch to stop folks from clowning on their models on xitter. Like if an LLM had just read the instructions of chess and started playing like a competent player, that would be genuinely impressive. But if what happened is they generated 10^12 synthetic games of chess played by stonk fish and used that to train the model- that ain't an emergent ability, that's just brute forcing chess. The fact that larger, open-source models that perform better on other benchmarks, still flail at chess is just a glaring red flag that something funky was going on w/ gpt-3.5-turbo-instruct to drive home the "eMeRgEnCe" narrative. I'd bet decent odds if you played with modified rules, (knights move a one space longer L shape, you cannot move a pawn 2 moves after it last moved, etc), gpt-3.5 would fuckin suck.
Edit: the author asks "why skill go down tho" on later models. Like isn't it obvious? At that moment of time, chess skills weren't a priority so the trillions of synthetic games weren't included in the training? Like this isn't that big of a mystery...? It's not like other NN haven't been trained to play chess...
@gerikson @BlueMonday1984 the only analysis of computer chess anybody needs https://youtu.be/DpXy041BIlA?si=a1vU3zmOWs8UqlSQ
So we have this new tech that makes stuff up and also is a bit racist at times? Lets use it to monitor employees, of course it also trains to replace your job.
This is fucked even without the hallucinating Clippy in the backend.
If your desktop is idle for more than 30-60 seconds (no "meaningful" mouse & keyboard movement), you get a red flag
People getting ~~flogged~~ flagged for being lazy for a few seconds reminds me of something …
caption: """AI is itself significantly accelerating AI progress"""
wow I wonder how you came to that conclusion when the answers are written like a Fallout 4 dialogue tree