I think “just writing better code” is a lot harder than you think. You actually have to do research first you know? Our universities and companies do research too. But I guarantee using R1 techniques on more compute would follow the scaling law too. It’s not either or.
artificialfish
joined 5 days ago
Nah, o1 has been out how long? They are already on o3 in the office.
It’s completely normal a year later for someone to copy their work and publish it.
It probably cost them less because they probably just distilled o1 XD. Or might have gotten insider knowledge (but honestly how hard could CoT fine tuning possibly be?)
Yes but also it’s open source soooo
https://huggingface.co/mradermacher/DeepSeek-R1-Distill-Llama-70B-Uncensored-i1-GGUF
Meta? The one that released Llama 3.3? The one that actually publishes its work? What are you talking about?