Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Assuming this is confirmed, what's the impact on training?

Inference is definitely an issue for LLMs right now. But if training were suddenly possible for lone hackers (or maybe smaller companies), it would open up a lot of new possibilities as well.



In theory it should make training a lot easier too, particularly on CPUs. But I think you'll still need reasonably expensive compute to get a model something close to the current big models, and you really can't ignore data. Data quality and quantity are both huge ingredients in model quality, at least as big as architecture. It's still non-trivial to get a good quality, large dataset, certainly out of the reach of lone hackers and most small companies.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: