Hacker Newsnew | past | comments | ask | show | jobs | submit | armanified's commentslogin

True, but most would ignore LM if it weren't LLM.

This title might have triggered something in those bots; most of them have sneaky AI SaaS links in their bio.

Honestly, I never expected this post to become so popular. It was just the outcome of a weekend practice session.


OMG! Why didn't I thought fo this first :P

OMG! You just gave me the next idea..

Pretty neat! I'll definitely take a deeper look into this.

Uppercase letters were intentionally ignored.

My initial idea was to train a navigation decision model with 25M parameters for a Raspberry Pi, which, in testing, was getting about 60% of tool calls correct. IMO, it seems like around 20M parameters would be a good size for following some narrow & basic language instructions.

Ok. This makes me wonder about a broader question. Is there a scientific approach showing a pyramid of cognitive functions, and how many parameters are (minimally) required for each layer in this pyramid?

It mostly doesn't, at 9M it has very limited capacity. The whole idea of this project is to demonstrate how Language Models work.

I haven't compared it with anything yet. Thanks for the suggestion; I'll look into these.

I intentionally removed all optimizations to keep it vanilla.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: