I think what the parent was trying to communicate (and what I'm thinking as well... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		tiarafawn on Feb 17, 2023 \| parent \| context \| favorite \| on: We Found an Neuron in GPT-2 I think what the parent was trying to communicate (and what I'm thinking as well) is doubting your premise in 1. ("the model must be thinking beyond the next token"). Rephrase "The model is good at picking the correct article for the word it wants to output next" to "After having picked a specific article, the model is good at picking a follow-up noun that matches the chosen article". Nothing about the second statement seems like an unlikely feat for a model that only predicts one word at a time without any thinking ahead about specific words.

sebzim4500 on Feb 17, 2023 [–]

The prompt is:

>I climbed up the pear tree and picked a pear. I climbed up the apple tree and picked

The argument made in the article (IMO an extremely convincing one) is that it wouldn't be able to predict the word 'an' except by observing that the word afterwards must be apple. Otherwise why not pick 'a'?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact