A quick Google search reveals terms such as "sparse attention" that are used to ... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		FartyMcFarter 42 days ago \| parent \| context \| favorite \| on: 1M context is now generally available for Opus 4.6... A quick Google search reveals terms such as "sparse attention" that are used to avoid quadratic runtime. I don't know if Anthropic has revealed such details since AI research is getting more and more secretive, but the architectural tricks definitely exist.

vlovich123 41 days ago [–]

Then you need to do a little bit deeper research. No one just applies sparse attention at inference time for a model not trained for it. They do this at training time because otherwise the task performance degrades too much.

Consider applying for YC's Summer 2026 batch! Applications are open till May 4
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact