Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
blackeyeblitzar
on Jan 28, 2025
|
parent
|
context
|
favorite
| on:
The Illustrated DeepSeek-R1
How does such a distillation work in theory? They don’t have weights from OpenAI’s models, and can only call their APIs, right? So how can they actually build off of it?
moralestapia
on Jan 28, 2025
[–]
Like RLHF but the HF part is GPT4 instead.
KarraAI
on Jan 28, 2025
|
parent
[–]
How do you ensure the student model learns robust generalizations rather than just surface-level mimicry?
moralestapia
on Jan 28, 2025
|
root
|
parent
[–]
No idea as I don't work on that, but my guess would be that the higher the 'n' the more model A approaches model B.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: