Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Note that the benchmarks used for comparison are basically measuring the model’s ability to understand financial content. In other words, reading comprehension for English, just in a specific domain. It shouldn’t really be surprising that a strong generalist model performs well here.

On the other hand, GPT-4 actually did worse on the NER task - labelling and tagging terms used in the text - vs their finetuned model. I assume the finetuned model was better at using the specific labels they were targeting.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: