Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

To be fair, it also seems like it performs worse. question: "Who is Scott Alexander?"

chatgpt-turbo:

>As an AI language model, I cannot determine the specific Scott Alexander in question without further contextual information. There may be several people with this name, including writers, politicians, entrepreneurs, athletes, and more.

chatgpt website:

>Scott Alexander is a pseudonym used by a prominent American psychiatrist and blogger who is best known for his writing on the blog "Slate Star Codex." He has also written articles for other publications, including The New York Times, The Atlantic, and The Wall Street Journal.

Scott Alexander's writing covers a wide range of topics, including medicine, psychiatry, economics, politics, and culture. He is known for his thoughtful, analytical approach to complex issues and for his ability to explain difficult concepts in a clear and accessible way.

In 2020, Scott Alexander announced that he was deleting his blog after he was doxxed by a reporter for The New York Times who planned to write an article about him. However, he later started a new blog called "Astral Codex Ten" where he continues to write.



Is that necessarily worse performance, though?

One of the main pitfalls/criticisms of ChatGPT has been that it confidently plows forward and gives an answer regardless of whether it's right or wrong.

Here, it seems like it's being more circumspect, which could be a step in the right direction. At least that's one possible explanation for not answering.

On Wikipedia, if I type "Scott Alexander" and hit enter, it takes me directly to the page for a baseball player. So it's not clear that the blogger is the right answer.

I do think there's a better response than either of these, though. It could list the most famous Scott Alexanders and briefly say what each is known for, then ask if you mean one of those.


With enough tries it gives wrong ones to the exact same question, too so I don't see an improvement in that direction.


Perhaps a transitory issue. I just tried it with the API, `gpt-3.5-turbo`. I got:

> Scott Alexander is the pen name of American psychiatrist and blogger, Scott Alexander Siskind. He is known for writing his blog, "Slate Star Codex", which covers a wide range of topics including science, medicine, politics, and culture. He has been praised for his clear and concise writing style and thoughtful analysis of various issues. In addition to his work as a blogger, Scott Alexander has also published a book titled "Unsong", which is a fantasy novel set in an alternate universe where the Bible is a magical text.


Can we really draw any conclusions on LLMs based on 1 sample? Maybe you've tried multiple times and with different semi famous people, but in general I see people comparing ML models in this fashion.


Not really, I did try it with multiple attempts with multiple people and chatgpt had more issues. I just shared only one of them. If someone tests in a more systematic fashion that'd be great.


Did you add the default ChatGPT system prompt at the beginning, when using the API?


I'm doing it via the openai library in the way they have in its docs.

>completion = openai.ChatCompletion.create(model="gpt-3.5-turbo", messages=[{"role": "user", "content": "Who is Scott Alexander?"}])


Adding ChatGPT's initial prompt as a message with `system` role may make a difference (didn't try): https://platform.openai.com/docs/guides/chat/instructing-cha...

Also, we don't know ChatGPT's parameters (temperature, etc.).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: