Yeah, I guess "creative writing" in this case is a shortcut for essentially saying they aren't that good at conforming to user-specified constraints. They can generate "creative" texts but can't necessarily constrain and iterate on their output in a conversational setting effectively.
... but it's also the perfect choice for creative writing ...?
Isn't this a contradiction? How can a model be good at creative writing if it's no good at conversation?