copyright is very hard to get right if you have 5TB data, if we now start pulling trigger for copyright enforcement it will just create moat for OpenAI and basically kill open models.
Sidenote I can't imagine openai would ever be able to actually patent gpt-x given that they neither invented the model (google did) nor created the training data.
Looks like they're adding in quite a few "multimodal" features in GPT-5. Emphasis on audio: artificial speech production, audio-to-text, voice recognition - likely building on Whisper. Translation for text/speech also seems on the roadmap.