OpenAI API from A to Z · Lesson 6
Multimodality: vision, Whisper, TTS
gpt-4o with images (URL vs base64), transcription via Whisper, speech synthesis with TTS and audio streaming.
gpt-4o with images (URL vs base64), transcription via Whisper, speech synthesis with TTS and audio streaming.