Whisper Prompting: Better Transcription Output
Use the optional Whisper API prompt parameter to control output style and the correct spelling of names, brands, and terms. Explore how GPT can generate fictitious prompts for Whisper.
Record or download audio that includes several unusual names or product names. Transcribe without a prompt, then with a glossary prompt. Compare spelling accuracy between the two results.
Task grader
Copy and adapt to your context. Text in angle brackets should be replaced.
from openai import OpenAI
client = OpenAI()
def transcribe(path: str, prompt: str = "") -> str:
with open(path, "rb") as f:
return client.audio.transcriptions.create(
file=f, model="whisper-1", prompt=prompt
).text
def build_glossary_prompt(*terms) -> str:
joined = ", ".join(terms)
return (
f"Glossary of proper nouns and product names: {joined}. "
"Please ensure these terms are spelled correctly in the transcript."
)
# Usage
no_prompt = transcribe("audio.wav")
with_glossary = transcribe(
"audio.wav",
prompt=build_glossary_prompt("QuirkQuid", "GPT-4o", "Aimee", "Shawn"),
)
print("Without prompt:", no_prompt[:200])
print("With glossary: ", with_glossary[:200])