Prompt Evaluations: Anthropic's Official Course · Lesson 8
Custom LLM Judge: Multi-Metric Scoring
Writing a custom llm_eval() function scoring on multiple metrics (conciseness, accuracy, tone 1-5). get_assert() for PromptFoo. Prefill <json> for reliable JSON output. Comparing basic vs better vs best summarization prompt.