Report a bug

Direct Preference Optimization: Learning from Preferences — OpenAI Fine-tuning: From Data to DPO