Gemini Cookbook: Official Quickstarts
Google's official Gemini quickstarts: function calling, audio, File API, caching, code execution, embeddings, and Live API. Real Python code from the Google team.
Getting Started with Google GenAI SDK
Install the new Google GenAI SDK, initialize a client, send text and multimodal prompts, set system instructions, count tokens and configure generation parameters.
Function Calling
Connect Python functions as model tools, explore automatic and manual calling modes, inspect chat history with FunctionCall/FunctionResponse, and understand FunctionDeclaration schemas.
Working with Audio
Upload audio files via the File API, work with inline audio, request transcripts with timestamps, and analyze YouTube videos.
File API
Upload and manage files via the Gemini File API: images, text, code, GCS objects, multi-file batch uploads, and direct HTTPS URLs.
Context Caching
Cache large documents for repeated use: create CachedContent, manage TTL, use the cache in generate_content and chat, monitor token counts.
Code Execution
Enable the Python code execution tool, work with files via File I/O, use code execution in chat and multimodal scenarios.
Embeddings
Generate text and multimodal embeddings, control dimensionality, compute semantic similarity, and build a simple RAG system with task_type.
Live API
Real-time voice and multimodal sessions over WebSocket: text-to-text, text-to-audio, async AudioLoop, session management, and stream handling.