Model Guide

Gemini

Google's Gemini - best practices and techniques

Overview

Gemini is Google's multimodal AI model, designed from the ground up to be natively multimodal - understanding and reasoning across text, code, images, audio, and video. It's integrated across Google's products including Search, Workspace, and Gemini apps.

Gemini 1.5 Pro offers large context windows (up to 2M tokens) and strong performance on reasoning tasks. Available via gemini.google.com and Google AI Studio.

Strengths

Multimodal

Native support for images, audio, and video input

Long Context

Can process very long documents and codebases

Google Integration

Deep integration with Search and Workspace

Code Generation

Strong coding capabilities with good explanations

Gemini-Specific Tips

01
Leverage multimodal input
Include images, screenshots, or diagrams when relevant
02
Use system instructions
Set persistent behavior via system prompts
03
Be specific about output
Gemini works well with explicit format requirements
04
Use grounding options
Enable Google Search grounding for current information

Example Prompt

Effective Gemini Prompt

You are a UX designer reviewing a mobile app interface. Context: - This is a banking app for millennials - Target audience: 25-40 years old Task: Analyze this screenshot and provide: 1. First impression (1 sentence) 2. 3 strengths 3. 3 areas for improvement 4. Overall score (1-10) [Insert screenshot of app UI] Format as: ## First Impression [Your response] ## Strengths - [Item 1] ... ## Improvements - [Item 1] ... ## Score: X/10

This prompt leverages Gemini's multimodal strengths by including an image, provides clear context about the target audience, and specifies a structured output format. Gemini excels at visual analysis combined with contextual reasoning.

← Previous: Claude Back to Learn Hub