Model Guide

Gemini

Google's Gemini - best practices and techniques

Overview

Gemini is Google's multimodal AI model, designed from the ground up to be natively multimodal - understanding and reasoning across text, code, images, audio, and video. It's integrated across Google's products including Search, Workspace, and Gemini apps.

Gemini 1.5 Pro offers large context windows (up to 2M tokens) and strong performance on reasoning tasks. Available via gemini.google.com and Google AI Studio.

Strengths

Multimodal

Native support for images, audio, and video input

Long Context

Can process very long documents and codebases

Google Integration

Deep integration with Search and Workspace

Code Generation

Strong coding capabilities with good explanations

Gemini-Specific Tips

  • 01
    Leverage multimodal input

    Include images, screenshots, or diagrams when relevant

  • 02
    Use system instructions

    Set persistent behavior via system prompts

  • 03
    Be specific about output

    Gemini works well with explicit format requirements

  • 04
    Use grounding options

    Enable Google Search grounding for current information

Example Prompt

Effective Gemini Prompt
You are a UX designer reviewing a mobile app interface. Context: - This is a banking app for millennials - Target audience: 25-40 years old Task: Analyze this screenshot and provide: 1. First impression (1 sentence) 2. 3 strengths 3. 3 areas for improvement 4. Overall score (1-10) [Insert screenshot of app UI] Format as: ## First Impression [Your response] ## Strengths - [Item 1] ... ## Improvements - [Item 1] ... ## Score: X/10

This prompt leverages Gemini's multimodal strengths by including an image, provides clear context about the target audience, and specifies a structured output format. Gemini excels at visual analysis combined with contextual reasoning.