Image-to-Image (Image Editing)

Overview

Nano Banana Pro supports image editing functionality. You can upload an image and then modify, enhance, or recreate it through text descriptions.

How It Works

Image-to-image functionality works through the following steps:

Upload original image: Convert local image to base64 encoding
Provide editing description: Describe the modifications you want in text
AI processing: The model understands the image and text, generates a new image
Return result: Returns the processed image in base64 encoding

Supported Image Formats

JPEG (.jpg, .jpeg)
PNG (.png)
WebP (.webp)
GIF (.gif)

Complete Python Example

Features

✅ Support multiple image formats
✅ Automatic detection of image MIME type
✅ Support custom aspect ratios
✅ Comprehensive error handling
✅ Automatic timestamp addition to output files

Complete Code

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

"""
Gemini Image Editor - Python Version
Upload local image + text description to generate new image, with custom aspect ratio support
"""

import requests
import base64
import os
import datetime
import mimetypes
from typing import Optional, Tuple

class GeminiImageEditor:
    """Gemini Image Editor"""

    # Supported aspect ratios
    SUPPORTED_ASPECT_RATIOS = [
        "21:9", "16:9", "4:3", "3:2", "1:1",
        "9:16", "3:4", "2:3", "5:4", "4:5"
    ]

    def __init__(self, api_key: str,
                 api_url: str = "https://api.ai.soraliststudio.com/v1beta/models/gemini-3-pro-image-preview:generateContent"):
        """
        Initialize image editor

        Parameters:
            api_key: API key
            api_url: API URL (using Google native Gemini API)
        """
        self.api_key = api_key
        self.api_url = api_url
        self.headers = {
            "Content-Type": "application/json",
            "Authorization": f"Bearer {api_key}"
        }

    def edit_image(self, image_path: str, prompt: str,
                   aspect_ratio: Optional[str] = "1:1",
                   output_dir: str = ".") -> Tuple[bool, str]:
        """
        Edit image and generate new image

        Parameters:
            image_path: Input image path
            prompt: Editing description (prompt)
            aspect_ratio: Aspect ratio, such as "16:9", "1:1" etc. (default 1:1)
            output_dir: Save directory (default current directory)

        Returns:
            (success, result message)
        """
        print(f"🚀 Starting image editing...")
        print(f"📁 Input image: {image_path}")
        print(f"📝 Edit description: {prompt}")
        print(f"📐 Aspect ratio: {aspect_ratio}")

        # Check if image file exists
        if not os.path.exists(image_path):
            return False, f"Image file does not exist: {image_path}"

        # Validate aspect ratio
        if aspect_ratio and aspect_ratio not in self.SUPPORTED_ASPECT_RATIOS:
            return False, f"Unsupported aspect ratio {aspect_ratio}. Supported: {', '.join(self.SUPPORTED_ASPECT_RATIOS)}"

        # Read and encode image
        try:
            with open(image_path, 'rb') as f:
                image_data = f.read()
            image_base64 = base64.b64encode(image_data).decode('utf-8')

            # Detect image type
            mime_type, _ = mimetypes.guess_type(image_path)
            if not mime_type or not mime_type.startswith('image/'):
                mime_type = 'image/jpeg'  # Default
            print(f"🎨 Image type: {mime_type}")

        except Exception as e:
            return False, f"Failed to read image: {str(e)}"

        # Generate output filename
        timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
        output_file = os.path.join(output_dir, f"gemini_edited_{timestamp}.png")

        try:
            # Build request data (Google native format)
            payload = {
                "contents": [{
                    "parts": [
                        {"text": prompt},
                        {
                            "inline_data": {
                                "mime_type": mime_type,
                                "data": image_base64
                            }
                        }
                    ]
                }]
            }

            # Add aspect ratio configuration
            if aspect_ratio:
                payload["generationConfig"] = {
                    "responseModalities": ["IMAGE"],
                    "imageConfig": {
                        "aspectRatio": aspect_ratio
                    }
                }

            print("📡 Sending request to Gemini API...")

            # Send request
            response = requests.post(
                self.api_url,
                headers=self.headers,
                json=payload,
                timeout=120
            )

            if response.status_code != 200:
                return False, f"API request failed, status code: {response.status_code}"

            # Parse response
            result = response.json()

            # Extract image data
            if "candidates" not in result or len(result["candidates"]) == 0:
                return False, "No image data found"

            candidate = result["candidates"][0]
            if "content" not in candidate or "parts" not in candidate["content"]:
                return False, "Response format error"

            parts = candidate["content"]["parts"]
            output_image_data = None

            for part in parts:
                if "inlineData" in part and "data" in part["inlineData"]:
                    output_image_data = part["inlineData"]["data"]
                    break

            if not output_image_data:
                return False, "No image data found"

            # Decode and save image
            print("💾 Saving image...")
            decoded_data = base64.b64decode(output_image_data)

            os.makedirs(os.path.dirname(output_file) if os.path.dirname(output_file) else ".", exist_ok=True)

            with open(output_file, 'wb') as f:
                f.write(decoded_data)

            file_size = len(decoded_data) / 1024  # KB
            print(f"✅ Image saved: {output_file}")
            print(f"📊 File size: {file_size:.2f} KB")

            return True, f"Successfully saved image: {output_file}"

        except requests.exceptions.Timeout:
            return False, "Request timeout (120 seconds)"
        except requests.exceptions.ConnectionError:
            return False, "Network connection error"
        except Exception as e:
            return False, f"Error: {str(e)}"

def main():
    """Main function - Usage example"""

    # ========== Configuration ==========
    # 1. Set your API key
    API_KEY = "sk-YOUR_API_KEY"

    # 2. Input image path
    INPUT_IMAGE = "./dog.png"  # Replace with your image path

    # 3. Input editing description (prompt)
    PROMPT = "Generate image: add a handsome cat next to the dog, keep the original image structure unchanged"

    # 4. Select aspect ratio (optional)
    # Supported: 21:9, 16:9, 4:3, 3:2, 1:1, 9:16, 3:4, 2:3, 5:4, 4:5
    ASPECT_RATIO = "1:1"  # Square
    # ASPECT_RATIO = "16:9"  # Landscape
    # ASPECT_RATIO = "9:16"  # Portrait

    # 5. Set save directory (optional)
    OUTPUT_DIR = "."  # Current directory
    # ============================

    print("="*60)
    print("Gemini Image Editor")
    print("="*60)
    print(f"⏰ Start time: {datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
    print("="*60)

    # Create editor instance
    editor = GeminiImageEditor(api_key=API_KEY)

    # Execute image editing
    success, message = editor.edit_image(
        image_path=INPUT_IMAGE,
        prompt=PROMPT,
        aspect_ratio=ASPECT_RATIO,
        output_dir=OUTPUT_DIR
    )

    # Display result
    print("\n" + "="*60)
    if success:
        print("🎉 Edit successful!")
        print(message)
    else:
        print("❌ Edit failed!")
        print(message)
    print("="*60)
    print(f"⏰ End time: {datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
    print("="*60)

if __name__ == "__main__":
    main()

Usage Instructions

1. Install Dependencies

pip install requests

2. Prepare Image

Place the image to be edited in an accessible path, for example:

./my_photo.jpg
./images/original.png

3. Configure Parameters

# API key
API_KEY = "sk-YOUR_API_KEY"

# Input image path
INPUT_IMAGE = "./dog.png"

# Edit description
PROMPT = "Add a handsome cat next to the dog"

# Aspect ratio (optional)
ASPECT_RATIO = "1:1"

4. Run Script

python3 image_editor.py

Editing Examples

Example 1: Add Elements

PROMPT = "Add a butterfly flying in the flowers to the image"

Example 2: Change Style

PROMPT = "Convert this photo to oil painting style, keeping the original composition"

Example 3: Modify Background

PROMPT = "Change the background to a beach at sunset, keeping the person unchanged"

Example 4: Enhance Details

PROMPT = "Enhance image details, improve clarity, keep overall composition unchanged"

Output Example

============================================================
Gemini Image Editor
============================================================
⏰ Start time: 2026-01-29 14:30:52
============================================================
🚀 Starting image editing...
📁 Input image: ./dog.png
📝 Edit description: Generate image: add a handsome cat next to the dog
📐 Aspect ratio: 1:1
🎨 Image type: image/png
📡 Sending request to Gemini API...
💾 Saving image...
✅ Image saved: gemini_edited_20260129_143052.png
📊 File size: 856.43 KB

============================================================
🎉 Edit successful!
Successfully saved image: gemini_edited_20260129_143052.png
============================================================
⏰ End time: 2026-01-29 14:32:15
============================================================

cURL Example

If you prefer using the command line:

# 1. Convert image to base64
BASE64_IMAGE=$(base64 -i your_image.jpg)

# 2. Send request
curl -X POST "https://api.ai.soraliststudio.com/v1beta/models/gemini-3-pro-image-preview:generateContent" \
  -H "Authorization: Bearer sk-YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d "{
    \"contents\": [{
      \"parts\": [
        {\"text\": \"Add a red rose to the image\"},
        {
          \"inline_data\": {
            \"mime_type\": \"image/jpeg\",
            \"data\": \"$BASE64_IMAGE\"
          }
        }
      ]
    }],
    \"generationConfig\": {
      \"responseModalities\": [\"IMAGE\"],
      \"imageConfig\": {
        \"aspectRatio\": \"1:1\"
      }
    }
  }"

Notes

Image Size Limit

Recommended image size: < 4MB
Larger images may cause:
- Longer upload time
- Increased processing time
- Possible timeout

Network Requirements

Since image transfer uses base64 encoding:

Upload and download data volume is large
Recommend using stable, high-speed network connection
Consider using metered high-speed bandwidth

Timeout Settings

Default timeout: 120 seconds
Complex image processing may require more time
Recommend adjusting timeout based on actual situation:

response = requests.post(
    self.api_url,
    headers=self.headers,
    json=payload,
    timeout=300  # Increase to 5 minutes
)

Best Practices

Clear editing description: Provide clear, specific modification requirements
Maintain original quality: Use high-quality input images
Appropriate aspect ratio: Choose suitable aspect ratio based on original image
Test and iterate: Try different prompts to get the best results
Error handling: Always check return results and handle possible errors

Image-to-Image (Image Editing)

On this page