元渊 API元渊 API
Best Practices

Image-to-Image (Image Editing)

Upload images and edit them through text descriptions using Nano Banana Pro

Overview

Nano Banana Pro supports image editing functionality. You can upload an image and then modify, enhance, or recreate it through text descriptions.

How It Works

Image-to-image functionality works through the following steps:

  1. Upload original image: Convert local image to base64 encoding
  2. Provide editing description: Describe the modifications you want in text
  3. AI processing: The model understands the image and text, generates a new image
  4. Return result: Returns the processed image in base64 encoding

Supported Image Formats

  • JPEG (.jpg, .jpeg)
  • PNG (.png)
  • WebP (.webp)
  • GIF (.gif)

Complete Python Example

Features

  • ✅ Support multiple image formats
  • ✅ Automatic detection of image MIME type
  • ✅ Support custom aspect ratios
  • ✅ Comprehensive error handling
  • ✅ Automatic timestamp addition to output files

Complete Code

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

"""
Gemini Image Editor - Python Version
Upload local image + text description to generate new image, with custom aspect ratio support
"""

import requests
import base64
import os
import datetime
import mimetypes
from typing import Optional, Tuple

class GeminiImageEditor:
    """Gemini Image Editor"""

    # Supported aspect ratios
    SUPPORTED_ASPECT_RATIOS = [
        "21:9", "16:9", "4:3", "3:2", "1:1",
        "9:16", "3:4", "2:3", "5:4", "4:5"
    ]

    def __init__(self, api_key: str,
                 api_url: str = "https://api.ai.soraliststudio.com/v1beta/models/gemini-3-pro-image-preview:generateContent"):
        """
        Initialize image editor

        Parameters:
            api_key: API key
            api_url: API URL (using Google native Gemini API)
        """
        self.api_key = api_key
        self.api_url = api_url
        self.headers = {
            "Content-Type": "application/json",
            "Authorization": f"Bearer {api_key}"
        }

    def edit_image(self, image_path: str, prompt: str,
                   aspect_ratio: Optional[str] = "1:1",
                   output_dir: str = ".") -> Tuple[bool, str]:
        """
        Edit image and generate new image

        Parameters:
            image_path: Input image path
            prompt: Editing description (prompt)
            aspect_ratio: Aspect ratio, such as "16:9", "1:1" etc. (default 1:1)
            output_dir: Save directory (default current directory)

        Returns:
            (success, result message)
        """
        print(f"🚀 Starting image editing...")
        print(f"📁 Input image: {image_path}")
        print(f"📝 Edit description: {prompt}")
        print(f"📐 Aspect ratio: {aspect_ratio}")

        # Check if image file exists
        if not os.path.exists(image_path):
            return False, f"Image file does not exist: {image_path}"

        # Validate aspect ratio
        if aspect_ratio and aspect_ratio not in self.SUPPORTED_ASPECT_RATIOS:
            return False, f"Unsupported aspect ratio {aspect_ratio}. Supported: {', '.join(self.SUPPORTED_ASPECT_RATIOS)}"

        # Read and encode image
        try:
            with open(image_path, 'rb') as f:
                image_data = f.read()
            image_base64 = base64.b64encode(image_data).decode('utf-8')

            # Detect image type
            mime_type, _ = mimetypes.guess_type(image_path)
            if not mime_type or not mime_type.startswith('image/'):
                mime_type = 'image/jpeg'  # Default
            print(f"🎨 Image type: {mime_type}")

        except Exception as e:
            return False, f"Failed to read image: {str(e)}"

        # Generate output filename
        timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
        output_file = os.path.join(output_dir, f"gemini_edited_{timestamp}.png")

        try:
            # Build request data (Google native format)
            payload = {
                "contents": [{
                    "parts": [
                        {"text": prompt},
                        {
                            "inline_data": {
                                "mime_type": mime_type,
                                "data": image_base64
                            }
                        }
                    ]
                }]
            }

            # Add aspect ratio configuration
            if aspect_ratio:
                payload["generationConfig"] = {
                    "responseModalities": ["IMAGE"],
                    "imageConfig": {
                        "aspectRatio": aspect_ratio
                    }
                }

            print("📡 Sending request to Gemini API...")

            # Send request
            response = requests.post(
                self.api_url,
                headers=self.headers,
                json=payload,
                timeout=120
            )

            if response.status_code != 200:
                return False, f"API request failed, status code: {response.status_code}"

            # Parse response
            result = response.json()

            # Extract image data
            if "candidates" not in result or len(result["candidates"]) == 0:
                return False, "No image data found"

            candidate = result["candidates"][0]
            if "content" not in candidate or "parts" not in candidate["content"]:
                return False, "Response format error"

            parts = candidate["content"]["parts"]
            output_image_data = None

            for part in parts:
                if "inlineData" in part and "data" in part["inlineData"]:
                    output_image_data = part["inlineData"]["data"]
                    break

            if not output_image_data:
                return False, "No image data found"

            # Decode and save image
            print("💾 Saving image...")
            decoded_data = base64.b64decode(output_image_data)

            os.makedirs(os.path.dirname(output_file) if os.path.dirname(output_file) else ".", exist_ok=True)

            with open(output_file, 'wb') as f:
                f.write(decoded_data)

            file_size = len(decoded_data) / 1024  # KB
            print(f"✅ Image saved: {output_file}")
            print(f"📊 File size: {file_size:.2f} KB")

            return True, f"Successfully saved image: {output_file}"

        except requests.exceptions.Timeout:
            return False, "Request timeout (120 seconds)"
        except requests.exceptions.ConnectionError:
            return False, "Network connection error"
        except Exception as e:
            return False, f"Error: {str(e)}"

def main():
    """Main function - Usage example"""

    # ========== Configuration ==========
    # 1. Set your API key
    API_KEY = "sk-YOUR_API_KEY"

    # 2. Input image path
    INPUT_IMAGE = "./dog.png"  # Replace with your image path

    # 3. Input editing description (prompt)
    PROMPT = "Generate image: add a handsome cat next to the dog, keep the original image structure unchanged"

    # 4. Select aspect ratio (optional)
    # Supported: 21:9, 16:9, 4:3, 3:2, 1:1, 9:16, 3:4, 2:3, 5:4, 4:5
    ASPECT_RATIO = "1:1"  # Square
    # ASPECT_RATIO = "16:9"  # Landscape
    # ASPECT_RATIO = "9:16"  # Portrait

    # 5. Set save directory (optional)
    OUTPUT_DIR = "."  # Current directory
    # ============================

    print("="*60)
    print("Gemini Image Editor")
    print("="*60)
    print(f"⏰ Start time: {datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
    print("="*60)

    # Create editor instance
    editor = GeminiImageEditor(api_key=API_KEY)

    # Execute image editing
    success, message = editor.edit_image(
        image_path=INPUT_IMAGE,
        prompt=PROMPT,
        aspect_ratio=ASPECT_RATIO,
        output_dir=OUTPUT_DIR
    )

    # Display result
    print("\n" + "="*60)
    if success:
        print("🎉 Edit successful!")
        print(message)
    else:
        print("❌ Edit failed!")
        print(message)
    print("="*60)
    print(f"⏰ End time: {datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
    print("="*60)

if __name__ == "__main__":
    main()

Usage Instructions

1. Install Dependencies

pip install requests

2. Prepare Image

Place the image to be edited in an accessible path, for example:

./my_photo.jpg
./images/original.png

3. Configure Parameters

# API key
API_KEY = "sk-YOUR_API_KEY"

# Input image path
INPUT_IMAGE = "./dog.png"

# Edit description
PROMPT = "Add a handsome cat next to the dog"

# Aspect ratio (optional)
ASPECT_RATIO = "1:1"

4. Run Script

python3 image_editor.py

Editing Examples

Example 1: Add Elements

PROMPT = "Add a butterfly flying in the flowers to the image"

Example 2: Change Style

PROMPT = "Convert this photo to oil painting style, keeping the original composition"

Example 3: Modify Background

PROMPT = "Change the background to a beach at sunset, keeping the person unchanged"

Example 4: Enhance Details

PROMPT = "Enhance image details, improve clarity, keep overall composition unchanged"

Output Example

============================================================
Gemini Image Editor
============================================================
⏰ Start time: 2026-01-29 14:30:52
============================================================
🚀 Starting image editing...
📁 Input image: ./dog.png
📝 Edit description: Generate image: add a handsome cat next to the dog
📐 Aspect ratio: 1:1
🎨 Image type: image/png
📡 Sending request to Gemini API...
💾 Saving image...
✅ Image saved: gemini_edited_20260129_143052.png
📊 File size: 856.43 KB

============================================================
🎉 Edit successful!
Successfully saved image: gemini_edited_20260129_143052.png
============================================================
⏰ End time: 2026-01-29 14:32:15
============================================================

cURL Example

If you prefer using the command line:

# 1. Convert image to base64
BASE64_IMAGE=$(base64 -i your_image.jpg)

# 2. Send request
curl -X POST "https://api.ai.soraliststudio.com/v1beta/models/gemini-3-pro-image-preview:generateContent" \
  -H "Authorization: Bearer sk-YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d "{
    \"contents\": [{
      \"parts\": [
        {\"text\": \"Add a red rose to the image\"},
        {
          \"inline_data\": {
            \"mime_type\": \"image/jpeg\",
            \"data\": \"$BASE64_IMAGE\"
          }
        }
      ]
    }],
    \"generationConfig\": {
      \"responseModalities\": [\"IMAGE\"],
      \"imageConfig\": {
        \"aspectRatio\": \"1:1\"
      }
    }
  }"

Notes

Image Size Limit

  • Recommended image size: < 4MB
  • Larger images may cause:
    • Longer upload time
    • Increased processing time
    • Possible timeout

Network Requirements

Since image transfer uses base64 encoding:

  • Upload and download data volume is large
  • Recommend using stable, high-speed network connection
  • Consider using metered high-speed bandwidth

Timeout Settings

  • Default timeout: 120 seconds
  • Complex image processing may require more time
  • Recommend adjusting timeout based on actual situation:
response = requests.post(
    self.api_url,
    headers=self.headers,
    json=payload,
    timeout=300  # Increase to 5 minutes
)

Best Practices

  1. Clear editing description: Provide clear, specific modification requirements
  2. Maintain original quality: Use high-quality input images
  3. Appropriate aspect ratio: Choose suitable aspect ratio based on original image
  4. Test and iterate: Try different prompts to get the best results
  5. Error handling: Always check return results and handle possible errors

How is this guide?