GeminiKit

A comprehensive Swift SDK for Google's Gemini AI models with native async/await support

Build Intelligent Apps

Everything you need to integrate Google's Gemini AI into your Swift applications

🎨

Image Generation

Imagen 3.0 with multiple aspect ratios and batch support

🎬

Video Generation

Veo 2.0 text-to-video and image-to-video creation

🗣️

Text-to-Speech

Natural voices with multi-speaker dialogue support

🧠

Thinking Mode

Step-by-step reasoning with transparent thought process

🌐

Web Grounding

Real-time Google Search for current information

💻

Code Execution

Generate and run code for computational tasks

🛠️

Function Calling

Build AI agents with parallel function execution

🎥

Multimodal

Process images, videos, audio, and documents

Streaming

Real-time SSE responses with platform optimization

🔤

Embeddings

High-quality vectors for search and similarity

🔄

OpenAI Compatible

Drop-in replacement for existing OpenAI code

🖥️

CLI Tools

Command-line interface for all features

💬

Chat Sessions

Stateful conversations with history management

📊

Structured Output

JSON mode and schema-enforced responses

🔒

Type Safety

Strongly typed Swift API with full autocomplete

📁

File Management

Upload and manage files for processing

🛡️

Safety Controls

Configurable content filtering and harm categories

📦

Zero Dependencies

Pure Swift with no external packages

// Initialize GeminiKit
import GeminiKit

let gemini = GeminiKit(
  apiKey: "your-api-key"
)
// Text Generation
let response = try await gemini.generateContent(
  model: .gemini_2_5_flash,
  prompt: "Explain quantum computing"
)
print(response.text ?? "")
// Streaming Responses
for try await chunk in gemini.streamContent(
  model: .gemini_2_5_pro,
  prompt: "Write a story"
) {
  print(chunk.text ?? "")
}
// Image Generation
let images = try await gemini.generateImages(
  model: .imagen_3_0_generate,
  prompt: "Japanese garden sunset",
  aspectRatio: .landscape_16_9
)
// Multimodal Analysis
let result = try await gemini.generateContent(
  model: .gemini_2_5_flash,
  messages: [
    .user("What's in this image?",
      .imageData(imageData))
  ]
)
// Chat Sessions
let chat = gemini.startChat(
  model: .gemini_2_5_flash
)
let msg = try await chat.sendMessage(
  "Hello! How are you?"
)
// Function Calling
let result = try await gemini.generateContent(
  model: .gemini_2_5_flash,
  prompt: "Get the weather",
  tools: [weatherTool],
  toolConfig: .auto
)
// Thinking Mode
let stream = gemini.streamContent(
  model: .gemini_2_5_flash_thinking,
  prompt: "Design a system"
)
// Shows reasoning steps
// Text-to-Speech
let audio = try await gemini.generateSpeech(
  model: .gemini_2_5_flash_tts,
  text: "Hello world!",
  voice: .aoede
)
// Embeddings
let embedding = try await gemini.embedContent(
  model: .textEmbedding004,
  content: "Semantic search text"
)
// Returns vector array
// Web Grounding
let result = try await gemini.generateContent(
  model: .gemini_2_5_flash,
  prompt: "Latest AI news",
  tools: [.googleSearch()]
)
// Code Execution
let result = try await gemini.generateContent(
  model: .gemini_2_5_flash,
  prompt: "Calculate fibonacci",
  tools: [.codeExecution()]
)
// JSON Mode
let result = try await gemini.generateContent(
  model: .gemini_2_5_flash,
  prompt: "List 3 colors",
  generationConfig: .init(
    responseMIMEType: "application/json"
  )
)
// Video Generation (Paid)
let videoOp = try await gemini.generateVideos(
  model: .veo_2_0_generate,
  prompt: "Sunset timelapse",
  duration: .seconds_8,
  aspectRatio: .landscape_16_9
)
// Video Analysis
let analysis = try await gemini.generateContent(
  model: .gemini_2_5_flash,
  messages: [
    .user("Analyze this video",
      .videoFile("video.mp4"))
  ]
)

Multi-Platform Support

Build once, deploy everywhere

📱
💻
📺
🥽
🐧