gemini-text
Generate text content using Google Gemini models via scripts/. Use for text generation, multimodal prompts with images, thinking mode for complex reasoning, JSON-formatted outputs, and Google Search grounding for real-time information. Triggers on "generate with gemini", "use gemini for text", "AI text generation", "multimodal prompt", "gemini thinking mode", "grounded response".
What this skill does
# Gemini Text Generation Generate content using Google's Gemini API through executable scripts with advanced capabilities including system instructions, thinking mode, JSON output, and Google Search grounding. ## When to Use This Skill Use this skill when you need to: - Generate any type of text content (blogs, emails, code, stories) - Process images with text descriptions or analysis - Perform complex reasoning requiring step-by-step thinking - Get structured JSON outputs for data processing - Access real-time information via Google Search - Apply specific personas or behavior patterns - Combine text generation with other Gemini skills (images, TTS, embeddings) ## Available Scripts ### scripts/generate.js **Purpose**: Full-featured text generation with all Gemini capabilities **When to use**: - Any text generation task - Multimodal prompts (text + image) - Complex reasoning requiring thinking mode - Structured JSON output requirements - Real-time information needs (grounding) - Custom system instructions/personas **Key parameters**: | Parameter | Description | Example | |-----------|-------------|---------| | `prompt` | Text prompt (required) | `"Explain quantum computing"` | | `--model`, `-m` | Model to use | `gemini-3-flash-preview` | | `--system`, `-s` | System instruction | `"You are a helpful assistant"` | | `--thinking`, `-t` | Enable thinking mode | Flag | | `--json`, `-j` | Force JSON output | Flag | | `--grounding`, `-g` | Enable Google Search | Flag | | `--image`, `-i` | Image for multimodal | `photo.png` | | `--temperature` | Sampling 0.0-2.0 | `0.7` for creative | | `--max-tokens` | Output limit | `1000` | **Output**: Generated text string, optionally with grounding sources ## Workflows ### Workflow 1: Basic Text Generation ```bash node scripts/generate.js "Explain quantum computing in simple terms" ``` - Best for: Simple content creation, explanations, summaries - Model: `gemini-3-flash-preview` (default, fast) ### Workflow 2: With System Instruction (Persona) ```bash node scripts/generate.js "How do I read a file in Python?" --system "You are a helpful coding assistant" ``` - Best for: Domain-specific tasks, expert personas, consistent tone - Use when: You need specific behavioral constraints ### Workflow 3: Complex Reasoning (Thinking Mode) ```bash node scripts/generate.js "Analyze the ethical implications of AI in healthcare" --thinking ``` - Best for: Complex analysis, step-by-step reasoning, multi-step problems - Use when: Task requires careful consideration and logical progression ### Workflow 4: Structured JSON Output ```bash node scripts/generate.js "Generate a user profile object with name, email, and preferences" --json ``` - Best for: Data extraction, structured data generation, API responses - Output: Valid JSON ready for parsing - Note: Prompt must clearly request JSON structure ### Workflow 5: Real-Time Information (Grounding) ```bash node scripts/generate.js "Who won the latest Super Bowl?" --grounding ``` - Best for: Current events, news, factual information after training cutoff - Output: Response + grounding sources with citations - Use when: Accuracy of current information is critical ### Workflow 6: Multimodal (Image Analysis) ```bash node scripts/generate.js "Describe what's in this image in detail" --image photo.png ``` - Best for: Image captioning, visual analysis, image-based Q&A - Requires: Image file in PNG or JPEG format - Combines well with: gemini-files for file upload ### Workflow 7: Content Creation Pipeline (Batch + Text + TTS) ```bash # 1. Create batch requests (gemini-batch skill) # 2. Generate content node scripts/generate.js "Create a 500-word blog post about sustainable energy" # 3. Convert to audio (gemini-tts skill) ``` - Best for: High-volume content production, podcasts, audiobooks ## Parameters Reference ### Model Selection | Model | Speed | Intelligence | Context | Best For | |-------|-------|--------------|---------|----------| | `gemini-3-flash-preview` | Fast | High | 1M | General use, agentic tasks (default) | | `gemini-3-pro-preview` | Medium | Highest | 1M | Complex reasoning, research | | `gemini-2.5-flash` | Fast | Medium | 1M | Stable, reliable generation | | `gemini-2.5-pro` | Slow | High | 1M | Code, math, STEM tasks | ### Temperature Settings | Value | Creativity | Best For | |-------|-----------|----------| | 0.0-0.3 | Low | Code, facts, formal writing | | 0.4-0.7 | Medium | Balanced output | | 0.8-1.0 | High | Creative writing, brainstorming | | 1.0-2.0 | Very High | Highly creative, varied outputs | ### Thinking Budget | Value | Description | |-------|-------------| | 0 | Disabled (default behavior) | | 512-1024 | Standard reasoning | | 2048+ | Deep analysis (slower, more tokens) | ## Output Interpretation ### Standard Text Output - Plain text response ready for use - Check for truncation if max-tokens was set - May include markdown formatting ### JSON Output - Valid JSON object (use `--json` flag) - Parse with: `import json; data = json.loads(output)` - Verify structure matches your requirements - Handle potential parsing errors ### Grounded Response When `--grounding` is used, the script prints: 1. Main response text 2. "--- Grounding Sources ---" section 3. List of sources with titles and URLs ### Thinking Mode Output - May include reasoning steps before final answer - Longer response times due to thinking process - Better for tasks requiring careful analysis ## Common Issues ### "google-genai not installed" ```bash npm install @google/genai@latest dotenv@latest ``` ### "API key not set" Set environment variable: ```bash export GOOGLE_API_KEY="your-key-here" # or export GEMINI_API_KEY="your-key-here" ``` ### "Model not available" - Check model name spelling - Verify API access for selected model - Try `gemini-3-flash-preview` (most available) ### JSON parse errors - Ensure prompt explicitly requests JSON structure - Check output for JSON formatting - Consider using system instruction: "You always respond with valid JSON" ### Image file not found - Verify image path is correct - Use absolute paths if relative paths fail - Supported formats: PNG, JPEG ### Response truncated - Increase `--max-tokens` value - Break task into smaller requests - Use pro models with higher token limits ## Best Practices ### Performance Optimization - Use flash models for speed, pro for quality - Lower temperature (0.0-0.3) for deterministic outputs - Set appropriate max-tokens to control costs - Use thinking mode only for complex tasks ### Prompt Engineering - Be specific and clear in your prompts - Use system instructions for consistent behavior - Include examples in prompts for better results - For JSON: specify exact structure in prompt ### Error Handling - Wrap script calls in try-except blocks - Validate JSON output before parsing - Handle network timeouts with retries - Check API quota limits for batch operations ### Cost Management - Use flash models when possible (lower cost) - Limit max-tokens for simple queries - Cache results for repeated queries - Use batch API for high-volume tasks ## Related Skills - **gemini-image**: Generate images from text - **gemini-tts**: Convert text to speech - **gemini-embeddings**: Create vector embeddings for semantic search - **gemini-files**: Upload files for multimodal processing - **gemini-batch**: Process multiple requests efficiently ## Quick Reference ```bash # Basic node scripts/generate.js "Your prompt" # Persona node scripts/generate.js "Prompt" --system "You are X" # Thinking node scripts/generate.js "Complex task" --thinking # JSON node scripts/generate.js "Generate JSON" --json # Search node scripts/generate.js "Current event" --grounding # Multimodal node scripts/generate.js "Describe this" --image photo.png ``` ## Reference - See `references/models.md` for detailed model information - Get API key: https://aistudio.google.com/apikey - Documentation: https://ai.google.dev/gemini-api
Related in AI Agents
skill-development
IncludedComprehensive meta-skill for creating, managing, validating, auditing, and distributing Claude Code skills and slash commands (unified in v2.1.3+). Provides skill templates, creation workflows, validation patterns, audit checklists, naming conventions, YAML frontmatter guidance, progressive disclosure examples, and best practices lookup. Use when creating new skills, validating existing skills, auditing skill quality, understanding skill architecture, needing skill templates, learning about YAML frontmatter requirements, progressive disclosure patterns, tool restrictions (allowed-tools), skill composition, skill naming conventions, troubleshooting skill activation issues, creating custom slash commands, configuring command frontmatter, using command arguments ($ARGUMENTS, $1, $2), bash execution in commands, file references in commands, command namespacing, plugin commands, MCP slash commands, Skill tool configuration, or deciding between skills vs slash commands. Delegates to docs-management skill for official documentation.
reprompter
IncludedTransform messy prompts into well-structured, effective prompts — single or multi-agent. Use when: "reprompt", "reprompt this", "clean up this prompt", "structure my prompt", rough text needing XML tags and best practices, "reprompter teams", "repromptception", "run with quality", "smart run", "smart agents", multi-agent tasks, audits, parallel work, anything going to agent teams. Don't use when: simple Q&A, pure chat, immediate execution-only tasks. See "Don't Use When" section for details. Outputs: Structured XML/Markdown prompt, quality score (before/after), optional team brief + per-agent sub-prompts, agent team output files. Success criteria: Single mode quality score ≥ 7/10; Repromptception per-agent prompt quality score 8+/10; all required sections present, actionable and specific.
adaptive-compaction
IncludedAdaptive add-on policy and recovery layer that decides WHEN to compact, prune, snapshot, or fork -- replacing fixed-percent auto-compaction across Claude Code, Codex, and MCP-capable hosts. Trigger on auto-compact timing or damage: "when should I compact", "is it safe to compact now or start a fresh session", "auto-compact fires too early/mid-task", "switching to an unrelated task but the window still has space", "context rot", "answers get worse the longer the session runs", "the agent forgot the plan or my decisions after it summarized", "add a layer on top that manages context without changing the agent", raising autoCompactWindow to give the policy room, or installing/tuning a cross-tool compaction policy or PreCompact hook -- even when "compaction" is never said but the problem is context-window pressure or post-summarization memory loss. Do NOT use to summarize a conversation, build RAG, write a summarization prompt (decides WHEN not HOW), or answer max-context-length trivia.
agent-skill-creator
IncludedCreate cross-platform agent skills from workflow descriptions. Activates when users ask to create an agent, automate a repetitive workflow, create a custom skill, or need advanced agent creation. Triggers on phrases like create agent for, automate workflow, create skill for, every day I have to, daily I need to, turn process into agent, need to automate, create a cross-platform skill, validate this skill, export this skill, migrate this skill. Supports single skills, multi-agent suites, transcript processing, template-based creation, interactive configuration, cross-platform export, and spec validation.
llm-wiki
IncludedUse when building or maintaining a persistent personal knowledge base (second brain) in Obsidian where an LLM incrementally ingests sources, updates entity/concept pages, maintains cross-references, and keeps a synthesis current. Triggers include "second brain", "Obsidian wiki", "personal knowledge management", "ingest this paper/article/book", "build a research wiki", "compound knowledge", "Memex", or whenever the user wants knowledge to accumulate across sessions instead of being re-derived by RAG on every query.
skill-master
IncludedAgent Skills authoring, evaluation, and optimization. Create, edit, validate, benchmark, and improve skills following the agentskills.io specification. Use when designing SKILL.md files, structuring skill folders (references, scripts, assets), ingesting external documentation into skills, running trigger evals, benchmarking skill quality, optimizing descriptions, or performing blind A/B comparisons. Keywords: agentskills.io, SKILL.md, skill authoring, eval, benchmark, trigger optimization.