CCM
/Skills
SkillsMCPMarketplacesDigestLearnAdvertise

This week in Claude

Every Monday: Claude Code, Agent SDK, MCP, and the Anthropic platform moves worth your time.

Skills by Category
Frontend DevelopmentBackend & APIsTesting & QASecurityDevOps & CI/CDGit & Pull RequestsDocumentationCode Review & QualityAI & Agent BuildingSkill Development
MCP Servers by Category
Sales & MarketingWeb & Browser AutomationDatabasesAI & LLM ToolsCloud & InfrastructureCommunication & MessagingDeveloper ToolsDesign & CreativeDocuments & KnowledgeSearch & Web Crawling
Marketplaces by Category
AI Agents & OrchestrationLLM IntegrationDevelopment ToolsFrontend & UIBackend & APIsDatabasesTesting & Code QualityDevOps & CloudSecurity & ComplianceGit & Version Control

Claude Code Marketplaces

Discover Claude Code plugins, extensions, and tools. Automatically updated directory of Anthropic Claude AI marketplaces with development tools, productivity plugins, and integrations.

Resources

  • Browse Skills
  • Browse MCP Servers
  • Browse Marketplaces
  • Plugins Reference

Community

  • About
  • Learn
  • Feedback
  • Privacy Policy
  • Advertise

Built for the Claude Code community with Claude Code by @mertduzgun

Independent project, not affiliated with Anthropic

Gpt Image Edit

doany-ai/skills
82.2k installs
Summary

This is the image edit endpoint of OpenAI's GPT Image 2, wrapped with the prompting patterns that actually work. Use it when you need to preserve a face or brand mark while swapping out multilingual text, moving headlines around with layout precision, or composing subjects across up to 10 reference images. The documented anti-patterns are worth reading: lead with what stays unchanged, quote in-image text character-for-character instead of paraphrasing, and use numbered refs when you pass multiple images. Routes through the local RunComfy CLI. If you need batch consistency across 20 SKU variants or photorealistic portrait fidelity, the skill doc steers you to Nano Banana Edit or Flux Kontext instead.

Install to Claude Code

npx -y skills add doany-ai/skills --skill gpt-image-edit --agent claude-code

Installs into .claude/skills of the current project.

CodeRabbit
CodeRabbit
AI writes the code. CodeRabbit catches the slop.
Try For Free →
Context.devContext.dev
Context.dev
Integrate web data into your AI product. One API to scrape website & brand data.
Get API Key Now →
Make your agent a DeFi expert
Make your agent a DeFi expert
Agent, run crypto. Access onchain data & trade routes via 1inch.
Install now →
Make money from your Skills
Make money from your Skills
On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.
Start earning →
AppSignal
AppSignal
Monitor with ease. Code with confidence.
Start Free Trial →
Vibe Prospecting MCPVibe Prospecting MCP
Vibe Prospecting MCP
Connect Claude to +800M contacts, +150M companies. Find & Enrich leads in chat.
Try For Free →
CodeRabbit
CodeRabbit
AI writes the code. CodeRabbit catches the slop.
Try For Free →
Context.devContext.dev
Context.dev
Integrate web data into your AI product. One API to scrape website & brand data.
Get API Key Now →
Make your agent a DeFi expert
Make your agent a DeFi expert
Agent, run crypto. Access onchain data & trade routes via 1inch.
Install now →
Make money from your Skills
Make money from your Skills
On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.
Start earning →
AppSignal
AppSignal
Monitor with ease. Code with confidence.
Start Free Trial →
Vibe Prospecting MCPVibe Prospecting MCP
Vibe Prospecting MCP
Connect Claude to +800M contacts, +150M companies. Find & Enrich leads in chat.
Try For Free →
Files
SKILL.mdView on GitHub

GPT Image Edit — Pro Pack on RunComfy

runcomfy.com · Edit endpoint · Text-to-image sibling · GitHub

OpenAI GPT Image 2 — /edit endpoint (ChatGPT Images 2.0 image-to-image) on the RunComfy Model API. Strongest in its class at preserving identity through targeted edits and rewriting embedded text in any script (Latin, kana, CJK, Cyrillic, Arabic).

npx skills add agentspace-so/runcomfy-skills --skill gpt-image-edit -g

When to pick this model (vs siblings)

You wantUse
Edit multilingual / embedded text in imageGPT Image Edit
Identity preservation through translated headline variantsGPT Image Edit
Layout-precise edit (move headline, swap CTA, etc.)GPT Image Edit
Up to 10 reference imagesGPT Image Edit
Batch up to 20 images consistentlyNano Banana Edit
Single-shot precise local edit, source-fidelity-firstFlux Kontext
Generate from scratch with GPT Image 2sibling gpt-image-2 skill
Batch SKU galleries with stable identityNano Banana Edit

Prerequisites

  1. RunComfy CLI — npm i -g @runcomfy/cli
  2. RunComfy account — runcomfy login opens a browser device-code flow.
  3. CI / containers — set RUNCOMFY_TOKEN=<token> instead of runcomfy login.

Endpoints + input schema

openai/gpt-image-2/edit

FieldTypeRequiredDefaultNotes
promptstringyes—Edit instruction. Lead with preservation, end with the change.
imagesstring[]yes—Up to 10 publicly-fetchable HTTPS URLs. First is primary; rest are auxiliary.
sizeenumnoautoauto (preserve input), 1024_1024 (1:1), 1024_1536 (2:3 portrait), 1536_1024 (3:2 landscape).

size=auto preserves the input ratio — strongly recommended unless the edit explicitly changes framing.

How to invoke

Single-ref preservation edit:

runcomfy run openai/gpt-image-2/edit \
  --input '{
    "prompt": "Keep the person'\''s face, pose, and brand mark unchanged. Replace the background with a soft warm-grey studio sweep and a gentle floor shadow.",
    "images": ["https://.../portrait.jpg"]
  }' \
  --output-dir <absolute/path>

Multilingual text rewrite (preserve everything except the headline):

runcomfy run openai/gpt-image-2/edit \
  --input '{
    "prompt": "Keep the photograph, layout, and brand mark exactly as in the input. Replace only the in-image headline. The new headline reads \"今日のおすすめ\" in bold Japanese kana, same position and font weight as before.",
    "images": ["https://.../poster-en.jpg"]
  }' \
  --output-dir <absolute/path>

Multi-ref composition:

runcomfy run openai/gpt-image-2/edit \
  --input '{
    "prompt": "Compose subject from image 1 into the room from image 2. Match the lighting and color palette of image 2. Keep image 1 subject identity (face, pose, clothing) unchanged.",
    "images": ["https://.../subject.jpg", "https://.../room.jpg"]
  }' \
  --output-dir <absolute/path>

Prompting — what actually works

Lead with preservation goals. Always: "Keep [face / pose / clothing / brand / framing] unchanged." Then state the change. The model honors what's stated up front.

Multilingual text — quote the characters, name the script. "the headline reads \"コーヒー\" in bold Japanese kana", "the label says \"АРОМА\" in Cyrillic, white on black", "the right-margin caption reads \"تخفيض\" in Arabic right-to-left". Don't paraphrase — quote.

Directional language for spatial edits. Concrete spatial scopes work: "move the headline from top-right to bottom-center", "remove the leftmost object only", "replace the watermark in the bottom-right corner".

Multi-ref numbering. When passing multiple images, refer to them by number: "subject from image 1, lighting from image 2, color palette from image 3". The model routes cues correctly.

Use size: "auto" to preserve input ratio. Only override when the edit explicitly changes framing (e.g. cropping a 16:9 to 1:1).

Anti-patterns:

  • Long compound edit instructions ("change A and B and C and D") → drift increases per added scope.
  • Missing preservation goals → model subtly rewrites the face / brand / framing.
  • Paraphrasing in-image text instead of quoting it → text comes out different.
  • Asking for size outside the 3 fixed values + auto → 422.

Where it shines

Use caseWhy GPT Image Edit
Multilingual ad localizationOne source asset → many language variants of the same headline
Brand-safe headline / CTA swapsLayout precision + preservation language hold the rest stable
Multi-ref composition (subject from one, scene from another)Numbered refs route cues correctly
Layout-precise repositioningDirectional language ("top-right to bottom-center") honored
Identity preservation across signage editsStrongest in class for face / brand preservation through targeted edits

Sample prompts (verified to produce strong results)

Background swap with full preservation (page example):

Turn the background into a bright minimal white-to-soft-gray studio
sweep with gentle floor shadow; add a large headline in-image that
reads "OPEN STUDIO" in a bold clean sans-serif, high contrast, centered;
keep the main person or product, pose, and face identity unchanged

Multilingual variant:

Keep the photograph, layout, lighting, and brand mark exactly as in the
input. Replace only the in-image headline.
The new headline reads "コーヒー" in bold Japanese kana, same position
and font weight as before.

Multi-ref composition:

Compose subject from image 1 into the kitchen from image 2.
Match the warm window light and color palette of image 2.
Keep subject identity (face, pose, clothing) from image 1 unchanged.

Limitations

  • size: 3 fixed values + auto — anything else 422s.
  • images: up to 10 — first is primary, rest are auxiliary cues.
  • Long compound prompts drift — split into multiple passes when needed.
  • For batch consistency across many SKU images, Nano Banana Edit (up to 20) is better.
  • Photorealism on portraits — Nano Banana Pro wins head-to-head.

Exit codes

codemeaning
0success
64bad CLI args
65bad input JSON / schema mismatch
69upstream 5xx
75retryable: timeout / 429
77not signed in or token rejected

Full reference: docs.runcomfy.com/cli/troubleshooting.

How it works

The skill invokes runcomfy run openai/gpt-image-2/edit with a JSON body matching the schema. The CLI POSTs to https://model-api.runcomfy.net/v1/models/openai/gpt-image-2/edit, polls the request, fetches the result, and downloads any .runcomfy.net/.runcomfy.com URL into --output-dir. Ctrl-C cancels the remote request before exit.

Security & Privacy

  • Token storage: runcomfy login writes the API token to ~/.config/runcomfy/token.json with mode 0600 (owner-only read/write). Set RUNCOMFY_TOKEN env var to bypass the file entirely in CI / containers.
  • Input boundary: the user prompt is passed as a JSON string to the CLI via --input. The CLI does NOT shell-expand the prompt; it transmits the JSON body directly to the Model API over HTTPS. No shell injection surface from prompt content.
  • Third-party content: image / mask / video URLs you pass are fetched by the RunComfy model server, not by the CLI on your machine. Treat external URLs as untrusted; image-based prompt injection is a known risk for any image-edit / video-edit model.
  • Outbound endpoints: only model-api.runcomfy.net (request submission) and *.runcomfy.net / *.runcomfy.com (download whitelist for generated outputs). No telemetry, no callbacks.
  • Generated-file size cap: the CLI aborts any single download > 2 GiB to prevent disk-fill from a malicious or runaway model output.
Featured
CodeRabbit
CodeRabbit
AI writes the code. CodeRabbit catches the slop.
Try For Free →
Context.devContext.dev
Context.dev
Integrate web data into your AI product. One API to scrape website & brand data.
Get API Key Now →
Make your agent a DeFi expert
Make your agent a DeFi expert
Agent, run crypto. Access onchain data & trade routes via 1inch.
Install now →
Make money from your Skills
Make money from your Skills
On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.
Start earning →
AppSignal
AppSignal
Monitor with ease. Code with confidence.
Start Free Trial →
Vibe Prospecting MCPVibe Prospecting MCP
Vibe Prospecting MCP
Connect Claude to +800M contacts, +150M companies. Find & Enrich leads in chat.
Try For Free →
Categories
Generative Media
First SeenJun 3, 2026
View on GitHub

Recommended

More Generative Media →
stable-diffusion-image-generation

davila7/claude-code-templates

stable diffusion image generation
1.2k
27.7k
stable-diffusion-image-generation

orchestra-research/ai-research-skills

stable diffusion image generation
266
9.2k
elevenlabs-music-generation

agentspace-so/runcomfy-agent-skills

Generate studio-quality songs and instrumentals from text descriptions, 5 seconds to 5 minutes, with section-level structure control.
116.9k
11
elevenlabs-music-generation

runcomfy-com/skills

elevenlabs music generation
82.4k
1
elevenlabs-music-generation

doany-ai/skills

elevenlabs music generation
82.2k
elevenlabs-tts

inferen-sh/skills

39.1k
173