Image Generation AI Pipeline Implementation Plan
For Claude: REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
Goal: Build an automated, brand-aware image generation pipeline that generates, validates, and uploads images to Cloudflare R2 via CLI — supporting blog thumbnails, newsletter banners, social cards, and product mockups.
Architecture: CLI script (generate-image.sh) orchestrates a Node.js pipeline: loads brand style guide (JSON) -> generates prompt via LLM -> calls image generation API (OpenRouter-first, fal.ai fallback) -> validates via vision model -> uploads to R2. All API calls routed through OpenRouter where supported, per CEO policy.
Tech Stack: Node.js (ESM), OpenAI SDK (OpenRouter-compatible), @aws-sdk/client-s3 (R2 upload), sharp (image processing), Commander.js (CLI)
1. Research Results Summary
1.1 Image Generation Service Comparison (February 2026)
| Service | Representative Models | Price/Image (1MP) | API Friendliness | Quality | Speed | Korean Support | OpenRouter Available |
|---|---|---|---|---|---|---|---|
| Fal.ai | Flux 2 Pro, Flux 2 Dev, Flux Schnell, SDXL | $0.003~$0.05 | Excellent (REST) | High | Fast | No | N/A (direct only) |
| Replicate | Flux 2 Pro, SDXL, Stable Diffusion | $0.003~$0.055 | Excellent (REST) | High | Medium | No | N/A (direct only) |
| Together AI | Flux Schnell (free), Flux 1.1 Pro, Imagen 4 | $0.0027~$0.06 | Excellent (REST) | High | Very Fast (315ms schnell) | No | N/A (direct only) |
| OpenAI | GPT Image 1.5, GPT Image 1 Mini, DALL-E 3 | $0.005~$0.19 | Good (official SDK) | Very High | Medium | Yes (prompt) | Yes (via chat API) |
| Black Forest Labs | Flux 2 Pro/Max/Flex/Klein | $0.014~$0.28 | Good (REST) | Very High | Varies | No | N/A (direct) |
| Stability AI | Stable Diffusion 3.5, Stable Image Core | $0.003~$0.08 | Good (REST) | Good | Fast | No | N/A (direct only) |
| OpenRouter | Flux 2 Pro/Flex, Gemini Flash Image, Riverflow | $0.003~$0.07/MP | Excellent (OpenAI-compat) | High | Varies | Model-dependent | Yes (native) |
| Google (via OR) | Imagen 4 Fast/Standard/Ultra, Gemini Flash Image | $0.02~$0.06 | Good (via OpenRouter) | Very High | Fast | Yes (prompt) | Yes |
| Midjourney | Midjourney v6+ | ~$0.04 (subscription) | Poor (no official API) | Best | Medium | Limited | No |
| Nano Banana (Google) | Gemini 2.5 Flash Image, Gemini 3 Pro Image | $0.039~$0.134 | Good (via OpenRouter) | Very High | Medium | Yes | Yes |
1.2 OpenRouter Image Generation Support (Confirmed)
OpenRouter does support image generation via the chat completions endpoint with modalities: ["image"] or modalities: ["image", "text"].
Available Image Models on OpenRouter (Feb 2026):
| Model ID | Provider | Price/MP | Notes |
|---|---|---|---|
black-forest-labs/flux.2-pro | BFL | ~$0.05 | Professional quality |
black-forest-labs/flux.2-flex | BFL | ~$0.06 | Multi-reference editing |
google/gemini-2.5-flash-image | ~$0.039 | Nano Banana, contextual | |
google/gemini-3-pro-image-preview | ~$0.134 | Highest Google quality | |
sourceful/riverflow-v2-fast | Sourceful | TBD | Fast generation |
sourceful/riverflow-v2-pro | Sourceful | TBD | Pro quality |
API Format:
const response = await openai.chat.completions.create({
model: 'black-forest-labs/flux.2-pro',
modalities: ['image'],
messages: [{ role: 'user', content: prompt }],
});
// Image returned as base64 data URL in response
Key Constraint: OpenRouter does NOT yet support all image models (e.g., no Imagen 4 direct, no SDXL, no Stability). For maximum model coverage, fal.ai is the best direct alternative.
Verdict: OpenRouter first for models it supports (Flux 2, Gemini Image), fal.ai fallback for budget models (Flux Schnell $0.003, SDXL $0.003) and specialized needs.
1.3 Use-Case Optimal Model Matrix
| Use Case | Dimensions | Recommended Model (OpenRouter) | Fallback (fal.ai) | Est. Cost/Image |
|---|---|---|---|---|
| Blog Hero/Thumbnail | 1200x628 | flux.2-pro | flux-2-pro | $0.05 |
| Newsletter Banner | 600x200 | gemini-2.5-flash-image | flux-schnell | $0.003~$0.039 |
| Instagram Card News | 1080x1080 | flux.2-pro | flux-2-dev | $0.025~$0.05 |
| Reels Thumbnail | 1080x1920 | flux.2-pro | flux-2-pro | $0.05 |
| Product Mockup/Cover (HQ) | 1536x1024 | gemini-3-pro-image-preview | flux-2-pro | $0.05~$0.134 |
2. Recommended Stack
Primary: OpenRouter (Policy-Compliant)
- Models: Flux 2 Pro (general), Gemini 2.5 Flash Image (budget), Gemini 3 Pro Image (premium)
- API: OpenAI SDK with
baseURL: https://openrouter.ai/api/v1 - Aligns with: CEO's OpenRouter-only policy (
openrouter-policy.md)
Fallback: Fal.ai (Budget/Specialized)
- Models: Flux Schnell ($0.003), SDXL ($0.003), Flux 2 Dev ($0.025)
- Use when: OpenRouter doesn't support needed model, or budget-critical bulk generation
- Requires: FAL_KEY env var (CEO approval needed)
Vision QA: OpenRouter (Claude/Gemini)
- Model:
anthropic/claude-sonnet-4-6orgoogle/gemini-2.5-flash - Cost: ~$0.003/image check (text tokens only for analysis)
Storage: Cloudflare R2 (Already Configured)
- Bucket:
business-builder - Public URL:
https://pub-8433d9d9c43b94f189da8f35ea1926ed.r2.dev - Keys: Already in
.env(R2_ACCESS_KEY_ID, R2_SECRET_ACCESS_KEY, etc.)
3. Pipeline Design
3.1 Architecture Diagram
[CLI Command]
│
├─ --brand richbukae --type hero --topic "AI Marketing"
│
▼
[1. Load Brand Style Guide] ← brands/{brand}.json
│
▼
[2. Generate Prompt (LLM)] ← OpenRouter: gemini-2.5-flash
│ Input: style guide + topic + image type + dimensions
│ Output: detailed image generation prompt
│
▼
[3. Generate Image] ← OpenRouter: flux.2-pro (or fallback fal.ai)
│ Input: prompt + dimensions
│ Output: base64 PNG
│
▼
[4. Vision QA Check] ← OpenRouter: claude-sonnet-4-6
│ Input: generated image + brand style guide
│ Checks: brand color match, composition, text readability, quality
│ Output: PASS/FAIL + score + reasons
│
├─ PASS → [5. Process & Upload]
│ │ sharp: resize, optimize
│ │ R2: upload with metadata
│ │ Output: public URL
│ ▼
│ [Done: Return URL + metadata]
│
└─ FAIL → [Retry: Regenerate with feedback, max 3 attempts]
│ Append failure reasons to prompt
▼
[Back to Step 3]
3.2 CLI Interface Design
# Basic usage
./scripts/generate-image.sh --brand richbukae --type hero --topic "AI Marketing Automation"
# With model override
./scripts/generate-image.sh --brand apppro --type instagram --topic "SaaS Development" --model flux.2-pro
# Budget mode (fal.ai schnell)
./scripts/generate-image.sh --brand richbukae --type newsletter --topic "Weekly Tips" --budget
# Batch mode (from CSV/JSON)
./scripts/generate-image.sh --batch images-to-generate.json
# Dry run (prompt only, no generation)
./scripts/generate-image.sh --brand apppro --type hero --topic "Cloud Native" --dry-run
Options:
| Flag | Description | Default |
|---|---|---|
--brand | Brand name (loads brands/{name}.json) | Required |
--type | Image type: hero, newsletter, instagram, reels, mockup | Required |
--topic | Subject/topic for the image | Required |
--model | Override model: flux.2-pro, gemini-flash, schnell | Auto (by type) |
--budget | Use cheapest model (fal.ai schnell) | false |
--no-qa | Skip vision QA check | false |
--dry-run | Generate prompt only, no API call | false |
--batch | Path to batch JSON file | - |
--output | Local save path (also uploads to R2) | - |
3.3 Vision QA Check Criteria
The vision model evaluates generated images against:
{
"checks": [
{
"name": "brand_colors",
"description": "Primary/secondary brand colors visible in image",
"weight": 0.25
},
{
"name": "composition",
"description": "Good visual composition, proper framing, no artifacts",
"weight": 0.25
},
{
"name": "text_readability",
"description": "Any text in image is readable (if applicable)",
"weight": 0.15
},
{
"name": "brand_mood",
"description": "Image mood matches brand guide (luxury, tech, etc.)",
"weight": 0.20
},
{
"name": "technical_quality",
"description": "No distortions, artifacts, or low-resolution areas",
"weight": 0.15
}
],
"pass_threshold": 0.7
}
4. Brand Style Guide Template
4.1 JSON Schema
File location: projects/image-pipeline/brands/{brand-name}.json
{
"brand": "richbukae",
"display_name": "RichBukae",
"tagline": "AI-Powered Wealth Building",
"visual_identity": {
"primary_color": "#1a1a2e",
"secondary_color": "#c9a84c",
"accent_color": "#e8d5a3",
"background_preference": "dark",
"style_keywords": ["luxury", "minimal", "elegant", "dark-navy", "gold-accent"],
"mood": "premium, sophisticated, trustworthy"
},
"typography": {
"heading_style": "bold, sans-serif, modern",
"avoid": ["comic", "handwritten", "playful"]
},
"imagery": {
"preferred_subjects": ["abstract tech", "wealth symbols", "modern office", "data visualization"],
"avoid_subjects": ["cartoon", "clipart", "stock-photo-generic"],
"human_presence": "minimal, silhouettes preferred"
},
"dimensions": {
"hero": { "width": 1200, "height": 628 },
"newsletter": { "width": 600, "height": 200 },
"instagram": { "width": 1080, "height": 1080 },
"reels": { "width": 1080, "height": 1920 },
"mockup": { "width": 1536, "height": 1024 }
}
}
4.2 Example: apppro.kr Brand
{
"brand": "apppro",
"display_name": "AppPro",
"tagline": "AI Solutions for Business",
"visual_identity": {
"primary_color": "#2563eb",
"secondary_color": "#1e40af",
"accent_color": "#60a5fa",
"background_preference": "light",
"style_keywords": ["tech", "modern", "blue", "clean", "professional"],
"mood": "innovative, trustworthy, cutting-edge"
},
"typography": {
"heading_style": "bold, geometric sans-serif",
"avoid": ["decorative", "serif", "ornamental"]
},
"imagery": {
"preferred_subjects": ["SaaS dashboards", "code", "cloud architecture", "AI neural networks"],
"avoid_subjects": ["nature", "food", "lifestyle"],
"human_presence": "optional, diverse professionals"
},
"dimensions": {
"hero": { "width": 1200, "height": 628 },
"newsletter": { "width": 600, "height": 200 },
"instagram": { "width": 1080, "height": 1080 },
"reels": { "width": 1080, "height": 1920 },
"mockup": { "width": 1536, "height": 1024 }
}
}
5. Skill Packaging Plan
File: .claude/skills/image-generation.md
The skill file will provide:
- Usage instructions for PL agents to invoke the pipeline
- Brand guide management (add/edit/list brands)
- Model selection guide (when to use which model)
- Cost tracking (log each generation with model + cost)
- R2 upload conventions (path:
images/{brand}/{type}/{date}-{hash}.png)
R2 Upload Path Convention
images/
richbukae/
hero/2026-02-25-a1b2c3.png
instagram/2026-02-25-d4e5f6.png
apppro/
hero/2026-02-25-g7h8i9.png
newsletter/2026-02-25-j0k1l2.png
6. Cost Simulation
Scenario A: Low-Cost Stack (Fal.ai Flux Schnell + Gemini Flash QA)
- Image generation: $0.003/image
- Prompt generation: $0.001/image (Gemini Flash)
- Vision QA: $0.003/image (Gemini Flash vision)
- Total: ~$0.007/image
Scenario B: Mid-Range Stack (OpenRouter Flux 2 Pro + Claude QA)
- Image generation: $0.05/image
- Prompt generation: $0.002/image (Gemini Flash)
- Vision QA: $0.005/image (Claude Sonnet)
- Total: ~$0.057/image
Scenario C: Premium Stack (OpenRouter Gemini 3 Pro Image + Claude QA)
- Image generation: $0.134/image
- Prompt generation: $0.002/image
- Vision QA: $0.005/image
- Total: ~$0.141/image
Monthly Cost Projections
| Volume | Low-Cost (A) | Mid-Range (B) | Premium (C) |
|---|---|---|---|
| 100 images/mo | $0.70 | $5.70 | $14.10 |
| 300 images/mo | $2.10 | $17.10 | $42.30 |
| 1,000 images/mo | $7.00 | $57.00 | $141.00 |
Note: Costs assume ~20% retry rate (QA failures regenerated). Actual costs = base x 1.2.
Recommended Monthly Budget
- MVP phase (100-300 images): $5~$20/mo with Mid-Range stack
- Scale phase (300-1,000 images): $20~$70/mo, mix Low-Cost for bulk + Mid-Range for hero images
7. Implementation Tasks (TDD, Bite-Sized)
Prerequisites
OPENROUTER_API_KEYmust be set in.env(CEO approval — currentlysk-or-...placeholder)- Optionally
FAL_KEYfor budget fallback (requires CEO approval) - R2 credentials already available in
.env
Task 1: Project Scaffolding
Files:
- Create:
projects/image-pipeline/package.json - Create:
projects/image-pipeline/tsconfig.json - Create:
projects/image-pipeline/.env(symlink or copy from root)
Step 1: Create project directory and package.json
mkdir -p projects/image-pipeline
cd projects/image-pipeline
{
"name": "image-pipeline",
"version": "0.1.0",
"type": "module",
"scripts": {
"generate": "tsx src/cli.ts",
"test": "vitest run",
"test:watch": "vitest"
},
"dependencies": {
"openai": "^4.80.0",
"@aws-sdk/client-s3": "^3.750.0",
"sharp": "^0.33.0",
"commander": "^13.0.0"
},
"devDependencies": {
"tsx": "^4.19.0",
"typescript": "^5.7.0",
"vitest": "^3.0.0",
"@types/node": "^22.0.0"
}
}
Step 2: Create tsconfig.json
{
"compilerOptions": {
"target": "ES2022",
"module": "ESNext",
"moduleResolution": "bundler",
"outDir": "dist",
"rootDir": "src",
"strict": true,
"esModuleInterop": true,
"resolveJsonModule": true,
"declaration": true
},
"include": ["src/**/*"],
"exclude": ["node_modules", "dist"]
}
Step 3: Install dependencies
npm install
Step 4: Commit
git add projects/image-pipeline/
git commit -m "feat(image-pipeline): scaffold project with deps"
Task 2: Brand Style Guide Loader
Files:
- Create:
projects/image-pipeline/brands/richbukae.json - Create:
projects/image-pipeline/brands/apppro.json - Create:
projects/image-pipeline/src/brand-loader.ts - Test:
projects/image-pipeline/src/brand-loader.test.ts
Step 1: Write brand JSON files
Create brands/richbukae.json and brands/apppro.json using the templates from Section 4 above.
Step 2: Write failing test for brand loader
// src/brand-loader.test.ts
import { describe, it, expect } from 'vitest';
import { loadBrand, type BrandConfig } from './brand-loader.js';
describe('loadBrand', () => {
it('loads richbukae brand config', () => {
const brand = loadBrand('richbukae');
expect(brand.brand).toBe('richbukae');
expect(brand.visual_identity.primary_color).toBe('#1a1a2e');
expect(brand.dimensions.hero.width).toBe(1200);
});
it('loads apppro brand config', () => {
const brand = loadBrand('apppro');
expect(brand.brand).toBe('apppro');
expect(brand.visual_identity.primary_color).toBe('#2563eb');
});
it('throws on unknown brand', () => {
expect(() => loadBrand('nonexistent')).toThrow('Brand not found');
});
});
Step 3: Run test to verify it fails
npx vitest run src/brand-loader.test.ts
Expected: FAIL — loadBrand not defined
Step 4: Implement brand loader
// src/brand-loader.ts
import { readFileSync } from 'node:fs';
import { join, dirname } from 'node:path';
import { fileURLToPath } from 'node:url';
const __dirname = dirname(fileURLToPath(import.meta.url));
const BRANDS_DIR = join(__dirname, '..', 'brands');
export interface BrandDimensions {
width: number;
height: number;
}
export interface BrandConfig {
brand: string;
display_name: string;
tagline: string;
visual_identity: {
primary_color: string;
secondary_color: string;
accent_color: string;
background_preference: string;
style_keywords: string[];
mood: string;
};
typography: {
heading_style: string;
avoid: string[];
};
imagery: {
preferred_subjects: string[];
avoid_subjects: string[];
human_presence: string;
};
dimensions: Record<string, BrandDimensions>;
}
export function loadBrand(name: string): BrandConfig {
const path = join(BRANDS_DIR, `${name}.json`);
try {
const raw = readFileSync(path, 'utf-8');
return JSON.parse(raw) as BrandConfig;
} catch {
throw new Error(`Brand not found: ${name}`);
}
}
Step 5: Run test to verify pass
npx vitest run src/brand-loader.test.ts
Expected: PASS
Step 6: Commit
git add projects/image-pipeline/brands/ projects/image-pipeline/src/brand-loader.*
git commit -m "feat(image-pipeline): add brand style guide loader with tests"
Task 3: Prompt Generator (LLM-based)
Files:
- Create:
projects/image-pipeline/src/prompt-generator.ts - Test:
projects/image-pipeline/src/prompt-generator.test.ts
Step 1: Write failing test
// src/prompt-generator.test.ts
import { describe, it, expect, vi } from 'vitest';
import { generateImagePrompt } from './prompt-generator.js';
import { loadBrand } from './brand-loader.js';
// Mock OpenAI
vi.mock('openai', () => ({
default: class {
chat = {
completions: {
create: vi.fn().mockResolvedValue({
choices: [{ message: { content: 'A luxurious dark navy background with golden accent lines...' } }]
})
}
};
}
}));
describe('generateImagePrompt', () => {
it('generates a prompt incorporating brand style', async () => {
const brand = loadBrand('richbukae');
const result = await generateImagePrompt({
brand,
imageType: 'hero',
topic: 'AI Marketing',
});
expect(result).toBeTruthy();
expect(typeof result).toBe('string');
expect(result.length).toBeGreaterThan(50);
});
});
Step 2: Run test to verify failure
npx vitest run src/prompt-generator.test.ts
Expected: FAIL — module not found
Step 3: Implement prompt generator
// src/prompt-generator.ts
import OpenAI from 'openai';
import type { BrandConfig } from './brand-loader.js';
const client = new OpenAI({
baseURL: process.env.OPENROUTER_BASE_URL || 'https://openrouter.ai/api/v1',
apiKey: process.env.OPENROUTER_API_KEY,
});
interface PromptOptions {
brand: BrandConfig;
imageType: string;
topic: string;
}
export async function generateImagePrompt(opts: PromptOptions): Promise<string> {
const { brand, imageType, topic } = opts;
const dims = brand.dimensions[imageType];
const systemPrompt = `You are an expert image prompt engineer. Generate a detailed,
high-quality text-to-image prompt based on the brand guidelines and topic provided.
The prompt should be specific about colors, composition, mood, and style.
Output ONLY the image generation prompt, nothing else.`;
const userPrompt = `Brand: ${brand.display_name}
Style: ${brand.visual_identity.style_keywords.join(', ')}
Mood: ${brand.visual_identity.mood}
Colors: Primary ${brand.visual_identity.primary_color}, Secondary ${brand.visual_identity.secondary_color}, Accent ${brand.visual_identity.accent_color}
Background: ${brand.visual_identity.background_preference}
Preferred subjects: ${brand.imagery.preferred_subjects.join(', ')}
Avoid: ${brand.imagery.avoid_subjects.join(', ')}
Image type: ${imageType} (${dims?.width || 1024}x${dims?.height || 1024})
Topic: ${topic}
Generate a detailed image prompt.`;
const response = await client.chat.completions.create({
model: 'google/gemini-2.0-flash-exp',
messages: [
{ role: 'system', content: systemPrompt },
{ role: 'user', content: userPrompt },
],
max_tokens: 500,
});
const content = response.choices[0]?.message?.content;
if (!content) throw new Error('No prompt generated');
return content.trim();
}
Step 4: Run test
npx vitest run src/prompt-generator.test.ts
Expected: PASS
Step 5: Commit
git add projects/image-pipeline/src/prompt-generator.*
git commit -m "feat(image-pipeline): add LLM-based prompt generator"
Task 4: Image Generator (OpenRouter + Fal.ai fallback)
Files:
- Create:
projects/image-pipeline/src/image-generator.ts - Test:
projects/image-pipeline/src/image-generator.test.ts
Step 1: Write failing test
// src/image-generator.test.ts
import { describe, it, expect, vi } from 'vitest';
import { generateImage, type GenerateImageOptions } from './image-generator.js';
// Mock fetch for fal.ai and OpenAI for OpenRouter
vi.mock('openai', () => ({
default: class {
chat = {
completions: {
create: vi.fn().mockResolvedValue({
choices: [{
message: {
content: [{ type: 'image_url', image_url: { url: 'data:image/png;base64,iVBOR...' } }]
}
}]
})
}
};
}
}));
describe('generateImage', () => {
it('returns base64 image data via openrouter', async () => {
const result = await generateImage({
prompt: 'A dark navy background with golden lines',
width: 1024,
height: 1024,
provider: 'openrouter',
model: 'black-forest-labs/flux.2-pro',
});
expect(result.base64).toBeTruthy();
expect(result.provider).toBe('openrouter');
});
});
Step 2: Run test to verify failure
npx vitest run src/image-generator.test.ts
Expected: FAIL
Step 3: Implement image generator
// src/image-generator.ts
import OpenAI from 'openai';
export interface GenerateImageOptions {
prompt: string;
width: number;
height: number;
provider: 'openrouter' | 'fal';
model: string;
}
export interface GenerateImageResult {
base64: string;
provider: string;
model: string;
}
export async function generateImage(opts: GenerateImageOptions): Promise<GenerateImageResult> {
if (opts.provider === 'openrouter') {
return generateViaOpenRouter(opts);
}
return generateViaFal(opts);
}
async function generateViaOpenRouter(opts: GenerateImageOptions): Promise<GenerateImageResult> {
const client = new OpenAI({
baseURL: process.env.OPENROUTER_BASE_URL || 'https://openrouter.ai/api/v1',
apiKey: process.env.OPENROUTER_API_KEY,
});
const response = await client.chat.completions.create({
model: opts.model,
// @ts-expect-error OpenRouter extension
modalities: ['image'],
messages: [{ role: 'user', content: opts.prompt }],
});
const content = response.choices[0]?.message?.content;
let base64 = '';
if (typeof content === 'string') {
// Some models return base64 directly
base64 = content;
} else if (Array.isArray(content)) {
// Multi-part response with image_url
const imgPart = content.find((p: any) => p.type === 'image_url');
if (imgPart && 'image_url' in imgPart) {
base64 = (imgPart as any).image_url.url.replace(/^data:image\/\w+;base64,/, '');
}
}
if (!base64) throw new Error('No image in OpenRouter response');
return { base64, provider: 'openrouter', model: opts.model };
}
async function generateViaFal(opts: GenerateImageOptions): Promise<GenerateImageResult> {
const FAL_KEY = process.env.FAL_KEY;
if (!FAL_KEY) throw new Error('FAL_KEY not set');
const modelMap: Record<string, string> = {
'schnell': 'fal-ai/flux/schnell',
'flux-2-dev': 'fal-ai/flux-2/dev',
'flux-2-pro': 'fal-ai/flux-2-pro',
'sdxl': 'fal-ai/fast-sdxl',
};
const falModel = modelMap[opts.model] || opts.model;
const response = await fetch(`https://fal.run/${falModel}`, {
method: 'POST',
headers: {
'Authorization': `Key ${FAL_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
prompt: opts.prompt,
image_size: { width: opts.width, height: opts.height },
num_images: 1,
}),
});
if (!response.ok) throw new Error(`Fal.ai error: ${response.status}`);
const data = await response.json() as any;
const imageUrl = data.images?.[0]?.url;
if (!imageUrl) throw new Error('No image in fal.ai response');
// Download and convert to base64
const imgResponse = await fetch(imageUrl);
const buffer = Buffer.from(await imgResponse.arrayBuffer());
const base64 = buffer.toString('base64');
return { base64, provider: 'fal', model: opts.model };
}
Step 4: Run test
npx vitest run src/image-generator.test.ts
Expected: PASS
Step 5: Commit
git add projects/image-pipeline/src/image-generator.*
git commit -m "feat(image-pipeline): add image generator with OpenRouter + fal.ai"
Task 5: Vision QA Checker
Files:
- Create:
projects/image-pipeline/src/qa-checker.ts - Test:
projects/image-pipeline/src/qa-checker.test.ts
Step 1: Write failing test
// src/qa-checker.test.ts
import { describe, it, expect, vi } from 'vitest';
import { checkImageQuality, type QAResult } from './qa-checker.js';
vi.mock('openai', () => ({
default: class {
chat = {
completions: {
create: vi.fn().mockResolvedValue({
choices: [{
message: {
content: JSON.stringify({
passed: true,
score: 0.85,
checks: {
brand_colors: { score: 0.9, note: 'Gold and navy visible' },
composition: { score: 0.8, note: 'Good layout' },
text_readability: { score: 0.9, note: 'N/A' },
brand_mood: { score: 0.8, note: 'Luxury feel achieved' },
technical_quality: { score: 0.85, note: 'Clean render' },
},
})
}
}]
})
}
};
}
}));
describe('checkImageQuality', () => {
it('returns QA result with score', async () => {
const result = await checkImageQuality({
imageBase64: 'iVBORw0KGgoAAAANS...',
brandName: 'richbukae',
imageType: 'hero',
});
expect(result.passed).toBe(true);
expect(result.score).toBeGreaterThan(0.7);
expect(result.checks).toBeDefined();
});
});
Step 2: Run test, verify failure
npx vitest run src/qa-checker.test.ts
Step 3: Implement QA checker
// src/qa-checker.ts
import OpenAI from 'openai';
import { loadBrand } from './brand-loader.js';
const client = new OpenAI({
baseURL: process.env.OPENROUTER_BASE_URL || 'https://openrouter.ai/api/v1',
apiKey: process.env.OPENROUTER_API_KEY,
});
export interface QACheckDetail {
score: number;
note: string;
}
export interface QAResult {
passed: boolean;
score: number;
checks: Record<string, QACheckDetail>;
}
interface QAOptions {
imageBase64: string;
brandName: string;
imageType: string;
}
export async function checkImageQuality(opts: QAOptions): Promise<QAResult> {
const brand = loadBrand(opts.brandName);
const prompt = `You are an image quality auditor. Evaluate this AI-generated image against the brand guidelines.
Brand: ${brand.display_name}
Style: ${brand.visual_identity.style_keywords.join(', ')}
Primary Color: ${brand.visual_identity.primary_color}
Secondary Color: ${brand.visual_identity.secondary_color}
Mood: ${brand.visual_identity.mood}
Image Type: ${opts.imageType}
Score each criterion from 0.0 to 1.0:
1. brand_colors — Are the brand's primary/secondary colors visible?
2. composition — Good visual layout, no artifacts?
3. text_readability — If text present, is it readable?
4. brand_mood — Does the mood match the brand?
5. technical_quality — No distortions or low-resolution areas?
Respond in JSON only:
{
"passed": true/false (true if weighted average >= 0.7),
"score": 0.0-1.0,
"checks": {
"brand_colors": {"score": 0.0-1.0, "note": "..."},
"composition": {"score": 0.0-1.0, "note": "..."},
"text_readability": {"score": 0.0-1.0, "note": "..."},
"brand_mood": {"score": 0.0-1.0, "note": "..."},
"technical_quality": {"score": 0.0-1.0, "note": "..."}
}
}`;
const response = await client.chat.completions.create({
model: 'anthropic/claude-sonnet-4-6',
messages: [{
role: 'user',
content: [
{ type: 'text', text: prompt },
{ type: 'image_url', image_url: { url: `data:image/png;base64,${opts.imageBase64}` } },
],
}],
max_tokens: 500,
});
const content = response.choices[0]?.message?.content;
if (!content || typeof content !== 'string') throw new Error('No QA response');
return JSON.parse(content) as QAResult;
}
Step 4: Run test
npx vitest run src/qa-checker.test.ts
Expected: PASS
Step 5: Commit
git add projects/image-pipeline/src/qa-checker.*
git commit -m "feat(image-pipeline): add vision QA checker with brand validation"
Task 6: R2 Uploader
Files:
- Create:
projects/image-pipeline/src/r2-uploader.ts - Test:
projects/image-pipeline/src/r2-uploader.test.ts
Step 1: Write failing test
// src/r2-uploader.test.ts
import { describe, it, expect, vi } from 'vitest';
import { buildR2Key, type UploadResult } from './r2-uploader.js';
describe('buildR2Key', () => {
it('generates correct R2 path', () => {
const key = buildR2Key('richbukae', 'hero', 'abc123');
expect(key).toMatch(/^images\/richbukae\/hero\/\d{4}-\d{2}-\d{2}-abc123\.png$/);
});
});
Step 2: Run test, verify failure
npx vitest run src/r2-uploader.test.ts
Step 3: Implement R2 uploader
// src/r2-uploader.ts
import { S3Client, PutObjectCommand } from '@aws-sdk/client-s3';
import { createHash } from 'node:crypto';
const s3 = new S3Client({
region: 'auto',
endpoint: process.env.CLOUDFLARE_R2_ENDPOINT,
credentials: {
accessKeyId: process.env.R2_ACCESS_KEY_ID || '',
secretAccessKey: process.env.R2_SECRET_ACCESS_KEY || '',
},
});
export interface UploadResult {
key: string;
publicUrl: string;
size: number;
}
export function buildR2Key(brand: string, imageType: string, hash: string): string {
const date = new Date().toISOString().slice(0, 10);
return `images/${brand}/${imageType}/${date}-${hash}.png`;
}
export async function uploadToR2(base64: string, brand: string, imageType: string): Promise<UploadResult> {
const buffer = Buffer.from(base64, 'base64');
const hash = createHash('md5').update(buffer).digest('hex').slice(0, 8);
const key = buildR2Key(brand, imageType, hash);
await s3.send(new PutObjectCommand({
Bucket: process.env.R2_BUCKET_NAME || 'business-builder',
Key: key,
Body: buffer,
ContentType: 'image/png',
CacheControl: 'public, max-age=31536000',
}));
const publicUrl = `${process.env.R2_PUBLIC_URL}/${key}`;
return { key, publicUrl, size: buffer.length };
}
Step 4: Run test
npx vitest run src/r2-uploader.test.ts
Expected: PASS (unit test for buildR2Key only; upload tested in integration)
Step 5: Commit
git add projects/image-pipeline/src/r2-uploader.*
git commit -m "feat(image-pipeline): add R2 uploader with path conventions"
Task 7: CLI Entry Point
Files:
- Create:
projects/image-pipeline/src/cli.ts - Create:
scripts/generate-image.sh(thin wrapper)
Step 1: Implement CLI
// src/cli.ts
import { Command } from 'commander';
import { loadBrand } from './brand-loader.js';
import { generateImagePrompt } from './prompt-generator.js';
import { generateImage } from './image-generator.js';
import { checkImageQuality } from './qa-checker.js';
import { uploadToR2 } from './r2-uploader.js';
import { writeFileSync } from 'node:fs';
const program = new Command();
const MODEL_MAP: Record<string, { provider: 'openrouter' | 'fal'; model: string }> = {
'flux.2-pro': { provider: 'openrouter', model: 'black-forest-labs/flux.2-pro' },
'gemini-flash': { provider: 'openrouter', model: 'google/gemini-2.5-flash-image' },
'gemini-pro': { provider: 'openrouter', model: 'google/gemini-3-pro-image-preview' },
'schnell': { provider: 'fal', model: 'schnell' },
'sdxl': { provider: 'fal', model: 'sdxl' },
};
const TYPE_DEFAULTS: Record<string, string> = {
hero: 'flux.2-pro',
newsletter: 'gemini-flash',
instagram: 'flux.2-pro',
reels: 'flux.2-pro',
mockup: 'gemini-pro',
};
program
.name('generate-image')
.description('Brand-aware AI image generation pipeline')
.requiredOption('--brand <name>', 'Brand name (loads brands/{name}.json)')
.requiredOption('--type <type>', 'Image type: hero, newsletter, instagram, reels, mockup')
.requiredOption('--topic <topic>', 'Image subject/topic')
.option('--model <model>', 'Override model: flux.2-pro, gemini-flash, schnell, sdxl')
.option('--budget', 'Use cheapest model (fal.ai schnell)', false)
.option('--no-qa', 'Skip vision QA check')
.option('--dry-run', 'Generate prompt only', false)
.option('--output <path>', 'Save image locally')
.action(async (opts) => {
try {
// 1. Load brand
console.log(`[1/5] Loading brand: ${opts.brand}`);
const brand = loadBrand(opts.brand);
const dims = brand.dimensions[opts.type] || { width: 1024, height: 1024 };
// 2. Generate prompt
console.log(`[2/5] Generating prompt for: ${opts.topic}`);
const prompt = await generateImagePrompt({ brand, imageType: opts.type, topic: opts.topic });
console.log(` Prompt: ${prompt.slice(0, 120)}...`);
if (opts.dryRun) {
console.log('\n--- DRY RUN: Full Prompt ---');
console.log(prompt);
return;
}
// 3. Select model
const modelKey = opts.budget ? 'schnell' : (opts.model || TYPE_DEFAULTS[opts.type] || 'flux.2-pro');
const modelConfig = MODEL_MAP[modelKey];
if (!modelConfig) throw new Error(`Unknown model: ${modelKey}`);
// 4. Generate image (with retry)
const MAX_RETRIES = 3;
let imageResult;
let qaResult;
for (let attempt = 1; attempt <= MAX_RETRIES; attempt++) {
console.log(`[3/5] Generating image (attempt ${attempt}/${MAX_RETRIES}) via ${modelConfig.provider}:${modelConfig.model}`);
imageResult = await generateImage({
prompt,
width: dims.width,
height: dims.height,
...modelConfig,
});
if (!opts.qa) {
console.log(`[4/5] QA skipped`);
break;
}
// 5. QA Check
console.log(`[4/5] Running vision QA check...`);
qaResult = await checkImageQuality({
imageBase64: imageResult.base64,
brandName: opts.brand,
imageType: opts.type,
});
console.log(` QA Score: ${qaResult.score} (${qaResult.passed ? 'PASS' : 'FAIL'})`);
if (qaResult.passed) break;
if (attempt < MAX_RETRIES) console.log(` Retrying with QA feedback...`);
}
if (!imageResult) throw new Error('Image generation failed');
// 6. Upload to R2
console.log(`[5/5] Uploading to R2...`);
const upload = await uploadToR2(imageResult.base64, opts.brand, opts.type);
console.log(` URL: ${upload.publicUrl}`);
console.log(` Size: ${(upload.size / 1024).toFixed(1)} KB`);
// Optional local save
if (opts.output) {
writeFileSync(opts.output, Buffer.from(imageResult.base64, 'base64'));
console.log(` Saved locally: ${opts.output}`);
}
console.log('\nDone!');
} catch (err) {
console.error('Error:', (err as Error).message);
process.exit(1);
}
});
program.parse();
Step 2: Create shell wrapper
#!/usr/bin/env bash
# scripts/generate-image.sh — Thin wrapper for image pipeline CLI
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
PROJECT_DIR="$SCRIPT_DIR/../projects/image-pipeline"
# Load env
if [ -f "$SCRIPT_DIR/../.env" ]; then
set -a; source "$SCRIPT_DIR/../.env"; set +a
fi
cd "$PROJECT_DIR"
npx tsx src/cli.ts "$@"
Step 3: Make executable and test help
chmod +x scripts/generate-image.sh
./scripts/generate-image.sh --help
Expected: Help text with all options displayed
Step 4: Commit
git add projects/image-pipeline/src/cli.ts scripts/generate-image.sh
git commit -m "feat(image-pipeline): add CLI entry point with shell wrapper"
Task 8: Integration Test (End-to-End with Dry Run)
Files:
- Create:
projects/image-pipeline/src/integration.test.ts
Step 1: Write integration test
// src/integration.test.ts
import { describe, it, expect } from 'vitest';
import { loadBrand } from './brand-loader.js';
import { buildR2Key } from './r2-uploader.js';
describe('Integration: brand -> R2 path', () => {
it('richbukae hero generates valid path', () => {
const brand = loadBrand('richbukae');
expect(brand.dimensions.hero).toEqual({ width: 1200, height: 628 });
const key = buildR2Key('richbukae', 'hero', 'test123');
expect(key).toContain('images/richbukae/hero/');
});
it('apppro instagram generates valid path', () => {
const brand = loadBrand('apppro');
expect(brand.dimensions.instagram).toEqual({ width: 1080, height: 1080 });
const key = buildR2Key('apppro', 'instagram', 'test456');
expect(key).toContain('images/apppro/instagram/');
});
});
Step 2: Run all tests
cd projects/image-pipeline && npx vitest run
Expected: ALL PASS
Step 3: Commit
git add projects/image-pipeline/src/integration.test.ts
git commit -m "test(image-pipeline): add integration tests"
Task 9: Skill File + Documentation
Files:
- Create:
.claude/skills/image-generation.md
Step 1: Write skill file
---
name: image-generation
description: AI 이미지 생성 파이프라인 — 브랜드별 자동 이미지 생성, QA, R2 업로드
---
# Image Generation Pipeline
## Quick Start
\`\`\`bash
./scripts/generate-image.sh --brand richbukae --type hero --topic "AI Marketing"
\`\`\`
## Options
| Flag | Description | Default |
|------|-------------|---------|
| --brand | Brand name | Required |
| --type | hero/newsletter/instagram/reels/mockup | Required |
| --topic | Image subject | Required |
| --model | flux.2-pro/gemini-flash/schnell/sdxl | Auto |
| --budget | Use cheapest model | false |
| --no-qa | Skip QA check | false |
| --dry-run | Prompt only | false |
## Brands
Brand configs: `projects/image-pipeline/brands/{name}.json`
Available: richbukae, apppro
## Cost Reference
| Model | Provider | Cost/Image |
|-------|----------|-----------|
| Flux Schnell | fal.ai | $0.003 |
| Gemini Flash Image | OpenRouter | $0.039 |
| Flux 2 Pro | OpenRouter | $0.05 |
| Gemini 3 Pro Image | OpenRouter | $0.134 |
## Adding a New Brand
1. Create `projects/image-pipeline/brands/{name}.json` (copy existing template)
2. Update colors, keywords, mood, dimensions
3. Test: `./scripts/generate-image.sh --brand {name} --type hero --topic "test" --dry-run`
\`\`\`
**Step 2: Commit**
```bash
git add .claude/skills/image-generation.md
git commit -m "docs(image-pipeline): add image-generation skill file"
8. Blockers & CEO Action Required
| # | Item | Detail |
|---|---|---|
| 1 | OPENROUTER_API_KEY | CEO must issue key for OpenRouter (currently sk-or-... placeholder in policy) |
| 2 | FAL_KEY (optional) | Only needed if budget fal.ai fallback is desired |
| 3 | Brand Color Confirmation | Verify richbukae (#1a1a2e + #c9a84c) and apppro (#2563eb) brand colors are correct |
9. VP Review Points
- OpenRouter-first policy compliance — Plan routes all API calls through OpenRouter per CEO policy. Fal.ai is fallback only.
- Cost control — Mid-range stack at $0.057/image, ~$17/mo for 300 images. Within reasonable budget.
- No production deployment — This is a CLI tool, no Vercel/external deployment needed.
- Brand guide as JSON — Extensible to new brands/projects without code changes.
- Vision QA — Automated quality gate prevents low-quality images from reaching R2. Uses existing Claude Sonnet via OpenRouter.
- R2 already configured — No new infrastructure needed. Uses existing bucket + credentials.