← 목록으로
2026-02-25plans

Image Generation AI Pipeline Implementation Plan

For Claude: REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.

Goal: Build an automated, brand-aware image generation pipeline that generates, validates, and uploads images to Cloudflare R2 via CLI — supporting blog thumbnails, newsletter banners, social cards, and product mockups.

Architecture: CLI script (generate-image.sh) orchestrates a Node.js pipeline: loads brand style guide (JSON) -> generates prompt via LLM -> calls image generation API (OpenRouter-first, fal.ai fallback) -> validates via vision model -> uploads to R2. All API calls routed through OpenRouter where supported, per CEO policy.

Tech Stack: Node.js (ESM), OpenAI SDK (OpenRouter-compatible), @aws-sdk/client-s3 (R2 upload), sharp (image processing), Commander.js (CLI)


1. Research Results Summary

1.1 Image Generation Service Comparison (February 2026)

ServiceRepresentative ModelsPrice/Image (1MP)API FriendlinessQualitySpeedKorean SupportOpenRouter Available
Fal.aiFlux 2 Pro, Flux 2 Dev, Flux Schnell, SDXL$0.003~$0.05Excellent (REST)HighFastNoN/A (direct only)
ReplicateFlux 2 Pro, SDXL, Stable Diffusion$0.003~$0.055Excellent (REST)HighMediumNoN/A (direct only)
Together AIFlux Schnell (free), Flux 1.1 Pro, Imagen 4$0.0027~$0.06Excellent (REST)HighVery Fast (315ms schnell)NoN/A (direct only)
OpenAIGPT Image 1.5, GPT Image 1 Mini, DALL-E 3$0.005~$0.19Good (official SDK)Very HighMediumYes (prompt)Yes (via chat API)
Black Forest LabsFlux 2 Pro/Max/Flex/Klein$0.014~$0.28Good (REST)Very HighVariesNoN/A (direct)
Stability AIStable Diffusion 3.5, Stable Image Core$0.003~$0.08Good (REST)GoodFastNoN/A (direct only)
OpenRouterFlux 2 Pro/Flex, Gemini Flash Image, Riverflow$0.003~$0.07/MPExcellent (OpenAI-compat)HighVariesModel-dependentYes (native)
Google (via OR)Imagen 4 Fast/Standard/Ultra, Gemini Flash Image$0.02~$0.06Good (via OpenRouter)Very HighFastYes (prompt)Yes
MidjourneyMidjourney v6+~$0.04 (subscription)Poor (no official API)BestMediumLimitedNo
Nano Banana (Google)Gemini 2.5 Flash Image, Gemini 3 Pro Image$0.039~$0.134Good (via OpenRouter)Very HighMediumYesYes

1.2 OpenRouter Image Generation Support (Confirmed)

OpenRouter does support image generation via the chat completions endpoint with modalities: ["image"] or modalities: ["image", "text"].

Available Image Models on OpenRouter (Feb 2026):

Model IDProviderPrice/MPNotes
black-forest-labs/flux.2-proBFL~$0.05Professional quality
black-forest-labs/flux.2-flexBFL~$0.06Multi-reference editing
google/gemini-2.5-flash-imageGoogle~$0.039Nano Banana, contextual
google/gemini-3-pro-image-previewGoogle~$0.134Highest Google quality
sourceful/riverflow-v2-fastSourcefulTBDFast generation
sourceful/riverflow-v2-proSourcefulTBDPro quality

API Format:

const response = await openai.chat.completions.create({
  model: 'black-forest-labs/flux.2-pro',
  modalities: ['image'],
  messages: [{ role: 'user', content: prompt }],
});
// Image returned as base64 data URL in response

Key Constraint: OpenRouter does NOT yet support all image models (e.g., no Imagen 4 direct, no SDXL, no Stability). For maximum model coverage, fal.ai is the best direct alternative.

Verdict: OpenRouter first for models it supports (Flux 2, Gemini Image), fal.ai fallback for budget models (Flux Schnell $0.003, SDXL $0.003) and specialized needs.

1.3 Use-Case Optimal Model Matrix

Use CaseDimensionsRecommended Model (OpenRouter)Fallback (fal.ai)Est. Cost/Image
Blog Hero/Thumbnail1200x628flux.2-proflux-2-pro$0.05
Newsletter Banner600x200gemini-2.5-flash-imageflux-schnell$0.003~$0.039
Instagram Card News1080x1080flux.2-proflux-2-dev$0.025~$0.05
Reels Thumbnail1080x1920flux.2-proflux-2-pro$0.05
Product Mockup/Cover (HQ)1536x1024gemini-3-pro-image-previewflux-2-pro$0.05~$0.134

2. Recommended Stack

Primary: OpenRouter (Policy-Compliant)

  • Models: Flux 2 Pro (general), Gemini 2.5 Flash Image (budget), Gemini 3 Pro Image (premium)
  • API: OpenAI SDK with baseURL: https://openrouter.ai/api/v1
  • Aligns with: CEO's OpenRouter-only policy (openrouter-policy.md)

Fallback: Fal.ai (Budget/Specialized)

  • Models: Flux Schnell ($0.003), SDXL ($0.003), Flux 2 Dev ($0.025)
  • Use when: OpenRouter doesn't support needed model, or budget-critical bulk generation
  • Requires: FAL_KEY env var (CEO approval needed)

Vision QA: OpenRouter (Claude/Gemini)

  • Model: anthropic/claude-sonnet-4-6 or google/gemini-2.5-flash
  • Cost: ~$0.003/image check (text tokens only for analysis)

Storage: Cloudflare R2 (Already Configured)

  • Bucket: business-builder
  • Public URL: https://pub-8433d9d9c43b94f189da8f35ea1926ed.r2.dev
  • Keys: Already in .env (R2_ACCESS_KEY_ID, R2_SECRET_ACCESS_KEY, etc.)

3. Pipeline Design

3.1 Architecture Diagram

[CLI Command]
  │
  ├─ --brand richbukae --type hero --topic "AI Marketing"
  │
  ▼
[1. Load Brand Style Guide]  ← brands/{brand}.json
  │
  ▼
[2. Generate Prompt (LLM)]   ← OpenRouter: gemini-2.5-flash
  │  Input: style guide + topic + image type + dimensions
  │  Output: detailed image generation prompt
  │
  ▼
[3. Generate Image]           ← OpenRouter: flux.2-pro (or fallback fal.ai)
  │  Input: prompt + dimensions
  │  Output: base64 PNG
  │
  ▼
[4. Vision QA Check]          ← OpenRouter: claude-sonnet-4-6
  │  Input: generated image + brand style guide
  │  Checks: brand color match, composition, text readability, quality
  │  Output: PASS/FAIL + score + reasons
  │
  ├─ PASS → [5. Process & Upload]
  │         │  sharp: resize, optimize
  │         │  R2: upload with metadata
  │         │  Output: public URL
  │         ▼
  │         [Done: Return URL + metadata]
  │
  └─ FAIL → [Retry: Regenerate with feedback, max 3 attempts]
            │  Append failure reasons to prompt
            ▼
            [Back to Step 3]

3.2 CLI Interface Design

# Basic usage
./scripts/generate-image.sh --brand richbukae --type hero --topic "AI Marketing Automation"

# With model override
./scripts/generate-image.sh --brand apppro --type instagram --topic "SaaS Development" --model flux.2-pro

# Budget mode (fal.ai schnell)
./scripts/generate-image.sh --brand richbukae --type newsletter --topic "Weekly Tips" --budget

# Batch mode (from CSV/JSON)
./scripts/generate-image.sh --batch images-to-generate.json

# Dry run (prompt only, no generation)
./scripts/generate-image.sh --brand apppro --type hero --topic "Cloud Native" --dry-run

Options:

FlagDescriptionDefault
--brandBrand name (loads brands/{name}.json)Required
--typeImage type: hero, newsletter, instagram, reels, mockupRequired
--topicSubject/topic for the imageRequired
--modelOverride model: flux.2-pro, gemini-flash, schnellAuto (by type)
--budgetUse cheapest model (fal.ai schnell)false
--no-qaSkip vision QA checkfalse
--dry-runGenerate prompt only, no API callfalse
--batchPath to batch JSON file-
--outputLocal save path (also uploads to R2)-

3.3 Vision QA Check Criteria

The vision model evaluates generated images against:

{
  "checks": [
    {
      "name": "brand_colors",
      "description": "Primary/secondary brand colors visible in image",
      "weight": 0.25
    },
    {
      "name": "composition",
      "description": "Good visual composition, proper framing, no artifacts",
      "weight": 0.25
    },
    {
      "name": "text_readability",
      "description": "Any text in image is readable (if applicable)",
      "weight": 0.15
    },
    {
      "name": "brand_mood",
      "description": "Image mood matches brand guide (luxury, tech, etc.)",
      "weight": 0.20
    },
    {
      "name": "technical_quality",
      "description": "No distortions, artifacts, or low-resolution areas",
      "weight": 0.15
    }
  ],
  "pass_threshold": 0.7
}

4. Brand Style Guide Template

4.1 JSON Schema

File location: projects/image-pipeline/brands/{brand-name}.json

{
  "brand": "richbukae",
  "display_name": "RichBukae",
  "tagline": "AI-Powered Wealth Building",
  "visual_identity": {
    "primary_color": "#1a1a2e",
    "secondary_color": "#c9a84c",
    "accent_color": "#e8d5a3",
    "background_preference": "dark",
    "style_keywords": ["luxury", "minimal", "elegant", "dark-navy", "gold-accent"],
    "mood": "premium, sophisticated, trustworthy"
  },
  "typography": {
    "heading_style": "bold, sans-serif, modern",
    "avoid": ["comic", "handwritten", "playful"]
  },
  "imagery": {
    "preferred_subjects": ["abstract tech", "wealth symbols", "modern office", "data visualization"],
    "avoid_subjects": ["cartoon", "clipart", "stock-photo-generic"],
    "human_presence": "minimal, silhouettes preferred"
  },
  "dimensions": {
    "hero": { "width": 1200, "height": 628 },
    "newsletter": { "width": 600, "height": 200 },
    "instagram": { "width": 1080, "height": 1080 },
    "reels": { "width": 1080, "height": 1920 },
    "mockup": { "width": 1536, "height": 1024 }
  }
}

4.2 Example: apppro.kr Brand

{
  "brand": "apppro",
  "display_name": "AppPro",
  "tagline": "AI Solutions for Business",
  "visual_identity": {
    "primary_color": "#2563eb",
    "secondary_color": "#1e40af",
    "accent_color": "#60a5fa",
    "background_preference": "light",
    "style_keywords": ["tech", "modern", "blue", "clean", "professional"],
    "mood": "innovative, trustworthy, cutting-edge"
  },
  "typography": {
    "heading_style": "bold, geometric sans-serif",
    "avoid": ["decorative", "serif", "ornamental"]
  },
  "imagery": {
    "preferred_subjects": ["SaaS dashboards", "code", "cloud architecture", "AI neural networks"],
    "avoid_subjects": ["nature", "food", "lifestyle"],
    "human_presence": "optional, diverse professionals"
  },
  "dimensions": {
    "hero": { "width": 1200, "height": 628 },
    "newsletter": { "width": 600, "height": 200 },
    "instagram": { "width": 1080, "height": 1080 },
    "reels": { "width": 1080, "height": 1920 },
    "mockup": { "width": 1536, "height": 1024 }
  }
}

5. Skill Packaging Plan

File: .claude/skills/image-generation.md

The skill file will provide:

  1. Usage instructions for PL agents to invoke the pipeline
  2. Brand guide management (add/edit/list brands)
  3. Model selection guide (when to use which model)
  4. Cost tracking (log each generation with model + cost)
  5. R2 upload conventions (path: images/{brand}/{type}/{date}-{hash}.png)

R2 Upload Path Convention

images/
  richbukae/
    hero/2026-02-25-a1b2c3.png
    instagram/2026-02-25-d4e5f6.png
  apppro/
    hero/2026-02-25-g7h8i9.png
    newsletter/2026-02-25-j0k1l2.png

6. Cost Simulation

Scenario A: Low-Cost Stack (Fal.ai Flux Schnell + Gemini Flash QA)

  • Image generation: $0.003/image
  • Prompt generation: $0.001/image (Gemini Flash)
  • Vision QA: $0.003/image (Gemini Flash vision)
  • Total: ~$0.007/image

Scenario B: Mid-Range Stack (OpenRouter Flux 2 Pro + Claude QA)

  • Image generation: $0.05/image
  • Prompt generation: $0.002/image (Gemini Flash)
  • Vision QA: $0.005/image (Claude Sonnet)
  • Total: ~$0.057/image

Scenario C: Premium Stack (OpenRouter Gemini 3 Pro Image + Claude QA)

  • Image generation: $0.134/image
  • Prompt generation: $0.002/image
  • Vision QA: $0.005/image
  • Total: ~$0.141/image

Monthly Cost Projections

VolumeLow-Cost (A)Mid-Range (B)Premium (C)
100 images/mo$0.70$5.70$14.10
300 images/mo$2.10$17.10$42.30
1,000 images/mo$7.00$57.00$141.00

Note: Costs assume ~20% retry rate (QA failures regenerated). Actual costs = base x 1.2.

Recommended Monthly Budget

  • MVP phase (100-300 images): $5~$20/mo with Mid-Range stack
  • Scale phase (300-1,000 images): $20~$70/mo, mix Low-Cost for bulk + Mid-Range for hero images

7. Implementation Tasks (TDD, Bite-Sized)

Prerequisites

  • OPENROUTER_API_KEY must be set in .env (CEO approval — currently sk-or-... placeholder)
  • Optionally FAL_KEY for budget fallback (requires CEO approval)
  • R2 credentials already available in .env

Task 1: Project Scaffolding

Files:

  • Create: projects/image-pipeline/package.json
  • Create: projects/image-pipeline/tsconfig.json
  • Create: projects/image-pipeline/.env (symlink or copy from root)

Step 1: Create project directory and package.json

mkdir -p projects/image-pipeline
cd projects/image-pipeline
{
  "name": "image-pipeline",
  "version": "0.1.0",
  "type": "module",
  "scripts": {
    "generate": "tsx src/cli.ts",
    "test": "vitest run",
    "test:watch": "vitest"
  },
  "dependencies": {
    "openai": "^4.80.0",
    "@aws-sdk/client-s3": "^3.750.0",
    "sharp": "^0.33.0",
    "commander": "^13.0.0"
  },
  "devDependencies": {
    "tsx": "^4.19.0",
    "typescript": "^5.7.0",
    "vitest": "^3.0.0",
    "@types/node": "^22.0.0"
  }
}

Step 2: Create tsconfig.json

{
  "compilerOptions": {
    "target": "ES2022",
    "module": "ESNext",
    "moduleResolution": "bundler",
    "outDir": "dist",
    "rootDir": "src",
    "strict": true,
    "esModuleInterop": true,
    "resolveJsonModule": true,
    "declaration": true
  },
  "include": ["src/**/*"],
  "exclude": ["node_modules", "dist"]
}

Step 3: Install dependencies

npm install

Step 4: Commit

git add projects/image-pipeline/
git commit -m "feat(image-pipeline): scaffold project with deps"

Task 2: Brand Style Guide Loader

Files:

  • Create: projects/image-pipeline/brands/richbukae.json
  • Create: projects/image-pipeline/brands/apppro.json
  • Create: projects/image-pipeline/src/brand-loader.ts
  • Test: projects/image-pipeline/src/brand-loader.test.ts

Step 1: Write brand JSON files

Create brands/richbukae.json and brands/apppro.json using the templates from Section 4 above.

Step 2: Write failing test for brand loader

// src/brand-loader.test.ts
import { describe, it, expect } from 'vitest';
import { loadBrand, type BrandConfig } from './brand-loader.js';

describe('loadBrand', () => {
  it('loads richbukae brand config', () => {
    const brand = loadBrand('richbukae');
    expect(brand.brand).toBe('richbukae');
    expect(brand.visual_identity.primary_color).toBe('#1a1a2e');
    expect(brand.dimensions.hero.width).toBe(1200);
  });

  it('loads apppro brand config', () => {
    const brand = loadBrand('apppro');
    expect(brand.brand).toBe('apppro');
    expect(brand.visual_identity.primary_color).toBe('#2563eb');
  });

  it('throws on unknown brand', () => {
    expect(() => loadBrand('nonexistent')).toThrow('Brand not found');
  });
});

Step 3: Run test to verify it fails

npx vitest run src/brand-loader.test.ts

Expected: FAIL — loadBrand not defined

Step 4: Implement brand loader

// src/brand-loader.ts
import { readFileSync } from 'node:fs';
import { join, dirname } from 'node:path';
import { fileURLToPath } from 'node:url';

const __dirname = dirname(fileURLToPath(import.meta.url));
const BRANDS_DIR = join(__dirname, '..', 'brands');

export interface BrandDimensions {
  width: number;
  height: number;
}

export interface BrandConfig {
  brand: string;
  display_name: string;
  tagline: string;
  visual_identity: {
    primary_color: string;
    secondary_color: string;
    accent_color: string;
    background_preference: string;
    style_keywords: string[];
    mood: string;
  };
  typography: {
    heading_style: string;
    avoid: string[];
  };
  imagery: {
    preferred_subjects: string[];
    avoid_subjects: string[];
    human_presence: string;
  };
  dimensions: Record<string, BrandDimensions>;
}

export function loadBrand(name: string): BrandConfig {
  const path = join(BRANDS_DIR, `${name}.json`);
  try {
    const raw = readFileSync(path, 'utf-8');
    return JSON.parse(raw) as BrandConfig;
  } catch {
    throw new Error(`Brand not found: ${name}`);
  }
}

Step 5: Run test to verify pass

npx vitest run src/brand-loader.test.ts

Expected: PASS

Step 6: Commit

git add projects/image-pipeline/brands/ projects/image-pipeline/src/brand-loader.*
git commit -m "feat(image-pipeline): add brand style guide loader with tests"

Task 3: Prompt Generator (LLM-based)

Files:

  • Create: projects/image-pipeline/src/prompt-generator.ts
  • Test: projects/image-pipeline/src/prompt-generator.test.ts

Step 1: Write failing test

// src/prompt-generator.test.ts
import { describe, it, expect, vi } from 'vitest';
import { generateImagePrompt } from './prompt-generator.js';
import { loadBrand } from './brand-loader.js';

// Mock OpenAI
vi.mock('openai', () => ({
  default: class {
    chat = {
      completions: {
        create: vi.fn().mockResolvedValue({
          choices: [{ message: { content: 'A luxurious dark navy background with golden accent lines...' } }]
        })
      }
    };
  }
}));

describe('generateImagePrompt', () => {
  it('generates a prompt incorporating brand style', async () => {
    const brand = loadBrand('richbukae');
    const result = await generateImagePrompt({
      brand,
      imageType: 'hero',
      topic: 'AI Marketing',
    });

    expect(result).toBeTruthy();
    expect(typeof result).toBe('string');
    expect(result.length).toBeGreaterThan(50);
  });
});

Step 2: Run test to verify failure

npx vitest run src/prompt-generator.test.ts

Expected: FAIL — module not found

Step 3: Implement prompt generator

// src/prompt-generator.ts
import OpenAI from 'openai';
import type { BrandConfig } from './brand-loader.js';

const client = new OpenAI({
  baseURL: process.env.OPENROUTER_BASE_URL || 'https://openrouter.ai/api/v1',
  apiKey: process.env.OPENROUTER_API_KEY,
});

interface PromptOptions {
  brand: BrandConfig;
  imageType: string;
  topic: string;
}

export async function generateImagePrompt(opts: PromptOptions): Promise<string> {
  const { brand, imageType, topic } = opts;
  const dims = brand.dimensions[imageType];

  const systemPrompt = `You are an expert image prompt engineer. Generate a detailed,
high-quality text-to-image prompt based on the brand guidelines and topic provided.
The prompt should be specific about colors, composition, mood, and style.
Output ONLY the image generation prompt, nothing else.`;

  const userPrompt = `Brand: ${brand.display_name}
Style: ${brand.visual_identity.style_keywords.join(', ')}
Mood: ${brand.visual_identity.mood}
Colors: Primary ${brand.visual_identity.primary_color}, Secondary ${brand.visual_identity.secondary_color}, Accent ${brand.visual_identity.accent_color}
Background: ${brand.visual_identity.background_preference}
Preferred subjects: ${brand.imagery.preferred_subjects.join(', ')}
Avoid: ${brand.imagery.avoid_subjects.join(', ')}
Image type: ${imageType} (${dims?.width || 1024}x${dims?.height || 1024})
Topic: ${topic}

Generate a detailed image prompt.`;

  const response = await client.chat.completions.create({
    model: 'google/gemini-2.0-flash-exp',
    messages: [
      { role: 'system', content: systemPrompt },
      { role: 'user', content: userPrompt },
    ],
    max_tokens: 500,
  });

  const content = response.choices[0]?.message?.content;
  if (!content) throw new Error('No prompt generated');
  return content.trim();
}

Step 4: Run test

npx vitest run src/prompt-generator.test.ts

Expected: PASS

Step 5: Commit

git add projects/image-pipeline/src/prompt-generator.*
git commit -m "feat(image-pipeline): add LLM-based prompt generator"

Task 4: Image Generator (OpenRouter + Fal.ai fallback)

Files:

  • Create: projects/image-pipeline/src/image-generator.ts
  • Test: projects/image-pipeline/src/image-generator.test.ts

Step 1: Write failing test

// src/image-generator.test.ts
import { describe, it, expect, vi } from 'vitest';
import { generateImage, type GenerateImageOptions } from './image-generator.js';

// Mock fetch for fal.ai and OpenAI for OpenRouter
vi.mock('openai', () => ({
  default: class {
    chat = {
      completions: {
        create: vi.fn().mockResolvedValue({
          choices: [{
            message: {
              content: [{ type: 'image_url', image_url: { url: 'data:image/png;base64,iVBOR...' } }]
            }
          }]
        })
      }
    };
  }
}));

describe('generateImage', () => {
  it('returns base64 image data via openrouter', async () => {
    const result = await generateImage({
      prompt: 'A dark navy background with golden lines',
      width: 1024,
      height: 1024,
      provider: 'openrouter',
      model: 'black-forest-labs/flux.2-pro',
    });

    expect(result.base64).toBeTruthy();
    expect(result.provider).toBe('openrouter');
  });
});

Step 2: Run test to verify failure

npx vitest run src/image-generator.test.ts

Expected: FAIL

Step 3: Implement image generator

// src/image-generator.ts
import OpenAI from 'openai';

export interface GenerateImageOptions {
  prompt: string;
  width: number;
  height: number;
  provider: 'openrouter' | 'fal';
  model: string;
}

export interface GenerateImageResult {
  base64: string;
  provider: string;
  model: string;
}

export async function generateImage(opts: GenerateImageOptions): Promise<GenerateImageResult> {
  if (opts.provider === 'openrouter') {
    return generateViaOpenRouter(opts);
  }
  return generateViaFal(opts);
}

async function generateViaOpenRouter(opts: GenerateImageOptions): Promise<GenerateImageResult> {
  const client = new OpenAI({
    baseURL: process.env.OPENROUTER_BASE_URL || 'https://openrouter.ai/api/v1',
    apiKey: process.env.OPENROUTER_API_KEY,
  });

  const response = await client.chat.completions.create({
    model: opts.model,
    // @ts-expect-error OpenRouter extension
    modalities: ['image'],
    messages: [{ role: 'user', content: opts.prompt }],
  });

  const content = response.choices[0]?.message?.content;
  let base64 = '';

  if (typeof content === 'string') {
    // Some models return base64 directly
    base64 = content;
  } else if (Array.isArray(content)) {
    // Multi-part response with image_url
    const imgPart = content.find((p: any) => p.type === 'image_url');
    if (imgPart && 'image_url' in imgPart) {
      base64 = (imgPart as any).image_url.url.replace(/^data:image\/\w+;base64,/, '');
    }
  }

  if (!base64) throw new Error('No image in OpenRouter response');

  return { base64, provider: 'openrouter', model: opts.model };
}

async function generateViaFal(opts: GenerateImageOptions): Promise<GenerateImageResult> {
  const FAL_KEY = process.env.FAL_KEY;
  if (!FAL_KEY) throw new Error('FAL_KEY not set');

  const modelMap: Record<string, string> = {
    'schnell': 'fal-ai/flux/schnell',
    'flux-2-dev': 'fal-ai/flux-2/dev',
    'flux-2-pro': 'fal-ai/flux-2-pro',
    'sdxl': 'fal-ai/fast-sdxl',
  };

  const falModel = modelMap[opts.model] || opts.model;

  const response = await fetch(`https://fal.run/${falModel}`, {
    method: 'POST',
    headers: {
      'Authorization': `Key ${FAL_KEY}`,
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      prompt: opts.prompt,
      image_size: { width: opts.width, height: opts.height },
      num_images: 1,
    }),
  });

  if (!response.ok) throw new Error(`Fal.ai error: ${response.status}`);

  const data = await response.json() as any;
  const imageUrl = data.images?.[0]?.url;
  if (!imageUrl) throw new Error('No image in fal.ai response');

  // Download and convert to base64
  const imgResponse = await fetch(imageUrl);
  const buffer = Buffer.from(await imgResponse.arrayBuffer());
  const base64 = buffer.toString('base64');

  return { base64, provider: 'fal', model: opts.model };
}

Step 4: Run test

npx vitest run src/image-generator.test.ts

Expected: PASS

Step 5: Commit

git add projects/image-pipeline/src/image-generator.*
git commit -m "feat(image-pipeline): add image generator with OpenRouter + fal.ai"

Task 5: Vision QA Checker

Files:

  • Create: projects/image-pipeline/src/qa-checker.ts
  • Test: projects/image-pipeline/src/qa-checker.test.ts

Step 1: Write failing test

// src/qa-checker.test.ts
import { describe, it, expect, vi } from 'vitest';
import { checkImageQuality, type QAResult } from './qa-checker.js';

vi.mock('openai', () => ({
  default: class {
    chat = {
      completions: {
        create: vi.fn().mockResolvedValue({
          choices: [{
            message: {
              content: JSON.stringify({
                passed: true,
                score: 0.85,
                checks: {
                  brand_colors: { score: 0.9, note: 'Gold and navy visible' },
                  composition: { score: 0.8, note: 'Good layout' },
                  text_readability: { score: 0.9, note: 'N/A' },
                  brand_mood: { score: 0.8, note: 'Luxury feel achieved' },
                  technical_quality: { score: 0.85, note: 'Clean render' },
                },
              })
            }
          }]
        })
      }
    };
  }
}));

describe('checkImageQuality', () => {
  it('returns QA result with score', async () => {
    const result = await checkImageQuality({
      imageBase64: 'iVBORw0KGgoAAAANS...',
      brandName: 'richbukae',
      imageType: 'hero',
    });

    expect(result.passed).toBe(true);
    expect(result.score).toBeGreaterThan(0.7);
    expect(result.checks).toBeDefined();
  });
});

Step 2: Run test, verify failure

npx vitest run src/qa-checker.test.ts

Step 3: Implement QA checker

// src/qa-checker.ts
import OpenAI from 'openai';
import { loadBrand } from './brand-loader.js';

const client = new OpenAI({
  baseURL: process.env.OPENROUTER_BASE_URL || 'https://openrouter.ai/api/v1',
  apiKey: process.env.OPENROUTER_API_KEY,
});

export interface QACheckDetail {
  score: number;
  note: string;
}

export interface QAResult {
  passed: boolean;
  score: number;
  checks: Record<string, QACheckDetail>;
}

interface QAOptions {
  imageBase64: string;
  brandName: string;
  imageType: string;
}

export async function checkImageQuality(opts: QAOptions): Promise<QAResult> {
  const brand = loadBrand(opts.brandName);

  const prompt = `You are an image quality auditor. Evaluate this AI-generated image against the brand guidelines.

Brand: ${brand.display_name}
Style: ${brand.visual_identity.style_keywords.join(', ')}
Primary Color: ${brand.visual_identity.primary_color}
Secondary Color: ${brand.visual_identity.secondary_color}
Mood: ${brand.visual_identity.mood}
Image Type: ${opts.imageType}

Score each criterion from 0.0 to 1.0:
1. brand_colors — Are the brand's primary/secondary colors visible?
2. composition — Good visual layout, no artifacts?
3. text_readability — If text present, is it readable?
4. brand_mood — Does the mood match the brand?
5. technical_quality — No distortions or low-resolution areas?

Respond in JSON only:
{
  "passed": true/false (true if weighted average >= 0.7),
  "score": 0.0-1.0,
  "checks": {
    "brand_colors": {"score": 0.0-1.0, "note": "..."},
    "composition": {"score": 0.0-1.0, "note": "..."},
    "text_readability": {"score": 0.0-1.0, "note": "..."},
    "brand_mood": {"score": 0.0-1.0, "note": "..."},
    "technical_quality": {"score": 0.0-1.0, "note": "..."}
  }
}`;

  const response = await client.chat.completions.create({
    model: 'anthropic/claude-sonnet-4-6',
    messages: [{
      role: 'user',
      content: [
        { type: 'text', text: prompt },
        { type: 'image_url', image_url: { url: `data:image/png;base64,${opts.imageBase64}` } },
      ],
    }],
    max_tokens: 500,
  });

  const content = response.choices[0]?.message?.content;
  if (!content || typeof content !== 'string') throw new Error('No QA response');

  return JSON.parse(content) as QAResult;
}

Step 4: Run test

npx vitest run src/qa-checker.test.ts

Expected: PASS

Step 5: Commit

git add projects/image-pipeline/src/qa-checker.*
git commit -m "feat(image-pipeline): add vision QA checker with brand validation"

Task 6: R2 Uploader

Files:

  • Create: projects/image-pipeline/src/r2-uploader.ts
  • Test: projects/image-pipeline/src/r2-uploader.test.ts

Step 1: Write failing test

// src/r2-uploader.test.ts
import { describe, it, expect, vi } from 'vitest';
import { buildR2Key, type UploadResult } from './r2-uploader.js';

describe('buildR2Key', () => {
  it('generates correct R2 path', () => {
    const key = buildR2Key('richbukae', 'hero', 'abc123');
    expect(key).toMatch(/^images\/richbukae\/hero\/\d{4}-\d{2}-\d{2}-abc123\.png$/);
  });
});

Step 2: Run test, verify failure

npx vitest run src/r2-uploader.test.ts

Step 3: Implement R2 uploader

// src/r2-uploader.ts
import { S3Client, PutObjectCommand } from '@aws-sdk/client-s3';
import { createHash } from 'node:crypto';

const s3 = new S3Client({
  region: 'auto',
  endpoint: process.env.CLOUDFLARE_R2_ENDPOINT,
  credentials: {
    accessKeyId: process.env.R2_ACCESS_KEY_ID || '',
    secretAccessKey: process.env.R2_SECRET_ACCESS_KEY || '',
  },
});

export interface UploadResult {
  key: string;
  publicUrl: string;
  size: number;
}

export function buildR2Key(brand: string, imageType: string, hash: string): string {
  const date = new Date().toISOString().slice(0, 10);
  return `images/${brand}/${imageType}/${date}-${hash}.png`;
}

export async function uploadToR2(base64: string, brand: string, imageType: string): Promise<UploadResult> {
  const buffer = Buffer.from(base64, 'base64');
  const hash = createHash('md5').update(buffer).digest('hex').slice(0, 8);
  const key = buildR2Key(brand, imageType, hash);

  await s3.send(new PutObjectCommand({
    Bucket: process.env.R2_BUCKET_NAME || 'business-builder',
    Key: key,
    Body: buffer,
    ContentType: 'image/png',
    CacheControl: 'public, max-age=31536000',
  }));

  const publicUrl = `${process.env.R2_PUBLIC_URL}/${key}`;
  return { key, publicUrl, size: buffer.length };
}

Step 4: Run test

npx vitest run src/r2-uploader.test.ts

Expected: PASS (unit test for buildR2Key only; upload tested in integration)

Step 5: Commit

git add projects/image-pipeline/src/r2-uploader.*
git commit -m "feat(image-pipeline): add R2 uploader with path conventions"

Task 7: CLI Entry Point

Files:

  • Create: projects/image-pipeline/src/cli.ts
  • Create: scripts/generate-image.sh (thin wrapper)

Step 1: Implement CLI

// src/cli.ts
import { Command } from 'commander';
import { loadBrand } from './brand-loader.js';
import { generateImagePrompt } from './prompt-generator.js';
import { generateImage } from './image-generator.js';
import { checkImageQuality } from './qa-checker.js';
import { uploadToR2 } from './r2-uploader.js';
import { writeFileSync } from 'node:fs';

const program = new Command();

const MODEL_MAP: Record<string, { provider: 'openrouter' | 'fal'; model: string }> = {
  'flux.2-pro': { provider: 'openrouter', model: 'black-forest-labs/flux.2-pro' },
  'gemini-flash': { provider: 'openrouter', model: 'google/gemini-2.5-flash-image' },
  'gemini-pro': { provider: 'openrouter', model: 'google/gemini-3-pro-image-preview' },
  'schnell': { provider: 'fal', model: 'schnell' },
  'sdxl': { provider: 'fal', model: 'sdxl' },
};

const TYPE_DEFAULTS: Record<string, string> = {
  hero: 'flux.2-pro',
  newsletter: 'gemini-flash',
  instagram: 'flux.2-pro',
  reels: 'flux.2-pro',
  mockup: 'gemini-pro',
};

program
  .name('generate-image')
  .description('Brand-aware AI image generation pipeline')
  .requiredOption('--brand <name>', 'Brand name (loads brands/{name}.json)')
  .requiredOption('--type <type>', 'Image type: hero, newsletter, instagram, reels, mockup')
  .requiredOption('--topic <topic>', 'Image subject/topic')
  .option('--model <model>', 'Override model: flux.2-pro, gemini-flash, schnell, sdxl')
  .option('--budget', 'Use cheapest model (fal.ai schnell)', false)
  .option('--no-qa', 'Skip vision QA check')
  .option('--dry-run', 'Generate prompt only', false)
  .option('--output <path>', 'Save image locally')
  .action(async (opts) => {
    try {
      // 1. Load brand
      console.log(`[1/5] Loading brand: ${opts.brand}`);
      const brand = loadBrand(opts.brand);
      const dims = brand.dimensions[opts.type] || { width: 1024, height: 1024 };

      // 2. Generate prompt
      console.log(`[2/5] Generating prompt for: ${opts.topic}`);
      const prompt = await generateImagePrompt({ brand, imageType: opts.type, topic: opts.topic });
      console.log(`  Prompt: ${prompt.slice(0, 120)}...`);

      if (opts.dryRun) {
        console.log('\n--- DRY RUN: Full Prompt ---');
        console.log(prompt);
        return;
      }

      // 3. Select model
      const modelKey = opts.budget ? 'schnell' : (opts.model || TYPE_DEFAULTS[opts.type] || 'flux.2-pro');
      const modelConfig = MODEL_MAP[modelKey];
      if (!modelConfig) throw new Error(`Unknown model: ${modelKey}`);

      // 4. Generate image (with retry)
      const MAX_RETRIES = 3;
      let imageResult;
      let qaResult;

      for (let attempt = 1; attempt <= MAX_RETRIES; attempt++) {
        console.log(`[3/5] Generating image (attempt ${attempt}/${MAX_RETRIES}) via ${modelConfig.provider}:${modelConfig.model}`);
        imageResult = await generateImage({
          prompt,
          width: dims.width,
          height: dims.height,
          ...modelConfig,
        });

        if (!opts.qa) {
          console.log(`[4/5] QA skipped`);
          break;
        }

        // 5. QA Check
        console.log(`[4/5] Running vision QA check...`);
        qaResult = await checkImageQuality({
          imageBase64: imageResult.base64,
          brandName: opts.brand,
          imageType: opts.type,
        });

        console.log(`  QA Score: ${qaResult.score} (${qaResult.passed ? 'PASS' : 'FAIL'})`);
        if (qaResult.passed) break;
        if (attempt < MAX_RETRIES) console.log(`  Retrying with QA feedback...`);
      }

      if (!imageResult) throw new Error('Image generation failed');

      // 6. Upload to R2
      console.log(`[5/5] Uploading to R2...`);
      const upload = await uploadToR2(imageResult.base64, opts.brand, opts.type);
      console.log(`  URL: ${upload.publicUrl}`);
      console.log(`  Size: ${(upload.size / 1024).toFixed(1)} KB`);

      // Optional local save
      if (opts.output) {
        writeFileSync(opts.output, Buffer.from(imageResult.base64, 'base64'));
        console.log(`  Saved locally: ${opts.output}`);
      }

      console.log('\nDone!');
    } catch (err) {
      console.error('Error:', (err as Error).message);
      process.exit(1);
    }
  });

program.parse();

Step 2: Create shell wrapper

#!/usr/bin/env bash
# scripts/generate-image.sh — Thin wrapper for image pipeline CLI
set -euo pipefail

SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
PROJECT_DIR="$SCRIPT_DIR/../projects/image-pipeline"

# Load env
if [ -f "$SCRIPT_DIR/../.env" ]; then
  set -a; source "$SCRIPT_DIR/../.env"; set +a
fi

cd "$PROJECT_DIR"
npx tsx src/cli.ts "$@"

Step 3: Make executable and test help

chmod +x scripts/generate-image.sh
./scripts/generate-image.sh --help

Expected: Help text with all options displayed

Step 4: Commit

git add projects/image-pipeline/src/cli.ts scripts/generate-image.sh
git commit -m "feat(image-pipeline): add CLI entry point with shell wrapper"

Task 8: Integration Test (End-to-End with Dry Run)

Files:

  • Create: projects/image-pipeline/src/integration.test.ts

Step 1: Write integration test

// src/integration.test.ts
import { describe, it, expect } from 'vitest';
import { loadBrand } from './brand-loader.js';
import { buildR2Key } from './r2-uploader.js';

describe('Integration: brand -> R2 path', () => {
  it('richbukae hero generates valid path', () => {
    const brand = loadBrand('richbukae');
    expect(brand.dimensions.hero).toEqual({ width: 1200, height: 628 });
    const key = buildR2Key('richbukae', 'hero', 'test123');
    expect(key).toContain('images/richbukae/hero/');
  });

  it('apppro instagram generates valid path', () => {
    const brand = loadBrand('apppro');
    expect(brand.dimensions.instagram).toEqual({ width: 1080, height: 1080 });
    const key = buildR2Key('apppro', 'instagram', 'test456');
    expect(key).toContain('images/apppro/instagram/');
  });
});

Step 2: Run all tests

cd projects/image-pipeline && npx vitest run

Expected: ALL PASS

Step 3: Commit

git add projects/image-pipeline/src/integration.test.ts
git commit -m "test(image-pipeline): add integration tests"

Task 9: Skill File + Documentation

Files:

  • Create: .claude/skills/image-generation.md

Step 1: Write skill file

---
name: image-generation
description: AI 이미지 생성 파이프라인 — 브랜드별 자동 이미지 생성, QA, R2 업로드
---

# Image Generation Pipeline

## Quick Start

\`\`\`bash
./scripts/generate-image.sh --brand richbukae --type hero --topic "AI Marketing"
\`\`\`

## Options

| Flag | Description | Default |
|------|-------------|---------|
| --brand | Brand name | Required |
| --type | hero/newsletter/instagram/reels/mockup | Required |
| --topic | Image subject | Required |
| --model | flux.2-pro/gemini-flash/schnell/sdxl | Auto |
| --budget | Use cheapest model | false |
| --no-qa | Skip QA check | false |
| --dry-run | Prompt only | false |

## Brands

Brand configs: `projects/image-pipeline/brands/{name}.json`

Available: richbukae, apppro

## Cost Reference

| Model | Provider | Cost/Image |
|-------|----------|-----------|
| Flux Schnell | fal.ai | $0.003 |
| Gemini Flash Image | OpenRouter | $0.039 |
| Flux 2 Pro | OpenRouter | $0.05 |
| Gemini 3 Pro Image | OpenRouter | $0.134 |

## Adding a New Brand

1. Create `projects/image-pipeline/brands/{name}.json` (copy existing template)
2. Update colors, keywords, mood, dimensions
3. Test: `./scripts/generate-image.sh --brand {name} --type hero --topic "test" --dry-run`
\`\`\`

**Step 2: Commit**

```bash
git add .claude/skills/image-generation.md
git commit -m "docs(image-pipeline): add image-generation skill file"

8. Blockers & CEO Action Required

#ItemDetail
1OPENROUTER_API_KEYCEO must issue key for OpenRouter (currently sk-or-... placeholder in policy)
2FAL_KEY (optional)Only needed if budget fal.ai fallback is desired
3Brand Color ConfirmationVerify richbukae (#1a1a2e + #c9a84c) and apppro (#2563eb) brand colors are correct

9. VP Review Points

  1. OpenRouter-first policy compliance — Plan routes all API calls through OpenRouter per CEO policy. Fal.ai is fallback only.
  2. Cost control — Mid-range stack at $0.057/image, ~$17/mo for 300 images. Within reasonable budget.
  3. No production deployment — This is a CLI tool, no Vercel/external deployment needed.
  4. Brand guide as JSON — Extensible to new brands/projects without code changes.
  5. Vision QA — Automated quality gate prevents low-quality images from reaching R2. Uses existing Claude Sonnet via OpenRouter.
  6. R2 already configured — No new infrastructure needed. Uses existing bucket + credentials.
plans/2026/02/25/image-generation-pipeline.md