Reference

Build Your Own Pipeline

Conceptual guide for designing file discovery and plan generation workflows.

25 septembre 2025

•

12 min de lecture

•

This guide distills the key architectural patterns from PlanToCode into a conceptual blueprint. Whether you want to build a similar system or understand why certain design decisions were made, this document covers the foundational patterns you can reuse or adapt.

Pipeline architecture map

Overview of the multi-stage pipeline from task input to plan output.

Pipeline architecture diagram — Placeholder for pipeline architecture diagram.

Key Architectural Patterns

Job Queue Pattern

All LLM-backed operations run as background jobs with status tracking, cancellation support, and retry logic. Jobs are persisted to SQLite so state survives app restarts.

Benefits

Decouples UI responsiveness from LLM latency
Enables cancellation mid-stream
Provides job history of all operations
Supports retry with exponential backoff

Pitfalls to Avoid

Job status management adds complexity
Need careful handling of stale jobs on restart
Stream accumulation can consume memory for large responses

Workflow Orchestrator Pattern

Multi-stage workflows are coordinated by an orchestrator that schedules stages sequentially, passes intermediate data between them, and handles failures at any stage.

Components

Definition loader reads workflow JSON specs
Stage scheduler dispatches stages in order
Payload builder constructs inputs from prior outputs
Event emitter publishes progress for UI updates

Repository Pattern

All persistence goes through typed repositories that abstract SQLite operations. This provides a clean API, enables testing, and centralizes database access.

Benefits

Typed access prevents SQL injection
Repositories can be mocked for testing
Centralized query optimization
Consistent error handling

Pipeline Steps

1. Define your task model

Start by defining what constitutes a task in your system. PlanToCode uses sessions with task descriptions, file selections, and model preferences.

Store task metadata in a dedicated table with versioning for history tracking.

2. Build the job queue

Create a job queue that persists jobs to storage, emits status events, and supports cancellation. Jobs should track prompts, responses, tokens, and cost.

Use a semaphore-based concurrency limiter to control parallel LLM requests.

3. Implement processors

Each job type needs a processor that builds prompts, calls the LLM, and parses responses. Use streaming for long outputs.

Processors should be stateless and receive all context through job parameters.

4. Create the workflow orchestrator

For multi-stage workflows, build an orchestrator that schedules stages, manages intermediate data, and handles failures.

Store workflow definitions as JSON for easy modification without code changes.

5. Add the routing layer

Route LLM requests through a server proxy that normalizes payloads, manages API keys, and tracks usage.

Keep provider credentials on the server; never embed them in desktop clients.

Architecture Decisions

Should you use a local database or server-side storage?

Use local SQLite for job state and artifacts. This enables offline operation and fast queries. Sync to server only for billing and cross-device state.

Streaming vs non-streaming responses?

Use streaming for plan generation and any output shown progressively. Use non-streaming for short transformations like text improvement.

How to handle LLM provider failures?

Implement automatic retry with exponential backoff. Consider a fallback provider like OpenRouter for resilience.

Where should file content be loaded?

Load file content in the processor just before building the prompt. This ensures fresh content and avoids storing large blobs in job records.

What to Customize vs Reuse

Customize

Prompt templates for your specific use case
File discovery patterns for your project types
Output format (XML, JSON, Markdown)
Model selection per task type

Reuse

Job queue architecture with status tracking
Workflow orchestrator pattern
Repository pattern for persistence
Streaming response handling
Provider routing and normalization

Common Pitfalls to Avoid

!

Embedding API keys in the client

Route all LLM requests through a server proxy that manages credentials securely.

!

Not persisting job state

Store every job with full prompt and response for review and recovery.

!

Blocking UI on LLM calls

Use background jobs with event-driven UI updates for responsive interfaces.

!

Ignoring token limits

Estimate tokens before sending and chunk large inputs to stay within context windows.

!

No cancellation support

Check cancellation flags between streaming chunks and propagate to server.

Artifacts to Persist

Full prompt sent to the LLM (for debugging and review)
Complete response including streaming accumulation
Token counts from provider response
Computed cost based on model pricing
System prompt template identifier for versioning
Workflow intermediate data for multi-stage flows

Implementation Notes

Use SQLite with WAL mode for concurrent read/write access
Implement graceful shutdown that marks running jobs as failed
Add health checks for external dependencies before job processing
Log all LLM errors with full context for debugging
Consider caching file content with short TTL to avoid redundant reads