Model Configuration
Task-level model lists, selector controls, and token guardrails in the desktop client.
PlanToCode treats model selection as a task-level decision. Each workflow ships with a default model and an allowed list, and the desktop client exposes these options through a toggle that prevents sending prompts that exceed the active context window. Configuration is fetched from /api/config/desktop-runtime-config at startup and can be overridden per project in SQLite.
Model selector toggle
How the model selector shows allowed models with token guardrails.

Per-Task Allowed Models and Defaults
Each task type defines a default model, a list of allowed alternatives, token limits, and optional features like vision support. The desktop client reads these settings at runtime to populate the model selector.
| Task Type | Default Model | Max Output |
|---|---|---|
implementation_plan | openai/gpt-5.2-2025-12-11 | 23,000 |
implementation_plan_merge | openai/gpt-5.2-2025-12-11 | 35,000 |
task_refinement | anthropic/claude-opus-4-5-20251101 | 16,384 |
text_improvement | anthropic/claude-opus-4-5-20251101 | 4,096 |
voice_transcription | openai/gpt-4o-transcribe | 4,096 |
regex_file_filter | anthropic/claude-sonnet-4-5-20250929 | 35,000 |
file_relevance_assessment | openai/gpt-5-mini | 24,000 |
extended_path_finder | openai/gpt-5-mini | 8,192 |
web_search_prompts_generation | openai/gpt-5.2-2025-12-11 | 30,000 |
video_analysis | google/gemini-2.5-pro | 50,000 |
Allowed alternatives are specified per task. For example, implementation_plan allows switching between Claude, GPT-4o, and Gemini models.
Token Guardrails (Context Window Checks)
Before sending any request, the system validates that the prompt plus planned output tokens fit within the model's advertised context window. Violations prevent the request from being sent.
// Token guardrail validation
interface TokenGuardrail {
model: string;
context_window: number;
max_output: number;
}
function validateRequest(
prompt_tokens: number,
requested_output: number,
guardrail: TokenGuardrail
): ValidationResult {
const total = prompt_tokens + requested_output;
if (total > guardrail.context_window) {
return {
valid: false,
error: `Request requires ${total} tokens but model supports ${guardrail.context_window}`,
overage: total - guardrail.context_window
};
}
if (requested_output > guardrail.max_output) {
return {
valid: false,
error: `Requested ${requested_output} output tokens but model max is ${guardrail.max_output}`
};
}
return { valid: true };
}Context Window
Prompt + max_output must fit within model context limit
Output Budget
Requested output tokens cannot exceed model max_output
Runtime Config from /api/config/desktop-runtime-config
The desktop client fetches runtime configuration at startup from the server. This includes task model configs, provider information with model details, and concurrency limits.
// DesktopRuntimeAIConfig response structure
{
"tasks": {
"implementation_plan": {
"model": "openai/gpt-5.2-2025-12-11",
"allowedModels": [
"openai/gpt-5.2-2025-12-11",
"google/gemini-3-pro-preview",
"google/gemini-2.5-pro",
"anthropic/claude-opus-4-5-20251101",
"deepseek/deepseek-r1-0528"
],
"maxTokens": 23000,
"temperature": 0.7,
"copyButtons": [...]
}
// ... other task configs
},
"providers": [
{
"provider": { "code": "openai", "name": "OpenAI" },
"models": [
{
"id": "openai/gpt-5.2-2025-12-11",
"name": "GPT-5.2",
"contextWindow": 200000,
"priceInputPerMillion": "2.50",
"priceOutputPerMillion": "10.00"
}
]
}
],
"maxConcurrentJobs": 20
}Config Lifecycle
- • Fetched once at app startup
- • Cached in React context for component access
- • Auto-refreshed every 30 seconds via background sync
- • Merged with project-level overrides
Project-Level Overrides in SQLite
Teams can override server defaults at the project level. These overrides are stored in the key_value_store table using a structured key pattern and merged with the runtime config when tasks are executed.
-- Project task settings use key_value_store with structured keys
-- Key pattern: project_task_settings:{project_hash}:{task_type}:{field}
-- Example: Override model for implementation_plan in a specific project
INSERT INTO key_value_store (key, value, updated_at) VALUES (
'project_task_settings:abc123hash:implementation_plan:model',
'anthropic/claude-opus-4-5-20251101',
strftime('%s', 'now')
);
-- Example: Override temperature
INSERT INTO key_value_store (key, value, updated_at) VALUES (
'project_task_settings:abc123hash:implementation_plan:temperature',
'0.5',
strftime('%s', 'now')
);
-- Retrieve all settings for a project
SELECT key, value FROM key_value_store
WHERE key LIKE 'project_task_settings:abc123hash:%';Merge Behavior
Project overrides take precedence over server defaults. Settings are retrieved using the get_all_project_task_settings method which queries all keys matching the project hash prefix.
Selector Toggle in the Client
The Implementation Plans panel renders allowed models through a ModelSelectorToggle component. The toggle displays each allowed model, tracks the active selection, and checks whether the estimated prompt plus planned output tokens fit within the model's context window before allowing a switch.
Load Allowed Models
Component reads task config from context, filters to allowed models
Estimate Tokens
Call token estimation command with current prompt and selected model
Apply Guardrails
Disable models that cannot fit the prompt, show overage in tooltip
Allow Selection
User can switch between enabled models, selection persists to session
Overage Warning
If a model cannot support the total token requirement, the toggle disables the button and surfaces a tooltip with the computed overage, keeping reviewers within safe limits before sending work to an agent.
Prompt Estimation
Token counts are calculated through the token estimation command. The panel submits the session ID, task description, relevant files, and selected model so the backend can return system, user, and total token values.
// Token estimation request/response
interface TokenEstimationRequest {
session_id: string;
task_description: string;
selected_files: string[];
model: string;
}
interface TokenEstimationResponse {
system_tokens: number;
user_tokens: number;
total_tokens: number;
model_context_window: number;
model_max_output: number;
remaining_capacity: number;
estimated_cost: number;
}Estimation Sources
- • tiktoken for GPT models
- • Anthropic tokenizer for Claude
- • Character heuristics for others
Display in UI
- • Token count badge on model selector
- • Cost estimate in tooltip
- • Progress bar for context usage
Extended Configuration Options
Beyond model selection, task configs can specify additional parameters that affect generation behavior.
Temperature
Controls randomness in generation. Lower values (0.1-0.3) for deterministic tasks like code generation, higher values (0.7-0.9) for creative tasks.
Top-P (Nucleus Sampling)
Alternative to temperature. Limits sampling to tokens comprising the top P probability mass. Typically set to 0.9-0.95.
Stop Sequences
Strings that terminate generation when encountered. Used to stop at specific markers like</plan> or [END].
System Prompt
Task-specific system prompts that set context and constraints. Can be customized per project.
See how routing uses these configs
Provider routing shows how model configs determine where requests are sent and how usage is tracked.