Duplicate file prevention

Stop AI from creating duplicate files

AI coding tools frequently create duplicate files because they lack context about existing code structure. PlanToCode solves this with intelligent file discovery that maps your entire codebase before generating any code.

The Duplicate File Problem: Real Examples

Duplicate files are one of the most common and frustrating issues developers face when using AI coding assistants. When AI tools like Cursor, GitHub Copilot, or other code generation systems lack proper context about your existing codebase, they create new files instead of modifying existing ones. This leads to code fragmentation, merge conflicts, and hours of manual cleanup work.

Case Study: Cursor Issue #47028

A developer reported on the Cursor forum that when asking the AI to “update the authentication service,” Cursor created a new file src/services/auth-service-new.ts instead of modifying the existing src/services/authService.ts. This happened because the AI didn’t properly scan for existing implementations with similar naming patterns.

Impact: The developer spent 3 hours manually merging the duplicate code, resolving import conflicts across 15 files, and removing the duplicate. The project ended up with broken references in production because some imports still pointed to the old file path.

View Cursor forum discussion

Case Study: Cursor Issue #31402

Another documented case involved a React project where a developer asked to “add dark mode support.” Instead of modifying the existing components/ThemeProvider.tsx, Cursor created components/DarkModeProvider.tsx with overlapping functionality. The codebase ended up with two competing theme systems running simultaneously.

Impact: The duplicate theme providers caused state management conflicts, increased bundle size by 45KB, and created user experience bugs where theme preferences weren’t persisting correctly. The cleanup required a full refactoring sprint.

View Cursor forum discussion

Common Duplicate File Scenarios

• Creating utils-new.ts when helpers.ts exists with similar functions
• Generating apiClient2.ts instead of updating api/client.ts
• Making ButtonComponent.tsx when Button.tsx already exists
• Creating test-helper-updated.js instead of modifying testHelpers.js
• Duplicating configuration files like config-new.json or settings-v2.yaml

Why AI Tools Create Duplicate Files

Understanding the technical reasons behind duplicate file creation helps explain why this problem is so persistent across AI coding tools. It's not a simple bug—it's a fundamental architectural limitation of how most AI assistants interact with codebases.

1. Limited Context Window

Most AI coding assistants operate with a limited context window that can only “see” a small portion of your codebase at any given time. When you ask to create or modify a feature, the AI might only have access to the currently open files or a narrow slice of your project structure.

Technical Details: Even with large context windows (128K+ tokens), AI models still struggle with full-project awareness. A typical medium-sized project with 500 files could require 2-5 million tokens to fully index, far exceeding practical limits. This forces AI tools to make educated guesses about file locations rather than having complete knowledge.

2. Incomplete File Discovery

When AI tools do attempt file discovery, they often use shallow methods like searching currently open files, recently accessed files, or basic pattern matching. These approaches miss files that aren’t actively open or have non-standard naming conventions.

Example: If your authentication service is named authService.ts but the AI searches for files matching "auth*", it might miss it if the search is case-sensitive or limited to specific directories. The AI then concludes the file doesn’t exist and creates a duplicate.

3. Naming Convention Mismatches

Different projects use different naming conventions: camelCase, PascalCase, kebab-case, snake_case, or custom patterns. AI tools often struggle to recognize that user-service.ts, UserService.ts, and user_service.ts are all potential matches for a "user service" file.

Real Impact: In polyglot projects mixing multiple languages (TypeScript, Python, Go), naming conventions vary by language ecosystem. An AI trained primarily on JavaScript patterns might fail to recognize equivalent Python modules, leading to cross-language duplicates.

4. No Pre-execution Validation

Most AI coding tools execute changes immediately without a review step. They generate code and apply it directly to your filesystem. By the time you realize a duplicate was created, the damage is already done. There's no opportunity to catch the mistake before execution.

Workflow Problem: Traditional AI assistants follow a "generate → apply" pattern. Without a "generate → review → apply" workflow, developers have no chance to verify file paths, check for duplicates, or validate the AI’s understanding of the codebase structure before changes are written to disk.

5. Conflict Avoidance Bias

AI models are often trained with a safety-first approach: when uncertain whether a file exists or what its exact path is, they default to creating a new file rather than risking overwriting existing code. This "better safe than sorry" bias leads to duplicate file proliferation.

Training Incentives: AI models are penalized more heavily for destructive actions (overwriting important code) than for conservative actions (creating unnecessary duplicates). This asymmetric penalty structure in training data encourages duplicate creation as the "safer" option.

How PlanToCode Prevents Duplicate Files

PlanToCode fundamentally changes the workflow with a planning-first approach. Instead of immediately generating and executing code, PlanToCode uses a comprehensive file discovery system that maps your entire codebase structure before proposing any changes. This architectural difference eliminates the root causes of duplicate file creation.

Comprehensive File Discovery

PlanToCode runs a 4-stage file discovery workflow before generating any implementation plan. This workflow uses git integration, regex filtering, AI-powered relevance assessment, and relationship analysis to build a complete map of your codebase.

Discovery Process:

Stage 1: Validate git repository and root folder
Stage 2: Generate task-specific regex patterns
Stage 3: AI relevance assessment of file contents
Stage 4: Extended path discovery via relationships

This deep discovery means PlanToCode knows about authService.ts, auth-helpers.ts, and authentication/ directories before suggesting any changes. It won’t create duplicates because it has complete context.

Technical documentation

Review Before Execution

Unlike tools that immediately apply changes, PlanToCode generates a detailed implementation plan that you review in the Monaco editor before any code touches your filesystem. You see exactly which files will be created, modified, or deleted.

Plan Contents Include:

• Complete list of files to be modified
• New files to be created with full paths
• Specific changes with before/after context
• Token count estimates per operation
• Dependencies and import updates needed

This review step lets you catch duplicates before execution. If you see the plan wants to create auth-new.ts, you can reject it and refine the discovery scope.

Implementation plans guide

Intelligent Pattern Matching

PlanToCode's regex generation stage creates intelligent patterns that account for multiple naming conventions, case variations, and common file organization patterns. It understands that a request to “update the user service" should match userService.ts, user-service.ts, UserService.ts, or services/user/.

Advanced Matching: The system uses AI to generate context-aware regex patterns rather than simple string matching. For a task like "add JWT validation," it generates patterns covering auth*, jwt*, token*, middleware/auth* and related patterns.

Git-Aware File Tracking

The file discovery workflow integrates directly with git to respect .gitignore rules and track both committed and uncommitted changes. This git integration ensures PlanToCode sees your actual working tree, including recently created files that might not be committed yet.

Command Used: git ls-files --cached --others --exclude-standard captures all tracked files plus untracked files that aren’t ignored, giving PlanToCode a complete view of your codebase state including work-in-progress files.

Before & After: AI Without Planning vs. With PlanToCode

Without PlanToCode

User: "Add JWT validation to authentication"

AI has limited context, only sees currently open files

AI searches, doesn’t find existing auth files

Misses src/services/authService.ts due to naming/path mismatch

Immediately creates jwtValidation.ts

No review step, changes applied directly to filesystem

Result: Duplicate file created

Now have both authService.ts and jwtValidation.ts with overlapping functionality

Manual cleanup required:

• Merge duplicate code manually
• Update all import references
• Fix broken tests and dependencies
• Time wasted: 2-4 hours

With PlanToCode

User: "Add JWT validation to authentication"

File discovery workflow starts automatically

4-stage discovery maps entire codebase

Finds authService.ts, auth-helpers.ts, related config files

Generates implementation plan for review

Shows it will modify existing authService.ts, no duplicates

You review and approve plan

See exact changes before any code touches filesystem

Result: Clean, targeted modifications

JWT validation added to existing authService.ts, no duplicates created

Benefits achieved:

• Zero duplicate files created
• Clean modification to existing code
• All imports remain valid
• Time saved: 2-4 hours

Getting Started: Stop Creating Duplicates Today

Step 1: Install PlanToCode Desktop

Download the PlanToCode desktop application for your platform. The file discovery workflow and implementation planning features are built directly into the desktop client.

Step 2: Configure Your Project Root

Open PlanToCode and select your project's root directory. PlanToCode will validate git repository status and establish the base directory for all file operations. Configure any custom exclusion patterns for directories you want to skip (node_modules, dist, build, etc.).

Tip: The default exclusion patterns already cover common directories like node_modules, .git, and build artifacts. You only need to customize if your project has unusual directory structures.

Step 3: Describe Your Task

Enter a natural language description of what you want to accomplish. For example: "Add JWT validation to the authentication service" or "Implement dark mode support in the theme provider." Be as specific as possible about the functionality you want.

Good Task Descriptions:

• "Add Redis caching to the user profile API endpoint"
• "Implement WebSocket connection management in the chat service"
• "Add input validation to all form components"
• "Update database migration to add user roles table"

Step 4: Review the File Discovery

PlanToCode will run the 4-stage file discovery workflow in the background. You'll see real-time progress updates as it discovers relevant files. The workflow typically completes in 30-90 seconds depending on codebase size.

Once complete, review the list of discovered files. You'll see which files PlanToCode identified as relevant to your task. This is your first checkpoint to ensure the system has proper context about existing files.

Learn more about the discovery process

Step 5: Review the Implementation Plan

PlanToCode generates a detailed implementation plan based on the discovered files. Open the plan in the Monaco editor and carefully review:

• Which files will be modified (look for existing file paths)
• Which files will be created (verify these are genuinely new files needed)
• The specific code changes proposed for each file
• Import statements and dependency updates

Checkpoint: If you see any file creation that looks like a duplicate (e.g., auth-new.ts or UserService2.tsx), stop here. Refine your task description or manually adjust the file list before proceeding.

Step 6: Execute with Confidence

Once you've reviewed and approved the plan, copy the implementation instructions to your preferred AI coding tool (Cursor, Copilot, Claude, etc.) or execute directly via the integrated terminal. Because PlanToCode has already done the heavy lifting of file discovery and planning, execution becomes a straightforward process of applying well-defined changes.

Terminal integration guide

Frequently Asked Questions

Does PlanToCode work with Cursor and GitHub Copilot?

Yes. PlanToCode is designed as a planning layer that works alongside your existing AI coding tools. You use PlanToCode to discover files and generate implementation plans, then execute those plans using Cursor, GitHub Copilot, Claude Code, or any other AI assistant. The file discovery and planning prevent duplicates regardless of which tool executes the code.

How long does the file discovery workflow take?

File discovery typically completes in 30-90 seconds for medium-sized projects (500-2000 files). Very large monorepos with 10,000+ files may take 2-3 minutes. The workflow runs in the background, so you can continue working while it executes. Progress updates appear in real-time.

What if I have a huge codebase? Will discovery time out?

PlanToCode includes intelligent timeout management and caching mechanisms. For extremely large codebases, you can configure custom timeout values and use exclusion patterns to skip irrelevant directories (vendor code, generated files, etc.). The system also caches discovery results per session, so subsequent plans in the same session reuse the cached file context.

Configuration options

Can I still create genuinely new files when needed?

Absolutely. PlanToCode's file discovery doesn’t prevent creating new files—it prevents creating duplicate files. When your task genuinely requires a new file (like adding a completely new feature module), PlanToCode will propose creating it in the implementation plan. The difference is you'll see the proposal and can verify it's truly new functionality rather than an accidental duplicate.

Does this work for non-JavaScript projects?

Yes. PlanToCode's file discovery is language-agnostic. It works with Python, Go, Rust, Java, TypeScript, JavaScript, Ruby, PHP, C++, and any other text-based codebase. The regex generation and AI relevance assessment adapt to the specific languages and frameworks in your project based on the task description and discovered file extensions.

What happens if the AI still proposes a duplicate in the plan?

This is rare because the file discovery provides comprehensive context, but if it happens, you'll catch it during the review step. Simply reject the plan, refine your task description (be more specific about which existing files to modify), or manually adjust the file selection. The key advantage is catching duplicates before execution rather than after the damage is done.

Is there a cost for running file discovery?

File discovery does use AI for the relevance assessment stage (Stage 3), which incurs small API costs. However, the cost is minimal (typically $0.01-0.05 per discovery run) and the system provides cost estimates before execution. The investment is worthwhile compared to the 2-4 hours of manual cleanup time saved by preventing duplicates.

Can I use PlanToCode for refactoring existing duplicates?

Yes. If you already have duplicate files in your codebase, you can use PlanToCode to plan their consolidation. Describe the task as "Merge duplicate authentication services into authService.ts" or similar. The file discovery will find all related files, and the implementation plan will show you exactly how to consolidate them cleanly.

Stop Creating Duplicate Files Today

File discovery before execution. Review before application. Zero duplicates. This is how AI-assisted development should work: intelligent, preventive, clean.

Read the technical guide•Learn about plan review•See how it works