Skip to content

Liflex/ProjectEvolve

Repository files navigation

πŸ§ͺ ProjectEvolve

Autonomous AI-Powered Research and Project Improvement System

**Русская докумСнтация β†’ README_RU.md]


Python Platform License Claude


πŸ’‘ Philosophy

"Give an AI agent a real project and let it experiment autonomously." β€” Inspired by karpathy/autoresearch

Difference from Original

Original karpathy/autoresearch β€” AI agent researches neural network training (nanochat), modifying only train.py with a single val_bpb metric.

ProjectEvolve β€” extends this idea to any project:

  • Any programming language (Python, JavaScript, Go, Rust, ...)
  • Any task types (backend, frontend, DevOps, documentation, ...)
  • Any files and directories (full freedom of action)
  • Cross-platform (Windows, Linux, macOS)
  • Knowledge persistence across runs

Key inheritance: agent works autonomously, iteratively improves project, keeps successful changes, discards failures.


πŸ“– Overview

ProjectEvolve is a universal tool for running an AI agent on any project. The agent autonomously analyzes code, proposes improvements, makes changes, and learns from previous experiments.

What ProjectEvolve Does?

  1. Analyzes β€” studies project structure, code, documentation
  2. Proposes β€” generates improvement ideas
  3. Implements β€” makes changes to code/structure/docs
  4. Tests β€” ensures nothing breaks
  5. Accumulates β€” next iteration sees previous results
  6. Repeats β€” cycle continues autonomously

🎯 Why ProjectEvolve?

  • πŸ”„ Autonomous experiments β€” AI independently analyzes, proposes, and implements improvements
  • πŸ“š Knowledge accumulation β€” each iteration sees previous results, building project knowledge
  • ⚑ Universality β€” works with Python, JavaScript, Go, Rust, and any other technology
  • 🎨 Flexible setup β€” simple questionnaire adapts to project
  • 🌐 Cross-platform β€” Windows, Linux, macOS
  • πŸ”§ Zero maintenance β€” agent handles everything

πŸ’‘ How it works?

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Your Project   │─────▢│ ProjectEvolve│─────▢│  AI Agent   β”‚
β”‚  (any language) β”‚      β”‚  (script)    β”‚      β”‚  (Claude)   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                β”‚                      β”‚
                                β–Ό                      β–Ό
                        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                        β”‚ Configurationβ”‚      β”‚ Experiment  β”‚
                        β”‚ .autoresearchβ”‚      β”‚ #1, #2, #3  β”‚
                        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                β”‚                      β”‚
                                β–Ό                      β–Ό
                        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                        β”‚  Improvements│◀─────│  Context    β”‚
                        β”‚  code/docs    β”‚      β”‚  accumulatesβ”‚
                        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

🎨 Features

✨ What can ProjectEvolve do?

  • πŸ” Analyze β€” studies project structure, code, documentation
  • πŸ’‘ Propose β€” generates improvement ideas
  • πŸ”¨ Implement β€” makes changes to code, structure, documentation
  • πŸ§ͺ Quality Loop β€” built-in self-testing with quantitative metrics
  • πŸ“Š Evaluate β€” automatic scoring (0.0-1.0) with pass/fail decisions
  • πŸ“ Document β€” updates README, creates new documentation
  • πŸ”„ Iterate β€” each iteration learns from previous ones

🌐 Cross-platform Support

Platform Support Installation
Windows βœ… Full autoresearch.bat
Linux βœ… Full python autoresearch.py
macOS βœ… Full python autoresearch.py

πŸ”§ Technologies

  • Python 3.10+
  • Claude CLI (Anthropic)
  • Git (optional)

πŸ”„ Quality Loop

ProjectEvolve includes a built-in self-testing system inspired by quality gates:

How Quality Loop Works

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Generate   │─────▢│    Apply     │─────▢│   Evaluate   β”‚
β”‚   Idea       β”‚      β”‚   Changes    β”‚      β”‚   (Score)    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
                                                    β”‚
                                                    β–Ό
                                             β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                                             β”‚   Decision   β”‚
                                             β”‚  KEEP/DISCARDβ”‚
                                             β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                                    β”‚
                              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                              β”‚ (if kept)
                              β–Ό
                      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                      β”‚   Next Iter.  β”‚
                      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Quality Gate Features

  • Universal β€” works with Python, JavaScript, Go, Rust, Ruby, Java, any language
  • Auto-detect β€” automatically finds test commands (npm test, pytest, cargo test, etc.)
  • Quantitative β€” scores 0.0-1.0 with pass/fail decisions
  • Two-phase β€” Phase A (base quality, 70% threshold) β†’ Phase B (strict quality, 85% threshold)
  • Automatic β€” runs tests after each experiment, decides to keep or discard changes

Running Quality Loop

# Standalone quality check
python F:/IdeaProjects/autoresearch/utils/quality_loop.py --project /path/to/project

# Custom thresholds
python utils/quality_loop.py --project . --threshold-a 0.7 --threshold-b 0.85

# JSON output for parsing
python utils/quality_loop.py --project . --json

Quality Configuration

Configuration file .autoresearch/quality.yml is created automatically:

metrics:
  tests:
    enabled: true
    command: ""  # Auto-detect: npm test, pytest, cargo test, etc.
  build:
    enabled: false
    command: ""  # Auto-detect: npm run build, cargo build, etc.

thresholds:
  a:
    min_score: 0.7  # Phase A threshold
    required_checks: ["tests"]
  b:
    min_score: 0.85  # Phase B threshold
    required_checks: ["tests", "build"]

Decision Logic

Keep changes if:

  • βœ… Score β‰₯ baseline + 0.05 (improvement)
  • βœ… All required checks pass
  • βœ… No critical failures

Discard changes if:

  • ❌ Score decreased
  • ❌ Critical tests fail
  • ❌ Violates project constraints

Manual review if:

  • ⚠️ Score ~ baseline (minimal change)
  • ⚠️ Some non-critical tests fail

⚠️ Important: Claude Code Permissions

ProjectEvolve requires Claude Code to run with appropriate permissions.

Permissions Mode

  • βœ… "bypass permissions on" β€” Recommended! No approvals needed, full autonomy
  • ⚠️ Other modes (auto/manual) β€” May require permission approvals during execution
  • ❌ Risk: Agent may hang waiting for user to approve tool usage

Required Tools

ProjectEvolve agent needs these tools to be approved (if not in bypass mode):

Core tools:

  • Edit β€” Modify files
  • Read β€” Read file contents
  • Write β€” Create new files
  • Glob β€” Find files by pattern
  • Grep β€” Search file contents

Optional tools:

  • Bash β€” Execute shell commands (for Quality Loop)
  • Agent β€” Spawn sub-agents

Recommended Settings

Option 1: Bypass Mode (Recommended)

{
  "permissionMode": "bypass"
}

Option 2: Auto-Approve Safe Tools

{
  "permissionMode": "auto",
  "autoApproveSafeTools": true,
  "alwaysAllowTools": ["Edit", "Read", "Write", "Glob", "Grep"]
}

If agent hangs during experiment execution:

  1. Check if a permission prompt is waiting for approval
  2. Approve the required tool (Edit, Read, Write, etc.)
  3. Or switch to bypass mode for full autonomy

Note: All scripts have a 30-minute timeout per experiment. If Claude CLI hangs (e.g., on permission prompts), the experiment will time out and continue to the next iteration. Check logs for timeout errors.


πŸš€ Quick Start

Step 1️⃣: Environment Setup by AI Agent

For AI agent: Read INSTALL.md and configure the system:

# AI agent sets up the environment
# (install Python, Node.js, Claude CLI, create directories)

AI agent will:

  • βœ… Detect OS (Windows/Linux/macOS)
  • βœ… Install missing dependencies
  • βœ… Create required directories
  • βœ… Verify installation

See: INSTALL.md β€” this is the instruction for AI agent on cross-platform setup.

Step 2️⃣: Run ProjectEvolve

After environment setup, run the script:

# Basic run (10 iterations, 5 min interval)
python F:/IdeaProjects/autoresearch/autoresearch.py --project /path/to/project

# With parameters
python F:/IdeaProjects/autoresearch/autoresearch.py --project . --iter 50 --timeout 2

# Windows (via bat-file)
F:/IdeaProjects/autoresearch/autoresearch.py . 50 2

Parameters:

  • --project β€” path to your project
  • --iter β€” number of iterations (default: 10)
  • --timeout β€” interval between iterations in minutes (default: 5)

πŸ“‚ Project Structure

autoresearch/
β”œβ”€β”€ autoresearch.py          # Main script
β”œβ”€β”€ autoresearch.bat         # Windows launcher
β”œβ”€β”€ INSTALL.md               # Installation guide (for AI)
β”œβ”€β”€ README.md                # This file (English main)
β”œβ”€β”€ README_RU.md             # Russian version (full)
β”œβ”€β”€ QUICKSTART.md            # Quick guide
β”œβ”€β”€ config/
β”‚   β”œβ”€β”€ default_prompt.md    # Agent prompt template
β”‚   └── quality.yml          # Quality gate configuration
β”œβ”€β”€ utils/
β”‚   β”œβ”€β”€ cli_setup.py         # Interactive setup
β”‚   └── quality_loop.py      # Quality loop implementation
└── .gitignore               # Git ignore

What's Created in Your Project

your-project/
β”œβ”€β”€ .autoresearch/
β”‚   β”œβ”€β”€ .autoresearch.json        # Project configuration
β”‚   β”œβ”€β”€ quality.yml               # Quality gate configuration (auto-created)
β”‚   β”œβ”€β”€ experiments/
β”‚   β”‚   β”œβ”€β”€ prompt_1.md
β”‚   β”‚   β”œβ”€β”€ output_1.md
β”‚   β”‚   β”œβ”€β”€ accumulation_context.md  # Accumulated context
β”‚   β”‚   β”œβ”€β”€ last_experiment.md      # Last experiment
β”‚   β”‚   β”œβ”€β”€ changes_log.md          # Changes log
β”‚   β”‚   └── summary.json            # Final summary
β”‚   └── logs/
β”‚       └── autoresearch.log         # Run logs

πŸ”§ Configuration

First Run

========================================================================
   ProjectEvolve - First Time Setup
========================================================================

Project: /path/to/your-project

Project name: My Awesome App
Short description: Web app for task management

Project goals (one per line):
  > Improve performance
  > Add tests
  > Update documentation
  > [Enter]

Constraints (optional):
  > Don't change API
  > [Enter]

βœ“ Configuration saved!

Configuration File (.autoresearch.json)

{
  "name": "My Awesome App",
  "description": "Web app for task management",
  "goals": [
    "Improve performance",
    "Add tests",
    "Update documentation"
  ],
  "constraints": [
    "Don't change API"
  ],
  "tech_stack": ["Python", "FastAPI", "PostgreSQL"],
  "focus_areas": ["performance", "testing", "documentation"]
}

πŸ“Š Usage Examples

Example 1: Quick Test (Short Form)

# Short form: 3 experiments, 1 minute interval
python F:/IdeaProjects/autoresearch/autoresearch.py . 3 1

# Long form: same as above
python F:/IdeaProjects/autoresearch/autoresearch.py --project . --iter 3 --timeout 1

Example 2: Long Research Session

# 50 experiments, 10 minutes interval
python F:/IdeaProjects/autoresearch/autoresearch.py --project . --iter 50 --timeout 10

Example 3: Configure Only

# Initial configuration
python F:/IdeaProjects/autoresearch/autoresearch.py --project /path/to/project --configure

# Later β€” run
python F:/IdeaProjects/autoresearch/autoresearch.py --project /path/to/project --iter 10

Example 4: Continue From Specific Experiment

# Continue from Experiment 25 (after previous session ended at 24)
python F:/IdeaProjects/autoresearch/autoresearch.py --project . --iter 10 --start-from 25

# This will run Experiments 25-34 (10 experiments starting from 25)
# The agent will still see accumulated context from all previous experiments

Example 5: Auto-Detect Next Experiment Number

# Auto-detects next experiment number (if output_1.md exists, starts from 2)
python F:/IdeaResearch/autoresearch/autoresearch.py . 10 1

# Or without --project parameter (uses current directory)
python F:/IdeaProjects/autoresearch/autoresearch.py 10 1

πŸ› οΈ Troubleshooting

Claude CLI not found

npm install -g @anthropic-ai/claude-code

Python not found

Install Python 3.10+ and add to PATH.

Experiments hang

Increase interval between iterations (--timeout).

Wrong context

python F:/IdeaProjects/autoresearch/autoresearch.py --project . --reconfigure

πŸ“š Documentation

  • πŸ“– INSTALL.md β€” Installation guide (for AI agent)
  • ⚑ QUICKSTART.md β€” Quick guide
  • πŸ‡·πŸ‡Ί README_RU.md β€” Русская вСрсия

🀝 Contributing

Contributions welcome! Create issues and pull requests.

Ideas for Improvement

  • 🌐 Web UI for experiment monitoring
  • πŸ“Š Progress visualization
  • πŸ”” Completion notifications
  • πŸ“ˆ Metrics and analytics
  • πŸ”„ CI/CD integration

πŸ“„ License

MIT License β€” freely use in any project.


⭐ Stars

If you find this project useful, please give it a star on GitHub!

Made with ❀️ for autonomous project research

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors