Skip to content

samay2504/Image-Hue

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🎨 Colorful Image Colorization

Production-ready implementation of "Colorful Image Colorization" (Zhang, Isola, Efros - ECCV 2016) with interactive demos, memory-safe training, and comprehensive tooling.

CI License: MIT

✨ Features

  • πŸ“Š Paper-Accurate Implementation: Classification-based colorization with 313-bin quantized ab space, class rebalancing, and annealed-mean decoding (T=0.38)
  • 🎬 Interactive UIs: Beautiful Streamlit and Gradio frontends with blend animations, temperature sliders, and method comparison
  • πŸš€ Production-Ready: Docker support, Redis caching, memory safeguards (FP16, gradient checkpointing, tiling)
  • πŸ’Ύ Memory-Safe: Automatic OOM handling for RTX 3060 6GB and similar GPUs
  • πŸ”„ Multiple Methods: Classification (paper), L2 regression baseline, OpenCV color transfer
  • πŸ§ͺ Fully Tested: Unit tests, integration tests, CI/CD with GitHub Actions
  • πŸ“¦ Cross-Platform: Works on Linux, Windows (WSL2), and macOS with local venv or Docker

πŸš€ Quick Start

Option 1: Local Installation (Recommended for Development)

Windows (PowerShell)

# Clone repository
git clone https://github.com/yourusername/colorization.git
cd colorization

# Run setup script
.\scripts\setup_local.ps1

# Activate environment
.\.venv\Scripts\Activate.ps1

# Install PyTorch with CUDA 13.0
pip3 install torch torchvision --index-url https://download.pytorch.org/whl/cu130

# Verify GPU
python -c "import torch; print(f'CUDA available: {torch.cuda.is_available()}')"

Linux / macOS / WSL2

# Clone repository
git clone https://github.com/yourusername/colorization.git
cd colorization

# Run setup script
chmod +x scripts/*.sh
./scripts/setup_local.sh

# Activate environment
source .venv/bin/activate

# Install PyTorch with CUDA 13.0
pip3 install torch torchvision --index-url https://download.pytorch.org/whl/cu130

# Verify GPU
python -c "import torch; print(f'CUDA available: {torch.cuda.is_available()}')"

Option 2: Docker (Recommended for Production)

# Build and run with Docker Compose
cd docker
docker-compose up --build

# Access UIs
# Streamlit: http://localhost:8501
# Gradio: http://localhost:7860

Prerequisites for Docker GPU support:

  • Docker Desktop (Windows) or Docker Engine (Linux)
  • NVIDIA Container Toolkit (installation guide)
  • NVIDIA GPU drivers

🎨 Running the UIs

Streamlit (Recommended)

# Windows
.\scripts\run_streamlit.ps1

# Linux/macOS
./scripts/run_streamlit.sh

# Or directly
python -m streamlit run src/ui/streamlit_app.py --server.port 8501

Gradio

# Windows
.\scripts\run_gradio.ps1

# Linux/macOS
./scripts/run_gradio.sh

# Or directly
python src/ui/gradio_app.py --port 7860

Access the UIs:

πŸ‹οΈ Training

Quick Training (Small Dataset)

# Prepare your dataset
# Structure: data/train/<images>, data/val/<images>

# Compute color statistics (optional but recommended)
python -m src.data.dataset data/train --output data/color_stats.npz

# Train with quick config
python -m src.train \
    --config configs/quicktrain.yaml \
    --train_dir data/train \
    --val_dir data/val

Full Training (Paper Settings)

# For ImageNet-scale training
python -m src.train \
    --config configs/fulltrain.yaml \
    --train_dir /path/to/imagenet/train \
    --val_dir /path/to/imagenet/val

Training Configuration:

Edit configs/quicktrain.yaml or configs/fulltrain.yaml:

model:
  model_type: "mobile"  # Options: paper, mobile, l2
  num_classes: 313
  base_channels: 32

num_epochs: 50
batch_size: 16
learning_rate: 1e-4
use_amp: true  # FP16 mixed precision

Memory Management:

  • Model automatically scales batch size if OOM occurs
  • Use model_type: mobile for 6GB GPUs
  • Use model_type: paper for 12GB+ GPUs
  • FP16 mixed precision reduces memory by ~40%

πŸ–ΌοΈ Inference

Command-Line

# Single image
python -m src.infer \
    input.jpg \
    --output colorized.jpg \
    --model checkpoints/best_model.pth \
    --method classification \
    --temperature 0.38

# Batch processing
python -m src.infer \
    input_folder/ \
    --output output_folder/ \
    --model checkpoints/best_model.pth \
    --method classification

Python API

from src.infer import ColorizationInference
from PIL import Image

# Initialize engine
engine = ColorizationInference(
    model_path="checkpoints/best_model.pth",
    device="cuda",  # or "cpu"
    use_cache=True,
    redis_url="redis://localhost:6379"
)

# Colorize image
img = Image.open("grayscale.jpg")
result = engine.colorize_image(
    img,
    method="classification",
    temperature=0.38
)

# Save result
result_img = Image.fromarray((result * 255).astype('uint8'))
result_img.save("colorized.jpg")

# Create blend animation
frames = engine.create_blend_animation(
    img,
    num_frames=30
)

πŸ“Š Methods

1. Classification-Based (Paper Method) ⭐ Recommended

result = engine.colorize_image(img, method="classification", temperature=0.38)
  • Description: Treats colorization as per-pixel classification over 313 quantized ab bins
  • Features: Class rebalancing, soft-encoding, annealed-mean decoding
  • Temperature:
    • Lower (0.1-0.3): More vibrant, saturated colors
    • Default (0.38): Balanced (paper recommendation)
    • Higher (0.5-1.0): More conservative, desaturated

2. L2 Regression Baseline

result = engine.colorize_image(img, method="l2")
  • Direct regression to ab values
  • Simpler but tends to produce desaturated results
  • Useful for comparison

3. OpenCV Color Transfer

result = engine.colorize_image(img, method="opencv")
  • Simple baseline using traditional CV techniques
  • No neural network required
  • Limited quality

🎬 UI Features

Streamlit UI

  • πŸ“€ Image Upload: Drag & drop or browse
  • 🎯 Method Selector: Switch between classification, L2, OpenCV
  • 🌑️ Temperature Slider: Adjust color vibrancy (0.01-1.0)
  • 🎬 Blend Animation: Smooth grayscaleβ†’color transition
  • πŸ“Š Side-by-Side Comparison: Visual before/after
  • πŸ’Ύ Download Results: Save colorized images
  • πŸ“ˆ System Monitor: GPU memory usage, cache stats

Gradio UI

  • All Streamlit features
  • πŸ–ΌοΈ Animation Gallery: View all blend frames
  • πŸŽ›οΈ Real-time Blend Slider: Interactive color mixing
  • πŸ”— Shareable Links: Public demo URLs (with --share)

🐳 Docker Usage

Build and Run

# Build image
docker build -t colorize-app:latest -f docker/Dockerfile .

# Run Streamlit
docker run --gpus all -p 8501:8501 colorize-app:latest

# Run Gradio
docker run --gpus all -p 7860:7860 colorize-app:latest \
    python src/ui/gradio_app.py --port 7860

# Or use Docker Compose (recommended)
docker-compose up

Docker Compose Services

# docker-compose.yml includes:
- redis: Caching service (port 6379)
- app-streamlit: Streamlit UI (port 8501)
- app-gradio: Gradio UI (port 7860)

πŸ§ͺ Testing

# Run all tests
pytest src/tests/ -v

# Run with coverage
pytest src/tests/ --cov=src --cov-report=html

# Run specific test file
pytest src/tests/test_ops.py -v

# Run integration tests only
pytest src/tests/test_integration.py -v

πŸ“¦ Project Structure

colorization/
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ models/
β”‚   β”‚   β”œβ”€β”€ model.py          # PaperNet, MobileLiteVariant, L2RegressionNet
β”‚   β”‚   └── ops.py            # Quantization, encoding, color conversion
β”‚   β”œβ”€β”€ data/
β”‚   β”‚   β”œβ”€β”€ dataset.py        # Dataset loaders
β”‚   β”‚   └── transforms.py     # Data augmentation
β”‚   β”œβ”€β”€ ui/
β”‚   β”‚   β”œβ”€β”€ streamlit_app.py  # Streamlit interface
β”‚   β”‚   └── gradio_app.py     # Gradio interface
β”‚   β”œβ”€β”€ cache/
β”‚   β”‚   └── redis_client.py   # Redis caching with disk fallback
β”‚   β”œβ”€β”€ utils/
β”‚   β”‚   β”œβ”€β”€ memory.py         # Memory management, tiling
β”‚   β”‚   └── logger.py         # TensorBoard and file logging
β”‚   β”œβ”€β”€ tests/                # Unit and integration tests
β”‚   β”œβ”€β”€ train.py              # Training script
β”‚   └── infer.py              # Inference engine
β”œβ”€β”€ configs/
β”‚   β”œβ”€β”€ quicktrain.yaml       # Quick training config
β”‚   └── fulltrain.yaml        # Full paper training config
β”œβ”€β”€ docker/
β”‚   β”œβ”€β”€ Dockerfile            # CUDA 13.0 Docker image
β”‚   └── docker-compose.yml    # Multi-service setup
β”œβ”€β”€ scripts/
β”‚   β”œβ”€β”€ setup_local.sh        # Linux/macOS setup
β”‚   β”œβ”€β”€ setup_local.ps1       # Windows setup
β”‚   β”œβ”€β”€ run_streamlit.sh      # Launch Streamlit
β”‚   β”œβ”€β”€ run_gradio.sh         # Launch Gradio
β”‚   └── verify_system.sh      # System verification
β”œβ”€β”€ .github/workflows/
β”‚   └── ci.yml                # GitHub Actions CI
β”œβ”€β”€ requirements.txt          # Python dependencies
β”œβ”€β”€ requirements-dev.txt      # Development dependencies
└── README.md                 # This file

πŸ”§ Configuration

Model Configuration

# configs/quicktrain.yaml
model:
  model_type: "mobile"    # paper | mobile | l2
  num_classes: 313        # Number of ab bins
  base_channels: 32       # Channel multiplier (mobile/l2 only)

Training Hyperparameters (from Paper)

  • Optimizer: Adam (β₁=0.9, Ξ²β‚‚=0.99)
  • Weight Decay: 1e-3
  • Learning Rate Schedule:
    • Initial: 3e-5
    • β†’ 1e-5 at 200k iterations
    • β†’ 3e-6 at 375k iterations
  • Batch Size: 32 (adjust for your GPU)
  • Image Size: 256Γ—256
  • Quantization: Grid size 10 β†’ 313 in-gamut bins
  • Soft-encoding: K=5 neighbors, Οƒ=5
  • Class Rebalancing: Οƒ=5, Ξ»=0.5
  • Annealed-mean Temperature: T=0.38

πŸ’Ύ Memory Requirements

Training

Model Type GPU Memory Batch Size Training Speed
Mobile (32ch) 4-6 GB 16 Fast
Mobile (64ch) 8-10 GB 16 Fast
Paper (full) 10-14 GB 16 Moderate
Paper (full) 14-20 GB 32 Moderate

Memory Saving Techniques:

  • βœ… FP16 mixed precision (use_amp: true)
  • βœ… Gradient checkpointing (automatic for large models)
  • βœ… Auto batch size reduction on OOM
  • βœ… Mobile variant (fewer parameters)

Inference

Image Size GPU Memory Tile Size
256Γ—256 <1 GB Not needed
512Γ—512 1-2 GB Not needed
1024Γ—1024 3-4 GB 512 recommended
2048Γ—2048 8+ GB 512 required

Use tiling for large images:

result = engine.colorize_image(img, tile_size=512)

πŸ” Troubleshooting

CUDA Out of Memory

# Solution 1: Use mobile variant
model:
  model_type: "mobile"
  base_channels: 16  # Reduce from 32

# Solution 2: Reduce batch size
batch_size: 8  # Down from 16

# Solution 3: Enable gradient checkpointing
# (automatically enabled for paper model)

# Solution 4: Use CPU
device: "cpu"

PyTorch Not Detecting GPU

# Verify CUDA installation
nvidia-smi

# Reinstall PyTorch with correct CUDA version
pip3 uninstall torch torchvision
pip3 install torch torchvision --index-url https://download.pytorch.org/whl/cu130

Redis Connection Issues

# Start Redis locally
# Linux/macOS
redis-server

# Windows (with WSL2)
sudo service redis-server start

# Or disable caching
# In Python:
engine = ColorizationInference(use_cache=False)

Import Errors

# Ensure you're in the project root directory
cd /path/to/colorization

# Install in development mode
pip install -e .

# Or add to PYTHONPATH
export PYTHONPATH="${PYTHONPATH}:$(pwd)"  # Linux/macOS
$env:PYTHONPATH += ";$(pwd)"  # Windows PowerShell

πŸ“š Paper Implementation Details

Quantization (Section 3.2)

  • ab space: Grid size 10 β†’ covers [-110, 110] range
  • In-gamut filtering: Only 313 of 441 bins represent valid RGB colors
  • Soft-encoding: Gaussian kernel (Οƒ=5) on K=5 nearest neighbors

Class Rebalancing (Section 3.3, Equation 2)

1. Compute empirical distribution p from training set
2. Smooth with Gaussian: p_smooth = GaussianFilter(p, Οƒ=5)
3. Mix with uniform: pΜƒ = (1-Ξ»)Β·p_smooth + λ·(1/Q)  [Ξ»=0.5]
4. Compute weights: w = p̃^(-1)
5. Normalize: w = w / E[w]  (mean=1)

Annealed-Mean Decoding (Section 3.4, Equation 5)

1. Apply temperature: Z̃ = Z / T
2. Compute softmax: P(·) = softmax(Z̃)
3. Expected value: Γ’b = Ξ£ P(q) Β· ab(q)

Lower T β†’ more diverse/vibrant colors
Higher T β†’ more conservative/muted colors

🀝 Contributing

Contributions welcome! Please:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Guidelines:

  • Follow PEP 8 style
  • Add tests for new features
  • Update documentation
  • Run black and flake8 before committing

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

This implementation is based on:

Colorful Image Colorization
Richard Zhang, Phillip Isola, Alexei A. Efros
European Conference on Computer Vision (ECCV), 2016
Paper | Project Page

Original PyTorch implementation: richzhang/colorization

πŸ“§ Contact

For questions or issues:

🎯 Citation

If you use this code in your research, please cite:

@inproceedings{zhang2016colorful,
  title={Colorful image colorization},
  author={Zhang, Richard and Isola, Phillip and Efros, Alexei A},
  booktitle={European conference on computer vision},
  pages={649--666},
  year={2016},
  organization={Springer}
}

Made with ❀️ for reproducible deep learning research

About

Production-ready image colorization framework implementing ECCV 2016 paper (Zhang, Isola, Efros) with Streamlit/Gradio UIs, comprehensive tests, and memory-safe training for RTX GPUs

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors