Crystallizer

A Map -> Reduce powerhouse, disguised as an insight summarization tool.

Crystallizer is a programmable, LLM-powered general purpose data traversal and transformation tool.

Its default use-case will be as insight extraction and cohesion across N parts of long documents (think books).

However it can be programmed to do a large number of open-ended tasks, owing to it's templated system and task prompt design.

Installation

📋 Complete Installation Guide - Choose your preferred tool

Quick Links by Tool:

🚀 UV Installation - Fastest setup
📦 Poetry Installation - Best for development
🐍 Pip Installation - Universal compatibility

Quick Start (pip)

Requirements: Python 3.11+

# Create virtual environment
python3.11 -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Test installation
python crystallizer.py --help

Usage

python crystallizer.py \
  --system-prompt system_prompt.j2 \
  --haystack-path ./chat_logs \
  --connection ollama-local \
  --task-label gluon_design \
  --output-dir ./crystals

Configuration

Configure each LLM connection in config.json:

{
  "inference_service_connections": {
    "ollama-local": {
      "api_type": "ollama",
      "base_url": "http://localhost:11434",
      "default_model": "qwen2.5-coder:32b",
      "default_ctx_len": 18000
    },
    "openai-main": {
      "api_type": "openai",
      "base_url": "https://api.openai.com/v1",
      "api_key": "sk-...",
      "default_model": "gpt-4o-mini",
      "default_ctx_len": 128000
    }
  }
}

Features

Token-Aware Windowing: Automatically chunks large documents to fit LLM context limits
Multi-Provider Support: Works with Ollama (local) and OpenAI (cloud) backends
Template-Driven Prompts: Jinja2 templates for custom system prompts
Hierarchical Processing: 3-segment micro-windowing with merge strategies
Professional Logging: Semantic progress tracking with contextual semaphores
Batch Processing: Handle single files or entire directories

License

GNU AGPLv3

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
backends/providers		backends/providers
config		config
prompts		prompts
.gitignore		.gitignore
INSTALLATION.md		INSTALLATION.md
LICENSE		LICENSE
README.md		README.md
crystallizer.py		crystallizer.py
pyproject_toml.txt		pyproject_toml.txt
requirements.txt		requirements.txt
utilities.py		utilities.py
uv_toml.txt		uv_toml.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Crystallizer

Installation

Quick Links by Tool:

Quick Start (pip)

Usage

Configuration

Features

License

About

Uh oh!

Releases

Packages

Languages

License

Node0/crystallizer

Folders and files

Latest commit

History

Repository files navigation

Crystallizer

Installation

Quick Links by Tool:

Quick Start (pip)

Usage

Configuration

Features

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages