Complete guide to backing up and restoring vector database collections with Weave CLI.
Weave CLI's backup/restore feature allows you to:
- Prevent Data Loss: Create snapshots of collections before infrastructure changes
- Fast Recovery: Restore collections in minutes, not hours
- Cross-VDB Migration: Export from one VDB type, restore to another
- Space Efficient: 65-95% compression with gzip
- Complete Preservation: Embeddings, metadata, images, and content
✅ Works with all 15+ supported VDBs
✅ Portable format: .weavebak files (JSON + optional gzip)
✅ Batch processing: Handles large collections efficiently
✅ Progress tracking: Real-time status during backup/restore
✅ Validation: Built-in integrity checks
✅ Collection renaming: Restore to different collection names
# Simple backup (compressed by default)
weave backup create MyCollection --output backup.weavebak
# Result: backup.weavebak.gz (auto-adds .gz extension)weave backup validate backup.weavebak.gz# Restore to original name
weave backup restore backup.weavebak.gz
# Restore to different name
weave backup restore backup.weavebak.gz --collection NewNameweave backup list /path/to/backups/Export a collection to a portable .weavebak file.
weave backup create <collection> --output <file> [flags]Examples:
# Basic backup (compressed)
weave backup create AuctionImages --output backup.weavebak
# Creates: backup.weavebak.gz
# Uncompressed backup
weave backup create AuctionImages --output backup.weavebak --no-compress
# Custom batch size (default: 100)
weave backup create AuctionImages --output backup.weavebak --batch-size 500
# Specific VDB
weave backup create AuctionImages --vdb milvus-local --output backup.weavebak
# Quiet mode (no progress)
weave backup create AuctionImages --output backup.weavebak --quietFlags:
--output, -o(required): Output file path--compress: Enable gzip compression (default: true)--no-compress: Disable compression--batch-size: Documents per batch (default: 100)--quiet, -q: Suppress progress output--vdb: Select specific vector database
What's Backed Up:
- ✅ Document IDs
- ✅ Embeddings (vectors)
- ✅ Text content
- ✅ Metadata (all fields)
- ✅ Images (base64 data)
- ✅ Image URLs
- ✅ Image thumbnails
- ✅ Collection schema
- ✅ Embedding model name
- ✅ Vector dimensions
Restore a collection from a .weavebak file.
weave backup restore <backup-file> [flags]Examples:
# Restore to original collection name
weave backup restore backup.weavebak.gz
# Restore to different name
weave backup restore backup.weavebak.gz --collection NewName
# Overwrite existing collection
weave backup restore backup.weavebak.gz --overwrite
# Restore to specific VDB
weave backup restore backup.weavebak.gz --vdb weaviate-cloud
# Quiet mode
weave backup restore backup.weavebak.gz --quietFlags:
--collection, -c: Target collection name (default: name from backup)--overwrite: Delete existing collection before restore--quiet, -q: Suppress progress output--vdb: Select target vector database
Behavior:
- ✅ Auto-detects compressed vs uncompressed backups
- ✅ Creates collection if it doesn't exist
- ✅ Validates backup format before restore
- ✅ Batch inserts for performance
- ✅ Shows real-time progress
⚠️ Fails if collection exists (use--overwriteto replace)
Check backup file integrity.
weave backup validate <backup-file> [flags]Examples:
# Validate backup
weave backup validate backup.weavebak.gz
# JSON output
weave backup validate backup.weavebak.gz --jsonFlags:
--json: Output results in JSON format
Checks Performed:
- ✅ File exists and is readable
- ✅ Valid JSON format
- ✅ Correct backup version
- ✅ Collection metadata present
- ✅ All documents have required fields (ID, embedding)
- ✅ Embedding dimensions match metadata
- ✅ No duplicate document IDs
Exit Codes:
0: Backup is valid1: Backup has errors
Use in CI/CD:
#!/bin/bash
if weave backup validate backup.weavebak.gz; then
echo "✅ Backup valid, proceeding with deployment"
else
echo "❌ Backup validation failed"
exit 1
fiList all backup files in a directory.
weave backup list [directory] [flags]Examples:
# List backups in current directory
weave backup list
# List in specific directory
weave backup list /backups/
# JSON output
weave backup list /backups/ --jsonFlags:
--json: Output in JSON format
Output:
Found 6 backup file(s)
FILENAME COLLECTION DOCS SIZE COMPRESSED CREATED
------------------------------------------------------------------------------------------------------------------------
auctionsimages-test.weavebak.gz AuctionsImages 301 27.50 KB Yes 2026-03-05
auctionsimages-301.weavebak AuctionsImages 301 495.18 KB No 2026-03-05
weavedocs-compressed.weavebak.gz WeaveDocs 79 115.38 KB Yes 2026-03-05
Weave CLI supports uploading backups to and downloading backups from remote storage backends, enabling cloud-based disaster recovery and automated backup workflows.
- AWS S3: Amazon's object storage service
- MinIO: Self-hosted S3-compatible storage
✅ Automatic Upload/Download: Seamlessly integrated with backup create/restore
✅ Environment Variable Support: Use AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY
✅ Path Prefixes: Organize backups within buckets using prefixes (e.g., backups/)
✅ SSL/TLS Support: Configurable for both S3 (always SSL) and MinIO (optional)
✅ Remote-Only Mode: Upload without keeping local copy (--remote-only)
✅ Flexible Cleanup: Control whether downloaded files are kept or deleted
- AWS account with S3 access
- IAM user with S3 permissions:
s3:PutObject(for uploads)s3:GetObject(for downloads)s3:ListBucket(optional, for listing)s3:DeleteObject(optional, for cleanup)
- Access key ID and secret access key
# Upload backup to S3 (keeps local copy)
weave backup create MyCollection --output backup.weavebak \
--remote-storage s3 \
--s3-bucket my-backups \
--s3-region us-east-1 \
--s3-access-key AKIAIOSFODNN7EXAMPLE \
--s3-secret-key wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
# Upload to S3 with path prefix
weave backup create MyCollection --output backup.weavebak \
--remote-storage s3 \
--s3-bucket my-backups \
--s3-region us-west-2 \
--s3-prefix backups/production
# Upload only (delete local file after)
weave backup create MyCollection --output backup.weavebak \
--remote-storage s3 \
--s3-bucket my-backups \
--remote-only# Download and restore from S3
weave backup restore backup.weavebak.gz \
--remote-storage s3 \
--s3-bucket my-backups \
--s3-region us-east-1
# Download and restore with path prefix
weave backup restore backup.weavebak.gz \
--remote-storage s3 \
--s3-bucket my-backups \
--s3-region us-east-1 \
--s3-prefix backups/production
# Keep downloaded file for inspection
weave backup restore backup.weavebak.gz \
--remote-storage s3 \
--s3-bucket my-backups \
--keep-localInstead of passing credentials via flags, use environment variables:
# Set AWS credentials
export AWS_ACCESS_KEY_ID="AKIAIOSFODNN7EXAMPLE"
export AWS_SECRET_ACCESS_KEY="wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
# Create backup (credentials read from env)
weave backup create MyCollection --output backup.weavebak \
--remote-storage s3 \
--s3-bucket my-backups \
--s3-region us-east-1
# Restore backup (credentials read from env)
weave backup restore backup.weavebak.gz \
--remote-storage s3 \
--s3-bucket my-backupsBest Practice: Use environment variables for credentials in production to avoid exposing secrets in command history or scripts.
MinIO is a high-performance, S3-compatible object storage system that you can self-host.
- MinIO server running (local or remote)
- MinIO access key and secret key
- Endpoint URL (e.g.,
localhost:9000)
# Run MinIO locally
docker run -d \
-p 9000:9000 \
-p 9001:9001 \
--name minio \
-e MINIO_ROOT_USER=minioadmin \
-e MINIO_ROOT_PASSWORD=minioadmin \
-v /data/minio:/data \
minio/minio server /data --console-address ":9001"
# Create bucket (via MinIO console at http://localhost:9001)
# Or use mc client:
mc alias set local http://localhost:9000 minioadmin minioadmin
mc mb local/weave-backups# Upload backup to MinIO (local, no SSL)
weave backup create MyCollection --output backup.weavebak \
--remote-storage minio \
--s3-bucket weave-backups \
--s3-endpoint localhost:9000 \
--s3-access-key minioadmin \
--s3-secret-key minioadmin \
--s3-no-ssl
# Upload to MinIO with path prefix
weave backup create MyCollection --output backup.weavebak \
--remote-storage minio \
--s3-bucket weave-backups \
--s3-endpoint minio.company.com:9000 \
--s3-prefix backups/$(date +%Y%m%d) \
--s3-no-ssl
# Upload only (delete local copy)
weave backup create MyCollection --output backup.weavebak \
--remote-storage minio \
--s3-bucket weave-backups \
--s3-endpoint localhost:9000 \
--s3-no-ssl \
--remote-only# Download and restore from MinIO
weave backup restore backup.weavebak.gz \
--remote-storage minio \
--s3-bucket weave-backups \
--s3-endpoint localhost:9000 \
--s3-access-key minioadmin \
--s3-secret-key minioadmin \
--s3-no-ssl
# Download with path prefix
weave backup restore backup.weavebak.gz \
--remote-storage minio \
--s3-bucket weave-backups \
--s3-endpoint localhost:9000 \
--s3-prefix backups/20260310 \
--s3-no-ssl
# Keep downloaded file
weave backup restore backup.weavebak.gz \
--remote-storage minio \
--s3-bucket weave-backups \
--s3-endpoint localhost:9000 \
--s3-no-ssl \
--keep-localIf your MinIO server has SSL/TLS enabled:
# Upload with SSL (omit --s3-no-ssl flag)
weave backup create MyCollection --output backup.weavebak \
--remote-storage minio \
--s3-bucket weave-backups \
--s3-endpoint minio.company.com:9000 \
--s3-access-key minioadmin \
--s3-secret-key minioadmin--remote-storage: Storage type (s3orminio)--s3-bucket: Bucket name (required for remote storage)--s3-region: AWS region (default:us-east-1, S3 only)--s3-endpoint: MinIO endpoint (e.g.,localhost:9000, MinIO only)--s3-access-key: Access key ID (or useAWS_ACCESS_KEY_IDenv var)--s3-secret-key: Secret access key (or useAWS_SECRET_ACCESS_KEYenv var)--s3-prefix: Path prefix within bucket (e.g.,backups/)--s3-no-ssl: Disable SSL/TLS (MinIO only, default: SSL enabled)--remote-only: Upload to remote storage only (skip local file creation)--remote-keep-local: Keep local file after upload (default: true)
--remote-storage: Storage type (s3orminio)--s3-bucket: Bucket name (required for remote storage)--s3-region: AWS region (default:us-east-1, S3 only)--s3-endpoint: MinIO endpoint (e.g.,localhost:9000, MinIO only)--s3-access-key: Access key ID (or useAWS_ACCESS_KEY_IDenv var)--s3-secret-key: Secret access key (or useAWS_SECRET_ACCESS_KEYenv var)--s3-prefix: Path prefix within bucket (e.g.,backups/)--s3-no-ssl: Disable SSL/TLS (MinIO only, default: SSL enabled)--keep-local: Keep downloaded file after restore (default: false)
#!/bin/bash
# /usr/local/bin/weave-backup-s3.sh
export AWS_ACCESS_KEY_ID="your-access-key"
export AWS_SECRET_ACCESS_KEY="your-secret-key"
DATE=$(date +%Y%m%d)
COLLECTIONS=("ProductionDocs" "ProductionImages" "ProductionUsers")
for collection in "${COLLECTIONS[@]}"; do
weave backup create "$collection" \
--output "${collection}-${DATE}.weavebak" \
--remote-storage s3 \
--s3-bucket company-backups \
--s3-region us-east-1 \
--s3-prefix "weave-backups/${DATE}" \
--remote-only \
--quiet
if [ $? -eq 0 ]; then
echo "✅ Backed up $collection to S3"
else
echo "❌ Failed to backup $collection"
fi
done#!/bin/bash
# Restore all collections from specific date
DATE="20260310"
COLLECTIONS=("ProductionDocs" "ProductionImages" "ProductionUsers")
export AWS_ACCESS_KEY_ID="your-access-key"
export AWS_SECRET_ACCESS_KEY="your-secret-key"
for collection in "${COLLECTIONS[@]}"; do
weave backup restore "${collection}-${DATE}.weavebak.gz" \
--remote-storage s3 \
--s3-bucket company-backups \
--s3-prefix "weave-backups/${DATE}" \
--overwrite
if [ $? -eq 0 ]; then
echo "✅ Restored $collection from S3"
else
echo "❌ Failed to restore $collection"
fi
done#!/bin/bash
# Backup to local MinIO server
DATE=$(date +%Y%m%d)
weave backup create MyCollection \
--output "mycollection-${DATE}.weavebak" \
--remote-storage minio \
--s3-bucket weave-backups \
--s3-endpoint localhost:9000 \
--s3-access-key minioadmin \
--s3-secret-key minioadmin \
--s3-prefix "backups/${DATE}" \
--s3-no-ssl \
--remote-only \
--quiet
echo "✅ Backup uploaded to MinIO: backups/${DATE}/mycollection-${DATE}.weavebak.gz"Issue: "Access Denied" when uploading
# Verify credentials
aws s3 ls s3://my-backups --profile myprofile
# Check IAM permissions (need s3:PutObject)
aws iam get-user-policy --user-name myuser --policy-name S3AccessSolution: Ensure IAM user has s3:PutObject permission for the bucket.
Issue: "Bucket not found"
# List available buckets
aws s3 ls
# Create bucket if needed
aws s3 mb s3://my-backups --region us-east-1Issue: "Invalid region"
# Specify correct region
weave backup create MyCollection --output backup.weavebak \
--remote-storage s3 \
--s3-bucket my-backups \
--s3-region us-west-2 # Match bucket's regionIssue: "Connection refused" to MinIO
# Check if MinIO is running
docker ps | grep minio
# Check endpoint is accessible
curl http://localhost:9000/minio/health/live
# Start MinIO if needed
docker start minioIssue: "SSL error" with MinIO
# For local MinIO without SSL, use --s3-no-ssl
weave backup create MyCollection --output backup.weavebak \
--remote-storage minio \
--s3-endpoint localhost:9000 \
--s3-no-ssl # Disable SSL for local developmentIssue: "Access Denied" with MinIO
# Verify credentials with mc client
mc alias set local http://localhost:9000 minioadmin minioadmin
mc ls local/weave-backups
# Check bucket policy
mc policy get local/weave-backupsIssue: Downloaded file not cleaned up after restore
Check if --keep-local flag was used:
# Default behavior (cleanup)
weave backup restore backup.weavebak.gz \
--remote-storage s3 \
--s3-bucket my-backups
# Explicit cleanup (redundant but clear)
weave backup restore backup.weavebak.gz \
--remote-storage s3 \
--s3-bucket my-backups
# Downloaded file deleted automatically
# Keep file for debugging
weave backup restore backup.weavebak.gz \
--remote-storage s3 \
--s3-bucket my-backups \
--keep-local
# Check: ls /tmp/backup.weavebak.gzIssue: Large backups timing out
For large collections, increase batch size and monitor progress:
# Larger batch size for faster backup
weave backup create LargeCollection --output backup.weavebak \
--batch-size 500 \
--remote-storage s3 \
--s3-bucket my-backups
# Upload may take time, but progress is shownIssue: Environment variables not working
# Verify environment variables are set
echo $AWS_ACCESS_KEY_ID
echo $AWS_SECRET_ACCESS_KEY
# Export if not set
export AWS_ACCESS_KEY_ID="your-key"
export AWS_SECRET_ACCESS_KEY="your-secret"
# Re-run command
weave backup create MyCollection --output backup.weavebak \
--remote-storage s3 \
--s3-bucket my-backupsBackup before Docker/Kubernetes updates:
# Backup all important collections
weave backup create ProductionDocs --output prod-docs-$(date +%Y%m%d).weavebak
weave backup create ProductionImages --output prod-images-$(date +%Y%m%d).weavebak
# Validate backups
weave backup validate prod-docs-*.weavebak.gz
weave backup validate prod-images-*.weavebak.gz
# Now safe to update infrastructure
docker-compose down
docker-compose pull
docker-compose up -d
# Restore if needed
# weave backup restore prod-docs-20260305.weavebak.gz --overwriteAutomate daily backups:
#!/bin/bash
# /usr/local/bin/weave-backup.sh
BACKUP_DIR="/backups/daily"
DATE=$(date +%Y%m%d)
RETENTION_DAYS=30
# Create backups
weave backup create ProductionDocs \
--output "$BACKUP_DIR/docs-$DATE.weavebak" \
--quiet
weave backup create ProductionImages \
--output "$BACKUP_DIR/images-$DATE.weavebak" \
--quiet
# Validate
weave backup validate "$BACKUP_DIR/docs-$DATE.weavebak.gz" --json > "$BACKUP_DIR/docs-$DATE.validation.json"
weave backup validate "$BACKUP_DIR/images-$DATE.weavebak.gz" --json > "$BACKUP_DIR/images-$DATE.validation.json"
# Cleanup old backups
find "$BACKUP_DIR" -name "*.weavebak.gz" -mtime +$RETENTION_DAYS -delete
echo "✅ Backup completed: $DATE"Crontab entry:
# Daily backup at 2 AM
0 2 * * * /usr/local/bin/weave-backup.sh >> /var/log/weave-backup.log 2>&1Complete recovery workflow:
# 1. List available backups
weave backup list /backups/daily/
# 2. Choose most recent valid backup
weave backup validate /backups/daily/docs-20260305.weavebak.gz
# 3. Restore collection
weave backup restore /backups/daily/docs-20260305.weavebak.gz --overwrite
# 4. Verify document count
weave cols list | grep ProductionDocsMigrate from Weaviate Cloud to Milvus Local:
# 1. Backup from Weaviate Cloud
weave backup create MyCollection \
--vdb weaviate-cloud \
--output migration.weavebak
# 2. Validate backup
weave backup validate migration.weavebak.gz
# 3. Restore to Milvus Local
weave backup restore migration.weavebak.gz \
--vdb milvus-local \
--collection MyCollection
# 4. Verify
weave cols list --vdb milvus-local | grep MyCollectionSupported Migrations:
✅ Any VDB → Any VDB (all 15+ VDB types supported)
Clone production data for testing:
# Backup production
weave backup create ProdCollection \
--vdb weaviate-cloud \
--output prod-snapshot.weavebak
# Restore to local dev environment
weave backup restore prod-snapshot.weavebak.gz \
--vdb weaviate-local \
--collection DevCollection
# Now safe to test without affecting productionweave backup create MyCollection --output backup.weavebak
weave backup validate backup.weavebak.gz# Good: includes collection, date, and purpose
weave backup create Users --output users-pre-migration-20260305.weavebak
# Bad: generic filename
weave backup create Users --output backup.weavebak# Backup to network storage
weave backup create MyCollection --output /mnt/nas/backups/mycollection.weavebak
# Or upload to S3
weave backup create MyCollection --output backup.weavebak
aws s3 cp backup.weavebak.gz s3://my-backups/$(date +%Y%m%d)/# Monthly restore test
weave backup restore latest.weavebak.gz --collection TestRestore
weave docs list --collection TestRestore --limit 10
weave cols delete TestRestore# Track backup sizes over time
weave backup list /backups/ --json | jq '.[] | {collection, docs, size_mb: (.backup_size_bytes / 1024 / 1024)}'# Compression is enabled by default
weave backup create LargeCollection --output backup.weavebak
# Result: 65-95% size reduction
# Only disable for debugging
weave backup create LargeCollection --output backup.weavebak --no-compressIssue: "Collection not found"
# Verify collection exists
weave cols list | grep MyCollection
# Check selected VDB
weave backup create MyCollection --output backup.weavebak --vdb weaviate-cloudIssue: "Permission denied" writing backup file
# Check directory permissions
ls -la /path/to/backups/
# Use absolute path
weave backup create MyCollection --output $PWD/backup.weavebakIssue: "Collection already exists"
# Option 1: Use --overwrite
weave backup restore backup.weavebak.gz --overwrite
# Option 2: Restore to different name
weave backup restore backup.weavebak.gz --collection NewName
# Option 3: Delete manually first
weave cols delete MyCollection
weave backup restore backup.weavebak.gzIssue: "Invalid vectorizer" on restore
This occurs when backup metadata has invalid embedding model name. Check validation:
weave backup validate backup.weavebak.gzWorkaround: Edit backup file (advanced):
# Uncompress if needed
gunzip backup.weavebak.gz
# Edit metadata.embedding_model field
nano backup.weavebak
# Recompress
gzip backup.weavebakIssue: "Missing embeddings"
This is a warning, not an error. Collections without embeddings can still be backed up for metadata/content preservation.
Issue: "Embedding dimension mismatch"
# Check backup metadata
weave backup validate backup.weavebak.gz --json | jq '.vector_dimensions'
# This indicates data inconsistency in source collectionLatest profiling results with default batch size (100):
| Collection | Documents | Vector Dims | Backup Time | Throughput | File Size (Compressed) |
|---|---|---|---|---|---|
| DemoDocs | 38 | 1024 | 0.34s | 112 docs/sec | 254 KB |
| WeaveDocs | 79 | 1536 | 0.59s | 134 docs/sec | 552 KB |
| AuctionsImages | 301 | 1536 | 1.64s | 184 docs/sec | 27 KB |
Batch Size Impact (tested with 301 documents):
| Batch Size | Backup Time | Throughput | vs Batch=100 |
|---|---|---|---|
| 50 | 2.69s | 112 docs/sec | -51% ❌ |
| 100 (default) | 1.64s | 184 docs/sec | baseline |
| 200 | 0.80s | 376 docs/sec | +104% ✅ |
Key Finding: Batch size 200 is 2x faster than batch size 100!
| Documents | Restore Time | Throughput | Notes |
|---|---|---|---|
| 2 | <1s | N/A | Instant |
| 79 | Failed* | N/A | Metadata issue |
| 301 | 16.5s | 18 docs/sec | Full restore with images |
| 2,636 | ~120s | ~22 docs/sec | Projected |
*Failed due to invalid embedding model in source collection, not backup/restore bug
The --batch-size flag controls how many documents are fetched per VDB query. This is the single biggest performance knob.
Recommendations:
- Small collections (<100 docs): Use default (100)
- Medium collections (100-1000 docs): Use 200 for 2x speedup
- Large collections (1000+ docs): Try 200-500 for best performance
- Memory constrained: Use 50-100
Examples:
# Fast backup for large collections (2x faster)
weave backup create MyCollection --output backup.weavebak --batch-size 200
# Extra fast for very large collections (test first)
weave backup create MyCollection --output backup.weavebak --batch-size 500
# Memory-efficient backup
weave backup create MyCollection --output backup.weavebak --batch-size 50Performance Impact (301 docs, real-world test):
- Batch 50: 2.69s (112 docs/sec)
- Batch 100: 1.64s (184 docs/sec)
- Batch 200: 0.80s (376 docs/sec) ← 2x faster!
Compression is enabled by default and highly recommended:
# Compressed (default, recommended)
weave backup create MyCollection --output backup.weavebak --compress
# Uncompressed (faster but larger files)
weave backup create MyCollection --output backup.weavebak --no-compressTrade-offs:
- Compression saves 65-95% disk space
- Compression overhead: ~10% slower (0.16s for 301 docs)
- Recommendation: Always use compression unless disk space is unlimited
Backup During Off-Peak Hours:
- Reduces load on VDB
- Faster network I/O
- Better for production systems
# Use cron for scheduled backups
0 2 * * * /usr/local/bin/weave backup create MyCollection --output /backups/daily.weavebak --batch-size 200 --quiet# Default shows progress (recommended for interactive use)
weave backup create MyCollection --output backup.weavebak
# Quiet mode for scripts and cron jobs
weave backup create MyCollection --output backup.weavebak --quietPrimary Bottleneck: VDB query latency
- Each batch requires a separate VDB query
- Larger batches = fewer queries = faster backups
- Solution: Use larger batch sizes (200+)
Secondary Bottlenecks:
- Startup overhead (~3-3.5s): Config loading, VDB connection
- Single-threaded processing: Batches fetched sequentially
- JSON serialization: CPU-bound for large documents
Planned Optimizations (v0.12.0):
- Parallel batch processing (goroutines) → 2-3x improvement
- Connection pooling → 10-20% improvement
- Streaming JSON/compression → 5-10% improvement
- Target: 500+ docs/sec (current best: 376 docs/sec with batch=200)
Portable JSON format with optional gzip compression:
{
"version": "1.0.0",
"metadata": {
"collection": "MyCollection",
"vdb_type": "weaviate-cloud",
"embedding_model": "text-embedding-3-small",
"vector_dimensions": 1536,
"created_at": "2026-03-05T07:00:47-08:00",
"weave_version": "0.10.3",
"total_documents": 301,
"backup_size_bytes": 27648
},
"documents": [
{
"id": "doc-123",
"content": "Document text content",
"text": "Extracted text",
"embedding": [0.123, 0.456, ...],
"metadata": {
"source": "file.pdf",
"page": 1
},
"image": "base64-encoded-data",
"image_url": "https://...",
"image_thumbnail": "base64-thumbnail",
"url": "https://source.com/doc",
"image_metadata": {
"width": 800,
"height": 600
}
}
]
}- Extension:
.weavebak.gz(auto-added when--compressused) - Algorithm: gzip (standard)
- Compression Ratio: 65-95% depending on data
- Auto-Detection: Restore automatically detects compression
Q: Can I backup multiple collections at once?
Not directly, but use a script:
for collection in $(weave cols list --json | jq -r '.[].name'); do
weave backup create "$collection" --output "backups/$collection.weavebak"
doneQ: Are backups portable between VDB types?
Yes! That's a key feature. Export from any VDB, restore to any other.
Q: What happens if restore fails mid-way?
The collection is left in a partial state. Use --overwrite to retry:
weave backup restore backup.weavebak.gz --overwriteQ: Can I edit backup files?
Advanced users can edit the JSON, but:
- Uncompress first:
gunzip backup.weavebak.gz - Edit with care (JSON must remain valid)
- Re-validate:
weave backup validate backup.weavebak - Recompress:
gzip backup.weavebak
Q: How do I backup an entire VDB instance?
Backup each collection individually (see script above).
Q: What's the maximum collection size supported?
No hard limit. Tested up to 301 documents, designed for 2,636+. Batch processing handles large datasets efficiently.
weave cols list- List collectionsweave cols delete- Delete collectionsweave docs list- List documents in collectionweave health check- Verify VDB connectivity
- User Guide - Complete CLI reference
- Architecture - How backup/restore works internally
- Weave Stack Guide - Stack backup integration
- Test Guide - Testing backup/restore
Last Updated: March 10, 2026 Weave CLI Version: v0.11.3