Skip to content

hambosto/sweetbyte

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

325 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SweetByte Logo

A resilient, secure, and efficient file encryption tool.

Quality Checks Go Report Card Go Version License


Table of Contents


SweetByte is a high-security file encryption tool designed for robustness and performance. It safeguards your files using a multi-layered cryptographic pipeline, ensures data integrity with error correction codes, and provides a seamless user experience with both interactive and command-line interfaces.

🤔 Why SweetByte?

SweetByte was built with three core principles in mind:

  • Security First: Security is not just a feature; it's the foundation. By layering best-in-class cryptographic primitives like AES-256, XChaCha20, and Argon2id, SweetByte provides defense-in-depth against a wide range of threats.
  • Extreme Resilience: Data corruption can render encrypted files useless. SweetByte tackles this head-on by integrating Reed-Solomon error correction, giving your files a fighting chance to survive bit rot, transmission errors, or physical media degradation.
  • User-Centric Design: Powerful security tools should be accessible. With both a guided interactive mode for ease of use and a powerful CLI for automation, SweetByte caters to all workflows without compromising on functionality.

✨ Core Features

  • Dual-Algorithm Encryption: Chains AES-256-GCM and XChaCha20-Poly1305 for a layered defense, combining the AES standard with the modern, high-performance ChaCha20 stream cipher.
  • Strong Key Derivation: Utilizes Argon2id, the winner of the Password Hashing Competition, to protect against brute-force attacks on your password.
  • Resilient File Format: Integrates Reed-Solomon error correction codes, which add redundancy to the data. This allows the file to be successfully decrypted even if it suffers from partial corruption.
  • Tamper-Proof & Extensible File Header: Each encrypted file includes a secure header that is both authenticated and flexible. It uses an HMAC-SHA256 to prevent tampering and a Tag-Length-Value (TLV) format to allow for future extension.
  • Efficient Streaming: Processes files in concurrent chunks, ensuring low memory usage and high throughput, even for very large files.
  • Dual-Mode Operation:
    • Interactive Mode: A user-friendly, wizard-style interface that guides you through every step.
    • Command-Line (CLI) Mode: A powerful and scriptable interface for automation and power users.
  • Secure Deletion: Offers an option to securely wipe source files after an operation by overwriting them with random data, making recovery nearly impossible.

⚙️ How It Works: The Encryption Pipeline

SweetByte processes data through a sophisticated pipeline to ensure confidentiality, integrity, and resilience.

graph TD
    A[Data] --> B[Zlib]
    B --> C[PKCS7]
    C --> D[AES-GCM]
    D --> E[XChaCha20]
    E --> F[Reed-Solomon]
    F --> G[Output]
Loading

Encryption Flow

When encrypting a file, the data passes through the following stages:

  1. Zlib Compression: The raw data is compressed to reduce its size.
  2. PKCS7 Padding: The compressed data is padded to a specific block size, a prerequisite for block ciphers.
  3. AES-256-GCM Encryption: The padded data is encrypted with AES, the industry standard.
  4. XChaCha20-Poly1305 Encryption: The AES-encrypted ciphertext is then encrypted again with XChaCha20, adding a second, distinct layer of security.
  5. Reed-Solomon Encoding: The final ciphertext is encoded with error correction data, making it resilient to corruption.

This multi-stage process results in a final file that is not only encrypted but also compressed and fortified against data rot.

Decryption Flow

Decryption is the exact reverse of the encryption pipeline, unwrapping each layer to securely restore the original data.

🏛️ Architecture

SweetByte is designed with a modular, layered architecture that separates concerns and promotes code reuse. The high-level structure can be visualized as follows:

graph TD
        A[CLI Mode]
        B[Interactive Mode]

        C[Processor]
        E[Stream]
        D[Task Pool]


        F[Cipher]
        G[Derive]
        H[Header]
        I[Compression]
        J[Encoding]
        K[Padding]

        L[File]
        M[UI]
        N[Config]
        P[Utils]
        Q[Types]

    A --> C
    B --> C

    C --> E
    E --> D

    D --> F
    D --> G
    D --> H
    D --> I
    D --> J
    D --> K

    C --> L
    B --> M
    E --> N
    D --> P
    C --> Q
Loading
  • User Interfaces: The cli and interactive packages provide two distinct ways for users to interact with the application. Both interfaces are built on top of the processor package.
  • Core Logic: The processor, stream, and task pool packages form the core of the application. The processor package orchestrates the high-level workflow, the stream package handles concurrent, chunk-based file processing, and the task pool package manages concurrent task execution.
  • Cryptographic & Data Processing: This layer contains the packages that implement the cryptographic and data processing primitives. These packages are responsible for encryption, key derivation, header serialization, compression, error correction, and padding. They are primarily consumed by the task pool package.
  • Utilities & Support: This layer provides a set of utility and support packages that are used throughout the application. These packages handle file management (file), UI components (ui), configuration, and other miscellaneous tasks. The types package contains common data structures used throughout the application.

📦 File Format

Encrypted files (.swx) have a custom binary structure designed for security and resilience.

Overall Structure

An encrypted file consists of a resilient, variable-size header followed by a series of variable-length data chunks.

[ Secure Header (variable size) ] [ Chunk 1 ] [ Chunk 2 ] ... [ Chunk N ]

Secure Header

The header is designed for extreme resilience to withstand data corruption. Instead of a simple, fixed structure, it's a multi-layered, self-verifying format where every component—including the metadata about component sizes—is protected by Reed-Solomon error correction codes. This ensures that the header can be reconstructed even if it is partially damaged.

The header is composed of three main parts, read sequentially:

[ Lengths Header (16 bytes) ] [ Encoded Length Prefixes (variable) ] [ Encoded Data Sections (variable) ]

1. Lengths Header (16 bytes)

This is the only fixed-size part of the header. It acts as a bootstrap, providing the exact size of the encoded length prefix for each of the four main sections (Magic, Salt, Header Data, and MAC).

2. Encoded Length Prefixes (Variable Size)

Following the lengths header are four variable-size blocks. Each block is a Reed-Solomon encoded value that, when decoded, reveals the size of the corresponding encoded data section. This adds another layer of protection for the file's structural metadata.

3. Encoded Data Sections (Variable Size)

This is the core of the header, containing the actual metadata. Each section is individually encoded with Reed-Solomon, making it independently recoverable.

Section Raw Size Description
Magic Bytes 4 bytes 0xCAFEBABE - A constant value that identifies the file as a SweetByte encrypted file.
Salt 32 bytes A unique, random value used for the Argon2id key derivation function. This ensures that even with the same password, the derived encryption key is unique.
Header Data 14 bytes A block containing serialized file metadata. See details below.
MAC 32 bytes An HMAC-SHA256 that provides integrity and authenticity for the raw, decoded header sections (Magic Bytes + Salt + Header Data).

Header Authentication

To prevent tampering, the MAC is computed over the raw, decoded Magic Bytes, Salt, and Header Data sections. During decryption, the header sections are first decoded (and corrected if necessary), and then the MAC is verified. If verification fails, the process is aborted. This check uses a constant-time comparison to protect against timing attacks, ensuring that the header's metadata is authentic and has not been manipulated.

Header Data

The Header Data block is a 14-byte structure containing the core metadata for the file. It is created by serializing the internal Header struct before being encoded and authenticated. It has the following layout:

Field Size (bytes) Description
Version 2 A 16-bit unsigned integer representing the file format version (currently 0x0001).
Flags 4 A 32-bit unsigned integer bitfield of flags indicating processing options (e.g., FlagProtected).
OriginalSize 8 A 64-bit unsigned integer representing the original, uncompressed size of the file content.

This layered approach provides extreme resilience and security for the file's critical metadata, protecting it against both accidental corruption and malicious tampering.

Cryptographic Parameters

SweetByte uses strong, modern cryptographic parameters for key derivation and encryption.

  • Argon2id Parameters:
    • Time Cost: 3
    • Memory Cost: 64 KB
    • Parallelism: 4
  • Reed-Solomon Parameters:
    • Data Shards: 4
    • Parity Shards: 10 (Provides high redundancy)

Data Chunks

Following the header, the file contains the encrypted data, split into chunks. Each chunk is prefixed with a 4-byte length header, which is essential for the streaming-based decryption process.

[ Chunk Size (4 bytes) ] [ Encrypted & Encoded Data (...) ]

🚀 Usage

Installation

To install SweetByte, use the go install command:

go install github.com/hambosto/sweetbyte@latest

Interactive Mode

For a guided experience, run SweetByte without any commands. This is the default mode.

sweetbyte

You can also explicitly run interactive mode:

sweetbyte interactive

The interactive prompt will guide you through selecting an operation (encrypt/decrypt), choosing a file, and handling the source file after the operation is complete.

Command-Line (CLI) Mode

For scripting and automation, use the encrypt and decrypt commands.

To Encrypt a File:

# Basic encryption (will prompt for password)
sweetbyte encrypt -i my_document.txt -o my_document.swx

# Provide a password and delete the original file after encryption
sweetbyte encrypt -i my_document.txt -p "my-secret-password" --delete-source

To Decrypt a File:

# Basic decryption (will prompt for password)
sweetbyte decrypt -i my_document.swx -o my_document.txt

# Provide a password and delete the encrypted source file
sweetbyte decrypt -i my_document.swx -p "my-secret-password" --delete-source

🏗️ Building from Source

SweetByte is built with Go 1.25.4 and follows Go modules for dependency management. To build from source, follow these steps:

Prerequisites

  • Go 1.25.4 or higher
  • Git

Build Process

To build the project from source, clone the repository and use the go build command:

git clone https://github.com/hambosto/sweetbyte.git
cd sweetbyte
go build .

This will create a binary named sweetbyte in the current directory.

Cross-Compilation

You can also cross-compile for different platforms:

# Build for Windows
GOOS=windows GOARCH=amd64 go build -o sweetbyte.exe .

# Build for macOS
GOOS=darwin GOARCH=amd64 go build -o sweetbyte-darwin .

# Build for Linux (ARM64)
GOOS=linux GOARCH=arm64 go build -o sweetbyte-linux-arm64 .

Running Tests

To run the project's tests:

go test ./...

Using Nix (Optional)

If you have Nix installed with flakes enabled, you can use the provided flake.nix:

# Build using Nix
nix build

# Enter development shell
nix develop

# Run directly with Nix
nix run

🏛️ Internal Packages Overview

SweetByte is built with a modular architecture, with each package handling a specific responsibility.

Package Description
cipher Implements the AES and XChaCha20-Poly1305 encryption algorithms. The main Cipher struct manages both AES-GCM and XChaCha20-Poly1305 ciphers for layered encryption. The cipher/algorithm subpackage contains the actual implementations using Go's crypto packages, with proper nonce generation and authenticated encryption.
cli Contains the command-line interface logic using the Cobra library. The CLI package provides both encrypt and decrypt commands with their respective flags and functionality, as well as managing the password prompts and file operations for the command-line mode.
compression Handles Zlib compression and decompression with configurable compression levels (NoCompression, BestSpeed, DefaultCompression, BestCompression). The package integrates seamlessly with the encryption pipeline to reduce file sizes before encryption.
config Stores all application-wide constants and configuration parameters. This includes app name, version, file extension, and exclusion patterns for file operations. The package also defines which files should be excluded during file discovery operations.
derive Handles key derivation using Argon2id and secure salt generation. This package implements the secure key derivation function with recommended parameters (Time=3, Memory=64KB, Threads=4) and provides utilities for generating cryptographically secure random bytes.
encoding Manages Reed-Solomon error correction encoding and decoding. This package implements the Reed-Solomon forward error correction with 4 data shards and 10 parity shards (total of 14) to ensure data resilience. The Shards subcomponent handles splitting data into shards, combining them, and extracting data from potentially corrupted shards.
file Provides utilities for finding, managing, and securely deleting files. The package includes functions for validating file paths, checking file existence, creating directory structures, finding eligible files for processing based on file type and exclusion patterns, and handling file discovery through directory walking.
header Manages the serialization, deserialization, and verification of the secure file header. This complex package handles the multi-layered header format with Reed-Solomon protection, HMAC authentication with constant-time comparison, and proper deserialization of the various header sections. It includes the Serializer and Deserializer components for marshaling/unmarshaling headers with Reed-Solomon error correction.
interactive Implements the user-friendly interactive mode workflow. The interactive package provides a guided experience that prompts users through the encryption/decryption process using the huh library for beautiful prompts, handles file selection, and manages user preferences in a user-friendly way.
types Defines common types, enums, and data structures used throughout the application. This package includes processing modes (encrypt/decrypt), processing types (Encryption/Decryption), and task-related structures (Task, TaskResult) that are used for concurrent operations.
padding Implements PKCS7 padding with a configurable block size. The padding package ensures that data is properly padded to meet block cipher requirements, with proper padding/unpadding functions that handle both padding and unpadding operations.
processor Contains the high-level logic for the main encrypt/decrypt file operations. This package coordinates between various internal packages to execute the complete encryption or decryption workflow, handling file I/O, header operations, and process flow. It manages the entire pipeline from file opening to completion.
stream Manages concurrent, chunk-based file processing with a worker pool. The stream package includes subpackages for buffering (buffer), chunking (chunk), concurrent execution (concurrent), and processing (processing). It handles the streaming of data through the encryption pipeline with proper concurrency management using runtime.NumCPU() workers. The ChunkReader reads files in chunks for encryption or decryption, while ChunkWriter writes the processed chunks to output in sequential order. The SequentialBuffer ensures chunks are written in the correct sequence.
ui Provides UI components like interactive prompts, progress bars, and banners. The UI package includes subpackages for progress bars (bar) using the progressbar library with configurable themes, display functions (display) for showing file information and results in tables using lipgloss, prompts (prompt) for interactive user input using the huh library, and terminal utilities (term) for clearing the screen and printing banners.
utils Contains miscellaneous helper functions. This package provides utility functions for byte operations with safe casting, formatting (including human-readable byte formats), and general-purpose functions used throughout the application. The bytes subpackage includes functions for converting values to bytes and back using big-endian encoding.

🛡️ Security Considerations

SweetByte is designed with a strong focus on security. However, it's important to be aware of the following considerations:

  • Password Strength: The security of your encrypted files depends heavily on the strength of your password. Use a long, complex, and unique password to protect against brute-force attacks.
  • Secure Environment: Run SweetByte in a secure environment. If your system is compromised with malware, your password could be stolen, and your encrypted files could be decrypted.
  • Source File Deletion: The --delete-source option is provided for convenience. However, file deletion is a complex problem that depends on the underlying hardware and operating system. While SweetByte attempts to securely remove source files after encryption/decryption, it cannot guarantee that the file is unrecoverable.
  • Side-Channel Attacks: While SweetByte uses modern, secure ciphers, it's not immune to side-channel attacks. These attacks are beyond the scope of this tool and require physical access to the machine.

🤝 Contributing

Contributions are welcome! If you'd like to contribute, please feel free to fork the repository and submit a pull request. For major changes, please open an issue first to discuss what you would like to change.

Please make sure to update tests as appropriate and run the quality checks before submitting your contribution.

📜 License

This project is licensed under the MIT License.

Releases

No releases published

Packages

 
 
 

Contributors

Languages