Skip to content

SamurAIGPT/EmbedAI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

51 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

EmbedAI (PrivateGPT)

GitHub stars License Python 3.8+

Create a private QnA chatbot on your documents without relying on the internet. Leverage the power of local LLMs for complete privacy and security - none of your data ever leaves your local environment.

EmbedAI Screenshot

Inspired by imartinez/privateGPT

Features

  • 100% Private - All processing happens locally. Your documents never leave your machine
  • No Internet Required - Works completely offline after initial setup
  • Multiple Document Formats - Support for PDF, TXT, DOC, DOCX, and more
  • Local LLM Support - Uses GPT4All for local language model inference
  • Interactive UI - Clean web interface for document upload and chat
  • Document Ingestion - Automatic chunking and embedding of your documents
  • Source Citations - See which parts of your documents informed each answer

Quick Start

Prerequisites

  • Python 3.8 or later
  • Node.js v18.12.1 or later
  • Minimum 16GB RAM

Installation

  1. Clone the repository

    git clone https://github.com/SamurAIGPT/EmbedAI.git
    cd EmbedAI
  2. Start the client

    cd client
    npm install
    npm run dev
  3. Start the server (in a new terminal)

    cd server
    pip install -r requirements.txt
    python privateGPT.py
  4. Open the app

    Navigate to http://localhost:3000 and click "Download Model" to get the required LLM.

Architecture

EmbedAI/
├── client/          # Next.js frontend
│   ├── components/  # React components
│   └── pages/       # App pages
└── server/          # Python backend
    ├── privateGPT.py    # Main server
    └── ingest.py        # Document processing

How It Works

  1. Upload Documents - Drop your files into the web interface
  2. Automatic Processing - Documents are chunked and embedded locally
  3. Ask Questions - Chat naturally with your document corpus
  4. Get Answers - Receive responses with source citations

Supported File Types

Format Extension
PDF .pdf
Text .txt
Word .doc, .docx
Markdown .md

Configuration

The default model is GPT4All. You can configure:

  • Model selection
  • Chunk size for document processing
  • Number of source documents to retrieve

Troubleshooting

Model download fails?

  • Ensure you have a stable internet connection for the initial model download
  • Check that you have enough disk space (~4GB for the model)

Slow responses?

  • Local LLM inference requires significant CPU/RAM
  • Consider using a machine with at least 16GB RAM

Context window errors?

  • Try reducing the document chunk size
  • Split large documents into smaller files

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Follow for Updates

Related Projects

License

MIT License - see LICENSE for details.

Releases

No releases published

Packages

No packages published

Contributors 9