GitHub - zebbern/knowledge-assistant: A lightweight, no-cost, chat agent that answers questions from custom knowledge files. No vector databases, no embeddings, no complex setup

Knowledge-Assistant

A lightweight, free, file-based chat agent that answers questions from your custom knowledge files. No vector databases, no embeddings, no complex setup - just drop in your markdown files and start chatting.

env variables needed: OPENROUTER_API_KEY, NEXT_PUBLIC_SITE_URL

Why Lightweight?

Unlike heavyweight RAG solutions that require:

Vector databases (Pinecone, Weaviate, Chroma)
Embedding models and preprocessing
Complex chunking strategies
Database hosting and maintenance

Knowledge-Assistant takes a simpler approach:

Just files - Drop .md or .txt files in a folder
No preprocessing - Files are read at request time
No database - Context goes directly to the LLM
Free models - Uses OpenRouter's free tier (Llama, Mimo, DeepSeek)

Perfect for: documentation sites, GitHub repos, personal knowledge bases, and projects where simplicity beats complexity.

Key Features

Multiple Free Models - Choose from Llama 3.3, Mimo, DeepSeek R1, Devstral, GLM
Temperature Control - Adjust creativity (0 = precise, 2 = creative)
Custom System Prompts - Add your own AI personality and instructions
Streaming Responses - Real-time token-by-token output
Markdown Rendering - Full markdown with syntax highlighting
Mermaid Diagrams - Live diagram rendering in chat
Code Blocks - Syntax-highlighted with copy functionality
Persistent Sessions - Conversations stored in localStorage
Export Chat - Download conversations as markdown

Knowledge AI loads markdown files from a content/ directory and uses them as context for AI responses. This approach provides accurate, domain-specific answers without the complexity of vector databases or embeddings.

flowchart LR
    subgraph Client
        A[Chat Interface]
    end

    subgraph Server
        B[Next.js API]
        C[Knowledge Loader]
        D[OpenRouter API]
    end

    subgraph Knowledge
        E[content/*.md]
    end

    A -->|User Message| B
    B --> C
    C -->|Read Files| E
    C -->|Context + Message| D
    D -->|Streaming Response| B
    B -->|SSE Stream| A

Quick Start

Prerequisites

Node.js 18+
OpenRouter API key (get one free)

Installation

cd knowledge-assistance
npm install
cp .env.example .env.local

Edit .env.local with your API key:

OPENROUTER_API_KEY=your_api_key_here
NEXT_PUBLIC_SITE_URL=http://localhost:3000

Running

npm run dev       # Development
npm run build     # Production build
npm start         # Start production server

Access the application at http://localhost:3000

Architecture

graph TB
    subgraph Frontend
        UI[React Chat UI]
        MD[Markdown Renderer]
        MM[Mermaid Component]
        CB[Code Block Component]
    end

    subgraph API_Layer[API Layer]
        STREAM[Stream Endpoint]
        CHAT[Chat Endpoint]
    end

    subgraph Knowledge_System[Knowledge System]
        LOADER[File Loader]
        CONTENT[Content Files]
    end

    subgraph External
        OR[OpenRouter API]
    end

    UI --> STREAM
    STREAM --> LOADER
    LOADER --> CONTENT
    STREAM --> OR
    OR --> UI
    UI --> MD
    MD --> MM
    MD --> CB

Knowledge System

Add knowledge by placing markdown files in the content/ directory:

content/
  knowledge.md           # General domain knowledge
  n8n-workflow-guide.md  # n8n workflow JSON reference
  n8n-ai-nodes.md        # AI/LangChain node examples
  n8n-patterns.md        # Common workflow patterns
  mermaid-syntax.md      # Mermaid diagram reference

The AI reads all .md and .txt files at request time and uses them as context for responses.

Knowledge File Structure

Each knowledge file should be focused on a specific topic:

# Topic Title

Brief overview of the topic.

## Section 1

Detailed information with examples.

## Section 2

Code examples in fenced blocks.

Project Structure

knowledge-assistance/
  app/
    api/
      chat/
        route.ts          # Non-streaming endpoint
        stream/
          route.ts        # Streaming endpoint
    layout.tsx
    page.tsx
    globals.css
  components/
    chat.tsx              # Main chat interface
    mermaid.tsx           # Mermaid and code block rendering
    ui/                   # Shadcn UI components
  content/                # Knowledge files
  lib/
    utils.ts

API Endpoints

POST /api/chat/stream

Streaming chat endpoint using Server-Sent Events.

Request:

{
  "message": "User message",
  "messages": [
    { "role": "user", "content": "Previous message" },
    { "role": "assistant", "content": "Previous response" }
  ]
}

Response: SSE stream with chunked content

POST /api/chat

Non-streaming chat endpoint.

Request: Same as streaming endpoint

Response:

{
  "response": "Complete AI response"
}

Configuration

Model Selection

The default model can be changed in app/api/chat/stream/route.ts:

model: "xiaomi/mimo-v2-flash:free", // or any OpenRouter model

Token Limits

Adjust response length:

max_tokens: 4096,

Data Flow

sequenceDiagram
    participant User
    participant Chat UI
    participant API Route
    participant Knowledge Loader
    participant OpenRouter

    User->>Chat UI: Send message
    Chat UI->>API Route: POST /api/chat/stream
    API Route->>Knowledge Loader: getKnowledge()
    Knowledge Loader->>Knowledge Loader: Read content/*.md
    Knowledge Loader-->>API Route: Knowledge context
    API Route->>OpenRouter: Stream request
    loop Streaming
        OpenRouter-->>API Route: Token chunk
        API Route-->>Chat UI: SSE event
        Chat UI-->>User: Render token
    end

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
app		app
components		components
content		content
lib		lib
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
next.config.mjs		next.config.mjs
package-lock.json		package-lock.json
package.json		package.json
postcss.config.mjs		postcss.config.mjs
tailwind.config.ts		tailwind.config.ts
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Knowledge-Assistant

A lightweight, free, file-based chat agent that answers questions from your custom knowledge files. No vector databases, no embeddings, no complex setup - just drop in your markdown files and start chatting.

Why Lightweight?

Key Features

Quick Start

Prerequisites

Installation

Running

Architecture

Knowledge System

Knowledge File Structure

Project Structure

API Endpoints

POST /api/chat/stream

POST /api/chat

Configuration

Model Selection

Token Limits

Data Flow

About

Uh oh!

Releases

Packages

Languages

zebbern/knowledge-assistant

Folders and files

Latest commit

History

Repository files navigation

Knowledge-Assistant

A lightweight, free, file-based chat agent that answers questions from your custom knowledge files. No vector databases, no embeddings, no complex setup - just drop in your markdown files and start chatting.

Why Lightweight?

Key Features

Quick Start

Prerequisites

Installation

Running

Architecture

Knowledge System

Knowledge File Structure

Project Structure

API Endpoints

POST /api/chat/stream

POST /api/chat

Configuration

Model Selection

Token Limits

Data Flow

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages