A lightweight, free, file-based chat agent that answers questions from your custom knowledge files. No vector databases, no embeddings, no complex setup - just drop in your markdown files and start chatting.
env variables needed: OPENROUTER_API_KEY, NEXT_PUBLIC_SITE_URL
Unlike heavyweight RAG solutions that require:
- Vector databases (Pinecone, Weaviate, Chroma)
- Embedding models and preprocessing
- Complex chunking strategies
- Database hosting and maintenance
Knowledge-Assistant takes a simpler approach:
- Just files - Drop
.mdor.txtfiles in a folder - No preprocessing - Files are read at request time
- No database - Context goes directly to the LLM
- Free models - Uses OpenRouter's free tier (Llama, Mimo, DeepSeek)
Perfect for: documentation sites, GitHub repos, personal knowledge bases, and projects where simplicity beats complexity.
- Multiple Free Models - Choose from Llama 3.3, Mimo, DeepSeek R1, Devstral, GLM
- Temperature Control - Adjust creativity (0 = precise, 2 = creative)
- Custom System Prompts - Add your own AI personality and instructions
- Streaming Responses - Real-time token-by-token output
- Markdown Rendering - Full markdown with syntax highlighting
- Mermaid Diagrams - Live diagram rendering in chat
- Code Blocks - Syntax-highlighted with copy functionality
- Persistent Sessions - Conversations stored in localStorage
- Export Chat - Download conversations as markdown
Knowledge AI loads markdown files from a content/ directory and uses them as context for AI responses. This approach provides accurate, domain-specific answers without the complexity of vector databases or embeddings.
flowchart LR
subgraph Client
A[Chat Interface]
end
subgraph Server
B[Next.js API]
C[Knowledge Loader]
D[OpenRouter API]
end
subgraph Knowledge
E[content/*.md]
end
A -->|User Message| B
B --> C
C -->|Read Files| E
C -->|Context + Message| D
D -->|Streaming Response| B
B -->|SSE Stream| A
- Node.js 18+
- OpenRouter API key (get one free)
cd knowledge-assistance
npm install
cp .env.example .env.localEdit .env.local with your API key:
OPENROUTER_API_KEY=your_api_key_here
NEXT_PUBLIC_SITE_URL=http://localhost:3000npm run dev # Development
npm run build # Production build
npm start # Start production serverAccess the application at http://localhost:3000
graph TB
subgraph Frontend
UI[React Chat UI]
MD[Markdown Renderer]
MM[Mermaid Component]
CB[Code Block Component]
end
subgraph API_Layer[API Layer]
STREAM[Stream Endpoint]
CHAT[Chat Endpoint]
end
subgraph Knowledge_System[Knowledge System]
LOADER[File Loader]
CONTENT[Content Files]
end
subgraph External
OR[OpenRouter API]
end
UI --> STREAM
STREAM --> LOADER
LOADER --> CONTENT
STREAM --> OR
OR --> UI
UI --> MD
MD --> MM
MD --> CB
Add knowledge by placing markdown files in the content/ directory:
content/
knowledge.md # General domain knowledge
n8n-workflow-guide.md # n8n workflow JSON reference
n8n-ai-nodes.md # AI/LangChain node examples
n8n-patterns.md # Common workflow patterns
mermaid-syntax.md # Mermaid diagram reference
The AI reads all .md and .txt files at request time and uses them as context for responses.
Each knowledge file should be focused on a specific topic:
# Topic Title
Brief overview of the topic.
## Section 1
Detailed information with examples.
## Section 2
Code examples in fenced blocks.knowledge-assistance/
app/
api/
chat/
route.ts # Non-streaming endpoint
stream/
route.ts # Streaming endpoint
layout.tsx
page.tsx
globals.css
components/
chat.tsx # Main chat interface
mermaid.tsx # Mermaid and code block rendering
ui/ # Shadcn UI components
content/ # Knowledge files
lib/
utils.ts
Streaming chat endpoint using Server-Sent Events.
Request:
{
"message": "User message",
"messages": [
{ "role": "user", "content": "Previous message" },
{ "role": "assistant", "content": "Previous response" }
]
}Response: SSE stream with chunked content
Non-streaming chat endpoint.
Request: Same as streaming endpoint
Response:
{
"response": "Complete AI response"
}The default model can be changed in app/api/chat/stream/route.ts:
model: "xiaomi/mimo-v2-flash:free", // or any OpenRouter modelAdjust response length:
max_tokens: 4096,sequenceDiagram
participant User
participant Chat UI
participant API Route
participant Knowledge Loader
participant OpenRouter
User->>Chat UI: Send message
Chat UI->>API Route: POST /api/chat/stream
API Route->>Knowledge Loader: getKnowledge()
Knowledge Loader->>Knowledge Loader: Read content/*.md
Knowledge Loader-->>API Route: Knowledge context
API Route->>OpenRouter: Stream request
loop Streaming
OpenRouter-->>API Route: Token chunk
API Route-->>Chat UI: SSE event
Chat UI-->>User: Render token
end