Skip to content

Professional tax document processing system for Ecuadorian businesses. Automatically extracts, analyzes, and organizes data from SRI Forms 103 (Retenciones) and 104 (IVA). License: MIT Python FastAPI Next.js PostgreSQL

Notifications You must be signed in to change notification settings

CapBraco/tax_form_processor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

73 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ₯– Pan Tributario

Automated Tax Form Processing for Ecuador

Transforming 3-4 hours of manual work into 10 seconds

Live Demo GitHub License

Live Demo Β· Report Bug Β· Request Feature


πŸ“‹ Table of Contents


🎯 About

Pan Tributario is a full-stack SaaS application that automates the extraction and processing of Ecuadorian tax forms (SRI Forms 103 & 104). Built to solve a real problem faced by accountants and businesses, it reduces manual data entry from 3-4 hours per client to just 10 seconds.

The Problem

Ecuadorian accountants manually extract financial data from PDF tax forms:

  • Form 103 (Retenciones): 10+ fields per document
  • Form 104 (IVA): 130+ fields across 7 sections
  • Time required: 3-4 hours per client
  • Monthly workload: 150-200 hours for 50 clients
  • Error rate: ~5% due to manual entry

The Solution

Automated PDF processing with:

  • ⚑ 1,440x faster processing (3-4 hours β†’ 10 seconds)
  • βœ… Zero errors through automated extraction
  • πŸ“Š Professional reports (Excel & PDF exports)
  • πŸ‘₯ Multi-client management with yearly summaries
  • πŸ” Complete data isolation between users

✨ Features

Core Functionality

  • πŸ“€ Drag-and-drop PDF upload with bulk processing
  • πŸ” Automatic form detection (103 vs 104)
  • πŸ“Š Intelligent field extraction using regex patterns
  • πŸ’Ύ Client management organized by company and year
  • πŸ“ˆ Yearly accumulations for tax reporting
  • πŸ“„ Professional exports (Excel with formulas, branded PDFs)

User Experience

  • 🎨 Dark mode with proper contrast ratios
  • πŸ“± Mobile responsive design
  • πŸš€ Real-time processing with progress indicators
  • 🎯 Guest mode (5 free documents, no signup)
  • ♾️ Unlimited access for registered users

Security & Authentication

  • πŸ” Google OAuth 2.0 integration
  • πŸ”’ Bcrypt password hashing
  • πŸ€– reCAPTCHA v3 bot protection
  • πŸ›‘οΈ Multi-tenant architecture with complete user isolation
  • πŸͺ Secure session management

Technical Features

  • ⚑ Async processing for optimal performance
  • πŸ’Ύ 10 database tables with proper relationships
  • πŸ”„ 35+ API endpoints (RESTful)
  • πŸ“Š Analytics dashboard for admins
  • 🌐 CDN integration (Cloudflare) for 70%+ cache rate

πŸ› οΈ Tech Stack

Frontend

Next.js React TypeScript Tailwind CSS

Backend

FastAPI Python PostgreSQL SQLAlchemy

DevOps & Tools

Docker Railway Cloudflare


πŸ“Έ Screenshots

Landing Page

Landing Page

CLients Interface

Clients Interface

Form 103 Display

Form 103

Yearly Summary

Yearly Summary

Light Mode

Light Mode


πŸš€ Getting Started

Prerequisites

  • Node.js 18+ and npm
  • Python 3.11+
  • PostgreSQL 15+
  • Docker (optional)

Installation

1. Clone the repository

git clone https://github.com/CapBraco/tax_form_processor.git
cd tax_form_processor

2. Backend Setup

cd backend

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Create .env file
cp .env.example .env
# Edit .env with your database credentials

# Run migrations
alembic upgrade head

# Start server
uvicorn main:app --reload

Backend will run at http://localhost:8000

3. Frontend Setup

cd frontend

# Install dependencies
npm install

# Create .env.local file
cp .env.example .env.local
# Edit .env.local with your backend URL

# Start development server
npm run dev

Frontend will run at http://localhost:3000

Environment Variables

Backend (.env)

DATABASE_URL=postgresql://user:password@localhost:5432/taxforms
SECRET_KEY=your-secret-key-min-32-characters
FRONTEND_URL=http://localhost:3000
GOOGLE_CLIENT_ID=your-google-oauth-client-id
GOOGLE_CLIENT_SECRET=your-google-oauth-secret
RECAPTCHA_SECRET_KEY=your-recaptcha-secret

Frontend (.env.local)

NEXT_PUBLIC_API_URL=http://localhost:8000
NEXT_PUBLIC_SITE_URL=http://localhost:3000
NEXT_PUBLIC_RECAPTCHA_SITE_KEY=your-recaptcha-site-key

Docker Setup (Alternative)

# Build and run with docker-compose
docker-compose up -d

# Access the application
# Frontend: http://localhost:3000
# Backend: http://localhost:8000

πŸ—οΈ Architecture

System Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    CLIENT LAYER                          β”‚
β”‚  (Next.js Frontend - React Components & TypeScript)     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                     β”‚
                     β”‚ HTTPS/REST API
                     β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    API LAYER                             β”‚
β”‚    (FastAPI - Async Python with Pydantic validation)    β”‚
β”‚                                                          β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”            β”‚
β”‚  β”‚   Auth   β”‚  β”‚  Upload   β”‚  β”‚ Clients  β”‚            β”‚
β”‚  β”‚  Routes  β”‚  β”‚  Routes   β”‚  β”‚  Routes  β”‚            β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜            β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                     β”‚
                     β”‚ SQLAlchemy ORM
                     β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                  DATABASE LAYER                          β”‚
β”‚         (PostgreSQL - 10 Tables, JSONB fields)          β”‚
β”‚                                                          β”‚
β”‚  users β€’ documents β€’ form_103_data β€’ form_104_data      β”‚
β”‚  guest_sessions β€’ temporary_files β€’ analytics           β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Data Flow: PDF Upload & Processing

1. User uploads PDF β†’ Frontend validates file
2. API receives file β†’ Saves to temp storage
3. pdfplumber extracts text β†’ Regex patterns parse fields
4. Data saved to database β†’ Async SQLAlchemy commit
5. Frontend polls status β†’ Real-time updates
6. Success response β†’ Display extracted data

Database Schema

Database Schema

Key Tables:

  • users - Authentication and profiles
  • documents - Metadata for uploaded PDFs
  • form_103_data - Extracted data from Form 103
  • form_104_data - Extracted data from Form 104 (130 fields)
  • guest_sessions - Temporary sessions with document limits

πŸ“š API Documentation

Interactive API documentation available at:

  • Swagger UI: http://localhost:8000/docs
  • ReDoc: http://localhost:8000/redoc

Key Endpoints

Authentication

POST   /api/auth/register          # Create new account
POST   /api/auth/login             # Login with credentials
POST   /api/auth/google            # Google OAuth login
POST   /api/auth/logout            # Logout
GET    /api/auth/me                # Get current user

Documents

POST   /api/upload/bulk            # Upload multiple PDFs
GET    /api/documents              # List user documents
GET    /api/documents/{id}         # Get document details
DELETE /api/documents/{id}         # Delete document

Forms

GET    /api/forms-data/103/{id}    # Get Form 103 data
GET    /api/forms-data/104/{id}    # Get Form 104 data

Clients

GET    /api/clientes               # List user clients
GET    /api/clientes/{name}        # Get client details
POST   /api/clientes/export        # Export to Excel/PDF

πŸš€ Deployment

Railway Deployment (Current)

  1. Backend Service

    • Build: Dockerfile in backend/
    • Start: uvicorn main:app --host 0.0.0.0 --port $PORT
    • Environment: Production variables from Railway
  2. Frontend Service

    • Build: npm run build
    • Start: npm run start
    • Environment: Production variables from Railway
  3. PostgreSQL Database

    • Managed by Railway
    • Automatic backups
    • Connection string in DATABASE_URL

Custom Domain Setup

  1. Point DNS to Railway (CNAME)
  2. Configure Cloudflare for CDN
  3. Enable SSL/TLS (Full strict mode)
  4. Set up page rules for caching

πŸ—ΊοΈ Roadmap

Q1 2025

  • Form 101 support (Impuesto a la Renta)
  • Form 106 support (ATS)
  • Email notifications
  • Bulk document deletion

Q2 2026

  • Mobile app (React Native)
  • API for third-party integrations
  • Advanced analytics dashboard
  • Scheduled reports (weekly/monthly)

Q3 2026

  • Multi-language support (English)
  • WhatsApp notifications
  • Collaborative features (team accounts)

🀝 Contributing

Contributions are welcome! Please follow these steps:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

Please read CONTRIBUTING.md for details on our code of conduct.


πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.


πŸ‘€ Contact

Bryan A Paucar - @devbraco

Project Link: https://github.com/CapBraco/tax_form_processor

Live Demo: https://tax.capbraco.com


πŸ™ Acknowledgments

  • Inspired by real-world accounting workflows in Ecuador
  • Thanks to my father for inspiring me to build this app
  • Built with modern web technologies and best practices
  • Special thanks to the FastAPI and Next.js communities

Made with ❀️ by Bryan A Paucar

⭐ Star this repo if you find it helpful!

About

Professional tax document processing system for Ecuadorian businesses. Automatically extracts, analyzes, and organizes data from SRI Forms 103 (Retenciones) and 104 (IVA). License: MIT Python FastAPI Next.js PostgreSQL

Topics

Resources

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published