Technical Documentation

API, architecture, and developer resources

anonymize.today - Architecture

Classification: PUBLIC

Overview

anonymize.today is built on a modern, scalable architecture designed for performance, security, and reliability. This document provides a high-level overview of the system architecture, technology stack, and key components.


System Components

Frontend

  • Framework: Next.js 14 (App Router)
  • Language: TypeScript
  • UI Library: React 18
  • Styling: Tailwind CSS
  • Components: Radix UI, shadcn/ui
  • State Management: React Query, Zustand
  • Authentication: NextAuth.js

Backend Services

  • Language: Python 3.12
  • Framework: Microsoft Presidio
  • API Framework: FastAPI
  • NLP Library: spaCy
  • Services:
    • Analyzer (Port 8011)
    • Anonymizer (Port 8012)
    • Image Redactor (Port 8013)
    • Structured Data (Port 8014)

Database

  • Database: PostgreSQL 16
  • ORM: Prisma
  • Connection: Local connection for security

Infrastructure

  • Web Server: Nginx (reverse proxy)
  • SSL/TLS: Let's Encrypt certificates
  • Process Management: Systemd
  • Operating System: Ubuntu 24.04 LTS

Technology Stack

Frontend Technologies

  • Next.js 14.2.28
  • React 18.2.0
  • TypeScript 5.2.2
  • Tailwind CSS 3.3.3
  • Prisma 6.7.0

Backend Technologies

  • Python 3.12+
  • Microsoft Presidio
  • FastAPI
  • spaCy (NLP)
  • PostgreSQL 16

Infrastructure

  • Nginx
  • Systemd
  • UFW (Firewall)
  • Fail2Ban

System Architecture

┌─────────────────┐
│   Internet      │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│   Nginx (443)   │  ← SSL/TLS Termination
└────────┬────────┘
         │
    ┌────┴────┐
    │         │
    ▼         ▼
┌─────────┐ ┌──────────────────┐
│Frontend │ │  Backend Services│
│(Next.js)│ │  (Presidio)      │
│Port 3000│ │  Ports 8011-8014 │
└────┬────┘ └────────┬─────────┘
     │               │
     └───────┬───────┘
             │
             ▼
      ┌──────────────┐
      │  PostgreSQL  │
      │  Port 5432   │
      └──────────────┘

Data Flow

Analysis Flow

  1. User submits text via web interface
  2. Frontend sends request to Analyzer service
  3. Analyzer processes text with Presidio NLP
  4. Results returned to frontend
  5. Detected entities displayed to user

Anonymization Flow

  1. User selects anonymization operators
  2. Frontend sends text and operators to Anonymizer service
  3. Anonymizer applies operators to detected PII
  4. Anonymized text returned to user
  5. Token cost deducted from user balance

Integration Points

External Services

  • PayPal: Payment processing and subscriptions
  • Stripe: Payment processing and subscriptions
  • Microsoft 365: Email service for transactional emails
  • Google reCAPTCHA: Bot protection
  • GeoIP Service: Location detection for session tracking

API Access

  • RESTful API for programmatic access
  • JWT-based authentication
  • Rate limiting and error handling
  • Comprehensive API documentation

Scalability Considerations

Performance Optimizations

  • Lazy Loading: Language models loaded on-demand
  • Caching: Model caching for improved performance
  • Efficient Processing: Optimized NLP processing
  • Batch Processing: Efficient bulk operations

Resource Management

  • Memory Limits: Service-specific memory limits
  • CPU Optimization: Efficient resource usage
  • Database Optimization: Indexed queries and efficient schemas

Security Architecture

Security Layers

  1. Network Security: Firewall, DDoS protection
  2. Transport Security: TLS 1.3 encryption
  3. Application Security: Authentication, authorization, input validation
  4. Data Security: Encryption at rest, secure storage
  5. Monitoring: Logging, audit trails, intrusion detection

Authentication Flow

  1. User authenticates via NextAuth.js
  2. JWT token generated and stored in HTTP-only cookie
  3. Token validated on each request
  4. Session management with device tracking
  5. Automatic revocation on security events

Deployment Architecture

Service Deployment

  • Frontend: Deployed on high-speed volume
  • Backend Services: Deployed on high-speed volume
  • Database: Deployed on high-speed volume
  • Nginx: Reverse proxy and SSL termination

High Availability

  • Service Monitoring: Systemd service management
  • Automatic Restart: Service auto-restart on failure
  • Health Checks: Regular health check endpoints
  • Backup System: Automated backup and recovery

Performance Characteristics

Response Times

  • Analysis: < 1 second for typical text
  • Anonymization: < 1 second for typical text
  • API Calls: < 500ms average
  • Page Load: < 2 seconds

Throughput

  • Concurrent Users: Supports multiple concurrent users
  • Batch Processing: Efficient bulk processing
  • API Rate Limits: Appropriate rate limiting

Additional Resources


Note: Detailed architecture diagrams are available in the diagrams/ directory.

Server Infrastructure Report

Classification: PUBLIC
Purpose: Transparent documentation of server infrastructure for GDPR and ISO27001 compliance


Executive Summary

anonymize.today operates on enterprise-grade cloud infrastructure provided by Hetzner Online GmbH, a leading European hosting provider with ISO27001:2022 certification. Our infrastructure is located in Nuremberg, Germany, ensuring GDPR compliance through EU data residency and comprehensive security measures.

This document provides transparent information about our infrastructure, security measures, and compliance status without revealing sensitive security details.


Infrastructure Provider: Hetzner Online GmbH

Company Overview

Hetzner Online GmbH is a German hosting provider founded in 1997, specializing in dedicated servers, cloud hosting, and web hosting services. The company operates multiple data centers across Germany and Finland, serving customers worldwide with a focus on performance, reliability, and data protection.

Website: https://www.hetzner.com/

ISO27001:2022 Certification

Hetzner Online GmbH is ISO27001:2022 certified, demonstrating their commitment to information security management. This certification ensures:

  • Systematic Security Management: Comprehensive Information Security Management System (ISMS)
  • Risk Management: Regular risk assessments and mitigation strategies
  • Continuous Improvement: Ongoing security monitoring and improvement processes
  • Compliance: Adherence to international information security standards
  • Audit Trail: Regular third-party audits and certifications

Benefits for anonymize.today:

  • ✅ Infrastructure-level security controls
  • ✅ Certified data center operations
  • ✅ Regular security audits
  • ✅ Compliance with international standards
  • ✅ Enhanced trust and reliability

Data Privacy and GDPR Compliance

Hetzner's GDPR Compliance:

  • EU-Based Infrastructure: All data centers located in EU/EEA (Germany, Finland)
  • GDPR-Compliant Operations: Full compliance with General Data Protection Regulation
  • Data Processing Agreements: Standard DPA available for customers
  • Data Residency: Data stored exclusively in EU locations
  • Privacy Policy: Comprehensive privacy policy aligned with GDPR requirements

Location Benefits:

  • ✅ EU data residency (GDPR compliance)
  • ✅ German data protection laws apply
  • ✅ No data transfers outside EU/EEA
  • ✅ Strong privacy regulations

Infrastructure Advantages

Hetzner Cloud Infrastructure provides:

  1. High Performance

    • Modern AMD EPYC processors
    • NVMe SSD storage
    • High-speed network connections
    • Low latency
  2. Reliability

    • 99.9% uptime SLA
    • Redundant network infrastructure
    • Automated failover capabilities
    • Regular maintenance windows
  3. Scalability

    • Flexible resource allocation
    • Easy scaling of compute and storage
    • Pay-as-you-go pricing
    • No vendor lock-in
  4. Security

    • ISO27001:2022 certified data centers
    • Physical security controls
    • Network security
    • DDoS protection
    • Firewall services
  5. Transparency

    • Clear pricing
    • Detailed documentation
    • Status page for service updates
    • Open communication

Infrastructure Details

Server Location

  • Data Center: Nuremberg, Germany
  • Country: Germany (EU/EEA)
  • Region: Europe
  • GDPR Compliance: ✅ Full compliance through EU data residency

Server Type

  • Infrastructure: Hetzner Cloud Server
  • Operating System: Ubuntu 24.04 LTS
  • Architecture: x86_64
  • CPU: AMD EPYC processors
  • Memory: 15 GB RAM
  • Storage: High-speed NVMe volumes

Network Configuration

  • Public IP: Assigned by Hetzner
  • Network: Hetzner's high-speed network infrastructure
  • Bandwidth: High-speed connection with low latency
  • DDoS Protection: Hetzner's network-level DDoS protection

Storage Infrastructure

Hetzner Cloud Volumes

anonymize.today uses Hetzner Cloud Volumes for persistent, high-performance storage:

Main Volume (High-Speed)

  • Purpose: Application data, services, and database
  • Type: High-speed NVMe volume
  • Performance: Optimized for high I/O operations
  • Redundancy: Hetzner's volume redundancy and backup systems
  • Location: Nuremberg data center

Contents:

  • Frontend application files
  • Backend services (Presidio)
  • PostgreSQL database
  • Configuration files
  • SSL certificates

Backup Volume

  • Purpose: Automated backup storage
  • Type: High-speed NVMe volume
  • Performance: Optimized for backup operations
  • Redundancy: Separate from main volume for disaster recovery
  • Location: Nuremberg data center

Contents:

  • Full system backups
  • Incremental backups
  • Database backups
  • Configuration backups

Volume Benefits

Hetzner Cloud Volumes provide:

  • High Performance: NVMe SSD technology for fast I/O
  • Persistent Storage: Data persists across server restarts
  • Scalability: Easy to resize as needed
  • Redundancy: Built-in redundancy and protection
  • Snapshot Support: Point-in-time snapshots available
  • Backup Integration: Integrated with Hetzner's backup systems

Security Infrastructure

Hetzner Cloud Firewall

anonymize.today utilizes Hetzner's Cloud Firewall service for network-level security:

Features:

  • Network-Level Protection: Firewall rules applied at network level
  • Stateful Firewall: Tracks connection state
  • Rule-Based Access Control: Granular control over network traffic
  • DDoS Protection: Network-level DDoS mitigation
  • Traffic Filtering: Inbound and outbound traffic filtering

Configuration:

  • Only necessary ports open (HTTP, HTTPS, SSH on custom port)
  • Default deny policy for all other traffic
  • Rate limiting for connection attempts
  • Geographic filtering capabilities

Benefits:

  • ✅ Additional security layer beyond application firewall
  • ✅ Protection against network-level attacks
  • ✅ Reduced attack surface
  • ✅ Centralized firewall management
  • ✅ Integration with Hetzner's security infrastructure

Application-Level Security

In addition to Hetzner's firewall, anonymize.today implements:

  • UFW Firewall: Additional application-level firewall rules
  • Fail2Ban: Intrusion prevention for SSH and web services
  • Security Headers: HTTP security headers (HSTS, CSP, etc.)
  • TLS Encryption: TLS 1.3 for all connections
  • Access Control: Role-based and plan-based access control

Backup and Disaster Recovery

Automated Backup System

anonymize.today implements a comprehensive automated backup system:

Backup Schedule

  • Full Backups: Weekly (Sunday 02:00)
  • Incremental Backups: Twice daily (02:00 and 14:00)
  • Database Backups: With each backup operation
  • Verification: Automated backup verification (Sunday 04:00)
  • Retention: 30-day retention policy

Backup Components

All critical components are backed up:

  1. Application Files

    • Frontend source code
    • Backend services
    • Configuration files
  2. Database

    • Complete PostgreSQL database dumps
    • Transaction logs
    • User data and metadata
  3. Configuration

    • System configuration
    • Service configurations
    • SSL certificates
    • Security settings
  4. Metadata

    • Backup manifests
    • Checksums for integrity verification
    • Backup timestamps

Backup Storage

  • Location: Separate backup volume (Hetzner Cloud Volume)
  • Redundancy: Separate from production data
  • Encryption: Backups stored securely
  • Verification: Automated integrity checks
  • Retention: 30-day retention with automated cleanup

Recovery Capabilities

  • Full System Restoration: Complete system recovery from backups
  • Component-Level Restoration: Selective restoration of individual components
  • Point-in-Time Recovery: Recovery to specific backup points
  • Disaster Recovery: RTO: 4 hours, RPO: 12 hours

Data Redundancy

Hetzner Infrastructure:

  • Volume Redundancy: Built-in redundancy in Hetzner Cloud Volumes
  • Network Redundancy: Redundant network paths
  • Power Redundancy: Uninterruptible power supply (UPS) and backup generators
  • Cooling Redundancy: Redundant cooling systems

Application-Level:

  • Database Replication: PostgreSQL replication capabilities
  • Backup Redundancy: Multiple backup copies
  • Geographic Distribution: Backup storage in same data center with replication options

Security Hardening

Server Hardening Measures

anonymize.today implements comprehensive server hardening aligned with ISO27001:2022 and industry best practices:

Network Security

  • Firewall Configuration:

    • Hetzner Cloud Firewall (network-level)
    • UFW firewall (application-level)
    • Only necessary ports open
    • Default deny policy
  • SSH Hardening:

    • Custom SSH port (not default port 22)
    • Key-based authentication only
    • Password authentication disabled
    • Fail2Ban protection

System Security

  • Automatic Security Updates: Enabled for critical security patches
  • Security Monitoring: Intrusion detection and monitoring
  • Access Control: Least privilege access principles
  • Audit Logging: Comprehensive audit trails

Application Security

  • TLS Encryption: TLS 1.3 for all connections
  • Security Headers: HSTS, CSP, X-Frame-Options, etc.
  • Input Validation: Comprehensive input validation and sanitization
  • Rate Limiting: Protection against abuse and DoS attacks

Security Tools (Available)

The following security tools are available and can be activated:

  • AIDE: File integrity monitoring
  • rkhunter: Rootkit detection
  • ClamAV: Antivirus scanning
  • OSSEC: Intrusion detection system
  • auditd: Comprehensive audit logging

These tools provide additional layers of security monitoring and threat detection.


Monitoring and Availability

Uptime and Reliability

  • Target Uptime: 99.9% (Hetzner SLA)
  • Monitoring: Continuous service monitoring
  • Alerting: Automated alerts for service issues
  • Status Page: Public status page for service transparency

Resource Monitoring

  • CPU Usage: Monitored and optimized
  • Memory Usage: Tracked with automatic optimization
  • Disk Space: Automated monitoring and alerts
  • Network: Bandwidth and latency monitoring

Log Management

  • Centralized Logging: All services log to centralized location
  • Log Retention: Per compliance requirements
  • Log Analysis: Automated log analysis for security events
  • Audit Trails: Complete audit trails for compliance

Compliance and Certifications

ISO27001:2022 Compliance

Hetzner's Certification:

  • ✅ ISO27001:2022 certified data centers
  • ✅ Certified Information Security Management System (ISMS)
  • ✅ Regular third-party audits
  • ✅ Continuous improvement processes

anonymize.today Implementation:

  • ✅ 86% ISO27001:2022 implementation
  • ✅ Comprehensive security policies
  • ✅ Risk assessment and management
  • ✅ Incident response procedures
  • ✅ Access control policies
  • ✅ Audit logging and monitoring

GDPR Compliance

Data Residency:

  • ✅ All data stored in EU (Germany)
  • ✅ No data transfers outside EU/EEA
  • ✅ GDPR-compliant data processing
  • ✅ Data subject rights implemented

Data Protection:

  • ✅ Encryption at rest (AES-256-GCM)
  • ✅ Encryption in transit (TLS 1.3)
  • ✅ Access controls
  • ✅ Data minimization
  • ✅ Retention policies

Privacy:

  • ✅ Privacy by design
  • ✅ Data processing agreements
  • ✅ Privacy policy
  • ✅ User data export (GDPR Article 20)

Additional Compliance

  • OWASP Top 10: Protection against common vulnerabilities
  • CIS Controls: Implementation of Center for Internet Security controls
  • Industry Best Practices: Following industry security standards

Data Protection Measures

Encryption

Data in Transit:

  • TLS 1.3 encryption for all connections
  • HTTPS-only access
  • Secure certificate management (Let's Encrypt)
  • HSTS (HTTP Strict Transport Security)

Data at Rest:

  • AES-256-GCM encryption for sensitive data
  • Encrypted database fields
  • Encrypted backup storage
  • Secure key management

Access Control

  • Authentication: Multi-factor authentication (2FA) support
  • Authorization: Role-based access control
  • Session Management: Secure session handling
  • API Security: Secure API token management

Data Minimization

  • No Text Storage: User text processed in real-time, not stored
  • Metadata Only: Only necessary metadata retained
  • Retention Policies: Data retained per compliance requirements
  • Deletion: Secure data deletion upon account closure

Disaster Recovery

Recovery Objectives

  • RTO (Recovery Time Objective): 4 hours
  • RPO (Recovery Point Objective): 12 hours
  • MTPD (Maximum Tolerable Period of Disruption): 24 hours

Backup Strategy

  • Automated Backups: 2x daily incremental, weekly full
  • Backup Verification: Automated integrity checks
  • Backup Retention: 30-day retention policy
  • Backup Location: Separate volume for redundancy

Recovery Procedures

  • Full System Recovery: Complete restoration from backups
  • Component Recovery: Selective component restoration
  • Point-in-Time Recovery: Recovery to specific backup points
  • Testing: Regular recovery testing and validation

Physical Security

Data Center Security (Hetzner)

Hetzner's Nuremberg data center provides:

  • Physical Access Control: Restricted access with authentication
  • Video Surveillance: 24/7 video monitoring
  • Security Personnel: On-site security staff
  • Fire Suppression: Advanced fire detection and suppression
  • Environmental Controls: Climate control and monitoring
  • Power Redundancy: UPS and backup generators
  • Network Redundancy: Multiple network providers

Network Security

Hetzner Network Infrastructure

  • DDoS Protection: Network-level DDoS mitigation
  • Traffic Filtering: Advanced traffic filtering
  • Network Monitoring: Continuous network monitoring
  • Redundancy: Redundant network paths
  • Performance: High-speed, low-latency connections

Application Network Security

  • Firewall Rules: Granular firewall rules
  • Rate Limiting: Protection against abuse
  • Intrusion Detection: Network-level intrusion detection
  • Traffic Analysis: Monitoring and analysis of network traffic

Transparency and Compliance

Public Documentation

This document is part of anonymize.today's commitment to transparency and compliance:

  • GDPR Transparency: Public information about data processing
  • ISO27001 Transparency: Information about security measures
  • Infrastructure Transparency: Details about hosting and infrastructure
  • Security Transparency: Public security overview (without sensitive details)

Regular Updates

  • Documentation Updates: Regular updates to reflect current infrastructure
  • Compliance Status: Current compliance status
  • Security Improvements: Public security enhancements
  • Incident Reporting: Transparent incident reporting (when applicable)

Additional Resources


Contact and Support

For questions about infrastructure or compliance:


Infrastructure Provider: Hetzner Online GmbH
Data Center Location: Nuremberg, Germany (EU)


Note: This document provides public information about infrastructure and compliance. For detailed technical documentation, see internal documentation (available to authorized personnel only).