Skip to content

AI Services

AppArt Agent uses Google Gemini for document analysis, classification, and image generation.

Overview

Service Model Purpose
Document Analysis gemini-2.5-flash Native PDF processing with thinking/reasoning
Image Generation gemini-2.5-flash-image Photo redesign and visualization

Architecture

flowchart TB
    subgraph AILayer["AI Service Layer"]
        subgraph Services["Services"]
            DA["DocumentAnalyzer<br/>classify() | analyze() | synthesize()"]
            DP["DocumentProcessor<br/>process() | bulk_process()"]
            IG["ImageGenerator<br/>generate() | redesign()"]
        end
        SDK["Google Generative AI SDK"]
    end

    subgraph Auth["Authentication"]
        APIKey["API Key<br/>(Local Dev)"]
        VertexAI["Vertex AI<br/>(Production)"]
        Impersonation["Service Account<br/>Impersonation"]
    end

    subgraph External["External API"]
        Gemini["Google Gemini<br/>gemini-2.5-flash<br/>gemini-2.5-flash-image"]
    end

    DA --> SDK
    DP --> SDK
    IG --> SDK
    SDK --> Auth
    APIKey --> Gemini
    VertexAI --> Gemini
    Impersonation --> VertexAI

Document Analyzer

The DocumentAnalyzer class provides text and vision-based document analysis.

Initialization

from app.services.ai.document_analyzer import DocumentAnalyzer

analyzer = DocumentAnalyzer()

The analyzer automatically configures:

  • Vertex AI mode (production): Uses service account or ADC for GCP
  • API Key mode (development): Uses GOOGLE_CLOUD_API_KEY for direct API access

Document Classification

Automatically identifies document types using native PDF input:

classification = await analyzer.classify_document(
    pdf_bytes=pdf_file_bytes,  # Native PDF sent directly to Gemini
    filename="pv_ag_2024.pdf"
)

# Returns:
{
    "document_type": "pv_ag",
    "confidence": 0.95,
    "reasoning": "Document contains assembly meeting minutes header..."
}

Supported Document Types (5 categories):

Type Description
pv_ag Procès-verbal d'Assemblée Générale (meeting minutes)
diags Diagnostic documents (DPE, amiante, plomb, electric, gas, etc.)
taxe_fonciere Taxe Foncière (property tax)
charges Copropriete charges
other Other property documents (rules, contracts, insurance, etc.)

Document Analysis

Analyzes document content with type-specific prompts. PDFs are sent natively to Gemini with optional extracted text for context. Thinking/reasoning is enabled with an 8192-token budget for complex analysis:

analysis = await analyzer.analyze_document(
    document_type="pv_ag",
    pdf_bytes=pdf_file_bytes,      # Native PDF sent directly
    extracted_text=pdf_text_content  # Text extracted from PDF for context
)

# Returns structured analysis based on document type

Multi-Document Synthesis

Aggregates analysis from multiple documents with cross-document theme extraction, tantiemes calculation, and buyer action items:

synthesis = await analyzer.synthesize_documents(
    analyses=[
        {"document_type": "pv_ag", "analysis": {...}},
        {"document_type": "diags", "analysis": {...}},
        {"document_type": "charges", "analysis": {...}},
        {"document_type": "other", "analysis": {...}}
    ]
)

# Returns:
{
    "summary": "Comprehensive property analysis...",
    "total_annual_cost": 4500.0,
    "total_one_time_cost": 25000.0,
    "risk_level": "medium",
    "key_findings": [...],
    "recommendations": [...],
    "annual_cost_breakdown": {"charges_courantes": {"amount": 2400, "source": "charges"}},
    "one_time_cost_breakdown": [{"description": "Roof repair", "amount": 15000, ...}],
    "cross_document_themes": [{"theme": "Aging building", "documents_involved": [...]}],
    "buyer_action_items": [{"priority": 1, "action": "Negotiate price", "urgency": "high"}],
    "risk_factors": ["Lead presence detected", ...],
    "tantiemes_info": {"lot_tantiemes": 150, "total_tantiemes": 10000, "share_percentage": 1.5},
    "confidence_score": 85,
    "confidence_reasoning": "Based on 5 analyzed documents..."
}

Document Processor

Orchestrates the full document processing pipeline.

Bulk Processing

from app.services.ai.document_processor import DocumentProcessor

processor = DocumentProcessor(db_session)

result = await processor.process_bulk_upload(
    property_id=1,
    documents=[
        {"id": 1, "file_path": "...", "filename": "pv_ag.pdf"},
        {"id": 2, "file_path": "...", "filename": "dpe.pdf"}
    ]
)

Processing Pipeline

flowchart TD
    A[1. Document Upload] --> B[2. PDF Preparation<br/>Text extraction + metadata]
    B --> C[3. Classification<br/>Native PDF to Gemini]
    C --> D[4. Parallel Analysis<br/>Type-specific prompts<br/>with thinking enabled]
    D --> E[5. Result Aggregation]
    E --> F[6. Cross-Document Synthesis<br/>Themes, tantiemes, action items]
    F --> G[7. Database Update<br/>Preserve user overrides]

Key Processing Features

  • Native PDF Input: PDF bytes are sent directly to Gemini instead of converting to images
  • Text Extraction: Text is extracted from PDFs using PyMuPDF for additional context
  • Thinking/Reasoning: 8192-token thinking budget enabled for complex document analysis
  • Async Processing: All Gemini API calls wrapped in asyncio.to_thread() for non-blocking execution
  • Parallel Analysis: Multiple documents processed concurrently via asyncio.gather()
  • User Override Preservation: Synthesis regeneration preserves user-defined overrides (tantiemes, cost adjustments)
  • Automatic Re-synthesis: Synthesis is automatically regenerated after document uploads or deletions

Image Generator

Handles AI-powered image generation for photo redesigns.

Photo Redesign

from app.services.ai.image_generator import ImageGenerator

generator = ImageGenerator()

result = await generator.redesign_photo(
    original_image=image_bytes,
    style="modern",
    preferences={
        "color_scheme": "neutral",
        "furniture_style": "minimalist",
        "lighting": "bright"
    }
)

# Returns generated image bytes

Supported Styles

Style Description
modern_norwegian Modern Norwegian: clean lines, light wood
minimalist_scandinavian Minimalist Scandinavian: functional minimalism
cozy_hygge Cozy Hygge: warm textiles, candles
fancy_dark_modern Fancy Dark Modern: dark tones, luxury accents

Prompt Management

Prompts are versioned and stored in app/prompts/:

prompts/
└── v1/
    ├── analyze_diagnostic.md
    ├── analyze_photo.md
    ├── analyze_pvag.md
    ├── analyze_tax_charges.md
    ├── dp_classify_document.md
    ├── dp_process_charges.md
    ├── dp_process_diagnostic.md
    ├── dp_process_other.md
    ├── dp_process_pv_ag.md
    ├── dp_process_tax.md
    ├── dp_synthesize_results.md
    ├── generate_property_report.md
    ├── system_document_analyzer.md
    ├── system_document_classifier.md
    └── system_synthesis.md

Loading Prompts

from app.prompts import get_prompt, get_system_prompt

# Get analysis prompt
prompt = get_prompt("analyze_pvag", version="v1")

# Get system prompt
system = get_system_prompt("document_classifier", version="v1")

Configuration

Environment Variables

# Model selection
GEMINI_LLM_MODEL=gemini-2.5-flash           # Text analysis
GEMINI_IMAGE_MODEL=gemini-2.5-flash-image   # Image generation

# Authentication — choose one:
# Option A: Vertex AI (production / recommended)
GEMINI_USE_VERTEXAI=true
GOOGLE_CLOUD_PROJECT=your_project           # Required for Vertex AI
GOOGLE_CLOUD_LOCATION=us-central1           # Vertex AI region

# Option B: REST API key (quick development)
GEMINI_USE_VERTEXAI=false
GOOGLE_CLOUD_API_KEY=your_api_key           # Direct API

# Storage signing (production)
GCS_SIGNING_SERVICE_ACCOUNT=sa@project.iam.gserviceaccount.com  # Explicit SA for signing

Authentication Methods

Method Use Case Configuration
Vertex AI + Service Account Production (Cloud Run) Automatic via attached SA
Vertex AI + Impersonation Local development (production parity) See below
API Key Local development (quick start) GOOGLE_CLOUD_API_KEY

For testing with the same Vertex AI setup as production, use service account impersonation:

# 1. Grant impersonation permission (one-time)
gcloud iam service-accounts add-iam-policy-binding \
  appart-backend@YOUR_PROJECT.iam.gserviceaccount.com \
  --member="user:YOUR_EMAIL@gmail.com" \
  --role="roles/iam.serviceAccountTokenCreator" \
  --project=YOUR_PROJECT

# 2. Login with impersonation
gcloud auth application-default login \
  --impersonate-service-account=appart-backend@YOUR_PROJECT.iam.gserviceaccount.com

# 3. Configure environment
GEMINI_USE_VERTEXAI=true
GOOGLE_CLOUD_PROJECT=YOUR_PROJECT
GOOGLE_CLOUD_LOCATION=europe-west1

# 4. Start with GCS (uses ADC automatically)
./dev.sh start-gcs

This ensures you test with:

  • Same Vertex AI models as production
  • Same IAM permissions as the deployed service
  • No API key management required

Token Limits

Operation Max Output Tokens Thinking Budget
Classification 1,024 Disabled
Document Analysis 4,096 8,192
Synthesis 8,192 8,192
Image Generation N/A N/A

Error Handling

The AI services handle common errors gracefully:

try:
    result = await analyzer.analyze_document(...)
except RateLimitError:
    # Implement exponential backoff
    await asyncio.sleep(delay)
    retry()
except InvalidResponseError:
    # Log and return partial result
    logger.error("Invalid AI response")
    return {"error": "Analysis incomplete"}

Cost Optimization

  1. Use appropriate models: gemini-2.5-flash for text, gemini-2.5-flash-image for images
  2. Batch requests: Process multiple pages in single API call
  3. Cache results: Store analysis results in database
  4. Skip unchanged: Use file hashes to avoid re-processing

Testing AI Services

# tests/test_ai_services.py
import pytest
from app.services.ai.document_analyzer import DocumentAnalyzer

@pytest.mark.asyncio
async def test_document_classification():
    analyzer = DocumentAnalyzer()
    result = await analyzer.classify_document(
        images=[test_image],
        filename="test_pv_ag.pdf"
    )
    assert result["document_type"] in ["pv_ag", "diagnostic", "tax", "charges"]