AI Services¶

AppArt Agent uses Google Gemini for document analysis, classification, and image generation.

Overview¶

Service	Model	Purpose
Document Analysis	`gemini-2.5-flash`	Native PDF processing with thinking/reasoning
Image Generation	`gemini-2.5-flash-image`	Photo redesign and visualization

Architecture¶

flowchart TB
    subgraph AILayer["AI Service Layer"]
        subgraph Services["Services"]
            DA["DocumentAnalyzer<br/>classify() | analyze() | synthesize()"]
            DP["DocumentProcessor<br/>process() | bulk_process()"]
            IG["ImageGenerator<br/>generate() | redesign()"]
        end
        SDK["Google Generative AI SDK"]
    end

    subgraph Auth["Authentication"]
        APIKey["API Key<br/>(Local Dev)"]
        VertexAI["Vertex AI<br/>(Production)"]
        Impersonation["Service Account<br/>Impersonation"]
    end

    subgraph External["External API"]
        Gemini["Google Gemini<br/>gemini-2.5-flash<br/>gemini-2.5-flash-image"]
    end

    DA --> SDK
    DP --> SDK
    IG --> SDK
    SDK --> Auth
    APIKey --> Gemini
    VertexAI --> Gemini
    Impersonation --> VertexAI

Document Analyzer¶

The DocumentAnalyzer class provides text and vision-based document analysis.

Initialization¶

from app.services.ai.document_analyzer import DocumentAnalyzer

analyzer = DocumentAnalyzer()

The analyzer automatically configures:

Vertex AI mode (production): Uses service account or ADC for GCP
API Key mode (development): Uses GOOGLE_CLOUD_API_KEY for direct API access

Document Classification¶

Automatically identifies document types using native PDF input:

classification = await analyzer.classify_document(
    pdf_bytes=pdf_file_bytes,  # Native PDF sent directly to Gemini
    filename="pv_ag_2024.pdf"
)

# Returns:
{
    "document_type": "pv_ag",
    "confidence": 0.95,
    "reasoning": "Document contains assembly meeting minutes header..."
}

Supported Document Types (5 categories):

Type	Description
`pv_ag`	Procès-verbal d'Assemblée Générale (meeting minutes)
`diags`	Diagnostic documents (DPE, amiante, plomb, electric, gas, etc.)
`taxe_fonciere`	Taxe Foncière (property tax)
`charges`	Copropriete charges
`other`	Other property documents (rules, contracts, insurance, etc.)

Document Analysis¶

Analyzes document content with type-specific prompts. PDFs are sent natively to Gemini with optional extracted text for context. Thinking/reasoning is enabled with an 8192-token budget for complex analysis:

analysis = await analyzer.analyze_document(
    document_type="pv_ag",
    pdf_bytes=pdf_file_bytes,      # Native PDF sent directly
    extracted_text=pdf_text_content  # Text extracted from PDF for context
)

# Returns structured analysis based on document type

Multi-Document Synthesis¶

Aggregates analysis from multiple documents with cross-document theme extraction, tantiemes calculation, and buyer action items:

synthesis = await analyzer.synthesize_documents(
    analyses=[
        {"document_type": "pv_ag", "analysis": {...}},
        {"document_type": "diags", "analysis": {...}},
        {"document_type": "charges", "analysis": {...}},
        {"document_type": "other", "analysis": {...}}
    ]
)

# Returns:
{
    "summary": "Comprehensive property analysis...",
    "total_annual_cost": 4500.0,
    "total_one_time_cost": 25000.0,
    "risk_level": "medium",
    "key_findings": [...],
    "recommendations": [...],
    "annual_cost_breakdown": {"charges_courantes": {"amount": 2400, "source": "charges"}},
    "one_time_cost_breakdown": [{"description": "Roof repair", "amount": 15000, ...}],
    "cross_document_themes": [{"theme": "Aging building", "documents_involved": [...]}],
    "buyer_action_items": [{"priority": 1, "action": "Negotiate price", "urgency": "high"}],
    "risk_factors": ["Lead presence detected", ...],
    "tantiemes_info": {"lot_tantiemes": 150, "total_tantiemes": 10000, "share_percentage": 1.5},
    "confidence_score": 85,
    "confidence_reasoning": "Based on 5 analyzed documents..."
}

Document Processor¶

Orchestrates the full document processing pipeline.

Bulk Processing¶

from app.services.ai.document_processor import DocumentProcessor

processor = DocumentProcessor(db_session)

result = await processor.process_bulk_upload(
    property_id=1,
    documents=[
        {"id": 1, "file_path": "...", "filename": "pv_ag.pdf"},
        {"id": 2, "file_path": "...", "filename": "dpe.pdf"}
    ]
)

Processing Pipeline¶

flowchart TD
    A[1. Document Upload] --> B[2. PDF Preparation<br/>Text extraction + metadata]
    B --> C[3. Classification<br/>Native PDF to Gemini]
    C --> D[4. Parallel Analysis<br/>Type-specific prompts<br/>with thinking enabled]
    D --> E[5. Result Aggregation]
    E --> F[6. Cross-Document Synthesis<br/>Themes, tantiemes, action items]
    F --> G[7. Database Update<br/>Preserve user overrides]

Key Processing Features¶

Native PDF Input: PDF bytes are sent directly to Gemini instead of converting to images
Text Extraction: Text is extracted from PDFs using PyMuPDF for additional context
Thinking/Reasoning: 8192-token thinking budget enabled for complex document analysis
Async Processing: All Gemini API calls wrapped in asyncio.to_thread() for non-blocking execution
Parallel Analysis: Multiple documents processed concurrently via asyncio.gather()
User Override Preservation: Synthesis regeneration preserves user-defined overrides (tantiemes, cost adjustments)
Automatic Re-synthesis: Synthesis is automatically regenerated after document uploads or deletions

Image Generator¶

Handles AI-powered image generation for photo redesigns.

Photo Redesign¶

from app.services.ai.image_generator import ImageGenerator

generator = ImageGenerator()

result = await generator.redesign_photo(
    original_image=image_bytes,
    style="modern",
    preferences={
        "color_scheme": "neutral",
        "furniture_style": "minimalist",
        "lighting": "bright"
    }
)

# Returns generated image bytes

Supported Styles¶

Style	Description
`modern_norwegian`	Modern Norwegian: clean lines, light wood
`minimalist_scandinavian`	Minimalist Scandinavian: functional minimalism
`cozy_hygge`	Cozy Hygge: warm textiles, candles
`fancy_dark_modern`	Fancy Dark Modern: dark tones, luxury accents

Prompt Management¶

Prompts are versioned and stored in app/prompts/:

prompts/
└── v1/
    ├── analyze_diagnostic.md
    ├── analyze_photo.md
    ├── analyze_pvag.md
    ├── analyze_tax_charges.md
    ├── dp_classify_document.md
    ├── dp_process_charges.md
    ├── dp_process_diagnostic.md
    ├── dp_process_other.md
    ├── dp_process_pv_ag.md
    ├── dp_process_tax.md
    ├── dp_synthesize_results.md
    ├── generate_property_report.md
    ├── system_document_analyzer.md
    ├── system_document_classifier.md
    └── system_synthesis.md

Loading Prompts¶

from app.prompts import get_prompt, get_system_prompt

# Get analysis prompt
prompt = get_prompt("analyze_pvag", version="v1")

# Get system prompt
system = get_system_prompt("document_classifier", version="v1")

Configuration¶

Environment Variables¶

# Model selection
GEMINI_LLM_MODEL=gemini-2.5-flash           # Text analysis
GEMINI_IMAGE_MODEL=gemini-2.5-flash-image   # Image generation

# Authentication — choose one:
# Option A: Vertex AI (production / recommended)
GEMINI_USE_VERTEXAI=true
GOOGLE_CLOUD_PROJECT=your_project           # Required for Vertex AI
GOOGLE_CLOUD_LOCATION=us-central1           # Vertex AI region

# Option B: REST API key (quick development)
GEMINI_USE_VERTEXAI=false
GOOGLE_CLOUD_API_KEY=your_api_key           # Direct API

# Storage signing (production)
GCS_SIGNING_SERVICE_ACCOUNT=sa@project.iam.gserviceaccount.com  # Explicit SA for signing

Authentication Methods¶

Method	Use Case	Configuration
Vertex AI + Service Account	Production (Cloud Run)	Automatic via attached SA
Vertex AI + Impersonation	Local development (production parity)	See below
API Key	Local development (quick start)	`GOOGLE_CLOUD_API_KEY`

Local Development with Vertex AI (Recommended)¶

For testing with the same Vertex AI setup as production, use service account impersonation:

# 1. Grant impersonation permission (one-time)
gcloud iam service-accounts add-iam-policy-binding \
  appart-backend@YOUR_PROJECT.iam.gserviceaccount.com \
  --member="user:YOUR_EMAIL@gmail.com" \
  --role="roles/iam.serviceAccountTokenCreator" \
  --project=YOUR_PROJECT

# 2. Login with impersonation
gcloud auth application-default login \
  --impersonate-service-account=appart-backend@YOUR_PROJECT.iam.gserviceaccount.com

# 3. Configure environment
GEMINI_USE_VERTEXAI=true
GOOGLE_CLOUD_PROJECT=YOUR_PROJECT
GOOGLE_CLOUD_LOCATION=europe-west1

# 4. Start with GCS (uses ADC automatically)
./dev.sh start-gcs

This ensures you test with:

Same Vertex AI models as production
Same IAM permissions as the deployed service
No API key management required

Token Limits¶

Operation	Max Output Tokens	Thinking Budget
Classification	1,024	Disabled
Document Analysis	4,096	8,192
Synthesis	8,192	8,192
Image Generation	N/A	N/A

Error Handling¶

The AI services handle common errors gracefully:

try:
    result = await analyzer.analyze_document(...)
except RateLimitError:
    # Implement exponential backoff
    await asyncio.sleep(delay)
    retry()
except InvalidResponseError:
    # Log and return partial result
    logger.error("Invalid AI response")
    return {"error": "Analysis incomplete"}

Cost Optimization¶

Use appropriate models: gemini-2.5-flash for text, gemini-2.5-flash-image for images
Batch requests: Process multiple pages in single API call
Cache results: Store analysis results in database
Skip unchanged: Use file hashes to avoid re-processing

Testing AI Services¶

# tests/test_ai_services.py
import pytest
from app.services.ai.document_analyzer import DocumentAnalyzer

@pytest.mark.asyncio
async def test_document_classification():
    analyzer = DocumentAnalyzer()
    result = await analyzer.classify_document(
        images=[test_image],
        filename="test_pv_ag.pdf"
    )
    assert result["document_type"] in ["pv_ag", "diagnostic", "tax", "charges"]