Phase 1: OB-GYN Data Foundation

Overview

Duration: ~1.5 weeks (50-60 hours) Goal: Build unified schema and import OB-GYN Milestones + Procedures data. Admin can verify it works. Demo: Admin uploads Excel file, data appears in database, verification view confirms success.


Timeline

Week 1                              Week 2
├───────────────────────────────────┼─────────────────┐
│ Day 1-2    │ Day 3-4   │ Day 5   │ Day 1-2  │ Day 3│
│ Schema     │ Tables +  │ Excel   │ Importer │ Test │
│ Design     │ Migration │ Parser  │ + Admin  │ Demo │
├────────────┼───────────┼─────────┼──────────┼──────┤
│    8h      │    10h    │   8h    │    12h   │  8h  │
└────────────┴───────────┴─────────┴──────────┴──────┘

                                            DEMO 1

User Stories

IDStoryAcceptance Criteria
US-1.1As an admin, I can upload OB-GYN Excel files and have the data importedExcel upload endpoint works, data appears in evaluation_responses table
US-1.2As an admin, I can view imported data to verify correctnessVerification endpoint returns record counts and sample data

Epic 1.1: Database Schema Design

Estimated Time: 18 hours

Task IDTaskDescriptionHoursDependencies
1.1.1Schema Design DocumentDesign unified schema supporting competencies, milestones, procedures4-6hNone
1.1.2Departments TableCreate departments table (id, name, code)2h1.1.1
1.1.3Form Definitions TableCreate form_definitions table with dynamic question storage (JSON)3h1.1.1, 1.1.2
1.1.4Evaluation Responses TableCreate evaluation_responses table with normalized format3h1.1.1, 1.1.3
1.1.5Alembic MigrationsWrite and test all migrations4h1.1.2-1.1.4

Schema Design

-- Departments
CREATE TABLE departments (
    id SERIAL PRIMARY KEY,
    name VARCHAR(100) NOT NULL,        -- "OB-GYN", "Emergency Medicine"
    code VARCHAR(20) NOT NULL UNIQUE,  -- "obgyn", "em"
    created_at TIMESTAMP DEFAULT NOW()
);

-- Form Definitions (dynamic, not hardcoded)
CREATE TABLE form_definitions (
    id SERIAL PRIMARY KEY,
    department_id INTEGER REFERENCES departments(id),
    name VARCHAR(200) NOT NULL,        -- "Milestones", "Procedures", "PC1"
    form_type VARCHAR(50),             -- "milestone", "procedure", "competency"
    questions JSONB,                   -- Auto-extracted question metadata
    created_at TIMESTAMP DEFAULT NOW()
);

-- Evaluation Responses (unified format)
CREATE TABLE evaluation_responses (
    id SERIAL PRIMARY KEY,
    department_id INTEGER REFERENCES departments(id),
    form_definition_id INTEGER REFERENCES form_definitions(id),
    external_evaluation_id VARCHAR(100),  -- UUID from source
    evaluator_name VARCHAR(200),          -- From source data
    learner_name VARCHAR(200),            -- From source data
    learner_grad_year INTEGER,
    evaluation_date TIMESTAMP,
    question_key TEXT,                    -- Question identifier
    answer_raw NUMERIC,                   -- Numeric value
    answer_formatted TEXT,                -- Text representation
    created_at TIMESTAMP DEFAULT NOW()
);

-- Index for common queries
CREATE INDEX idx_eval_responses_dept ON evaluation_responses(department_id);
CREATE INDEX idx_eval_responses_learner ON evaluation_responses(learner_name);
CREATE INDEX idx_eval_responses_form ON evaluation_responses(form_definition_id);

Note: users and learner_mappings tables are created in Phase 2 when RBAC is implemented.


Epic 1.2: Data Import Pipeline

Estimated Time: 20 hours

Task IDTaskDescriptionHoursDependencies
1.2.1Excel ParserParse OB-GYN “All Data Normalized” sheet format using pandas4-6hNone
1.2.2Form Definition ExtractorAuto-discover unique forms and questions from Excel data6-8h1.2.1
1.2.3Excel ImporterLoad parsed data into evaluation_responses table4-6h1.1.4, 1.2.1, 1.2.2

Excel Format (All Data Normalized)

Columns:
- Evaluation Id (UUID)
- Evaluation Start (timestamp)
- Evaluation End (timestamp)
- Evaluation Last Changed (timestamp)
- Teacher (evaluator name)
- Learner (learner name)
- Learner Grad Year (integer)
- Form (form name, e.g., "PROF-3: Accountability/Conscientiousness")
- Question (question text)
- Formatted Answer (e.g., "Yes", "Competent")
- Raw Answer (numeric, e.g., 1.0, 4.0)

Epic 1.3: Admin Verification

Estimated Time: 12 hours

Task IDTaskDescriptionHoursDependenciesStatus
1.3.1Excel Upload EndpointPOST /upload/obgyn - accepts files, triggers import pipeline4h1.2.3✅ Done
1.3.2Data Verification EndpointPOST /import/obgyn — removed (path traversal risk, upload covers use case)Removed
1.3.3Integration TestingPR reviewed + merged to main on both remotes4h1.3.1, 1.3.4✅ Done
1.3.4Admin PageGET /admin/obgyn - stats dashboard + upload UI + category breakdown4h1.3.1✅ Done

API Endpoints

POST /upload/obgyn
  - Request: multipart/form-data with .xlsx files
  - Response: {
      "files_directory": "/tmp/...",
      "total_records_parsed": 2617,
      "total_forms_found": 88,
      "forms_created": 2,
      "forms_skipped": 86,
      "evaluations_inserted": 2613,
      "skipped_no_form": 0,
      "skipped_duplicate": 4,
      "files_uploaded": ["file1.xlsx"]
    }

GET /admin/obgyn
  - Server-rendered admin page with:
    - Stats cards (evaluations, learners, evaluators, forms)
    - Date range
    - Category breakdown (Milestones, Procedures, Teaching, etc.) with form + eval counts
    - Top 5 forms by evaluation count
    - Excel upload UI (drag-and-drop .xlsx files)

Epic 1.4: Future/Budget Items

Task IDTaskDescriptionPriority
1.4.1Dedicated Admin SectionNew /admin routes with navigation UI✅ Done (1.3.4)
1.4.2Additional Form TypesImport Teaching, M&M, Journal Club, etc.✅ Done (all 10 categories imported)
1.4.3Import HistoryTrack all imports with timestamps, record countsFuture

Deliverables

DeliverableDescriptionFormat
Database Schemadepartments, form_definitions, evaluation_responses tablesAlembic migrations
Import PipelineExcel parser, form extractor, importerPython modules
Admin EndpointsUpload and verification APIFlask routes
Verification ReportConfirm OB-GYN data imported correctlyJSON endpoint

Demo Script

  1. Open admin page: Navigate to /admin/obgyn — shows “No OB-GYN department found” if empty
  2. Upload Excel files: Drag .xlsx files into the upload section, click Upload
  3. Show import results: JSON response appears inline with counts
  4. Refresh page: Stats cards populate (2,613 evals, 144 learners, 88 forms, 10 categories)
  5. Show category breakdown: Milestones, Procedures, Teaching, etc. with form + eval counts
  6. Upload again: Re-upload same files — all records skipped as duplicates (idempotent)

Definition of Done

  • All database tables created with migrations (depts, forms, evaluations)
  • OB-GYN Milestones Excel imports successfully
  • OB-GYN Procedures Excel imports successfully
  • Upload endpoint (POST /upload/obgyn) works with multipart .xlsx files
  • Admin page (/admin/obgyn) with stats dashboard, category breakdown, and upload UI
  • Form categories auto-parsed from source filenames (10 categories across 88 forms)
  • POST /import/obgyn removed (security review — path traversal, upload covers use case)
  • secure_filename() applied to upload endpoint
  • Code reviewed and merged to main (both github + gitlab)