Phase 1: OB-GYN Data Foundation
Overview
Duration: ~1.5 weeks (50-60 hours)
Goal: Build unified schema and import OB-GYN Milestones + Procedures data. Admin can verify it works.
Demo: Admin uploads Excel file, data appears in database, verification view confirms success.
Timeline
Week 1 Week 2
├───────────────────────────────────┼─────────────────┐
│ Day 1-2 │ Day 3-4 │ Day 5 │ Day 1-2 │ Day 3│
│ Schema │ Tables + │ Excel │ Importer │ Test │
│ Design │ Migration │ Parser │ + Admin │ Demo │
├────────────┼───────────┼─────────┼──────────┼──────┤
│ 8h │ 10h │ 8h │ 12h │ 8h │
└────────────┴───────────┴─────────┴──────────┴──────┘
▲
DEMO 1
User Stories
| ID | Story | Acceptance Criteria |
|---|
| US-1.1 | As an admin, I can upload OB-GYN Excel files and have the data imported | Excel upload endpoint works, data appears in evaluation_responses table |
| US-1.2 | As an admin, I can view imported data to verify correctness | Verification endpoint returns record counts and sample data |
Epic 1.1: Database Schema Design
Estimated Time: 18 hours
| Task ID | Task | Description | Hours | Dependencies |
|---|
| 1.1.1 | Schema Design Document | Design unified schema supporting competencies, milestones, procedures | 4-6h | None |
| 1.1.2 | Departments Table | Create departments table (id, name, code) | 2h | 1.1.1 |
| 1.1.3 | Form Definitions Table | Create form_definitions table with dynamic question storage (JSON) | 3h | 1.1.1, 1.1.2 |
| 1.1.4 | Evaluation Responses Table | Create evaluation_responses table with normalized format | 3h | 1.1.1, 1.1.3 |
| 1.1.5 | Alembic Migrations | Write and test all migrations | 4h | 1.1.2-1.1.4 |
Schema Design
-- Departments
CREATE TABLE departments (
id SERIAL PRIMARY KEY,
name VARCHAR(100) NOT NULL, -- "OB-GYN", "Emergency Medicine"
code VARCHAR(20) NOT NULL UNIQUE, -- "obgyn", "em"
created_at TIMESTAMP DEFAULT NOW()
);
-- Form Definitions (dynamic, not hardcoded)
CREATE TABLE form_definitions (
id SERIAL PRIMARY KEY,
department_id INTEGER REFERENCES departments(id),
name VARCHAR(200) NOT NULL, -- "Milestones", "Procedures", "PC1"
form_type VARCHAR(50), -- "milestone", "procedure", "competency"
questions JSONB, -- Auto-extracted question metadata
created_at TIMESTAMP DEFAULT NOW()
);
-- Evaluation Responses (unified format)
CREATE TABLE evaluation_responses (
id SERIAL PRIMARY KEY,
department_id INTEGER REFERENCES departments(id),
form_definition_id INTEGER REFERENCES form_definitions(id),
external_evaluation_id VARCHAR(100), -- UUID from source
evaluator_name VARCHAR(200), -- From source data
learner_name VARCHAR(200), -- From source data
learner_grad_year INTEGER,
evaluation_date TIMESTAMP,
question_key TEXT, -- Question identifier
answer_raw NUMERIC, -- Numeric value
answer_formatted TEXT, -- Text representation
created_at TIMESTAMP DEFAULT NOW()
);
-- Index for common queries
CREATE INDEX idx_eval_responses_dept ON evaluation_responses(department_id);
CREATE INDEX idx_eval_responses_learner ON evaluation_responses(learner_name);
CREATE INDEX idx_eval_responses_form ON evaluation_responses(form_definition_id);
Note: users and learner_mappings tables are created in Phase 2 when RBAC is implemented.
Epic 1.2: Data Import Pipeline
Estimated Time: 20 hours
| Task ID | Task | Description | Hours | Dependencies |
|---|
| 1.2.1 | Excel Parser | Parse OB-GYN “All Data Normalized” sheet format using pandas | 4-6h | None |
| 1.2.2 | Form Definition Extractor | Auto-discover unique forms and questions from Excel data | 6-8h | 1.2.1 |
| 1.2.3 | Excel Importer | Load parsed data into evaluation_responses table | 4-6h | 1.1.4, 1.2.1, 1.2.2 |
Columns:
- Evaluation Id (UUID)
- Evaluation Start (timestamp)
- Evaluation End (timestamp)
- Evaluation Last Changed (timestamp)
- Teacher (evaluator name)
- Learner (learner name)
- Learner Grad Year (integer)
- Form (form name, e.g., "PROF-3: Accountability/Conscientiousness")
- Question (question text)
- Formatted Answer (e.g., "Yes", "Competent")
- Raw Answer (numeric, e.g., 1.0, 4.0)
Epic 1.3: Admin Verification
Estimated Time: 12 hours
| Task ID | Task | Description | Hours | Dependencies | Status |
|---|
| 1.3.1 | Excel Upload Endpoint | POST /upload/obgyn - accepts files, triggers import pipeline | 4h | 1.2.3 | ✅ Done |
| 1.3.2 | Data Verification Endpoint | POST /import/obgyn — removed (path traversal risk, upload covers use case) | — | — | Removed |
| 1.3.3 | Integration Testing | PR reviewed + merged to main on both remotes | 4h | 1.3.1, 1.3.4 | ✅ Done |
| 1.3.4 | Admin Page | GET /admin/obgyn - stats dashboard + upload UI + category breakdown | 4h | 1.3.1 | ✅ Done |
API Endpoints
POST /upload/obgyn
- Request: multipart/form-data with .xlsx files
- Response: {
"files_directory": "/tmp/...",
"total_records_parsed": 2617,
"total_forms_found": 88,
"forms_created": 2,
"forms_skipped": 86,
"evaluations_inserted": 2613,
"skipped_no_form": 0,
"skipped_duplicate": 4,
"files_uploaded": ["file1.xlsx"]
}
GET /admin/obgyn
- Server-rendered admin page with:
- Stats cards (evaluations, learners, evaluators, forms)
- Date range
- Category breakdown (Milestones, Procedures, Teaching, etc.) with form + eval counts
- Top 5 forms by evaluation count
- Excel upload UI (drag-and-drop .xlsx files)
Epic 1.4: Future/Budget Items
| Task ID | Task | Description | Priority |
|---|
| 1.4.1 | Dedicated Admin Section | New /admin routes with navigation UI | ✅ Done (1.3.4) |
| 1.4.2 | Additional Form Types | Import Teaching, M&M, Journal Club, etc. | ✅ Done (all 10 categories imported) |
| 1.4.3 | Import History | Track all imports with timestamps, record counts | Future |
Deliverables
| Deliverable | Description | Format |
|---|
| Database Schema | departments, form_definitions, evaluation_responses tables | Alembic migrations |
| Import Pipeline | Excel parser, form extractor, importer | Python modules |
| Admin Endpoints | Upload and verification API | Flask routes |
| Verification Report | Confirm OB-GYN data imported correctly | JSON endpoint |
Demo Script
- Open admin page: Navigate to
/admin/obgyn — shows “No OB-GYN department found” if empty
- Upload Excel files: Drag .xlsx files into the upload section, click Upload
- Show import results: JSON response appears inline with counts
- Refresh page: Stats cards populate (2,613 evals, 144 learners, 88 forms, 10 categories)
- Show category breakdown: Milestones, Procedures, Teaching, etc. with form + eval counts
- Upload again: Re-upload same files — all records skipped as duplicates (idempotent)
Definition of Done