# Medical Coding ML System - Technical Design Document
## 1. Executive Summary
### System Overview
A cloud-native, ML-powered medical coding prediction system leveraging AWS infrastructure and Snowflake data platform to predict both DRG and ICD-10 codes from clinical data. The system supports both real-time inference (<1s latency) and batch processing for high-volume coding operations.
### Key Capabilities
- **Dual Prediction Models**: DRG-LLaMA for DRG codes (54.6% top-1 accuracy target), Clinical BERT variants for ICD-10
- **Hybrid Processing**: Real-time API endpoints and batch processing pipelines
- **HIPAA Compliant**: End-to-end encryption, audit logging, BAA-covered services
- **Scalable Architecture**: Auto-scaling from 100 to 100K+ predictions/day
- **Epic-Ready**: Designed for future FHIR API and Clarity database integration
### Technology Stack
- **ML Platform**: AWS SageMaker (training, deployment, monitoring)
- **Data Platform**: Snowflake (existing instance leveraged)
- **Data Lake**: AWS S3 with lifecycle policies
- **Orchestration**: AWS Step Functions + Airflow
- **Security**: AWS KMS, PrivateLink, CloudTrail
---
## 2. System Architecture
### 2.1 High-Level Architecture
```mermaid
graph TB
subgraph "External Data Sources"
FHIR[Epic FHIR API]
Clarity[Epic Clarity Database]
Manual[Manual Upload/API<br/>CSV/JSON/HL7]
end
subgraph "Ingestion Layer - AWS"
Gateway[AWS Transfer Family /<br/>API Gateway / EventBridge]
S3Raw[S3 Raw Data Lake<br/>Encrypted]
S3Paths["/raw/fhir/<br/>/raw/clarity/<br/>/raw/clinical-notes/<br/>/raw/structured/"]
end
subgraph "Data Processing Layer"
subgraph "Snowflake + Snowpipe"
Raw[Raw Layer]
Staging[Staging Layer]
Analytics[Analytics Layer]
Feature[Feature Store<br/>Snowflake + SageMaker]
Raw --> Staging
Staging --> Analytics
Analytics --> Feature
end
end
subgraph "ML Platform Layer"
subgraph "AWS SageMaker"
Training[Training Jobs<br/>• DRG-LLaMA<br/>• Clinical BERT]
Registry[Model Registry<br/>• Versioning<br/>• A/B Testing<br/>• Staging]
Inference[Inference Endpoints<br/>• Real-time<br/>• Batch Transform<br/>• Multi-Model]
end
end
subgraph "Application Layer"
API["API Gateway + Lambda Functions<br/>• /predict/real-time<br/>• /predict/batch<br/>• /status/job/[id]<br/>• /metrics/performance"]
end
FHIR --> Gateway
Clarity --> Gateway
Manual --> Gateway
Gateway --> S3Raw
S3Raw --> S3Paths
S3Paths --> Raw
Feature --> Training
Training --> Registry
Registry --> Inference
Inference --> API
classDef aws fill:#FF9900,stroke:#232F3E,stroke-width:2px,color:#fff
classDef snowflake fill:#29B5E8,stroke:#0C5B99,stroke-width:2px,color:#fff
classDef epic fill:#CC0000,stroke:#660000,stroke-width:2px,color:#fff
classDef ml fill:#04AA6D,stroke:#028A0F,stroke-width:2px,color:#fff
class FHIR,Clarity epic
class Gateway,S3Raw,S3Paths,API aws
class Raw,Staging,Analytics,Feature snowflake
class Training,Registry,Inference ml
```
### 2.2 AWS Account Structure
```mermaid
graph TD
Org[AWS Organization Root]
Mgmt[Management Account<br/>• AWS Organizations<br/>• CloudTrail Organization Trail<br/>• Cost Management]
Security[Security Account<br/>• Security Hub<br/>• GuardDuty Master<br/>• AWS Config Aggregator]
Prod[Production Account<br/>• Production Workloads<br/>• PHI Data Processing<br/>• Model Endpoints]
Dev[Development Account<br/>• Development/Testing<br/>• Synthetic Data Only]
Data[Data Account<br/>• S3 Data Lake<br/>• Snowflake External Stages<br/>• Backup/Archive]
Org --> Mgmt
Org --> Security
Org --> Prod
Org --> Dev
Org --> Data
Security -.->|monitors| Prod
Security -.->|monitors| Dev
Security -.->|monitors| Data
classDef management fill:#FFA500,stroke:#FF8C00,stroke-width:2px
classDef security fill:#DC143C,stroke:#8B0000,stroke-width:2px
classDef prod fill:#228B22,stroke:#006400,stroke-width:2px
classDef dev fill:#4169E1,stroke:#0000CD,stroke-width:2px
classDef data fill:#9370DB,stroke:#4B0082,stroke-width:2px
class Mgmt management
class Security security
class Prod prod
class Dev dev
class Data data
```
---
## 3. Core Components
### 3.1 Data Ingestion Pipeline
#### Component Definition
```yaml
Epic FHIR Connector:
Type: Lambda Function + EventBridge
Runtime: Python 3.11
Memory: 3008 MB
Timeout: 15 minutes
Triggers:
- Scheduled: Rate(15 minutes)
- On-demand: API Gateway
Functions:
- Bulk export initiation
- Incremental data sync
- FHIR resource extraction
Output: S3 Raw Layer (NDJSON format)
Clarity Database Connector:
Type: AWS Glue Job
Worker Type: G.2X
Workers: 2-10 (auto-scaling)
Schedule: Daily at 2 AM UTC
Tables:
- CLARITY_DX (diagnoses)
- CLARITY_PRC (procedures)
- HNO_INFO (clinical notes)
- PATIENT (demographics)
Output: S3 Raw Layer (Parquet format)
Manual Upload Handler:
Type: S3 Event + Lambda
Supported Formats: CSV, JSON, HL7, PDF
Validation: JSON Schema / HL7 Parser
Processing:
- Format detection
- Schema validation
- PHI detection (Macie)
- Quarantine invalid files
```
### 3.2 Data Processing & Feature Engineering
#### Snowflake Architecture
```mermaid
graph LR
subgraph "External Sources"
FHIR[FHIR Data<br/>NDJSON]
Clarity[Clarity Data<br/>Parquet]
Manual[Manual Uploads<br/>CSV/JSON]
end
subgraph "Snowflake Database: MEDICAL_CODING_ML"
subgraph "RAW Schema"
Raw1[CLINICAL_DATA]
Raw2[FHIR_RESOURCES]
Raw3[CLARITY_EXTRACTS]
end
subgraph "STAGING Schema"
Stage1[CLEANED_ENCOUNTERS]
Stage2[VALIDATED_DIAGNOSES]
Stage3[PROCESSED_NOTES]
end
subgraph "FEATURES Schema"
Feat1[PATIENT_ENCOUNTERS]
Feat2[FEATURE_VECTORS]
Feat3[TEXT_EMBEDDINGS]
end
subgraph "ANALYTICS Schema"
Ana1[PREDICTION_RESULTS]
Ana2[MODEL_METRICS]
Ana3[CODING_ANALYTICS]
end
end
FHIR --> Raw2
Clarity --> Raw3
Manual --> Raw1
Raw1 --> Stage1
Raw2 --> Stage2
Raw3 --> Stage3
Stage1 --> Feat1
Stage2 --> Feat1
Stage3 --> Feat3
Feat1 --> Feat2
Feat3 --> Feat2
Feat2 --> Ana1
Ana1 --> Ana2
Ana1 --> Ana3
classDef external fill:#FFE0B2,stroke:#FF6F00,stroke-width:2px
classDef raw fill:#FFCDD2,stroke:#D32F2F,stroke-width:2px
classDef staging fill:#C5CAE9,stroke:#303F9F,stroke-width:2px
classDef features fill:#C8E6C9,stroke:#388E3C,stroke-width:2px
classDef analytics fill:#E1BEE7,stroke:#7B1FA2,stroke-width:2px
class FHIR,Clarity,Manual external
class Raw1,Raw2,Raw3 raw
class Stage1,Stage2,Stage3 staging
class Feat1,Feat2,Feat3 features
class Ana1,Ana2,Ana3 analytics
```
```sql
-- Database Structure
CREATE DATABASE IF NOT EXISTS MEDICAL_CODING_ML;
-- Schemas
CREATE SCHEMA IF NOT EXISTS RAW; -- Raw ingested data
CREATE SCHEMA IF NOT EXISTS STAGING; -- Cleaned, validated data
CREATE SCHEMA IF NOT EXISTS FEATURES; -- ML-ready features
CREATE SCHEMA IF NOT EXISTS ANALYTICS; -- Aggregated metrics
-- Key Tables
CREATE TABLE FEATURES.PATIENT_ENCOUNTERS (
encounter_id VARCHAR PRIMARY KEY,
patient_id VARCHAR,
admission_date TIMESTAMP,
discharge_date TIMESTAMP,
principal_diagnosis VARCHAR,
secondary_diagnoses ARRAY,
procedures ARRAY,
clinical_notes_processed VARIANT,
lab_results VARIANT,
vital_signs VARIANT,
features_vector ARRAY,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP()
);
-- Snowpipe for Continuous Ingestion
CREATE PIPE medical_coding_pipe
AUTO_INGEST = TRUE
AS
COPY INTO RAW.CLINICAL_DATA
FROM @medical_coding_stage
FILE_FORMAT = (TYPE = 'PARQUET');
```
#### Feature Engineering Pipeline
```python
# Core Feature Categories
feature_pipeline = {
"demographic_features": [
"age", "gender", "race", "ethnicity"
],
"clinical_features": [
"chief_complaint_embedding",
"diagnosis_count",
"procedure_count",
"comorbidity_index",
"severity_scores"
],
"temporal_features": [
"length_of_stay",
"icu_days",
"readmission_risk",
"seasonal_patterns"
],
"text_features": [
"clinical_note_embeddings",
"discharge_summary_entities",
"radiology_report_findings"
],
"lab_features": [
"abnormal_lab_count",
"critical_values",
"trend_indicators"
]
}
```
### 3.3 ML Model Architecture
```mermaid
graph LR
subgraph "Training Pipeline"
Data[Training Data<br/>from Snowflake]
Prep[Data Preprocessing<br/>• Tokenization<br/>• Feature Engineering]
Train[Model Training<br/>• DRG-LLaMA<br/>• Clinical BERT]
Eval[Model Evaluation<br/>• Accuracy Metrics<br/>• F1 Scores]
Reg[Model Registry<br/>• Version Control<br/>• Metadata]
end
subgraph "Deployment Pipeline"
Stage[Staging<br/>Endpoint]
AB[A/B Testing<br/>• Traffic Split<br/>• Performance Compare]
Prod[Production<br/>Endpoint]
Monitor[Model Monitor<br/>• Drift Detection<br/>• Performance Tracking]
end
subgraph "Inference Modes"
RT[Real-time<br/>• <500ms latency<br/>• Single predictions]
Batch[Batch Transform<br/>• Overnight processing<br/>• 10K records/batch]
end
Data --> Prep
Prep --> Train
Train --> Eval
Eval --> Reg
Reg --> Stage
Stage --> AB
AB --> Prod
Prod --> Monitor
Monitor -.->|Retrain Trigger| Data
Prod --> RT
Prod --> Batch
classDef training fill:#E8EAF6,stroke:#3F51B5,stroke-width:2px
classDef deploy fill:#E0F2F1,stroke:#00796B,stroke-width:2px
classDef inference fill:#FFF8E1,stroke:#F57F17,stroke-width:2px
class Data,Prep,Train,Eval,Reg training
class Stage,AB,Prod,Monitor deploy
class RT,Batch inference
```
#### DRG Prediction Model
```yaml
Model: DRG-LLaMA
Architecture:
Base Model: LLaMA-13B
Fine-tuning Dataset: 2M+ hospital admissions
Context Window: 1024 tokens
Training Infrastructure:
Instance: ml.p4d.24xlarge
GPUs: 8x A100
Training Time: ~48 hours
Optimization:
- Mixed Precision Training (FP16)
- Gradient Checkpointing
- DeepSpeed ZeRO-3
- Learning Rate: 1e-5 with cosine schedule
Performance Targets:
- Top-1 Accuracy: 54.6%
- Top-5 Accuracy: 86.5%
- Inference Latency: <500ms
```
#### ICD-10 Prediction Model
```yaml
Model: Hierarchical Clinical BERT
Architecture:
Base Model: Bio_ClinicalBERT
Task Head: Hierarchical Attention Network
Label Space: Top 1000 ICD-10 codes (expandable)
Training Infrastructure:
Instance: ml.g5.12xlarge
Training Time: ~24 hours
Multi-Label Strategy:
- Label-wise attention mechanism
- Hierarchical loss function
- Focal loss for class imbalance
Performance Targets:
- Micro-F1: 0.54
- Macro-F1: 0.48
- Top-10 Recall: 0.75
```
### 3.4 Inference Architecture
#### Real-time Inference
```python
# SageMaker Multi-Model Endpoint Configuration
endpoint_config = {
"EndpointName": "medical-coding-realtime",
"ProductionVariants": [
{
"VariantName": "drg-model",
"ModelName": "drg-llama-v1",
"InstanceType": "ml.g5.2xlarge",
"InitialInstanceCount": 2,
"AutoScaling": {
"MinCapacity": 2,
"MaxCapacity": 10,
"TargetValue": 100, # requests per second
"ScaleInCooldown": 300,
"ScaleOutCooldown": 60
}
},
{
"VariantName": "icd10-model",
"ModelName": "clinical-bert-icd10-v1",
"InstanceType": "ml.g5.xlarge",
"InitialInstanceCount": 2
}
]
}
```
#### Batch Processing
```python
# Step Functions State Machine Definition
batch_pipeline = {
"StartAt": "ValidateInput",
"States": {
"ValidateInput": {
"Type": "Task",
"Resource": "arn:aws:lambda:validate-batch-input",
"Next": "ExtractFeatures"
},
"ExtractFeatures": {
"Type": "Task",
"Resource": "arn:aws:states:::snowflake:query",
"Next": "ParallelPrediction"
},
"ParallelPrediction": {
"Type": "Parallel",
"Branches": [
{
"StartAt": "PredictDRG",
"States": {
"PredictDRG": {
"Type": "Task",
"Resource": "arn:aws:states:::sagemaker:transform"
}
}
},
{
"StartAt": "PredictICD10",
"States": {
"PredictICD10": {
"Type": "Task",
"Resource": "arn:aws:states:::sagemaker:transform"
}
}
}
],
"Next": "PostProcess"
},
"PostProcess": {
"Type": "Task",
"Resource": "arn:aws:lambda:postprocess-predictions",
"End": true
}
}
}
```
---
## 4. Data Flow Specifications
### 4.1 Real-time Data Flow
```mermaid
sequenceDiagram
participant Client
participant API Gateway
participant Lambda
participant Feature Store
participant SageMaker
participant DynamoDB
Client->>API Gateway: POST /predict/real-time
API Gateway->>Lambda: Invoke Preprocessor
Lambda->>Feature Store: Fetch Features
Feature Store-->>Lambda: Return Features
Lambda->>SageMaker: Invoke Endpoint
SageMaker-->>Lambda: Predictions
Lambda->>DynamoDB: Store Results
Lambda-->>API Gateway: Response
API Gateway-->>Client: JSON Response
```
### 4.2 Batch Processing Flow
```mermaid
graph LR
subgraph "Stage 1: Data Extraction [0-10 min]"
Source[Snowflake/S3<br/>Source Data]
Extract[Extract Pipeline<br/>• Parquet format<br/>• Date/facility partitions]
end
subgraph "Stage 2: Feature Engineering [10-30 min]"
Struct[Snowpark UDFs<br/>Structured Features]
NLP[SageMaker Processing<br/>NLP Features]
Vectors[Feature Vectors<br/>in S3]
end
subgraph "Stage 3: Model Inference [30-90 min]"
Transform[Batch Transform Jobs<br/>• Parallel processing<br/>• 10K records/batch<br/>• GPU optimization]
DRGBatch[DRG Model<br/>Predictions]
ICDBatch[ICD-10 Model<br/>Predictions]
end
subgraph "Stage 4: Post-processing [90-100 min]"
Confidence[Confidence Scoring]
Validation[Hierarchical<br/>Code Validation]
Aggregate[Result<br/>Aggregation]
end
subgraph "Stage 5: Result Storage [100-110 min]"
Snow[Write to<br/>Snowflake]
Dash[Update<br/>Dashboards]
Notify[Trigger<br/>Notifications]
end
Source --> Extract
Extract --> Struct
Extract --> NLP
Struct --> Vectors
NLP --> Vectors
Vectors --> Transform
Transform --> DRGBatch
Transform --> ICDBatch
DRGBatch --> Confidence
ICDBatch --> Confidence
Confidence --> Validation
Validation --> Aggregate
Aggregate --> Snow
Snow --> Dash
Snow --> Notify
classDef extraction fill:#E8F5E9,stroke:#4CAF50,stroke-width:2px
classDef feature fill:#E3F2FD,stroke:#2196F3,stroke-width:2px
classDef inference fill:#FFF3E0,stroke:#FF9800,stroke-width:2px
classDef processing fill:#F3E5F5,stroke:#9C27B0,stroke-width:2px
classDef storage fill:#FFEBEE,stroke:#F44336,stroke-width:2px
class Source,Extract extraction
class Struct,NLP,Vectors feature
class Transform,DRGBatch,ICDBatch inference
class Confidence,Validation,Aggregate processing
class Snow,Dash,Notify storage
```
---
## 5. Security & Compliance
```mermaid
graph TB
subgraph "Data Security Layers"
subgraph "Data at Rest"
S3KMS[S3 with SSE-KMS<br/>Customer Managed Keys]
SnowEnc[Snowflake Tri-Secret<br/>Secure Encryption]
EBSEnc[EBS Encrypted<br/>Volumes]
end
subgraph "Data in Transit"
TLS[TLS 1.2+ All Connections]
PLink[PrivateLink for<br/>Snowflake]
VPCEnd[VPC Endpoints<br/>for AWS Services]
CertPin[Certificate Pinning<br/>for Epic APIs]
end
subgraph "Access Control"
IAM[IAM Roles &<br/>Policies]
MFA[Multi-Factor<br/>Authentication]
RBAC[Role-Based<br/>Access Control]
Secrets[AWS Secrets<br/>Manager]
end
subgraph "Audit & Compliance"
Trail[CloudTrail<br/>Logging]
Config[AWS Config<br/>Rules]
Hub[Security Hub<br/>Monitoring]
Macie[Amazon Macie<br/>PHI Detection]
end
end
S3KMS --> IAM
SnowEnc --> IAM
EBSEnc --> IAM
TLS --> RBAC
PLink --> RBAC
VPCEnd --> RBAC
CertPin --> RBAC
IAM --> Trail
MFA --> Trail
RBAC --> Trail
Secrets --> Trail
Trail --> Config
Config --> Hub
Hub --> Macie
classDef encryption fill:#FFF3E0,stroke:#F57C00,stroke-width:2px
classDef transit fill:#E8F5E9,stroke:#2E7D32,stroke-width:2px
classDef access fill:#E3F2FD,stroke:#1565C0,stroke-width:2px
classDef audit fill:#FCE4EC,stroke:#C2185B,stroke-width:2px
class S3KMS,SnowEnc,EBSEnc encryption
class TLS,PLink,VPCEnd,CertPin transit
class IAM,MFA,RBAC,Secrets access
class Trail,Config,Hub,Macie audit
```
### 5.1 Encryption Strategy
```yaml
Data at Rest:
S3:
- Server-side encryption: SSE-KMS
- KMS Key: Customer Managed (CMK)
- Rotation: Annual
Snowflake:
- Tri-Secret Secure encryption
- Customer-managed keys in AWS KMS
EBS Volumes:
- Encrypted by default
- KMS key per environment
Data in Transit:
- TLS 1.2+ for all connections
- PrivateLink for Snowflake connectivity
- VPC Endpoints for AWS services
- Certificate pinning for Epic APIs
```
### 5.2 Access Control
```yaml
IAM Roles:
MLEngineerRole:
- SageMaker full access
- S3 read/write to ML buckets
- Snowflake external stage access
DataScientistRole:
- SageMaker training/tuning
- S3 read-only to production data
- CloudWatch metrics access
ApplicationRole:
- SageMaker endpoint invoke
- DynamoDB read/write
- S3 read to model artifacts
AuditorRole:
- CloudTrail read-only
- S3 audit logs access
- Compliance reports generation
```
### 5.3 Audit & Monitoring
```python
# CloudWatch Metrics Configuration
custom_metrics = {
"Model Performance": [
"prediction_accuracy",
"inference_latency",
"endpoint_availability"
],
"Data Quality": [
"missing_field_rate",
"schema_validation_failures",
"data_drift_score"
],
"Security": [
"unauthorized_access_attempts",
"phi_exposure_events",
"encryption_failures"
],
"Business": [
"daily_prediction_volume",
"code_distribution",
"cost_per_prediction"
]
}
# Alerting Thresholds
alerts = {
"Critical": {
"accuracy_drop": "< 50%",
"endpoint_failure": "availability < 99%",
"data_breach": "any PHI exposure"
},
"Warning": {
"accuracy_degradation": "< 52%",
"high_latency": "> 1000ms p99",
"cost_spike": "> 150% daily average"
}
}
```
---
## 6. Implementation Phases
```mermaid
gantt
title Medical Coding ML System Implementation Timeline
dateFormat YYYY-MM-DD
section Phase 1 Foundation
Infrastructure Setup :done, p1a, 2024-01-01, 14d
Data Pipeline Foundation :done, p1b, after p1a, 14d
section Phase 2 ML Dev
Feature Engineering :active, p2a, after p1b, 14d
Model Training :p2b, after p2a, 28d
Inference Pipeline :p2c, after p2b, 14d
section Phase 3 Production
Integration & Testing :p3a, after p2c, 14d
Deployment & Monitoring :p3b, after p3a, 14d
section Phase 4 Optimization
Performance Optimization :p4a, after p3b, 28d
Epic Integration Prep :p4b, after p4a, 28d
```
### Phase 1: Foundation (Weeks 1-4)
```mermaid
flowchart LR
subgraph "Week 1-2: Infrastructure"
A1[AWS Account<br/>Structure] --> A2[VPC &<br/>Networking]
A2 --> A3[IAM Roles<br/>& Policies]
A3 --> A4[KMS Key<br/>Generation]
A4 --> A5[Snowflake<br/>AWS Integration]
end
subgraph "Week 3-4: Data Pipeline"
B1[S3 Bucket<br/>Structure] --> B2[Snowpipe<br/>Configuration]
B2 --> B3[AWS Glue<br/>ETL Setup]
B3 --> B4[Data Quality<br/>Framework]
B4 --> B5[Epic FHIR<br/>Connector]
end
A5 --> B1
```
### Phase 2: ML Development (Weeks 5-12)
```mermaid
flowchart LR
subgraph "Week 5-6: Features"
C1[Snowflake<br/>Feature Tables] --> C2[NLP<br/>Preprocessing]
C2 --> C3[Feature Store<br/>Setup]
C3 --> C4[Data Validation<br/>Rules]
end
subgraph "Week 7-10: Training"
D1[SageMaker<br/>Environment] --> D2[DRG-LLaMA<br/>Fine-tuning]
D2 --> D3[Clinical BERT<br/>Training]
D3 --> D4[Hyperparameter<br/>Optimization]
D4 --> D5[Model<br/>Evaluation]
end
subgraph "Week 11-12: Inference"
E1[Endpoint<br/>Deployment] --> E2[Batch Transform<br/>Setup]
E2 --> E3[A/B Testing<br/>Framework]
E3 --> E4[Performance<br/>Benchmarking]
end
C4 --> D1
D5 --> E1
```
### Phase 3: Production Readiness (Weeks 13-16)
```mermaid
flowchart LR
subgraph "Week 13-14: Integration"
F1[API Gateway<br/>Config] --> F2[End-to-End<br/>Testing]
F2 --> F3[Load<br/>Testing]
F3 --> F4[Security<br/>Scanning]
F4 --> F5[DR<br/>Testing]
end
subgraph "Week 15-16: Deployment"
G1[Production<br/>Deployment] --> G2[Monitoring<br/>Dashboard]
G2 --> G3[Alerting<br/>Configuration]
G3 --> G4[Documentation<br/>Completion]
G4 --> G5[Shadow Mode<br/>Activation]
end
F5 --> G1
```
### Phase 4: Optimization & Scale (Weeks 17-24)
```mermaid
flowchart LR
subgraph "Week 17-20: Optimization"
H1[Model<br/>Compression] --> H2[Inference<br/>Optimization]
H2 --> H3[Cost<br/>Optimization]
H3 --> H4[Cache<br/>Implementation]
end
subgraph "Week 21-24: Epic Integration"
I1[FHIR API<br/>Deep Integration] --> I2[Clarity DB<br/>Connection]
I2 --> I3[Workflow<br/>Integration Design]
I3 --> I4[Clinical<br/>Validation]
end
H4 --> I1
```
---
## 7. Cost Optimization Strategies
### 7.1 Compute Optimization
```yaml
SageMaker:
- Savings Plans: 3-year commitment for 64% savings
- Spot Training: 70% cost reduction for training jobs
- Multi-model endpoints: Share infrastructure
- Automatic scaling: Scale to zero during off-hours
Snowflake:
- Auto-suspend: 10-minute idle timeout
- Warehouse sizing: Start small, scale as needed
- Result caching: 24-hour cache retention
- Clustering keys: Optimize query performance
Lambda:
- Reserved concurrency: Control costs
- Graviton2: 20% price-performance improvement
- Memory optimization: Right-size based on profiling
```
### 7.2 Storage Optimization
```yaml
S3 Lifecycle Policies:
- Infrequent Access: After 30 days
- Glacier: After 90 days
- Expiration: After 7 years (HIPAA requirement)
Data Compression:
- Parquet format: 70% compression ratio
- Gzip for JSON: 60% reduction
- Model compression: Quantization to INT8
```
---
## 8. Performance Targets & SLAs
### 8.1 System Performance
```yaml
Availability:
- API Uptime: 99.9% (43.8 min/month downtime)
- Batch Processing: 99.5% success rate
Latency:
- Real-time Inference: p50 < 200ms, p99 < 1000ms
- Batch Processing: < 2 hours for 100K records
Throughput:
- Real-time: 1000 requests/second
- Batch: 1M records/day
Accuracy:
- DRG Top-1: > 54%
- DRG Top-5: > 85%
- ICD-10 Micro-F1: > 0.52
```
### 8.2 Operational Metrics
```yaml
Recovery Objectives:
- RTO: 4 hours
- RPO: 1 hour
Model Management:
- Retraining Frequency: Monthly
- A/B Test Duration: 2 weeks minimum
- Rollback Time: < 5 minutes
```
---
## 9. Monitoring & Observability
```mermaid
graph TB
subgraph "Data Sources"
SM[SageMaker<br/>Endpoints]
Lambda[Lambda<br/>Functions]
Snow[Snowflake<br/>Queries]
API[API Gateway]
end
subgraph "Metrics Collection"
CW[CloudWatch Metrics]
Logs[CloudWatch Logs]
XRay[AWS X-Ray<br/>Tracing]
Custom[Custom Metrics<br/>via SDK]
end
subgraph "Monitoring Dashboards"
Perf[Model Performance<br/>• Accuracy<br/>• Latency<br/>• Throughput]
DQ[Data Quality<br/>• Missing Fields<br/>• Schema Violations<br/>• Data Drift]
Sec[Security<br/>• Access Attempts<br/>• PHI Events<br/>• Encryption Status]
Biz[Business Metrics<br/>• Daily Volume<br/>• Code Distribution<br/>• Cost/Prediction]
end
subgraph "Alerting"
Crit[Critical Alerts<br/>• Accuracy < 50%<br/>• Endpoint Failure<br/>• Data Breach]
Warn[Warning Alerts<br/>• Accuracy < 52%<br/>• High Latency<br/>• Cost Spike]
SNS[Amazon SNS]
PD[PagerDuty<br/>Integration]
end
SM --> CW
Lambda --> CW
Snow --> Custom
API --> CW
SM --> Logs
Lambda --> Logs
API --> XRay
CW --> Perf
Logs --> DQ
XRay --> Perf
Custom --> Biz
Perf --> Crit
DQ --> Warn
Sec --> Crit
Biz --> Warn
Crit --> SNS
Warn --> SNS
SNS --> PD
classDef source fill:#E1F5FE,stroke:#0277BD,stroke-width:2px
classDef collect fill:#F3E5F5,stroke:#6A1B9A,stroke-width:2px
classDef dashboard fill:#E8F5E9,stroke:#2E7D32,stroke-width:2px
classDef alert fill:#FFEBEE,stroke:#C62828,stroke-width:2px
class SM,Lambda,Snow,API source
class CW,Logs,XRay,Custom collect
class Perf,DQ,Sec,Biz dashboard
class Crit,Warn,SNS,PD alert
```
### 9.1 Dashboard Configuration
```python
# CloudWatch Dashboard Definition
dashboard = {
"name": "MedicalCodingML-Operations",
"widgets": [
{
"type": "metric",
"properties": {
"metrics": [
["AWS/SageMaker", "ModelLatency", {"stat": "Average"}],
["AWS/SageMaker", "Invocations", {"stat": "Sum"}],
["Custom", "PredictionAccuracy", {"stat": "Average"}]
],
"period": 300,
"region": "us-east-1",
"title": "Model Performance"
}
},
{
"type": "log",
"properties": {
"query": """
fields @timestamp, accuracy, model_version
| filter @type = "PREDICTION_RESULT"
| stats avg(accuracy) by bin(5m)
""",
"region": "us-east-1",
"title": "Accuracy Trend"
}
}
]
}
```
### 9.2 Alerting Rules
```yaml
Critical Alerts:
- Model accuracy < 50%
- Endpoint health check failures
- Data pipeline failures > 2 consecutive
- PHI access violations
Warning Alerts:
- Inference latency p99 > 1s
- Daily cost > $5000
- Model drift detected
- Queue depth > 10000
```
---
## 10. Future Enhancements
### Near-term (3-6 months)
- Multi-language support for clinical notes
- Automated retraining pipelines
- Explainability dashboard for predictions
- Integration with additional EHR systems
### Medium-term (6-12 months)
- Epic Cognitive Computing Platform integration
- Real-time learning from coder feedback
- Multi-site federated learning
- Advanced ensemble methods
### Long-term (12+ months)
- Full autonomous coding for specific specialties
- Predictive analytics for coding optimization
- Natural language query interface
- Cross-institutional benchmarking
---
## Appendix A: Configuration Files
### A.1 Terraform Variables
```hcl
variable "environment" {
description = "Deployment environment"
type = string
default = "production"
}
variable "ml_instance_types" {
description = "SageMaker instance types for endpoints"
type = map(string)
default = {
drg_model = "ml.g5.2xlarge"
icd10_model = "ml.g5.xlarge"
}
}
variable "snowflake_account" {
description = "Snowflake account identifier"
type = string
sensitive = true
}
```
### A.2 Model Configuration
```json
{
"drg_model": {
"name": "drg-llama-v1",
"framework": "pytorch",
"framework_version": "2.0",
"max_sequence_length": 1024,
"batch_size": 32,
"quantization": "int8"
},
"icd10_model": {
"name": "clinical-bert-icd10",
"framework": "transformers",
"framework_version": "4.35",
"max_sequence_length": 512,
"num_labels": 1000,
"attention_heads": 12
}
}
```
---
## Appendix B: Troubleshooting Guide
### Common Issues and Resolutions
1. **High Inference Latency**
- Check endpoint instance metrics
- Verify batch size configuration
- Consider upgrading instance type
- Enable SageMaker Model Monitor
2. **Data Pipeline Failures**
- Validate Snowpipe notification configuration
- Check S3 bucket permissions
- Review CloudWatch logs for Lambda errors
- Verify network connectivity
3. **Model Accuracy Degradation**
- Analyze data drift metrics
- Review recent data quality issues
- Check for schema changes in source systems
- Trigger manual retraining if needed
4. **Cost Overruns**
- Review SageMaker endpoint utilization
- Check for runaway Snowflake queries
- Audit S3 storage classes
- Implement auto-scaling policies