Skip to Content
ExamplesResearch Data Pipeline

Research Data Pipeline Example

Scenario: Academic medical center needs to convert EHR data to OMOP CDM for a multi-site research study on diabetes outcomes.

The Challenge

  • Multiple data sources (Epic, lab systems, pharmacy)
  • Complex OMOP concept mappings
  • IRB compliance and de-identification
  • Ongoing data refresh requirements

Wave Solution

1. Source Profiles:

  • Epic FHIR Condition resources
  • Epic FHIR MedicationRequest resources
  • Quest Diagnostics lab results (custom CSV)

2. Target Profiles:

  • OMOP CONDITION_OCCURRENCE table
  • OMOP DRUG_EXPOSURE table
  • OMOP MEASUREMENT table

3. Complex Mapping Example:

# Generated by Wave for diabetes condition mapping def map_diabetes_condition_to_omop(fhir_condition: dict) -> dict: """Map FHIR Condition to OMOP CONDITION_OCCURRENCE""" # ICD-10 to OMOP concept mapping icd10_to_omop = { 'E11.9': 201826, # Type 2 diabetes without complications 'E11.40': 4193704, # Type 2 diabetes with diabetic neuropathy 'E11.21': 4193323, # Type 2 diabetes with diabetic nephropathy } condition_occurrence = { 'person_id': get_person_id(fhir_condition['subject']['reference']), 'condition_concept_id': map_icd10_to_concept( fhir_condition['code']['coding'][0]['code'] ), 'condition_start_date': parse_fhir_date( fhir_condition['onsetDateTime'] ), 'condition_type_concept_id': 32020, # EHR record } return condition_occurrence

Results

  • ✅ 95% automated concept mapping accuracy
  • ✅ 2.3M patient records transformed successfully
  • ✅ Research ready dataset in 3 days vs 3 months manually

Key Takeaways

OMOP Expertise Built-In

Wave understands OMOP CDM structure and concept relationships, automatically generating mappings that follow OHDSI best practices.

Multi-Source Integration

Combine data from different EHR vendors, lab systems, and external sources into a unified research dataset.

Concept Mapping Automation

ICD-10, SNOMED, LOINC, and other coding systems are automatically mapped to OMOP standard concepts with high accuracy.

Compliance-Ready

Generated transformations include de-identification patterns and audit trails required for research compliance.