Exercise Solution: Create Synthetic Data for your FHIR Server
HAPI FHIR Server with Synthea Exercise
This exercise shows how to run a local HAPI FHIR server and populate it with synthetic patient data from Synthea.
Reference: Synthea repository and README synthetichealth/synthea
Prerequisites
- Docker Desktop (with Docker Compose)
- Java 11 or 17 (LTS recommended)
- Git
- curl
NOTE: If you are using the zip file containing the FHIR sample data, you skip steps 4 and 5 and go directly to step 6. From inside the extracted folder, you can run the commands to load the data into the HAPI FHIR server.
Setup Configuration Files
Step 1: Create Docker Compose Configuration
Create a docker-compose.yml
file with the following content:
version: '3.7'
services:
# HAPI FHIR server container
fhir:
container_name: fhir
image: "hapiproject/hapi:v7.0.0"
ports:
- "8080:8080" # Expose FHIR API on localhost:8080
configs:
- source: hapi
target: /app/config/application.yaml # Mount custom config
depends_on:
- db # Wait for database to start first
# PostgreSQL database for FHIR data persistence
db:
image: postgres:14
restart: always
environment:
POSTGRES_PASSWORD: admin
POSTGRES_USER: admin
POSTGRES_DB: hapi
volumes:
- ./hapi.postgress.data:/var/lib/postgresql/data # Persist data locally
configs:
hapi:
file: ./hapi.application.yaml # Reference to HAPI config file
Step 2: Create HAPI Application Configuration
Create a hapi.application.yaml
file in the same directory with the following content:
spring:
datasource:
# PostgreSQL connection details (must match docker-compose db service)
url: 'jdbc:postgresql://db:5432/hapi'
username: admin
password: admin
driverClassName: org.postgresql.Driver
jpa:
properties:
# Use HAPI-specific PostgreSQL dialect for optimal performance
hibernate.dialect: ca.uhn.fhir.jpa.model.dialect.HapiFhirPostgres94Dialect
# Disable search indexing for faster startup (optional)
hibernate.search.enabled: false
Running the Services
Step 3: Start HAPI FHIR Server and Database
From the project folder containing your configuration files:
# Start the containers in detached mode
docker compose up -d
# Wait 30-60 seconds for services to initialize, then verify the server is running
# This should return FHIR capability statement showing server is ready
curl -s http://localhost:8080/fhir/metadata | head -n 20
Optional: Test FHIR Server Connectivity
Verify the server can accept FHIR resources (should return a Patient resource with ID):
curl -sS -H 'Content-Type: application/fhir+json' -X POST \
http://localhost:8080/fhir/Patient \
--data-binary '{"resourceType":"Patient","name":[{"use":"official","family":"Test","given":["Upload"]}],"gender":"female","birthDate":"1980-01-01"}'
Generating Synthetic Data
Step 4: Clone and Build Synthea
Clone Synthea alongside this project and build it:
# Clone the Synthea repository
git clone https://github.com/synthetichealth/synthea.git
cd synthea
# Build Synthea (skip tests for faster build)
./gradlew build -x test
Step 5: Generate Synthetic Patient Data
Generate 100 synthetic patients as FHIR R4 transaction bundles:
# Generate 100 patients with FHIR R4 export enabled
./run_synthea \
-p 100 \
--exporter.fhir.export=true \
--exporter.fhir.transaction_bundle=true \
--exporter.fhir.upload=false
Note: Generated files will be under synthea/output/fhir/
Loading Data into HAPI FHIR
Step 6: Upload Organization and Practitioner
Synthea generates system-wide bundles that patient records reference. Upload these first to ensure all references resolve correctly:
# Navigate to the FHIR output directory
cd output/fhir
# Upload hospital/organization information first
# This contains Organization resources that patient encounters reference
for f in hospitalInformation*.json; do
[ -e "$f" ] || continue
echo "Uploading $f"
curl -sS -H 'Accept: application/fhir+json' \
-H 'Content-Type: application/fhir+json;charset=utf-8' \
-X POST http://localhost:8080/fhir \
--data-binary "@$f" -o /tmp/resp.json -w "HTTP %{http_code}\n" | cat
head -n 40 /tmp/resp.json
done
# Upload practitioner information
# This contains Practitioner resources that patient encounters reference
for f in practitionerInformation*.json; do
[ -e "$f" ] || continue
echo "Uploading $f"
curl -sS -H 'Accept: application/fhir+json' \
-H 'Content-Type: application/fhir+json;charset=utf-8' \
-X POST http://localhost:8080/fhir \
--data-binary "@$f" -o /tmp/resp.json -w "HTTP %{http_code}\n" | cat
head -n 40 /tmp/resp.json
done
Why this step is important: References to Organization and Practitioner resources are used in patient bundles. Loading them first ensures all references are valid when patient data is uploaded.
Step 7: Upload Patient Bundles
Upload all remaining transaction bundles, excluding the metadata files uploaded above:
# Navigate to the FHIR output directory (adjust path as needed)
cd output/fhir
# Upload all patient bundle files, skipping metadata files
for f in *.json; do
case "$f" in
hospitalInformation*|practitionerInformation*) continue;;
esac
echo "Uploading $f"
code=$(curl -sS -o /tmp/resp.json -w "%{http_code}" \
-H 'Accept: application/fhir+json' \
-H 'Content-Type: application/fhir+json;charset=utf-8' \
-X POST http://localhost:8080/fhir \
--data-binary "@$f")
echo "HTTP $code"
if [ "$code" -ge 400 ]; then
echo "Error response:"; head -n 120 /tmp/resp.json; break
fi
done
Step 8: Verify the Data Import
# Check total number of patients imported
curl -s 'http://localhost:8080/fhir/Patient?_summary=count'
# Fetch a sample of patient records to verify data structure
curl -s 'http://localhost:8080/fhir/Patient?_count=5' | head -n 120
# Optional: Check other resource types
curl -s 'http://localhost:8080/fhir/Observation?_summary=count'
curl -s 'http://localhost:8080/fhir/Encounter?_summary=count'