You have learned about saving and updating data in an openEHR system, but now you need to retrieve this saved data in ways that supports particular clinical requirements. This is where AQL (Archetype Query Language) comes in. This article provides a beginner-friendly introduction to AQL. Even though Simon tells us to start with "WHY", we are going explore the "WHAT" and "HOW" first and then the "WHY" will make perfect sense.
We will also go over a comparison of AQL with SQL, to get a intuitive understanding of AQL using a widely familiar declarative query language. If you are unfamiliar with SQL you can skip this section entirely.
If you are not very familiar with the two-level modelling approach of openEHR, here is an excellent article to get you upto speed. If you want to brush up on the technical basics of openEHR here is another excellent blog article.
What is AQL?#
AQL is a declarative query language specifically designed for openEHR systems, meaning you specify what data you want, not how to get it. AQL takes care of figuring out the most efficient way to retrieve the data from the openEHR system. It allows you to extract information from the complex, hierarchical structure of openEHR data in a way that's both powerful and intuitive.
Here's a simple AQL query:
SELECT
o/data/events/data/items[at0004]/value/magnitude AS systolic,
o/data/events/data/items[at0005]/value/magnitude AS diastolic
FROM EHR e
CONTAINS COMPOSITION c
CONTAINS OBSERVATION o
WHERE o/archetype_node_id = 'openEHR-EHR-OBSERVATION.blood_pressure.v2'
This query retrieves the systolic and diastolic blood pressure value from a observations stored within an EHR. Don't worry if it looks a bit intimidating at first; we'll break it down step by step.
AQL Basics#
The following are some key concepts of AQL queries:
Path-based syntax#
AQL uses paths to navigate the hierarchical structure of openEHR data. AQL uses the openEHR path syntax to refer to both high-level and detailed archetype nodes (objects and data values), as well as openEHR class instances and attributes that are part of the openEHR Reference Model but not defined in an archetype.
There are two type of openEHR paths,
archetype path : referring to a node within an archetype
eg. o/data/events/data/items[at0004]/value/magnitude from the above example which points to the location of systolic magnitude readings
RM class attribute path : pointing to an attribute of an openEHR Reference Model class
eg. c/uid/value that can be inserted in the above example like this
SELECT
c/uid/value
FROM EHR e
CONTAINS COMPOSITION c
CONTAINS OBSERVATION o
WHERE o/archetype_node_id = 'openEHR-EHR-OBSERVATION.blood_pressure.v2'
which will give us the composition ids of all the compositions from which we got the systolic and diastolic results in the previous example.
Predicates#
AQL uses predicates to define criteria for coarse-grained and fine-grained data. Predicate expressions are always enclosed by brackets ([]
). AQL supports three distinct types of predicates that filter data in different ways:
Standard Predicates#
Structure: Standard predicates always have a left-hand operand, an operator and a right-hand operand, e.g. [ehr_id/value='123456']
.
The left-hand operand is normally an openEHR path. The right-hand operand is normally a criterion value or a parameter, such as '123456'
, $ehrUid
. It is used for filtering based on data values.
eg: WHERE o/data/events/data/items[at0004]/value/magnitude > 140
The operator can be one of the following: >, >=, =, <, <=, !=
Archetype Predicates#
Archetype predicates provide a convenient shorthand to specify which archetype instances should be included in the query. For example:
CONTAINS OBSERVATION o[openEHR-EHR-OBSERVATION.blood_pressure.v2]
This is equivalent to writing:
CONTAINS OBSERVATION o WHERE o/archetype_node_id = 'openEHR-EHR-OBSERVATION.blood_pressure.v1'
We can rewrite the first example like this
SELECT
o/data/events/data/items[at0004]/value/magnitude AS systolic,
o/data/events/data/items[at0005]/value/magnitude AS diastolic
FROM EHR e
CONTAINS COMPOSITION c
CONTAINS OBSERVATION o[openEHR-EHR-OBSERVATION.blood_pressure.v2]'
and get the exact same results. The square bracket notation is more concise and immediately identifies which clinical concept is being queried.
Node Predicates#
A node predicate is also a shortcut of a standard predicate. eg:
Node Predicate | Standard Predicate |
---|---|
[at0002] | [archetype_node_id=at0002] |
[at0002, $nameValue] | [archetype_node_id=at0002 and name/value=$nameValue] |
[at0002, snomed_ct(3.1)::313267000] | [archetype_node_id=at0002 and name/defining_code/code_string='313267000' and name/defining_code/terminology_id/value='snomed_ct(3.1)'] |
A node predicate defines criteria on fine-grained data. |
NOTE : As a side note, don't get confused by archetype_node_id = 'openEHR-EHR-OBSERVATION.blood_pressure.v2' and archetype_node_id=at0002 both being valid node ids, they represent different levels of identification within the openEHR architecture. Archetype_node_id attribute exists on every node in the openEHR reference model. For top-level nodes (like OBSERVATION), it contains the full archetype ID like openEHR-EHR-OBSERVATION.blood_pressure.v2 and for internal nodes (like specific data points), it contains the at-codes like at0002.
FROM & CONTAINS Keyword#
The FROM and CONTAINS keywords in AQL establish the hierarchical relationship between different levels of openEHR data. This reflects the containment-based structure of the clinical information model:
FROM EHR e
CONTAINS COMPOSITION c
CONTAINS OBSERVATION o
This pattern creates a containment path that follows openEHR's natural data hierarchy:
- EHR (Electronic Health Record) - the top-level container for a patient's data
- COMPOSITION - clinical documents within the EHR (e.g., encounters, reports)
- ENTRY classes (OBSERVATION, EVALUATION, etc.) - clinical content within compositions
Left-hand operand class expression is the parent object of the right-hand operand class expression eg. COMPOSITION is a parent of OBSERVATION. Each containment level can be given an alias (e.g., 'e', 'c', 'o', 'a', 'b', 'c') that is used in SELECT and WHERE clauses to reference data at that level.
Logical operators AND
and OR
and parentheses ()
are used when multiple containment constrains are requiredLogical operators AND
and OR
and parentheses ()
are used when multiple containment constrains are required
EHR e
CONTAINS COMPOSITION c [openEHR-EHR-COMPOSITION.referral.v1] AND COMPOSITION c1 [openEHR-EHR-COMPOSITION.report.v1]
EHR e
CONTAINS COMPOSITION c [openEHR-EHR-COMPOSITION.referral.v1]
CONTAINS (OBSERVATION o [openEHR-EHR-OBSERVATION.laboratory-hba1c.v1] OR OBSERVATION o1 [openEHR-EHR-OBSERVATION.laboratory-glucose.v1])
The logical operator NOT
can be used in combination with CONTAINS
to express an exclusion constraint
Break down the example#
After going through the basics this example will be pretty clear, but lets break it down anyway
SELECT
o/data/events/data/items[at0004]/value/magnitude AS systolic,
o/data/events/data/items[at0005]/value/magnitude AS diastolic
FROM EHR e
CONTAINS COMPOSITION c
CONTAINS OBSERVATION o
WHERE o/archetype_node_id = 'openEHR-EHR-OBSERVATION.blood_pressure.v2'
- SELECT o/data/events/data/items[at0004]/value/magnitude AS systolic, o/data/events/data/items[at0005]/value/magnitude AS diastolic This specifies the data element to be returned. The path navigates through the observation structure to the blood pressure systolic and diastolic magnitudes.
FROM EHR e CONTAINS COMPOSITION c CONTAINS OBSERVATION o
: This defines the source of the data, starting with the EHR and going down to the observation level in the containment structure.WHERE o/archetype_node_id = 'openEHR-EHR-OBSERVATION.blood_pressure.v1'
: This filters the observations to only include those that are instances of the blood pressure archetype.
and we get back values that look like this
{
"columns":[{"path": "o/data/events/data/items[at0004]/value/magnitude","name":"systolic"},{"path":
"o/data/events/data/items[at0005]/value/magnitude","name":"diastolic"}],
"rows":[[120.0,80.0],[140.0,90.0]]
}
In table form
systolic | diastolic |
---|---|
120.0 | 80.0 |
140.0 | 90.0 |
Similarities and Differences with SQL#
While AQL and SQL share some similarities in their basic structure, they are designed for different purposes and have distinct features. Here's a comparison with examples to hopefully get an intuitive understanding of AQL from the shoulders of SQL :
Feature | SQL | AQL | Example |
Purpose | Querying relational databases | Querying openEHR data | |
Data Model | Tables with rows and columns | Hierarchical structure based on archetypes and templates | |
Syntax | SELECT ... FROM ... WHERE ... |
SELECT ... FROM ... CONTAINS ... WHERE ... |
|
Filtering | WHERE clause |
WHERE clause with path-based syntax |
To retrieve patients named "John Doe" in SQL: SELECT * FROM patients WHERE name = 'John Doe' . In AQL: SELECT e/ehr_id/value FROM EHR e CONTAINS COMPOSITION c CONTAINS DEMOGRAPHIC d WHERE d/name/value = 'John Doe' |
Joining | JOIN clause |
CONTAINS keyword |
To retrieve data from related tables in SQL: SELECT * FROM patients p JOIN admissions a ON p.id = a.patient_id . In AQL: SELECT p/name/value, a/admission_date/value FROM EHR e CONTAINS COMPOSITION c CONTAINS DEMOGRAPHIC p, COMPOSITION c2 CONTAINS ADMIN_ENTRY a WHERE c/archetype_node_id = 'openEHR-EHR-COMPOSITION.patient_summary.v1' AND c2/archetype_node_id = 'openEHR-EHR-COMPOSITION.admission.v1' |
Data Manipulation | INSERT , UPDATE , DELETE |
Not supported in AQL (for querying clinical data) |
It's important to note that while AQL is primarily focused on querying, unlike SQL which can also manipulate the data
Benefits of Using AQL and openEHR#
Combining AQL with openEHR offers several advantages for healthcare IT systems:
- Interoperability: AQL queries can be used across different openEHR systems, regardless of the underlying database technology. This is because AQL operates at a higher level of abstraction, focusing on the clinical concepts defined by archetypes and templates rather than the specific database schema.
- Flexibility: The openEHR data model is adaptable to changing clinical needs. AQL provides a powerful way to query this evolving data, ensuring that your queries remain relevant even as the underlying data structure changes.
- Clinical Focus: AQL allows clinicians to focus on the clinical content they need without having to understand the technical details of the database. This simplifies data retrieval and analysis for clinical users.
- Vendor Neutrality and Technology Agnostic: AQL is not tied to any specific vendor or technology. This gives healthcare providers the freedom to choose the best tools and systems for their needs without being locked into a particular vendor's ecosystem.
- Improved Patient Care: By enabling better access to and analysis of patient data, AQL and openEHR can contribute to improved clinical decision-making and patient care.
- Integration with RESTful APIs: AQL can be used in conjunction with RESTful APIs, which are a common way to exchange data between web applications. This allows developers to build applications that can easily access and query openEHR data.
NOTE: I promised to keep this beginner friendly, so I'll just leave a cURL of our example AQL query when run as RESTful API request in EHRbase.
curl --location 'https://[BASEURL_OF_YOUR_EHRBASE]/openehr/v1/query/aql' \
--header 'Accept: application/json' \
--header 'Content-Type: application/json' \
--data '{
"q": "SELECT o/data/events/data/items[at0004]/value/magnitude AS systolic, o/data/events/data/items[at0005]/value/magnitude AS diastolic FROM EHR e CONTAINS COMPOSITION c CONTAINS OBSERVATION o[$archetype]",
"query_parameters": {
"archetype": "openEHR-EHR-OBSERVATION.blood_pressure.v2"
}
}'
Conclusion#
AQL is a powerful query language that plays a crucial role in realising the benefits of openEHR. Its unique features, such as path-based syntax, composability, and focus on clinical concepts, make it well-suited for querying and analysing complex healthcare data. I hope this article removed some of the mystery surrounding AQL and got you excited to actually try it out.