Friday, January 20, 2012

What ODM (and SDTM) can learn from HL7-CDA

There has been a lot of discussion about using HL7-v3 messages in XML for submissions to the FDA. Especially some people at the FDA (who are not XML experts at all) are in favor of this: they expect better integration with EHRs from this. But they do forget that HL7-v3 messages is not about EHRs at all: they mix up between HL7-v3 and CDA (the latter using a subset of HL7-v3).
Their logic is similar to: "If I have a truck (transport format) that can carry cows, and the same truck can carry oranges, than I will be able to breed cows that can produce orange juice".
More reasons for not choosing for HL7-v3 for (SDTM) submissions can also be found in my old article "Ten good reasons why an HL7-XML message is not always the best solution as a format for CDISC standard - and especially not for SDTM data". Also the recent postings from Gartner have indicated that HL7-v3 messages have been a big failure.

I am getting currently getting training from HL7 Austria, which will (hopefully) make me an HL7-standards experts and especially make me even more better in working with CDA. If all goes well, this will also deliver me HL7 certification.

As I also am teaching CDA at the university (as the Austrian EHRs system "ELGA" will be based on CDA), I do see more and more clearly (also technically) how information from EHRs can be used in clinical research: the answer is NOT in using HL7-v3-XML for FDA submissions, nor in data collection: we do have CDISC ODM for that.

However, there are a few things we can learn from CDA, and which we may want to introduce in ODM, or at least allow it (though standardized extensions). Here is a first list:

- use of universal Object Identifiers (OIDs). These are not the current OIDs of ODM, but the one used in CDA and ISO-21090. For example:


The "Object Identifier" here is "2.16.840.1.113883.6.1" which is the worldwide recognized identifier for the LOINC controlled terminology for lab tests, which is used by almost every hospital in the world that has computers. "8480-6" is the LOINC code for "systolic blood pressure".

In ODM, we would probably use an "Alias" for this, e.g.:
<Alias Contexrt="LOINC" Name="8480-6"/>  
or  <Alias Context="2.16.840.1.113883.6.1" Name="8480-6"/>

But better would be that we could incorporate this into the ItemDef, e.g. using:


Don't pin me on the "cda:" prefix for the extension namespace: it was arbitrarily chosen.
Also remark that I additionally added a "MeasurementUnitRef" and an "Alias", the latter to indicate that this data point should later be mapped to SDTM "VSORRES" for "VSTESTCD=SYSBP" (the latter belonging to the CDISC controlled terminology.
Also remark that this snippet contains all necessary information to enable that information from an EHR (in CDA or CCD) is automatically retrieved into a CRF in case the systolic blood pressure in the EHR is coded using LOINC (it can also have been coded in SNOMED-CT, which would just add another line to the above snippet)

Though CDA-R2 is not using ISO-21090 datatypes (yet - it is envisaged for CDA-R3), many people have been asking or even demanding that ODM replaces its own datatypes by ISO-21090 datatypes. We must however take into account that currently ISO-21090 datatypes can only be used in captured data, not in definitions of to-be-captured data. So it currently not possible (without adding new stuff to ODM the standard) to define that "the systolic blood pressure will be captured as a "physical quantity' (PQ datatype)". One can only state that "the systolic blood pressure has been captured as a 'physical quantity'".

So for example, in CDA-R2, a systolic blood pressure observation is written as:



whereas in ODM it would be written as:
<ItemData ItemOID="IT.SYSTOL" Value="132">
    <MeasurementUnit MeasurementUnitOID="MU.mmHg"/>
</ItemData>
Remark that CDA/ISO21090 is using the code and codeSystem (in this case SNOMED-CT) which is more or less equivalent to the use of a reference to the ODM-OID of the ItemDef in "ItemOID". I say more or less, as each has its advantages and disadvantages.
The CDA/ISO-21090 "notation" has the advantage that it is universal, as a machine will immediately understand, that the data point is about  a systolic blood pressure.
The disadvantage is that there will not always be a test code available for a (new?) test in a clinical study.
Also remark that the CDA/ISO21090 "notation" contains more information, such as the capture date and time, which would be a separate data point in ODM.
So why not combine best of both worlds?

Consider the construct:

Not valid ODM you say!?
This is valid ODM, using a "vendor" extension, allowing elements from the CDA-R2 standard or ISO-21090 standard to appear within the "ItemData" element.
Using this simple and valid construct, it is possible to combine the best of both worlds, and even to directly insert a data point from an EHR into a CRF.
Now, I am not an absolute supporter of ISO-21090, for at least two reasons:
  • you have to pay in order to obtain a copy of the specification - and it is not cheap: ISO will charge you 238 Swiss Francs (currently US$ 250 or € 200). The document is copyrighted, so if you pass a copy to your colleague, it's either illegal unless you pay another 238 Swiss Francs.
    Therefore, I do not really consider ISO-21090 an open standard.
  • The XML is bad: have a look again at the last snippet. Did you notice something weird?
    Have a look at the date "19990229". This is not an ISO-8601 date in the XML sense. It is even not a valid date: there has not been a "february 29th in 1999.
    But if you validate against the schema or the schematron, this error will go unnoticed.
    CDISC uses ISO-8601 in its XML (but also in SDTM), and if you would validate some CDISC XML in which the (now with the correct XML notation) "1999-02-29", the validation engine would immediately and loudly protest. So the HL7 people made a big mistake here!
Another thing you might have noticed is the use of the "mm[Hg]" in the "unit" attribute. This is UCUM notation (Unified Code for Units of Measure). Although healthcare all over the world uses UCUM units, CDISC decided to develop its own controlled terminology for this. In my opinion, it would be better if also CDISC would only use UCUM units.

So what we still need to do to be able to currently better use EHRs in clinical research, is a mapping between CDISC controlled terminology for units of measure, and UCUM units. Of course it would be better if we deprecate our own controlled terminology for that, and only use UCUM units.

But for the moment, we could use the following construct in ODM:

<MeasurementUnit OID="mmHg" Name="millimeter mercury"
     ucum:unit="mm[Hg]">...</MeasurementUnit>

which again defines the link between the CDISC controlled terminology and UCUM.

So, though CDA is not perfect (and HL7-v3 messages are a disaster), there still are a few things we can learn from CDA. Most of it can already be implemented (due to the "vendor" extension mechanism), as well in ODM as in a future SDTM-XML.
Our truck can then carry types of clinical and healthcare information at the same time, even linking both of them perfectly.

That is what we "CDISC end-to-end" is really about!

No comments:

Post a Comment