-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How best to represent DICOM lists with missing elements, in FHIR RDF? #141
Comments
To help drive discussion, I'm listing here some options, with pros/cons. Are there others I should add? Option 1: Omit rdf:first elements from an RDF list ladderThis DICOM JSON:
would be this in Turtle:
PROS:
CONS:
Option 2: Use an RDF Sequence with a missing qlementThis DICOM JSON:
would be this in Turtle:
PROS:
CONS:
Option 3: Use a distinguished fhir:null value to represent null.This DICOM JSON:
would be this in Turtle:
PROS:
CONS:
Option 4: Use explicit fhir:indexes , like in FHIR RDF R4This DICOM JSON:
would be this in Turtle:
PROS:
CONS:
|
Option 5: Use a blank node that is a of fhir:null value to represent null.This DICOM JSON: "Value": [ "bar", null, "foo" ] would be this in Turtle: ( "bar" [ a fhir:null ] "foo" ) PROS:
CONS:
Option 5 example [edit by ericP]
Option 5 expanded with type hierarchy for null flavors [edit by ericP]
Option 5a example [edit by DBooth]
|
Perhaps something more generic than fhir:null like "rdf:null" as the issue seems to be a general RDF List issue than a FHIR issue? |
I don't think so. The question is what is the meaning of null in dicom and is that meaning consistent. I think here in the lists a null value is an existential variable. A variable that might be known in a different graph and/or inferable. So a blank node is the way to go. The null issue outside of lists of values is larger.
I think looking at dicom null flavours we have a mix here. dicom:NA is an owl:Nothing like value ( Which IMHO means that triple should not be there at all) If we are talking in part about null flavors we can do the Option 5 and capture this and allow for reasoning and graph merging to resolve this. |
Option 6: Just use a plain blank nodeThis DICOM JSON: "Value": [ 1, null, 2 ] would be this in Turtle: ( 1 [] 2 ) Pros
Cons
|
@JervenBolleman we discussed the DICOM Null flavors during the FHIR RDF call today. @ericprud had found and shared that link the other month at an earlier meeting. We discussed if a generalized approach could be used as the missing values in a rdf:List has use cases outside of DICOM. My original approach was to just omit an rdf:first when no value is known, but @dbooth-boston explained to me is that this is considered to not be "well-formed" and could cause issues. I lean towards a generalized (non-FHIR, non-DICOM) solution as JSON-LD compaction would not be changed for anything less than a general solution and to handle use cases outside of FHIR/DICOM. I do like having something that could indicate the "null flavor" for a FHIR/DICOM specific solution, it adds detail and clarity. My first thought was to have the DICOM RDF / JSON-LD match the current DICOM JSON save the context to make acceptance and tooling easier, but, nulls in literal lists is problematic. |
If we followed DICOM XML rather than the DICOM JSON (giving up on a DICOM JSON / JSON-LD match) DICOM XML (fragment)
Option 7RDF Turtle
Option 8reduced scaffolding ( use rdf:List, eliminate dcm:number, and move keyword and tag string values to ontology )
|
And in my "maybe" example: Option 9a
or possibly option 9b
option 9c (with EricP nulls)
|
I'd like to get this decided on our next teleconference (this week). Last week's discussion seemed to favor Option 5, but are there particular variants of option 5 that we should consider? For example, should we be recommending one specific null type? If so, specifically what URI? Or should we recommend a null-flavor type hierarchy? If so, specifically what, and what should the top-level null class be? If we can make options as concrete as possible, it will help facilitate our decision-making. |
@dbooth-boston of all of the DICOM null flavors, "NI" described in the spec as "No information. This is the most general and default null flavor." seems to be a good candidate for the root null flavor value with NA, UNK, ASKU, NAV, NASK, MSK, OTH being subclasses of "NI". I like Option 5 (@ericprud edits), as well, as it's closest to DICOM JSON. I also like Option 9 (with @ericprud nulls) being fairly compact. Further, the DICOM string usage could be mapped safely to more performant datatypes with SHEX/SHACL used to enforce the range/structure of values. A mapping/scripts from 5 to 9 (and 9 to 5) for those who want a few less triples? |
Just wanted to add a link to the HL7 NullFlavor code system, which is where the DICOM null flavor descriptions come from I think: https://terminology.hl7.org/CodeSystem-v3-NullFlavor.html |
Proposed 5B:
|
Per today's provocative, last-second assertion that
So refining an xsd:string to be e.g. opt 5?s dcm:01234567 [ dcm:Value ?age ]
FILTER (?age = "018M") opt 9 /c specialized types:?s dcm:01234567 ?age
FILTER (?age = "018M"dicom:AgeString) |
Option 10 ( dcm:Null bnode, XSD data types, VR types moved to ontology/SHEX/SHACL, moving Lists up to their properties)<>
dcm:00080008 ( "DERIVED" "PRIMARY" [ a dcm:Null ] "SINGLE A" ) ;
dcm:00080016 ( "1.2.840.10008.5.1.4.1.1.12.1") ;
dcm:00080018 ( "1.3.12.2.1107.5.4.3.11540117440512.19970422.140030.6" ) ;
dcm:00080020 ( "19970422"^^xsd:date ) ;
dcm:00080030 ( "131047"^^xsd:time ) ; 11/21/24: Edited by dbooth to use dcm:Null instead of dcm:null. |
AGREED: Option 10, though we have not yet decided on namespace for dcm: prefix. See also: #151 |
Copying this issue description from Erich Bremer's email.
From: Erich Bremer
Date: Mon, 25 Mar 2024 11:22:24 -0400
Let me re-state the two problems that I feel need to be resolved as far as
I see in having a DICOM RDF for those not part of the original conversation.
For reference, here is a snippet of DICOM compliant JSON:
{
"00020002": { "vr": "UI", "Value": [ "1.2.840.10008.5.1.4.1.1.12.1"]},
"00020003": {"vr": "UI", "Value":
["1.3.12.2.1107.5.4.3.321890.19960124.162922.29"]},
"00020010": { "vr": "UI", "Value": ["1.2.840.10008.1.2.4.50"]},
"00020012": { "vr": "DS", "Value": [ "999.999"]}, ...
https://dicom.nema.org/medical/dicom/current/output/chtml/part18/sect_F.2.3.html
https://dicom.nema.org/medical/dicom/current/output/chtml/part18/sect_F.2.4.html
says leave off the "value" property but the rest must be there.
https://dicom.nema.org/medical/dicom/current/output/chtml/part18/sect_F.2.5.html
element as the position itself may/may not be important.
Now to RDF ******
a) It's easy to handle 1+2+3 with RDF using a blank nodes and RDF Lists:
[ dcm:00020002 [ dcm:vr: "UI", dcm:Value ("1.2.840.10008.5.1.4.1.1.12.1") ];
dcm:00020003 [ dcm:vr "UI", dcm:Value
("1.3.12.2.1107.5.4.3.321890.19960124.162922.29")];
dcm:00020010 [ dcm:vr: "UI", dcm:Value ("1.2.840.10008.1.2.4.50")];
dcm:00020012 [ dcm:vr: "DS", dcm:Value ( "999.999")]; ...
The problem arises with 4 - nulls
For 3, we leave off the dcm:Value triple as "null maps to no triple". The
problem is in the RDF Lists.
( "1" "2" "3") is short-hand for:
_:myList rdf:first "1" ;
rdf:rest [ rdf:first "2" ;
rdf:rest [ rdf:first "3" ;
rdf:rest rdf:nil
]
] .
Following "no triple asserted is null", if I wanted to leave the second
element out of the list, I would just remove rdf:first "2". No triplestore
that I know would complain. I can SPARQL using the long list version and I
can write the SPARQL with an optional {_:SecondPosition rdf:first ?second }
or even a minus {}. In this fashion, the positional information is
preserved as needed by elements like
https://dicom.nema.org/medical/dicom/current/output/chtml/part03/sect_C.8.11.7.html#table_C.8-74f
or pixel spacing
https://dicom.innolitics.com/ciods/rt-dose/image-plane/00280030 and would
return an unbound value depending on the data.
This path falls apart as there is no support (an implicit rdf:first must
always be present) for the missing rdf:first triple in the second position
for the shorthand ( "1" "2" "3"). I can put a variable ( "1" ?second "3",
but I cannot say the equivalent of ( "1" optional { ?second} "3") or even
( "1" minus { ?second} "3") JSON-LD will simply remove the second element
if I say ( "1" null "3") with the thought that null is no triple
asserted. But this stance is not the same as removing the rdf:first
triple, it's removing multiple triples [ rdf:first "2" ; rdf:rest[] ] and
then pointing _:first to _:third which takes out the positional information
and changing the meaning of things as DICOM views their data. RDF List is
a container construct with triples that express the various positions and
relations and the associated positional values. What would be wrong (who
would be put out) if we allow something that is already allowable in the
RDF model? Honestly, I feel like RDF is not following its own rules here
and needs to be fixed even with DICOM out of the picture. The same, I
feel, applies to RDF Sequence containers:
ex:mySequence
rdf:type rdf:Seq ;
rdf:_1 "1" ;
rdf:_2 "2" ;
rdf:_3 "3" .
If I omitted rdf:_2, it should just now be:
ex:mySequence
rdf:type rdf:Seq ;
rdf:_1 "1" ;
rdf:_3 "3" .
If the current logic of RDF lists is applied it would become the below
which is not the same:
ex:mySequence
rdf:type rdf:Seq ;
rdf:_1 "1" ;
rdf:_2 "3" .
Keeping the positional triples in RDF Lists and sequences seems to be an
easier fix than introducing some type of null literal or typed
"null"^^xsd:integer but it seems to be an implicit taboo due to the lack of
the support in the syntactic sugar for lists and how JSON-LD compaction
behaves with nulls.
I appreciate and I think I understand the thinking of the elements as part
of the schema, but it seems to be heading to more complex territory. RDF
would allow us to deviate further from the DICOM JSON through using things
like custom data types to reduce the number of triples:
[ dcm:00020002 ("1.2.840.10008.5.1.4.1.1.12.1"^^dcm:UI);
dcm:00020003 ("1.3.12.2.1107.5.4.3.321890.19960124.162922.29" ^^dcm:UI );
dcm:00020010 ("1.2.840.10008.1.2.4.50" ^^dcm:UI );
dcm:00020012 ( "999.999"^^dcm:DS); ...
and even remove lists where DICOM value multiplicity is always 1 but it
makes things a bit more complicated. I think a DICOM RDF needs to be very
much bi-directional and keeping towards the DICOM JSON modeling makes it
more familiar for the people in the DICOM domain. It reduces the tooling
on both sides. Nothing stops a RDF person from making SPARQL update
transforms to mutate the data back and forth to a different design (perhaps
more performant) but I fear great becomes the enemy of the good. "a little
semantics goes a long way" - Erich
The text was updated successfully, but these errors were encountered: