-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Representation of DICOM "arrays" (values with VM larger than 1) and other values (OB, OW, etc.) in RDF #149
Comments
Adding this here as it was discussed at the last meeting as it gives support for the argument to use Well-Known Text (WKT) to represent polygonal data in DICOM: WKT is supported in OGC GeoSPARQL Specification [Virtuoso] (https://vos.openlinksw.com/owiki/wiki/VOS/VirtGeoSPARQLEnhancementDocs) as well as other RDF systems |
Here is a snippet from a converted file with DICOM tag 30060050 as a OSG WKT literal instead of a rdf:List of x,y,z decimals. The OGC namespace for GeoSPARQL in included:
|
Reference: |
DICOM value multiplicity VM
DICOM defines the value multiplicity VM in Pt.3 Chapter 6.4.
This can be used to encode arrays and if the values are represented as strings, then the delimiter is the backslash "" and binary values of fixed length are just concatenated without any delimiter.
Encoding as RDF list
A direct approach to represent these would be to put all values in RDF lists. Even though this has a nice representation in turtle, actually it uses a linked list, which puts additional scaffolding elements to the data.
A very common use of these DICOM "arrays" are for example ROIs (region of interest) polygons. These are typically for display puposes and need to be read in order at once from the data. An RDF list representation makes this impossible with SPARQL only and needs some cumbersome sorting algorithm to recreate an array with the correct order.
DICOM encoding
Another idea is to use a string representation of the DICOM values of VMs greater than 1.
One way could be to stick with the DICOM notation and just use the delimiter for strings as defined in DICOM itself.
Well-known text representation
Another way could be to use an array representation that is close to the semantic meaning of DICOM. For polygons this could be taken from the well-known text representation of geometry. And other attributes in DICOM like for example DVHs could be mapped to these as well (e.g. as MultiPoint).
The adavantage is, that there are already SPARQL tools that can handle a sub set of the well-known text encodings.
Binary encoded data
DICOM can define arrays of binary data types and serialize them directly in binary format. This is still missing from RDF. As a workaround these can be represented as a decimal string, but this might change the accuracy.
Other values like OB, OW, etc.
These are streams that contain a sequence of multiple values, even though the VM is defined as 1 for the stream. Typically these data sets are very large and even though they can be represented as strings this increases the length of the encoded data. And BASE64 encoding makes it difficult to access the data at a given index directly.
The text was updated successfully, but these errors were encountered: