Should a DICOM StudyInstanceUID be unique to the patient? - dicom

While working with the DICOM study, series and media concepts, I wondered if these values are to be unique over all data, or only to the patient they belong to.
Phrased otherwise; can I have 2 patients having a study/series/sop instance uid that is the same value for both patients?
Or does the DICOM standard simply doesn't care about that and is that open to the implementor to decide?

In DICOM, a Study (identified by its Study Instance UID) is always associated with a single Patient. See DICOM standard part 3 for details.
To answer your initial question/thought: a Unique Identifier (UID) has to be globally unique, i.e. world-wide over all patients, devices, hospitals, etc.

UID in DICOM (no matter what UID) is always globally unique. So, as you asked in question, uniqueness is not limited to Patient level or something.
Following is from specifications:
2017a Part 5 - Data Structures and Encoding (9 Unique Identifiers (UIDs))
Unique Identifiers (UIDs) provide the capability to uniquely identify a wide variety of items. They guarantee uniqueness across multiple countries, sites, vendors and equipment. Different classes of objects, instance of objects and information entities can be distinguished from one another across the DICOM universe of discourse irrespective of any semantic context.
More details about DICOM UID can be found in this answer.
Your comment on question as below:
My question was more about what to do in case I choose to clone a patient in my system and attach the same dicom(s) to it. Should I regenerate the dicom-uid's or could I keep them as-is.
I am not sure what you mean by "clone". While cloning, if there is change in dataset, you should regenerate the SOPInstance UID. Even if you simply apply lossy transfer syntax to your dataset, you should regenerate the SOPInstance UID. Any action that differentiates/separates the the datasets from original require new SOPInstance UID. So, while cloning, if you are changing patient demographics, new UID should be generated. Whether new StudyInstance UID should be generated or not depends upon what is changed.
OTOH, if you are just copying your dataset at different location, it is still same dataset. You do not need to regenerate UIDs in this case.

Unfortunately although the standard states the UID should be globally unique you can not guarantee it at the series level in my experience. I have come across series with duplicate ids across studies. To protect yourself assume you have to use StudyUID +SeriesUID to ensure a unique series key.

Related

How to generate unique DICOM UID?

I am working on DICOM gated (PET) data.
I would like to artificially create a DICOM image series which includes gated data. I am inquiring on the increment values of SOPInstanceUID which labels each image slice in each phase or gate.
These have different values for each slice in a gate and are incremented between gates but I can't find out the logic to how this value is chosen.
Is there a reference to where and how these values are written?
Multiple algorithms to generate DICOM UID are explained in this answer with their drawbacks.
As per DICOM specifications, all UIDs including SOPInstanceUID in question should be unique. This is irrelevant to what data (gated PET data or other) you are working on.
Following is from specifications:
2017a Part 5 - Data Structures and Encoding (9 Unique Identifiers (UIDs))
Unique Identifiers (UIDs) provide the capability to uniquely identify a wide variety of items. They guarantee uniqueness across multiple countries, sites, vendors and equipment. Different classes of objects, instance of objects and information entities can be distinguished from one another across the DICOM universe of discourse irrespective of any semantic context.
UID consists of two parts:
Organization root:
This part of UID ensures the uniqueness across organizations. There are service providers who offer this for free. Medical Connections is the one I am aware about. You can contact them to get the one for free.
Suffix:
Further, you should generate suffix in such a way that it guarantees uniqueness inside your organization.
Following are the general rules for DICOM UID:
Total length must be <= 64 characters, including the stops
Must contain only digits 0-9 and full stops
Each numeric "component" (between stops) must be a valid and unambiguous integer number, and so must not have a leading zero (unless the whole component is zero)
Must be guaranteed to be unique - this means:
It must be derived from a proper official root under your sole control.
It must not be created by appending digits (however special you consider the combination!) to someone else's UID.
In particular, series UIDs for secondary capture images, KIN objects etc. must not be created as derivatives of the Study UID (unless you own that root!)
Related to the above, there is no expectation or requirement that the Study UID, Series UID and Instance UID for images should be derived from the same root (though in practice, Series UID and Instance UID normally are, as both must be generated internally by the equipment which generates the images)
Date and Time are useful for generating UIDs, but only if:
Each machine has a unique root (normally your company UID root + a machine specific suffix such as a serial number
If it is possible for UIDs to be generated at > 1 per second, then a sequential counter should also be used
if on a multi-threaded machine, then the thread ID or a properly interlocked counter are needed to prevent 2 applications or 2 threads in the same application from generating identical UIDs simultaneously.
Do not use time on its own - it is too easy to end up with a leading zero 0 - e.g. 20060724.093017 use instead 20060724093017
Same can be found in specifications.
Following example is from DICOM Specifications to generate UID. Please note that this is Informative section.
2017a Part 5 - Data Structures and Encoding (B Creating a Privately Defined Unique Identifier (Informative))
B.1 Organizationally Derived UID:
The following example presents a particular choice made by a specific
organization in defining its suffix to guarantee uniqueness of a SOP
Instance UID.
"1.2.840.xxxxx.3.152.235.2.12.187636473"
In this example, the root is:
1 Identifies ISO
2 Identifies ANSI Member Body
840 Country code of a specific Member Body (U.S. for ANSI)
xxxxx Identifies a specific Organization.(assigned by ANSI)
In this example the first two components of the suffix relate to the
identification of the device:
3 Manufacturer defined device type
152 Manufacturer defined serial number
The remaining four components of the suffix relate to the
identification of the image:
235 Study number
2 Series number
12 Image number
187636473 Encoded date and time stamp of image acquisition
In this example, the organization has chosen these components to
guarantee uniqueness. Other organizations may choose an entirely
different series of components to uniquely identify its images. For
example it may have been perfectly valid to omit the Study Number,
Series Number and Image Number if the time stamp had a sufficient
precision to ensure that no two images might have the same date and
time stamp. Because of the flexibility allowed by the DICOM Standard
in creating organizationally derived UIDs, implementations should not
depend on any assumed structure of UIDs and should not attempt to
parse UIDs to extract the semantics of some of its components.
There is one more way mentioned in specifications
2017a Part 5 - Data Structures and Encoding (B Creating a Privately Defined Unique Identifier (Informative))
B.2 UUID Derived UID:
UID may be constructed from the root "2.25." followed by a decimal representation of a Universally Unique Identifier (UUID). That decimal representation treats the 128 bit UUID as an integer, and may thus be up to 39 digits long (leading zeros must be suppressed).
A UUID derived UID may be appropriate for dynamically created UIDs, such as SOP Instance UIDs, but is usually not appropriate for UIDs determined during application software design, such as private SOP Class or Transfer Syntax UIDs, or Implementation Class UIDs.

Secondary Capture Image: what is the correct workflow for creating and storing?

I need to create a Secondary Capture Image representing a report related to radiopharmaceutical and dose injected to the patient during the medical examination.
I know Secondary Capture Image is not the right choice to accomplish the task but that is what the customer requires.
Following are the steps I thought to implement for developing the feature and I would like to read some opinions or suggestion from the community.
Assumption: MWL is implemented and the Study Instance UID is generated in the RIS
query the MWL (C-FIND) to get the requested procedure object
parse the result to get the StudyInstanceUID and patient related
informations (name, sex, birthdate etc.)
query (C –FIND) the modality looking for the specific Study
Instance UID
parse the result to get the Series Instance UID
create the image setting the three mandatory attribute Study
Instance UID, Series Instance UID, Modality (together with some type
2 attributes I got querying MWL and modality in the previous
steps)
C-STORE to persist the image to the storage archive
Commit of the image (do I really need?)
I really appreciate comments opinions or someone that can address me to a more solid architecture.
correct
correct. Do not forget about attributes that are not so obvious like Admission ID, Accession Number, Referring Physicians Name and others.
The majority of modalities does not support Query/Retrieve as an SCP. If you would really need to query for the images, send the C-FIND to the PACS rather than the modality. The Study Instance UID comes with the worklist. Even if the UID you find by Query differs from that, I would strongly recommend to use the one from the worklist. However, I do not see any sense in using attributes from other sources than the MWL and your own "acquisition".
Why would you want to add the image to an existing series? It would probably be more appropriate to create a new one. There are a lot of reasons for that, e.g. Modality and vendor/equipment information are series level information and probably different.
There are more mandatory attributes for SC (e.g. in the general image module). Not all come from the MWL.
yes.
You do not have to. However, suppose that your images are lost:
a) you have received a storage commitment from the PACS -> blame on the PACS
b) you have not received a storage commitment from the PACS -> blame on ...? ;-)

Maximum records can be stored at Riak database

Can anyone give an example of maximum record limit in Riak database with specific hardware details? please help me in this case.I'm going to build a CDR information system. Will it be suitable to select Riak as my database?
Riak uses the 2^160 SHA-1 hash value to identify the partitions to store data in. Data is then stored in the identified partitions based on the bucket and key name. The size of the hash space is therefore not related to the amount of data that can be stored. Two different objects that happen to hash to the same value will therefore not overwrite each other.
When working with Riak, it is important to model your data correctly and consider how it needs to be retrieved and queried during the design process. Ideally you should try to ensure that the vast majority of your queries can be done through direct key access. It is often recommended to de-normalise your data and use natural keys. For CDRs this may mean creating an object holding all CDRs for a subscriber per day. These objects can be named based on the subscriber id and date, making it easy to retrieve data directly by key. It is also often more efficient to retrieve a few larger objects than many small ones and perform filtering in the application rather than try to just get the exact data that is needed. I have described this approach in greater detail here.
The limit to the number of records (or key/value pairs) you can store in Riak is governed only by the size of the hash space: 2^160. According to WolframAlpha, this is the number:
1461501637330902918203684832716283019655932542976
In other words, go nuts. :)

Matching unique ids in two different databases

I have two different databases that are not connected in any way. In fact, one is a public school database and one is a hud (housing) database. By law they are not allowed to share names and other specific identifying addresses. Birthdates and addresses are okay - along with zip codes and other more general ids. The uses need to be able to query the other database to get non-specific information so it would appear that they need to share the same unique id. I was considering such things as using birthdates and perhaps initials of name or perhaps last 4 digits of ssn along with the birthdate. The client was thinking of global positioning data but I'm concerned about apartments next to one another or moving of families. Any ideas?
First you need to determine what will be your measure of uniqueness. If there are two people in either database with more than one entry for your measure of uniqueness, you need to change your strategy. After that, put a constraint on both databases constraining that these properties(Birthday, SSN) are what make a Person record unique.

DICOM: What's the point of SOPInstanceUID tag?

DICOM already provides a unique enough identifier for the Series (e.g. Series Instance UID), so why also include one on the lower level objects (e.g. SOPInstanceUID)?
What I find really annoying is the fact that when referencing other objects - for example when RTPlan object references RTStruct object via ReferencedStructureSetSequence / ReferencedSOPInstanceUID - it's done using the SOP Instance UID. However any of the DICOM SCPs - such as find/move - don't work with SOP Instance UID, they work with the Series Instance UID. So what gives? Do I have to load the whole Series to find all the referenced objects?
This question was from quite a while ago, but I thought I'd add that, ignoring QR altogether, a SeriesInstanceUID is a globally unique identifier for a single series. SOPInstanceUID is a globally unique identifier for a DICOM file. A series can have multiple DICOM files, so each would share that same SeriesInstanceUID, but each file would have it's own SOPInstanceUID.
As you probably know, DICOM has a hierarchy of identifiers for each individual SOP (Service Object Pair) Instance (Patient ID / Study Instance UID / Series Instance UID / SOP Instance UID). This hierarchy is built into the Query/Retrieve mechanism in DICOM, and is also used to identify specific SOP Instances.
In the specific case you're mentioning, I believe there could be the possibility of multiple RT Structure Sets within a Series/Study. The individual SOP Instance must be referenced so that you know which Structure Set the RT Plan is referencing.
As for products supporting retrieving by SOP Instance UID, unfortunately, relational queries are not widely supported in DICOM Query/Retrieve SCPs, as you've discovered, and some DICOM servers do not support Image level queries. In this specific case, you could query at the series level specifically for the RTSTRUCT modality, and only retrieve the Series that have this modality, thus narrowing down which data you need to download to just the RT Structure Sets.
SOPInstanceUID represent separate uid of the Dicom Image File. Study, series and sopinstace uids are based on data model. StudyUID give you the particular study information. In which different series devided. Series instance uid used for for this. And SOP instance uid represent seperate Dicom image. It's hierarchy structure. I also never used SOPInstanceUID when i developed PACS workstation in Java. As per my experience, Study & Series uids are enough for represent patient's data. But still SOPInstanceUID gives unique identity to dicom image.
SOP Instance UID : Represent your a unique Identifier for IOD, Its a TYPE 1 tag must present with value.
For Example :
Each DICOM Image has unique identifier
Series reference is not specific enough. In the case of structure sets the Reference SOP Instance UID ties the contours in the structure set to the specific slice in the dataset. It's not enough to just reference the series because you have to ensure that the contour is exactly aligning with a slice.
SOPInstanceUId is for image level identification.
Understand it like:
A study can have multiple series and a series can have multiple images/DICOM
So,
to identify study uniquely we use StudyInstanceUID
to identify series uniquely we use SeriesInstanceUID and
to identify an image/DICOM uniquely we use SOPInstanceUId

Resources