Steven C. Horii, MD
What DICOM does and the importance of its functions are best illustrated by a hypothetical transaction between two manufacturing companies (1,4).
Suppose that you, as the director of product development, decide that you need a dozen widgets to create a prototype product. You tell Betty, your engineering purchasing manager, to buy the widgets from the Acme Widget Company. Betty knows that the quickest way to get them would be to call Acme, but she also knows that the widget you want is popular and may be out of stock. She picks up the telephone, calls Acme, and hears the usual greeting: "Hello, this is Acme Widget; how may I direct your call?" Betty replies, "Please connect me with Bob Roberts in sales." The receptionist puts through Betty's call.
What Betty has done so far is much like what the device using DICOM does when it asks to communicate with another device. When she picks up the telephone, she is requesting that the line be opened to her. The dial tone she hears tells her that the line is available (no dial tone would indicate a problem with the line or that it was in use). Similarly, DICOM uses services to request communication with another device over the network. The network protocol will indicate that the network is either busy or available. If it is available, DICOM initiates a series of actions that request what is called an association with the other device.
By simply dialing (ie, using either a push-button or dial telephone) the desired number and listening for an answer, you have accomplished a surprising amount of communication protocol. First, you follow certain rules in dialing the number. If you are calling from within an office, it is likely that you must first dial a number to request a line outside your local switchboard. Then, you dial a "1" plus an area code if the number you are calling is not in your area code, or you leave those digits out if the number is local. When you hear the telephone on the other end ring, you know the line is not busy. If it is busy, your "protocol" will likely be to hang up and try again later.
When the person on the other end answers, the first few words tell you quite a bit. The two things of most importance to you are, first, that the person is speaking English, and second, that you have reached the correct number. If you hear, "Moshi moshi, Fujiyama Denki desu," you might either hang up or ask if you are talking to Acme Widget. In this case, some negotiation is taking place; you want to know if you have the correct number and if the person speaks English instead of Japanese. You are presuming that the person will recognize your language as English. The person's reply will determine what you ask next.
DICOM does some initial negotiating to establish the association mostly by determining what it is the requesting device wants to do and what the receiving device is capable of doing. With DICOM, it is not the device itself, but the software running on the device that does much of this negotiation. DICOM refers to the devices as application entities because it is the application layer (the uppermost layer in the communications stack) that initiates the communication process.
Just as your initial "negotiations" determine basic capabilities (eg, the language being used, the identity of the other person), so the association establishment in DICOM involves negotiating capability. The application entity requesting the association sends what amounts to a list of things it wants to do. The receiving application entity then replies with which items on that list it can do. Subsequent exchanges are based on the capabilities that the two entities have in common. An example is the way in which digital pixel values larger than 8 bits (1 byte) are represented. Some computers represent 2-byte (16-bit) numbers with the least significant byte (ie, the one representing the least significant digits of the number) being stored or sent first ("little endian"). Other computers do just the opposite ("big endian"). If two devices using the opposite representations exchange numeric data (eg, pixel data), the values will be incorrectly represented unless the devices know this and can effect conversion. DICOM allows for both methods of representation, so that one matter for negotiation is which method the devices will use during the exchange of information.
Another difference among imaging devices is the manner in which they represent values. As will be discussed later, DICOM breaks the information it needs to send into a series of data elements. Applications may have different ways of representing the value contained in any given data element. DICOM has very specific definitions of the different types of representations allowed; these are called value representations (VR). Examples include text strings of differing maximum lengths (a text string is a collection of successive characters with a starting and ending character, much like a sentence), binary numeric data, time and date data, and person name data. In the earlier ACR-NEMA standard, these VRs were defined in a section of the standard called a data dictionary. If you were writing software that received ACR-NEMA information and you wanted to understand the meaning of the data elements you were receiving, you would use the data dictionary to look up the ACR-NEMA element tag (which identifies the element) to find out what data were in the element and how the element was represented. With the DICOM standard, new VR types were added, and any future additions will increase the size and complexity of the data dictionary. This is one reason why DICOM defines both an ACR-NEMA method and a new method for handling the VR of a data element. The new method defined by DICOM includes the VR in the element, thus avoiding any possible ambiguity. For software designers, this method reduces the need to refer to the data dictionary. The ACR-NEMA method of defining the VR in the data dictionary is called implicit VR. The new method used in DICOM, in which the VR is included in the data element, is called explicit VR. DICOM allows both methods, and this is important to know as abilities are being negotiated.
Both the VR method and the order of the bytes are part of a set of information that is crucial to successful information exchange. This information set is called a transfer syntax and is defined in DICOM. Which transfer syntax is to be used for exchange of information is negotiated very early in the association process. Thus, when people familiar with DICOM speak of "implicit VR, little endian," they are referring to one of the DICOM transfer syntaxes.
After Betty is connected with Bob in the sales department at Acme, they exchange some pleasantries and then Bob asks her what she needs. Betty replies, "I need 15 widgets." Bob thinks for a second, tells her he is checking inventory, then says, "Yes, I have them in stock." Then he adds, "You know, Betty, we have just released the Mark II Widget. It is 10% faster than the Mark I Widget for the same price." Betty asks, "What about the interface?" Bob replies, "The Mark II is a drop-in replacement for the Mark I. We will be phasing the Mark I out as we use up inventory." Betty says, "Sounds good; I'll take the 15 widgets as Mark IIs, then."
Although these transactions seem straightforward, it is because the process is one that is very familiar. If Betty did not know Bob in sales, for example, she might simply have asked for the sales department.
What is happening in this scenario is something that happens so many times during a typical day that we tend to ignore it unless something goes wrong. In this example, the definitions of terms and the model of the way things work are held in common by you, Betty, and Bob at Acme. This is true in virtually all of our communication: We work from common definitions and models, and if the terms or model are unfamiliar, we either ask for clarification or we run the risk of encountering problems. In this case, you (product development manager) asked for a dozen widgets, but Betty (engineering purchasing manager) ordered 15 of them. Betty is working from the historical information she has that, in your development projects, you invariably forget about the three widgets that will be used up in safety compliance testing and need to order more. Betty knows enough about your prototype project design to ask whether the Mark II Widget has the same interface as the Mark I. She also knows that the design would benefit from an increase in widget speed, so she sees no reason not to switch to the Mark II. There is an even higher level model invoked in this scenario. Some engineers would be wary of using a just-released product for fear that it is likely to cause more problems than a more time-tested product. However, in this scenario, both you and Betty know that Acme has such good quality assurance that the likelihood of encountering any problem with the Mark II Widget compared with the Mark I is negligible.
The most important functions that DICOM performs are to define as unambiguously as possible the terms it uses and to define models of image communication that are agreed on by those who adopt the standard. At a very high level of communication (that between users), it is important to agree on terminology. Both within radiology and between radiology and other specialties, use of unambiguous terms is vital if actions taken on the basis of the communication are to be correct. A difference of opinion about what constitutes "medial" and "lateral," for example, could have disastrous effects if surgery is undertaken on the basis of the radiology report. The College of American Pathologists has for some years been working on the Systematized Nomenclature for Medicine (SNOMED). This nomenclature proposes standard terms, also called a controlled vocabulary, for anatomic structures and pathologic conditions. The College of American Pathologists is now a DICOM Standards Committee member and is working with the committee to develop the sections of SNOMED that will address imaging (5). These sections are referred to as the SNOMED-DICOM Microglossary.
DICOM has also adopted conventions from other standards where appropriate, such as the person name format proposed by the Health Level 7 standards body. This format clarifies how names are represented and is of value when there are prefixes (eg, Dr, Fr) and suffixes (eg, Jr, II; academic degrees) along with the name. The format divides a name into components such as "family name," "given name," and "prefixes," and each component is single or multiple to account for unusual constructions like multiple middle names or suffixes.
DICOM extends its definitions beyond terms, measurement values, conditions, and the like. It also defines the model of how these things, or entities, relate to each other. For example, how is a patient related to a study done on that patient? A simple relationship in this case is expressed by the word "has": A patient has a study. In turn, the study itself contains images. This process can be extended so that, through careful examination of a clinical operation, a model can be built that includes all the entities in the operation (eg, patients, studies, images, reports) and the relationships between them. This process is called entity-relationship (E-R) modeling. It is important to note that the resulting model, usually a diagram, does not depict the direction in which data move (Fig 3).
Each of the entities in an E-R model has other descriptors, or attributes. For example, a patient might be described in terms of name, age, sex, and medical record number, and perhaps height and weight. All of these are attributes of the patient; that is, they each carry some identifying or descriptive information about that particular person. DICOM defines not only E-R models for imaging but also the attributes that describe each entity. In fact, the data elements of the ACR-NEMA standard are the attributes used in DICOM. Figure 4 shows the structure of a DICOM attribute.
The process of developing E-R models and identifying the attributes that describe the entities is part of an information analysis and modeling method called object-oriented analysis. This technique has caught on rapidly in computer science disciplines until now it is nearly impossible to study in any branch of information science without encountering object-oriented methods. Part of the goal of creating E-R models is to achieve an object-oriented result; the entities are objects and the attributes describe them.
DICOM adopted the object-oriented approach as part of the design philosophy. Things such as images, reports, and patients are all objects in DICOM and are called information objects because their function is to carry information. The definition of what constitutes an information object in DICOM is called an information object definition, which is nothing more than a list of which attributes must be present (mandatory attributes), which are optional, and which are conditional (ie, must be present only in some situations). There is a subtle distinction that is important to make when describing information objects. The information object definition can be thought of as a "form" with a number of "blanks" to be filled in with information. Each piece of information is an attribute, such as patient name or medical record number. Even if the form is not filled in, the various blanks on the form give the information some structure. When the blanks are filled in, values are given to the attributes so the form is no longer generic; it applies to a specific patient, image, or other type of object. This filling-in process, or assigning of values to attributes, creates what is called an information object instance.
The ACR-NEMA Version 1 and 2 standards did not use object-oriented analysis or design. Instead, attributes (or elements, as they were called) were grouped according to use. For example, there were groups of elements that carried identifying information about the patient and others consisting of elements that described the methods of image acquisition. Because they were developed without an E-R model, these groups do not conform to conventional object-oriented definitions. For example, a collection of elements used in the ACR-NEMA Version 2 standard to identify and describe a computed tomographic (CT) image would also contain the patient name. In an E-R model, however, the patient name is an attribute of the patient object, not of the image object. In other words, the patient name is not needed to describe the CT image, even though it would be needed to identify the image. One might also view these complex objects as consisting of parts of more than one entity in an E-R model.
The problem for the DICOM developers (at that time, still the ACR and the NEMA) was that they wanted to maintain some compatibility with the earlier versions of the standard. In addition, there were some computer scientists who believed that objects that "broke the rules" by containing attributes not strictly inherent in that object had some value. For one thing, retrieving such complex objects from storage would be more efficient: If the objects were broken up into smaller ones, assembling the smaller objects for use would mean searching the storage device for all of them. Such a task would be more time-consuming than recovering the complex object in a single retrieval. Consequently, these complex data structures were retained in DICOM as composite objects. New objects that were defined by the E-R models of DICOM followed object-oriented design rules and did not contain attributes that were not inherent in the object. Such objects in DICOM are called normalized objects. The image objects (eg, CT, ultrasound [US], magnetic resonance [MR] imaging) are all composite; the objects that are used for image and results management are normalized.
At the hypothetical company of which you are product development manager, an important aspect of quality assurance is keeping track of what parts are incorporated into your products. That way, if a problem with your product is discovered either in your testing laboratories or by your customers, you can find out what parts went into that product and determine if the problem is a design flaw or a faulty part. To facilitate this process, parts would be assigned part numbers in your products. This would enable you to identify a part that failed, but it would not help you figure out which manufacturing lot it came from. In other words, the part number identifier you assign may be unique in a product line, but it would not be unique across multiple versions of that product. For example, the widgets that Betty ordered might be assigned part number 2011030-001 on the basis of your design drawings. However, all the products you make containing that widget will reference that part number. To identify a specific part, you need a unique identifier (UID). If you know that Acme assigns unique serial numbers, for example, you might use them to uniquely identify a particular widget in a particular product. Only one product should contain the widget with part number 2011030-001 and serial number 97-2003, so any subsequent follow-up with Acme necessitated by product failure will be made simpler.
Just as certain elements in industry are assigned numbers for unique identification, so the images, reports, or other information that are transmitted from one device to another in medical image communication must be identified in a unique fashion. DICOM uses UIDs to identify information objects in this way. The form of the UID conforms to an international standard, and, if properly applied, provides an identifier that is unique not only within an institution but worldwide. UIDs are designed mainly for computer software interpretation; as a result, their form is a bit cumbersome. The UID is used by DICOM whenever one thing is referenced by another. For example, the transfer syntaxes described earlier have UIDs so that the different machines using them can refer to a particular transfer syntax by its UID. As part of the international standardization process, the committee responsible for the DICOM standard applied for, and was granted, a numeric field to use as part of any UID that DICOM defines. This numeric field is called the organizational root; for DICOM, it is 1.2.840.10008. To this organizational root are appended additional numeric fields. For example, the UID for the DICOM explicit VR little endian transfer syntax is 1.2.840.10008.1.2.1. The organizational root will differ depending on whether the UID was assigned by a manufacturer or a user. The periods separating the numbers make these numeric fields look as though they should have some particular meaning, but they do not. A UID exists to give a unique identity to something, not to carry information about the thing it identifies.
1. National Electrical Manufacturers' Association. Digital Imaging and Communications in Medicine (DICOM). Rosslyn, Va: NEMA, 1996; PS 3.1-1996-3.13-1996.
4. Bennett B, McIntyre J. Understanding DICOM 3.0. Dallas, Tex: Kodak Health Imaging Systems, 1993.
5.Bidgood WD Jr, Horii SC. Modular extension of the ACR-NEMA DICOM standard to support new diagnostic imaging modalities and services. J Digit Imaging 1996; 9:67-77.
<<Previous 1 2 3 4 5 6 7 8 9 Next >>