Value Level Metadata and Research Concepts

When people point to flaws in SDTM, they typically appear to me as gaps in the existing standard. In general, CDISC started defining standards by focusing on the basic structural metadata (e.g. domains, variables, code lists). This makes sense because this structural metadata is fundamentally useful, and relatively easy to understand and create. As the industry’s use of the standards has increased, so has the demand for standards that can be implemented more consistently and easily, as well as standards that are more computable. The limitations in the current standards are gaps, and addressing these gaps represents a natural evolution for the CDISC standards.

As noted in my previous post “What’s in a SHARE Value Level Metadata Library?” CDISC does not currently contain Value Level Metadata (VLM) content, and this content represents a lot of new metadata. VLM is a gap in the existing standards. How do we know what variables are impacted by a specific –TESTCD? Much of that information can be conveyed through VLM, and the CDISC Terminology teams have started to address the VLM content gaps. In SHARE, we’re extending the CDISC model to capture VLM content.

A Concept Layer exists in the SHARE meta-model. Conceptual metadata represents another gap in the CDISC standards, with the noted exception of Controlled Terminology. The SHARE team loaded NCI- created concepts for existing CDASH and SDTM variables as experimental content, since they are not part of the normative standard. Basically, these concepts consist of a natural language definition and a Concept-code (c-code). For example, the SDTM variable AESDTH implements the concept “Death Related to Adverse Event”, has a concept code of C48275, and a CDISC definition of “The termination of life as a result of an adverse event.” Without these basic concepts, we don’t have consistently rendered definitions for our standards metadata.

Why do we need concepts with basic definitions and c-codes for our standards metadata? When a TA standards team is creating a new standard, these definitions help Subject Matter Experts (SME), that strangely prefer not to speak in the language of SDTM domains and variables, decide if their needs are met by the existing standards, or if the development of new standards metadata is warranted. When new standards metadata is developed, creating natural language definitions in terms understood by the SMEs helps to clarify and disambiguate the meaning and use of that metadata.

When developing new VLM, how do we know how each SDTM variable in a domain is related to a specific –TESTCD? The SMEs draw on their clinical / statistical / data management expertise to identify the appropriate set of concepts and their relationships to the specified test. These related sets of concepts are then used to create VLM metadata content in terms of SDTM variables and controlled terminology. The conceptual metadata needed to support this process does not explicitly exist in the CDISC standards today. Developing the conceptual metadata that supports the development of VLM and other standards metadata represents another logical next step in the evolution of the CDISC standards. 

There’s more to the SHARE Concept Layer than basic concepts, including the means to combine basic concepts to represent clinical observations. The SHARE Concept Layer will be covered in more detail in a future post.

Comments

Popular posts from this blog

Value Level Metadata, Vertically Structured Datasets, and Normalizaton

ODMv2: Renovating the ODM Standard

What’s the difference between iSHARE and eSHARE?