Blog detail

Creating SDTM domains with SAS: A guide for Clinical SAS Programmers

As a clinical programmer, there are many paths available. The main goal is always to access the data, manipulate and transform it, analyze it, and report on it. A programmer can specialize in data management (DM) programming and spend most of the time cleaning the data through edit checks and the engendered of patient listings and profiles.

Another task of the DM programmer is to transform the data from its raw format into a standard format. This standard format could be the CDISC Study Data Tabulation Model (SDTM) that is requested by regulatory agencies such as the FDA for submission of an incipient compound, or it could be a sponsor’s own standards. In the process of transforming the data, the DM programmer must make sure that the output conforms to the standard and is compliant as well as valid.

Thus, another aspect of the job is to write programs to check the data against the standard and run the programs whenever an incipient study is about to be analyzed. Determinately, when all the data has been transformed, the DM programmer must engender a convey file that will be sent to the regulatory agency that will review the submission data. The second type of clinical programmer is the statistical programmer (STAT) who takes the data that is cleaned and transformed by the DM programmer and engenders tables, listings, and graphs (TLG) for the clinical study report (CSR). Sometimes the data is taken from its raw state and transformed directly into TLGs, but most often the STAT programmer engenders analysis data sets from which they can facilely engender the compulsory output documents for the CSR. The STAT programmer is withal tasked with engendering ad hoc reports when needed, yearly safety updates, DSMB reports, and integrated safety and efficacy summaries.

There is a significant transition occurring for many clinical programmers in data management (DM). Many DM programmers are evolving from engendering programs in Base SAS to the utilization of incipient implements and solutions to engender the data that is needed in an incipient drug application submission. What did the programmer do in the past to cleanse the data and how has that process transmuted? Now that the data is requested to be in a standard format, what types of programs, macros, and formats were habituated to transform the data? What is done now to make the process more facile, more efficient, and repeatable across protocols, compounds, and therapeutic areas? From the old methodology to the incipient implements, we will show how the transformation process can be transmuted and amended.

Implementing SDTM with Base SAS

One possible approach is to implement the SDTM data standard with Base SAS as the primary implement. In the simplest form, this involves importing the source data into Base SAS, transforming that data with DATA steps, SQL and SAS PROCS, and then preserving SDTM domains as aeonian data sets. For this instance of engendering the DM file, sort the three source data sets by patient identifier, and then merge them together. The remaining activity is to define each of the SDTM DM variables in a DATA step and preserve that DM file to the target LIBREF. As is the case with all legacy SAS work, we have at our disposal a code editor window and SAS documentation perhaps in hard copy as well as online.

Base SAS Approach – Challenges and Benefits

There are several challenges with the reliance on Base SAS alone to perform SDTM domain data engendered. A primary issue is the management of metadata, as there is no metadata provided with Base SAS alone. One thing to note about this program is that you require to inscribe in all the LENGTH and LABEL verbalizations to define the SDTM metadata for the final domain data sets. This type of metadata is tedious, prone to error, and liable to result in inconsistencies across SDTM domain metadata for a tribulation. You additionally have no authentic regulation of the target metadata and no genuine-time validation that your resulting domain is valid SDTM data. When utilizing this Base SAS approach, you additionally run into logistical and strategic issues with code maintenance and reusability of the SAS code. The Base SAS code itself can become arduous to read, which makes maintenance arduous. This kind of coding inclines to be “one-off” in nature – resulting in constrained reusability.

The primary advantage of the Base SAS approach – albeit some might genuinely consider it a disadvantage – is that you have no restrictions as to what you can do with your SAS code. You have the full arsenal of Base SAS and can utilize any SAS procedure, macro code, or SQL procedure code to solve the quandary of SDTM data conversion. Some programmers have taken the Base SAS approach to SDTM engendered work and have augmented it with commonly available implements such as Microsoft Access or Excel as a place to store and apply metadata. This augmented approach is better than Base SAS solutions alone because you have your target SDTM metadata in a more manageable source, and you can consider the effort remotely data-driven and less prone to metadata consistency errors.

Implementing SDTM with SAS Enterprise Guide

A second approach to SDTM domain engendered uses SAS Enterprise Guide, which features a graphical utilizer interface and some additional that facilitate the SDTM file engendered. The first step in this effort was to define a LIBREF called LIBRARY that would point to the aeonian format Catalog associated with the source legacy data sets. Next, simply drag and drop the source data sets into the SAS Enterprise Guide Process Flow window. With the data in the process flow, it’s picayune to apply PROC SORT SAS Enterprise Guide tasks to sort the data by patient identifier. At this point, the task of SDTM conversion joins the same process utilized with the Base SAS  solution, where the data is merged, SDTM variables defined, and the aeonian DM data set is preserved just as shown in the Base SAS solution.

SAS Enterprise Guide Approach - Challenges and Benefits

The selection of SAS Enterprise Guide as the primary implement to engender SDTM data results in homogeneous challenges to utilizing Base SAS alone. Metadata management is still destitute in this approach and all the variable lengths and labels are still manually typed into the program code. Again, there is no genuine regulation of the target metadata and no authentic-time validation that your resulting domain is valid SDTM data. Albeit SAS Enterprise Guide provides the auxiliary Process Flow GUI, the “Tasks” available that you can drop into your Process Flow are constrained to sorting, appending, and transposing the data. The next section shows how the SAS Clinical Data Integration solution distributes more available data management tasks in the form of what it calls “Transformations” in lieu of “Tasks.” There are some advantages to utilizing SAS Enterprise Guide to engender SDTM data over Base SAS alone. As with the Base SAS approach, there is always the full arsenal of Base SAS, and you can utilize any Base SAS PROC, SAS MACRO, or SAS SQL code to solve the quandary of data conversion. However, with SAS Enterprise Guide you get some additional assistance in the form of automated “Tasks” that you can drag and drop into your project. You can optically discern the PROC SORT-driven “Sort” task utilized in Exhibit 1 above, but there are other utilizable tasks for SDTM engendered such as the data splitter, data appended and data transposing (rows to columns and columns to rows) tasks that can be very subsidiary here. If we had a more arduous domain to engender, then these additional prepackaged “Tasks” could be included in the process flow and programming. Additionally, it is worth mentioning that with SAS Enterprise Guide 4.3, you get more of a true development environment in SAS than ever afore.

SAS Enterprise Guide 4.3 includes code completion facilities and interactive syntax guides found in other software development environments that you will dote as a SAS programmer. Because of the process flow view, the SDTM work lends itself to being more manageable and reusable long-term because the programming itself inclines to be less spaghetti code. Conclusively, just as with the Base SAS approach, the SAS Enterprise Guide approach could be utilized in conjunction with implements such as Microsoft Access or Excel to give you a minimal way of managing your SDTM metadata.

Implementing SDTM with SAS Clinical Data Integration

After exploring the engendered of SDTM files with Base SAS and SAS Enterprise Guide, it is now a good conception to optically canvass the “full monty” SAS approach to SDTM data engendered work utilizing SAS Clinical Data Integration. SAS Clinical Data Integration is an ETL implement built on top of SAS Data Management that includes concrete functionality to fortify clinical tribulations. To commence the same process in SAS Clinical Data Integration, drag and drop the SDTM DM domain from our metadata repository. That target domain already has defined the table and variable level metadata, and it additionally includes felicitous integrity constraints on the data. Now that the target is defined, drag, and drop the source data sets. The next step is to join the three source data sets via an SQL join, which is done by dragging and dropping the predefined “SQL Join” transform. The “Extract” transformation step you optically discern in Exhibit 2 is where the SDTM DM variables get defined in a process analogous to the Base SAS and SAS Enterprise Guide DATA step code in prior sections. Within SAS Clinical Data Integration, this is done within point-and-click driven PROC SQL code building steps. The final step is to insert the “Table Loader” transformation, which takes the SAS data set from the “Extract” step and saves the permanent DM data set.

SAS Clinical Data Integration Approach – Challenges and Benefits

Because SAS Clinical Data Integration handles many facets of SDTM data engendered, the challenges are minimal. Probably the most immensely colossal challenge for a SAS programmer is learning to give up slinging Base SAS code and learning to rely on the implement to do the work. Additionally, SAS Clinical Data Integration relies on SAS SQL under the hood quite marginally, so “old-school” Base SAS programmers may need to enhance their SQL skills. As with the Base SAS and SAS Enterprise Guide solutions, you can utilize any Base SAS procedures you require, but the key advantage of utilizing SAS Clinical Data Integration is in its competency to manage your metadata. Ergo, eschew inscribing a bunch of custom SAS code, because that constrains the tool’s competency to control the work. It can be a marginal adjustment to learn to work and program largely within the confines of the transforms available within SAS Clinical Data Integration. Metadata management is paramount, so scarcely of setup is required to define your target data metadata upfront.

SAS Clinical Data Integration provides the same kind of process flow view and drag-and-drop tasks/transforms that SAS Enterprise Guide provides. More importantly, SAS Clinical Data Integration manages the metadata for your SDTM work, a benefit that neither Base SAS nor SAS Enterprise Guide can provide alone. It controls the target SDTM metadata, so compliance with a defined SDTM standard is built into the workflow. It withal connects the metadata across SDTM data engendered so that you can analyze data for changes and updates and withal propagate a vicissitude across your SDTM data engendered.

Utilizing SAS Clinical Data Integration as intended with standard transforms essentially enforces remotely of consistency of process in engendering SDTM domains. This consistency along with the process view sanctions for SDTM engendered jobs to be more facilely maintained and withal sanctions for reuse of jobs. SAS Clinical Data Integration withal sanctions for “typical” SDTM generation tasks, such as study day (--DY) or ISO date (-- DTC) engendered, to be standardized into utilizer-inscribed transforms that can be dragged and dropped into future SDTM jobs.

Although SAS Enterprise Guide provides several mundane “tasks” that can be dragged and dropped into your process flow, SAS Clinical Data Integration provides a much more expansive list of transformations to cull from. Several of those are prodigiously handy in terms of engendering SDTM domains, including the sort, transpose, data joiner, lookup table, data extraction, and data loader transformations.

Finally, SAS Clinical Data Integration is integrated with the SAS Clinical Standards Toolkit associated with Base SAS software. There are pre-subsisting SAS Clinical Data Integration transformations that sanction you to validate SDTM data sets predicated on the SDTM metadata and withal to automatically engender a define.xml file – which is an astronomically immense benefit.


There are multiple SAS approaches to the task of converting clinical tribulations data into the CDISC SDTM. This white paper presented a Base SAS approach, a SAS Enterprise Guide approach and a SAS Clinical Data Integration approach. The main distinction between approaches involved peregrinating from little implement support to heftily ponderous context-concrete implement support to accomplish the SDTM DM engendered task. It used to be that when clinical SAS programmers were confronted by a data transformation task such as SDTM conversions, we had Base SAS. Now, there is a better GUI implement, SAS Enterprise Guide, that avails with SAS code development. More recently, SAS Clinical Data Integration has emerged as an exhilarating incipient clinical tribulation and CDISC-cordial, industry-concrete implement. SAS Clinical Data Integration gives us a way to manage our metadata and process in a way that was not available afore, while still sanctioning us to write Base SAS code when needed. SAS Clinical Data Integration is critical in order to have largely metadata-driven transformation processes that can scale to perform numerous data conversions in a reliable and efficient manner.

About SAS: SAS is the leader in analytics. SAS is the no.1 advanced skills to have in this data-driven world.

About Sankhyana: Sankhyana (SAS Authorized Training Partner in India) is a premium and the best Clinical SAS Training Institute in India offers the best Online/Live-Web training on SAS and Data Management tools.


Keywords: #ClinicalSAS #ClinicalResearch #AnalyticsTraininginBangalore #SDTM #AdaM #TLF #CDISC #BestSASCourseinBangalore  #Analytics #SASClinicalDataIntegration #DataAnalytics  #SASTraininginBangalore #SASAnalyticsTraininginBangalore #SASEnterpriseGuide #PharmaTraininginBangalore #BestSASTrainingInstituteinBangalore #BestSASTrainingInstituteinIndia #BestPredictiveModelingTrainingInstituteinIndia #SASCertification #SASCertificationTraininginBangalore #baseSAS #ADvanceSAS #BestOnlineSASTrainingInstituteinIndia #BestClinicalSASTrainingInstituteinIndia #ClinicalSASTraininginBangalore #BestClinicalSASTrainingInstituteinBangalore #BestClinicalSASTrainingInstituteinIndia #SASCertificationTraininginBangalore #SASCertificationTraininginIndia #BestSASTrainingInstituteinIndia #clinicalsas #advancedanalytics #healthcareanalytics #pharmaanalytics #predictivemodeling #ai #sasinhealthcare #ClinicalTrials #clinicalresearch #sasatc #SASAuthorizedTrainingPartnerinIndia #SASCertified #SankhyanaEducation #SankhyanaConsultancyServices #SajalKumar #ClinicalDataManagement #BestClinicalSASTraininginstituteinIndia #SASLWTraining #India