Sorting Through the Issues
Overview
This page is designed as a way to organize the issues around importing data into StudyTRAX from an existing database. "Out-of-the-Box" StudyTRAX uses MS Excel to import data. There is a good bit of functionality built in StudyTRAX to accommodate this, however, the questions and issues below should be addressed BEFORE importing data.
Current System
In order to prepare the data for import via Excel, it is helpful to know:
- What is used for the current repository (e.g., spreadsheet, database [e.g., MS Access], other software application, in a device)?
- What is the format? For example:
- One large spreadsheet with each row as a separate patient / visit
- In a relational database, if so what is the table structure
- In a device that can export to various formmats (e.g., CSV file)
- What is the state of the data?
- Are the fields clean? For example:
- If spreadsheet, only one value per cell
- Has there any validation rules been used (e.g., the "Gender" variable does NOT have 0s, 1s, Ms, Fs, ?s all mixed together)?
- Is there a single source for the data, or is it spread across multiple sources
- Are there different versions of the data source?
- Are the fields clean? For example:
Import Process
Depending on desired method of import, different processes options may be put in place and import options used. The main issues being whether it will be a one-time import, or will data be imported/updated from another system continuously (e.g., multiple occurrences over time)?
Something new / Something old
Often when converting data over form another system, there are a number of desired changes to be made. In short, a decision will need to be made on whether to import that data "As-Is" for historical reference, or will there be an attempt made to combine the old database and new project in StudyTRAX. If the later, typically there will need to be a way to identify subjects collected in the old system versus new. Also, errors of omission (i.e., old subjects not having values for the planned new fields [i.e., specificity errors]) will need to be discussed).
Additional considerations for effective imports:
StudyTRAX imports data utilizing the tabs as the means to reference different collection intervals. Those data points which are not collected on a longnitudinal basis are also collected using a unique tab in the spreadsheet. Ideally, subject data is imported in a blocked manner so as to reduce the amount of importing for any one subject. While importing can be repeating on any one subject's data fields, control can quickly become less available over time. Below is brief list of some of the most important tips to save you time, in preparation for StudyTRAX import...
- When importing data to StudyTRAX on a scheduled basis, It is invaluable to have the same contact extract the source data as cleanly as possible.
- Organization of this data should be cleaned in a procedural fashion, to be sure the same type of data is being imported into StudyTRAX.
- Separate all subject data as one row per subject and re-label all columns to fit the code names of the data fields defined in StudyTRAX
- If a screening interval is imported, but the screening dates are meaningless to the study, a dummy screening date will be defined as the date the import occurred.
- Make sure all longitudinal type data has a specific date column that defines the encounter date.
- Devise an import plan that best keeps the process procedurally organized. For example, if the number of total subjects in the source database is less than 100, we suggest importing data on a form by form basis for all subjects. This is especially true when importing longitudinal data.
- Be sure all column of unique data source types, fit the destination (StudyTRAX) data source type.
- If a "site variable" is defined in the source data, be sure this is modified to fit the destinations exact "site code" name.
Note: Your StudyTRAX support contact will ensure your data is imported using the appropriate techniques as suggested.