The data request process

The resources below will help you plan for your project and request the data you need from NACC. You may also want to review this information in the Checklist for Authors to help you prepare for publication and review.

1

Read about NACC Data for an overview of the available data sets.

3

Search ongoing and completed proposals to ensure that your work doesn't overlap with an existing project. It is the responsibility of the researcher, not NACC, to identify a unique research hypothesis and ensure there is no significant overlap with existing analyses.

4

Use the query system and the MRI preview system to get a sense of available NACC data.

5

Submit a data request answering a few brief questions about the project you have in mind. NACC will respond in three business days to acknowledge your request and ask for clarification, if needed.

6

Accept the terms of the NACC Data Use Agreement which is designed to protect the privacy of study participants, ensure appropriate use of the data solely by the individuals identified in the data request, and ensure that NACC is acknowledged in any publications and kept informed before and during the publication submission process.

7

NACC provides your data set Quick Access data files are distributed to you.


The NACC Handbook: A Researcher’s Guide

  • The Uniform Data Set (longitudinal follow-up)

    The Uniform Data Set (UDS) is the primary data set used by researchers interested in clinical data. The NIA/NIH Alzheimer's Disease Research Centers (ADRCs) began submitting UDS data to NACC in September 2005, using the UDS Forms to collect standardized clinical data from subjects who are evaluated on an approximately annual basis. Since 2005, the UDS forms have undergone two major revisions to reflect advances in the science and incorporate new diagnostic criteria. To combine data across the three versions, a Researcher’s Data Dictionary (RDD) was created. This document, the RDD-UDS, should be the first and primary resource for researchers analyzing NACC clinical and demographic data.

    As a resource for investigators in their analysis, NACC has provided a CSV file of all RDD-UDS variables and coding.

    For more information, please see the section below titled "Advice on research design and best variables to use."

    FTLD Module (frontotemporal lobar degeneration)

    Beginning in February 2012, a subset of UDS subjects have also been evaluated using the supplemental FTLD Module. At Centers participating in this voluntary effort, subjects with suspected FTLD and/or controls are evaluated with the FTLD Module in addition to the standard UDS Forms.

    LBD Module (Lewy body disease)

    Beginning in August 2017, a subset of UDS subjects have also been evaluated using the supplemental LBD Module . At Centers participating in this voluntary effort, subjects with suspected LBD and/or controls are evaluated with the LBD Module in addition to the standard UDS Forms. These data will be made available for research once sufficient numbers of participants and visits have accumulated.

    Researcher’s Data Dictionary: Neuropathology Data Set (autopsy data)

    The NP data set comprises subjects who have died and consented to autopsy. The NP data-collection form has undergone numerous revisions to reflect advances in the science and incorporate new diagnostic criteria. To combine data across versions, a Researcher’s Data Dictionary (RDD) was created. The RDD-NP should be the first and primary resource for researchers analyzing NACC neuropathology data.

    As a resource for investigators in their analysis, NACC has provided a CSV file of all RDD-NP variables and coding.

    Imaging available for download

    A subset of UDS subjects have one or more MRIs available to download as zip files (a subset because MRI submission is voluntary on the part of the Centers). Research structural MRIs are stored in DICOM and NIfTI format and a variety of scan types (primarily T1, T2, FLAIR, and DTI). A very small subset of UDS subjects also have one or more amyloid PET scans available to download. Please note that PET scans are only available from a small number of Centers.

    Researcher’s Data Dictionary: Biomarker and Imaging Data Sets

    Three data dictionaries have been created to aid investigators in the analysis of NACC’s biomarker and imaging data sets. See below for a description of each of these data sets and their corresponding data dictionaries.

    Imaging data (MRI calculated volumes)

    Among the UDS subjects with MRI files stored at NACC, a subset have standardized calculated volume values (e.g., hippocampal volume). These data are provided to NACC by the IDeA lab at the University of California, Davis. Investigators requesting these data should review the description of the calculation methods and protocols. The Researcher’s Data Dictionary — Imaging Data (RDD-ID) includes MRI calculated volume variables, as well as variables associated with the DICOM and NIfTI files stored at NACC. For specific methods used to perform these calculations, please see this document provided by the IDeA Lab.

    CSF biomarker data (CSF Aβ, total Tau, p-Tau)

    For a small sample of UDS subjects, NACC stores CSF biomarker values from a single lumbar puncture or longitudinal lumbar punctures. The Data Element Dictionary — CSF (DED-CSF) describes the variables related to CSF biomarker data. Please note that these data come from a small number of Centers.

    Genetic data (APOE genotype, availability of genetic data)

    APOE genotype data are available for a large subset of UDS subjects. The Researcher’s Data Dictionary — Genetic Data (RDD-Gen) describes variables relating to APOE genotype data as well as variables indicating the availability of genetic data at the Alzheimer’s Disease Genetics Consortium (ADGC) and the NIA Genetics of Alzheimer’s Disease Data Storage Site (NIAGADS).

    As a resource for investigators in their analysis, NACC has provided a CSV file of all RDD-Gen variables and coding.

    Minimum Data Set (MDS) (abstracted records)

    Before the UDS was implemented at Centers in 2005, data on Center subjects were collected retrospectively via data abstraction and were included in the MDS. Because of the limited scope of the MDS and its retrospective and cross-sectional nature, NACC recommends using data from the UDS rather than the MDS for most research proposals.

    1. Submit a data request. In doing so, you will be required to:
      1. Provide at least four or five sentences describing the research aims and hypotheses and statistical methods, as well as the main variables to be used in the analysis. Please follow the guidance below on how to describe these aims:
        1. State concisely the goals of the proposed research and summarize the expected outcome(s), including the impact that the results of the proposed research will exert on the research field(s) involved.
        2. List succinctly the specific objectives of the research proposed — e.g., to test a stated hypothesis, create a novel design, solve a specific problem, challenge an existing paradigm or clinical practice, address a critical barrier to progress in the field, or develop new technology. BE SPECIFIC.
      2. List the data sets that you would like to include in your analysis. For an overview, please refer to the "Data available from NACC" section above.
      3. Select the desired file type: CSV, SAS, or SPSS.
    2. Sign and date our electronic Data Use Agreement (DUA), which is linked at the end of the data request survey. If you are collaborating with researchers at one or more other institutions, please have the head researcher at each institution sign the data use agreement. NACC staff will provide you with the link to the DUA.
  • Standard data file download

    NACC data are posted for download on a secure website. Investigators receive a user name, password, and link to download their data files. Your download link remains active for a limited period of time. If you find that your link has expired and would like us to reactivate it, please email the NACC research support staff.

    Image file download

    • NACC will provide your MR images on Amazon Web Service, S3. NACC will send you an email containing an access key ID and secret access key. Below are some options available to download your MR image files:
      • If you prefer to use a software application, some options that are popular with our investigators are Cyberduck and S3 Browser. Both offer free versions and can connect to the S3 Bucket, allowing you to download all the image files to your local computer.
      • If you prefer to use the command line, you can download your images using the Amazon Command Line Interface (CLI). If you haven't used this before, you will need to install the CLI. Then, you can configure the CLI with the credentials above and proceed to download the bucket using the "aws s3 sync" command.
    • Once you have completed the downloads, please contact the NACC staff member who helped you obtain the data for administrative purposes. We ask that you complete your downloads within three weeks of receiving the access key from NACC.
  • If you have two or more NACC data sets that need to be merged, be sure to merge them by NACCID.

    If you are focusing your analysis on neuropathology data, please note that you will need to eliminate any longitudinal visits that you are not interested in. For example, if you are interested in matching a subject’s neuropathology data to the clinical data from the most recent visit before death, you will need to delete from your file any previous visits for that subject; for instance, if a subject has had five visits to date, you will need to restrict your file to the fifth visit for this subject, deleting visits 1 through 4.

  • If you would like to narrow down the subjects in your file by certain criteria, please see the section below. For example, if you are interested in focusing on subjects with CSF biomarker data, be sure to request from NACC the pertinent CSF variable data set when submitting your data request.

    Please keep these important notes in mind when using NACC variables to restrict your sample:

    Restricting based on cognitive status and etiologic diagnosis:

    • On the UDS Clinician Diagnosis Form D1, subjects receive a diagnosis corresponding to cognitive status: normal cognition, impaired-not-MCI (mild cognitive impairment), MCI, or demented). Subjects also receive an etiologic diagnosis — what the clinician suspected to be the cause (whether primary, contributing, or non-contributing) of any cognitive impairment. Both the variable on cognitive status (NACCUDSD) and one or more variables concerning etiologic diagnosis (for example, NACCALZD) will often be required to focus on a specific diagnostic group of interest.
    • For example, subjects with an etiologic diagnosis of Alzheimer’s disease (NACCALZD=1) can have a cognitive status of impaired-not-MCI, MCI, or dementia. The only way to focus on those with AD dementia is to use both the cognitive status variable (NACCUDSD=4) and the etiologic diagnosis variable (NACCALZD=1).

    Restricting by diagnosis at a certain visit (e.g., MCI at the initial UDS visit):

    • To restrict to subjects with a diagnosis at the initial visit, select those who meet your criteria and have NACCVNUM=1.
    • To include subjects who have ever received the diagnosis of interest, you would look across all visits and determine whether the subject ever received that diagnosis at any UDS visit.
    • To restrict to subjects who have the diagnosis of interest at the most recent UDS visit, you would use the variables for visit number and/or visit date (NACCVNUM, VISITMO, VISITYR).

    Defining cognitive status based on the MMSE, MoCA, or Clinical Dementia Rating (CDR®) score:

    • Although many researchers choose to define cognitive status according to the clinician diagnosis provided on UDS Form D1, others choose to use the global CDR® score, the MMSE, or the MoCA. The following are examples of the commonly used CDR® cutpoints: Normal cognition: CDR®=0; Mild cognitive impairment: CDR®=0.5; Demented: CDR®=1 (mild), 2 (moderate), or 3 (severe). For more information, see the section on Form B4 in the UDS Coding Guidebook. The MMSE score is sensitive to demographic and educational differences. Please consult the research literature to determine the best cut points for establishing cognitive status for your sample.

    Clinicopathological studies:

      If you are conducting a clinicopathological study, it may be advisable to restrict your sample to those who have clinical measures within one or two years of autopsy. The NACCINT variable, which indicates the months between the last UDS visit and death, can be used for this purpose.

    Using the UDS data cross-sectionally:

    • To restrict to data from the initial visit, focus on visit data corresponding to NACCVNUM=1.
    • To restrict to data from the most recent visit, use the variables for visit number and/or visit date to determine the most recent visit (NACCVNUM, VISITMO, VISITYR).

    Restricting to those with non-missing data:

    • A number of missing codes have been included in the Researcher Data Dictionaries (RDD) to indicate why data are missing. To focus your analyses on subjects with non-missing data for a particular variable, be sure to exclude subjects with the missing codes, such as –4, 9, 99, and 995–998. Also, be sure to exclude those missing value codes from your analysis; otherwise, they may skew your findings. Please refer to the section below titled "Missing data and data collection changes between UDS versions" for additional details about missing codes.

    Restricting by number of visits made:

    • The variables NACCAVST (total number of UDS visits) and NACCNVST (total number of in-person visits, excluding telephone visits) can be used to restrict to subjects with a minimum number of UDS visits completed.

    Restricting to data from a particular version of the UDS:

    • Use the FORMVER variable to determine the form version (e.g, FORMVER=1, which corresponds to version 1). To focus on subjects assessed with UDS version 3 forms, use FORMVER=3.
  • A custom file that NACC provides may include both longitudinal clinical data from the RDD-UDS (PDF) and neuropathology data from the RDD-NP (PDF).

    UDS data will be provided in long format, with one row of data per subject visit (so a subject with more than one visit completed will have more than one row of data). The illustration below shows how the data might look in your file, followed by a brief explanation.

    Subject IDNACCVNUMFORMVERNACCAGESEXNACCNEURNACCBRAA
    11271136
    12289136
    21192213
    31163136
    32164136
    33265136
    34266136
    35267136
    36268136
    37269136
    38271136
    39271136
    • Visit number: The chronological order of the visits is indicated by the NACCVNUM variable. For example, NACCVNUM=1 is the Initial Visit.
    • Number of visits completed: As you can see above, the first subject has completed 2 UDS visits, the second subject has completed 1 UDS visit, and the third subject has completed 9 UDS visits.
    • Form version: You can see that the first subject has only UDS version 2 data (FORMVER=2), the second subject has only UDS version 1 data (FORMVER=1), and the third subject has both UDS version 1 and version 2 data collected over time.
    • Neuropathology data is duplicated: The neuropathology data values are repeated/duplicated for each of a subject’s UDS visits (see NACCNEUR and NACCBRAA variables). For example, the third subject has the same value for NACNEUR for all nine of the subject’s visits. Make sure you analyze the data at the subject level and do not double count those subjects who have more than one UDS visit.

    Data freeze

    NACC data are frozen and archived approximately every three months. Your data set includes UDS data up through the data freeze indicated in the email sent by NACC. For example, if you are using data from the September 2015 data freeze, your file will include visit data collected and submitted to NACC from the beginning of the Uniform Data Set (September 2005) through the end of August 2015. If at any time you would like an updated version of your file, please submit an online request for an update.

    Visit number and visit date

    If you are using the longitudinal UDS data in your analysis, you can identify the order of the visits using the NACCVNUM variable (NACCVNUM=1 is initial visit; NACCVNUM=2 is first follow-up completed, etc.). However, the variables for the visit date (VISITMO, VISITDAY, and VISITYR) and days since Initial Visit (NACCFDYS) are generally better to use than the NACCVNUM variable when doing the following:

    • Selecting pertinent MRIs or other biomarker data
    • Time to event analyses
    • Longitudinal analyses such as linear mixed modeling
    • Calculating variables such as time since cognitive symptom onset to most recent visit

    Relating imaging data to UDS visits

    NACC does not associate MRI or amyloid PET scans with a particular UDS visit because often the scans and the UDS visit occur at different times, sometimes even years apart. We leave it up to investigators to match the scans to the UDS visit based on their study criteria. Investigators can determine which UDS visit is closest in time to a scan by comparing the UDS visit date variables (VISITMO, VISITDAY, VISITYR) and the MRI scan date variables (MRIMO, MRIYR, MRIDY) or PET scan date variables (APETMO, APETDY, APETYR).

    Telephone visits

    Telephone visits are completed by the subject’s co-participant when the subject is unable to attend an in-person visit. Visits completed over the telephone can be identified using the PACKET variable (PACKET=T). This variable can be used to exclude telephone visits if desired.

    Clinical diagnosis groups

    The manner of collecting the clinical diagnosis changed substantially from v1-v2 to v3 of Form D1 (Clinician Diagnosis); therefore, it is helpful to compare Form D1 for v2 and v3.

    Please also see the important note about "Cognitive status and etiologic diagnosis" under the section on "Narrowing your data set based on your eligibility criteria."

    Missing data and changes in data collection among UDS versions

    Please review the Researchers Data Dictionary for UDS data (RDD-UDS) (PDF) and Neuropathology data (RDD-NP) (PDF) to determine the missing/unknown codes for each variable. In most cases, the missing code is 8, 88, 888, 9, 99, or 999. Variables are coded as -4 if those particular variables were not collected for a given version of the UDS, or if a skip pattern on the form resulted in a missing value that could not be replaced by an implicit value based on the preceding question. To date, the UDS has been implemented in three versions. The version of the forms used for a subject’s visit can be determined using the FORMVER variable.

    Medication data

    Variables DRUG1 through DRUG40 correspond to each of the medications (up to 40) reported to be taken by the subject. For example, if a subject reports taking atenolol and losartan, then DRUG1=atenolol and DRUG2=losartan.

    Data values that can vary over time (e.g., age of onset of cognitive decline)

    Some variables are collected at each UDS visit, and because different clinicians perform the assessments over time, or because the subject’s symptoms have changed over time, the values for these longitudinally collected variables change over time. In particular, if you are using the Form B9 variable for age of onset of cognitive decline (DECAGE) or first predominant symptoms, be sure to ascertain whether values for these variables have changed over time. In some cases, it may be best to use the most recent non-missing value, because it may represent the best data available. In other cases, it may be better to use the value at the initial UDS visit, depending on your study design.

  • UDS Form A5 data

    The data collected using Subject Health History Form A5 is usually focused on health conditions reported by the subject or co-participant; therefore, caution should be taken when using the associated variables in your analysis. If the health condition reported on Form A5 is also collected on other UDS forms, it may be advisable to use data from the other form, especially if the other form is based on clinician judgment (e.g., Forms D1 and D2). For example, Form A5 collects data on Parkinson’s disease; however, in most cases, it is advisable to use the data on Parkinson’s disease from the Clinician Diagnosis Form D1 instead, since it is diagnosed by the clinician, not reported by the subject. Alternatively, you may wish to consider examining all relevant data collected on the health condition to determine the best variable(s) to use.

    Mini Mental State Exam (MMSE, UDSv2) or MoCA (UDSv3) versus Clinician Dementia Rating (CDR®)

    The MMSE/MoCA and CDR® capture different information and have different measurement properties. The choice of the instrument depends on your analysis goals. The MMSE (replaced by the MoCA in UDS version3) is good for screening and staging moderate and severe dementia, whereas the CDR® can measure (non-subtle) progression of cognitive and functional decline.

    Using NACC data for incidence and prevalence rates

    NACC data are not suited for an analysis of dementia incidence or prevalence at a city, state, or national level. This is because the sample is not population-based. Recruitment protocols differ by Center; depending on the ADRC, subjects may or may not have been randomly selected. Therefore, the NACC data are best viewed as a case series, and caution should be exercised when developing research aims surrounding the NACC data and when interpreting the results.

    Death, dropout, and discontinuation

    ADRCs periodically notify NACC, via the Milestones Form, about UDS subjects who have died, have dropped out of the study, or have been discontinued by the Center for other reasons. The frequency with which Centers provide these data varies by Center; therefore, researchers should be cautious in the interpretation of any analysis in which data on death, dropout, or discontinuation are used.

    Differences in the neuropathology database among UDS and MDS subjects

    All UDS subjects who died and consented to autopsy have data collected using NACC’s Neuropathology (NP) Form (version 8, 9, or 10). In the ADRCs’ earlier data set, the Minimum Data Set (MDS), some autopsied subjects have data based on earlier versions of the NP Form, if their autopsies were conducted after the NP Form was implemented in 2002. The remaining autopsied MDS subjects have limited NP data that were collected within the MDS Form. If you will be obtaining autopsy data for MDS subjects, a NACC consultant will help guide you through the data availability.

    Difference between primary neuropathologic diagnosis and neuropathologic features

    The latest version of the NP Form (version 10) has been updated to reflect the most current AD and FTLD neuropathological criteria. NP v10 focuses on assessing neuropathological features rather than the primary neuropathological diagnosis, as had been the case in previous versions of the NP Form. Upon request, researchers can obtain the data on primary neuropathological diagnosis as was collected in NP Form versions 1 through 9.

    Inconsistencies among the CDR®, neuropsychological testing, and the clinical diagnosis

    A small number of UDS subjects have seeming inconsistencies among their scores on the CDR®, their neuropsychological test results, and/or their clinical diagnoses. For example, a subject may have a clinical diagnosis of dementia on Form D1 but a CDR® of 0 (no impairment). These inconsistencies are generally verified as correct by Centers and may occur because different clinicians complete different parts of the UDS assessment.

    ZIP codes

    To protect the confidentiality of the data, NACC generally does not provide ZIP code data to researchers.

    Generalizability

    Because UDS recruitment methods vary by ADRC and over time, UDS participants are best described as a clinical case series of patients from each ADRC,

  • Please remember to review the NACC Checklist for Authors and submit ALL abstracts and publications using these data via the Manuscript/Abstract Submission system.

    All publications, posters, and presentations using NACC data must include the NACC grant number, as seen in the Checklist for Authors.

    NACC asks that you notify us of any changes in the status of your manuscript after you initially submit it to us (accepted for publication, change of journal, etc.). We generally do not provide a second review of the manuscript if you have edited it to submit to a new journal, but we do ask that you submit the edited manuscript to us for our records (on the submission web page, please choose one of the bottom two categories to indicate either a minor or major revision).

    If your manuscript is accepted for publication, please ensure that you are in compliance with the NIH Public Access Policy. For more information on the PMC submission process, please review the "Resources for navigating PMC submission."

    Until we receive the PMCID for your manuscript, NACC will send you periodic requests for status updates (e.g., accepted for publication, change of journal). Please email NACC with questions on status updates and PMCIDs.

  • Non-imaging biospecimens and genetic data

    Please refer to the section on NACC Partnerships for information on requesting biospecimens from NCRAD or genetic data from ADGC or NIAGADS.

    Biospecimens available at the ADRCs

    Brain tissue is stored by ADRCs on a subset of their UDS subjects, and the Centers may be willing to share specimens with outside investigators. To take the next steps, researchers can use the NACC biospecimen locator or submit a tissue location request.

    Other data available at the ADRCs

    Centers may store additional data of interest, such as scores on neuropsychological tests that were performed as part of the UDS visit but not submitted to NACC because they are not part of the standardized UDS protocol. NACC is not usually aware of additional data stored at the Centers. Generally, the best way to identify subjects with the data desired will be to determine which Centers may have data you require, based on published research found through a PubMed search or a search of the individual Centers’ websites, and then to contact the Center(s) directly. If you mention any previous contact with NACC in your communication with the Center, please include a statement like the following: "Please note that this is neither a NACC request nor a NACC initiative, and your participation is entirely voluntary."