Title Text

Research Data Files


Data Dissemination   Research Data Files   Data Users Guides

Medicare HOS Research Data Files

Several types of Medicare HOS data files are available for research purposes. Medicare HOS data files are available as public use files (PUFs), limited data sets (LDSs), and research identifiable files (RIFs). Please note that the HOS PUFs are not intended to be generalizable or used for national estimates.

HOS PUFs contain most of the survey items collected on the HOS instrument (excluding beneficiary identifying information) as well as selected additional administrative variables. HOS PUFs are constructed to prevent the identification of any single beneficiary or Medicare Advantage Organization (MAO) and only respondents to the survey are included in the files. HOS PUFs are available at no cost and can be downloaded directly from this page (see below for additional information).

Medicare HOS Public Use Data Files (PUFs)

To facilitate the dissemination of data collected by the Medicare HOS project for additional research, PUFs have been created for each cohort of data. The files have been constructed in accordance with current CMS and Department of Health and Human Services (HHS) policies and other applicable statutes and laws. All identifying information has been excluded from the files, and demographic categories (i.e., age and race) have been aggregated such that identification of any given individual is not possible.

Two distinct categories of PUFs have been generated:

  1. Baseline PUFs contain the data collected during a given baseline survey administration.
  2. Analytic PUFs contain the merged baseline and two-year follow up files as well as supplemental variables.

Baseline PUFs contain data for all respondents in a new cohort. Analytic PUFs contain a completed cohort of data for all baseline respondents and are constructed to be self-contained, with a baseline component and a follow up component, if available, for each beneficiary's record. There is no field that allows identification of a particular individual across the cohorts in the analytic PUFs; however, baseline PUFs have been constructed with a unique anonymous ID field that does allow identification of the same individual across multiple baseline cohorts.

The PUFs are available in cohort-specific downloadable ZIP files. For Cohorts 1-22 Baseline and Cohorts 1-20 Analytic, the PUFs are formatted as fixed-width ASCII (flat) files. Beginning with the Cohort 23 Baseline and Cohort 21 Analytic, the PUFs are formatted as comma separated values (CSV) files that can be opened with any spreadsheet application. SAS® code is available by cohort to import the ASCII files for Cohorts 1-20 Analytic PUFs and create SAS® data sets. Each import program creates a cohort specific SAS® data set containing field names and labels. The PUF files as well as the Analytic PUF Import Code SAS® programs are available for download from the dropdown boxes below.

Additional documentation that details the construction and content of each PUF is available in the corresponding Data Users Guide (DUG). The HOS PUF DUGs are available from the Data Users Guides section. These DUGs present detailed documentation regarding file construction and contents for all data sets distributed by this program. If you have any questions regarding the PUFs, please feel free to contact the Medicare HOS Information and Technical Support Line.

PUF Data Files for Download


PUF Data Files in Multiple Formats

PUF data files are also available for download from the Inter-University Consortium for Political and Social Research (ICPSR) website. ICPSR is a unit within the Institute for Social Research at the University of Michigan and is an international consortium of academic institutions and research organizations that maintains a data archive of research files in the social sciences. Users may download PUFs created in SAS, SPSS, Stata, and tab separated values (TSV) formats from the ICPSR website. The 1998-2012 HOS Cohorts 1-15 Baseline, the 2000-2004 Cohorts 1-5 Follow Up, and the 1998-2000 Cohort 1 Analytic through 2012-2014 Cohort 15 Analytic PUFs are available for download at no cost. A login account and online Terms of Use Agreement are required to download the files from ICPSR.

Medicare HOS Limited Data Set (LDS) and Research Identifiable File (RIF)

HOS LDSs and RIFs are comprised of the entire national sample for a given cohort (including both respondents and non-respondents), and contain all the HOS survey items. The RIFs contain all the variables in the LDSs; however, the RIFs also include specific direct person identifiers and plan identifiers that are excluded or modified in the LDSs. These data files are available as SAS datasets. A signed Data Use Agreement (DUA) with CMS is required to obtain either LDS or RIF data files.

The RIFs contain direct person identifiers (i.e., name, address, Medicare Beneficiary Identifier [MBI], Medicare Health Insurance Claim [HIC] number where available, and Social Security Number [SSN] where available) that allow identification of the same individuals across multiple cohorts. Note that HIC numbers are no longer included in RIFs beginning with Cohort 22 and SSNs are no longer included in RIFs beginning with Cohort 21. The RIFs also include plan identifiers and plan characteristics for the participating MAOs, such as MAO contract number, enrollment at sampling, and plan name. The HOS LDSs retain some protected beneficiary-level health information such as date of birth, gender, race/ethnicity, and county of residence; however, specific direct person identifiers (i.e., name, address, MBI, HIC number, and SSN) are not included in the LDSs, as outlined in the Health Insurance Portability and Accountability Act (HIPAA Privacy Rule). Additionally, the MAO contract number is blinded in the LDS and certain fields describing MAOs have been modified (i.e., categorical enrollment) or excluded (i.e., plan name) to prevent identification of specific MAO contracts.

The downloadable LDS file specifications documents detail the characteristics of the merged baseline and follow up LDS for each available cohort of data that used the specified version(s) of the HOS (e.g., HOS 1.0, HOS 2.0, HOS 2.5, or HOS 3.0). The documentation describes the field name/description, field type, field length, additional information (including valid values), and indication of field inclusion or exclusion for the fields in each cohort file. The File Specifications documents for Cohorts 1-22are linked below.

HOS LDS Requests

All research requests for LDS files must be submitted through the CMS Limited Data Set File Process. Instructions are available here: Limited Data Set (LDS) Files. The Medicare HOS Information and Technical Support at hos@hsag.com remains available to answer questions about the HOS LDS files. Questions about topics such as the availability of specific data cohorts and variables should be directed to the technical support email before contacting CMS. Technical support is also available for questions about the feasibility of using the LDS files to address specific research aims.

HOS RIF Requests

Requests for HOS RIF files will continue to be processed through the Research Data Assistance Center (ResDAC) at the University of Minnesota, a CMS contractor that provides assistance to academic, government and non-profit researchers interested in using Medicare and/or Medicaid data. ResDAC is available to assist in the completion and/or review of data requisition forms for Medicare HOS research data files prior to their submission to CMS. For additional information and assistance obtaining Medicare HOS RIF files, please visit the ResDAC HOS page. ResDAC may also be contacted by calling (888)-9RESDAC (888-973-7322) between the hours of 8am to 4pm CT Monday through Friday or by emailing resdac@umn.edu.

National Cancer Institute (NCI) SEER - MHOS Linked Data

The Surveillance, Epidemiology, and End Results (SEER) and the Medicare Health Outcomes Survey (MHOS) data sets are a data linkage available to cancer researchers. These data sets link data on cancer patients to patient-reported outcome measures and provide researchers with the potential to investigate the health status and health related quality of life of older adults enrolled in Medicare Advantage Organizations with and without a cancer diagnosis. The SEER-MHOS linked data sets available now include HOS data from the baseline and follow up surveys for Cohorts 1-20 collected during the years of 1998-2019. There is a flag available in NCI’s SEER*Stat statistical software that identifies SEER cancer patients who responded to the MHOS surveys, as well as the number of surveys before and after diagnosis. This will allow researchers to facilitate the development of a research proposal by permitting them to obtain a rough estimate of the number of individuals who have been diagnosed with the cancer site they are interested in and have completed the MHOS before and after being diagnosed. Researchers who are interested in using the SEER-MHOS linked data in their investigations can find information about obtaining the SEER-MHOS dataset at the NCI SEER-MHOS page.

A technical report titled: “Validation of health-related quality of life scales using the VR-12 in the SEER-MHOS Data Resource” provides a validated method for scoring the Eight Scales from the VR-12 with the Eight Scales of the SF-36® for the Medicare HOS cohorts. Extensive documentation and scoring algorithms are available without any cost from Dr. Lewis Kazis at Boston University School of Public Health or the National Cancer Institute.

For access to the full technical report and SAS code to generate the algorithms, please email SEER-MHOS@hsag.com.

Additional information about the VR-12 is available from Dr. Kazis at:  

Lewis E. Kazis, Sc.D.
Professor of Health Law, Policy & Management
Director, Center for the Assessment of Pharmaceutical Practices (CAPP)
Department of Health Law, Policy & Management
Boston University School of Public Health
715 Albany Street, Talbot 1 West
Boston MA 02118
Telephone: 617-414-1418
E-mail: lek@bu.edu



This page was last modified on 02/15/2024

Privacy   |   Contact Us   |   FAQs   |   Glossary   |   Program Partners

Logo This site is hosted and maintained by Health Services Advisory Group.
It is sponsored by the Centers for Medicare & Medicaid Services.