Government agencies and civil organizations collect, consolidate and publish Social Determinants of Health (SDOH) data. Public data sources provide social factors and health measures that can be used for research, predictive analytics and forecasting. For an overview AI and SDOH data see Social Determinants of Health Data and SDOH and AI Solutions.
Sources can generally be categorized as social factors or health indicators but note that some sources will include both. Also, sources can include data on health behaviors. You may or may not want to consider these as social factors but note that a case can be made for including health behaviors as they can reflect interplay between people and social factors. See Social Determinants and Health Behaviors: Conceptual Frames and Empirical Advances.
Social Factors
Social factors data is available from a number of agencies and organizations but sources can have different standards, cover various time periods, cover various geographic definitions and define metrics differently. Also, the methods of data access via downloads and APIs vary widely. An excellent repository for U.S. SDOH data sources is available from the Agency for Healthcare Research and Quality (AHRQ). The ARHQ consolidates the sources into a single data structure. Geographic categories are provided for census tract, county and zip code. Measures are provided by counts and by ratios using populations. You can download data and codebooks and documentation on data sources is available. For more info on using AHRQ sources, see Finding the Best SDOH Factors for Your Data Science Project. The major sources included in the database are:
| The American Community Survey is a U.S. Census Bureau nationwide survey of social, economic, housing, and demographic data. Data are collected continuously throughout the year and pooled across a calendar year to produce estimates for that year. |
| The Environmental Protection Agency: National Walkability Index is a nationwide geographic data resource that ranks block groups according to their relative walkability. |
| The USDA Food Access Research Atlas provides food access data for populations within census tracts using measures of supermarket accessibility. |
| The USDA Food Environment Atlas provides indicators of food choices and diet quality to provide a spatial overview of a community’s ability to access healthy food and its success in doing so. |
| The CMS Provider of Services File – Hospital & Non-Hospital Facilities provides data on healthcare provider types including demographic information and the type of services provided. |
| The U.S. Census Bureau’s Small Area Income and Poverty Estimates produces estimates of income and poverty for U.S. states and counties as well as estimates of school-age children in poverty for school districts. |
| The Common Core of Data is the Department of Education’s primary database on public elementary and secondary education in the United States. |
| The Health Resources and Services Administration’s Area Health Resources Files provides data on health care professions, health facilities, populations, economics, hospital utilization, hospital expenditures, and environment. |
| The CDC Social Vulnerability Index uses U.S. Census data to determine the social vulnerability of census tracts. The Index ranks each tract on social factors, including poverty, lack of vehicle access, and crowded housing. |
There are several important sources for factors that are not included in the ARHQ database:
| Source | Description | Metrics | Geo Level |
| Federal Bureau of Investigation Crime Data Explorer | The Crime Data Explorer provides a view of estimated national and state data, crime statistics, and graphs of specific variables from the National Incident-Based Reporting System (NIBRS). | counts | state |
| Local Area Unemployment Statistics | US Bureau of Labor Statistics portal provides data on unemployment rates by month and 12-month net changes. | percentage | county, metro area |
Health Indicators
Health indicators are derived from patient data, Medicare claims and surveys. Like social factors data, sources have different formats, time periods, geographic definitions and metric definitions. Important sources for indicators are outline below.
| Source | Description | Metrics | Geo Level |
CMS Chronic Disease Data | Selected chronic conditions among Medicare beneficiaries. The dataset contains prevalence, use and spending based on claims. | counts, percentage | county |
| County Health Rankings | The County Health Rankings, a program of the University of Wisconsin Population Health Institute, measures the health of nearly all counties in the nation. The major source is the Behavioral Risk Factor Surveillance System telephone survey. | percentage | county |
| Places: Local Data for Better Health | PLACES uses small area estimation methods of the Behavioral Risk Factor Surveillance System telephone survey data to obtain 29 chronic disease measures. | percentage | count, zip code, metro area |
| CDC U.S. Chronic Disease Indicators | Cronic Disease Indicators data are obtained from several sources including vital statistics, disease registries, national health surveys, inpatient and emergency department databases, Medicare claims data, policy tracking systems, and the U.S. Census. | counts | state |
Public SDOH data is also available from state, county and metro area agencies and civic organizations. You can also find specialized data sources from data.world, data.gov and openICPSR.

No responses yet