Social determinants of health (SDOH) are the conditions in which people are born, grow, live, work and age. They include factors like socioeconomic status, education, neighborhood and physical environment, employment, and social support networks, as well as access to health care. Health outcomes include chronic conditions, behavioral health, functional limitations and morbidity as well as health expenditures.

There are several frameworks that define the set of factors in SDOH. These include the World Health Organization, the Healthy People 2030 initiative and the Kaiser Family Foundation (KFF). The figure below provides an outline of the KFF framework.

Image by KFF Foundation

A case can also be made for including health behaviors in SDOH as behaviors reflect interplay between people and social factors. See Social Determinants and Health Behaviors: Conceptual Frames and Empirical Advances.

Sources of SDOH include the following:

  • Individual Level Data in Health Records: Data is collected directly from individuals and is captured in their health records. Data may be collected through health networks, schools, and the government organizations such as Centers for Medicare & Medicaid Services (CMS).
  • Individual Level Data based on Data Mining: Organization provide data mining services to organize data on individuals from public, private and derived data sources. For example, see Socioeconomic Health Score | LexisNexis Risk Solutions.
  • Population Level Data: Government agencies and civil organizations collect, consolidate and publish SDOH data. Organizations include the Centers for Medicare & Medicaid Services (CMS), the Census Bureau, the Department of Labor, the Department of Transportation, and the Department of Education.

Analysis of SDOH data presents a number of challenges. Data sources have different structures and standards, cover various time periods, cover various geographic definitions and define metrics differently. Some of the important standards initiatives are:

  • Project Open Data Metadata Schema: The US Federal Data Strategy calls for data standards to accelerate the creation and adoption of data standards across agencies. The Data Catalog Vocabulary (DCAT-US) schema is the standardized metadata specification for describing datasets and APIs within an agency’s data inventory. Initiatives are in place to implement the schema in many agencies.
  • Data Documentation Initiative (DDI): (DDI) is an international standard for describing the data produced by surveys and other observational methods in the social, behavioral, economic, and health sciences.
  • Gravity Project: The Gravity Project is developing standards for exchanging SDOH between US healthcare organizations.

To address the lack of current standards and data uniformity, some organizations maintain curated repositories to provide consolidated, uniform SDOH data structures and metrics. For example, the Agency for Healthcare Research and Quality (AHRQ) curates many data sources and provides common geographic identifiers in the AHRQ Social Determinants of Health Database.

Many federal, state, and local government agencies and civil organizations openly publish SDOH data. However, many of the platforms providing SDOH data have paid, tiered access models that can be restrictive for researchers or groups investigating population health. A catalog of open data datasets is available from data.world and includes many of topics important to SDOH research. A catalog of federal, state and city government data resources is available from data.gov. Repositories also openly share data used in social, behavioral and health science research. For example, openICPSR is a self-publishing repository for replication data sets associated with a journal article so that other researchers can replicate findings. However, much of the published data uses formats of proprietary statistical packages like SPSS or STATA.

Accessing SDOH data sources is typically via file downloads. There are initiatives to provide data via API’s and to adopt API standards. Socrata provides an open data API and open source portal implementing the DCAT standard. An example implementation can be found at the Center for Disease Control and Prevention Data Catalog. Many of the US government agency API implementations can be found at api.data.gov.