New! The SDR Master Box version 1.0 is available via Dataverse.

In the Data Harmonization project we combine data from various sources. The core of our empirical base are data from international survey projects. Altogether, we collected 1721 national samples from 89 waves and 22 projects, covering 142 countries/territories in a time span of 47 years, and a total of over 2.2 million respondents. The harmonized variables include: basic socio-demographics (age, gender, education, urban/metropolitcal locality), protest behavior (participation in demonstrations, signing petitions), trust in political institutions (in the national parliament, legal system, political parties, national government), as well as social trust and interest in politics. Details on the harmonization of these variables can be found here.

These survey data are complemented with macro-level variables from non-survey sources describing the social, economic and political environment (e.g. population, GDP per capita, Freedom House Index, Gini coefficient) measured at the country-year level.

Additionally, we provide two types of methodological control variables:

  1. Quality-control variables on how the data were gathered, described, and recorded refer to the result of evaluating original materials: general survey documentation, specific description of data, and computer data files, respectively. We also evaluate the quality of weights to ensure the representativeness of the survey samples.
  2. Harmonization controls deal with the correspondence between source and target variables. They describe the main differences among the project-specific questionnaire items (source variables) that will be combined across surveys into a common measure (target variable), and how source variables were harmonized (e.g. number of source variables within given national survey, rescaling). These control variables may be used in assessing the inter-survey validity and reliability of the target variables.

The resulting SDR Master Box consists of five data files and corresponding documentation: (1) the master file (MASTER) with individual-level data from cross-national surveys, (2) country-level file (PLUG-COUNTRY), (3) country-year-level file (PLUG-COUNTRY-YEAR), (4) survey-level file (PLUG-SURVEY), and (5) wave-level file (PLUG-WAVE). The MASTER file is the core of the Master Box and contains harmonized target variables, harmonization control variables, as well as flags for non-unique records, non-unique case IDs, and missing case IDs, while the other PLUG files containing contextual data, metadata, and data quality indicators.