The GHS is an annual household survey specifically designed to measure the living circumstances of South African households. The GHS collects data on education, employment, health, housing and household access to services.
Kind of Data
Sample survey data
Unit of Analysis
Households and individuals
v1.3 Edited, anonymised dataset for public distribution
"Birth and child" questions were asked in section 3 of the questionnaire. However, the original release of version 1 did not contain these questions. On request, Statistics SA provided DataFirst with these data files. They have been made available to the public for analysis. The Child data file has incomplete data for the variable q315aliv.
“Birth and child” questions were no longer asked in the future GHS. Section 3 was replaced by “tourism” section in GHS 2003.
Version 1.1 included new weights for comparability across GHS 2002-2008. Version 1.1 was released at the same time as the GHS 2008 (6 May 2010). Person files were reweighted to reflect Community Survey 2007 results and update the estimates of the impact of HIV/Aids on demographic trends in South Africa. Household files were weighted independently of the person files in v1.1. The reweighting procedure was discussed in the report: "Reweighting of the GHS 2002-2008 data series".
As part of the revision process, some observations were removed. The following variables were renamed wherever applicable: c_gender (to “gender”), d_age (to “age”), e_race (to “race”). The following mismatches were found between v1 and v1.1: "status1 & "status2" in the worker file
Version 1.2 includes the new weights for comparability across GHS 2002-2011. Version 1.2 was released at the same time as the GHS 2012 (22 August 2013). Reweighting was necessary in order to maintain the comparability of population estimates used in the GHS based on figures provided by the 2013 mid-year population estimation model that incorporates the demographic findings of Census 2011. Household files were weighted independently of person files.
“geotype” “metro” added to house and person file
“head_age” added to house file
The following variable from the worker file contains a mismatch with v1.1: "q26indus"
Version 1.3 includes new weights for comparability across GHS 2002-2017. Version 1.3 was released at the same time as GHS 2017 (21 June 2018). It was decided to replace the 2013 series Mid-year population estimates in the previous version with a the more recent 2017 series Mid Year Population Estimates as benchmarks for weighting the GHS data files. Household files were weighted independently of person files.
The scope of the General Household Survey 2002 includes:
Household characteristics: Dwelling type, home ownership, access to water and sanitation facilities, access to services, transport, household assets, land ownership, agricultural production
Individuals' characteristics: demographic characteristics, relationship to household head, marital status, language, education, employment, income, health, disability, access to social services, mortality.
Women's characteristics: fertility
Th survey is representative at national level and at provincial level.
The lowest level of geographic aggregations is province.
The survey cover all de jure household members (usual residents) of households in the nine provinces of South Africa and residents in workers' hostels. The survey does not cover collective living quarters such as students' hostels, old age homes, hospitals, prisons and military barracks.
Producers and sponsors
Statistics South Africa
Government of South Africa
The sample is multi-stage stratified using probability proportional to size principles. The first stage is stratification by province, then by type of area within each province. Primary sampling units (PSUs) are then selected proportionally within each stratum (urban or non-urban) in all provinces. Altogether 3000 PSUs are selected. Within each PSU ten dwelling units are selected systematically for enumeration.
Dates of Data Collection
Data Collection Mode
Statistics South Africa
Government of South Africa
Earlier versions of the GHS datasets 2002 to 2007 include a District Council variable. This is no longer available in the later versions issued by Statistics SA. They caution that although the GHS 2005-2007 sample was designed to report at DC level, estimations are not reliable at this level. The 2008 - 2013 sample was designed to report at provincial and metro level. However, StatsSA did not take the absent population at metro into account when weighting the data and therefore this data is not reliable at Metro level.
The new programs that were introduced for weighting of the general household surveys from 2008 onwards, discard all records with missing values for age, sex or population group (for observations at household level, they are the values for age, sex or population group of the household head). This means that missing values of those variables were imputed. The emphasis was on obtaining reliable imputations rather than a 100% imputation rate, so some persons/households were discarded during the weighting.
Statistics South Africa. General Household Survey 2002 [dataset]. Version 1.3. Pretoria: Statistics South Africa [producer], 2017. Cape Town. DataFirst [distributor], 2020. DOI: https://doi.org/10.25828/d2xh-m039