The GHS is an annual household survey specifically designed to measure the living circumstances of South African households. The GHS collects data on education, employment, health, housing and household access to services.
Kind of Data
Sample survey data
Unit of Analysis
Households and individuals
v1.4: Edited, anonymised dataset for public distribution
Version 1.2 contained weights for comparability across GHS 2002-2009. Person files were reweighted to reflect Community Survey 2007 results and update the estimates of the impact of HIV/Aids on demographic trends in South Africa. Household files were weighted independently of the person files in v1.2. The reweighting procedure was discussed in the report: "Reweighting of the GHS 2002-2008 data series".
From GHS 2009, the worker file will no longer be released as a separate data file, although information about respondents’ economic activities were still recorded and included in the person file.
The tourism file was retained in the GHS 2009 to help StatsSA verify the external validity of the first round of Domestic Tourism Survey. However, the GHS tourism file will be phased out from 2010.
Version 1.3 data files contain new weights for comparability across GHS 2002-2011 released at the same time as the GHS 2012 (22 August 2013). Reweighting was necessary in order to maintain the comparability of population estimates used in the GHS based on figures provided by the 2013 mid-year population estimation model that incorporates the demographic findings of Census 2011. Household files were weighted independently of person files.
In the household file v1.2, some households had the value of 888888 in the variable “q419rem”(remittance income). In v1.3, they were recoded to 0.
Mismatches were found in the following imputed variables between household file v1.2 and v1.3: "econact"; "totmhinc"
The following imputed variables were added to the household file v1.3: "totalgrnt"; "undisab"; "disab"; "sevdisab"
The following variables were added to the household file of v1.3: "stratum"; "geotype"; "metro"; "q142msal"
The person file v1.2 contained "q142msalcat", which was removed in v1.3.
The following imputed variables were added to the person file v1.3: "disab"; "sevdisab"; "literacy3b"
The following variables were added to the person file of v1.3: "stratum"; "geotype"; "metro"
In the tourism file v1.3, the variable "q213tran" contains a category labelled “7”. This category did not exist in the version 1.2.
Version 1.4 includes new weights for comparability across GHS 2002-2017 released at the same time as GHS 2017 (21 June 2018). It was decided to replace the 2013 series mid-year population estimation in the previous version with a the more recent 2017 series mid-year population estimation as benchmarks for weighting the GHS data files. Household files were weighted independently of person files.
Because the tourism file was not reweighted by StatsSA, we kept its version at v1.3.
The scope of the General Household Survey includes:
Worker's characteristics: unemployment, non-economic and economic activities
Household information: tourism information for the household, type of dwelling, ownership of dwelling and other assets, electricity, water and
sanitation, environmental issues, services, transport, expenditure etc.
The survey is representative at national level and at provincial level.
The lowest level of geographic aggregations is province.
The survey covered all de jure household members (usual residents) of households in the nine provinces of South Africa and residents in workers' hostels. The survey does not cover collective living quarters such as students' hostels, old age homes, hospitals, prisons and military barracks.
Producers and sponsors
Statistics South Africa
A multi-stage, stratified random sample was drawn using probability-proportional-to-size principles. First level stratification was based on province and second-tier stratification on district council. The GHS 2009 represents the second year of a new master sample (the first year was GHS 2008) that will be used until 2010.
Dates of Data Collection
Data Collection Mode
Statistics South Africa
Government of South Africa
GHS uses questionnaires as data collection instruments
The questionnaire for the General Household Survey has undergone various changes since 2002. Significant changes were made to the GHS 2009 questionnaire and this should be borne in mind when comparing across different datasets. See GHS 2009 statistical release for a detailed report on important differences between the questionnaires.
In GHS 2009-2010:
The variable on care provision (Q129acre) in the GHS 2009 and 2010 should be used with caution. The question to collect the data (question 1.29a) asks:
"Does anyone in this household personally provide care for at least two hours per day to someone in the household who - owing to frailty, old age, disability, or ill-health cannot manage without help?"
Response codes (in the questionnaire, metadata, and dataset) are:
1 = No
2 = Yes, 2-19 hours per week
3 = Yes, 20-49 hours per week
4 = Yes, 50 + hours per week
5 = Do not know
There is inconsistency between the question, which asks about hours per day, and the response options, which record hours per week. The outcome that a respondent who gives care for one hour per day (7 hours/week) would presumably not answer this question. Someone giving care for 13 hours a week would also be excluded as though they do that do serious caregiving, which is incorrect.
In GHS 2009-2015:
The variable on land size in the General Household Survey questionnaire for 2009-2015 should be used with caution. The data comes from questions on the households' agricultural activities in Section 8 of the GHS questionnaire: Household Livelihoods: Agricultural Activities. Question 8.8b asks:
“Approximately how big is the land that the household use for production? Estimate total area if more than one piece.” One of the response category is worded as:
1 = Less than 500m2 (approximately one soccer field)
However, a soccer field is 5000 m2, not 500, therefore response category 1 is incorrect. The correct category option should be 5000 sqm. This response option is correct for GHS 2002-2008 and was flagged and corrected by Statistics SA in the GHS 2016.
Statistics South Africa. General Household Survey 2009 [dataset]. Version 1.4. Pretoria: Statistics South Africa [producer]. 2017. Cape Town. DataFirst [distributor], 2020. DOI: https://doi.org/10.25828/9vjw-ex06