The LFS is a twice-yearly rotating panel household survey, specifically designed to measure the dynamics of employment and unemployment in South Africa. It measures a variety of issues related to the labour market,including unemployment rates (official and expanded), according to standard definitions of the International Labour Organisation (ILO).
All editions of the LFS have been updated (some more than once) since their release. These version changes are detailed in a document available from DataFirst (in the "external documents" section titled "LFS 2000-2008 Collated Version Notes on the South African LFS").
Kind of Data
Sample survey data [ssd]
Unit of Analysis
Households (dwellings) and individuals
v2: Edited, anonymised dataset for licensed distribution
The South African February 2001 LFS dataset was originally released in September 2001 as 4 data files (household, worker and general and stratum psu). A second version was downloaded from the Statistics South Africa website as 3 data files (incorporating the previously seperate stratum psu data file) on 11 August 2011. This version differed slightly from the originally obtained release in the following ways
1. No ” Feb2001” suffix attached to variable names
2. Year and Month variables included
3. Variable labels were altered. Previously, all variable labels were literal questions. Now the variable labels describe the variables.
4. A number of the variables have been renamed slightly. For example, ”C Gender Feb2001”, in the worker data file, version 1.0, is now simply ”Gender” in version 2.0
There are also a few substantive differences between the old and new versions of the dataset:
Worker data file
Sector: The variable representing employment sector has had a number of observations that were previously ”Unspecified” recoded into various other categories. There are 91 differences between versions in total.
Industry: The variable representing ”Main industry” has been recoded for a number of observations. Most importantly, the variable value label previously denoted by ”Other” is now labelled ”12”. Some of the observations have different values to previous versions. There are 316 differences between versions in total.
Main Occupation: The variable representing ”Main occupation” has been recoded for a number of observations. Most importantly, the variable value label previously denoted by ”Occupation not adequately defined” is now labelled ”91”. Some of the observations have different values than before. There are 285 differences between versions in total.
Employment Status: The variables representing employment status have several substantive differences. These two variables reflect two definitions, “narrow” and “expanded”, of employment status. Some observations that were previously defined as having one particular status (within both variables!) are now defined as another. Official employment status (the “narrow” definition, STATUS1) has 359 differences, whereas expanded employment status (STATUS2) has 583 differences.
Household/general data file
The household/general file in version 2.0 has 13 extra observations, 3 each from the Free State, Northwest and Gauteng provinces and 4 from Limpopo. Each
of these observations are coded missing or unspecified for each variable except Province and Household Weight.
Household characteristics, household listing, demographics, education, economic activity, work for pay, business ownership, unemployment, employers, main work activity in the past week, wages, salary, employment, migration
The LFS sample covers the non-institutional population except for workers' hostels. However, persons living in private dwelling units within institutions are also enumerated. For example, within a school compound, one would enumerate the schoolmaster's house and teachers' accommodation because these are private dwellings. Students living in a dormitory on the school compound would, however, be excluded.
Producers and sponsors
Statistics South Africa
The sampling procedure for the LFS was a two-stage complex sample. The first stage was the selection with probability proportional to size of PSUs from the 1996 Census list of Enumerator Areas (EAs), to form the Master Sample of 1999. The 1999 Master Sample is thus based on the 1996 Population Census list of EAs and the measure of size used was the number of dwelling units per PSU. The second stage involved the systematic selection of 10 dwelling units from each of the selected PSUs. The Master Sample was stratified into 18 strata, i.e. 9 provinces and within each province by urban / non-urban.
The LFS is a twice-yearly rotating panel household survey. A rotating panel sample involves visiting the same dwelling units on a number of occasions (in this instance, five at most), and replacing a proportion of these dwelling units each round. New dwelling units are added to the sample to replace those that are taken out. The Master Sample is based on the 1996 Population Census of enumeration areas (EA) and the estimated number of dwelling units from the 1996 Population Census. A sample of 30 000 dwelling units was drawn from 3000 primary sampling units (PSUs) (that is 10 dwellingunits per enumerator area (EA)) from the Master Sample. A two-stage sampling procedure was applied and the sample was stratified, clustered and selected to meet the requirements of probability sampling. The Master Sample is based on the 1996 Population Census enumerator areas and the estimated number of dwelling units from the 1996 Population Census. The EAs were grouped within a province by urban/rural, and a disproportional sample of EAs was taken from each group (stratum). Within each explicit stratum the PSUs were stratified by simply arranging them in geographical order by District Council, Magisterial District and, within the magisterial district, by average household income (for formal urban areas and hostels) or EA. The allocated number of EAs was systematically selected with probability proportional to size in each stratum. The sample was explicitly stratified by province and area type (urban/rural).
The careful and scientific selection of the PSUs is the first stage of the sample selection. These identified PSUs must match those areas selected from 1996 census records. After boundary identification, the next stage was to list accurately all the dwelling units in the PSU. A PSU is either one EA from the Census or several EAs when the number of dwelling units in the base or originally selected EA from the census was found to have less than 100 dwelling units. Each EA should have approximately 150 dwelling units but it was found that many contained less than that. Thus, in some cases it has been found necessary to add EAs to the original EA to give our minimum requirement of 100 dwelling units in the first stage of primary sampling units (PSUs). PSUs in the Master Sample consist of 100 to 2445 dwelling units. Special dwellings such as all prisoners in prisons, patients in hospitals, people residing in boarding houses and hotels (whether temporary or semi-permanent), guest houses (whether catering or self-catering), schools and churches are excluded from the sample. The second stage of the sample selection is from the dwelling unit listing. A systematic sample of 10 dwelling units was drawn from each PSU. However, if there was growth of more than 20% in a PSU, then the sample size was increased systematically according to the proportion of growth in the PSU. The same dwellings will be visited on, at most, five different occasions. After this, new dwelling units will be included for interviewing from the same PSU.
The first pilot round of LFS fieldwork took place in February 2000, based on a probability sample of 10 000 dwelling units. The sample was increased to 30 000 dwelling units in September 2000. Both of these surveys were published as discussion documents. The third round took place in February 2001, using the same 30 000 dwelling units.
The initial weights (household weights), based on the sample design, were equal to the inverse of the probability of selection. The initial weight for each member of the household was the same as the weight for the household itself. Further adjustment factors were then calculated within PSUs to account for non-response. To adjust for under-enumeration and to align survey estimates with independent population estimates, the weights were calibrated against Person benchmarks. A software package called CALMAR was used to perform this calibration. Using an iterative procedure, CALMAR adjusted the weights so that Person estimates conformed as closely as possible to external Person benchmarks. Gender, race and age group parameters were used for the Person cross-classification of the population.
Stats SA revised their population model to produce mid-year population estimates in the light of mortality data released in 2005 (see Stats SA Statistical Release P0309.3, 2005). The benchmarks for the LFS discussed in this statistical release have been adjusted accordingly. Weights were then adjusted according to those 2005 population estimates.
Statistics South Africa. Labour Force Survey: February 2001. [dataset]. Version 2. Pretoria: Statistics South Africa [producer], 2001. Cape Town: DataFirst [distributor], 2011. DOI: https://doi.org/10.25828/y5xa-r546