The National Income Dynamics Study (NIDS) is a face-to-face longitudinal survey of individuals living in South Africa as well as their households. The survey was designed to give effect to the dimensions of the well-being of South Africans, to be tracked over time. At the broadest level, these were:
Wealth creation in terms of income and expenditure dynamics and asset endowments;
Demographic dynamics as these relate to household composition and migration;
Social heritage, including education and employment dynamics, the impact of life events (including positive and negative shocks), social capital and intergenerational developments;
Access to cash transfers and social services
Wave 1 of the survey, conducted in 2008, collected the detailed information for the national sample. In 2010/2011 Wave 2 of NIDS re-interviewed these people, gathering information on developments in their lives since they were interviewed first in 2008. As such, the comparison of Wave 1 and Wave 2 information provides a detailed picture of how South Africans have fared over two years of very difficult socio-economic circumstances.
Completed and non-response interviews in the NIDS Data:
The NIDS datasets contain both completed and non-response interviews (e.g. Refusals). It is recommended that researchers limit their research to completed interviews to avoid item non-response from non-response interviews. The completed interviews can be identified by making use of the w`x'_`y'_outcome variables, where `x' represents the wave and `y' represents the relevant data file/outcome type indicator. These outcome variables can be found in each of the following data files, Adult, Child, Proxy, HHQuestionnaire and Link File.
The only exception to this is Wave 1 where no outcome variable exists. This is because at a household level, all of the interviews are completed. However this does not apply at an individual level where non-response interviews can be identified by making use of the "Reason for refusal" variables, namely w1_a_refexpl or w1_c_refexpl in the Adult and Child data files respectively.
Kind of Data
Sample survey data [ssd]
Unit of Analysis
Households and individuals
v4.0.0: Edited, anonymised dataset for public distribution.
This explains changes in any new versions of the data. Note that the version numbers for the latest versions of NIDS have not been updated in the do files in the program library that has been made available to assist researchers with data manipulation. Data users should update the global to the version of the data they are using.
Version 1 of the National Income Dynamics Study wave 2 2010-2011 public release dataset was released on 9 May 2012.
CHANGES IN VERSION 2
The following changes were made to the National Income Dynamics Study (NIDS) Wave 2 2010-2011 Version 1 dataset to produce Version 2:
Data Corrections in Version 2
Discrepancies in birth history, and parent vital status (mother/father alive) data were corrected with data from call-backs to households. Duplicate households (resulting from interviews with the same respondents in more than one household) were identified during Wave 3 fieldwork and corrected. As a result there is a change in the number of individuals and households in version 2 of this dataset.
New Variables in version 2
These identify mothers and fathers in the NIDS panel even when they were not co-resident with their children or had died. For a complete list of new variables in Version 2 of the dataset, please refer to the User Manual.
Renamed Variables in Version 2
Variables were renamed in the Child file for consistency in the variable names across files. Please refer to the User Manual for a complete list of renamed variables in this file.
Dropped Variables in Version 2
The wx_r_age variable was dropped in the new versions of Wave 1(v5) and Wave 2(v2). Researchers should use the best_age variable in the individual derived file for age. Most of the variables dropped were empty variables. Please see the User Manual for a list of the dropped variables
New Weights in Version 2
All weights were recalculated in version 2. Please see the NIDS Wave 3 User Manual for an explanation of how the weights were calculated the relationship between the different weights.
CHANGES IN VERSION 2.1
Admin data has been added to the regular wave specific pack. Previously this was a separate item to download via the DataFirst catalogue. We hope that this convenience will enrich users' experience of developing research from this ever growing resource. The publically available data matches the names of schools as collected by NIDS to Department of Basic Education's Ordinary School's Master List. Only a limited number of variables are made publically available to protect the identities of NIDS respondents. A secure data facility is provided where researchers can match their own data sources based on EMIS numbers to the matched schools. See <http://www.nids.uct.ac.za/nids-data/secure-data> for further details.
The following variables have been added to the Wave 2 Admin Dataset:
The variable w2_a_wncom which was incorrectly named in the Adult file in the last release has been renamed back to w2_a_ owncom.
Adult Unemployment duration
There was an error in the way the two unemployment duration variables (w2_a_unemwnt_dyunit and w2_a_unemwnt_dy) were released as they were missing some observations. These variables now accurately reflect what was collected in field.
Household outcome change
There was a one person household that has a deceased individual but the household outcome was interview completed. The household outcome has been correctly changed to household all dead.
Through interaction with our users it was brought to our attention that the svyset command in STATA was retaining settings. We have subsequently removed these settings from all data sets.
CHANGES IN VERSION 2.2 (February 2014)
NIDS datasets have been reweighted to take into account the Census 2011 geographic data. Both the household level as well as the individual level panel weights have been adjusted.
Previous geographic variables have been given the suffix ‘2001’ to distinguish them from the new geographic variables. The following variables were affected:
Old Variable Name New Variable Name
*Secure dataset variables
Census 2011 Geographic Variables have been brought into the NIDS dataset. The new variables are:
New Variable Name w2_gc_prov2011 w2_gc_dc2011 w2_gc_mdbdc2011 w2_hhgeo2011 w2_gc_eatype2011* w2_gc_ea2011* w2_gc_mp2011* w2_mapped_prov2011* w2_mapped_dc2011* w2_mapped_mdbdc2011* w2_mapped_mp2011* w2_mapped_geo2011* w2_mapped_ea2011* w2_mapped_eatype2011
*Secure dataset variables
More detail about this change can be found in the document detailing the Inclusion of Census 2011
data in NIDS.
CHANGES IN VERSION 2.3
Version 2.3 of NIDS Wave 2 2010-2011 has minor changes to some variable lables.
CHANGES IN VERSION 3
Change in numbers between Releases are listed in the document nids-w2-v3-changes.
Main Respondent variable
The variable for the main respondent (w2_h_respondent) in the household questionnaire is now available in this release version.
In the Adult, Child and Proxy questionnaires the survey gathered information regarding all the locations in which respondents lived in the past (Questions b10 - b16 in the adult questionnaire). This set of questions are collectively known as migration questions. In previous releases of the data these descriptions were coded using 2001 Census data to district municipality level (DC). In the latest release these descriptions are coded to both the 2001 and 2011 Census data and both versions of the district municipality codes are made available.
New variables for migration have the suffix dc_2001 and dc_2011 for descriptions coded to the 2001 and 2011 Census data respectively.
Upon inspection, the NIDS team noticed that some respondents in the Adult dataset had answered that they had other self-employment activities but the occupational codes for the other self-employment activities did not exist in data. This new variable has been added: w2_a_emsothatc_isco_c, which has the occupational codes of respondents with other self-employment activities.
Birth History Section
NIDS embarked on an exercise to identify and match all the children across Wave 1 - Wave 4 in the Birth History (BH) section. In cleaning this section, the NIDS team made calls to confirm the number of children the mother had given birth to. Therefore there were a lot of changes to this section, because some children were either added or dropped in the mother's birth history. An additional gain from this exercise is that each child in the BH section now has a PID to identify them.
Own Pid Variable
The variables for who in the household owns the dwelling (w2_h_ownpid2 & w2_h_ownpid3), in the household questionnaire, are now available in this release version.
Police District data
Police district data has now been included as part of the Admin data file. Variables include distance to the nearest police station and distance to the police station in the district in which the household is located. Only categorical distances have been included in the public release version of the data. Actual distances can be found in the secure (restricted access) version of the data.
An exercise to reduce inconsistences in the parental information was carried out for all individuals across all waves. The cases with problems were identified by comparing parental information across waves. In cases where the information varied across waves, calls were made to verify this information. Information obtained from the calls was used to correct the inconsistent parental data. Where the respondents were not able to be contacted, the data remain unchanged.
A list of variables which have been renamed can be found in the document "nids-w2-v3-changes" released with the new version of the data.
CHANGES IN VERSION 3.1
Version 3.1 has changes to the weight variables, w1_pweight in the indderived data file and the w1_wgt in the hhderived data file. The weight variables were changed because:
1. Panel weights were missing for some babies born to CSM mothers after Wave 1 (2008)
2. The weight for one respondent was missing
3. This version includes a syntax file to correct an error concerning the w*_a_unemwnt (number of years wanting work with no success) variable in the Adult data file. This variable was inconsistently re-named across the panel. This variable name will be corrected in the next NIDS data release.
CHANGES IN VERSION 4.0.0
Version 4.0.0 of NIDS wave 2 includes changes to the number of individuals and households in each data file, largely driven by previously incorrect classification of TSM/CSM status, duplicate interviews and additional baby CSMs not captured in a previous version of this wave. Version 4.0.0 also contains new and renamed variables, and there are changes to the survey weights. For details on these changes please see the document Wave 2 Changes between V3.1 and V4.0.0 which is provided with the data.
Data on the following topics was collected during the survey:
HOUSEHOLD: Household characteristics, household roster, mortality history, living standards, expenditure, consumption, negative events, positive events, agriculture
ADULTS: Demographics, education, labour market participation, income, health, well-being, numeracy, anthropometric data
CHILDREN: Education, health, family support, grants, anthropometric data, numeracy
The NIDS data is nationally representative. The survey began in 2008 with a nationally representative sample of over 28,000 individuals in 7,300 households across the country. The survey is repeated every two years with these same household members, who are called Continuing Sample Members (CSMs). The survey is designed to follow people who are CSMs, wherever they may be in SA at the time of interview. The NIDS data is therefore, by design, not representative provincially or at a lower level of geography (e.g. District Council).
The lowest level of geographic aggregation in the NIDS public release data is District Municipality. However, the data is not representative at any level but the national level.
The target population for NIDS was private households in all nine provinces of South Africa, and residents in workers' hostels, convents and monasteries. The frame excludes other collective living quarters, such as student hostels, old age homes, hospitals, prisons and military barracks.
Producers and sponsors
Southern Africa Labour and Development Research Unit
University of Cape Town
SALDRU, University of Cape Town
Government of South Africa
In a panel dataset, say w1 – w2, the original sampling is what matters when specifying the stratification and clustering variables. Thus the weighting variable will change according to whether someone is doing a cross-sectional analysis or panel analysis (and which panels are involved). Therefore in a panel setting with w1 and w2 data, the svyset command would be as follows:
The paper "A comment on the use of "cluster" corrections in the context of panel data" by Martin Wittenberg provides more detail on this and is available from http://www.nids.uct.ac.za/publications/technical-papers
Dates of Data Collection
Data Collection Mode
National Income Dynamics Study (NIDS) supervisory staff
Data Collection Notes
There were two major changes in data collection methodology from Wave 1 to Wave 2:
1.The introduction of Computer Assisted Personal Interviewing (CAPI) as the means of data collection. This allowed us take advantage of a range of data assurance and quality checks.
2.Tracking of CSMs to new addresses. In addition to in-field information gathering on CSMs that had moved, NIDS also uses an in-house call-centre to assist with tracking.
These methodological changes required careful pre-testing (over and above the changes made to the questionnaire) to ensure that the systems and field protocols functioned correctly. At the level of interviewing, the CAPI system followed the paper instruments as closely as possible.
Paper consent forms were issued in all languages and the informed consent process was conducted in the respondent's language of choice. For each questionnaire, two consent forms were signed. One signed copy remained with respondents and the other was returned to SALDRU. These forms carried unique bar-coded numbers that were entered into the CAPI system; similarly the household and person level IDs were displayed on the CAPI system and written onto the consent forms to cross-referencing was possible. Data coming in from the field were accepted as valid only if SALDRU had a signed consent form for each interview that produced the data. If signed consent forms were not located, the associated interviews were deleted from the dataset.
Fieldwork for Wave 2 (including both Phase 1 & Phase 2 fieldwork) commenced in May 2010 and concluded in September 2011. There were breaks in fieldwork from 15 December 2010 to 3 January 2011 and again from 9 May to 1 August 2011.
As in Wave 1 four types of questionnaires were administered in Wave 2:
Household questionnaire: One household questionnaire was completed per household by the oldest woman in the household or another person knowledgeable about household affairs and particularly household spending. Household questionnaires took approximately 45 minutes in non-agricultural households and 70 minutes in agricultural households to complete.
Individual Adult questionnaire: The Adult questionnaire was applied to all present Continuing Sample Members and other household member's resident in their households that are aged 15 years or over. This questionnaire took an average of 45 minutes per adult to complete.
Individual Proxy Questionnaire: Should an individual qualifying for an Adult questionnaire not be present then a Proxy Questionnaire (a much reduced Adult Questionnaire using third party referencing in the questioning) was taken on their behalf with a present resident adult. On average a Proxy questionnaire took 20 minutes. Proxy Questionnaires were also asked for CSMs who had moved out of scope (out of South Africa or to a non-accessible institution such as prison), except if the whole household moved out of scope, and could therefore not be tracked or interviewed directly.
Child questionnaire: This questionnaire collected information about all Continuing Sample Members and residents in their household younger than 15. Information about the child was gathered from the care-giver of the child. The questionnaire focused on the child's educational history, education, anthropometrics and access to grants. This questionnaire took an average of 20 minutes per child to complete.
Phase Two of Wave 2:
In June 2011 NIDS commissioned a Phase Two of Wave 2 as a Non-Response Follow-Up from Phase 1 of Wave 2. Household included in this subsample where those that refused and those that could not be located or tracked in Phase 1. Out of a total of 1064 households attempted, an additional 389 households were successfully interviewed in Phase Two.
Questionnaire Differences between W2 Phase 1 & W2 Phase2
There are two important methodological differences between Phase 1 and Phase 2:
1. Not all sections of the original Wave 2 questionnaires were asked. This reduced respondent burden and the time required for fieldworker training. Questions NOT asked in Phase 2 are indicated with the non-response code “-2”. Core modules such as household composition and income were still asked. Consult the Wave 2 Phase 2 questionnaires for more details of these differences.
2. Movers out of Phase 2 dwelling units were not tracked further. Address information was collected for this sub-sample and they will be tracked as part of the Wave 3 fieldwork exercise. These individuals are classified as “Not tracked” in the Wave 2 dataset.
Registering to use the NIDS data includes agreement that the data user will not attempt to identify specific individuals from the data.
Public use data, available to all
Southern Africa Labour and Development Research Unit. National Income Dynamics Study Wave 2, 2010-2011 [dataset]. Version 4.0.0. Pretoria: SA Presidency [funding agency]. Cape Town: Southern Africa Labour and Development Research Unit [implementer], 2016. Cape Town: DataFirst [distributor], 2016. DOI: https://doi.org/10.25828/j1h1-5m16