During October 1996 Statistics South Africa recorded the details of people living in more than nine million households in South Africa, as well as those in hostels, hotels and prisons. Census 1996 was the first nation wide census since the splitting up of the country under apartheid after 1970 and sought to apply the same methodology to everyone: visiting the household, and obtaining details about all its members from a representative who was either interviewed, or else filled in the questionnaire in their language of choice.
Kind of Data
Sample survey data
Unit of Analysis
Households and individuals
v1.1: Edited, anonymised dataset for licensed distribution.
Version 1 of the October Household Survey 1996 dataset is the dataset received from Statistics South Africa.
In this version of the dataset, version 1.1, some of the data files have string variables converted to numeric variables for ease of use. The "Other" and "Genpsu" files have not been changed, so retain their original version number, version 1
The scope of the OHS 1996 includes: employment, unemployment, informal sector, internal migration, services available by type of dwelling, access to health and social services, safety and well-being of household, households by average household size and type of dwelling, level of education, quality of life, health statistics, vital statistics.
The survey had national coverage
The lowest level of geographic aggregation covered by the data is province.
The survey covered households and household members in households in the nine provinces of South Africa.
Producers and sponsors
Statistics South Africa
A sample of 1600 Enumerator Areas (EA's) was produced in conjunction with the sample for the 1996 Population Census post-enumeration survey. A two stage sampling procedure was applied in the following manner.
The first stratification was done by province, as well as by type of EA (formal or informal urban areas, commercial farms, traditional authority areas or other non-urban areas). Originally eight hundred EA's were allocated to each strata by province proportionately. Later some adjustments were made to ensure adequate representation of smaller provinces such as the Northern Cape. Independent systematic samples of EA's were drawn for each stratum within each province. The sampling frame that was used was constructed from the preliminary database of EA's which was established during the demarcation and listing phase of the 1996 population census. In the second phase 10 households were drawn from each EA on the western and eastern side of the EA drawn for the post enumeration survey. This meant 10 households per EA in 1600 different EA's, that is 16 000 households in total.
The 1996 OHS was weighted to the population census of October 1996, as adjusted by the PES. To calculate weights, a generalised ranking with a linear distance function was used to implement a population control adjustment. The marginal population frequencies of the variables sex and age group (0 - 4, 5 - 14, 15 - 19, 20 - 29, 30 - 39, 40 - 64 and 65 + years) were used within each province and each population group.
Previously, OHS surveys were weighted to reflect estimates of population size using the 1991 population census, and not the one of 1996. The data reported here for 1996 are thus not presently directly comparable with the previously published OHS figures. Statistics South Africa is in a process of re-weighting the earlier surveys to reflect estimates of the population size based on the 1996 population census.
This data set is also not directly comparable with the 1996 OHS data contained in Statistical release P0317.10. The data of the post-enumeration survey (PES), conducted just after the October 1996 population census, were used for weighting purposes for that release.
Dates of Data Collection
Data Collection Mode
The data files in the October Household Survey 1996 (OHS 1996) correspond to the following sections in the questionnaire:
House: Data from FLAP, Section 1 and Section 7
Person: Data from Section 2
Worker: Data from Section 3
Migrant: Data from Section 4
Death: Data from Section 5
Births: Data from Section 6 - This data had a considerable number of problems and will not be published.
Income: Data from Section 7 (included in House)
Domestic: Data from Section 8
The October Household Survey 1996 questionnaire had incorrect FLAP data. No Population Group question was indicated on the FLAP. DataFirst notified Statistics SA who supplied a corrected questionnaire which is the one now available with the dataset.
In the previous version of the 1996 October Household Survey dataset archived by DataFirst the HHID were not unique. This was corrected in the first version disseminated by DataFirst, version 1. Version 1.1 keeps this correction, but data users should check versions not obtained from DataFirst and replace these with the latest version available from DataFirst.
The Metadata for the OHS 1996 provides an explanation for merging the files in the files in the OHS 1996 dataset: "The data from different files can be linked on the basis of the record identifiers. The record identifiers are composed of the first few fields in each file. Each record contains the three fields Magisterial District, Enumeration area, and Visiting point number. These eleven digits together constitute a unique household identifier. All records with a given household identifier, no matter which file they are in, belong to the same household. For individuals, a further two digits constituting the Person number, when added to the household identifier, creates a unique individual identifier. Again, these can be used to link records from the PERSON and WORK files. The syntax needed to merge information from different files will differ according to the statistical package used (October Household Survey 1996: Metadata: General Notes: 2).”
According to the above, to generate household IDs it is necessary to use a combination of magisterial district number (mdnumber), enumeration area number (eanumber) and visiting point number (vpnumber). To generate person IDs it is necessary to use the above with the person number (personnu).
These variables are named as such in the OHS 1996 House, OHS 1996 Births, OHS 1996 Migrant, OHS 1996 Deaths, OHS 1996 Household Income Other, OHS 1996 Other, OHS 1996 Domestic and OHS 1996 Flap data files. However, in the OHS 1996 Worker and OHS 1996 Person data files the variable for magisterial district number is “distr”, the variable for Enumeration Area is “ea” and the variable for visiting point number is called "visp”. The variable for person number in these files is called “respno”.
The metadata provided to DataFirst with this dataset does not discuss these changes.
October Household Survey 1996 Births file:
Births data was collected by Section 6 of the OHS 1996 questionnaire, completed for all women younger than 55 years who had ever given birth. The metadata for this survey from Statistics SA states that “This data had a considerable number of problems and will not be published” The dataset provided by DataFirst therefore does not include the original “births” file. Those in possession of this file from unofficial versions of the dataset should note the following problems with the data in the OHS 1996 births file:
Variable name: eegender
Question 6.2: Is/was (the child) a boy or a girl?
Valid range: 1 (boy) - 2 (girl)
Data quality issue: There is a third response value of 0 with no description
Variable name: livinghh
Question 6.4: If alive: Is (the child) currently living with this household?
Valid range: 1 (yes) - 2 (no)
Data quality issue: This variable has an additional response value (0), which has no description
Variable name: agealive
Question 6.5: If alive: How old is he/she?
This question was asked of all women younger than 55 years who have ever given birth to provide the age of their living children.
Data quality issue: responses range from 0-77 for age of child (assuming age 99 is for missing responses) which is outside the plausible range.
Variable name: agenaliv
Question 6.6: If dead: How old was (the child) when he/she died?
Data quality issue: The format of the age at death variable is not clear
Variable name: datebirt
Question 6.7: [All children]: In what year and month was (the child) born?
Data quality issue: There are problems with the format of the date of birth variable
Variable name: wherebor
Question 6.8: [All children]: Where was (the child) born?
Data quality issue: There are only three options for the place of birth in the questionnaire (in a hospital, in a clinic and elsewhere), but the data has 10 response values (0-9) with no explanation for this in the metadata.
Variable name: regstere
Question 6.9 [All children] Was the birth registered?
Valid range: 1(yes) - 2 (no)
Data quality issue: There are 4 response values (0-3) for this variable
Central Statistics Services. October Household Survey 1996 [dataset]. Version 1.1. Pretoria:Central Statistical Service (now Statistics South Africa) [producer], 1999. Cape Town: DataFirst[distributor], 2013. DOI: https://doi.org/10.25828/kn3m-x762