The Community Survey (CS) is a nationally representative, large-scale household survey which was conducted from February to March 2007. The Community Survey is designed to provide information on the extent of poor households in South Africa, and their access to services, and levels of unemployment, at national, provincial and municipal levels.
The main objectives of the survey were:
1. To fill data gaps from the absence of a national population census in 2006
2. To provide estimates at lower geographical levels than existing household surveys
3. To build capacities for conducting Census 2011
4. To provide inputs to the mid-year population projections.
Kind of Data
Sample survey data
Unit of Analysis
v1.2: Edited, anonymised dataset for licensed distribution.
Version 1 of the Community Survey 2007 dataset did not include fertility and mortality data (from sections F and I of the questionnaire respectively).
This version, version 1.1, downloaded from Statistics South Africa's website on 17 october 2011, includes fertility and mortality data. The metadata file provided with this version is the metadata supplied with version 1 and therefore does not cover the fertility and mortality data.
Geography variables, provided in a separate data file in version 1 of the dataset, but were included in the "Person", "Household" and "Mortality" files of version 1.1 of the dataset.
This version, version 1.2 includesthe changes made in version 1.1. However, the variables in version 1.1 were strings and in version 1.2 these have now been converted to numeric variables for ease of use.
The scope of the Community Survey (CS) includes:
Demographic characteristics (age and sex, population group, fertility, mortality), migration, economic activity, geographical distribution, marital status, disability, education, household good, access to services (social security services, housing, water, energy, sanitation, communication services, refuse removal)
The survey covered the whole of South Africa, including all nine provinces as well as the four settlement types - urban-formal, urban-informal, rural-formal (commercial farms) and rural-informal (tribal areas).
The lowest level of geographic aggregation of the data is local municipality
The Community Survey covered all de jure household members (usual residents) in South Africa. The survey excluded collective living quarters (institutions) and some households in EAs classified as recreational areas or institutions. However, an approximation of the out-of-scope population was made from the 2001 Census and added to the final estimates of the CS 2007 results.
Producers and sponsors
Statistics South Africa
Government of South Africa
The Government of South Africa
The sampling procedure that was adopted for the CS was a two-stage stratified random sampling process. Stage one involved the selection of enumeration areas, and stage tw0 was the selection of dwelling units. Since the data are required for each local municipality, each municipality was considered as an explicit stratum. The stratification is done for those municipalities classified as category B municipalities (local municipalities) and category A municipalities (metropolitan areas) as proclaimed at the time of Census 2001. However, the newly proclaimed boundaries as well as any other higher level of geography such as province or district municipality, were considered as any other domain variable based on their link to the smallest geographic unit - the enumeration area.
The Census 2001 enumeration areas were used because they give a full geographic coverage of the country without any overlap. Although changes in settlement type, growth or movement of people have occurred, the enumeration areas assisted in getting a spatial comparison over time. Out of 80 787 enumeration areas countrywide, 79 466 were considered in the frame. A total of 1 321 enumeration areas were excluded (919 covering institutions and 402 recreational areas).
On the second level, the listing exercise yielded the dwelling frame which facilitated the selection of dwellings to be visited. The dwelling unit is a structure or part of a structure or group of structures occupied or meant to be occupied by one or more households. Some of these structures may be vacant and/or under construction, but can be lived in at the time of the survey. A dwelling unit may also be within collective
living quarters where applicable (examples of each are a house, a group of huts, a flat, hostels, etc.).
The Community Survey universe at the second-level frame is dependent on whether the different structures are classified as dwelling units (DUs) or not. Structures where people stay/live were listed and classified as dwelling units. However, there are special cases of collective living quarters that were also included in the CS frame. These are religious institutions such as convents or monasteries, and guesthouses where people stay for an extended period (more than a month). Student residences - based on how long people have stayed (more than a month) - and old-age homes not similar to hospitals (where people are living in a communal set-up) were treated the same as hostels, thereby listing either the bed or room. In addition, any other family staying in separate quarters within the premises of an institution (like wardens' quarters, military family quarters, teachers' quarters and medical staff quarters) were considered as part of the CS frame. The inclusion of such group quarters in the frame is based on the living circumstances within these structures. Members are independent of each other with the exception that they sleep under one roof.
The remaining group quarters were excluded from the CS frame because they are difficult to access and have no stable composition. Excluded dwelling types were prisons, hotels, hospitals, military barracks, etc. This is in addition to the exclusion on first level of the enumeration areas (EAs) classified as institutions (military bases) or recreational areas (national parks).
The Selection of Enumeration Areas (EAs)
The EAs within each municipality were ordered by geographic type and EA type. The selection was done by using systematic random sampling. The criteria used were as follows:
In municipalities with fewer than 30 EAs, all EAs were automatically selected.
In municipalities with 30 or more EAs, the sample selection used a fixed proportion of 19% of all sampled EAs. However, if the selected EAs in a municipality were less than 30 EAs, the sample in the municipality was increased to 30 EAs.
The Selection of Dwelling Units
The second level of the frame required a full re-listing of dwelling units. The listing exercise was undertaken before the selection of DUs. The adopted listing methodology ensured that the listing route was determined by the lister. Thisapproach facilitated the serpentine selection of dwelling units. The listing exercise provided a complete list of dwelling units in the selected EAs. Only those structures that were classified as dwelling units were considered for selection, whether vacant or occupied. This exercise yielded a total of 2 511 314 dwelling units. The selection of the dwelling units was also based on a fixed proportion of 10% of the total listed dwellings in an EA. A constraint was imposed on small-size EAs where, if the listed dwelling units were less than 10 dwellings, the selection was increased to 10 dwelling units. All households within the selected dwelling units were covered. There was no replacement of refusals, vacant dwellings or non-contacts owing to
their impact on the probability of selection.
The Community Survey sample has equal probabilities for all elements in the cluster which make it a self-weighting systematic random sample. Since the sample is stratified by municipalities as demarcated at the time of Census 2001, the inclusion probability of selection of an EA at the first level of selection, and the dwelling unit at the second level of selection, is the product of first and second-level probabilities. Also, since all households within the dwelling unit are considered, their probability of being in the dwelling unit is always one.
Dates of Data Collection
Data Collection Mode
Data Collection Notes
The main objective of enumeration is to collect and document particulars of all individuals and housing units with the selected respondent(s).
The adopted enumeration method for CS 2007 was canvassing, whereby the enumerator conducts a face-to-face interview with the respondent while simultaneously completing the questionnaire. The Community Survey adopted both the de jure and de facto approach in order to compare with other Stats SA social statistics definitions as well as to give a comparison over time between the censuses with the ultimate objective of having two estimates of the population – the de jure population estimates are mostly useful for long-term planning, and the de facto
population estimates are mostly used for demographic estimations.
Enumerators visited the selected sampled dwelling units to interview households and ensure that the information required from them was captured on the questionnaires. Self-enumeration was not allowed. The enumeration was carried out over a three week period with a non-response follow-up period of one week as planned, that is on 7 February 2007. The mop-up exercise was carried out from 1 to 7 March. This
included follow-up on non-contacts, vacant dwellings, and unoccupied dwellings. However, due to the high number of dwelling units that were being mapped for the non- response follow-up period, the contracts of enumerators were extended beyond 28 February to assist the supervisors during that period.
The FWS and FWC conducted 100% quality checks for accuracy and completeness on all completed questionnaires. In addition, the DSCs, PSCs and Monitors also did quality checks on randomly selected questionnaires and DUs and addressed problematic questions as they came up. In addition, the FWC did 2% spot checks of selected dwelling units within their assigned fieldwork coordination unit to minimise
bogus enumeration. Training played a big role in ensuring good quality data from the field. At district level, retraining was done in areas where fieldwork monitors felt that the work was not of the expected quality. A close watch was also kept on individual enumerators, and Fieldwork Supervisors and Fieldwork Coordinators who had problems performing according to expectations were retrained where necessary. Their work was also checked more frequently.
FWS were required to package questionnaires in their EA boxes and hand them over to the FWCs soon after the completion of the EA. The FWCs were required to sign for the receipt of the boxes after verifying the contents of the boxes. They were in turn required to hand over the completed boxes to the DLOs for reverse logistics. DLOs were also required to sign for the receipt of the boxes after verifying the contents of the boxes. The boxes were then stored in designed storage areas awaiting shipping back to the data processing centre in Pretoria.
Progress reporting for data collection was done on a daily basis. Provinces were provided with procedures and timelines for progress reporting and were able to report progress on a daily basis though at the initial stages, there were problems as outlined below.
The Pilot Survey
A pilot survey was conducted in February 2006. The purpose of the pilot was to test all the developed strategies, methodologies, systems, and the questionnaire. A total of 782 EAs were covered in the pilot survey. During the pilot survey the effectiveness of instruments, processes and methods used within the scope of the CS were tested. A range of lessons were learnt which led to the refinement of processes, methods and systems towards the main survey.
Statistics South Africa
The design of the CS questionnaire was household-based and intended to collect information on 10 people. It was developed in line with the household-based survey questionnaires conducted by Stats SA. The questions were based on the data items generated out of the consultation process described above. Both the design and questionnaire layout were pre-tested in October 2005 and adjustments were made
for the pilot in February 2006. Further adjustments were done after the pilot results had been finalised.
The Community Survey results were released on 24 October 2007. After the evaluation of the data by the Stats Council, the Community Survey was found to be comparable in many aspects with other Stats SA surveys, censuses and other external sources. However, there are some areas of concern where Statistics South Africa is urging users to be more cautious when using the Community Survey data.
The main concerns are:
·The institutional population is merely an approximation to 2001 numbers and it is not new data.
·The measure of unemployment in the Community Survey is higher and less reliable due to the differences in questions asked relative to the normal Labour Force Surveys.
·The income includes unreasonably high income for children due to presumably misinterpretation of the question, e.g. listing parent's income for the child.
·The distribution of households by province has very little congruence with the General Household Survey or Census 2001.
·The interpretation of grants or those receiving grants need to be done with caution.
·Since the Community Survey is based on random sample and not a Census, any interpretation should be understood to have some random fluctuation in data, particularly concerning the small population for some cells. The user should understand that the figures are within a certain interval of confidence.
Users should be aware of these statements as part of the cautionary notes:
·The household estimates at municipal level differ slightly from the national and provincial estimates in terms of the household variables profile;
·The Community Survey has considered as an add-on an approximation of population in areas not covered by the survey, such as institutions and recreational areas. This approximation of people could not provide the number of those households (i.e. institutions). Thus, there is no household record for those people approximated as living out of CS scope;
·Any cross-tabulation giving small numbers at municipal level should be interpreted with caution such as taking small value in given table's cell as likely over or under estimation of the true population;
·No reliance should be placed on numbers for variables broken down at municipal level (i.e. age, population group etc.). However, the aggregated total number per municipality provides more reliable estimates;
·Usually a zero total figure (excluding those in institutions) reflects the fact that no sample was realised and in such cases this is likely to be a significant underestimate of the true population.
·As an extension from the above statement, in a number of instances the number realised in the sample, though not zero, was very small (maybe as low as a single individual) and in some cases had to be re-weighted by a very large factor (maximum nearly 800 for housing weight and over 1000 for person weight).
·As a further consequence, small sub-populations are likely to be heavily over- or under-represented at a household level in the data.
·It should be noted that the estimates were done with the use of the de-facto population and not the de-jure population. The final presentation of results is presented on the de-jure population.
Statistics South Africa. Community Survey 2007 [dataset]. Version 1.2. Pretoria: Statistics South Africa [producer], 2008. Cape Town: DataFirst [distributor], 2010. DOI: https://doi.org/10.25828/0nqv-ns26