Library

Survey Design

The process of designing structured questionnaires and survey protocols to collect reliable, valid, and actionable data from a defined population.

When to Use

Use survey design when you need to collect structured, comparable data across a large number of respondents to answer specific quantitative questions about a population. Surveys are the primary instrument for baselines, midlines, and endlines. They are also used for needs assessments, coverage monitoring, and performance surveys. Use them when you need data that is systematic, replicable, and statistically representative.

Surveys are not the right tool when you need to understand why something is happening (use focus group discussions or key informant interviews), when the population is too small for statistical analysis (use qualitative methods), or when the question requires narrative or interpretive answers.

How It Works

Step 1: Define the evaluation questions the survey must answer

Every survey item should trace directly to an evaluation or monitoring question. Items without a clear question-to-indicator link should be removed. Unfocused surveys produce data that no one uses.

Step 2: Draft the instrument

Write items using clear, simple language. Each item should measure one thing. Avoid double-barrelled questions ("Do you feel safe and supported?"), leading questions, and jargon. Use established, validated instruments wherever they exist (e.g., HDDS for dietary diversity, WDDS for women's dietary diversity, MDD-W for minimum dietary diversity).

Step 3: Design the question flow and skip logic

Organise items into logical sections. Use skip logic to route respondents past irrelevant sections. Begin with non-sensitive, rapport-building questions. Place sensitive items (income, violence) toward the end.

Step 4: Pilot the instrument

Test the draft with a small sample (15-30 respondents) from the same population type as the study. Identify misunderstood items, translation issues, and skip logic errors. Revise based on findings. Do not skip piloting - it is the single highest-return investment in data quality.

Step 5: Train enumerators

Enumerators must be trained on the instrument, interview protocols, consent procedures, and data entry. Run calibration exercises where pairs of enumerators interview the same respondent independently and compare results.

Step 6: Implement with quality controls

Use digital data collection (SurveyCTO, KoBoToolbox, ODK) to enforce skip logic, range checks, and required fields. Conduct field supervision with back-check surveys (re-interviewing a random 10% sample to verify enumerator data). Review daily data reports during data collection.

Key Components

Coverage - which topics and indicators are included, and which are deliberately excluded
Question types - Likert scales, multiple choice, open-ended, ranking, observation-based
Response categories - exhaustive, mutually exclusive, and appropriate for the population's understanding
Skip logic - routing that prevents irrelevant questions and reduces respondent burden
Translation and back-translation - if conducted in a language other than English, translate forward, then independently back-translate to verify meaning
Piloting protocol - plan for who, where, and how the instrument will be tested before deployment
Data entry and validation rules - built-in range checks and required fields for digital data collection

Best Practices

Use validated instruments. Reinventing widely used instruments (dietary diversity, food security, WASH) introduces comparability problems and quality risks. Use established tools with documented validity and reliability where they exist.

Collect outcome data, not just output data. Many surveys track what was delivered (outputs) rather than what changed (outcomes). Outcome indicators require outcome questions.

Collect baseline data before the programme starts. Without baseline data, change cannot be measured and impact cannot be assessed.

Match survey timing to measurement logic. Some outcomes need time to materialise. Collecting endline data 3 months after a 2-year programme intervention may be too early to detect genuine change.

Keep instruments short. Respondent fatigue produces lower quality data in the second half of long surveys. Aim for under 45 minutes for household surveys. Every item cut improves data quality on the items that remain.

Common Mistakes

Over-designing the instrument. Adding items "just in case" produces surveys that are too long, tire respondents, and generate data that is never analysed. Every item costs respondent time, enumerator time, and analysis effort.

Skipping the pilot. Pilots reveal translation problems, confusing items, and skip logic errors that are invisible on paper. Piloting with 20 respondents typically surfaces 80% of instrument problems.

Collecting data that cannot change the analysis. If you cannot afford to act on a negative finding, do not collect the data. Collecting data without intention to use it wastes respondent time and erodes community trust.

Failing to standardise across enumerators. If different enumerators interpret and administer items differently, the resulting data is not comparable. Calibration training and back-check protocols address this.

Examples

WASH baseline, East Africa. A UNICEF-funded WASH programme in Ethiopia used the WASH Conditions Assessment Tool as the basis for its baseline survey, adding 12 programme-specific items on hygiene behaviour. The 40-minute household survey was piloted in two villages outside the programme area before deployment. Calibration exercises between enumerator pairs identified a misunderstood definition of "improved latrine" that was corrected before field data collection. The final survey was administered to 1,800 households across three districts.

Food security survey, West Africa. A WFP-funded programme in Mali used the Household Food Insecurity Access Scale (HFIAS) and the Household Dietary Diversity Score (HDDS) as the core of its monitoring survey. These validated instruments enabled comparison with WFP's global database and with the programme's own baseline. Local language translation used forward-translation by bilingual programme staff followed by independent back-translation by a university linguist.

Compared To

Method	Data Type	Sample Size	Depth
Survey	Structured quantitative	Large (100-5,000+)	Shallow-medium
Focus Group Discussions	Qualitative	Small (6-12 per group)	Deep
Key Informant Interviews	Qualitative	Small (10-30)	Very Deep
Observation Methods	Direct observation	Variable	Medium

Survey Design

The process of designing structured questionnaires and survey protocols to collect reliable, valid, and actionable data from a defined population.

When to Use

How It Works

Step 1: Define the evaluation questions the survey must answer

Every survey item should trace directly to an evaluation or monitoring question. Items without a clear question-to-indicator link should be removed. Unfocused surveys produce data that no one uses.

Step 2: Draft the instrument

Step 3: Design the question flow and skip logic

Step 4: Pilot the instrument

Step 5: Train enumerators

Step 6: Implement with quality controls

Key Components

Coverage - which topics and indicators are included, and which are deliberately excluded
Question types - Likert scales, multiple choice, open-ended, ranking, observation-based
Response categories - exhaustive, mutually exclusive, and appropriate for the population's understanding
Skip logic - routing that prevents irrelevant questions and reduces respondent burden
Translation and back-translation - if conducted in a language other than English, translate forward, then independently back-translate to verify meaning
Piloting protocol - plan for who, where, and how the instrument will be tested before deployment
Data entry and validation rules - built-in range checks and required fields for digital data collection

Best Practices

Collect outcome data, not just output data. Many surveys track what was delivered (outputs) rather than what changed (outcomes). Outcome indicators require outcome questions.

Collect baseline data before the programme starts. Without baseline data, change cannot be measured and impact cannot be assessed.

Match survey timing to measurement logic. Some outcomes need time to materialise. Collecting endline data 3 months after a 2-year programme intervention may be too early to detect genuine change.

Common Mistakes

Examples

Compared To

Method	Data Type	Sample Size	Depth
Survey	Structured quantitative	Large (100-5,000+)	Shallow-medium
Focus Group Discussions	Qualitative	Small (6-12 per group)	Deep
Key Informant Interviews	Qualitative	Small (10-30)	Very Deep
Observation Methods	Direct observation	Variable	Medium

Survey Design

When to Use

How It Works

Key Components

Best Practices

Common Mistakes

Examples

Compared To

Related Topics

Further Reading

Survey Design

When to Use

How It Works

Key Components

Best Practices

Common Mistakes

Examples

Compared To

Related Topics

Further Reading