🧬 6.1 Introduction to Qualitative Research

Qualitative research explores the why and how of experiences, behaviours, and meanings. Instead of measuring quantities (grams, mmol/L), it examines language, stories, and context—for example, why a family chooses certain foods, how participants experience a diet, or what barriers hippos (ok, stakeholders 🦛) face when changing routines.

Unlike quantitative methods that test hypotheses with numbers, qualitative approaches seek rich descriptions and interpretations. You’ll often work with interviews, focus groups, open-ended survey responses, diaries, field notes, or observations.

🎯 Objectives

Understand what qualitative research is (and isn’t), and when to use it.
Compare qualitative vs. quantitative approaches and how they complement each other.
Get hands-on: load and lightly explore open-ended text (food preference notes).
Prepare for rigorous analysis (sampling, reflexivity, ethics, coding frameworks).

Key concepts (click)

Data types: interviews, focus groups, observation notes, diaries, open-ended survey text.

Designs: phenomenology (lived experience), grounded theory (theory building), ethnography (culture/setting), case study, narrative analysis, reflexive thematic analysis.

Sampling: purposive, maximum variation, snowball, theoretical sampling (until saturation: no new themes emerge).

Trustworthiness: credibility (member checks, triangulation), dependability (audit trail), confirmability (reflexivity), transferability (thick description).

Ethics: consent, confidentiality, anonymisation, data minimisation, secure storage.

🧭 When to use qualitative methods?

To understand experiences (e.g., why participants prefer certain foods).
To explore contexts and systems (e.g., food access, cultural norms).
To generate hypotheses and inform interventions or surveys.
To explain unexpected quantitative results (mixed methods).

🔧 Setup (Colab)

Clone repo or upload food_preferences.txt (open-ended responses).

import os
from google.colab import files

MODULE = '06_qualitative'
DATASET = 'food_preferences.txt'
BASE_PATH = '/content/data-analysis-projects'
MODULE_PATH = os.path.join(BASE_PATH, 'notebooks', MODULE)
DATASET_PATH = os.path.join(MODULE_PATH, 'data', DATASET)

try:
    if not os.path.exists(BASE_PATH):
        print('Cloning repository...')
        !git clone https://github.com/ggkuhnle/data-analysis-projects.git
    os.chdir(MODULE_PATH)
    if not os.path.exists(DATASET_PATH):
        raise FileNotFoundError('Dataset missing after clone.')
    print('Dataset ready ✅')
except Exception as e:
    print('Setup fallback: upload file...')
    os.makedirs('data', exist_ok=True)
    uploaded = files.upload()
    if DATASET in uploaded:
        with open(os.path.join('data', DATASET), 'wb') as f:
            f.write(uploaded[DATASET])
        print('Uploaded dataset ✅')
    else:
        raise FileNotFoundError('Upload food_preferences.txt to continue.')

📥 Load the qualitative data

We treat each line as one response (e.g., short interview note or survey comment).

import pandas as pd
from pathlib import Path

txt = Path('data')/'food_preferences.txt'
responses = [r.strip() for r in txt.read_text(encoding='utf-8').splitlines() if r.strip()]
df = pd.DataFrame({'response_id': range(1, len(responses)+1), 'text': responses})
print('N responses:', len(df))
df.head(5)

🔍 First look (light-touch)

Before coding/themes, a quick familiarisation pass helps: skim, note recurring words, surprising phrases, and tone.

for i, row in df.head(8).iterrows():
    print(f"{row['response_id']:>2}: {row['text']}")

📏 Rigour in qualitative work

Reflexivity: keep a short reflexive memo about your assumptions, role, and decisions.
Audit trail: version a codebook, note inclusion/exclusion criteria, justify transformations.
Ethics: anonymise identifiers, store raw audio/text securely, manage consent and withdrawal.
Triangulation: compare data sources (e.g., interview + observation + logs).

🧪 Tiny, safe quantifications

Counting words isn’t the analysis, but it can help orient you. We’ll keep this light and non-dominant.

targets = ['carrot', 'carrots', 'fruit', 'grass', 'sweet', 'bitter']
word_counts = {t:0 for t in targets}
for t in df['text'].str.lower():
    for w in targets:
        word_counts[w] += t.count(w)
word_counts

🧩 Next steps: from familiarisation → coding → themes

Generate initial codes (labels on meaningful segments).
Group codes into candidate themes.
Review & refine themes against the data.
Define & name themes; select vivid excerpts.
Report with thick description and transparent decisions.

👉 You’ll practice coding in 6.2 and 6.3 (with reliability checks).

🧩 Exercises

Reflexive memo (5–8 lines): what prior beliefs might shape your interpretations?
Context probe: list three non-text sources you’d triangulate (e.g., observation, diet logs, environmental notes)—why?
Ethics: identify any direct identifiers in these responses; propose an anonymisation rule.

✅ Conclusion

You’ve set a solid foundation for qualitative work—design choices, rigour, ethics, and a first “feel” for the text. Next, we’ll preprocess and move toward codes and themes.