Qualitative Data Analysis
Qualitative Data Analysis
Thematic Analysis
Research can feel like wandering in a forest without a map. You collect interviews, focus group transcripts, or field notes, and suddenly there is a pile of words staring back at you. Where do you even start? How do you make sense of all that information without losing what people actually said?
This is where thematic analysis comes in. It is a practical and flexible way to make sense of qualitative data. Instead of crunching numbers, you are looking for patterns, ideas, and stories hidden in the words. You start small, reading and re-reading, noticing repeated phrases, then giving those ideas short labels, called codes. Later, you group these codes into bigger themes that reveal the main patterns across your data.
In this post, we will walk through the four main phases of thematic analysis: familiarisation, coding, generating themes and interpretation of themes. For each phase, we will include concrete examples from real interviews so you can see exactly how the process works. By the time you finish learning, you will have a clear, practical understanding of how to move from raw transcripts to themes that tell a compelling story.
Before you start analysing, you need to get close to your data. This phase is about understanding what participants actually said, not what you think they meant. You begin by reading one interview from start to finish, then another, and then you return to the first one again. Slowly, certain phrases and ideas begin to stand out.
For instance, imagine you interviewed secondary school teachers about their work experiences. In one transcript, a teacher says, “I take work home every single night.” In another, a different teacher mentions, “There’s never enough time to finish marking during school hours.” At this stage, you don’t assign codes yet. You simply notice that workload keeps appearing and jot down brief notes like “workload stress” or “time pressure.” These notes are your first impressions.
Sometimes, you may transcribe the interviews yourself. This helps you catch nuances such as tone, pauses, and emotion. For example, if a teacher sighs before talking about lesson preparation, that sigh conveys meaning beyond the words. Braun and Clarke (2006) describe familiarisation as immersion in the data to fully understand its content and context before starting formal analysis. This deep familiarity reduces the risk of misinterpretation later.
The goal here is not to be clever or quick. You are simply listening and reading attentively. Some parts of the transcripts may surprise you, while others might feel repetitive—but that repetition matters. By the end of this phase, you should feel like you understand the participants’ experiences, routines, and concerns.
Familiarisation forms the foundation for everything that comes next. Without it, coding and theme generation can feel mechanical, detached, or shallow. Taking time at this stage ensures your analysis truly reflects the real experiences and voices of the participants.
Once you are familiar with the data, you begin coding. This is where you actively start organising meaning. You go back to the transcripts and examine them line by line, highlighting sections that seem relevant to your research question.
Using the same teacher interviews, take this sentence
“I barely have time to rest because I’m always preparing lessons or marking scripts.”
You might code this as workload pressure. Later in the same interview, the teacher says
“We handle too many students with very limited support.”
That section could be coded as large class size or lack of institutional support.
Now imagine you repeat this process across ten interviews. Every time a teacher talks about long hours, excessive marking, or administrative paperwork, you apply the same code workload pressure. Over time, you realise this code appears in almost every transcript. That tells you something important.
Coding can be done manually or using software, but the logic remains the same. You are tagging pieces of data with labels that capture their core meaning. According to Braun and Clarke, coding should be systematic and inclusive, meaning you code all relevant data rather than only striking quotes (Braun and Clarke, 2006).
This stage is flexible. You may change code names, merge similar codes, or split broad codes into smaller ones. That is normal. Coding is not about perfection. It is about building a structured map of what is happening in the data.
By the end of coding, your transcripts will be full of labels. These labels become the raw material for developing themes.
Now, here is where the real sense making begins.
You look at your list of codes and ask a simple question. Which of these seem to talk about the same underlying issue. For example, you may have the following codes:
Workload pressure
Long working hours
Marking overload
Administrative burden
Individually, these are codes. Together, they tell a broader story. You might group them under a theme called Excessive Work Demands.
In another cluster, you may find codes like
lack of teaching materials
overcrowded classrooms
insufficient funding
These might form a second theme such as resource constraints in schools.
Themes are not just categories. They are patterns of shared meaning across the dataset. Braun and Clarke describe themes as capturing something important about the data in relation to the research question and representing a level of patterned response or meaning (Braun and Clarke, 2006).
At this stage, you actively move codes around. Some codes may not fit anywhere and are set aside. Others may become subthemes. For instance, workload pressure could later split into emotional exhaustion and time management strain if the data supports it.
Generating themes requires interpretation. You are no longer just describing what was said, but explaining what it means across participants. This is where thematic analysis becomes powerful, because it turns individual voices into insights that reflect shared experiences.
Interpretation
So now that you have your themes, the next step is interpretation. This is where you stop just describing what the data says and start thinking about what it actually means. You’re looking at patterns, but also trying to explain them in a way that connects to real experiences.
Take the theme Excessive Work Demands. The codes behind it—workload pressure, long working hours, marking overload, administrative burden—show that teachers are constantly stretched in different directions. They aren’t just teaching lessons; they’re handling paperwork, marking mountains of scripts, and managing admin tasks. When you interpret this, you ask questions like: How does all this affect their energy, motivation, or stress levels? Does it change how they interact with students or plan lessons?
In a similar manner, if you notice another theme like Resource Constraints, you can start linking it to Excessive Work Demands. For example, not having enough materials or support may make the workload feel even heavier. So, the interpretation isn’t just about listing what’s happening—it’s about seeing why it matters and how the themes relate to each other.
At this stage, it also helps to think about broader implications. Maybe teachers cope by prioritizing tasks or working longer hours at home. Maybe some feel burnt out. You can compare these insights to what other studies have found or think about what it tells you about the school environment.
Basically, interpretation is the stage where your data becomes a story. It shows what participants are experiencing, why those experiences happen, and why they matter. In the case of Professional Overload, it points to the need for better workload management, more support from schools, and policies that protect teachers’ wellbeing.
So, in the end, you’re not just summarizing data—you’re connecting the dots, explaining patterns, and showing why it all matters in real life.
By the end of this phase, your analysis begins to look like a story, not a list.