How to... undertake case study research
What is research design?
A research design is a plan for getting from your original question or hypothesis to obtaining workable results from your research, on which you can base defensible conclusions.
Good case study design involves providing empirical data for analysis and conclusion (Gummesson, 2007), but doing so in such a way that stands up to scrutiny.
Defining the research
The first task is to decide what it is you are trying to find out by defining your research question. Carrying out a literature review is an essential precursor to most research, and is a good way of getting ideas for research questions.
The questions need to be suitable: large enough to provide sufficient scope for research, but new enough not to have already been answered.
Yin (2009, location no. 827) advises narrowing down the research question to something more specific, in order to look for relevant evidence. This may take the form of a proposition.
The role of theory
There is some debate as to whether or not it is appropriate to use theory at this stage, but some initial delving into theory will help you further define the parameters of the case you are investigating.
Note too that it is important to understand your own ontological and epistemological perspective: are you carrying out interpretive or positivist research? Broadly speaking, the positivist approach looks at objective reality, which exists beyond the human mind, whereas the interpretivist approach sees knowledge of the world as inevitably affected by the observer.
When considering your theoretical position at the outset, it is important not to lose sight of an important practical consideration: will the case you have chosen (or are considering choosing) cooperate with your research? You need a case where people will be helpful, leading you to key informants, providing access to documents, and allowing you to interview or survey staff. For example, a school might illustrate an important theoretical point, but if teachers and pupils refuse to engage and you can't gain access to classrooms, it will not be of much use.
Unit of analysis
The beginning of the research process is all about definition: not only your research question, but also your unit of analysis, which is the actual object or entity being studied. Also, the unit must be at the same level as the object of the proposition (Gerring and McDermott, 2007). For example, a company or business could be the unit of analysis, and the object of the proposition be to examine company performance.
On the other hand, your unit of analysis might be an individual or a small group, for example if you were looking at the effects of a particular social intervention such as whether or not neighbourhood policing could reduce crime. It could even be something less tangible, such as a community, a decision, a project, or even a book marketing campaign.
Population and sampling in case selection
Population – the group of people or the area you are investigating – and the sample (the subset of the population you are studying) are both important research principles (see Sampling techniques). Both apply in case study research.
Seawright and Gerring (2008, pp. 295-296) claim that the selection of cases has the same objectives as random sampling in that what is desired is a representative sample and useful variation on the dimensions of theoretical interest. However, given the difficulties of getting a representative case, on both practical and theoretical grounds, they suggest that purposive sampling may be more appropriate (p. 296).
Developing the instrument – different case designs
The work of defining the research questions and proposition is unique to each study, but when it comes to selecting and developing the instrument, there are a number of different possible research designs for case studies.
Single vs multiple case design
This simply means choosing whether your study will include just one, or several cases.
Both types of case study design have their advantages.
Yin (2009, location no. 1201) lists five rationales for single cases:
- A critical case – i.e. one that can test a particular theory.
- An extreme or unique case – for example, a study of a rare disorder.
- A representative case – a case that is representative, or typical, of a particular situation.
- A revelatory case – one that reveals a phenomenon hitherto unexplored.
- A longitudinal case – a study of changes over time.
The big advantage of multiple case studies is that evidence is provided from many sources, thus making it easier to generalize. A single case, on the other hand, may be considered idiosyncratic. The use of multiple case designs has therefore become more frequent over recent years.
Another advantage of multiple case design is the methodological similarity with the experiment. This has been pointed out by a number of authors – Gerring and McDermott (2007), Lloyd-Jones (2003) and Yin (2009) – despite the fact that the case study is normally considered a qualitative method.
Its disadvantage, however, is its resource intensiveness.
Holistic vs embedded
Single cases and multiple cases can be holistic or embedded. A holistic case is one where the case is the unit of analysis; an embedded one is where there are several units of analysis in the case. This can be represented by Table I.
One case with one unit of analysis
Several cases each with one unit of analysis
One case with several units of analysis
Several cases each with several units of analysis
For example, a case could be about a school and its response to a new demographic trend or government edict, in which case it would be holistic. If within the school, several different classes were studied, then these sub-units, or "mini cases", would be embedded within the overall case.
Combining the case method with other methods
Some researchers combine case studies with other methods, such as a survey: for example, you could conduct a survey of several local councils, and provide a case study of one council. This type of approach has the benefit of combining qualitative and quantitative research.
Hitherto it has been assumed that theory is developed as part of the initial research work, drawing from the literature review, and that data are analysed against theory. However, with grounded theory design, the opposite happens: data are collected first of all, then theory developed, then more data are collected and compared with the theory, so the whole becomes an iterative process (see How to... implement grounded theory).
The iterative quality of grounded theory would seem to remove it from consideration in orthodox case study design.
Quality in case study research
All research needs to conform to the following quality criteria:
1. Construct validity – this is all about making sure the research uses the right operational measures, appropriate to what is being studied. Construct validity can be improved by:
- Multiple sources of evidence, i.e. data collection methods, which can be triangulated against one another.
- Having a chain of evidence.
- Letting key informants review the draft (Yin, 2009, location no. 1110).
2. Internal validity – this seeks to establish a causal relationship, and is relevant for explanatory rather than exploratory cases. The researcher needs to establish that x causes y, and show that there are no other factors that could have played a part in y.
3. External validity – the extent to which it is possible to generalize from the findings of case studies. Many would say that it is not, on the grounds that case studies are too particular (although this applies less to multiple case studies).
Surveys, based on a sample of a larger population, allow for statistical generalization. Case studies, on the other hand, can offer results which can be generalized against a particular theory. This is known as analytical generalization.
4. Reliability – another researcher should be able to go in and repeat the case study, and come up with the same findings. (Note that this is different from being able to replicate the results in another case.) The way to make this possible is by documenting the procedures in the research.
Using case studies to generate theory
Cepeda and Martin (2005) see theory building as a key stage in the case study research process. After the collection of data, there is a stage for reflection, which enables the researcher to update the initial conceptual framework on which the research was based. The result is a cyclical process of theory, producing a research process giving rise to data from which fresh theory can be formulated, and fresh research carried out. Because research takes place "in the field", there is a close relationship between theory and what is happening on the ground.
Figure 1. Cepeda and Martin's view of conceptual frameworks and the research cycle (2005, p. 861)
In the most elementary sense, the research design is the logical sequence that connects the empirical data to a study’s initial research questions and, ultimately, to its conclusions (Yin, 2003). It can also be seen as a blueprint, chain of evidence, or logical model of proof. It needs to maximize construct validity, internal validity, external validity and reliability.
In this research design, I present the methodological issues of the thesis. It presents the unit of analysis, it covers the reasons for selecting organizations, it describes the data sources that where used, how the data was collected and how the data was analyzed. It is presented in such a way that other researchers can replicate this research (Yin, 2003).
Unit of analysis
For a case study it is important to define the case, in terms of what the case is, and where the case leaves off (Miles and Huberman, 1994; Yin, 2003). This is a problem for many researchers in case studies. The more a study contains specific propositions, the more it will stay within feasible limits, and also the context has to be clear (Yin, 2003).
This case study is multiple and holistic (Yin, 2003). I have conducted interviews within two different organisations. Case studies can be single or multiple-case designs, where a multiple design must follow a replication rather than sampling logic. When no other cases are available for replication, the researcher is limited to single-case designs. One of the rationales to justify a single case study is if theory has specified a clear set of testable propositions (Yin, 2003), which is the case in this research. To test the propositions I made use of the customization characteristics identified in the literature study. Also theory has to specify the circumstances within which the propositions are believed to be true (Yin, 2003). This is also the case in this research, because it focuses on online delivered content as the unit of analysis. During this research, I had the opportunity to conduct multiple case studies. Multiple case studies are preferred, because they can be more robust than a single case study and, depending on the results, can strengthen the external validity (Yin, 2003).
Single case- and multiple case studies can further be classified as holistic or embedded. In an embedded case study, the case is split in multiple units of analysis, while a holistic case study has one unit of analysis for each case. The unit of analysis of each case is that part of the company that is relevant to answer the main research question, also called logical subunits (Yin, 2003). When no logical subunits can be identified, the holistic design is advantageous. When this unit of analysis changes during the study, the researcher can be forced to start over. When conducting an embedded case study, the researcher has the pitfall to focus too much on a single unit, and fails to return to the larger unit of analysis.
Site selection criteria
During this thesis it was not possible to collect data from many sources. The sites to study should be appropriate and accessible. There are simply not many sources available where digital products are being customized over the Internet, which is a requisite to answer the main research question and the hypotheses that where identified during the literature review. The site to study should also be a location where entry or access to the sources should be available, and the appropriate people should likely be available (Berg, 2004). The logic of using samples is to make inferences about some larger population from a smaller one, which is the sample (Berg, 2004).
There are various sampling principles to select the sites to study, such as maximum variation sampling, critical case sampling, snowball sampling, purposive sampling or convenience sampling (Miles and Huberman, 1994; Berg, 2004). Sampling involves decisions about which people to observe or interview. Maximum variation sampling, for example, involves looking for outliers to see whether main patterns still hold. Qualitative samples tend to be purposive rather than random, which is very important with small numbers of cases (Miles and Huberman, 1994). Two actions are involved when sampling in qualitative research. First, the boundaries have to be set to define aspects of the cases that can be studied within the limits of the available time and means, that connects directly to the research questions and that probably will include samples of what needs to be studied. Second, a sampling frame needs to be created to help to uncover, confirm or qualify the basic processes or constructs of the study (Miles and Huberman, 1994).
During this research I made use of the principle of multiple-case sampling. Multiple case sampling adds confidence to findings. This approach connects directly to the overall research question, adding confidence on the ‘how’ research question. The overall research question:
How to support customization and personalization for pure digital products in the Internet economy to dramatically decrease complexity and search costs for consumers, so variety can be maximized?
When using this kind of sampling, an explicit sampling frame is needed (Miles and Huberman, 1994). The sampling frame was created by identifying suppliers that offer online digital products on the Internet, that can be customized, preferably have a large variety, and it should be easy for consumers to find the digital products they would like. Organizations that offer digital products where variety is high, and offer these products online are scarce; in particular companies that offer possibilities to customize their products. Nowadays, digital products in the form of music seem to qualify for these criteria. The two most popular organizations that offer customized music are Last.fm and Pandora Media. Both organizations agreed to be part of this research. These two sites offer digital products in the form of streaming music on the Internet. Another company who offers digital products in the form of music is Mercora, but they did not agree to participate.
Yin (2003) identifies six sources of evidence that can be collected during case studies, each having their strengths and weaknesses. The first is documentation, which is stable because it can be reviewed repeatedly, it is unobtrusive, it is exact and it has a broad coverage. However, it can also be difficult to retrieve, the selection and reporting can be biased, and the access can deliberately be blocked. The second is archival records, which is the same as documentation, but in addition it also has the advantage of being precise and quantitative, and the disadvantage of being difficult to access due to privacy reasons. The third is interviews. Interviews are targeted and insightful, but also have disadvantages because they can be biased due to poorly constructed questions or poor responses, and they can be inaccurate due to poor recall. The fourth are direct observations which have the advantage of being real-time and contextual, but they can be time consuming, selective, the observed event may react different due to the observation, and is time consuming. The fifth is participant-observation which has the same characteristics as direct observations. They have an extra advantage as being insightful into interpersonal behaviour and an extra disadvantage of the possibility being biased due to manipulation. The sixth and last source of evidence is physical artefacts. They are insightful into cultural features and technical operations; however, selectivity and availability are disadvantages.
When conducting a case study, three principles of data collection can maximize the benefits of the above six sources of evidence (Yin, 2003). The first is to use multiple sources of evidence, which, if done properly, enables data triangulation. It also helps to avoid tunnel vision (Verschuren, 2003). The second principle is to create a case study database. Yin (2003) recommends keeping the data or evidence and reports separated. The last principle is to maintain a chain of evidence, which increases the reliability of the information.
There are many possible sources of evidence to identify, for example documentation and archival records in the form of existing reports. Another source of evidence that was used was by means of interviews. The semi structured interview was used, because the topics are clear and some questions can be predetermined, but it leaves space for probing beyond given answers (Berg, 2004). The interviews were recorded, transcribed and reviewed by the interviewees. The remaining sources of evidence, which are direct observations, participant-observation, and physical artefacts, were not used.
Data analysis in qualitative research can be defined as consisting of three concurrent flows of action: data reduction, data display, and conclusions and verification. These flows are present in parallel during and after the collection of data (Miles and Huberman, 1994). Data reduction refers to the process of selecting, focusing, simplifying, abstracting and transforming the collected data. It needs to be reduced in order to make the data more readily accessible and understandable (Berg, 2004; Kvale, 1996). Data display is intended to organize the collected data in such a way that it permits conclusion drawing (Miles and Huberman, 1994; Berg, 2004). The third component of the data analysis process is conclusion drawing and verification. During the collection of data, there should not be made any definitive conclusions, and these preliminary conclusions should be verified during the process (Miles and Huberman, 1994).
Linking data to propositions can be done in a number of ways, for example the technique of pattern matching, whereby several pieces of the same case may be related to some theoretical proposition (Yin, 2003). Other strategies are explanation building, time-series analysis, logic models, and cross-case synthesis. Every case study should strive to have a general analytic strategy, defining priorities for what to analyze and why. Examples of such general strategies are relying on theoretical propositions, thinking about rival explanations, and developing a case description (Yin, 2003).
The general analytic strategy that I used for data analysis is the technique of relying on theoretical propositions. The four hypotheses that followed from reviewed literature and the overall research question, led to this case study. To develop internal validity and external validity, I followed the specific analytical technique of pattern matching (Yin, 2003). When all collected data is available in textual format, data can be methodologically analyzed (Miles and Huberman, 1994). In pattern matching, or pattern coding, an empirically based pattern is compared with a predicted or proposed one. Pattern coding has four important functions (Miles and Huberman, 1994). First, it reduces large amounts of data into a smaller number of analytical units. Second, it gets the researcher into analysis during data collection, so that later fieldwork can be more focused. Third, it helps the researcher elaborate a cognitive map for understanding interactions. Fourth, it lays the groundwork for cross-case analysis by surfacing common themes.
The above strategy was used as a strategy during the case study. One of the data sources, and also the most dominant, existed of transcribed interviews, as I conducted three interviews within the two companies during this research. Kvale (1996) differentiates between five main approaches to analyze interviews. These are meaning condensation, meaning categorization, meaning structuring through narratives, meaning interpretation, and ad hoc meaning generation. During this research I used a combination of the meaning condensation approach and the meaning categorization approach. Meaning condensation entails an abridgement of the meaning expressed by the interviewees into shorter formulations. Long statements are compressed into briefer statements in which the main sense of what is being said is rephrased in a few words. Meaning condensation thus involves a reduction of large interview texts into briefer more succinct formulations (Kvale, 1996). Meaning categorization implies that the interview is coded into categories. Long statements are reduced to predefined categories, and can reduce and structure a large text into a few tables and figures (Kvale, 1996).
This paragraph summarizes the instrumentation which is used for this research. All measures are based on the review of the literature in Chapter 2.
Mass customization classification
To classify the mass customizer in terms of consumer involvement in the design process and product modularity, I follow Duray et al. (2000). They developed an instrument to classify mass customizers, with established scales to enhance validity, reliability and generalizability of measures.
According to Duray et al. (2000), consumer involvement can be scaled into two factors. The first factor is consumer involvement in the design and fabrication stages, and is considered as a high degree of customization. The second factor is consumer involvement in the assembly and use stages, and is considered as a low degree of customization. To measure the type of modularity employed, Duray et al. (2000) also identified two factors. The first factor is modularity through fabrication, and can be considered a measure of modularity in the design or fabrication of a product. The second factor is modularity through standardization. It contains items that address modularity in the form of options to standard products or interchangeability of components.
To operationalize the concept of point of customer involvement, the earliest point of involvement classifies the company. Once a customer is involved in the process, involvement carries throughout the whole production cycle. If a customer’s initial point of involvement is in the design stage of the production cycle, the customer’s preferences would be incorporated throughout the remaining stages of fabrication, assembly and use (Duray et al., 2000). The same is the case for the type of modularity employed. Once each company has been assigned one value for each of the variables, customer involvement and modularity, the classification process is simplified. Table 3.1 shows the identification of mass customizers.
Design / fabrication
Assembly / use
Design / fabrication
Assembly / use
Table 3.1: Classification of the mass customization configuration.
To measure variety, I follow Blecker et al. (2006) who propose a key metrics based approach to control variety induced complexity in mass customization. Blecker et al. (2006) revealed that multiple use, interface complexity and platform efficiency are key metrics that directly influence the extent of product variations that can be offered by the mass customizer. The multiple use metric provides a measurement of the number of product variant required by consumers as compared to the total number of modules (Ericsson and Erixon, 1999 in Blecker et al., 2006).
Blecker et al., 2006
Blecker et al., 2006
Blecker et al., 2006
Table 3.2: Possible variety metrics.
Complexity and search costs
The literature review revealed that measuring complexity is not easy, because it is very subjective, and depends on how the consumption experience is experienced (Desmeules, 2002). To evaluate the extent of perceived complexity, I follow Blecker et al. (2006), who proposes two metrics. The first metric measures the average interaction length of time, in other words, how much time consumers need to on average to completely configure a product variant. The second key metric refers to the abortion rate. If consumers are uncertain about their choices or overwhelmed by the interaction process, it is more likely that they give up configuration and leave the website of the mass customizer.
Average interaction length of time
Blecker et al., 2006;
Blecker et al., 2006
Table 3.3: Perceived complexity metrics.
To answer the main research question, I will have to be able to identify if search costs are dramatically reduced for consumers. These identified measures are primarily taken from Kurniawan et al. (2006), who did a study about decision quality with product selection.
Number of alternatives searched
Kurniawan et al., 2006
Kurniawan et al., 2006
Helander and Khalid, 2000
Table 3.4: Search costs metrics.
To identify methods or strategies that the mass customizer uses to reduce the perceived complexity and search costs, I will use the following metrics:
Reduce perceived complexity (customization)
Attribute vs. alternative
Huffman and Kahn, 1998
Dellaert and Stremersch, 2004
Dellaert and Stremersch, 2004; Piller et al., 2005;
Huffman and Kahn, 1998
Stegmann et al., 2006
Stegmann et al., 2006
Piller et al., 2005
Piller et al., 2005
Table 3.5: Reduce perceived complexity, or customization metrics.
Case study protocol and case study database
The case studies are conducted following a protocol containing the following.
- Procedures to introduce the case to the interviewees;
- Procedures to start and finish a case study;
- Procedures for conducting interviews including initial questions;
- Procedures for data recording in the case study database.
For every case, a case study database is constructed with the following structure.
- Introducing e-mails to the interviewees;
- Recorded interviews;
- Literal transcription of the interviews;
- Downloaded documents from the companies’ websites;
- Earlier interviews found on the Internet.