Studying for your PhD: Quantitative and Qualitative Data Collection Methods

Page content


Generally, data is of two types: quantitative, which is numerical, and qualitative, which is not. Collecting data is often what people think PhDs are about. For some people, data collection will be a very large proportion of their study. For others, it will be a comparatively small part at the end of a lot of work devising their PhD project plan, tweaking their methodology or building their equipment. In my very first research project, I spent 9 months building a computer programme, which took 1 hour to run to generate my data. I did have to run it about 10 times to get my results, but whilst the computer was running the programme, I was able to go and watch cricket in the park next door to the lab! Data is only the first stage of your results: in many PhDs, the interpretation of those data will be much more significant, but this can only happen if you obtain valid data.

Quantitative Data Collection Methods

The most direct way to collect quantitative data is by measuring things. This can range from physical characteristics in science experiments such as mass, temperature, time or with the aid of equipment, frequencies in all sorts of spectrometers or the number of events triggering a detector or transducer.

However, other disciplines use direct measurement too. Analysis of documents can include direct measurement of word frequencies, sentence and paragraph length. However, it is important to distinguish between direct measurement of temperature and measurement of numbers used as a proxy or indicator. For example, there are measures of readability of documents that relate readability to the average length of sentences and paragraphs. In this case, we measure one thing and we postulate that it is linked to another thing, which is what we seek to measure. This is important because in a PhD, you should always make your assumptions and evidence for those assumptions transparent. All numerical data has a degree of accuracy or uncertainty associated with it. In some cases, this will be small enough to be effectively ignored; in others, we need to quote the uncertainty associated with our data.

Other ways to generate quantitative data include surveys. These may generate numerical data through a variety of closed questions. These may ask a direct question with a binary response, or gather shades of opinion. A common approach is to use a Likert scale, in which we ask our respondents to tell us whether they agree or disagree with a statement and how strongly they agree or disagree. Again, this is not direct measurement, and by the wording of the question, we can introduce bias into our responses.

Qualitative Data Collection Methods

Qualitative data comes in many forms. Traditionally, it is text based, either directly recorded as text or transcribed from audio recording. With the increasing availability of video recording, there is an increasing use of video recording. This allows the recording of non-verbal cues, which can form a legitimate part of the data but tend to still be reduced to text for ease of analysis.

The data may be drawn from a wide range of sources including interviews, open ended survey questions, analysis of documents, literary texts or operational documents in an organisational study, focus groups or observations.

The decisions over data gathering do not end there. Do you generate data as tape recordings to be transcribed or notes written up afterwards? There is no right answer: notes trade word for word veracity for a more open discussion. If tape recording is used, transcription can be very time consuming: do you make use of a third party to do this for you? Do you do it independently twice to maximise veracity and minimise errors? In the context of a PhD, the answers to the questions and the choice of data collection methods are probably less important than your rationale for them.

What Can Wrong When Collecting Data

The biggest pitfall is ending up with too much or too little data. Studies gathering qualitative data can generate huge amounts of data, which are difficult to transcribe and then harder to analyse. If you do not have enough data, then it can be hard to answer your research question in a meaningful way. For example, in clinical research studies, it is important to check that there are simply enough patients with the relevant condition in your setting.

Do not confuse data with information or evidence. You collect data, but only after your analysis and interpretation do you generate information or evidence.