There are two main kinds of studies, observational studies and, experimental studies. The statistical process can be viewed in the context of this General process of Investigation that we talked about earlier. The four steps comprise: identifying a question or problem, collecting relevant data, analysing the data and making a conclusion. Study design helps us improve the collection of data, it makes collecting data transparent and also helps us to understand what kinds of inferences or what kind of conclusions we can make based on the data at hand.
observational studies entail some kind of observational measurement but there’s no direct intervention; there’s no attempt to alter what’s going on in terms of social dynamics or in terms of the world around us. In an observational study, there can always be confounding (or lurking) variables affecting the results. This means that observational studies can almost never show causation. It is easier to adjust for confounding variables in an experiment.
In experimental studies the goal is to apply some kind of intervention, so you might give people a pill versus people not receiving a pill. You might try to alter the conditions in which they live or how they live, the key point is that in experimental studies you have some kind of manual intervention by the researcher, in observational studies you don’t have an intervention.
Observational Studies
In observational studies we typically cannot prove cause and effect the reason is that we’re just observing. We’re just measuring what’s going on and there’s no kind of intervention. A survey is a typical kind of observational study, you take a sample from a population and you ask them questions, regarding approval or disapproval of certain drug use or their view on religious affiliations. Surveys are very commonly used as observational studies.
Observational studies are further divided in two parts. In a cross-sectional observational study data are collected at one point in time on a set of individuals. So, if you just collect data in 1955 that is a cross-sectional study. A longitudinal study also known as a panel study is an observational study in which data are collected over time on the same individuals or object. So, if you ask people a question in year 1950 and then you ask the same group of people a question in 1955, 1960 and so on that is a panel study because you’re collecting data over time on the same individuals. Panel studies are much more expensive than cross-sectional studies but they can be useful for analyzing trends of data
One key point with observational studies is that you need to be careful with interpreting the results. There can always be confounding or lurking variables affecting the results, what this means practically is that observational studies can rarely show causation and it requires very strong assumptions about the world around us if you want to take an observational study and say something about cause and effect.
It’s better if we differentiate between association & causation, two variables are associated if values of one variable tend to be related to values of the other variable, that mostly happens in observational studies. Two variables are causally associated if changing the value of the explanatory variable influences the value of the response variable which can be done only through experimental studies.
Cofounding Variables
We can take an example of Obesity to understand the nuance, it’s very tempting to say that you know if you get fat, then you’re causing your friends to get fat, because of social structure. But it’s very possible that maybe a fast-food restaurant opened down the street has changed your eating habits. So, what appeared to be a cause-and-effect pattern due to social structure of friends was really caused by the fact that a fast-food restaurant opened down the street. We like to see cause and effect relationships for better conclusive results, when we have an experiment it’s much easier to prove cause and effect.
Confounding variable in general is a variable not included in the study design that has an effect on the variables studied. So, in the case of social networks and obesity a confounding variable could be the fast-food restaurant that opened down the street. To take another example, if you study sunscreen and cancer, you might say that sunscreen is causing cancer, if you just look at people who use sunscreen in cancer.
So, you might gather sample of data you might observe what’s going on over time and you might say, people who are using sunscreen are more likely to have skin cancer. However, the confounding or lurking variable could be available and you may not be considering the level of sun exposure in your study. So, people who use the sun who go out on the beach more often tend to use sunscreen more often but still that sun exposure those people tend to have higher levels of cancer skin cancer than those people who stay indoors all the time. So, there’s this confounding variable of sun exposure that you’re not adjusting for in your observational study.
Experimental Studies
The second type of statistical study is experimental studies. We use experimental studies to say something about cause and effect. Experimental studies intel the application of some intervention or treatment. The researcher is trying to change something about the world around. The idea behind an experimental study is that subjects are randomly assigned to treatment and control groups, the treatment group receives the intervention; in the control group we do not try to remove effective confounders by controlling the environmental conditions.
so, here’s another example involving neighborhoods and obesity. So, women and children living in poor neighborhoods were randomly assigned to have an opportunity to live in wealthier neighborhoods. So, the experimenters were trying to alter the behavior of people, so there’s intervention. The researcher provides the poor people an opportunity to live in a wealthier neighborhood and by applying this intervention, they examined obesity levels. As a result, they conclude the opportunity to move from a neighbor with a high level of poverty to one with a lower level of poverty was associated with modest but potentially important reductions in the prevalence of extreme obesity and diabetes.
So, they could make this claim about cause and effect, the cause being living in a poor neighborhood and the effect being obesity and diabetes levels. They could make this claim because they were applying some kind of intervention to the world. There are some major disadvantages with experiments first of all they can be unethical especially if you conduct them on vulnerable populations like animals or if the treatment is known to be harmful.
Fundamental questions in data collection
There are two Fundamental Questions in Data Collections that extend the generalizability of a result, first was the sample randomly selected from the population? And second was the explanatory variable/ groups randomly assigned in sample? The answer to both the questions will lead to generalizability and no will make the study hard to generalize. Generalization simply refers that the sample study can be applicable to the mass audience. Remember the statistical inference where we draw sample to say something about the population. Generalization means the sample study is applicable to the population.