Correlation
Correlation
Correlation coefficient: statistical index of the degree to which two variables are associated, or related.
We can determine whether one variable is related to another by seeing whether scores on the two variables covary---whether they vary together.
Example of Correlation
Is there an association between:
- Children’s IQ and Parents’ IQ
- Degree of social trust and number of membership in voluntary association ?
- Urban growth and air quality violations?
- GRA funding and number of publication by Ph.D. students
- Number of police patrol and number of crime
- Grade on exam and time on exam
Scatterplot
- The relationship between any two variables can be portrayed graphically on an x- and y- axis.
- Each subject i1 has (x1, y1). When score s for an entire sample are plotted, the result is called scatter plot.
Direction of the relationship
Variables can be positively or negatively correlated.
- Positive correlation: A value of one variable increase, value of other variable increase.
- Negative correlation: A value of one variable increase, value of other variable decrease.
Standardized relationship
- The Pearson r can be thought of as a standardized measure of the association between two variables.
- That is, a correlation between two variables equal to .64 is the same strength of relationship as the correlation of .64 for two entirely different variables.
- The metric by which we gauge associations is a standard metric.
- Also, it turns out that correlation can be thought of as a relationship between two variables that have first been standardized or converted to z scores.
Correlation Represents
a Linear Relationship
- Correlation involves a linear relationship.
- "Linear" refers to the fact that, when we graph our two variables, and there is a correlation, we get a line of points.
- Correlation tells you how much two variables are linearly related, not necessarily how much they are related in general.
- There are some cases that two variables may have a strong, or even perfect, relationship, yet the relationship is not at all linear. In these cases, the correlation coefficient might be zero.
Coefficient of Determination r2
The percentage of shared variance is represented by the square of the correlation coefficient, r2.
- Variance indicates the amount of variability in a set of data.
- If the two variables are correlated, that means that we can account for some of the variances in one variable by the other variable.