Correlation and Covariance

Correlation

  • Positive correlation exists when larger values of xx correspond to larger values of yy and vice versa.

  • Negative correlation exists when larger values of xx correspond to smaller values of yy and vice versa.

  • Weak or no correlation exists if there is no such apparent relationship.

Covariance

It is a measure that quantifies the strength and direction of a relationship between a pair of variables.

cov(x,y)=1nin(xix)(yiy)cov(x,y)=\frac{1}{n}\sum_i^n(x_i-\overline{x})(y_i-\overline{y})

Correlation Coefficient

The correlation coefficient, or Pearson product-moment correlation coefficient is another measure of the correlation between data. You can think of it as a standardized covariance.

rxy=cov(x,y)σ(x)σ(y)=in(xix)(yiy)in(xix)2in(yiy)2r_{xy}=\frac{cov(x,y)}{\sigma(x)\sigma(y)}=\frac{\sum_i^n(x_i-\overline{x})(y_i-\overline{y})}{\sqrt{\sum_i^n(x_i-\overline{x})^2\sum_i^n(y_i-\overline{y})^2}}

Make a Scatter Plot, and look at it! You may see a correlation that the calculation does not.

Correlation Is Not Causation which says that a correlation does not mean that one thing causes the other.