Scatter Diagrams and Correlation

1. Bivariate Data

Bivariate data consists of pairs of values for two different variables recorded for the same subject.

Example: Height and weight of students.
The goal is to determine if there is a relationship (correlation) between the two variables.

A scatter diagram plots bivariate data on a Cartesian plane.

Scatter Diagram Plot

Correlation describes the strength and direction of the relationship between variables.

Positive Correlation: As $x$ increases, $y$ tends to increase. Points trend upwards from left to right.
Negative Correlation: As $x$ increases, $y$ tends to decrease. Points trend downwards from left to right.
Zero Correlation: No apparent relationship between $x$ and $y$. Points are scattered randomly.

Correlation Diagrams

A line of best fit is a straight line drawn through the center of the data points to represent the overall trend.

Interpolation: Estimating a value within the range of the data set. Generally more reliable.
Extrapolation: Estimating a value outside the range of the data set. Less reliable as the trend may change.

Process:

Line of Best Fit Example