Analyze correlation between variables. Part of the DevTools Surf developer suite. Browse more tools in the Statistics collection.
Use Cases
Test whether two business metrics are linearly related before building a model
Identify feature pairs to avoid multicollinearity in regression models
Validate whether a leading indicator actually predicts a lagging metric
Analyze the relationship between user behavior and conversion rates
Tips
Enter your data as two columns — the tool calculates both Pearson (linear) and Spearman (monotonic) correlation coefficients automatically
Check the scatter plot alongside the coefficient — a coefficient of 0 can hide non-linear relationships that are visible in the plot
Use the p-value output to assess statistical significance of the correlation, not just the coefficient magnitude
Fun Facts
The Pearson correlation coefficient was developed by Karl Pearson in 1895, building on earlier work by Francis Galton. Galton discovered correlation while studying the relationship between parents' and children's heights.
The famous correlation-causation warning dates to at least the 1880s, but the specific phrase 'correlation is not causation' was popularized in statistics textbooks in the mid-20th century. It remains one of the most violated principles in journalism.
Spurious correlations — statistically significant relationships between unrelated variables — occur naturally due to multiple testing. Tyler Vigen's book 'Spurious Correlations' (2015) illustrated this with examples like the 99.26% correlation between US per capita mozzarella consumption and civil engineering doctorates awarded.
FAQ
When should I use Pearson vs Spearman correlation?
Pearson measures linear relationship and requires both variables to be normally distributed and continuous. Spearman measures any monotonic relationship (consistently increasing or decreasing) and works with ordinal data or non-normal distributions.
What does a correlation coefficient of 0.7 mean?
It means 49% of the variance in one variable is explained by the other (r^2 = 0.49). Common benchmarks: |r| < 0.3 is weak, 0.3-0.7 is moderate, > 0.7 is strong. Context matters — 0.3 can be highly significant in social science.
Can I use correlation to prove causation?
No — correlation only shows association. Establishing causation requires experimental design (randomized controlled trials) or causal inference methods (instrumental variables, difference-in-differences). High correlation is necessary but not sufficient for causation.