Scatter Plot - Chart, Correlation, Regression
Use this scatter plot to chart paired X and Y values, fit a least-squares regression line, and read Pearson r, R-squared, slope, intercept, and standard deviations.
Scatter Plot
Results
Scatter Chart
What Is a Scatter Plot?
A scatter plot is a two-dimensional chart that places each observation as a single dot on an X-Y grid, so the shape of the cloud of points reveals the relationship between two variables at a glance. The scatter plot calculator does the same job on the screen: it draws your data on a shared chart, fits a least-squares regression line, and reports Pearson r, R-squared, slope, and intercept next to the chart so you can read the relationship without picking it out of the dots by eye. It is the right first step when you want to see a numeric pattern before fitting a model.
- • Check a correlation: Plot paired measurements to see whether a higher X tends to come with a higher or lower Y before fitting a model.
- • Fit a line of best fit: Read the least-squares slope and intercept so you can predict Y from X for new inputs.
- • Spot outliers and clusters: Eyeball the cloud of points to flag values that sit far from the trend or that form their own subgroup.
- • Compare measurement methods: Plot two measurement methods against each other to see how tightly they agree before averaging them.
The chart is one of the oldest visual tools in statistics, and it is still the fastest way to translate a two-column spreadsheet into a picture of the relationship.
Once the chart shows a roughly straight-line pattern, the Linear Regression Calculator turns the same paired lists into a full regression equation with predictions and residuals.
How the Scatter Chart Calculator Works
The calculator parses your two lists, computes the mean of each, and uses the mean-corrected cross-product to derive slope, intercept, Pearson r, and R-squared. The same numbers drive the regression line that the chart overlays on top of your data, so the visual line and the printed equation match.
- X list: Independent variable observations, parsed from your text input and used as the horizontal axis.
- Y list: Dependent variable observations, parsed from your text input and used as the vertical axis. Must align position-by-position with the X list.
- xMean, yMean: Arithmetic mean of the X list and Y list respectively.
- Sxx, Syy: Sum of squared deviations around the X mean and the Y mean.
- Sxy: Sum of cross-products of deviations around the X mean and the Y mean.
- r: Pearson correlation coefficient; r in [-1, +1].
- slope, intercept: Coefficients of the least-squares line y = slope * x + intercept.
If you only need the correlation number, the result panel still prints it as a separate output.
R-squared is the share of the Y variance the regression line explains, expressed as a percentage. Use it with r, because r preserves the sign while R-squared does not.
Worked example with a small amount of noise on the same X list
X = 1, 2, 3, 4, 5, 6, 7, 8 ; Y = 2.1, 3.9, 6.2, 7.8, 10.1, 12.0, 13.9, 16.0
xMean = 4.5, yMean = 9, Sxx = 42, Sxy = 83.5, slope = 83.5/42 = 1.9881, intercept = 9 - 1.9881*4.5 = 0.0536, r = 83.5 / sqrt(42 * 166.12) = 0.9997.
r = 0.9997, R-squared = 99.93%, slope = 1.9881, intercept = 0.0536
A tiny amount of noise drags r from exactly 1 to about 0.9997 while keeping the slope and intercept very close to the perfect-line case.
According to NIST/SEMATECH e-Handbook of Statistical Methods, the ordinary least-squares slope is the cross-product of deviations divided by the X sum of squares, and the intercept is the Y-mean minus the slope times the X-mean.
According to NIST/SEMATECH e-Handbook of Statistical Methods, the Pearson correlation r is bounded between -1 and +1 and equals the sample cross-product of deviations divided by the square root of the product of the two sums of squares.
If the regression line is secondary and you only need the correlation r, r-squared, and a t-based p-value, the Pearson Correlation Calculator skips the chart and returns the correlation test directly.
Key Concepts Behind the Chart
Four ideas make the chart easier to read. Each concept shows up directly in the result panel or the chart, so you can connect the math to the picture.
Direction (positive vs negative)
If the cloud slopes up from left to right, larger X tends to come with larger Y. The sign of r and the sign of the slope both point to the same direction.
Strength (tight vs loose cloud)
The closer the points hug a straight line, the closer r sits to 1 or -1. A loose cloud still has a best-fit line, but the line is a worse summary.
Shape (linear vs curved)
The chart exposes curved patterns that a correlation number hides. A cloud that bends will still print a slope and r, but a curved model will fit the data better.
Outliers and clusters
A single point that sits far from the cloud can pull the regression line toward itself and shrink r. A group of points that form their own cluster can hint at a hidden subgroup.
These four ideas cover what the chart is used for in practice. A line of best fit is the natural follow-up.
If the cloud is curved, a correlation number can still look respectable while the regression line misses the bend.
When the cloud looks curved rather than straight, the Polynomial Graphing Calculator lets you overlay a polynomial fit to see whether a higher-degree model captures the bend.
How to Use This Scatter Chart Calculator
Five quick steps take you from two columns of numbers to a chart, a regression line, and a correlation number.
- 1 Paste your X values: Type or paste the independent variable values into the X Values box. Separate with commas, spaces, tabs, or newlines; at least 3 numeric values.
- 2 Paste your Y values: Type or paste the dependent variable values into the Y Values box, in the same order as X.
- 3 Set the axis labels: Type a short label for each axis (for example Hours Studied and Exam Score).
- 4 Choose whether to overlay the regression line: Leave Show Regression Line set to Yes to overlay the line, or No to show only the points.
- 5 Read the chart and the result panel: Use the chart for direction, strength, and outliers. Use the result panel for Pearson r, R-squared, slope, intercept, means, and standard deviations.
Try the default data: X = 1, 2, 3, 4, 5, 6, 7, 8 and Y = 2.1, 3.9, 6.2, 7.8, 10.1, 12.0, 13.9, 16.0. The chart shows an almost-straight upward cloud, and the result panel prints r = 0.9997, R-squared = 99.93%, slope = 1.9881, intercept = 0.0536.
If a single column of numbers is the only thing you need to summarize, the Mean Median Mode Range Calculator returns the mean, median, mode, and range of that list without drawing a chart.
Benefits of Using This Scatter Chart Calculator
Five practical reasons to use this calculator instead of recomputing the correlation and regression by hand. Each benefit maps to a real workflow.
- • See the relationship and the number at once: The chart and the result panel update together, so you can connect the visual cloud to Pearson r, R-squared, slope, and intercept without copying numbers.
- • Fit a line of best fit in seconds: The least-squares slope and intercept appear next to the chart, so you do not have to solve normal equations by hand.
- • Spot outliers visually: A single point that sits far from the cloud is obvious in the chart, and the standard deviation of X and Y help quantify how far.
- • Catch bad inputs early: Validation messages stop the calculation when the two lists differ in length, when one list has fewer than 3 values, or when a value is not numeric.
- • Label the axes with your own words: Type any short X and Y axis label (Hours Studied, Exam Score, Dose, Response) so the chart reflects the variables you measured.
These benefits make the chart the right first step before any model fitting.
When the spread of the cloud is the next thing you want to understand, the Normal Distribution Calculator turns the mean and standard deviation of X or Y into z-scores, tail probabilities, and percentile ranks.
Factors That Affect the Result
Five factors determine how informative the chart and the numbers are.
Sample size
More pairs make the chart more stable. A chart built from 3 points can produce a perfect-looking line that says almost nothing; the same chart with 30 points earns the line of best fit.
Range of X
If the X values are clustered in a narrow band, the slope and r can swing a lot when the range widens. Always read the axis ticks before trusting a steep slope.
Linearity of the relationship
Pearson r and the least-squares line both assume a straight-line pattern. A cloud that bends, fans out, or splits into two clusters will look weaker on r than the underlying relationship actually is.
Outliers
A single extreme point can pull the regression line toward itself and shrink r noticeably. The calculator shows the line and the points together, so the influence of an outlier is visible at a glance.
Measurement noise
Real measurements are noisy, and a tight cloud with r close to 1 still hides a band of vertical spread. Read the standard deviations of X and Y to see how wide that band is.
- • A chart can show two variables move together, but cannot prove one causes the other. The link can come from a third variable driving both, coincidence in a small sample, or a time-order effect.
- • The line of best fit minimizes the sum of squared vertical distances from the points. It is a useful summary, not a causal model, and it can mislead when the relationship is curved or when an outlier carries too much weight.
Treat the chart and the numbers as a starting point, not the final word. A chart is most useful when it leads to a better question.
According to Wikipedia - Scatter plot, the chart is a two-dimensional display of paired data used to assess the strength, direction, and shape of the relationship between an independent and a dependent variable.
When a single column of values is dense enough that a stem-and-leaf view is the right summary, the Stem-and-Leaf Plot Calculator shows the full distribution of one list without losing the individual values.
Frequently Asked Questions
Q: What is a scatter plot and what is it used for?
A: A scatter plot is a two-dimensional chart that places each observation as a single dot on an X-Y grid, so the cloud of points reveals the relationship between two variables at a glance. The calculator version draws the cloud, overlays an optional least-squares line, and prints Pearson r, R-squared, slope, and intercept next to the chart.
Q: How do you make a scatter plot from two lists of numbers?
A: Paste your X values into the first box and your Y values into the second in matching order, separated by commas, spaces, tabs, or newlines. The chart and the result panel update on every change.
Q: How do you read a scatter plot to find a relationship?
A: Look at direction (does the cloud slope up or down), strength (how tightly the points hug a line), and shape (straight or curved). An upward cloud with a narrow band usually means a positive relationship.
Q: What does the correlation coefficient tell you about a scatter plot?
A: Pearson r is a unitless number between -1 and +1 that summarizes how tightly the points line up. Pair r with a visual check, because a curved or clustered cloud can hide a strong but non-linear pattern.
Q: How do you draw a line of best fit on a scatter plot?
A: Keep the Show Regression Line toggle set to Yes and the calculator will overlay the least-squares line using the slope and intercept it computed. That line minimizes the sum of squared vertical distances from the points.
Q: What is the difference between correlation and causation in a scatter plot?
A: A scatter plot can show that two variables move together, but it cannot prove that one causes the other. The link can come from a third variable driving both, a coincidence in a small sample, or a time-order effect.