EssayFount's statistics hub delivers free step-by-step worked examples across descriptive statistics, probability, hypothesis testing, regression, ANOVA, and non-parametric tests, plus dedicated SPSS and R workflows and AP Statistics review. Every example is written or reviewed by credentialed statisticians and academic writers who explain the method behind each result so students can solve similar problems independently.
Authored by Dr. Naomi Alvarez, PhD Statistics, with fourteen years teaching biostatistics and dissertation methodology. Peer-reviewed by Dr. Clara Bennett, PhD Statistics, with eighteen years teaching probability and applied statistics. Last reviewed April 2026.
How students use the EssayFount statistics hub
Over the past twelve months, 81 verified writing experts holding a Master of Science or PhD in statistics, biostatistics, or applied mathematics produced this hub. Together they compiled 287 fully worked numerical problems and 46 SPSS and R reproducible analyses across the topic lanes below. Students reach the page in three predictable windows: the night before a problem set is due, the two-week stretch before a midterm, and the dissertation results-chapter writing season for graduate students.
Every worked problem passes a two-tier review. A subject-matter writer holding a doctorate or terminal master's drafts each solution; a second senior statistician verifies the numerical steps, the assumption checks, and the final reported value before publication. This mirrors the American Statistical Association's recommended practice for teaching reproducible quantitative analysis. Read more about our writers coursework support and the credential verification process behind every byline.
The hub complements rather than replaces a course. Students should still attend lectures, read the assigned chapters in Moore, McCabe, and Craig's Introduction to the Practice of Statistics or Casella and Berger's Statistical Inference, and attempt problems unaided first. When a method does not click, the page provides a second explanation with a worked example. For peer subject support, see our math pillar writing services, programming pillar study materials, data science and machine learning pillar, psychology research methods, and business analytics pillar. For a fully written assignment with model results, see our statistics writing service and, for results-chapter graduate work, our dissertation results chapter help.
Descriptive statistics
Descriptive statistics summarize a data set through measures of central tendency (mean, median, mode), measures of dispersion (range, variance, standard deviation, interquartile range), and graphical displays (histograms, boxplots). Descriptive work is the first step of every analysis, both for sanity-checking the data and for choosing which inferential procedure is appropriate.
Mean, median, mode, range, variance, standard deviation
The arithmetic mean is the sum of observations divided by the count. The median is the middle observation after sorting; for an even count, average the two middle values. The mode is the most frequent value, useful for categorical data. The variance is the average squared deviation from the mean; the standard deviation is its square root and shares units with the original data. Sample variance divides by n minus one (Bessel's correction) so the estimator is unbiased.
Frequency distributions and histograms
A frequency distribution groups data into bins and counts observations per bin. The histogram is the graphical representation. Bin width matters: too few bins hide structure, too many introduce noise. The Freedman-Diaconis rule sets bin width at twice the interquartile range divided by the cube root of the sample size, a defensible default for unimodal data.
Boxplots and outlier detection
A boxplot shows the median, the 25th and 75th percentiles (the box edges), and the whiskers extending to the most extreme value within 1.5 times the interquartile range from the box. Observations beyond the whiskers are plotted individually as candidate outliers. Outlier flagging is a starting point, not a deletion rule; investigate whether each flagged value is a data entry error, a genuine extreme observation, or evidence of a population subgroup.
Worked example: compute the five-number summary by hand
Given the sample 4, 7, 8, 10, 12, 13, 15, 18, 22 (n equals 9, already sorted): the minimum is 4 and the maximum is 22. The median is the fifth observation, which is 12. The first quartile is the median of the lower half (4, 7, 8, 10), which is 7.5. The third quartile is the median of the upper half (13, 15, 18, 22), which is 16.5. The interquartile range is 16.5 minus 7.5 equals 9. No values fall beyond 1.5 times 9 from either box edge, so no outliers are flagged.
For deeper dives by topic, follow through to the SPSS sub-pillar, R programming sub-pillar, and regression analysis sub-pillar.
Probability and probability distributions
Probability distributions describe how the values of a random variable are distributed. Discrete distributions assign probability mass to each possible value (binomial, Poisson, geometric, hypergeometric); continuous distributions assign probability density across an interval (normal, t, chi-square, F, exponential, uniform). Identifying the right distribution for a problem is half the work.
Discrete distributions: binomial and Poisson
The binomial distribution counts successes in n independent Bernoulli trials each with success probability p. Mean equals n times p, variance equals n times p times one minus p. The Poisson distribution counts events in a fixed interval when events occur at a constant average rate lambda. Mean and variance both equal lambda. Use Poisson when n is large and p is small (lambda equals n times p remains moderate), as a computational shortcut for the binomial.
Continuous distributions: normal, t, chi-square, F
The normal distribution is fully described by its mean mu and standard deviation sigma. Standardizing by z equals (x minus mu) divided by sigma converts any normal to the standard normal with mean zero and standard deviation one. The Student t distribution adjusts the normal for small samples with unknown population variance and converges to the normal as the degrees of freedom increase. The chi-square distribution arises in variance estimation and goodness-of-fit testing. The F distribution arises as a ratio of two independent chi-square variables divided by their degrees of freedom and underlies analysis of variance.
Central limit theorem
The central limit theorem states that, for sufficiently large sample size, the sampling distribution of the sample mean is approximately normal with mean equal to the population mean and standard deviation equal to the population standard deviation divided by the square root of n, regardless of the population's distribution shape. The "sufficiently large" rule of thumb is n at least 30 for moderately skewed populations; for heavily skewed populations, n in the hundreds may be required.
Worked example: normal probability calculation
SAT total scores are approximately normal with mean 1050 and standard deviation 200. What proportion of test takers score above 1300? Standardize: z equals (1300 minus 1050) divided by 200, which is 1.25. The standard normal probability above z equals 1.25 is about 0.1056 from the z table. So roughly 10.6 percent of test takers score above 1300.
Inferential statistics and hypothesis testing
Hypothesis testing evaluates whether sample evidence is strong enough to reject a null hypothesis at a chosen significance level alpha. The test produces a test statistic and a p value; if the p value falls below alpha, the null hypothesis is rejected. The American Statistical Association's 2016 statement on p values cautions against treating significance as a binary truth and recommends reporting effect sizes and confidence intervals alongside any p value.
Null and alternative hypotheses
The null hypothesis (H zero) states no effect or no difference; the alternative hypothesis (H one) states the effect or difference of interest. Two-sided alternatives test for any difference (mu not equal to mu zero); one-sided alternatives test for a specific direction (mu greater than mu zero or mu less than mu zero). Choose the alternative before looking at the data.
One-sample and two-sample t-tests
A one-sample t-test compares a sample mean to a hypothesized population mean. The test statistic is t equals (sample mean minus hypothesized mean) divided by (sample standard deviation divided by square root of n), with degrees of freedom n minus one. A two-sample t-test compares two independent group means; with equal variances assumed, use the pooled variance form. With unequal variances, use Welch's t-test (the default in R's t.test() function).
Z-tests and confidence intervals
Use a z-test when the population standard deviation is known and the sample is large; use a t-test otherwise. A confidence interval for the mean is sample mean plus or minus t critical times standard error. A 95 percent confidence interval contains the true population mean in 95 percent of repeated samples, not "with 95 percent probability" for any single interval (a common misinterpretation).
Type I and Type II error
Type I error rejects a true null hypothesis; its probability equals alpha. Type II error fails to reject a false null hypothesis; its probability is denoted beta. Power is one minus beta, the probability of correctly detecting an effect of a given size. Increasing sample size increases power without inflating Type I error and is the most reliable way to improve a study.
Worked example: two-sample t-test for means
A nursing instructor compares mean test scores between two sections. Section A has n equals 30, sample mean 78, standard deviation 10. Section B has n equals 32, sample mean 73, standard deviation 12. Is the difference statistically significant at alpha equals 0.05? Welch's t equals (78 minus 73) divided by the square root of (10 squared divided by 30 plus 12 squared divided by 32), which is 5 divided by the square root of (3.333 plus 4.5), which is 5 divided by 2.798, which is 1.79. Welch's degrees of freedom approximation gives about 59. Two-sided p value is roughly 0.078. Fail to reject the null at alpha equals 0.05; the difference is not statistically significant, though the point estimate suggests a meaningful direction worth a larger study.
Regression analysis
Regression analysis models the conditional mean of a response variable as a function of one or more predictor variables. Linear regression is the workhorse; logistic regression handles binary outcomes; Poisson and negative binomial regression handle count outcomes. Every regression workflow runs through three checks: model specification, assumption verification, and inference on the estimated coefficients.
Simple linear regression and least squares
Simple linear regression fits y equals beta zero plus beta one times x plus epsilon, where epsilon is a normally distributed error term with mean zero and constant variance. The ordinary least squares estimates minimize the sum of squared residuals. The slope beta one represents the expected change in y for a one-unit increase in x; the intercept beta zero represents the expected y when x equals zero (which may not be physically meaningful and should not be over-interpreted).
Multiple regression and adjusted R-squared
Multiple regression generalizes to two or more predictors. The R-squared statistic measures the proportion of variance in y explained by the model and always increases as predictors are added (even useless ones). The adjusted R-squared penalizes for the number of predictors and provides a fairer comparison across models. For predictive performance comparison across nested models, use cross-validated mean squared error rather than in-sample R-squared.
Logistic regression for binary outcomes
When the outcome y takes values 0 or 1, linear regression is inappropriate. Logistic regression models the log odds of y equals 1 as a linear function of the predictors. Coefficients are interpreted as log odds ratios; exponentiating a coefficient gives the odds ratio for a one-unit increase in the predictor. Goodness of fit uses the deviance and tests like the Hosmer-Lemeshow statistic; predictive performance uses the area under the receiver operating characteristic curve.
Diagnostic plots and assumption checks
R's plot() applied to an lm object produces four diagnostic plots: residuals versus fitted (linearity and constant variance), normal Q-Q (residual normality), scale-location (constant variance again, on a square-root scale), and residuals versus leverage with Cook's distance contours (influential points). Trust the diagnostic plots; if the residuals versus fitted plot shows a clear pattern, the linearity assumption is violated and the inference is unreliable regardless of the p value.
Worked example: fit and interpret a multiple regression
A college regresses freshman GPA on SAT total score and high school GPA for n equals 200 students. Estimated model: freshman GPA equals 0.20 plus 0.0008 times SAT plus 0.45 times high school GPA. Adjusted R-squared equals 0.42. Both coefficients are statistically significant at alpha equals 0.01. Interpretation: holding high school GPA constant, a 100 point increase in SAT is associated with an expected 0.08 increase in freshman GPA. Holding SAT constant, a one-point increase in high school GPA is associated with an expected 0.45 increase in freshman GPA. The model explains roughly 42 percent of the variance in freshman GPA, which is a meaningful but incomplete predictor.
Stuck on a regression interpretation or SPSS output for your dissertation chapter four? Send the file and receive a verified statistics writer quote within hours.
Analysis of variance (ANOVA)
Analysis of variance partitions total variability into between-group and within-group components and tests whether group means differ. ANOVA assumes independent observations, normally distributed errors within each group, and equal variances across groups (homoscedasticity). When assumptions fail, use Welch's ANOVA or non-parametric alternatives below.
One-way ANOVA
One-way ANOVA tests whether three or more group means differ. The F statistic equals the mean square between groups divided by the mean square within groups. A significant F means at least one group mean differs from the others; a follow-up post-hoc test (Tukey's honestly significant difference, Bonferroni-corrected pairwise t-tests) identifies which pairs differ.
Two-way ANOVA with interaction
Two-way ANOVA includes two categorical predictors and tests three effects: the main effect of factor A, the main effect of factor B, and the A by B interaction. A significant interaction means the effect of one factor depends on the level of the other; in that case, interpret the simple effects rather than the main effects.
Post-hoc tests: Tukey's honestly significant difference
Tukey's honestly significant difference test controls the family-wise Type I error rate across all pairwise comparisons. Bonferroni correction is more conservative (and less powerful) but flexible across any set of contrasts. Both should be reported with the underlying ANOVA F test, never as a substitute for it.
Worked example: one-way ANOVA on three groups
A clinical study compares mean systolic blood pressure across three diet groups (low-sodium, Mediterranean, control), each with n equals 25. Group means: low-sodium 124, Mediterranean 128, control 134. Within-group standard deviation is approximately 8 across all groups. F equals 12.6 with degrees of freedom 2 and 72; p value is below 0.001. Reject the null. Tukey's honestly significant difference identifies low-sodium as significantly lower than control (p below 0.001) and Mediterranean as significantly lower than control (p equals 0.012). Low-sodium and Mediterranean do not differ significantly from each other (p equals 0.184).
Non-parametric tests
When the normality or equal-variance assumption fails and a transformation does not fix it, switch to non-parametric tests. They make weaker distributional assumptions at the cost of slightly lower power when the parametric assumptions actually hold.
Mann-Whitney U
The Mann-Whitney U test compares two independent groups on an ordinal or skewed continuous outcome. It tests whether one distribution is stochastically greater than the other. Equivalent to the Wilcoxon rank-sum test.
Kruskal-Wallis
Kruskal-Wallis is the non-parametric counterpart to one-way ANOVA, comparing three or more independent groups. Follow with Dunn's test for pairwise comparisons.
Wilcoxon signed-rank
The Wilcoxon signed-rank test compares paired observations or a single sample against a hypothesized median. Use it when the paired-differences fail the normality check that a paired t-test would require.
Chi-square test of independence
The chi-square test of independence assesses whether two categorical variables are associated. Compute expected counts under independence (row total times column total divided by grand total per cell), then sum (observed minus expected) squared divided by expected across cells. Compare to a chi-square distribution with (rows minus one) times (columns minus one) degrees of freedom. Cells with expected count below 5 inflate the Type I error rate; pool sparse categories or use Fisher's exact test for small tables.
SPSS data analysis help
SPSS data analysis uses a graphical interface for descriptive, inferential, and multivariate statistics. The IBM SPSS Statistics package is the dominant tool in social science, public health, and nursing research because of its menu-driven workflow and structured output. Learning the menu equivalents of common procedures is the fastest path to productive use.
Importing data and variable view
Import a CSV with File greater than Open greater than Data. Switch to Variable View to set variable name, type (numeric, string, date), label (a longer description shown in output), values (numeric codes mapped to category labels), missing (codes for missing data), and measure (nominal, ordinal, scale). Setting the measure correctly upfront avoids procedure errors later.
Running descriptives and frequencies
Analyze greater than Descriptive Statistics greater than Descriptives produces means, standard deviations, minima, maxima for scale variables. Analyze greater than Descriptive Statistics greater than Frequencies produces counts and percentages for categorical variables and is the right starting point for any analysis. Analyze greater than Descriptive Statistics greater than Crosstabs produces two-way contingency tables with optional chi-square test.
Running t-tests, ANOVA, regression in SPSS
For an independent samples t-test, Analyze greater than Compare Means greater than Independent-Samples T Test, then specify the test variable and the grouping variable. For one-way ANOVA, Analyze greater than Compare Means greater than One-Way ANOVA. For linear regression, Analyze greater than Regression greater than Linear, then specify dependent and independent variables. SPSS output reports unstandardized B, standard error, standardized beta, t statistic, and significance for each coefficient.
Reading SPSS output and reporting in APA format
Report a t-test result as: t(df) equals statistic, p equals value, with means and standard deviations for both groups in the prose. Report a regression as: F(df1, df2) equals statistic, p equals value, R squared equals value, with each coefficient's B, standard error, and significance in a table. APA seventh edition style sets the bar for clean reporting in published research.
R programming help for statistics
R programming for statistics is the open-source standard in academic statistics, biostatistics, and data science. The R Foundation maintains the language; the Comprehensive R Archive Network (CRAN) hosts thousands of contributed packages. RStudio (now Posit) is the dominant IDE.
Reading data with read.csv and tidyverse
Base R's read.csv() reads a comma-separated file into a data frame. The tidyverse alternative readr::read_csv() is faster, infers column types more robustly, and returns a tibble. The full tidyverse workflow uses dplyr verbs (filter, select, mutate, summarize, group_by, arrange) for data manipulation and ggplot2 for visualization, all chained with the pipe operator (the native pipe |> or magrittr's %>%).
lm() for linear regression and glm() for logistic
The lm() function fits a linear model: model equals lm(y tilde x1 plus x2, data equals df). summary(model) prints coefficients, standard errors, t statistics, and overall fit. anova(model) gives the ANOVA decomposition. For logistic regression, glm(y tilde x1 plus x2, family equals binomial, data equals df) fits the model; exponentiating the coefficients with exp(coef(model)) returns odds ratios.
ggplot2 for diagnostic visualization
ggplot2 builds plots layer by layer. A scatter plot with a fitted regression line: ggplot(df, aes(x, y)) plus geom_point() plus geom_smooth(method equals "lm"). Diagnostic plots from a lm object: plot(model) prints the four standard diagnostic plots, or use the performance package's check_model() for an enhanced version with statistical tests overlaid.
Reproducible reporting with R Markdown
R Markdown weaves narrative text and R code into a single document that renders to HTML, PDF, or Word. Code chunks are evaluated at render time, so the report and the analysis stay in sync. The Quarto successor extends the workflow to Python and Julia and is now the recommended tool for new projects. Reproducibility is the single most valued skill in modern statistical practice.
AP Statistics preparation
AP Statistics covers four big ideas defined by the College Board: variation and distribution, patterns and uncertainty, data-based predictions, decisions, and conclusions, and probabilistic reasoning. The curriculum maps to a one-semester introductory college statistics course and rewards conceptual understanding over rote calculation.
AP Statistics topic breakdown
Unit one covers exploring categorical and quantitative data. Unit two covers two-variable data. Unit three covers data collection. Unit four covers probability and discrete distributions. Unit five covers sampling distributions. Unit six covers inference for proportions. Unit seven covers inference for means. Unit eight covers chi-square inference. Unit nine covers inference for regression slopes.
Multiple-choice and free-response strategy
Multiple choice rewards eliminating two wrong answers fast and choosing among the remaining two. Free response rewards complete sentences, contextual interpretations, and explicit assumption checks. Lose one credit for skipping the assumption check; lose another for failing to interpret the answer in context. Practice with released free-response questions from the College Board Course and Exam Description.
Sample free-response walkthrough
Released free-response questions are graded on a four-point rubric (essentially correct, partially correct, minimal, no credit) per part. Read the rubric for each released question alongside your written answer and self-grade. Repeat with five questions, focus revision on the parts that consistently lose credit, then take a full timed practice section. Expect a one-letter-grade improvement after roughly twelve hours of structured practice.
Statistics writing service
EssayFount pairs statistics students with verified PhD and Master of Science statisticians and academic writers for annotated solutions, weekly study sessions with written follow-ups, and full assignment review across descriptive, inferential, regression, ANOVA, SPSS, and R workflows. Over the past year, the team supported students across 41 universities with a 96 percent on-time delivery rate and a 4.83 out of 5 average rating.
Credentialed statistics writers
Every writer holds a PhD or terminal Master of Science in statistics, biostatistics, applied mathematics, or quantitative social science. Each candidate completes a written test with two SPSS reproducibles and one written interpretation before being approved for student work. Writer credentials are visible on the writers page and on every assigned project.
Pricing, per problem and per assignment
Per-problem pricing starts at the low double digits for an undergraduate descriptive or inferential question and scales by complexity. Per-assignment pricing covers full problem sets, SPSS or R deliverables, and dissertation chapter four results sections. Visit our pricing page for the current rate card and turnaround tiers.
How it works, including dissertation chapter four
Submit the assignment or data set through the quote form. A statistics writer accepts the project within an hour during business windows. For dissertation chapter four (results), the writer cleans the data, runs the analyses specified by your committee-approved methodology, generates APA-style tables and figures, and writes the narrative interpretation. You receive an editable SPSS or R script and the written results section together so you can re-run and modify as needed during defense prep.
Free statistics resources
Beyond the worked-example hub above, EssayFount publishes free reference resources to support self-study. These work as a quick lookup during a homework session and as a study aid before a midterm.
Probability and t-tables
Standard normal, Student t, chi-square, and F distribution tables in printable PDF, formatted for legibility. Useful when an exam restricts calculator use or when a student needs to sanity-check software output by hand.
Worked-example library
The full library of 287 numerical examples and 46 SPSS and R reproducibles, indexed by topic and difficulty. Browse by topic from the navigation above or search by procedure name (one-way ANOVA, multiple regression, logistic regression).
SPSS and R cheat sheets
Side-by-side cheat sheets mapping common procedures across SPSS menu paths and R function calls. Helpful for students transitioning between courses that use different software.
Try the free hypothesis-test, regression, and ANOVA calculators for quick numeric checks during homework.
Ready to get a verified statistics writer on your assignment?
EssayFount pairs you with a credentialed statistics writer in under an hour during weekday windows. Whether you need a single problem walked through, an SPSS output corrected, or a full dissertation chapter four written, you receive an editable script and a clean APA write-up together. Get a step-by-step statistics solution from a verified writer and have a credentialed statistician on your problem within hours.