ENSIKLOPEDIA

Kembali ke Ensiklopedia Arsip Wikipedia Indonesia

Studentization

In statistics, Studentization, named after William Sealy Gosset, who wrote under the pseudonym Student, is the process of dividing a statistic derived from a sample (such as sample mean) by a sample-based estimate of a population standard deviation. Unlike "normalization", where only the numerator is uncertain, Studentization has both numerator and denominator uncertain, typically (that is, under Gaussian assumptions) following a Student-T distribution. The term is also used for the standardisation of a higher-degree statistic by another statistic of the same degree:^[1]^[2] for example, an estimate of the third central moment would be standardised by dividing by the cube of the sample standard deviation.

While standardization typically involves dividing a centered variable by the known population standard deviation ( $\sigma$ ), studentization is required when $\sigma$ is unknown. A simple example is the process of dividing a sample mean by the sample standard deviation when data arise from a location-scale family. The consequence of "Studentization" is that the complication of treating the probability distribution of the mean, which depends on both the location and scale parameters, has been reduced to considering a distribution which depends only on the location parameter. However, the fact that a sample standard deviation is used, rather than the unknown population standard deviation, complicates the mathematics of finding the probability distribution of a Studentized statistic. Because the denominator is itself a random variable that fluctuates from sample to sample, the resulting statistic typically follows a more spread-out distribution, such as the Student's t-distribution, rather than the standard normal distribution.

Beyond basic tests, it is critical in regression diagnostics—via studentized residuals—and in multiple comparisons, where the studentized range distribution is used to maintain statistical power while controlling for Type I errors. In computational statistics, the idea of using Studentized statistics is of some importance in the development of confidence intervals with improved properties in the context of resampling and, in particular, bootstrapping.^[3]

History and motivation

The development of studentization was driven by practical needs in industrial quality control during the early 20th century. The concept is closely associated with the work of William Sealy Gosset, a chemist and a mathematician who studied at New College, Oxford, and worked for the Guinness brewery in Dublin. Gosset faced practical quality-control problems involving small samples while analyzing the quality of raw materials like barley and hops.

At the time, the prevailing statistical methods, largely developed by Karl Pearson, relied on large datasets where the population standard deviation ( $\sigma$ ) could be assumed to be known. The standard normal (Z) test was commonly used for inference about means, but it required knowledge of the population standard deviation. However, in industrial and laboratory contexts, the population variance was often unknown and had to be estimated from the sample.

Gosset recognized that replacing the population standard deviation with the sample standard deviation ( $s$ ) altered the distribution of the test statistic, introducing additional uncertainty, particularly when sample sizes were very small. Because the brewery could only afford to take very small samples (often as few as three or four measurements), the traditional Z-test consistently underestimated the error, leading to incorrect conclusions about the quality of the beer.

To address this issue, he developed a family of probability distributions that accounted for this extra variability. His seminal work was published in 1908 in the journal Biometrika under the pseudonym "Student" (due to Guinness's policy of keeping technical discoveries secret), leading to what is now known as the Student's t-distribution. Studentization emerged as the central mechanism underlying this adjustment. Later, Ronald A. Fisher refined these ideas by formalizing the use of degrees of freedom, typically $n-1$ , which determine the shape of the t-distribution.^[4]

Studentized residuals

In regression analysis, studentized residuals are a type of standardized residual that are particularly useful for identifying outliers and influential observations. In a typical linear regression model, the raw residuals (the difference between the observed values and the values predicted by the model) do not all have the same variance, even if the underlying errors have equal variance. This occurs because the variance of each residual depends on the "leverage" of its corresponding data point—points further from the mean of the independent variables have higher leverage and smaller residual variance.

To make residuals comparable and easier to interpret, statisticians use studentization to "equalize" them. This is done by dividing each raw residual by an estimate of its standard deviation. There are two main types of studentized residuals:

Internally studentized residuals: These use a variance estimate based on the entire dataset, including the observation being tested. While useful, a major drawback is that an extreme outlier can "pull" the model toward itself, inflating the global variance estimate. This is known as "masking," where the outlier's own influence makes it appear less extreme than it actually is. For example, in a dataset with one extreme outlier, the internally studentized residual may appear moderate while the externally studentized version reveals it clearly.
Externally studentized residuals (also known as deleted residuals): To overcome the masking effect, the variance for the $i$ -th residual is estimated by fitting the model to the dataset excluding the $i$ -th observation. This ensures that a single anomalous data point does not contaminate its own error estimate, making this method much more sensitive for outlier detection.

The use of studentized residuals is a standard part of regression diagnostics. By plotting these residuals against predicted values, researchers can verify if the assumptions of the linear model (such as homoscedasticity) hold true or if specific data points are distorting the results of the entire analysis.

Studentized range

Beyond its historical origins, Studentization appears in several modern statistical contexts, none more widely used than the studentized range. In statistics, the studentized range is defined as the difference between the maximum and minimum values of a sample, divided by the estimated standard error: q = (x_max − x_min) / s where s is the estimated standard error of the data.

This statistic is the basis for Tukey's HSD (Honestly Significant Difference) test, which allows researchers to compare the means of several groups to see which ones are significantly different from each other. Without studentization, comparing multiple groups would significantly increase the risk of a Type I error (finding a difference where none exists). By using a studentized scale, the test provides a consistent threshold that accounts for the number of groups being compared, ensuring that the overall confidence level of the entire study remains accurate.

In fields like biology and psychology, where experiments often involve multiple treatment groups, the studentized range distribution provides a more robust framework than running multiple individual t-tests.

Examples

References

↑ Dodge, Y. (2003) The Oxford Dictionary of Statistical Terms, OUP. ISBN 0-19-850994-4
↑ Kendall, M.G., Stuart, A. (1973) The Advanced Theory of Statistics. Volume 2: Inference and Relationship, Griffin. ISBN 0-85264-215-6 (Section 20.31–2)
↑ Davison, A.C., Hinkley, D.V. (1997) Bootstrap Methods and their Application, CUP. ISBN 0-521-57471-4
↑ Student (March 1908). "The Probable Error of a Mean". Biometrika. 6 (1): 1–25.

Statistics

Descriptive statistics

Continuous data

Center	Mean Arithmetic Arithmetic-Geometric Contraharmonic Cubic Generalized/power Geometric Harmonic Heronian Heinz Lehmer Median Mode
Dispersion	Average absolute deviation Coefficient of variation Interquartile range Percentile Range Standard deviation Variance
Shape	Central limit theorem Moments Kurtosis L-moments Skewness

Count data

Index of dispersion

Summary tables

Dependence

Graphics

Statistical data processing

Transformations	Data transformation Log transformation Power transform Box–Cox transformation Yeo–Johnson transformation Variance-stabilizing transformation Anscombe transform Fisher transformation
Scaling and normalization	Feature scaling Normalization Standardization (z-score) Min–max normalization Unit vector normalization
Data cleaning	Data cleaning Outlier Winsorizing Truncation Missing data
Data reduction	Dimensionality reduction Principal component analysis Factor analysis
Time-series preprocessing	Differencing Detrending Seasonal adjustment Stationarity transformation

Data collection

Study design	Effect size Missing data Optimal design Population Replication Sample size determination Statistic Statistical power
Survey methodology	Sampling Cluster Stratified Opinion poll Questionnaire Standard error
Controlled experiments	Blocking Factorial experiment Interaction Random assignment Randomized controlled trial Randomized experiment Scientific control
Adaptive designs	Adaptive clinical trial Stochastic approximation Up-and-down designs
Observational studies	Cohort study Cross-sectional study Natural experiment Quasi-experiment

Statistical inference

Statistical theory

Frequentist inference

Point estimation	Estimating equations Maximum likelihood Method of moments M-estimator Minimum distance Unbiased estimators Mean-unbiased minimum-variance Rao–Blackwellization Lehmann–Scheffé theorem Median unbiased Plug-in
Interval estimation	Confidence interval Pivot Likelihood interval Prediction interval Tolerance interval Resampling Bootstrap Jackknife
Testing hypotheses	1- & 2-tails Power Uniformly most powerful test Permutation test Randomization test Multiple comparisons
Parametric tests	Likelihood-ratio Score/Lagrange multiplier Wald

Specific tests

Z-test (normal) Student's t-test F-test
Goodness of fit	Chi-squared G-test Kolmogorov–Smirnov Anderson–Darling Lilliefors Jarque–Bera Normality (Shapiro–Wilk) Likelihood-ratio test Model selection Cross validation AIC BIC
Rank statistics	Sign Sample median Signed rank (Wilcoxon) Hodges–Lehmann estimator Rank sum (Mann–Whitney) Nonparametric anova 1-way (Kruskal–Wallis) 2-way (Friedman) Ordered alternative (Jonckheere–Terpstra) Van der Waerden test

Bayesian inference

Correlation	Pearson product-moment Partial correlation Confounding variable Coefficient of determination
Regression analysis	Errors and residuals Regression validation Mixed effects models Simultaneous equations models Multivariate adaptive regression splines (MARS) Template:Least squares and regression analysis
Linear regression	Simple linear regression Ordinary least squares General linear model Bayesian regression
Non-standard predictors	Nonlinear regression Nonparametric Semiparametric Isotonic Robust Homoscedasticity and Heteroscedasticity
Generalized linear model	Exponential families Logistic (Bernoulli) / Binomial / Poisson regressions
Partition of variance	Analysis of variance (ANOVA, anova) Analysis of covariance Multivariate ANOVA Degrees of freedom

Categorical / multivariate / time-series / survival analysis

Categorical

Multivariate

Time-series

General	Decomposition Trend Stationarity Seasonal adjustment Exponential smoothing Cointegration Structural break Granger causality
Specific tests	Dickey–Fuller Johansen Q-statistic (Ljung–Box) Durbin–Watson Breusch–Godfrey
Time domain	Autocorrelation (ACF) partial (PACF) Cross-correlation (XCF) ARMA model ARIMA model (Box–Jenkins) Autoregressive conditional heteroskedasticity (ARCH) Vector autoregression (VAR) (Autoregressive model (AR))
Frequency domain	Spectral density estimation Fourier analysis Least-squares spectral analysis Wavelet Whittle likelihood

Survival

Survival function	Kaplan–Meier estimator (product limit) Proportional hazards models Accelerated failure time (AFT) model First hitting time
Hazard function	Nelson–Aalen estimator
Test	Log-rank test

Applications

Biostatistics	Bioinformatics Clinical trials / studies Epidemiology Medical statistics
Engineering statistics	Chemometrics Methods engineering Probabilistic design Process / quality control Reliability System identification
Social statistics	Actuarial science Census Crime statistics Demography Econometrics Jurimetrics National accounts Official statistics Population statistics Psychometrics
Spatial statistics	Cartography Environmental statistics Geographic information system Geostatistics Kriging

This statistics-related article is a stub. You can help Wikipedia by adding missing information.

History and motivation

Studentized residuals

Studentized range

Examples

See also

References