Algebraic statistics is a branch of mathematical statistics that focuses on the use of algebraic, geometric, and combinatorial methods in statistics. While the use of these methods has a long history in statistics, algebraic statistics is continuously forging new interdisciplinary connections.
The field experienced a major revitalization in the 1990s. In 1998, Diaconis and Sturmfels introduced Gröbner bases for constructing Markov chain Monte Carlo algorithms for conditional sampling from discrete exponential families. Pistone and Wynn, in 1996, applied computational commutative algebra to the design and analysis of experiments, providing new tools for understanding confounding and identifiability in complex experimental settings. These works, along with the monograph by Giovanni Pistone, Eva Riccomagno, and Henry P. Wynn, in which the term “algebraic statistics” was first used, played a pivotal role in establishing this field as a unified area of research.
Modern researchers in algebraic statistics explore a wide range of topics, including computational biology, graphical models, and statistical learning.
Consider a random variable which can take on the values . Such a variable is completely characterized by the three probabilities
and these numbers satisfy
Conversely, any three such numbers unambiguously specify a random variable, so we can identify the random variable with the tuple .
Now suppose is a binomial random variable with parameter and , i.e. represents the number of successes when repeating a certain experiment two times, where each experiment has an individual success probability of . Then
and it is not hard to show that the tuples which arise in this way are precisely the ones satisfying
The latter is a polynomial equation defining an algebraic variety (or surface) in , and this variety, when intersected with the simplex given by
yields a piece of an algebraic curve which may be identified with the set of all 3-state Bernoulli variables. Determining the parameter amounts to locating one point on this curve; testing the hypothesis that a given variable is Bernoulli amounts to testing whether a certain point lies on that curve or not.
H. B. Mann. 1949. Analysis and Design of Experiments: Analysis of Variance and Analysis-of-Variance Designs. Dover.
Raghavarao, Damaraju (1988). Constructions and Combinatorial Problems in Design of Experiments (corrected reprint of the 1971 Wileyed.). New York: Dover.
Raghavarao, Damaraju; Padgett, L.V. (2005). Block Designs: Analysis, Combinatorics and Applications. World Scientific.