What is Algebraic Statistics?
Algebraic Statistics is a new field, less than a decade old, whose precise scope is still emerging. The term itself was coined by Giovanni Pistone, Eva Riccomagno and Henry Wynn, with the title of their book [Pistone et al. 2000]. That book explains how polynomial algebra arises in problems from experimental design and discrete probability, and it demonstrates how computational algebra techniques can be applied to statistics. The algebraic view of discrete statistical models has been useful for a number of applications, including the study of Markov bases and conditional inference [Diaconis and Sturmfels 1998], phylogenetic tree reconstruction using invariants [Sturmfels and Sullivant 2005], disclosure limitation [Sullivant 2005], and parametric inference [Pachter and Sturmfels 2004] to name a few. The central idea that discrete statistical models are the non-negative real points on certain algebraic varieties called algebraic statistical models, has itself been the stepping stone to the discovery ofother connections between algebraic geometry and statistics.
Example: A hidden markov model (HMM) determines a subset M (the model) of a space of possibile probability distributions P. It is an algebraic subset of a generally high-dimensional simplex. The model describes this subset parametrically, but it can also be described by systems of equations. One problem is to determine which point on the model M is closest to the probability distribution given by a data set (maximum likelihood estimation). Many problems in computational biology, e.g., sequence alignment, have the structure just described. The meeting is intended to not only explore the algebra of HMMs, but to find and develop more connections to practical problems arising from the use of such models.

Return to top