Quantcast

Upon my return [to academia, after years of private statistical consulting], I started reading the Annals of Statistics … and was bemused. Every article started with:


Assume that the data are generated by the following model…


followed by mathematics exploring inference, hypothesis testing, and asymptotics…. I [have a] very low … opinion … of the theory published in the Annals of Statistics. [S]tatistics [is] a science that deals with data.

The linear regression model led to many erroneous conclusions that appeared in journal articles waving the 5% significance level without knowing whether the model fit the data. Nowadays, I think most statisticians will agree that this is a suspect way to arrive at conclusions.

In the mid-1980s … A new research community … sprang up. Their goal was predictive accuracy….. They began working on complex prediction problems where it was obvious that data models were not applicable: speech recognition, image recognition, nonlinear time series prediction, handwriting recognition, prediction in financial markets.

The advances in methodology and increases in predictive accuracy since the mid-1980s that have occurred in the research of machine learning has been phenomenal…. What has been learned? The three lessons that seem most important:

  • Rashomon: the multiplicity of good models;
  •           • Occam: the conflict between simplicity and accuracy;
  •           • Bellman: dimensionality — blessing or curse

Leo Breiman, The Two Cultures of Statistics (2001)

(which are: machine learning / artificial intelligence / algorithmists —vs— model builders / statistics / econometrics / psychometrics)

10 notes

  1. isomorphismes posted this