Bayesian statistics: the parameters you want to infer are modeled as random vari...

datastoat · on Dec 5, 2019

A more charitable take on machine learning: you decide that your criterion is predictive accuracy, and you evaluate it on a holdout set (or you cross-validate).

The idea of evaluation on a holdout set is actually frequentist: it's equivalent to "I really want my model to work well on the true distribution, but that's unknown, so I shall approximate it by the empirical distribution of the data." The empirical distribution is the maximum likelihood fit to the data, if you allow yourself the entire space of distributions.

Compare to how Bayesians do model selection... I've seen several versions:

-- "I have a prior on the set of models, and I compute the model evidence using Bayesian principles, and thereby update my beliefs about the set of models." (This is a clean principled approach. Shame no one does it!)

-- "I compute model evidence using Bayesian principles. The model with the largest evidence is my favoured model." (This is nonsense.)

-- "I compute model evidence. I then use gradient descent to find the hyperparameter values that maximize evidence." This is what is done by all sorts of "Bayesian" frameworks, such as the Gaussian Process models in sklearn. (This is classic frequentism, but for some strange reason Bayesians claim it as their own.)

I certainly wouldn't argue that "predictive accuracy" is the be-all and end-all of modelling -- but it is a nice clean principled approach to model selection. I have honestly never seen a Bayesian who takes a principled approach to model selection.

eli_gottlieb · on Dec 5, 2019

> A more charitable take on machine learning: you decide that your criterion is predictive accuracy, and you evaluate it on a holdout set (or you cross-validate).

I'm doing a PhD in machine learning, so I quite realize. But it's Bayesian machine learning!

mikorym · on Dec 5, 2019

Could you give an example of a Bayesian statistical calculation that does not intersect with frequentist statistics (i.e., your first type)?

jbay808 · on Dec 5, 2019

What do you mean by a "non-empirical prior"?

chongli · on Dec 5, 2019

Bayesian statistics is sometimes called subjectivist statistics. Probability in Bayesian statistics reflects your degree of belief in some potential outcome.

If you conduct an experiment, you use Bayes’ theorem to update your degree of belief, which is now conditional on the outcome of your experiment.

By quantifying your degree of belief in a prior, you give yourself some starting point (rather than just assuming 0 probability), even if that prior is only an educated guess and not some well researched position. This can be good because you might not have done the research yet.