Statistical models can be represented as a [[set]] of [[Statistical Distributions]] (or a set of [[Statistical Densities]]).
(Parametric) statistical models can have zero or more parameters. ^[ [[Parametric vs Non-Parametric Statistical Tests]] ] Models with no parameters are called non-parametric (e.g. a model defined by the [[Cumulative Distribution Function | CDF]] of the observed data). Since many distributions can also be defined as families of functions (e.g. $f(x; \mu, \sigma) = N(\mu, \sigma)$) we can think of the entire model as being parameterized by all the parameters for each distribution in the model's set of distributions.
Conventionally notation for parameterized model might look something like the following:
#todo figure out how to make latex elastic space
#todo figure out latex gothic bold
#todo figure out latex for spacing of colon in type annotations
$
F = \lbrace f(x;\mu,\sigma) = \dots, \mu \in \mathbb{R}, \sigma > 0 \rbrace
$
In a style familiar to programers more consistent with [[Type Theory]] and the syntax of [[PFPL]]:
$
F(\mu : \mathbb{R}, \sigma : \mathbb{R^+}) = \lbrace
f\{\mu,\sigma\}(x) = \dots
\rbrace
$
^[#todo this style still isn't great. F should reflect that it's a datatype and F(...) the constructor] ^[What about models defined by a set with an infinite number of densities? Structs can't easily be expressed as infinite products.]
#to-elaborate [[All Of Statistics - Larry Wasserman]] 7.2 defines a statistical model as a "set of distributions (or set of densities)". What's the intuition here? Why is it sufficient to say that a set of distributions is a representative model of a process? What *are* we modeling? the [[Random Variable]]? The process?
---
- Links: [[All Of Statistics - Larry Wasserman]] [[Statistics]] [[Machine Learning]] [[Type Family]]
- Created at: [[2021-09-26]]