If we take a [[Bayesian vs Frequentist Statistics|Bayesian]] point of view on [[Statistical Inference]] we treat the parameters $\theta$ as a variable with some prior uncertainty $P(\theta)$ that we can update to account for new data $ \begin{align} P(\theta|\text{data}) &= \frac{P(\text{data}|\theta) P(\theta)}{P(data)} \\ \\ P(\theta|\text{data}) &\propto P(\text{data}|\theta) P(\theta) \end{align} $ The posterior $P(\theta|\text{data})$ is a weighting of the prior distribution $P(\theta)$ and the likelihood of the observed data given our current assumptions about $\theta$. ^[If we don't care about getting a valid probability distribution (or just want a general intuition) we don't need to be concerned with the scaling factor] Idea: As you get more and more data, the parameter values will be more heavily weighted to the parameter values estimated from the observed data and less weighted by the parameter values of the prior distribution. ## Resources - [Bayesian Statistics: An Introduction - YouTube](https://www.youtube.com/watch?v=Pahyv9i_X2k&list=PLTNMv857s9WU729gegxdW2e4wto2wEP4S&index=5) --- - Links: - Created at: 2023-06-21