Student-T distribution - /s (strangemonad's notes)

If $z \sim \operatorname{N}(0, 1)$ with $y \sim \chi^2_k$ then when you have a new rv that's a ratio of $z$ and $y$ $\frac{z}{y/k} \sim t_k$ it's t-distributed with k degrees of freedom. You get this construction in the following setup. Assuming $x \sim \operatorname{N}(\mu, \sigma^2)$ then $z = \frac{x - \mu}{\sigma / \sqrt{n}} \sim \operatorname{N}(0,1)$ #todo fill in details of how you normalize by sample std-dev to get $ \dots \sim \frac{\operatorname{N}(0,1)}{\sqrt{ \frac{1}{n-1} \chi^2_{n-1}}} \sim t_{n-1} $ **Typical use** find the distribution of $\overline{x}$ when $\sigma$ is not known. When you know the population variance, the sample mean is normally distributed. ^[For a population with variance $\sigma^2$, the sample mean $\overline{T}$ for a sample of size $n$ is normally distributed with $\operatorname{Var}(\overline{T}) = \sigma^2/n$.] However, if you need to estimate the variance, the distribution of the sample mean is no longer normal but instead follows a t-distribution. ## Intuition It's really close to a [[normal distribution]] except it's perturbed a little because we didn't know the underlying population variance and had to instead estimate it by calculating the sample variance. When $n$ is small, you get a distribution that's shorter and wider (higher tails) than a normal distribution (representing the fact that we have more uncertainty). As $n$ gets larger, it approaches the normal distribution. When $n = 100$, it's more or less identical to the normal distribution. ^[probably where at least part of the idea that n=100 is a large enough sample to be representative comes from] --- - Links: [[Statistics]] [[Statistical Distributions]] [[Sampling Distributions]] [[MIT 2.830J Control of Manufacturing Processes - Lec 6]] - Created at: [[2021-11-25]]