π Confidence Interval for Population Mean (Ο Unknown)
This situation is more realistic because population variability is rarely known in practical studies.
π― Objective
To estimate the true population mean (ΞΌ) when population variability is unknown.
π Why Not Use Z-Distribution?
When Ο is unknown, replacing it with sample standard deviation (s) introduces additional estimation error.
- Sample standard deviation varies from sample to sample
- This adds extra uncertainty
- Z-distribution underestimates this uncertainty
π Properties of the t-Distribution
- Bell-shaped and symmetric (like normal distribution)
- Has heavier tails (more spread)
- Accounts for extra uncertainty in estimating Ο
- Shape depends on Degrees of Freedom (df)
π Degrees of Freedom
Degrees of Freedom (df) measure the number of independent values used to estimate variability.
\[ df = n - 1 \]
- n = sample size
π Standard Error (Estimated)
Since Ο is unknown, we estimate Standard Error using sample standard deviation:
\[ SE = \frac{s}{\sqrt{n}} \]
π Formula for Confidence Interval
\[ \bar{x} \pm t_{\alpha/2, df} \cdot \frac{s}{\sqrt{n}} \]
Where:
- xΜ = Sample Mean
- t = t-score from t-table
- s = Sample Standard Deviation
- n = Sample Size
- df = n β 1
π’ Example 1: Estimating Average Battery Life
Given:
- Sample mean battery life = 10 hours
- Sample standard deviation = 2 hours
- Sample size = 25
- Confidence level = 95%
Step 1: Degrees of Freedom
df = 25 β 1 = 24
Step 2: Standard Error
\[ SE = \frac{2}{\sqrt{25}} = \frac{2}{5} = 0.4 \]
Step 3: t-value
From t-table for 95% confidence and df = 24:
t β 2.064
Step 4: Margin of Error
\[ ME = 2.064 \times 0.4 = 0.826 \]
Step 5: Construct Interval
10 Β± 0.826
π Interpretation
The wider interval reflects added uncertainty from estimating Ο.
βοΈ t-Distribution vs Normal Distribution
| Feature | Z-Distribution | t-Distribution |
|---|---|---|
| Population SD | Known | Unknown |
| Spread | Narrower | Wider |
| Tail Thickness | Thin tails | Heavy tails |
| Depends on df? | No | Yes |
π When to Use t-Distribution
- Population standard deviation unknown
- Sample size small (n < 30)
- Population approximately normal
- Random and independent sampling
π€ Applications in Machine Learning
- Estimating true model performance with small validation sets
- Evaluating uncertainty in experimental results
- Comparing algorithms using limited data
- Estimating real-world prediction accuracy
π§ Key Insights
- Use t-distribution when Ο is unknown
- Degrees of freedom control shape of distribution
- Smaller samples β larger uncertainty
- Interval is wider than Z-interval
- t-distribution approaches normal for large samples