📘 Confidence Interval for Population Proportion

A confidence interval for a population proportion estimates the true proportion of a population that possesses a particular characteristic.

This method is widely used in surveys, opinion polls, quality testing, and classification accuracy measurement.

🎯 Objective

To estimate the true population proportion (p) using the sample proportion (p̂).

We use probability theory to quantify uncertainty in estimating proportions.

👥 Understanding Proportion

A proportion represents a fraction of the population with a specific attribute.

Examples:

Proportion of voters supporting a candidate
Proportion of defective products
Proportion of patients responding to treatment
Classification accuracy in ML models

📐 Sample Proportion

Sample proportion is calculated as:

\[ \hat{p} = \frac{x}{n} \]

x = number of successes
n = total sample size

Sample proportion is the point estimate of population proportion.

📊 Standard Error of Proportion

The variability of the sample proportion is measured by:

\[ SE = \sqrt{\frac{\hat{p}(1 - \hat{p})}{n}} \]

Standard Error decreases as sample size increases.

📏 Formula for Confidence Interval

\[ \hat{p} \pm Z_{\alpha/2} \cdot \sqrt{\frac{\hat{p}(1 - \hat{p})}{n}} \]

Where:

p̂ = Sample proportion
Z = Z-score from normal distribution
n = Sample size

📋 Conditions for Use

Random sampling
Independent observations
Sample size sufficiently large
np̂ ≥ 5 and n(1−p̂) ≥ 5

These conditions ensure the sampling distribution is approximately normal.

🔢 Example 1: Estimating Voter Support

Given:

In a survey of 500 voters, 320 support a candidate
Confidence level = 95%

Step 1: Sample Proportion

\[ \hat{p} = \frac{320}{500} = 0.64 \]

Step 2: Standard Error

\[ SE = \sqrt{\frac{0.64(1-0.64)}{500}} = \sqrt{\frac{0.2304}{500}} = \sqrt{0.0004608} \approx 0.0215 \]

Step 3: Z-value

For 95% confidence → Z = 1.96

Step 4: Margin of Error

\[ ME = 1.96 \times 0.0215 \approx 0.042 \]

Step 5: Construct Interval

0.64 ± 0.042

Confidence Interval = (0.598 , 0.682)

In percentage form: (59.8% , 68.2%)

📊 Interpretation

We are 95% confident that the true proportion of voters supporting the candidate lies between 59.8% and 68.2%.

⚖️ Factors Affecting Interval Width

1️⃣ Sample Size

Larger sample → Narrower interval
Smaller sample → Wider interval

2️⃣ Confidence Level

Higher confidence → Wider interval
Lower confidence → Narrower interval

3️⃣ Proportion Value

Maximum variability occurs near p = 0.5

📈 Practical Applications

Political opinion polls
Market research surveys
Public health statistics
Quality control testing
Social science research

🤖 Applications in Machine Learning

Estimating classification accuracy
Evaluating precision and recall rates
Measuring model success probabilities
Binary prediction reliability

Proportion confidence intervals help quantify reliability of ML classification systems.

🧠 Key Insights

Used for binary outcomes (Yes/No, Success/Failure)
Sample proportion estimates population proportion
Z-distribution used for interval construction
Interval width depends on sample size and confidence level
Widely used in surveys and ML evaluation

Confidence Interval for Population proportion