Chi-Square Test for Independence

Is there a relationship between gender (Male/Female) and social media preference (Instagram, Facebook, Twitter), or are they independent?

Observed Frequency Table

Social Media Male Female Total
Instagram 50 80 130
Facebook 30 40 70
Twitter 20 30 50
Total 100 150 250

Chi-Square Formula

        χ² = Σ [(O - E)² / E]
        
        Where:
        - O = Observed frequency
        - E = Expected frequency
        

Step-by-Step Calculation

        Step 1: Calculate Expected Frequencies
        E = (Row Total * Column Total) / Grand Total

        Example for Instagram (Male):
        E = (130 * 100) / 250 = 52

        Step 2: Compute Chi-Square Statistic
        χ² = Σ [(O - E)² / E]

        Step 3: Compare χ² with Critical Value (from table)

        Step 4: Conclusion:
        - If χ² > Critical Value → Reject H₀ (Variables are dependent)
        - If χ² ≤ Critical Value → Fail to reject H₀ (No significant relationship)
        

Key Takeaways

        - Chi-Square Test helps check relationships between categorical variables.
        - It is widely used in research, marketing, and medical studies.
        - Large sample sizes improve accuracy.
        

Chi-Square Goodness of Fit Test

Problem Statement

        The age distribution of a town in 2000 was:
        - Under 18: 20%
        - 18-35: 30%
        - Over 35: 50%
        
        In 2010, the age distribution of 500 individuals was recorded as:
        - Under 18: 121
        - 18-35: 288
        - Over 35: 91
        
        Using a significance level of α = 0.05, determine if the population distribution has changed.
        

Observed vs. Expected Frequencies

Age Group Observed (O) Expected (E) (O - E)² / E
< 18 121 100 4.41
18 - 35 288 150 126.96
> 35 91 250 101.12
Total 500 500 232.49

Chi-Square Formula

        χ² = Σ [(O - E)² / E]

        Where:
        - O = Observed frequency
        - E = Expected frequency
        

Step-by-Step Calculation

        Step 1: Compute Expected Frequencies
        E = (Total Observations) * (Expected Proportion)

        Step 2: Compute Chi-Square Statistic
        χ² = Σ [(O - E)² / E]

        Step 3: Compare with Critical Value
        df = (3 - 1) = 2
        Critical χ² (α = 0.05, df = 2) = 5.991

        Step 4: Conclusion:
        - If χ² > 5.991 → Reject H₀ (Age distribution has changed)
        - If χ² ≤ 5.991 → Fail to reject H₀ (No significant change)
        

Final Conclusion

        Computed χ² = 232.49, which is much greater than 5.991.
        Thus, we reject H₀ and conclude that the population distribution 
        has significantly changed over the past 10 years.
        

Problem Statement

            Suppose the IQ in a certain population is normally distributed with a mean (μ) of 100 
            and a standard deviation (σ) of 15. A researcher wants to determine whether a new drug affects IQ levels.
            He recruits 20 patients to try the drug and records their IQ levels.
        

Step 1: Given Data

        Population Mean (μ) = 100
        Population Standard Deviation (σ) = 15
        Sample Size (n) = 20
        Sample IQ Scores = [95, 102, 98, 110, 105, 101, 99, 96, 104, 108, 
                            97, 103, 107, 111, 94, 100, 106, 92, 109, 98]
        

Step 2: Compute Sample Mean

        Sample Mean (x̄) = (Sum of Sample IQs) / (Sample Size)
                        ≈ 101.1
        

Step 3: Compute Z-Score

        Z = (x̄ - μ) / (σ / √n)
          = (101.1 - 100) / (15 / √20)
          = 0.328
        

Step 4: Compute P-Value (Two-Tailed Test)

        P-value = 2 * (1 - Φ(|Z|))
               ≈ 2 * (1 - 0.6293)
               ≈ 0.7414
        

Step 5: Decision Rule

        Significance Level (α) = 0.05
    
        Since P-value (0.7414) > α (0.05), we fail to reject the null hypothesis (H₀).
        

Conclusion: The drug does not have a statistically significant effect on IQ levels.