< Radiation Oncology < Medical Statistics


Χ2 (Chi-Squared)


Overview

  • Used for comparison of two classifications schemes, which may each have multiple categories
  • Purpose is to determine the probability that observed data are (or are not) consistent with the hypothesis H0: the probability of outcomes in the different groups is the same
  • Used to approximate Fisher's Exact Test (2x2) for large numbers:
    • Accuracy of estimation depends on the total number of observations in each cell
    • Expected observations number (calculated from actual observations; see below) should be at least 5 in each cell
  • Used to extend Fisher's Exact Test for comparison of classification schemes with >2 categories

Χ2 for 2x2 Table

  • Used for tables, which are too large for Fisher's Exact Test
  • The process is parallel; please see that page for details of initial set-up
2x2 Table
 Outcome 1Outcome 2Total
Group 1O11O12R1
Group 2O21O22R2
TotalC1C2N
 
2x2 Table Example
 BasketballSoftball/BaseballTotal
Boys1214.727
Girls1315.328
Total253055
  • Start by assuming that H0 is true, and that p0 = p1 = p2
  • Calculate the expected 2x2 table based on the observed total numbers
    • Expected population "success rate" is p0 = C1 / N
    • Using p0, and the observed Group 1/2, Outcome 1/2 numbers, calculate the expected 2x2 table
  • Compare the expected table to the observed table, by calculating test statistic T
  • One way of calculating T is to evaluate the proportional difference in each cell between the observed and expected values, and then sum them all
    • T = ((O11-E11)2/E12) + ((O12-E12)2/E12) + ((O21-E21)2/E21) + ((O22-E22)2/E22)
  • After some nifty mathematics, this can more simply be calculated from the original bserved table
    • T = N * (|O11 * O22 - O12 * O21| - 1/2*N)2 / R1 * R2 * C1 * C2
  • Because T is derived from observed-expected difference, the larger the T, the more different the tables are, and the less likely H0
  • In order to calculate the significance level, we need to evaluate the probability that the observed table was due to random sampling, which is related to the size of T. We also need to evaluate the probability of all the other possible tables that could have been observed (again, same as in Fisher's test)
  • When H0 is true, the probability distribution of T is approximately the same as the probability distribution for the Χ2 function
  • We can therefore approximately determine the probability of observed T by evaluating the Χ2 function at the T level (by looking it up in a table)
  • Because these are approximations, the table typically gives critical values:
    Χ2 for 2x2 Table
    Probability0.250.100.050.010.0050.001
    T1.3232.7063.8416.6357.87910.83
    • This is the probability that the observed outcome (and any possible outcomes less likely than this one) occurred due to random sampling only

    Χ2 for 2x2 Table Example

    Observed
     Graft RejectedEngraftment
    Low cell dose1719
    High cell dose428
     
    Expected (Calculated)
     Graft RejectedEngraftment
    Low cell dose1125
    High cell dose1022
    • T = 8.01
    • From the Χ2 table above, p is between 0.005 and 0.001.
    • We can therefore conclude that p < 0.005 and that high cell dose correlates strongly with engraftment
    This article is issued from Wikibooks. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.