# G-test

## Relation to the chi-squared test

The commonly used chi-squared tests for goodness of fit to a distribution and for independence in contingency tables are in fact approximations of the log-likelihood ratio on which the G-tests are based. The general formula for Pearson's chi-squared test statistic is

${\displaystyle \chi ^{2}=\sum _{i}{\frac {\left(O_{i}-E_{i}\right)^{2}}{E_{i}}}.}$

The approximation of G by chi squared is obtained by a second order Taylor expansion of the natural logarithm around 1. This approximation was developed by Karl Pearson because at the time it was unduly laborious to calculate log-likelihood ratios. With the advent of electronic calculators and personal computers, this is no longer a problem. A derivation of how the chi-squared test is related to the G-test and likelihood ratios, including to a full Bayesian solution is provided in Hoey (2012).[2]

## Statistical software

• The R programming language has the likelihood.test function in the Deducer package.
• In SAS, one can conduct G-test by applying the /chisq option after the proc freq.[7]
• In Stata, one can conduct a G-test by applying the lr option after the tabulate command.
• Fisher's G-test in the GeneCycle Package of the R programming language (fisher.g.test) does not implement the G-test as described in this article, but rather Fisher's exact test of Gaussian white-noise in a time series.[8]

## References

1. Sokal, R. R. and Rohlf, F. J. (1981), Biometry: the principles and practice of statistics in biological research, New York: Freeman. ISBN 0-7167-2411-1.
2. Harremoës, P. and Tusnády, G. (2012). Information divergence is more chi squared distributed than the chi squared statistic, Proceedings ISIT 2012, pp. 538–543.
3. Quine, M. P. and Robinson, J. (1985), "Efficiencies of chi-square and likelihood ratio goodness-of-fit tests", Annals of Statistics, 13: 727–742.
4. Harremoës, P. and Vajda, I. (2008), "On the Bahadur-efficient testing of uniformity by means of the entropy", IEEE Transactions on Information Theory, 54: 321–331.
5. Dunning, Ted (1993). "Accurate Methods for the Statistics of Surprise and Coincidence", Computational Linguistics, Volume 19, issue 1 (March, 1993).
6. G-test of independence, G-test for goodness-of-fit in Handbook of Biological Statistics, University of Delaware. (pp. 46–51, 64–69 in: McDonald, J. H. (2009) Handbook of Biological Statistics (2nd ed.). Sparky House Publishing, Baltimore, Maryland.)
7. Fisher, R. A. (1929), "Tests of significance in harmonic analysis", Proceedings of the Royal Society of London: Series A, Volume 125, Issue 796, pp. 54–59.