Multinomial Exact Tests
met.py

met.py is a Python module that allows you to define a pair of multinomial distributions (conceptually 'control' and 'test' distributions) and then compute one- and two-sided p values to test whether the 'test' distribution is equivalent to the 'control' distribution. The likelihood of all possible 'control' distributions can be evaluated and the distribution of p values can be expressed in terms of the likelihood of the observed 'control' distribution.

Module Contents

Functions

The met module defines the following functions:

met.fact(n)
Return the factorial of the argument n.
met.fillzeroes(case, factor)
The 'case' argument is a list of counts; if there are any zero counts in the case, the non-zero counts are multiplied by 'factor' and the zero counts are replaced with 1. This function is used to adjust zero counts (and hence probabilities) among the 'control' sample.
met.listprod(list)
Return the product of all elements of the list, or 1 if the list is empty.
met.all_multinom_cases(categories, items)
Return a list of all multinomial combinations (each a list) of 'items' items distributed in all possible ways over 'categories' categories.
met.obs_to_probs(obs)
Standardize the list of observations to be a sequence of probabilities that sum to 1.0.
met.onesided_exact_likelihood(ref_obs, site_obs [, ref_samples=None] [, fill_zeroes=False ] [, fill_factor=10])
Calculate the distribution of one-sided exact test probabilities based on the likelihood of observing the given distribution of reference area samples. Return a list of tuples, where each tuple represents one of the possible distributions of reference area samples, and each tuple contains: 1) the distribution of reference area samples being evaluated; 2) the likelihood of that being the true reference area distribution given the observed distribution of reference area data; 3) the raw p value for the one-sided exact test for that reference area distribution; 4) the normalized likelihood for the case (normalized to the maximum likelihood), and 5) the final p value for the one-sided exact test, after the raw p value has been multiplied by the normalized likelihood.
ref_obs
A list of reference category counts, ordered from 'best' to 'worst'.
site_obs
A list of site category counts, ordered from 'best' to 'worst'.
ref_samples
The number of reference area samples to use to compute the cases that will be used for the multinomial probabilities that will be used to calculate likelihood of different reference probabilities. If None, sum(ref_obs) will be used instead.
fill_zeroes
Boolean indicating whether or not zeroes should be filled before probabilities are calculated.
fill_factor
The factor to be used in conjunction with fill_zeroes.
met.twosided_exact_likelihood(ref_obs, site_obs [, ref_samples=None] [, fill_zeroes=False ] [, fill_factor=10])
Calculate the distribution of two-sided exact test probabilities based on the likelihood of observing the given distribution of reference area samples. Return a list of tuples, where each tuple represents one of the possible distributions of reference area samples, and each tuple contains: 1) the distribution of reference area samples being evaluated; 2) the likelihood of that being the true reference area distribution given the observed distribution of reference area data; 3) the raw p value for the two-sided exact test for that reference area distribution; 4) the normalized likelihood for the case (normalized to the maximum likelihood), and 5) the final p value for the two-sided exact test, after the raw p value has been multiplied by the normalized likelihood.
ref_obs
A list of reference category counts, ordered from 'best' to 'worst'.
site_obs
A list of site category counts, ordered from 'best' to 'worst'.
ref_samples
The number of reference area samples to use to compute the cases that will be used for the multinomial probabilities that will be used to calculate likelihood of different reference probabilities. If None, sum(ref_obs) will be used instead.
fill_zeroes
Boolean indicating whether or not zeroes should be filled before probabilities are calculated.
fill_factor
The factor to be used in conjunction with fill_zeroes.

Classes

The met module defines the following class:

class met.Multinom(ref_obs, site_obs)
Arguments are lists. Each list contains the number (count) of observations in different cagegories. If one-sided tests will be done, the categories should be listed in order from 'best' to 'worst' (implying that any alternative distribution with one or more counts moved from an earlier category to a later category represents a 'worse' condition). Both lists should be the same length, and corresponding elements in the two lists should represent the same category.

Exceptions

The met module defines the following exception:

exception met.Multinom_Error
An exception class for all errors in the met module.

Multinom Objects

A Multinom object must be instantiated so that its methods can be called to perform exact tests. Objects of the Multinom class also contain attributes that (can) store results of the exact tests in addition to the p values that are returned by those methods.

Multinom objects have the following public methods:

Multinom.new_site_obs(obs)
Use different site (test) data with the same reference (control) data that were originally used to define the object.
Multinom.onesided_exact_test([fill_zeroes=False] [, fill_factor=10] [, save_cases=False])
Compute, return, and store the p value for a one-sided exact multinomial test of the distribution of site observations against reference observations.
fill_zeroes
Boolean: adjust zero counts (probabilities) in the reference data. The adjustment is made by multiplying non-zero counts by fill_factor and setting zero counts to 1.
fill_factor
The factor by which to multiply non-zero reference counts if fill_zeroes is true. Defaults to 10.
save_cases
Boolean: save a list of all 'more extreme' cases found during the test.
Multinom.twosided_exact_test([fill_zeroes=False] [, fill_factor=10] [, save_cases=False])
Compute, return, and store the p value for a two-sided exact multinomial test of the distribution of site observations against reference observations.
fill_zeroes
Boolean: adjust zero counts (probabilities) in the reference data. The adjustment is made by multiplying non-zero counts by fill_factor and setting zero counts to 1.
fill_factor
The factor by which to multiply non-zero reference counts if fill_zeroes is true. Defaults to 10.
save_cases
Boolean: save a list of all 'more extreme' cases found during the test.

Multinom objects have the following public attributes:

Multinom.cases
A list of the 'more extreme' arrangements of site data evaluated by the latest test. Valid only if the save_cases argument was used with the latest test.
Multinom.n_extreme_cases
The number of 'more extreme' arrangements of site data evaluated by the latest test.
Multinom.p_value
The p value returned by the latest test.
Multinom.ref_obs
The list of reference area observations (counts).
Multinom.ref_probs
The list of reference area probabilities corresponding to ref_obs.
Multinom.site_obs
The list of site observations (counts).

Copyright and License

Copyright (c) 2009, R.Dreas Nielsen

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. The GNU General Public License is available at http://www.gnu.org/licenses/.