Likelihood Distributions¶
-
class
oktopus.likelihood.
Likelihood
[source]¶ Defines a Likelihood function.
Methods
__call__
(params)Calls evaluate()
evaluate
(params)Evaluates the negative of the log likelihood function. fisher_information_matrix
(params)Computes the Fisher Information Matrix. fit
([optimizer])Minimizes the evaluate()
function usingscipy.optimize.minimize()
,scipy.optimize.differential_evolution()
,scipy.optimize.basinhopping()
, orskopt.gp.gp_minimize()
.gradient
(params)Returns the gradient of the loss function evaluated at params
hessian
(params)Returns the Hessian matrix of the loss function evaluated at params
jeffreys_prior
(params)Computes the negative of the log of Jeffrey’s prior and evaluates it at params
.uncertainties
(params)Returns the uncertainties on the model parameters as the square root of the diagonal of the inverse of the Fisher Information Matrix evaluated at params
.-
evaluate
(params)[source]¶ Evaluates the negative of the log likelihood function.
Parameters: params : ndarray
parameter vector of the model
Returns: neg_loglikelihood : scalar
Returns the negative log likelihood function evaluated at
params
.
-
fisher_information_matrix
(params)[source]¶ Computes the Fisher Information Matrix.
Returns: fisher : ndarray
Fisher Information Matrix
-
-
class
oktopus.likelihood.
MultinomialLikelihood
(data, mean)[source]¶ Implements the negative log likelihood function for the Multinomial distribution. This class also contains a method to compute maximum likelihood estimators for the probabilities of the Multinomial distribution.
\[\arg \min_{\theta \in \Theta} - \sum_k y_k \cdot \log p_k(\theta)\]Examples
Suppose our data is divided in two classes and we would like to estimate the probability of occurence of each class with the condition that \(P(class_1) = 1 - P(class_2) = p\). Suppose we have a sample with \(n_1\) counts from \(class_1\) and \(n_2\) counts from \(class_2\). Assuming the distribution of the number of counts is a binomial distribution, the MLE for \(P(class_1)\) is given as \(P(class_1) = \dfrac{n_1}{n_1 + n_2}\). The Fisher Information Matrix is given by \(F(n, p) = \dfrac{n}{p * (1 - p)}\). Let’s see how we can estimate \(p\).
>>> from oktopus import MultinomialLikelihood >>> import autograd.numpy as np >>> counts = np.array([20, 30]) >>> def ber_pmf(p): ... return np.array([p, 1 - p]) >>> logL = MultinomialLikelihood(data=counts, mean=ber_pmf) >>> p0 = 0.5 # our initial guess >>> p_hat = logL.fit(x0=p0, method='Nelder-Mead') >>> p_hat.x array([ 0.4]) >>> p_hat_unc = logL.uncertainties(p_hat.x) >>> p_hat_unc array([ 0.06928203]) >>> 20. / (20 + 30) # theorectical MLE 0.4 >>> print(np.sqrt(0.4 * 0.6 / (20 + 30))) # theorectical uncertainty 0.0692820323028
Attributes
data (ndarray) Observed count data mean (callable) Events probabilities of the multinomial distribution Methods
__call__
(params)Calls evaluate()
evaluate
(params)fisher_information_matrix
(params)fit
([optimizer])Minimizes the evaluate()
function usingscipy.optimize.minimize()
,scipy.optimize.differential_evolution()
,scipy.optimize.basinhopping()
, orskopt.gp.gp_minimize()
.gradient
(params)hessian
(params)Returns the Hessian matrix of the loss function evaluated at params
jeffreys_prior
(params)Computes the negative of the log of Jeffrey’s prior and evaluates it at params
.uncertainties
(params)Returns the uncertainties on the model parameters as the square root of the diagonal of the inverse of the Fisher Information Matrix evaluated at params
.-
n_counts
¶ Returns the sum of the number of counts over all bin.
-
-
class
oktopus.likelihood.
PoissonLikelihood
(data, mean)[source]¶ Implements the negative log likelihood function for independent (possibly non-identically) distributed Poisson measurements. This class also contains a method to compute maximum likelihood estimators (MLE) for the mean of the Poisson distribution.
More precisely, the MLE is computed as:
\[\arg \min_{\theta \in \Theta} \sum_k \lambda_k(\theta) - y_k \cdot \log \lambda_k(\theta)\]Notes
See here for the mathematical derivation of the Poisson likelihood expression.
Examples
Suppose we want to estimate the expected number of planes arriving at gate 50 terminal 1 at SFO airport in a given hour of a given day using some data.
>>> import math >>> from oktopus import PoissonLikelihood >>> import numpy as np >>> import autograd.numpy as npa >>> np.random.seed(0) >>> toy_data = np.random.randint(1, 20, size=100) >>> def mean(l): ... return npa.array([l]) >>> logL = PoissonLikelihood(data=toy_data, mean=mean) >>> mean_hat = logL.fit(x0=10.5) >>> mean_hat.x array([ 9.29000013]) >>> print(np.mean(toy_data)) # theorectical MLE 9.29 >>> mean_unc = logL.uncertainties(mean_hat.x) >>> mean_unc array([ 3.04795015]) >>> print("{:.5f}".format(math.sqrt(np.mean(toy_data)))) # theorectical Fisher information 3.04795
Attributes
data (ndarray) Observed count data mean (callable) Mean of the Poisson distribution Note: If you want to compute uncertainties, this model must be defined with autograd numpy wrapper Methods
__call__
(params)Calls evaluate()
evaluate
(params)fisher_information_matrix
(params)fit
([optimizer])Minimizes the evaluate()
function usingscipy.optimize.minimize()
,scipy.optimize.differential_evolution()
,scipy.optimize.basinhopping()
, orskopt.gp.gp_minimize()
.gradient
(params)hessian
(params)Returns the Hessian matrix of the loss function evaluated at params
jeffreys_prior
(params)Computes the negative of the log of Jeffrey’s prior and evaluates it at params
.uncertainties
(params)Returns the uncertainties on the model parameters as the square root of the diagonal of the inverse of the Fisher Information Matrix evaluated at params
.
-
class
oktopus.likelihood.
GaussianLikelihood
(data, mean, var)[source]¶ Implements the likelihood function for independent (possibly non-identically) distributed Gaussian measurements with known variance.
The maximum likelihood estimator is computed as:
\[\arg \min_{\theta \in \Theta} \dfrac{1}{2}\sum_k \left(\dfrac{y_k - \mu_k(\theta)}{\sigma_k}\right)^2\]Examples
The following example demonstrates how one can fit a maximum likelihood line to some data:
>>> from oktopus import GaussianLikelihood >>> import autograd.numpy as np >>> from matplotlib import pyplot as plt >>> x = np.linspace(0, 10, 200) >>> np.random.seed(0) >>> fake_data = x * 3 + 10 + np.random.normal(scale=2, size=x.shape) >>> def line(x, alpha, beta): ... return alpha * x + beta >>> my_line = lambda a, b: line(x, a, b) >>> logL = GaussianLikelihood(fake_data, my_line, 4) >>> p0 = (1, 1) # dumb initial_guess for alpha and beta >>> p_hat = logL.fit(x0=p0, method='Nelder-Mead') >>> p_hat.x # fitted parameters array([ 2.96263393, 10.32860717]) >>> p_hat_unc = logL.uncertainties(p_hat.x) # get uncertainties on fitted parameters >>> p_hat_unc array([ 0.04874546, 0.28178535]) >>> plt.plot(x, fake_data, 'o') >>> plt.plot(x, line(*p_hat.x)) >>> # The exact values from linear algebra would be: >>> M = np.array([[np.sum(x * x), np.sum(x)], [np.sum(x), len(x)]]) >>> alpha, beta = np.dot(np.linalg.inv(M), np.array([np.sum(fake_data * x), np.sum(fake_data)])) >>> print(alpha) 2.96264087528 >>> print(beta) 10.3286166099
Attributes
data (ndarray) Observed data mean (callable) Mean model var (float or array-like) Uncertainties on the observed data Methods
__call__
(params)Calls evaluate()
evaluate
(params)fisher_information_matrix
(params)Computes the Fisher Information Matrix. fit
([optimizer])Minimizes the evaluate()
function usingscipy.optimize.minimize()
,scipy.optimize.differential_evolution()
,scipy.optimize.basinhopping()
, orskopt.gp.gp_minimize()
.gradient
(params)hessian
(params)Returns the Hessian matrix of the loss function evaluated at params
jeffreys_prior
(params)Computes the negative of the log of Jeffrey’s prior and evaluates it at params
.uncertainties
(params)Returns the uncertainties on the model parameters as the square root of the diagonal of the inverse of the Fisher Information Matrix evaluated at params
.
-
class
oktopus.likelihood.
LaplacianLikelihood
(data, mean, var)[source]¶ Implements the likelihood function for independent (possibly non-identically) distributed Laplacian measurements with known error bars.
\[\arg \min_{\theta \in \Theta} \sum_k \dfrac{|y_k - \mu_k(\theta)|}{\sigma_k}\]Attributes
data (ndarray) Observed data mean (callable) Mean model var (float or array-like) Uncertainties on the observed data Methods
__call__
(params)Calls evaluate()
evaluate
(params)fisher_information_matrix
(params)fit
([optimizer])Minimizes the evaluate()
function usingscipy.optimize.minimize()
,scipy.optimize.differential_evolution()
,scipy.optimize.basinhopping()
, orskopt.gp.gp_minimize()
.gradient
(params)Returns the gradient of the loss function evaluated at params
hessian
(params)Returns the Hessian matrix of the loss function evaluated at params
jeffreys_prior
(params)Computes the negative of the log of Jeffrey’s prior and evaluates it at params
.uncertainties
(params)
-
class
oktopus.likelihood.
MultivariateGaussianLikelihood
(data, mean, cov, dim)[source]¶ Implements the likelihood function of a multivariate gaussian distribution.
Parameters: data : ndarray
Observed data.
mean : callable
Mean model.
cov : ndarray or callable
If callable, the parameters of the covariance matrix will be fitted.
dim : int
Number of parameters of the mean model.
Methods
__call__
(params)Calls evaluate()
evaluate
(params)Computes the negative of the log likelihood function. fisher_information_matrix
(params)fit
([optimizer])Minimizes the evaluate()
function usingscipy.optimize.minimize()
,scipy.optimize.differential_evolution()
,scipy.optimize.basinhopping()
, orskopt.gp.gp_minimize()
.gradient
(params)hessian
(params)Returns the Hessian matrix of the loss function evaluated at params
jeffreys_prior
(params)Computes the negative of the log of Jeffrey’s prior and evaluates it at params
.uncertainties
(params)Returns the uncertainties on the model parameters as the square root of the diagonal of the inverse of the Fisher Information Matrix evaluated at params
.
-
class
oktopus.likelihood.
BernoulliLikelihood
(data, mean)[source]¶ Implements the negative log likelihood function for independent (possibly non-identical distributed) Bernoulli random variables. This class also contains a method to compute maximum likelihood estimators for the probability of a success.
More precisely, the MLE is computed as
\[\arg \min_{\theta \in \Theta} - \sum_{i=1}^{n} y_i\log\pi_i(\mathbf{\theta}) + (1 - y_i)\log(1 - \pi_i(\mathbf{\theta}))\]Examples
>>> import numpy as np >>> from oktopus import BernoulliLikelihood, UniformPrior, Posterior >>> from oktopus.models import ConstantModel as constant >>> # generate integer fake data in the set {0, 1} >>> np.random.seed(0) >>> y = np.random.choice([0, 1], size=401) >>> # create a model >>> p = constant() >>> # perform optimization >>> ber = BernoulliLikelihood(data=y, mean=p) >>> unif = UniformPrior(lb=0., ub=1.) >>> pp = Posterior(likelihood=ber, prior=unif) >>> result = pp.fit(x0=.3, method='powell') >>> # get best fit parameters >>> print(np.round(result.x, 3)) 0.529 >>> print(np.round(np.mean(y>0), 3)) # theorectical MLE 0.529 >>> # get uncertainties on the best fit parameters >>> print(ber.uncertainties([result.x])) [ 0.0249277] >>> # theorectical uncertainty >>> print(np.sqrt(0.528678304239 * (1 - 0.528678304239) / 401)) 0.0249277036876
Attributes
data (array-like) Observed data mean (callable) A functional form that defines the model for the probability of success Methods
__call__
(params)Calls evaluate()
evaluate
(theta)fisher_information_matrix
(theta)fit
([optimizer])Minimizes the evaluate()
function usingscipy.optimize.minimize()
,scipy.optimize.differential_evolution()
,scipy.optimize.basinhopping()
, orskopt.gp.gp_minimize()
.gradient
(theta)hessian
(params)Returns the Hessian matrix of the loss function evaluated at params
jeffreys_prior
(params)Computes the negative of the log of Jeffrey’s prior and evaluates it at params
.uncertainties
(theta)
-
class
oktopus.likelihood.
BernoulliGaussianMixtureLikelihood
(data, mean, var=1.0)[source]¶ Implements the likelihood of \(Z_i = X_i + Y_i\), such that \(X_i\) is a Bernoulli random variable, with probability of success \(p_i\), and \(Y_i\) is a normal random variable with zero mean and known variance \(sigma^2\).
The loglikelihood is given as
\[\log p(z^n) = \sum_{i=1}^{n} \log \left\[(1 - p_i) \mathcal{N}(0, \sigma^2) + p_i \mathcal{N}(1, \sigma^2)\right\]\]Examples
>>> import numpy as np >>> from oktopus import BernoulliGaussianMixtureLikelihood, UniformPrior, Posterior >>> from oktopus.models import ConstantModel as constant >>> # generate integer fake data in the set {0, 1} >>> np.random.seed(0) >>> y = np.append(np.random.normal(size=30), 1+np.random.normal(size=70)) >>> # create a model >>> p = constant() >>> # build likelihood >>> ll = BernoulliGaussianMixtureLikelihood(data=y, mean=p, var=1.) >>> unif = UniformPrior(lb=0., ub=1.) >>> pp = Posterior(likelihood=ll, prior=unif) >>> result = pp.fit(x0=.3, method='powell') >>> # get best fit parameters >>> print(np.round(result.x, 3)) 0.803 >>> print(np.mean(y>0)) # theorectical MLE 0.78
Methods
__call__
(params)Calls evaluate()
evaluate
(theta)fisher_information_matrix
(params)Computes the Fisher Information Matrix. fit
([optimizer])Minimizes the evaluate()
function usingscipy.optimize.minimize()
,scipy.optimize.differential_evolution()
,scipy.optimize.basinhopping()
, orskopt.gp.gp_minimize()
.gradient
(params)Returns the gradient of the loss function evaluated at params
hessian
(params)Returns the Hessian matrix of the loss function evaluated at params
jeffreys_prior
(params)Computes the negative of the log of Jeffrey’s prior and evaluates it at params
.uncertainties
(params)Returns the uncertainties on the model parameters as the square root of the diagonal of the inverse of the Fisher Information Matrix evaluated at params
.