differential entropy of a gaussian
An important result is the differential entropy of a gaussian variable.
The pdf of a multivariate gaussian is
f(x)=1√(2π)ndet(Σ)exp(−(x−μ)TΣ−1(x−μ))
Now, we know that entropy is expectation of "surprise," or −logp(x). "LOE" is "linearity of expectation."
H[X]=−E[log(p(X))]Moved negative sign out, LOE=−E[log(1(2π)n2det(Σ)12)−12(x−μ)TΣ−1(x−μ)]Substitution, square root to power=−E[n2log(2π)+12log(det(Σ))−12(x−μ)TΣ−1(x−μ)]Log rule of multiplication=n2log(2π)+12log(det(Σ))+12E[(x−μ)TΣ−1(x−μ)]LOE=n2log(2π)+12log(det(Σ))+12E[tr((x−μ)TΣ−1(x−μ))]Term is a quadratic form resulting in a scalar,
trace of scalar is itself=n2log(2π)+12log(det(Σ))+12E[tr(Σ−1(x−μ)(x−μ)T)]commutation of trace=n2log(2π)+12log(det(Σ))+12tr(E[Σ−1(x−μ)(x−μ)T])LOE=n2log(2π)+12log(det(Σ))+12tr(Σ−1E[(x−μ)(x−μ)T])Covariance matrix is constant=n2log(2π)+12log(det(Σ))+12tr(Σ−1Σ)Definition of covariance matrix=n2log(2π)+12log(det(Σ))+12tr(In)Matrix times its inverse is identity=n2log(2π)+12log(det(Σ))+12nTrace of identity is the length of diagonal=n2log(2πedet(Σ)1n)
Important result is that the differential entropy only depends on the variance of the gaussian, and not its mean.