mutual information
Measure of the shared amount of information (entropy) between two random variables.
I(X;Y)=DKL(p(x,y)||p(x)p(y))
Where DKL is kl divergence.
We are basically asking for the "distance" between the joint distribution of two variables and the product of the marginalizations.
So, I is really the information gained from using the real joint distribution with all the extra information about dependence, vs using just each variable as if they were independent.