# joint entropy

Joint entropy of two variables $$X$$ and $$Y$$ is

\begin{equation} H(X, Y) = -\sum_{x,y} p(x, y) \log{p, y} \end{equation}

The lower bound of joint entropy is

\begin{equation} H(X, Y) \geq max (H(X), H(Y)) \geq 0 \end{equation}

Which means that you cannot reduce joint entropy by adding another unknown variable, the joint entropy will always jump up to the maximum entropy at least.

An upper bound of joint entropy is

\begin{equation} H(X, Y) \leq H(X) + H(Y) \end{equation}

Equality is achieved when they are independent, implying that the information revealed by doing two experiments together is exactly equal to the information of doing each experiment individually. If they happened to be dependent, then there's some extra information in the system and so entropy may be lower (we need fewer bits to express all the information perfectly).

Finally, joint entropy is closely related to conditional entropy, in that

\begin{equation} H(X, Y) = H(Y|X) + H(X) \end{equation}

The information from performing a two experiments togther is exactly the information from $$Y$$ given we see $$X$$ and the information from doing $$X$$, recalling the additivity of entropy. If they are actually independent, then the conditional goes to just $$H(Y)$$ and we have the same result as above.