The connection between thermodynamic entropy and
information
The bottom line is that thermodynamic entropy
is best understood not as
a property or macroscopic state of matter (like mass, temperature, or
pressure),
but
as a lack of knowledge of the detailed configuration of matter. In
particular, thermodynamic entropy is a measure of our lack of
information about the
micro-state of a closed system of matter at equilibrium. To make this
concrete, I'll compare two similar simple systems, one of
particles and one of bits. Although the concept of entropy in classical
thermodynamics
was elucidated long before information
theory was developed, thermodynamic entropy can be viewed as a
straight-forward application of information theory to a physical
problem.
There are many other fine discussions of this topic,
but few that strip it down to a simple example. I have strayed slightly
from some common conventions (e.g. I use "micro-state" instead of
"microstate", for emphasis), and suppress a range of worthwhile
elaborations. A more in-depth, but more technical, discussion of the
same topic is at Entropy
in thermodynamics and information
theory. But
this article and others have the same bottom line, with only a
variation of language:
"....it should be remembered that
Gibb's statistical mechanical
entropy is only one application of information theory to physical
systems, relevant when the particular 'message' not yet communicated is
the underlying microstate of the physical system."
An good diagram illustrating this idea of "physical
information" is in M. P. Franks paper "Physical
limits of Computing".
Consider
a perfectly insulated 2-D box of
simple particles. The macro-state
of an ideal gas can be specified by the total energy E, number of
particles N and volume V. There are a large but finite number of
possible micro-states that are all consistent with this system's
single, and unchanging, macro-state:

Ludwig
Boltzmann's leap of imagination was that
the number of
possible micro-states, Ω, was finite, and in some sense a
particle's state
is discrete. But it wasn't until quantum mechanics was developed that
this was clarified and shown to be strictly true.
Henri
Poincairé and others showed that such a particle system
would
necessarily
cycle through all possible micro-states, and that each would be visited
with equal probability.
|
Any one of
these micro-state is equally likely to be the actual micro-state (near
equilibrium) and we have no way of knowing which is the actual
micro-state. This lack of information is
not because we haven't examined the system closely; it reflects the
inaccessibility of this information near equilibrium. But
we can count how many
micro-states are possible.
The thermodynamic entropy, S, for this case is:
S
= log(Ωp) ,
Ωp = number of equally probable
micro-states
Boltzmann's form of this equation is S = k ln(Ω),
where k is Boltzmann's constant. Same
equation, different units. I'm using bits (implicity log2)
as units, where Boltzmann
used the SI units J K-1, or nats,
where 1 nat = 1/ln(2) bits.
|
This statistical measure of thermodynamic entropy
quantifies the uncertainty about which micro-state is occupied. The
higher the number of equally probable possibilities, the more
uncertainty. Near equilibrium the system has a maximum entropy, because
there are the most possible micro-states near equilibrium (for example,
there are very few possibilities for all the particles clumped in one
corner of our insulated box, but many possible ways they can be roughly
evenly distributed across the box).
Compare this with a set of 2-D 4 x 4 arrays of bits
(images in this case, each one a kind of message), each with the same
macro-state specified by the number of bits (N = 16, represented by
black or white squares). If an acquaintance is to send you an
image/message of this form (a 16-bit email, for example), and you have
no prior information about which image/message is to be sent, then each
of a countable number
(65,536) of images/messages is equally probable.

The information
theory entropy (Shannon entropy), H,
for
this case is
defined as:
H
= log(Ωp) ,
Ωp = number of equally probable
micro-states
The entropy
H quantifies the uncertainty
about what message is to be received. The
higher the number of equally probable possibilities, the more entropy.
The image/message has a maximum of entropy before it is
received. But after it is received and read, there is no longer any
uncertainty; there is only one possible micro-state, the image/message
itself; Ωp = 1 and H
= 0.
If a single
one of these arrays is received as an image/message, the information, I, contained in the image/message is:
I
= -log(1/Ωs) = log(Ωs)
, Ωs = number of
equally probable micro-states consistent with the message macro-state
If the micro-states are not equally probable, these formulas
for S, H and I need to be modified. They become
weighted sums over all possible states, where the probability of each
state is the weighting factor. See Entropy
in thermodynamics and information
theory.
|
The probability of this particular image/message
being sent is 1/Ωs. The larger
the number of possibilities, the more uncertainty is resolved, or
entropy reduced, when the particular image/message is received.
Information
is a measure of how much an image/message (an
observed micro-state) tells us, by comparison with the number of
other messages it could have been (those consistent with the
image/message's macro-state).
| Here's the math. For this 16 bit message
with 65,536
possibilities, a single message contains I = -log2(1/65,536) bits
= -log2(2-16 ) bits = 16
bits. This is the amount the message was "surprising", or how much our
uncertainty (entropy) was reduced -- it
could have been a lot of things but it was this singular message. But
this result -- 16 bits of information is contained in the message -- is
not
surprising for this simple example; we knew we were to be sent 16 bits
and when we received the message we found out what each of the 16 bits
was. |
H
and I might seem
redundant because the formulas are
similar. But H does not equal I. Entropy refers to the uncertainty
of an unknown
message, and information refers to the probability of a known message
occurring by chance alone.
| More accurately, entropy is a measure of
uncertainty due to the unknown part
of a
message/particle system, and information is a measure of
reduction of uncertainty due to the known part of a message/particle
system. |
Information gained is equal to entropy
lost. Information and entropy are two sides of the same
probabilistic coin. While a flipped coin is spinning in the air the
entropy H is
one bit (an unknown heads or tails), and the information I is zero.
When it lands and is observed, the entropy H is zero, and the
information I is one bit (a
known heads or tails).
S and H are
equivalent, in that S = H of a thermodynamic system. S is reserved for thermodynamics,
but H can can be applied to
any statistical system.
The entropy S is
a state function
of a thermodynamic system, but it can't be directly measured like
pressure and temperature (see measuring
entropy). There is no entropy-meter; entropy must be infered by
varying the state of a system near equlibrium and observing how it responds. This is
one reason why the statistical
mechanics interpretation of entropy is so important:
"[The] ability to make macroscopic
predictions based on microscopic properties is the main asset of
statistical mechanics over
thermodynamics. Both theories are governed
by the
second
law of thermodynamics through the medium of
entropy.
However, entropy in thermodynamics
can only be known empirically, whereas in statistical mechanics, it is
a function of the distribution of the system on its microstates." (from
statistical mechanics )
It might seem like this statistical interpretation
of
matter can cause matter to be "influenced" by our knowledge, or lack of
knowledge, of its micro-states. What does information or knowledge
about micro-states have to do with how a steam engine works! But this
train of thought is a result of a misperception of microscopic states
in nature. Which micro-state a particle system is in is irreducibly
(inherently)
uncertain, in same sense that the position and momentum of individual
particles are uncertain (Heisenberg's
uncertainty principle). The fact that entropy almost always increases or
stays the same (the second
law of thermodynamics) is
a statistical statement about the uncertainty of a particle
system's micro-state.
| The fact that entropy sometimes
can decrease is often glossed over in discussions of, and even
the statement of, the second law of thermodynamics. The usefulness of
the second law (it's explanatory power) is due to how frequently
entropy doesn't increase for any large number of particles. For even
small macroscopic systems (e.g. > 10,000 particles, >~210,000
possible states), it is highly improbable (p <<< 1/2) that a
measureable increase of entropy (e.g. a fractional increase of
1/10,000) will occur in the (current) lifetime of the universe (~1010
years). Almost is good enough for physics too. |
James Clerk Maxwell's thought
experiment Maxwell's demon is an example of the
importance of observability/uncertainty in discussing the second law.
The experiment's resolution, that the demon can't cheat the second law because
she
can't observe the micro-state without altering it, highlights the
importance of observability/uncertainty in physics.
To Do: Comparison of non-equilibrium systems moving to equilibrium.
To Do: Show how these ideas can be extended to easily percieved
messages, particulary images.
53,754 bit (184 x 289) images ( 253754
possible images ). Each pixel is represented by one bit, black or
white. These three are particular images, not arbitrary selections from
all possible 184 x 289 one-bit images :

Low algorithmic complexity
"simple" (one of very few possible low information images)
High
"image available energy" (non-random intensity gradient)
Farthest
from equilibrium |

Medium algorithmic complexity
"complex" (one of a few possible medium information images)
Medium
"image available energy"
Far
from equilibrium |

High algorithmic complexity
"random" (one of many possible nearly random images)
Low
"image available energy"
Near
equilibrium |