The Conservation of Information and The Scandal of Deduction

Count Timothy von Icarus

I have clarified my thinking on this and reduced the issue to just two paragraphs. I'll leave the original below.

There is a claim in physics that the algorithmic complexity (Kolmogorov Complexity) of the universe doesn't increase, and this is used as a way of explaining that information is conserved in the universe (neither created nor destroyed). This is a similar concept to Le Place's Demon, and some argue that this fact would only obtain iff the universe is fully deterministic, elsewise truly stochastic quantum interactions constantly inject new information into the universe.

The common claim is that the shortest possible description of the universe at any time T(short) gives you complete information about all the states of the universe because one can simply evolve T(short) into any other state. Thus, no new information in the algorithmic sense is ever added. T(short) can always become another state given a deterministic computation where the evolving T(short) produces the desired state with probability = 1.

The problem here from my perspective is that T(Short) + evolution produces all states, not just a single state of the universe. To use T(Short) to have a "program" (conceivably the universe evolving itself as a quantum cellular automata), output a specific different time, T(Diff) requires that we have some way to halt the evolution at T(Diff) and output the current state. Otherwise the evolution just runs forever. To do this would seem to require that you already have a complete description of T(Diff) to match to, comparing your evolved states to the one you want, and since T(Diff) could require a much larger description than T(Short), this means such an evolution scheme does NOT entail that the universe doesn't gain algorithmic complexity.

-----OLD POST ----

There is a (contested) claim in physics that information cannot be created or destroyed. While thinking this through, it occured to me that formulations of this claim often rely on the "scandal of deduction," the idea that deductive reasoning produces no information.

The problem with this is that the scandal of deduction is, on the face of it, ridiculous. It doesn't appear to be true. J.S. Mill once said of it: "one would have to make some significant advances in philosophy to believe it."

The "scandal" follows from the fact that, in any deductively valid argument, the conclusion follows from the premises. There is no uncertainty in the process of deterministic deduction, nothing new that must be added to reach our conclusion. However, this implies that all it should take for us to understand all the possible truths of Euclidean geometry is for us to read its axioms once. This seems wrong.

I wanted to look at this along side the common claim that the algorithmic complexity (Kolmogorov Complexity) of the universe cannot change over time. The claim is that it is not possible for a full description of the universe at some time T1 to be shorter or longer than a full description at some future time T2.

At first glance this seems wrong. Our universe began in a very low entropy state, i.e. given a description of the macrostate of the universe there was a comparatively very low number of possible microstates the universe could have. Thus, specifying the entire universe appears to require ruling out a much smaller number of possible microstates early on, which in turn should require less information to specify. Another intuitive way to think of it is that, early on in the universe, a lot of the conditions are extremely similar, meaning you could block code identical areas to a greater degree.

But the common refutation of this point goes like this:

Let’s pretend that we live in a deterministic world for a minute (my next point may or may not apply when you start dealing with quantum uncertainty…I certainly cannot close to intuiting how those math questions would play out). Given the universe at the beginning of time, it has a certain complexity C. Now let’s say that the complexity of the laws governing the physical world is V, then it seems hard to me to see why the complexity of the universe is not upwardly capped at C+V. Especially when you say that complexity is the question “how easy is it to predict what happens next?”. It seems that knowledge sufficient to replicate the initial universe (C) coupled with the knowledge sufficient to know the laws of the universe(V) would be all that you need to predict anything you want at any time (T). So although the complexity of the universe might increase in the very beginning of time, it would fairly quickly hit the upper limit once the added complexity is matched to the complexity of the laws of physics.

The claim here is that, even if T1 can be described in fewer bits than T2, you can just evolve T1 into T2, thus a description of T1 actually is a description of T2! This implies the scandal of deduction, that nothing new is learned from deterministic computation.

Here is why I don't think this works. T1 isn't a full description of T2 until it evolves into T2. When it evolves to T2, how do you tell your program to halt and output just T2's description? You can't give such an instruction unless you either already have a description of T2 to match to your evolved universe or you assume some sort of absolute time and also assume that you already know how many "time steps" must occur between T1 and T2 . Otherwise, you can't tell your universe model to "stop evolving" at T2, the time you're supposedly able to describe with T1.

Not only do we have strong reason to support rejecting the idea of such an absolute time relative to the universe, but there is also the problem that T1 does not contain the information "T2 will be reached in X steps of evolution," within it.

Evolving the universe forward might allow you to turn T1 into any other time in the universe, but in order to halt the process and output the description of the time you want to describe using T1 you'd need to already have a total description of the time you want so that you can match the two. This is very similar to the Meno Paradox.

Now, if T1 counts as a complete description of T2 then it seems like a program that iteratively outputs all possible combinations of 1s and 0s up to any length X also counts as a complete description of the universe, provided X is a length large enough to describe the universe. However, it seems ridiculous to say that an iterative string creation program is somehow equivalent to its outputs. No one will be happy if they hire a software engineer to create a specific program and they receive an iterative string generator that would, eventually, produce the ideal program they are looking for.

(Obviously the other big argument for conservation is that QM evolution is unitary, but seeing as how unitary evolution is taken as axiomatic I don't think this explains much)

hypericin

However, it seems ridiculous to say that an iterative string creation program is somehow equivalent to its outputs. No one will be happy if they hire a software engineer to create a specific program and they receive an iterative string generator that would, eventually, produce the ideal program they are looking for. — Count Timothy von Icarus

Such a program would be equivalent to its desired output if it were able to distinguish it from all the others. That of course would be the hard part.

It seems like you are drawing a parallel between this program and:

Evolving the universe forward might allow you to turn T1 into any other time in the universe, but in order to halt the process and output the description of the time you want to describe using T1 you'd need to already have a total description of the time you want so that you can match the two. — Count Timothy von Icarus

However this seems like a much weaker argument, in that the difficulty is tractable, unlike with the bit generator. Why not pick a frame of reference, say the center of the universe? You specify a number of time steps (plank units of time) that this reference point undergoes. Every other frame of reference would deterministically evolve its own time relative to the reference.

jgill

I've thought about this in mathematical research. In a sense, once definitions and rules of procedure are in place one could say that all that deductively follows exists somehow. It's a matter of discovering these outcomes. Of course, new definitions crop up all the time. The ideas of measurement and continuity in Euclidean spaces led to the generalization of metric spaces and point-set topology. These, in turn led to Banach and Hilbert spaces and algebraic topology. It goes on and on.

Although this line of thinking might be interesting among (analytic?) philosophers, most math researchers don't care. It's challenging enough to speculate and create the deductive schemes.

RogueAI

The claim is that it is not possible for a full description of the universe at some time T1 to be shorter or longer than a full description at some future time T2. — Count Timothy von Icarus

What if there was no consciousness at T1, but consciousness later emerged at T2? Wouldn't the full description at T2, when there is consciousness, be longer than the consciousless description at T1?

unenlightened

There is a (contested) claim in physics that information cannot be created or destroyed. — Count Timothy von Icarus

That seems like a very odd claim. My understanding is that entropy equates to disorder, and disorder increases (in this universe). This immediately results in information increasing, because order implies compressibility. So a claim that information does not increase over time amounts to the denial of thermodynamics. That is not 'a claim in physics'!

Following on to the relation to deduction, one might say that the axioms of a theory are the compressed form of every deduction that follows from them, but one cannot read the compressed file without decompressing it first. Thus the initial conditions of a deterministic universe 'contain' all the information of every moment of its evolution, and time is the running decompression algorithm that 'expands' the initial equation.

Pantagruel

What if there was no consciousness at T1, but consciousness later emerged at T2? Wouldn't the full description at T2, when there is consciousness, be longer than the consciousless description at T1? — RogueAI

Does consciousness of information constitute additional information?

RogueAI

Does consciousness of information constitute additional information? — Pantagruel

Not if one has the belief that mental states are the same as brain states. However, it seems obvious to me that a description of a conscious system (e.g., a working brain) is incomplete unless that description includes "is conscious". What do you think?

Pantagruel

↪RogueAI

I agree. The notion of a P-zombie is inherently contradictory. Consciousness seems more like a 'degree of freedom' that emerges from brain states.

Count Timothy von Icarus

Lots of good points here, I'll try to get to them all eventually.

↪jgill

I thought it might be most immediately applicable to physics. The claim that the algorithmic complexity of a deterministic universe doesn't increase is fairly common as is the claim that a non-deterministic universe would have an increase in information.

However, such "non-deterministic" universes aren't thought to be entirely random, both because QM isn't random and because there isn't much you can say about a universe that is totally random, given that no state bears any necessary relationship to any others. Generally this "randomness" is bracketed by probabilities assigned to outcomes.

If your "random" outcomes are discrete (or observably discrete), then you can just create a program that says "starting with the initial conditions at T0, produce every possible universe." That is, you can turn a non-deterministic Turing Machine into a deterministic one by having it brute force its way through all combinatorially possible outcomes provided there are finitely many. This being the case, in what sense could quantum interactions actually create new information? If a program can describe the possible outcomes of all such interactions, and we're allowed to say that a state S1 that evolves into another state S2 is identical with the second state (S1 = S2), then it seems like there shouldn't be a distinction.

Whereas if the information content of the universe is said to be infinite (it usually is not), then analogies to computation really aren't appropriate in the first place.

IDK, it seems like a sort of Platonic hangover haunting the sciences, eternal relations and all.

↪unenlightened

Thus the initial conditions of a deterministic universe 'contain' all the information of every moment of its evolution, and time is the running decompression algorithm that 'expands' the initial equation.

Right, that's exactly the normal position and what is meant by "conservation of information." It's a Laplace's Demon type conception of the universe. The problem I see is that, in the world of experience, logical consequence doesn't appear to exist eternally, rather it rolls out in a stepwise fashion. My throwing a ball at a window entails that the window will break, but the window doesn't break until the ball reaches it. A description of the ball and window before and after impact aren't the same thing unless you allow for a sort of "outside of time" computation to exist. It seems to me that committed nominalists should reject that sort of conservation of information.

Count Timothy von Icarus

This seems somewhat related to the (seemingly) paradoxical fact that you can increase the algorithmic complexity of a set by removing elements.

For example, it's very easy to uniquely define "the natural numbers up to 100,000." However, if you randomly chop out 1/3rd of them the description has to get significantly longer to uniquely specify the list.

Specifying a unique time-like slice of the universe might likewise require more information that specifying all such slices. However, if we except this, it seems to beg the question re: eternalism and eternally existing truths, since it implies that a description of "all the moments in the universe," is a real thing that physics studies, as opposed to to just the current universe being the only such actual time-like slice. It seems like a sort of deeply embedded backdoor Platonism.

If you assume past moments no longer exists and that future ones do not yet exist, i.e., that local becoming is real, then "specifying T2 takes more information than simply specifying all states of the universe past and future," seems to be equivalent with saying "the universe has gained information."

Gnomon

There is a (contested) claim in physics that information cannot be created or destroyed. While thinking this through, it occured to me that formulations of this claim often rely on the "scandal of deduction," the idea that deductive reasoning produces no information. . . . At first glance this seems wrong. Our universe began in a very low entropy state, . . . — Count Timothy von Icarus

Since I have no formal training in Philosophy or higher Math, my comments on the notion that "deductive reasoning produces no information", may not be of high quality. To clearly define the ratio of "reasoning" to Information content, could get into some tricky reasoning beyond my limited abilities. But, since my amateur philosophical retirement hobby is based on Quantum & Information theories, I may be able to make some pertinent comments. I hope they are not off-topic.

First, the quoted phrase above, parallels the Second Law of Thermodynamics, and it seems to equate Information with Energy. Indeed, some scientists today do make that equation*1. Since the second law implies that the universe began with a limited amount of Energy/Information, some might infer that maximum Entropy at "Heat Death" would mean Zero Energy & Zero Information : hence Zero Universe. But, is it possible that local energy/information users could import energy from the almost empty vastness of space-time*2, in order to optimize their local throughput of energy & information? That may be how some sci-fi alien species are able to become entrepreneurial space-farers, exploring & conquering less efficient (or less-fortunate) species, such as Earthlings.

In that case, with the Quality of Information in mind, instead of just Quantity, the value of Information, would be, not just conserved but compounded, like money in an interest-bearing account, or in risky-but-high-yield investments. Relative to the whole universe, increasing Entropy is equivalent to a negative interest rate, such as monetary inflation. Our Cosmos is a closed banking system, but it began with a stock-pile of low entropy-high energy/information. So, the evolution of increasing intelligence (information quality ) could conceivably offset the energy inflation (entropy) of expansion without new inputs*3.

Therefore, I agree that Deduction shouldn't be scandalous, just because, unlike Induction/Production it doesn't output new information (facts). On the other hand, by testing old information (beliefs), Deduction should produce better quality information. For an evolutionary analogy, animals that developed intelligence, in response to environmental stresses, gradually evolved into new species. Physically, still animals, but metaphysically a new species (homo economicus), that is more efficient in processing information mentally. No longer dependent on teeth & claws, they use imagination & ideas to make their fortune in the world. :smile:

PS___Energy and Information are not the same thing, but different forms of the same thing : Potential.

*1. Information converted to energy :
Toyabe and colleagues have observed this energy-information equivalence . . .
https://physicsworld.com/a/information-converted-to-energy/

*2. Vacuum Energy :
Vacuum energy is an underlying background energy that exists in space throughout the entire Universe.
https://en.wikipedia.org/wiki/Vacuum_energy

*3. Weight of Evidence and Information Value :
In data analysis, Logistic Regression models take as input categorical and numerical data, and output the probability of the occurrence of the event.
https://www.analyticsvidhya.com/blog/2021/06/understand-weight-of-evidence-and-information-value/
Note --- Shannon's mathematical definition of Information is based on statistical Probability

Gnomon

There is a (contested) claim in physics that information cannot be created or destroyed. . . . The claim here is that, even if T1 can be described in fewer bits than T2, you can just evolve T1 into T2, thus a description of T1 actually is a description of T2! This implies the scandal of deduction, that nothing new is learned from deterministic computation. — Count Timothy von Icarus

The causal role of information in the world is of interest to me, both scientifically and philosophically. Can you provide a link to a site or publication where that "claim" is made? It might clarify any presumptions behind such an assertion.

I'd be surprised if materialist/physicalist/deterministic scientists would think in terms of "learning" in a law-limited "deterministic" system*1. However computer scientists, and Information theorists, do sometimes use such anthro-morphic terminology metaphorically*2. So, if the "laws of nature" are imagined as a computer program, the universe could conceivably learn, in the same sense that Artificial Intelligence does*3, by means of "non-deterministic algorithms"*4.

But AI is not natural, and currently requires a natural Programmer to establish the parameters of the system. Would a self-organizing, self-learning world also require the services of a preter-natural Programmer to bootstrap the system?

As I mentioned above, this kind of sci-phi (science/philosophy) is over my head. But I'm learning. Does that mean I was not destined to post on such an abstruse question? :smile:

*1. What is physics-informed learning? :
What is physics-informed machine learning? Machine learning is a branch of artificial intelligence and computer science that focuses on the use of data and algorithms that attempt to imitate the function of the human brain, improving in accuracy over time.
https://www.pnnl.gov/explainer-articles/physics-informed-machine-learning

*2. Physicists working with Microsoft think the universe is a self-learning computer :
https://thenextweb.com/news/physicists-working-with-microsoft-think-the-universe-is-a-self-learning-computer

*3. Causal Determinism :
Causal determinism is, roughly speaking, the idea that every event is necessitated by antecedent events and conditions together with the laws of nature.
https://plato.stanford.edu/entries/determinism-causal/

*4. What is a nondeterministic algorithm? :
A nondeterministic algorithm is an algorithm that, given a particular input, can produce different outputs. This is in contrast to a deterministic algorithm, which will always produce the same output for a given input. Nondeterministic algorithms are often used in artificial intelligence (AI) applications.
https://www.aiforanyone.org/glossary/nondeterministic-algorithm

Wayfarer

There is a (contested) claim in physics that information cannot be created or destroyed. — Count Timothy von Icarus

I don't get this. I'm familiar with this principle in respect of energy, but not information. Is there a meaning of ‘information’ here specific to the context?

Count Timothy von Icarus

↪Gnomon

"Information" is very tough term because it is defined loads of different ways. I suppose here I should have used "Kolmogorov Complexity," in every instance here. This is a measure of how many bits it takes to describe something (really how many bits a computer program would need to be to produce an output of a full description).

So, that said, I would think that the "heat death," scenario, where the universe is in thermodynamic equilibrium, would have the greatest complexity/take the most bits to describe as a description of its macroproperties excludes a maximal number of possible microstates that must be excluded by specifying information.

↪Wayfarer

The context would be algorithmic information theory, or Kolmogorov Complexity. The common claim is, the shortest possible description of the universe at any time T(short) gives you complete information about all the states of the universe because one can simply evolve T(short) into any other state. Thus, no new information in the algorithmic sense is ever added.

The problem here from my perspective is that T(Short) + evolution produces all states, not just a single state of the universe. To use T(Short) to have a "program" (conceivably the universe evolving itself as a quantum cellular automata), output a specific different time, T(Diff) requires that we have some way to halt the evolution at T(Diff) and output the current state. Otherwise the evolution just runs forever. To do this would seem to require that you already have a complete description of T(Diff) to match to, comparing your evolved states to the one you want, and since T(Diff) could require a much larger description than T(Short), this means such an evolution scheme does NOT entail that the universe doesn't gain algorithmic complexity.

If we allow that any program that produces an output of x is the shortest description of x, even if it outputs other things, then the shortest descriptions of all items is a simple program that combinatorially enumerates all possible combinations of values for an object encodable in a string of length n. This seems like nonsense.

I imagine the confusion comes for certain pancomputationalist conceptions of the universe where the universe is essentially a quantum computer, and thus the assumption that every moment that exists is an output, and so T0, the start of the universe, entails a unique output for all states. I don't think this is a useful way to use an analogy to computation in nature. If we accept that every system state should be considered as an "output," then it follows that very simple closed systems, given infinite time, compute all computable functions, so long as they instantiate any of the basic cellular automata capable of universe computation.

A related problem is that, even if you have a unique identifier for something, that doesn't mean the identifier actually allows you to construct it. E.g., "the coprimes of the centillionth factorial," uniquely describes a set of integers, but it doesn't give you a means of constructing that set and even if doing so is straight forward enough, it would take a lot of resources because it's a huge computation.

Thus, it's unclear if the claim that "deterministic computation cannot create new information," (reduce uncertainty in this case, usually framed as Shannon information or something similar) makes any sense. If we were given the set of coprimes, our uncertainty as to the elements' identities would appear to be reduced in the example above. Being able to know a set of facts S with 100% certainty given information x + deduction is not the same thing as knowing S because real computation doesn't occur in a Platonic realm where all logical consequence exists eternally. Even abstractly, computation should be thought of as requiring steps (time), as a stepwise enumeration of logical consequence, rather than being identical to the relationships it describes.

Mazur's "When is One Thing Equal to Some Other Thing?" sort of gets at this same thing, but from a different angle and I don't understand category theory so he loses me midway through.

https://www.google.com/url?sa=t&source=web&rct=j&url=https://people.math.harvard.edu/~mazur/preprints/when_is_one.pdf&ved=2ahUKEwjn-pvitdn_AhVskokEHfqrDawQFnoECBoQAQ&usg=AOvVaw3_EzAdB0Ll98eIYYV4dLzi

Gnomon

"Information" is very tough term because it is defined loads of different ways. I suppose here I should have used "Kolmogorov Complexity," in every instance here. This is a measure of how many bits it takes to describe something (really how many bits a computer program would need to be to produce an output of a full description).

So, that said, I would think that the "heat death," scenario, where the universe is in thermodynamic equilibrium, would have the greatest complexity/take the most bits to describe as a description of its macroproperties excludes a maximal number of possible microstates that must be excluded by specifying information. — Count Timothy von Icarus

Sorry, "Kolmogorov Complexity"*1 is way over my little pointy head. And, while your comments in the quote may have something to do with the hypothesis that the universe is a computer program, it doesn't address my request*2 for a link to the Second Law assertion.

The Santa Fe Institute studies physical Complexity from an information theoretic perspective. And they can get into some pretty abstruse word-salads. But my interest is much simpler, and primarily regarding the relationship between Meaningful Information and Causal Energy : i.e. Information defined in terms of Energy. Hence, I'd like to know more about the implications of the Second Law on the Conservation of Information.

So, I'll ask again : "Can you provide a link to a site or publication where that "claim" is made? It might clarify any presumptions behind such an assertion". I'm really interested in the basis of that claim. Not at all interested in "Kolmogorov Complexity". :smile:

*1. Kolmogorov complexity :
The notion of Kolmogorov complexity can be used to state and prove impossibility results akin to Cantor's diagonal argument, Gödel's incompleteness theorem, and Turing's halting problem. In particular, no program P computing a lower bound for each text's Kolmogorov complexity can return a value essentially larger than P's own length (see section § Chaitin's incompleteness theorem); hence no single program can compute the exact Kolmogorov complexity for infinitely many texts.
https://en.wikipedia.org/wiki/Kolmogorov_complexity
Note --- "prove impossibility results"???? Not even in the ballpark of my layman dilettante vocabulary.

*2. I don't know what the reference to "impossible complexity" has to do with this claim :
"There is a (contested) claim in physics that information cannot be created or destroyed". I was hoping you could point me to the source of that claim, so I could understand its implications : e.g. Metaphysical Information is as fundamental/essential to reality as Physical Energy.

The Conservation of Information and The Scandal of Deduction

Welcome to The Philosophy Forum!

Categories

More Discussions