Avatar

MYSTERIOUS QUANTUM PHYSICS

@mysteriousquantumphysics / mysteriousquantumphysics.tumblr.com

from frustration to a glimpse of understanding and further.
Avatar

How to Use Quantum Computing as a Tool for Philosophy of Science

Recently, I attended the MCQST 2023 Conference on which Lídia del Rio presented the research with her collaborators about quantum thought experiments in a quantum computer. They wrote a whole package to do this and describe the ideas in detail in [1]. It's definitely worth checking out the paper and the package - to make you curious let us look at an illustrative example [1, p.4-10].

Example Setting

Let us consider the following setting (as depicted in the image above): Alice has some two-level quantum system R (e.g. a qubit) in the state written in blue. Thus, the probability of obtaining a=0 in a measurement is 1/3 while the result a=1 will be obtained with a probability 2/3. Depending on the outcome, Bob receives the a system in state |0> (if Alice's result was a=0) or in state |+> ~ |0>+|1> (if Alice's result was a=1). In turn, Bob measures his system in the computational basis and can receive the outcome b=0 or b=1. What conclusions can Bob draw about Alice's measurement outcomes based on his? It is assumed that Bob knows the rules upon which Alice sends him the different systems. Thus, if his outcome is b=0 he cannot make any retrodiction since the outcome b=0 could stem from both possible states |0> and |+>. However, if he measures b=1 he knows that his state must have been in |+> and thus he can retrodict that Alice's outcome must have been a=1. Therefore, in one of both cases Bob can draw a deterministic conclusion about Alice's outcome.

So far so good, at this point I'd like to mention that even though this setup seems to be motivated by the Frauchiger-Renner Thought Experiment, we will not talk about apparent paradoxes or fundamental questions in foundations of quantum mechanics themselves. Instead the setting is supposed to be easy to grasp and can therefore neatly serve the purpose to illustrate how a thought experiment can be formalized in terms of quantum circuits. Hence, we will discuss a tool which can be used for quantum thought experiments in general by using a simple example.

Alice's and Bob's Brains in a Quantum Circuit

Next, we will translate this specific setting as a quantum circuit - by going through the above illustration of the resulting circuit. The first qubit is initialized in the state of Alice's system R. Even though this seems to be the only true quantum system at hand, we will act as if there was an external observer who looks at both Alice and Bob and their respective systems. Imagine you are in the position of this external observer and set the Heisenberg cut at this point: You are classical while both Alice and Bob are quantum (as it is done in Neo-Copenhagen interpretations). Then, one also has to model the "brains"/"memory" of both Alice and Bob. We start with Alice first: we assign a wire of the circuit to Alice's reasoning which basically means that somehow the possible measurement results are stored in this respective qubit. The wire representing her memory is initialized in state |0> and is connected to her system R via a CNOT gate. This means that if the system R was in state |0>, the qubit representing Alices would stay in |0>. However, if R was in |1>, Alice's state of memory would be in |1> as well. This way, one can model different measurement outcomes and also Alice's memory in a unitary manner without explicitly including measurements in the circuits yet. This is necessary since from our external perspective everything about Bob and Alice is considered to be quantum, i.e. must be modelled unitarily. Now, we can look at the third wire: It is again initialized in state |0> and remains in this state if Alice's memory is in state |0>. However, the controlled Hadamard will act on the third wire if Alice's memory is in state |1>, hence it would be turned into state |+>. Thus the controlled Hadamard models the system S which Bob receives - conditioned on Alice's measurement outcome of R. Finally, the last wire is again initialized in |0> and is supposed to model Bob's memory. Exactly as Alice's memory, also the relation between Bob's memory and his system S is modeled via a CNOT gate.

As a result we now have a quantum circuit which represents the setup from above from an external perspective. I think already at this point one can see the beauty of this approach - while one needs quite a lot of sentences to explain the simple setup, it can very easily be grasped by the neat quantum circuit. What is left to do now is to model Bob's reasoning regarding his retrodiction on Alice's outcome. We found that Bob can draw a deterministic conclusion about Alice's measurement outcome if his outcome is b=1, in the other case he cannot draw such a conclusion. How can this be mapped into a quantum circuit?

Modelling Bob's reasoning

In the above circuit we added a couple of additional wires. One set represents the four possible logical inferences in this case, and the last two wires will show what Bob's prediction will be based on the initialized inferences. Let's go through this step by step: There are four possible inferences on the measurement outcomes a and b, but only one of them is assumed to hold, namely (b = 1 -> a=1), which is why only the wire corresponding to this inference is initialized in state |1>. The other three inferences, which are assumed not to hold, are initialized in |0> and since there are control nodes from the nonlocal Toffoli-type on those wires, they will not really contribute as long as one does not change the initialization. Those Toffoli-type gates have two control nodes each as well as a NOT at the lower end. One control node is placed on Bob's memory qubit, acting dependently on the outcome b and represents the antecedent of each possible inference. If the inference of a wire assumes b=1 the corresponding Toffoli node on Bob's memory will be black, while it will be white for b=0. The second control node of each Toffoli gate is black in order to be activated according to which inference is initialized with state |1>. The consequent of those inferences is modeled by the lowest two wires. The NOT of each Toffoli is placed on the corresponding wires representing the consequent. Looking at the Toffoli for the inference b=1 -> a=1, one can see that if Bob's memory is in |1> and simultaneously the wire of the corresponding inference is initialized as |1> as well, the NOT on the lowest wire will turn the respective state to |1> (the prediction wires are initialized in |0>). Thus, Bob's prediction can be read off by the states of the prediction wires. Finally, one can also run this circuit and check its consistency - how?

Consistency Checks

In the above image we have put together all we got so far: Alice's actions from before, as well as bob's actions and his reasoning as discussed right above. The consistency of such a model can be checked by measuring the prediction wires as well as Alice's memory qubit. In this case, the only deterministic inference will show itself if Bob's prediction wire for a=1 will be |1> and this will coincide with Alice's memory being in |1>. For the other case, no inference can be done. This way one can check the consistency of the model and if the results show paradoxical outcomes one knows that something went wrong, that something in the logical reasoning / adopted interpretation of quantum theory is getting problematic. Having everything formalized as a quantum circuit will make the analysis of the issues easier.

Final Remarks

It appears to me that Quantum Circuits are not used here because one expects some computational advantage by running them on a Quantum Computer - instead they are used to neatly formalize subsystems and possible inferences of thought experiments. This way, thought experiments can be made more clear and transparent as well as it is easier to see the problem if the outcomes are not consistent. Therefore, their work shows that quantum circuits have a much broader field of application - it is not only about striving for some kind of quantum advantage for specific decision problems, instead they can also be used to formalize concepts in foundations of quantum mechanics; and this is something I have never thought about before, which is why I am so fascinated by the idea.

---

References: [1] Nurgalieva, Mathis, del Rio, Renner - Thought experiments in a quantum computer. arXiv:2209.06236

Avatar

Quantum Circuit Cutting - with Randomly Applied Channels

Recently, I briefly introduced what circuit cutting is, why it is an advisable thing to do with current quantum hardware (NISQ devices) and what additional costs the cutting is causing. However, I did not go into detail how such a circuit cutting method can look like in detail - this is what this entry will be about. In particular, we will have a look on the circuit cutting procedure proposed by Lowe et. al. [1] in which randomly applied channels are able to cut a circuit.

Identity on Cut Circuits

As mentioned in the previous entry, circuit cutting requires to find a proper identity channel on the cut wires which has reasonably low sampling overhead - the definition of the identity is thus the heart of every circuit cutting procedure. In general, such an identity has the form

where Φ_i is some properly chosen quantum channel. The corresponding cost depends on the value κ

which is the L1 norm of the real coefficients of the identity channel above:

Thus, we see that the main possibility to reduce the sampling overhead is to reduce this value. One possibility with small sampling overhead is the following (however, it is not minimal! The method described in [2] has a lower overhead, but we will not go into detail of this).

The identity channel used in [1] looks like

Here, d=2^k is the dimension of the subspace governing the qubits of the cut. The variable z denotes a Bernoulli random value where z=1 appears with probability d/(2d+1). Later, we will derive this form of the identity channel and will see how this probability and also the expectation E_z emerges. Since, there are two values of z, there are also two quantum channels which can be applied. The first one, Ψ_0, is a measure-and-prepare channel:

The unitaries which are applied on the state prior to measurement have to comprise a 2-design (at least) because otherwise the derivation would not work. Such a design is formed by e.g. Clifford gates but there are many possibilities, one could also rely on approximate designs. Since the form of a quantum channel is not so pictorially, you can see in the following how this channel looks like in "circuit-language":

This means, one applies U^\dagger on the k cut wires, measures in z-basis and retrieves a bitstring y. A state in computational basis corresponding to this bitstring gets initialized and then U is applied. All of this is repeated many times. Note that applying such a circuit in the middle of a larger circuit destroys entanglement of the global state and this is also the reason why cutting requires a lot of sampling (quantified by the sampling overhead): The effects of entanglement in the final result must be regained somehow by repeating the procedure numerous times.

The other channel, Ψ_1, is a simpler one. It is the so-called fully depolarizing channel in which all of the information within the cut (within a sample) is lost:

In practice, the action on the cut part of the circuit is as follows: First one measures the k cut wires in the computational basis. Afterwards one takes a uniformly sampled bitstring x and initializes it on the wires - as in the following circuit snippet:

Guiding through the Derivation

Now, as we have settled the definitions, let's go through the derivation of this identity channel! At first, we need an equation which we shall not prove as this would be more involved (based on Haar measure etc.). It is called Werner Twirling Channel:

This equation is particularly nice because the right hand side is much much simpler than the left hand side, which requires all of the unitaries in the design. At the same time, the right hand side could be quickly written out by hand for e.g. d=1. This will come in handy in deriving the identity channel.

The idea of the proof is to start with one channel Ψ_0 and massage it a little to find an expression of Ψ_1 within it and then massaging it a little further and finding an expression for the identity channel.

Hence, start with the channel Ψ_0:

A lot is going on here, thus let's go through the equalities step by step: From the first to the second line the only thing happening is that we insert an identity (using completeness of the computational basis). Going to the next line, the two scalar factors are swapped and the states with index i can be used to rewrite the expression into a Tr. By exploiting the properties of tensor products and the trace, one can draw outside the sum a partial trace expression in the last line. This shape is nice, because we can recognize the left hand side of the above equation and can simplify the underbraced expression:

Since the expression within the sum does not depend on the index j anymore, this sum merely gives a factor d. In the second line we draw the partial trace factor into the brackets and recognize the Ψ_1 channel! Additionally, the part with the SWAP operator can be simplified as well (you can easily prove this by checking it with a 4x4 SWAP matrix and using a general matrix X). All of this helped us immensely in relating both channels to each other. Reshaping this equation a little gives us:

In the second line, we draw a factor outside in order to retrieve the Bernoulli probabilities we have defined previously and then, by respecting the additional sign, this can be easily rewritten as an expectation value in the last line.

What about the cost?

Now as we have both defined and derived the identity circuit expression, let's relate it to the introducing sentences about the sampling overhead. The value of κ can be easily computed:

As we can see, the sampling overhead is exponential in the number of cut qubits - this is a deficit in practice. Even though, there are slightly better circuit cutting procedures, the overhead always scales exponentially in the number of cut qubits. Although unfortunate, this makes perfect sense intuitively: The cutting destroys part of the quantum properties of the system and these must be reproduced classically (by sampling). Mostly everything quantum which is simulated classically scales exponentially (since the Hilbert space dimension grows exponentially with growing number of particles).

Overall, circuit cutting is an interesting new field in quantum computing which might help to go beyond the capabilities of current NISQ devices - nevertheless, there is always a price to pay and it will become evident in future research whether circuit cutting will be a common method or not.

--- References: [1] Angus Lowe, Matija Medvidović, Anthony Hayes, Lee J. O'Riordan, Thomas R. Bromley, Juan Miguel Arrazola, Nathan Killoran. Fast quantum circuit cutting with randomized measurements. 2022. arXiv:2207.14734

[2] Hiroyuki Harada, Kaito Wada, Naoki Yamamoto. Optimal parallel wire cutting without ancilla qubits. arXiv:2303.07340

Avatar

Cutting Quantum Circuits into Pieces - why and how?

Even though quantum computing is a promising and huge field, it is still at an early development stage. We know algorithms with clear advantage towards classical algorithms such as Grover's or Shor's - however, we are far away from implementing those algorithms on real devices for e.g. breaking state of the art RSA encriptions.

Today's Possibilities of Quantum Computing

Thus, part of current research is to make use of the kind of quantum computers which are available today: Noisy Intermediate-Scale Quantum (NISQ) devices. They are far away from ideal quantum computers since they provide only a limited number of qubits, have faulty gate implementations and measurements and the quantum states decohere rather fast [1]. As a result, algorithms which require large depth circuits cannot be realistically implemented nowadays. Instead, it is advisable to find out what can be done with the currently available NISQ devices. Good candidates are variational quantum algorithms (VQA) in which one uses both quantum and classical methods: One constructs a parametrized quantum circuit whose parameters are optimized by a classical optimizer (e.g. COBYLA). To those methods belong for instance the variational quantum eigensolver (VQE) which can be used to find the ground state energy of a Hamiltonian (a problem which is in general often tackled without quantum computing, i.e. classical computing with tensor network approaches). Another method is solving QUBO problems with the quantum approximate optimization algorithm (QAOA). These are promising ideas, but one should note that it is not sure yet whether we can obtain quantum advantage with them or not [2].

Cutting Quantum Circuits

So far, we have learned that current quantum devices are faulty, hence still far away from fault-tolerant quantum computers. Thus, it is preferable to make quantum circuits of the above mentioned VQAs smaller somehow. Imagine the case in which you want to use the ibm_cairo system with 27 quibts, but the problem you want to solve requires 50 qubits - what can you do? One prominent idea is to cut the circuit of your algorithm into pieces (in this case, bipartitioning it). How can this be done? As you can imagine, such a task requires sophisticated methods to simulate the quantum behaviour of the large circuit even though one has fewer qubits available. Let's briefly look on how this can be done.

Wire Cutting v.s. Gate Cutting

There are different ideas about where to place the cut. In some situations it might be advisable to cut a complicated gate [3, 4]. The more illustrative way is to cut one or more wires of a circuit by implementing a certain decomposition of an identity onto the wire(s) to be cut [5, 6]. In general, such a decomposition looks like

L is the space of linear operators on the d-dimensional complex vector space. How should this be understood? For example in [6] they apply a special case of this identity equation; in a run of the circuit only one of these terms (one channel) is applied at a time. This already indicates that cutting requires running the circuit multiple times in order to simulate the identity. This makes sense intuitively, since making a cut somewhere in a circuit makes it necessary to perform a measurement. As a result, some of the entanglement / quantum properties of the circuit are lost. To compensate this, one has to artifically simulate this quantum behaviour by sampling (running the circuit more often). This so-called sampling overhead can be proven to be

This can be derived with the help of defining an unbiased estimator and applying Hoeffding's inequality. A detailed derivation (which holds for general operators, not only for the identity) can be found in appendix E of [3]. The exact sampling cost depends on the explicit decomposition one wants to apply.

Closing remarks

Up to my knowledge, those circuit cutting schemes only work efficiently for special cases. Often, the cost depends on the size of the cut, i.e. how many wires are cut. Additionally, the original circuit should be able to be partitioned reasonably. In the title picture you can see a mock circuit with five qubits. You can see that on the left side of the cut, there are gates which act on the first three (1,2,3) qubits only, while on the right side they only act on qubits 3,4 and 5. Hence, the cut should be placed on the overlap on both parts, i.e. on the middle qubit (3). The cut size is only one in this case, but in useful applications the cut size might be much larger. Since the cost often depends on the dimension of the cut qubits, the cost increases exponentially in the cut size (since the Hilbert space dimension grows as 2^k for the number of cuts k).

Thus, we see that circuit cutting can be very powerful in special problem instances, in which it can e.g. reduce the required qubits roughly by half - this helps making circuits shallower and smaller. However, there are lots of limitation given by the set of suitable problem instances and the sampling overhead.

--- References

[1] Marvin Bechtold, Johanna Barzen, Frank Leymann, Alexander Mandl, Julian Obst, Felix Truger, Benjamin Weder. Investigating the effect of circuit cutting in QAOA for the MaxCut problem on NISQ devices. 2023. arXiv:2302.01792

[2] M. Cerezo, Andrew Arrasmith, Ryan Babbush, Simon C. Benjamin, Suguru Endo, Keisuke Fujii, Jarrod R. McClean, Kosuke Mitarai, Xiao Yuan, Lukasz Cincio, Patrick J. Coles. Variational Quantum Algorithms. 2021. arXiv:2012.09265

[3] Christian Ufrecht, Maniraman Periyasamy, Sebastian Rietsch, Daniel D. Scherer, Axel Plinge, Christopher Mutschler. Cutting multi-control quantum gates with ZX calculus. 2023. arXiv:2302.00387

[4] Kosuke Mitarai, Keisuke Fujii. Constructing a virtual two-qubit gate by sampling single-qubit operations. 2019. arXiv:1909.07534

[5] Tianyi Peng, Aram Harrow, Maris Ozols, Xiaodi Wu. Simulating Large Quantum Circuits on a Small Quantum Computer. 2019. arXiv:1904.00102

[6] Angus Lowe, Matija Medvidović, Anthony Hayes, Lee J. O'Riordan, Thomas R. Bromley, Juan Miguel Arrazola, Nathan Killoran. Fast quantum circuit cutting with randomized measurements. 2022. arXiv:2207.14734

Avatar

ZX Calculus - Another Perspective on Quantum Circuits. Part II

Last time we introduced basic definitions and a small set of rules of the ZX calculus. While our aim is to analyze the Bell circuit in terms of this framework, you can find more sophisticated examples in [2, pp. 28]. For the Bell circuit we only need one further ingredient:

Cups and Caps

Cups and Caps are the ZX-type representations of the Bell State |Φ^+>. As you surely know, this state "lives" in a four dimensional Hilbert space, and can be represented as a vector with four entries - and in the ZX calculus this means:

In more complicated circuits it is neat to know that this Bell state actually acts as a bended piece of wire, which introduces a lot of flexibility in one's modifications of an expression. The cups and caps are merely vectorizations of the 2x2 identity matrix.

Application to the Bell Circuit

A brief reminder about the Bell circuit: It just applies a Hadamard and a CNOT on the input qubits. The outcome is supposed to be the Bell state |Φ^+>, i.e. a cup, as desribed above.

First, start by translating the circuit into ZX-language, by using the definitions we found in the previous entry. The circuit becomes:

Here, we simply expressed the |0> vectors as grey dots on the left, then applied a Hadamard on the first and afterwards a CNOT. Application of the Fusion rule on the two grey dots on the bottom yields:

Then, we apply the Hadamard on the grey dot (|0>) which changes its color:

Thus, we can again fuse two dots, in this case the two white dots above:

Then, we know that dots with a single income and outcome leg are actually just identities! As a result, our expression simplifies:

And this is exactly the cup we desired! Translating the circuit into ZX-language and applying the rules led us to the result that we have a Bell state in the end. Of course one could have evaluated this circuit easily by hand with the help of the matrix representations of the gates - nevertheless, I think it is a neat example to see the simplicity and beauty of the ZX-calculus. Check out [1] for more sophisticated examples!

Conclusion

Similar as tensor networks in general, the ZX calculus is a neat and beautiful framework which gives rise to a rich variety of applications - even though they resemble a lot, both are specifically tailored for different applications. A nice property of the ZX calculus is that it is universal: it can represent all 2^n x 2^m matrices and simultaneously it is a very intuitive and pictorial description [1, p.18]. As a final note: If you're familiar with condensed matter and tensor networks, you know that the AKLT state is of particular importance. It can also be described with the help of ZX Calculus and the framework is able to reveal its interesting properties as e.g. the string order [2].

--- References: The ZX graphics were created with tikzit.github.io. Furthermore, you can find a lot of valuable information on zxcalculus.com. [1] ZX-calculus for the working quantum computer scientist - Wetering. 2020. arXiv:2012.13966 [2] AKLT-States as ZX-Diagrams: Diagrammatic Reasoning for Quantum States - East, Wetering, Chancellor, Grushin. 2021. doi.org/10.1103/PRXQuantum.3.010302

Avatar

ZX Calculus - Another Perspective on Quantum Circuits. Part I

Recently, stumbled across a tensor network-type framework which was completely new to me - the ZX Calculus. The ZX Calculus is not only a neat way of representing possibly complicated mathematical equations, it also gives explicit rules to alter and simplify those expressions. The ZX Calculus is particularly suited to describe matters in quantum information, which is why I'd like to provide a neat example of how to use this framework. As you might already know, quantum circuits can be fully analysed and understood with the help of tensor networks (actually, they are tensor networks) [1]. However, the ZX Calculus is a specific framework which gives a very illustrative graphical way of understanding quantum circuits, while the typical tensor network approaches are mostly tailored for many body problems. All of the following is taken from [2], a very comprehensive introduction to the ZX Calculus and I fully recommend to go through this paper if the following glimpse into the topic made you curious.

In the following we will set up the very basic set of definitions and rules in order to understand how to evaluate the outcome of the well-known Bell circuit which creates a maximally entangled Bell state:

Basic Definitions: Spiders and Vectors

The most fundamental definition in the ZX Calculus is the spider. The Z-Spider has n inputs and m outputs and is defined as follows:

Thus, such a spider is simply a way of representing a specific kind of 2^n x 2^m matrices. Here, the |0> and |1> denote the basis states of the Pauli Z operator. Similarily, an X-Spider can be defined in terms of another basis, the eigenstates of the Pauli X operator, |+> and |- >:

Thus, the color of the dot encodes information about the basis. The usage of the basis states of both Pauli X and Pauli Z is eponymous for the ZX Calculus. One could have chosen the Pauli Y basis as well, however the choice of X and Z results in nice symmetry properties [2, p.22]. From this, we can already conclude the first identity which we will need to evaluate the Bell circuit: Set n=m=1 as well as α=0. With these parameters, the spides become plain 2x2 identity matrices (just look at the definitions!). While α=0 is denoted with an empty dot, this observation can be represented as:

Thus, as soon as we encounter single, plain dots with one incoming and one outgoing leg, we can remove them. Additionally, we will need to know, how to represent simple basis vectors in this diagrammatic language. This is simply done by using dots with a single leg and the following simple consideration according to the definitions of the spiders:

Of course, one can also describe |- > and |1> states, just apply α=π respectively. Note that we omit global phases here; thus using a simple equality sign is actually a delicate matter.

The Hadamard Gate

The Hadamard gate is a unitary gate which simply transforms between the X and Z basis; e.g. applying the Hadamard gate to a |0> state will result in |+>. Its graphical representation is just a plain box with one outcoming and one incoming leg - its action on the basis vectors is as follows:

Actually, this is one special case of the more general rule, that the application of Hadamards changes colors as follows:

This of course also holds if the colors are inversed. In general, all ZX rules hold under coherent exchange of colors.

The CNOT Gate

Another central gate in quantum computing is the CNOT gate, which is a controlled NOT gate, i.e. the target qubit is only flipped if the control qubit is |1>, otherwise nothing happens. This 2-qubit-gate can be represented as

The equality sign should be taken with care as well, because the left is in the quantum circuit notation, while the right is in ZX calculus notation. Its construction is explicitly explained in [2, pp. 11]. Since it is a bit lengthy to go through it by representing the diagrams as matrices, I leave it to you to check it in the reference in case you are interested.

The Fusion Rule

In general, it is possible to "fuse" dots of the same color, while adding their phases. Note that it is addition mod 2π because α and  β are the exponents of e.

Later, we will only use a special case of this, namely that we can fuse dots of the same color which are connected by one line.

Now, we have finally settled the rough framework for analyzing the Bell circuit, which will do in the next part!

--- References: The ZX graphics were created with tikzit.github.io. Furthermore, you can find a lot of valuable information on zxcalculus.com. [1] Tensor Networks in a Nutshell - Biamonte, Bergholm. 2017. arXiv:1708.00006 [2] ZX-calculus for the working quantum computer scientist - Wetering. 2020. arXiv:2012.13966

Avatar

Studying in China Remotely from Germany - Some Experiences

As I have mentioned in a previous entry, the past winter term I studied at Tsinghua University in Beijing. Unfortunately, I could not enter the country due to the Covid restrictions that have been still present when the semester started - nevertheless, I thought it might be valuable to share some experiences.

Studying Remotely

What had been a frustrating experience is that the exchange semester - which I had started organizing in summer 2021 - could not take place in person. The exchange semester started in September 22 and the information that exchange students cannot enter China was sent by the end of June 22. Thus, it was only roughly two months before the semester started until I knew for sure that I will not study on Tsinghua Campus. This had been unfortunate, because I was thinking about cancelling the exchange but it was too short to organize something else in Germany, as for instance an internship. I could have expected that it will not work anyway but somehow I kept some naive optimism until I knew for sure. Hence, after some considerations (Tsinghua expected a response already about one week after they sent the notice) I decided to do the exchange semester nevertheless. Even though this meant not having access to most of the experiences that make an exchange semester worthwhile and spending another semester mostly at home - even though there are nearly no Covid restrictions in Germany anymore. Back then, I was at least happy that I could avoid the risk of ending up in a harsh Chinese lockdown - the opposite happened: China gave up most of its Covid regulations. I’m happy and I hope that future exchange students will be more lucky than me in this respect.

Avatar

Data Science meets the Many Body Problem

Since the machine learning course I did this semester at Tsinghua university was mainly focused on typical data science applications, I was curious in how far those methods can be applied in physics. Of course it is nothing new that neural networks can in principle also be used for physical applications - however, tensor network methods still seem to be dominant in the field of numerical many body physics. Thus, I decided to dive in a little into the literature about the usage of Restrictive Boltzmann Machines (RBM) in many body physics.

What are RBMs?

Usually, RBMs are used for instance for recommendation tasks (e.g. video recommendations on video platforms) and many more. In general, it is a unsupervised learning technique which makes use of minimizing its "energy". Thus, the intuition behind RBMs is, despite their data science applications, already related to physics: We will see that it is no surprise that "Boltzmann" is part of the name of this method. An RBM utilizes input data, tries to extract meaningful features from it and wants to find the probability distribution over the input. This follows the physical intuition as follows: The RBM is a neural network with two layers, a hidden and a visible layer, where each node can adopt binary values. An example network looks like:

where the x denote the visible nodes, the h the hidden nodes and W denotes the weights between both layers. Note that there are no links in between the nodes of a single layer; this is why these networks are called restricted Boltzmann machines.

The network is governed by a corresponding energy function as:

where we also have the offsets of the single nodes (a for the visible nodes and b for the hidden nodes). Given this energy function, one can determine the probability distribution by the Boltzmann distribution where Z is a partition function, as familiar from classical statistical physics. As usual, the energy is supposed to be minimized, what is done by the learning algorithm of the RBM but this should not explained here since it would go beyond the scope of a brief blog entry. At this point I'd only like to mention that there are some difficulties determining e.g. the partition function (which is intractable in general) and that this requires some sophisticated algortihms. If you're interested in this and how the RBMs work exactly, a neat and far more rigorous introduction into RBMs can be found here.

One side note at this point: RBMs were introduced by Geoffrey Hinton after John Hopfield (a physicist) invented the so-called Hopfield networks which are also such an energy based mechanism, based on the physical intuition of Ising models.

Note that so far we only talked about the RBMs as they are used in data science - despite their physical intuition, they had so far nothing to do with neither quantum mechanics nor the many body problem. This is what comes next.

How can this be linked to condensed matter?

As introduced in [1], an RBM that can represent a quantum many body state would look like this:

In comparison to the previous network we changed the labels from x to σ, where the σ's denote e.g. spin 1/2 configurations, bosonic occupation numbers and so on. For them one has to choose a basis, e.g. the σ^z basis. This configuration can be summarized in the set S. Hence, the visible nodes are the N physical nodes of the system. The M hidden nodes h play the role of auxiliary (spin) variables. The authors describe the understanding of such a neural-network quantum state as follows: "The many-body wave function is a mapping of the N−dimensional set S to (exponentially many) complex numbers which fully specify the amplitude and the phase of the quantum state. The point of view we take here is to interpret the wave function as a computational black box which, given an input manybody configuration S, returns a phase and an amplitude according to Ψ(S)" [1, p.2]. Thus, one gives a certain spin configuration as input and the RBM generates the state, in the following form:

Thus, once can recognize that the form of such a neural-network quantum state adopts a similar form as the aforementioned Boltzmann distribution (exponential of energy function). However, there is additionally a sum over all possible hidden configurations which specifies the full state. After setting a state up in this form, one aim could be to find the ground state corresponding to a certain Hamiltonian, and according to the authors of [1], their RBM method gives decent results for this task!

Similarity to tensor networks

Interesting is, that this framework (even though it appears very different) has some analogous quantities as tensor network states. For example, the representational quality of a neural-network quantum state can be increased by increasing the number of hidden states: Thus, the ratio M/N plays a similar role as the bond dimension of a tensor product state! There are many more similarities, which should not be discussed here but can be found in [3].

Nevertheless, I'd like to mention an important distinction, which is also crucial for tensor network states, because some algorithms (DMRG etc.) can only handle area law states properly. While volume law states have an entanglement entropy which scales with the volume of the partitions of a state, an area law state has a scaling only proportionally with the area of the cut. Area law states can be handled better numerically, because the bond dimension of tensor network states explode for volume law states (more on tensor networks and area law can be found in [4]). According to [2] the difference between area law states and volume law states can be captured in a neat way with RBM states: While volume law states must have full connections between the hidden and physical nodes, an area law state has fewer links - this imposes locality in a sense. The RBM states thus give a neat intuition between the differenece of both kinds of states.

All in all, RBMs seem to be an interesting approach to connect both data science methods and many body physics. It may be that they have strengths which the usual tensor networks approaches lack: for instance, the authors of [2, p.888] claim that it might be possible that RBMs might be able to handle volume law states better than usual tensor network approaches do, which would be of course a major benefit. Since I haven't heard of this approach within the condensed matter framework before, I'm very curious how the importance of this method will evolve in future research!

--- References:

[1] Carleo, Troyer, Solving the Quantum Many-Body Problem with Artificial Neural Networks, arXiv:1606.02318

[2] Melco, Carleo, Carrasquilla, Cirac, Restricted Boltzmann machines in quantum physics, https://doi.org/10.1038/s41567-019-0545-1

[3] Chen, Cheng, Xie, Wang, Xiang, Equivalence of restricted Boltzmann machines and tensor network states, arXiv:1701.04831

[4] Hauschild, Pollmann, Efficient numerical simulations with Tensor Networks: Tensor Network Python (TeNPy), arXiv:1805.00055

Avatar

Hi, sorry for being so inactive currently! This winter term I'm pursuing an online exchange at Tsinghua University and since my subjects there are rather related to computer science than to physics it has been difficult so far to find sufficient time to dive into nice physics topics suitable for this blog! But I hope I'll be more active again soon!:)

Avatar

Darwinian Evolution as thermodynamics in disguise?

As you know, biology is definitely not my area of expertise, but I recently stumbled accross some papers which catched my interest. Even though biology is supposed to have laws acting on a completely different "layer of complexity" than physics, some authors ([1, 2, 3] and the references therein) suggest that it might be possible to formalize at least evolutionary processes in a similar manner as thermodynamics. In the following I note some aspects of these papers which make this idea intriguing, but I wholeheartedly suggest that you check out the references themselves!

As thermodynamics is only applicable in the so-called thermodynamic limit, this holds also true in the suggested thermodynamic description of evolution [2]: i.e. it requires a large number of particles/organisms to make the thermodynamic description valid. As in thermodynamics, regarding solely small numbers of particles makes the description problematic since fluctuations and random processes cannot be neglected, the same holds for populations which are too small - the behaviour of populations can only be described faithfully on a large enough scale.

Two Opposing Principles: Maximum Entropy and Learning

As it is well-known, the principle of maximum entropy is of major importance in physics. It states that the probability distribution of an ensemble must be such that the entropy is maximized under appropriate constraints. This means that one chooses the configuration of a system which is compatible with the prior knowledge about itself and maximizes the entropy. This becomes intuitive if one regards a gas of some kind of molecules in a box: The configuration of the system is most likely to be an equal distribution of molecules thoughout the box (high entropy) while having all molecules in one half of the box (lower entropy) will not happen in reality. However, organisms work differently. They do not only increase the entropy but also decrease it, i.e. they organize/learn: The authors [2] suggest that it is required to formulate an opposing principle, the "second law of learning", where learning can be formalized in the sense of neural networks [3]. The important point is that the learning-law decreases the entropy (increases the "order" of the system) while the endeavour of the principle of maximum entropy is to increase the entropy - hence, evolutionary processes might be described by an interplay of these two opposing principles.

By defining an appropriate loss function (within the neural networks picture) one can maximize the entropy via Lagrange multipliers and derive a partition function Z, which is central in thermodynamics (for details of the derivation etc. please check [2] p. 2-3). If you took a thermodynamics/statistical mechanics course before, you know that once the partition function is known, one can derive a lot of other useful quantities, such as the free energy F. In the proposed picture of using thermodynamics to describe evolutionary processes, the partition function also has a definite interpretation: "Z represents macroscopic fitness or the sum over all possible fitness values for a given organism, that is, over all genome sequences that are compatible with survival in a given environment, whereas F represents the adaptation potential of the organism." [p.3, 2]. For me, it is pretty intruiging that it seems to be possible to reformulate such familiar notions from thermodynamics for the use of describing evolutionary processes.

Major Transitions in Evolution as Phase Transition

Moreover, the authors describe that so-called major transitions in evolution (such as the formation of cells out of pre-cellular life, origin of eucaryotes, etc.) can be described as phase transitions. We've been talking about phase transitions before as well as their major relevance in e.g. condensed matter physics - hence, it is intruiging and surprising to apply this concept to evolutionary processes. It is possible to define a quantity called "evolutionary temperature" which can be tuned, and at a certain critical "evolutionary temperature" the phase transition takes place. The phase transition is considered to be of first-order, and the authors note: "To describe phase transitions, we have to consider the system moving from one learning equilibrium (that is, a saddle point on the free energy landscape) to another." [p. 4, 2]

The proposal is phenomenological, not fundamental

As closing remarks, it should be noted that also lifeless matter can exhibit some features of life, e.g. crystals grow and some behaviour of glasses can be very roughly brought in relation to some phenomena of living organisms [p.4, 1].

Despite of all this fascinating formalism, it is important to note that these formulas will not be an explanation of evolution on a fundamental level. Of course, life is a far more rich phenomenon which can impossibly be explained by the straight forward formulas of thermodynamics. Also Katsnelson, Wolf and Koonin note that "Biological entities and their evolution do not simply follow the ‘more is different’ principle but, in some respects, appear to be qualitatively different from nonbiological phenomena, indicative of distinct forms of emergence that require new physical theory." [1] However, it is important to be aware that the aformentioned approach does not even claim to be a fundamental explanation - instead, it is a phenomenological way of describing the observed phenomena. Exactly as thermodynamics within the realm of physics is no fundamental theory, it is phenomenological and yet it has the power to explain a very wide range of phenomena.

---

Bibliography

Avatar

How iMPS Can Shed Light on the Distinction between Symmetry Protected Topological Phases. Part II

Previously we introduced some notions from representation theory and how infinite chains can be usefully represented via infinite matrix product states (iMPS). Now, the actually exciting part comes: We connect both parts, symmetries and iMPS. First, we start by checking how symmetries, i.e. their (projective) representations act on an iMPS.

Symmetries in iMPS

Given a system which is invariant under a certain internal symmetry with matrix representation Σ, the corresponding transformation of the Γ tensors must leave the whole iMPS invariant. This action works as

at a single Γ tensor. The diagrammatic representation looks as follows:

One can see that there is only a single contracted leg on the left hand side which is the reason why we only sum over j' in Equ. (3).

The elements of the symmetry group are denoted by g (which are omitted in the diagrammatic representation). Hence, by acting on one site with the symmetry representation, one gets a set of phases e^(i θ_g) and one of unitary matrices U_g if one regards all elements g of the symmetry group. The phases form a 1D representation (character) of the symmetry group, while the U_g form a projective representation, i.e. their homomorphism property is expanded by a phase, as we introduced in the previous entry. The resulting factor set can distinguish between different symmetry protected topological phases of the system. How can one practically retrieve the U matrices? By applying iTEBD or iDMRG one can retrieve the ground state of a given system, which will be provided as an iMPS in canonical form. Then, one knows that transforming a state by the symmetry should leave it invariant. Thus, the overlap between the original and transformed state has to remain <ψ|ψ'>=1. This inner product consists of several ``generalized'' transfer matrices:

Due to the required normalization, the largest eigenvalue of this generalized transfer matrix must be η=1, otherwise the state is not invariant under the given symmetry. We denote the corresponding eigenvector X_αα' (which is actually a matrix but can be shaped into a vector by vectorization). This gives us the anticipated U matrix, which is related to the eigenvector as

which can be explained by how the symmetry representation Σ acts on the Γ tensors as described in Equ. (3). Due to the fact that the U are unitary, only one U^† will effectively remain in the expression. This sole U^† can then be related to the eigenvector X from the fact that the iMPS is in canonical form. A diagrammatic representation of this computation works as follows:

First, we have a fraction of the generalized transfer matrix (Equ. (4)). Afterwards, we apply Equ. (3) and this yields:

From the fact that the U are unitary and commute with Λ, one can simplify the expression as:

The remaining U^† can be related to the left eigenvector of the transfermatrix via Equ. (5). How this helps us to numerically distinguish the phases will be regarded in the next section, where we regard a specific example.

Application to the Spin-1 Chain

Let us consider a spin-1 system evolving under the Hamiltonian

which gives rise to a symmetry protected topological phase. Note that the first term describes the usual Heisenberg model, while the second is an anisotropy of the system. Even though one can study more symmetries of this system, let us focus on the Z_2 x Z_2 symmetry. The representations Σ are given as

Previously, we mentioned that the factor set of a projective representation can be linked to a distinction of topological phases. In the present case this works as follows. Note that the Z_2 x Z_2 symmetry is abelian, i.e. the group elements commute. For a general projective representation U with group elements g and h, this means:

where we used the homomorphism property of the projective representation twice.

Applying this to the present Hamiltonian one will end up with two different values of the curly O which can distinguish the phases:

Here, ϕ can either be 0 or π, leading to O=+1,-1.

Having said this, the practical, numerical method to find the phases according to the Z_2 x Z_2 symmetry in the present system is the following: First, one implements the Hamiltonian such that it suits iTEBD. Then, one finds the corresponding ground state and applies the symmetry matrices R_x and R_y in order to retrieve the overlap in the form of the generalized transfer matrices. Afterwards, one takes one of the transfer matrices and diagonalizes it (via sparse matrix methods, e.g. Lanczos). Having implemented this, one can check for which parameters of D and J the largest eigenvalue is indeed 1 (if it is smaller than 1, the symmetry is not conserved in this phase, i.e. O=0). After this step, one can find the U_x and U_y by Equ. (5) and one can determine the phase as described above and find two distinct phases preserving the symmetry: a trivial, O=+1, and a topological, so-called Haldane phase, O=-1.

Now, the overall result we achieved is that we found values which are capable of characterizing a topological phase. This is important, since topological phases do not obey the Landau paradigm and can therefore not be distinguished by spontaneous symmetry breaking. Hence, it is remarkable that one can find a way to distinguish these new phases by combining representation theory and MPS methods. There are even more methods to distinguish the symmetry protected phases in such 1D chains - in case you're interested, have a look on the remainder of [1].

---

References

Avatar

How iMPS Can Shed Light on the Distinction between Symmetry Protected Topological Phases. Part I

Since I attended a class about topology in condensed matter theory this semester I thought it might be interesting to share some insights from a related paper I read [1]. It combines group/representation theory, numerical methods and the aforementioned topology in condensed matter. Hence, it might not be very accessible if you haven't heard about these things before, but maybe it might be interesting to those who have a more involved background in physics.

Phase Transitions, Symmetry Breaking and Landau Paradigm

Since the discovery of topological phases in condensed matter physics, it became a central task of current research to find ways of classifying and distinguishing these new phases of matter exactly. They show behaviour which drastically renewed the understanding of phase transitions: While the Landau paradigm dictated that a phase transition is intrinsically related to spontaneous symmetry breaking, topological phases do not require any breaking of a global symmetry of the system per se. Even if one encounters a system which has topological phases but does not obey any symmetry, e.g. Chern insulators, the phase transitions are not triggered by symmetry breaking in the Landau's sense. Diametrically opposed to spontaneous symmetry breaking, the subclass of symmetry protected topological phases (SPT) ultimately needs certain symmetries to be present - otherwise the interesting phases vanish. But if there is no symmetry breaking, one cannot define a local order parameter which characterizes the different phases of the system - hence, topological phases require new ways to find properties that change at the phase transition, i.e. at the quantum critical point. This search is crucial if one attempts to distinguish different topological phases - in the following, symmetry protected topological phases will be regarded. One possible way is to find nonlocal order parameters (as e.g. the string order in the AKLT-state) but it is also possible to focus on making use of projective representations and tools from computational physics: If the symmetries of the system are known, it is possible to determine the projective representations of the respective symmetries from the system's (matrix product) state, which will give rise to a distinction of different symmetry protected phases.

Recap of Representation Theory

Since some notions of representation theory are crucial for the following discussion it is advisable to briefly recap/introduce them.

Given two groups G and G', a group homomorphism f is a map

such that

where "·" denotes the multiplication in G and "⊙'' in G'. A (linear) representation of a (symmetry) group G is a vector space V together with a group homomorphism f, where

Here, Aut(V) is the automorphism group of V. Intuitively this means that the representation f implements the structure of the group G on the vector space V. A slight adaption of this is the so-called projective representation. One deals with such a representation if the group homomorphism allows an additional phase, i.e.

The entirety of the phases ρ_(g_i, g_j) is called factor set.

Later on, the representation will be denoted by its homomorphism, the corresponding space will be omitted. The additional definitions used here can be found in any book about representation theory, e.g. [2].

Infinite Matrix Product States

For the upcoming purpose, it will be necessary to regard iMPS, i.e. infinite matrix product states which can be used to describe translationally invariant 1D chains. Hence, for iMPS it is sufficient to regard a small unit cell (e.g. of length = 2). However, further details will not be discussed here because it would go beyond the scope of this entry to introduce the exact functionality of iMPS, iTEBD and/or iDMRG. Instead, the reader may refer e.g. to an introduction to TPS methods as [3]. As it is generally known, MPS have a gauge degree of freedom and additionally they can be written in the so-called canonical form, which is diagrammatically depicted as follows:

Γ describes a 3-tensor with one physical leg and two virtual legs, and Λ is a diagonal matrix which consists of the Schmidt values of the respective bipartition of the chain. The transfer matrix in terms of tensor networks is defined as

which is a 4-tensor, but can easily matricized (by reshaping the indices) to obtain an actual transfer matrix. The index j denotes the physical dimension, while the greek letters denote the bond dimensions. The diagrammatic representation of the transfer matrix can be seen in the following:

In order to respect the crucial property that the iMPS must be normalized, the largest eigenvalues of its transfer matrix must be η=1. This is because the inner product of a state |ψ> with itself can be written in terms of the corresponding transfer matrix T:

where N is the number of physical sites and η_1 the largest eigenvalue. Hence, this particular normalization property can be linked to the fact that the proposed system should be in the thermodynamic limit or at least close to it.

What comes next

After having introduced these notions, the next time we will be able to review how the MPS and the projective representations can be used to distinguish different symmetry protected topological phases in 1D chains.

--- References

Avatar

Are Quasiparticles "real"? Part II

As we have seen in the previous part, it is possible that new quantum states emerge, based on the higher level structure of a physical system. In particular, the SSH-Model served as an example to show that edge states can arise in a so called topological phase - hence, there existence is due to the structure of the entirety of the physical system. However, these edge states are only one of the simplest examples - one can encounter more "particle-like" emergent states, as e.g. phonons in metals, skyrmions, Goldstone modes or edge states in time reversal symmetric two dimensional topological insulators - with well-defined momentum as the traverse along the edge of the insulator. However, the name quasi-particle seems to suggest that they are less "real" than e.g. electrons. However, we will see that this is not necessarily the case.

Strong Emergence v.s. Weak Emergence

It is possible to distinguish between strong and weak emergence, as cited in [2]. In particular, "Strong Emergence of phenomena" is defined as:

"We can say that a high-level phenomenon is strongly emergent with respect to a low-level domain when the high-level phenomenon arises from the low-level domain, but truths concerning that phenomenon are not deducible even in principle from truths in the low-level domain." [my italics]

Then, Weak Emergence is characterized by the fact that a high-level phenomenon can indeed in principle be deduced from the truths of the low-level domain. Basically, this tells us, that if a phenomenon is strongly emergent, this phenomenon is genuinely new, whereas a weakly emergent phenomenon does exhibit 'new physics'. In our case the high-level phenomena address mostly quasi-particles in condensed matter theory, while the low-level domain is the microscopic quantum (field) theory.

Ellis [2] argues that in condensed matter theory we face Strong Emergence instead of Weak Emergence. This is the case because between the "hierarchy" of levels within and between sciences there is not only bottom-up causation but also top-down causation. While many physicists claim that bottom-up causation is sufficient (i.e. reductionist views in which it is argued that the knowledge about a microscopic theory is sufficient to deduce all of the macroscopic phenomena), Ellis suggests that top-down causation is essential. In short, his argument ([2] pp. 7-16) goes as follows: Starting from the pure microscopic theory (without additional assumptions), one could not conclude the existence of emergent phenomena. Instead, it is required to regard the high-level structure, study the emergent phenomena and build these phenomena (in our case, quasi-particles) into the microscopic theory - only then the microscopic theory is capable of describing also the higher-level, emergent physics. This is directly related to the so-called interlevel Wave-Particle Duality ([2], p.8)).

All Theories are Effective Theories

The term "Theory of Everything" (TOE) is often coined to be the holy grail of physics. It indicates that there is indeed a lowest layer of physics from which all of the higher levels can be deduced. However, there is no "proof" that such a most fundamental layer exists, it is just a philosophical (rather reductionist) assumption. If one lifts this assumption and leaves open the possibility that such a layer might not exist at all, it becomes reasonable to expect that all of our theories are effective theories (i.e. condensed matter theory is assumed to be an effective theory anyway, but also particle physics is clearly effective). And if all of our theories are effective, there is no reason to assume that any of the layers should be a preferred level of fundamentality.

How is "fundamental" defined?

In order to embrace this argument, it is necessary to note what kind of understanding of "fundamentality" is adopted in this context. If a "fundamental" entity is required to be "non-composite" the current discussion is indeed not supportable (as quasi-particles are made of constituents). The present understanding of "fundamentality" is focused on the idea that a "fundamental" entity is required to "be subject to fundamental laws" (p.9 [3]). And if each layer of physics is of same fundamentality, as argued above, then also emerging entities of each layer are equally fundamental.

If one accepts these points, also Laughlins characterization of emergent entities (as cited in [3] p.4) will appear reasonable:

"Accordingly, upon emergence, new entities arise that are (i) physical (for they are made up of the physical entities that constitute their emergence basis), (ii) high-level (for they only exist at the level of an organized collection of their basal entities) and (iii) ontologically new (for they are as fundamental as their ultimate basal entities are)." [my italics]

Hence, we can conclude that it is fairly reasonable to regard quasi-particles in condensed matter theory as being as fundamental as the usual fundamental particles (electrons, etc.). They are ontologically new, obey fundamental laws, and therefore they are "real" and not only mathematical concepts that make our calculations easier.

---

References:

Quote of header picture from: [1] Laughlin (1999). Nobel Lecture: Fractional quantization. https://doi.org/10.1103/RevModPhys.71.863

[2] Ellis (2020). Emergence in Solid State Physics and Biology. arXiv:2004.13591

[3] Guay, Sartenaer (2018). Emergent Quasiparticles. Or How to Get a Rich Physics from a Sober Metaphysics.  Individuation, Process and Scientific Practices. New York, USA: Oxford University Press. pp. 214-235

Avatar

Are Quasiparticles "real"? Part I

Previously we discussed emergence and its relation to symmetry breaking - however, emergence is such a rich feature of nature that it is worth further discussion, in particular its relation to quasi-particles. As Laughlin puts it: "One of the things emergence can do is create new particles" (as cited in [1]). Quasiparticles appear in many topics of condensed matter theory, e.g. in BCS Theory of superconductivity, in the integer quantum hall effect, or in topological insulators. Their name, quasi-particle, seems to indicate that they lack "reality" - however, as we will see, it is not that simple to determine their ontological status.

First, what are quasiparticles?

Before we can dive into the discussion of the quasi-particle's ontological status, it is required that we all are on the same page - hence, we will briefly review what quasiparticles are and for this purpose it will be helpful to regard a specific example: the Su-Schrieffer-Heger Model (SSH) [2]. The SSH model is a paradigmatic example of topological insulators - however necessary to discuss, the topology will not be the primary focus here.

The SSH-Model

The model describes a one dimensional chain on which electrons can hop between the sites. The hopping amplitudes are staggered and therefore one can regard fully dimerized cases as we will see later. The system consists of N unit cells with two sites each. Since one ignores interactions between the electrons, it is reasonable to describe it via an single particle Hamiltonian:

Here, v and w denote the (generally different) hopping amplitudes. Hence, v is the hopping amplitude within the unit cells and w the amplitude between two neighbouring cells. Moreover, A and B denote the respective sublattices.

Let's have a look on two special cases, the fully dimerized cases in which we have either w=0, v=1 or w=1, v=1. The former defines a trivial phase of the system, while the latter describes a so called topological phase.

In the above trivial case, (v=1, w=0) the eigenstates of the Hamiltonian are simple singlet states within the unit cells:

For the topological case (v=0, w=1) things become more subtle, the states of the bulk look similar as in the trivial case:

However, the sites at the edges (m=1 on sublattice A, m=N on sublattice B) are isolated from the bulk:

Therefore these sites give rise to zero energy states, i.e. they are eigenstates of the Hamiltonian with zero energy:

Note, that there is a 4-fold degeneracy of the edge configurations: Since they are fermions, each site can only be occupied or not occupied - this leads to four states with equal energy (=zero), both occupied, both empty and two possibilities of having one site occupied and one empty.

These edge states belong to the simplest edge states that can appear in a topological insulator. In higher dimensions, edge states can also appear in a more "particle-like" form as e.g. in time reversal symmetric two dimensional topological insulators, where the edge states have well-defined momentum and travel along the edge. Note that these zero energy states do only appear because of the "shape" of the whole system! Without the given chain or lattice structure, they would not be present - this is a crucial point for the remaining discussion of their ontological status.

---

References:

[1] Guay, Sartenaer (2018). Emergent Quasiparticles. Or How to Get a Rich Physics from a Sober Metaphysics.  Individuation, Process and Scientific Practices. New York, USA: Oxford University Press. pp. 214-235

[2] Asbóth, Orószlany, Pályi (2015). A Short Course on Topological Insulators: Band-structure topology and edge states in one and two dimensions. arXiv:1509.02295

You are using an unsupported browser and things might not work as intended. Please make sure you're using the latest version of Chrome, Firefox, Safari, or Edge.