arXiv:2005.02526v1 [physics.bio-ph] 5 May 2020namics is the human brain, which consumes up to 20% of the body’s energy to perform an array of cognitive functions, from computations

Non-equilibrium dynamics and entropy production in the human brain

Christopher W. Lynn,1, 2 Eli J. Cornblath,2, 3 Lia Papadopoulos,1

Maxwell A. Bertolero,3 and Danielle S. Bassett1, 2, 4, 5, 6, 7

1Department of Physics & Astronomy, College of Arts & Sciences,University of Pennsylvania, Philadelphia, PA 19104, USA

2Department of Bioengineering, School of Engineering & Applied Science,University of Pennsylvania, Philadelphia, PA 19104, USA

3Department of Neuroscience, Perelman School of Medicine,University of Pennsylvania, Philadelphia, PA 19104, USA

4Department of Electrical & Systems Engineering, School of Engineering & Applied Science,University of Pennsylvania, Philadelphia, PA 19104, USA5Department of Neurology, Perelman School of Medicine,University of Pennsylvania, Philadelphia, PA 19104, USA6Department of Psychiatry, Perelman School of Medicine,University of Pennsylvania, Philadelphia, PA 19104, USA

7Santa Fe Institute, Santa Fe, NM 87501, USA(Dated: July 31, 2020)

Living systems operate out of thermodynamic equilibrium at small scales, consuming energy andproducing entropy in the environment in order to perform molecular and cellular functions. How-ever, it remains unclear whether non-equilibrium dynamics manifest at macroscopic scales, and ifso, how such dynamics support higher-order biological functions. Here we present a framework toprobe for non-equilibrium dynamics by quantifying entropy production in macroscopic systems. Weapply our method to the human brain, an organ whose immense metabolic consumption drives adiverse range of cognitive functions. Using whole-brain imaging data, we demonstrate that the brainfundamentally operates out of equilibrium at large scales. Moreover, we find that the brain producesmore entropy – operating further from equilibrium – when performing physically and cognitivelydemanding tasks. By simulating an Ising model, we show that macroscopic non-equilibrium dynam-ics can arise from asymmetries in the interactions at the microscale. Together, these results suggestthat non-equilibrium dynamics are vital for cognition, and provide a general tool for quantifyingthe non-equilibrium nature of macroscopic systems.

I. INTRODUCTION

The functions that support life – from processing infor-mation to generating forces and maintaining order – re-quire organisms to operate far from thermodynamic equi-librium [1, 2]. For a system at equilibrium, the fluxes oftransitions between different states vanish [Fig. 1(a)], aproperty known as detailed balance. The system ceasesto produce entropy and its dynamics become reversiblein time. By contrast, living systems exhibit net fluxes be-tween states or configurations [Fig. 1(b)], thereby break-ing detailed balance and establishing an arrow of time[2]. Critically, such non-equilibrium dynamics lead tothe production of entropy, a fact first recognized by SadiCarnot in his pioneering studies of irreversible processes[3]. At the molecular scale, enzymatic activity drivesnon-equilibrium processes that are crucial for intracellu-lar transport [4], high-fidelity transcription [5], and bio-chemical patterning [6]. At the level of cells and subcel-lular structures, non-equilibrium activity enables sensing[7], adaptation [8], force generation [9], and structuralorganization [10].

Despite the importance of non-equilibrium processesat the microscale, there remain basic questions aboutthe role of non-equilibrium dynamics in macroscopic sys-tems composed of many interacting components. What,if anything, can non-equilibrium behaviors at large scales

tell us about the fundamental non-equilibrium nature ofa system at small scales? Moreover, just as microscopicnon-equilibrium dynamics support molecular and cellularfunctions, does the breaking of detailed balance at largescales support higher-order biological functions?

To answer these questions, we study large-scale pat-terns of activity in the brain. Notably, the human brainconsumes up to 20% of the body’s energy in order toperform an array of cognitive functions, from compu-tations and attention to planning and motor execution[12, 13], making it a promising system in which to probeform macroscopic non-equilibrium dynamics. Indeed,metabolic and enzymatic activity in the brain drives anumber of non-equilibrium processes at small scales, in-cluding neuronal firing [14], molecular cycles [15], andcellular housekeeping [16]. One might therefore concludethat the brain – indeed any living system – must breakdetailed balance at large scales. However, by coarse-graining a system, one may average over non-equilibriumdegrees of freedom, yielding “effective” macroscopic dy-namics that produce less entropy [17, 18] and regain equi-librium [19]. Thus, even though non-equilibrium pro-cesses are vital at molecular and cellular scales, it re-mains both interesting and important to understand therole of non-equilibrium dynamics in the brain – and inall living systems generally – at large scales.

arX

iv:2

005.

0252

6v3

[ph

ysic

s.bi

o-ph

] 3

0 Ju

l 202

0

2

(c)(a)

-8 -6 -4 -2 0 2 4 6 8-6

-4

-2

0

2

4

6

0

0.01

0.02

0.03

0.04

Principal component 1

5✕10-4

s-1

Prin

cipa

l com

pone

nt 2

Prob

abilit

y

-8 -6 -4 -2 0 2 4 6 8-6

-4

-2

0

2

4

6

0

0.01

0.02

0.03

0.0410-3 s-1


Prin

cipa

l com

pone

nt 2

Prob

abilit

y

(e)(d) Rest Task (gambling)

Brokendetailed balanceDetailed balance (b)

1

2

3

4 △S = 0

1

2

3

4 △S > 00

0.1

-0.1

z

PC1

PC2

FIG. 1. Macroscopic non-equilibrium dynamics in the brain. (a-b) A simple four-state system, with states represented ascircles and transition rates as arrows. (a) At equilibrium, there are no net fluxes of transitions between states – a conditionknown as detailed balance – and the system does not produce entropy. (b) Systems that are out of equilibrium exhibit netfluxes of transitions between states, breaking detailed balance and producing entropy in the environment. (c) Brain statesdefined by the first two principal components of the neuroimaging time-series of regional activity, computed across all timepoints and all subjects. Colors indicate the z-scored activation of different brain regions, ranging from high-amplitude activity(green) to low-amplitude activity (orange). Arrows represent possible fluxes between states. (d-e) Probability distribution(color) and net fluxes between states (arrows) for neural dynamics at rest (d) and during a gambling task (e). In order to usethe same axes in panels (d) and (e), the dynamics are projected onto the first two principal components of the combined restand gambling time-series data. The flux scale is indicated in the upper right, and the disks represent two-standard-deviationconfidence intervals for fluxes estimated using trajectory bootstrapping [11] (see Appendix A; Fig. 5).

II. FLUXES AND BROKEN DETAILEDBALANCE IN THE BRAIN

Here we develop tools to probe for and quantify non-equilibrium dynamics in macroscopic living systems. Weapply our methods to analyze whole-brain dynamics from590 healthy adults both at rest and across a suite of sevencognitive tasks, recorded using functional magnetic reso-nance imaging (fMRI) as part of the Human ConnectomeProject [20]. For each cognitive task (including rest), thetime-series data consist of blood-oxygen-level-dependent(BOLD) fMRI signals from 100 cortical parcels [21] (seeAppendix A), which we concatenate across all subjects.

We begin by visually examining whether the neuraldynamics break detailed balance. To visualize the dy-namics, we must project the time series onto two dimen-sions. For example, here we project the neural dynamicsonto the first two principal components of the time-seriesdata, which we compute after combining all data points

across all subjects [Fig. 1(c)]. In fact, this projectiondefines a natural low-dimensional state space [22], cap-turing over 30% of the variance in the neural activity (seeAppendix B, Fig. 6). One can then probe for broken de-tailed balance by calculating the net fluxes of transitionsbetween different regions of state space, a method pro-posed by Battle et al. [23] (see Appendix A). Moreover,we can repeat this analysis for different cognitive tasks toinvestigate whether the brain’s non-equilibrium behaviordepends on the mental function being performed.

We first consider the brain’s behavior during restingscans, wherein subjects are instructed to remain stillwithout executing a specific task. At rest, we find thatthe brain exhibits net fluxes between states [Fig. 1(d)],thereby establishing that neural dynamics break detailedbalance at large scales. Furthermore, given the intu-ition that biological functions are supported by non-equilibrium dynamics [1], one might expect the brainto break detailed balance even more strongly when per-

3

Prin

cipa

l com

pone

nt 2

Principal component 1 Principal component 1 Principal component 1

αβJ < 0

βα(b)(a) Asymmetric Ising model

(c)

T0.1 1 10

-3 -2 -1 0 1 2 3-3

-2

-1

0

1

2

3

0 0.01 0.02

-3 -2 -1 0 1 2 3-3

-2

-1

0

1

2

3

0 0.01 0.02Probability

-3 -2 -1 0 1 2 3-3

-2

-1

0

1

2

3

0 0.01 0.02 0.03

-3 -2 -1 0 1 2 3-3

-2

-1

0

1

2

3

0 0.01 0.02 0.03Probability

-3 -2 -1 0 1 2 3-3

-2

-1

0

1

2

3

0 0.01 0.02 0.03 0.04 0.05 0.06

-3 -2 -1 0 1 2 3-3

-2

-1

0

1

2

3

0 0.01 0.02 0.03 0.04 0.05 0.06Probability

5✕10-3

α β

Asymmetric SK model

J > 0αβ

α

Chris Lynn April 3, 2020

hello

Jij ⇠ N⇣

0,1pN

⌘

J↵� ⇠ N�0, 1/N

�

J ⇠ N�0, 1/N

�

1

αββ

FIG. 2. Simulating complex non-equilibrium dynamics using an asymmetric Ising model. (a) Two-spin Ising model withasymmetric interactions (left), where the interaction Jαβ represents the strength of the influence of spin β on spin α. Simulatingthe model with synchronous updates, the system exhibits a clear loop of flux between spin states (right). (b) Asymmetric versionof the Sherrington-Kirkpatrick (SK) model, wherein directed interactions are drawn independently from a zero-mean Gaussianwith variance 1/N , where N is the size of the system. (c) For an asymmetric SK model with N = 100 spins, we plot theprobability distribution (color) and fluxes between states (arrows) for simulated time-series at temperatures T = 0.1 (left),T = 1 (middle), and T = 10 (right). In order to visualize the dynamics, the time series are projected onto the first two principalcomponents of the combined data across all three temperatures. The scale is indicated in flux-per-time-step, and the disksrepresent two-standard-deviation confidence intervals estimated using trajectory bootstrapping (see Appendix A).

forming a specific cognitive task. To test this hypothesis,we study task scans, wherein subjects respond to stim-uli and commands that require attention, computations,and physical and cognitive effort. For example, duringa gambling task in which subjects play a card guessinggame for monetary reward, the brain’s dynamics form adistinct loop of fluxes [Fig. 1(e)] that are nearly an or-der of magnitude stronger than those present during rest.Such closed loops of flux are a characteristic feature ofnon-equilibrium steady-state systems [24], and we verifythat the brain operates in a stochastic steady state (seeAppendix C, Fig. 7). Furthermore, to confirm that non-equilibrium dynamics encode the arrow of time, we showthat if the time series are shuffled – thereby destroyingthe temporal order of the system – then the fluxes be-tween states vanish and detailed balance is restored (seeAppendix D, Fig. 8). Together, these results demon-strate that the brain fundamentally breaks detailed bal-ance at large scales, and moreover, that the strength ofbroken detailed balance depends critically on the cogni-tive function being performed.

III. SIMULATING MACROSCOPICNON-EQUILIBRIUM DYNAMICS

To understand how non-equilibrium dynamics arise atlarge scales, it is helpful to consider a canonical modelof stochastic dynamics in complex systems. In the Isingmodel, the interactions between spins are typically con-strained to be symmetric, yielding simulated dynamicsthat obey detailed balance and converge to equilibrium[25]. However, connections in the brain – from synapsesbetween neurons to white matter tracts between entirebrain regions – are inherently asymmetric [26]. If we al-low for asymmetric interactions in the Ising model, thenthe system diverges from equilibrium, displaying closedloops of flux between spin states at small scales [Fig.2(a)]. But can these fine-scale violations of detailed bal-ance combine to give rise to macroscopic non-equilibriumdynamics?

To answer this question, we study a system of N =100 spins (matching the 100 parcels in our neuroimag-ing data), with the interaction between each directed

4

pair of spins drawn independently from a zero-meanGaussian with variance 1/N [Fig. 2(b)]. This modelis the asymmetric generalization of the Sherrington-Kirkpatrick (SK) model of a spin glass [27]. After sim-ulating the system at three different temperatures, weperform the same procedure that we applied to the neu-roimaging data (Fig. 1): projecting the time-series ontothe first two principal components of the combined dataand calculating net fluxes in this low-dimensional statespace. At high temperature, stochastic fluctuations dom-inate the system, and we only observe weak fluxes be-tween states [Fig. 2(c), right]. By contrast, as the tem-perature decreases, the interactions between spins over-come the stochastic fluctuations, giving rise to clear loopsof flux [Fig. 2(c), middle and left]. These loops of fluxdemonstrate that asymmetries in the fine-scale interac-tions between elements can give rise to large-scale vio-lations of detailed balance. Moreover, by varying thestrength of microscopic interactions, a single system cantransition from near equilibrium to far from equilibrium,just as observed for the brain during distinct cognitivetasks [Fig. 1(d-e)].

IV. QUANTIFYING ENTROPY PRODUCTIONIN MACROSCOPIC SYSTEMS

While fluxes in state space reveal broken detailed bal-ance, quantifying this non-equilibrium behavior requiresmeasuring the “distance” of a system from equilibrium.One such measure is the rate at which a system pro-duces entropy in its enviornment, a central concept innon-equilibrium statistical mechanics [28]. Importantly,this physical entropy production Sphys (often referredto as dissipation) is lower-bounded by an information-theoretic notion of entropy production Sinfo, which canbe estimated simply by observing a system’s dynamics[17]. For example, consider a system with joint transi-tion probabilities Pij = Prob[xt−1 = i, xt = j], where xtis the state of the system at time t. If the dynamics areMarkovian (as, for instance, is true for the Ising system),then the information entropy production is given by [29]

Sphys ≥ Sinfo =∑ij

Pij logPijPji

, (1)

where the sum runs over all states i and j.The inequality in Eq. (1) provides a direct link between

macroscopic dynamics and non-equilibrium behavior: Ifwe can establish that the information entropy productionis greater than zero (Sinfo > 0), then we can immedi-ately conclude that the system is fundamentally out ofequilibrium (Sphys > 0). From an information-theoreticperspective, we remark that Sinfo (which we refer to sim-ply as entropy production) is equivalent to the Kullback-Leibler divergence between the forward transition prob-abilities Pij and the reverse transition probabilities Pji[30]. If the system obeys detailed balance (that is, if

Pij = Pji for all pairs of states i and j), then the en-tropy production vanishes. Conversely, any violation ofdetailed balance leads to an increase in entropy produc-tion, thereby reflecting the distance of the system fromequilibrium.

Calculating the entropy production requires estimatingthe transition probabilities Pij . However, for complexsystems the number of states grows exponentially withthe size of the system, making a direct estimate of theentropy production infeasible. To overcome this hurdle,we employ a hierarchical clustering algorithm that groupssimilar states in the time series into a single cluster, yield-ing a reduced number of coarse-grained states [Fig. 10(a);see Appendix A]. Moreover, by choosing these clustershierarchically [31], we prove that the estimated entropyproduction can only increase with the number of coarse-grained states (ignoring finite data effects; see AppendixE), thereby providing an improving lower bound on thephysical rate of entropy production. Indeed, across alltemperatures in the Ising system, we verify that the es-timated entropy production increases with the numberof clusters k [Fig. 10(b)]. Furthermore, as the tempera-ture decreases the entropy production grows [Fig. 10(b)],thereby capturing the difference in the non-equilibriumnature of the system at high versus low temperatures[Fig. 2(c)].

V. ENTROPY PRODUCTION IN THE HUMANBRAIN

We are now prepared to investigate whether the brainoperates at different distances from equilibrium whenperforming distinct functions. We study seven tasks,each of which engages a specific cognitive process and as-sociated anatomical system: emotional processing, work-ing memory, social inference, language processing, re-lational matching, gambling, and motor execution [32].To estimate the entropy production, we cluster the neu-roimaging data (combined across all subjects and tasksettings, including rest) into k = 8 coarse-grained states,the largest number for which all transitions were observedat least once in each task (see Appendix F, Fig. 10).Across all tasks and rest, the brain produces a significantamount of entropy, confirming that the brain operatesout of equilibrium [Fig. 4(a)]. Specifically, for all tasksettings the entropy production is significantly greaterthan the noise floor that arises due to finite data (one-sided t-test with p < 0.001). Furthermore, the entropyproduction is greater during all of the cognitive tasksthan at rest, with each task inducing a distinct patternof fluxes between states (see Appendix G, Fig. 11). Infact, the motor task (wherein subjects are prompted toperform specific physical movements) induces a 20-foldincrease in entropy production over resting-state dynam-ics, thereby demonstrating that, depending on the func-tion being performed, neural dynamics operate at a widerange of distances from equilibrium.

5

(a) (b)

5 10 15 200

0.2

0.4

0.6

0.8

Number of clusters k

Entro

py p

rodu

ctio

n (b

its)

T = 0.1T = 1T = 10

x1

xN-1

xN

datacluster

flux

Asymmetric SK modelClustering in state space

FIG. 3. Estimating entropy production using hierarchical clustering. (a) Schematic of clustering procedure where axesrepresent the activities of individual components (e.g., brain regions in the neuroimaging data or spins in the Ising model),points reflect individual states observed in the time-series, shaded regions define clusters (or coarse-grained states), and arrowsillustrate possible fluxes between clusters. (b) Entropy production in the asymmetric SK model as a function of the numberof clusters k for the same time-series studied in Fig. 2(c), with error bars reflecting two standard deviations estimated usingtrajectory bootstrapping (see Appendix A).

At small scales, living systems operate out of equilib-rium in order to perform cellular and molecular functions[2]. Are macroscopic violations of detailed balance sim-ilarly associated with higher-order biological functions?Specifically, are the variations in the brain’s entropy pro-duction [Fig. 4(a)] driven by physical and cognitive de-mands? To answer this question, we first consider thefrequency of responses in each task as a measure of phys-ical effort. Across tasks, we find that entropy productiondoes in fact increase with the frequency of physical re-sponses [Fig. 4(b)], with each response yielding an addi-tional 0.07± 0.03 bits of information entropy.

In order to study effect of cognitive effort, we note thatthe working memory task splits naturally into two con-ditions: one with high cognitive load and another withlow cognitive load. Moreover, the frequency of physicalresponses is identical across the two conditions, therebycontrolling for physical effort. We find that the brainoperates further from equilibrium when exerting morecognitive effort [Fig. 4(c)], with the high-load conditioninducing a two-fold increase in entropy production overthe low-load condition. Finally, we verify that these find-ings do not depend on the Markov assumption in Eq.(1) (see Appendix H, Fig. 12), are robust to reasonablevariation in the number of clusters k (see Appendix I,Fig. 13), and cannot be explained by head motion in thescanner (a common confound in fMRI studies [33]) norvariance in the activity time-series (see Appendix J, Fig.14). Together, these results demonstrate that large-scaleviolations of detailed balance in the brain are related toboth physical effort and cognition. This conclusion, inturn, suggests that non-equilibrium dynamics in macro-scopic living systems may be associated with higher-orderbiological functions.

VI. CONCLUSIONS

In this study, we describe a method for investigat-ing macroscopic non-equilibrium dynamics by quantify-ing entropy production in living systems. While micro-scopic non-equilibrium processes are known to be vitalfor molecular and cellular operations [4–10], here we showthat non-equilibrium dynamics also arise at large scalesin complex living systems. Analyzing whole-brain imag-ing data, we find not only that the human brain breaksdetailed balance at large scales, but that the brain’s en-tropy production (that is, its distance from equilibrium)increases with physical and cognitive exertion. Notably,the tools presented are non-invasive, applying to any sys-tem with time-series data, and can be used to studystochastic steady-state dynamics, rather than determin-istic dynamics that trivially break detailed balance. Fur-thermore, the framework is not limited to the brain, butcan be applied broadly to probe for broken detailed bal-ance in complex systems, including collective behavior inhuman and animal populations [34], correlated patternsof neuronal firing [35], and aggregated activity in molec-ular and cellular networks [36, 37].

ACKNOWLEDGMENTS

The authors thank Erin Teich, Pragya Srivastava, Ja-son Kim, and Zhixin Lu for feedback on earlier versionsof this manuscript. The authors acknowledge supportfrom the John D. and Catherine T. MacArthur Foun-dation, the ISI Foundation, the Paul G. Allen FamilyFoundation, the Army Research Laboratory (W911NF-10-2-0022), the Army Research Office (Bassett-W911NF-

6

0 0.1 0.2 0.3 0.4 0.50

0.01

0.02

0.03

0.04

0.05

Lowload

Highload

0

0.01

0.02

0.03

0.04

0.05

Entro

py p

rodu

ctio

n (b

its/s

)

0

0.01

0.02

0.03

0.04

0.05

RestEmotion

Working

memory

SocialLanguage

Relational

Gambling

Motor

k = 8

Entro

py p

rodu

ctio

n (b

its/s

)

(b)

(a)

(c)Workingmemory

Response rate (1/s)

slope = 0.07 bits

Entro

py p

rodu

ctio

n (b

its/s

)r = 0.774p = 0.024

FIG. 4. Entropy production in the brain varies with physical and cognitive demands. (a) Entropy production at rest andduring seven cognitive tasks, estimated using hierarchical clustering with k = 8 clusters. (b) Entropy production as a functionof response rate (i.e., the frequency with which subjects are asked to physically respond) for the tasks listed in panel (a). Eachresponse induces an average 0.07 ± 0.03 bits of produced entropy (Pearson correlation r = 0.774, p = 0.024). (c) Entropyproduction for low cognitive load and high cognitive load conditions in the working memory task, where low and high loadsrepresent 0-back and 2-back conditions, respectively, in an n-back task. The brain produces significantly more entropy duringhigh-load than low-load conditions (one-sided t-test, p < 0.001, t > 10, df = 198). Across all panels, raw entropy productions[Eq. (1)] are divided by the fMRI repetition time ∆t = 0.72 s to compute an entropy production rate, and error bars reflecttwo standard deviations estimated using trajectory bootstrapping (see Appendix A).

14-1-0679, Falk-W911NF-18-1-0244, Grafton-W911NF-16-1-0474, DCIST- W911NF-17-2-0181), the Office ofNaval Research, the National Institute of Mental Health(2-R01-DC-009209-11, R01-MH112847, R01-MH107235,R21-M MH-106799), the National Institute of ChildHealth and Human Development (1R01HD086888-01),National Institute of Neurological Disorders and Stroke(R01 NS099348), and the National Science Founda-tion (NSF PHY-1554488, BCS-1631550, and NCS-FO-1926829).

CITATION DIVERSITY STATEMENT

Recent work in neuroscience and other fields has iden-tified a bias in citation practices such that papers from

women and other minorities are under-cited relative tothe number of such papers in the field [38, 39]. Here wesought to proactively consider choosing references thatreflect the diversity of the field in thought, form of con-tribution, gender, race, geographic location, and otherfactors. Excluding self-citations to the authors of thispaper and single-author citations, the first and last au-thors of references are 58% male/male, 21% female/male,14% male/female, and 7% female/female. We look for-ward to future work that could help us better understandhow to support equitable practices in science.

7

Appendix A: Methods

1. Calculating fluxes

Consider time-series data gathered in a time windowttot, and let nij denote the number of observed transitionsfrom state i to state j. The flux rate from state i tostate j is given by ωij = (nij − nji)/ttot. For the fluxcurrents in Figs. 1(d-e) and 2(c), the states of the systemare points (x, y) in two-dimensional space, and the stateprobabilities are estimated by p(x, y) = t(x,y)/ttot, wheret(x,y) is the time spent in state (x, y). The magnitudeand direction of the flux through a given state (x, y) isdefined by the flux vector [23]

u(x, y) =1

2

(ω(x−1,y),(x,y) + ω(x,y),(x+1,y)

ω(x,y−1),(x,y) + ω(x,y),(x,y+1)

). (A1)

In a small number of cases, two consecutive states in theobserved time-series x(t) = (x(t), y(t)) and x(t + 1) =(x(t + 1), y(t + 1)) are not adjacent in state space. Inthese cases, we perform a linear interpolation betweenx(t) and x(t+1) in order to calculate the fluxes betweenadjacent states.

2. Estimating errors using trajectory bootstrapping

The finite length of time-series data limits the accu-racy with which quantities can be estimated. In order tocalculate error bars on all estimated quantities, we applytrajectory bootstrapping [11, 23]. We first record the listof transitions

I =

i1 i2i2 i3...

...iL−1 iL

, (A2)

where i` is the `th state in the time-series, and L is thelength of the time-series. From the transition list I, onecan calculate all of the desired quantities; for instance,the fluxes are estimated by

ωij =1

ttot

∑`

δi,I`,1δj,I`,2 − δj,I`,1δi,I`,2 . (A3)

We remark that when analyzing the neural data, al-though we concatenate the time-series across subjects, weonly include transitions in I that occur within the samesubject. That is, we do not include the transitions be-tween adjacent subjects in the concatenated time-series.

To calculate errors, we construct bootstrap trajecto-ries (of the same length L as the original time-series)by sampling the rows in I with replacement. For exam-ple, to compute errors for the flux vectors u(x) in Figs.1(d-e) and 2(c), we first estimate the covariance matrix

u1

u2

u12�σ

u22�σ

FIG. 5. Visualizing flux vectors. Schematic demonstrat-ing how we illustrate the flux of transitions through a state(vector) and the errors in estimating the flux (ellipse).

Cov(u1(x), u2(x)) by averaging over bootstrapped tra-jectories. Then, for each flux vector, we visualize its errorby plotting an ellipse with axes aligned with the eigen-vectors of the covariance matrix and radii equal to twicethe square root of the corresponding eigenvalues (Fig.5). All errors throughout the manuscript are calculatedusing 100 bootstrap trajectories.

The finite data length also induces a noise floor for eachquantity, which is present even if the temporal order ofthe time-series is destroyed. To estimate the noise floor,we construct bootstrap trajectories by sampling individ-ual data points from the time-series. We contrast thesebootstrap trajectories with those used to estimate errorsabove, which preserve transitions by sampling the rowsin I. The noise floor, which is calculated for each quan-tity by averaging over the bootstrap trajectories, is thencompared with the estimated quantities. For example,rather than demonstrating that the average entropy pro-ductions in Fig. 4(a) are greater than zero, we establishthat the distribution over entropy productions is signifi-cantly greater than the noise floor using a one-sided t-testwith p < 0.001.

3. Simulating the asymmetric Ising model

The asymmetric Ising model is defined by a (possi-bly asymmetric) interaction matrix J , where Jαβ repre-sents the influence of spin β on spin α [Fig. 2(a)], anda temperature T ≥ 0 that tunes the strength of stochas-tic fluctuations. Here, we study a system with N = 100spins, where each directed interaction Jαβ is drawn in-dependently from a zero-mean Gaussian with variance1/N = 0.01 [Fig. 2(b)]. One can additionally includeexternal fields hα, but for simplicity here we set themto zero. The state of the system is defined by a vectorx = (x1, . . . , xN ), where xα = ±1 is the state of spinα. To generate time series, we employ Glauber dynam-ics with synchronous updates, a common Monte Carlo

8

method for simulating Ising systems [25]. Specifically,given the state of the system x(t) at time t, the prob-

ability of spin α being “up” at time t + 1 (that is, theprobability that xα(t+ 1) = 1) is given by

Prob[xα(t+ 1) = 1] =exp

(1T

∑β Jαβxβ(t)

)exp

(1T

∑β Jαβxβ(t)

)+ exp

(− 1T

∑β Jαβxβ(t)

) . (A4)

Stochastically updating each spin α according to Eq.(A4), one arrives at the new state x(t+1). For each tem-perature in the Ising calculations in Figs. 2(c) and 10(b),we generate a different time-series of length L = 100, 000with 10, 000 trials of burn-in.

4. Hierarchical clustering

To estimate the entropy production of a system, onemust first calculate the transition probabilities Pij =nij/(L − 1). For complex systems, the number of statesi (and therefore the number of transitions i → j) growsexponentially with the size of the system N . For exam-ple, in the Ising model each spin α can take one of twovalues (xα = ±1), leading to 2N possible states and 22N

possible transitions. In order to estimate the transitionprobabilities Pij , one must observe each transition i→ jat least once, which requires significantly reducing thenumber of states in the system. Rather than definingcoarse-grained states a priori, complex systems (and thebrain in particular) often admit natural coarse-graineddescriptions that are uncovered through dimensionality-reduction techniques [22, 40, 41].

Although one can use any coarse-graining technique toimplement our framework and estimate entropy produc-tion, here we employ hierarchical k-means clustering fortwo reasons: (i) generally, k-means is perhaps the mostcommon and simplest clustering algorithm, with demon-strated effectiveness fitting neural dynamics [40, 41]; and(ii) specifically, by defining the clusters hierarchically weprove that the estimated entropy production becomesmore accurate as the number of clusters increases (ig-noring finite data effects; Fig. 9).

In k-means clustering, one begins with a set of states(for example, those observed in our time-series) and anumber of clusters k. Each observed state x is randomlyassigned to a cluster i, and one computes the centroid ofeach cluster. On the following iteration, each state is re-assigned to the cluster with the closest centroid (here weuse cosine similarity to determine distance). This pro-cess is repeated until the cluster assignments no longerchange. In a hierarchical implementation, one beginswith two clusters; then one cluster is selected (typicallythe one with the largest spread in its constituent states)to be split into two new clusters, thereby defining a to-tal of three clusters. This iterative splitting is continued

until one reaches the desired number of clusters k. InAppendix E, we show that hierarchical clustering pro-vides an increasing lower-bound on the entropy produc-tion; and in Appendix F, we demonstrate how to choosethe number of clusters k.

5. Neural data

The whole-brain dynamics used in this study are mea-sured and recorded using blood-oxygen-level-dependent(BOLD) functional magnetic resonance imaging (fMRI)collected from 590 healthy adults as part of the HumanConnectome Project [20, 32]. BOLD fMRI estimatesneural activity by calculating contrasts in blood oxygenlevels, without relying on invasive injections and radi-ation [42]. Specifically, blood oxygen levels (reflectingneural activity) are measured within three-dimensionalnon-overlapping voxels, spatially contiguous collectionsof which each represent a distinct brain region (or par-cel). Here, we consider a parcellation that divides thecortex into 100 brain regions that are chosen to opti-mally capture the functional organization of the brain[21]. After processing the signal to correct for sources ofsystematic noise such as head motion (see Appendix K),the activity of each brain region is discretized in time,yielding a time-series of neural activity. For each sub-ject, the shortest scan (corresponding to the emotionalprocessing task) consists of 176 discrete measurementsin time. In order to control for variability in data sizeacross tasks, for each subject we only study the first 176measurements in each task.

Appendix B: Low-dimensional embedding usingPCA

In order to visualize net fluxes between states in a com-plex system, we must project the dynamics onto two di-mensions. While any pair of dimensions can be used toprobe for broken detailed balance, a natural choice is thefirst two principal components of the time-series data.Indeed, principal component analysis has been widelyused to uncover low-dimensional embeddings of large-scale neural dynamics [22, 43]. Combining the time-seriesdata from the rest and gambling task scans (that is, thedata studied in Fig. 1), we find that the first two princi-

9

1 2 3 4 5 6 7 8 9 100

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Frac

tion

of v

aria

nce

expl

aine

d

Principal component

Cumulative

Individual

(a)

(b)

1 2 3 4 5 6 7 8 9 100

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Frac

tion

of v

aria

nce

expl

aine

d

Principal component

Individual (Gambling)

Cumulative (Rest)

Individual (Rest)Cumulative (Gambling)

FIG. 6. PCA reveals low-dimensional embedding of neuraldynamics. (a) Cumulative fraction of variance explained byfirst ten principal components (line) and explained variancefor each individual principal component (bars) in the com-bined rest and gambling data. (b) For the same principalcomponents (calculated for the combined rest and gamblingdata), we plot the cumulative fraction of variance explained(lines) and individual explained variance (bars) for the rest(red) and gambling (blue) data.

pal components capture over 30% of the total variance inthe observed recordings [Fig. 6(a)], thereby comprisinga natural choice for two-dimensional projections. More-over, we confirm that the projected dynamics captureapproximately the same amount of variance in both therest and gambling tasks, confirming that PCA is not over-fitting the neural dynamics in one task or another [Fig.6(b)].

Appendix C: The brain operates at a stochasticsteady state

Some of the tools and intuitions developed in tradi-tional statistical mechanics to study equilibrium systemshave recently been generalized to systems that operate at

Nor

mal

ized

prob

abilit

y ch

ange

Chri

sLynn

Mar

ch25

,20

20

hello

p i/�

p i

1

-8 -6 -4 -2 0 2 4 6 8-6

-4

-2

0

2

4

6

-0.6

-0.4

-0.2

0

0.2

0.4

0.6


Prin

cipa

l com

pone

nt 2

(b) Task (gambling)

-8 -6 -4 -2 0 2 4 6 8-6

-4

-2

0

2

4

6

-0.6

-0.4

-0.2

0

0.2

0.4

0.6


Prin

cipa

l com

pone

nt 2

(a) Rest

Nor

mal

ized

prob

abilit

y ch

ange

Chri

sLynn

Mar

ch25

,20

20

hello

p i/�

p i

1

FIG. 7. Small changes in state probabilities imply steady-state dynamics. Change in state probabilities pi, normalizedby the standard deviation σpi , plotted as a function of the firsttwo principal components at rest (a) and during the gamblingtask (b).

non-equilibrium steady states [44]. For example, Evanset al. generalized the second law of thermodynamics tonon-equilibrium steady-state systems by discovering the(steady state) fluctuation theorem [45]. More recently,Dieterich et al. showed that, by mapping their dynam-ics to an equilibrium system at an effective temperature,some non-equilibrium steady-state systems are governedby a generalization of the fluctuation-dissipation theorem[46]. Thus, it is both interesting and practical to inves-tigate whether the brain operates at a non-equilibriumsteady state.

We establish in Figs. 1 and 4 that the brain operatesout of equilibrium. To determine if the brain functions ata steady state, we must examine whether its state prob-abilities are stationary in time; that is, letting pi denotethe probability of state i, we must determine whetherpi = dpi/dt = 0 for all states i. The change in theprobability of a state is equal to the net rate at whichtransitions flow into versus out of a state. For the two-

10

dimensional dynamics studied in Fig. 1, this relationtakes the form

dp(x,y)

dt= ω(x−1,y),(x,y) − ω(x,y),(x+1,y)

+ ω(x,y−1),(x,y) − ω(x,y),(x,y+1), (C1)

where ωij = (nij−nji)/ttot is the flux rate from state i tostate j, nij is the number of observed transitions i → j,and ttot is the temporal duration of the time-series [23].

Here, we calculate the changes in state probabilitiesfor both the rest and gambling scans. Across all statesin both task conditions, we find that these changes areindistinguishable from zero when compared to statisticalnoise (Fig. 7). Specifically, the changes in state probabil-ities are much less than twice their standard deviations,indicating that they cannot be significantly distinguishedfrom zero with a p-value less than 0.05. Combined withthe results from Figs. 1 and 4, the stationarity of theneural state probabilities demonstrates that the brain op-erates at a non-equilibrium steady-state.

Appendix D: Shuffling time-series restores detailedbalance

In Fig. 1, we demonstrate that the brain operates outof equilibrium by exhibiting net fluxes between states.These fluxes break detailed balance and establish an ar-row of time. Here we demonstrate that if the arrow oftime is destroyed by shuffling the order of the neural time-series, then the fluxes vanish and equilibrium is restored.Specifically, for both the rest and gambling task scans,we generate 100 surrogate time-series with the order ofthe data randomly shuffled. Averaging across these shuf-fled time-series, we find that the fluxes between statesare vanishingly small compared to statistical noise (Fig.8), thus illustrating that the system has returned to equi-librium. We remark that other common surrogate datatechniques, such as the random phases and amplitudeadjusted Fourier transform surrogates, are not applica-ble here because they preserve the temporal structure ofthe time-series data [47].

Appendix E: Bounding entropy production usinghierarchical clustering

Complex systems are often high-dimensional, with thenumber of possible states or configurations growing expo-nentially with the size of the system. In order to estimatethe information entropy production Sinfo of a complexsystem, we must reduce the number of states throughthe use of coarse-graining, or dimensionality reduction,techniques. Interestingly, the entropy production ad-mits a number of strong properties under coarse-graining[17, 18, 28, 29]. Of particular interest is the fact thatthe entropy production can only decrease under coarse-graining [17]. Specifically, given two descriptions of a

10-4 s-1


Prin

cipa

l com

pone

nt 2

Prob

abilit

y

(b) Task (gambling)


Prin

cipa

l com

pone

nt 2

Prob

abilit

y

(a) Rest

10-4 s-1

FIG. 8. Shuffled data do not exhibit net fluxes between brainstates. Probability distribution (color) and nearly imperceiv-able fluxes between states (arrows) for neural dynamics, whichare shuffled and projected onto the first two principal compo-nents, both at rest (a) and during a gambling task (b). Theflux scale is indicated in the upper right, and the disks rep-resent two-standard-deviation confidence intervals for fluxesestimated using trajectory bootstrapping (see Appendix A).

system, a “microscopic” description with states {i} anda “macroscopic” description with states {i′}, we say thatthe second description is a coarse-graining of the first ifthere exists a surjective map from the microstates {i}to the macrosctates {i′} [that is, if each microstate i ismapped to a unique macrostate i′; Fig. 9(a)]. Given sucha coarse-graining, Esposito showed [17] that the entropyproduction of the macroscopic description S′ can be nolarger than that of the microscopic description S; in otherwords, the coarse-grained entropy production provides alower bound for the original value, such that S′ ≤ S.

The monotonic decrease of the entropy production un-der coarse-graining implies two desirable mathematicalresults. First, if one finds that any coarse-grained de-scription of a system is out of equilibrium (that is, if thecoarse-grained entropy production is significantly greater

11

(a) (b)

ii’

Microstates MacrostatesCoarse-graining

S(2)S(3)S(4)S ≤≤≤

k = 2k = 3k = 4Full systemHierarchical clustering

S S’≤FIG. 9. Hierarchy of lower bounds on the entropy production. (a) Coarse-graining is defined by a surjective map from aset of microstates {i} to a set of macrostates {i′}. Under coarse-graining the entropy production can only decrease or remainthe same. (b) In hierarchical clustering, states are iteratively combined to form new coarse-grained states (or clusters). Eachiteration defines a coarse-graining from k states to k − 1 states, thereby forming a hierarchy of lower bounds on the entropyproduction.

than zero), then one has immediately established thatthe full microscopic system is out of equilibrium (sincethe physical microscopic entropy production is at leastas large as the coarse-grained value). We use this fact toshow – only by studying coarse-grained dynamics – thatthe brain fundamentally operates far from equilibrium(Fig. 4).

Second, here we show that hierarchical clustering pro-vides a hierarchy of lower bounds on the physical en-tropy production. In hierarchical clustering, each cluster(or coarse-grained state) at one level of description (withk clusters) maps to a unique cluster at the level below[with k − 1 clusters; Fig. 9(b)]. This process can eitherbe carried out by starting with a large number of clustersand then iteratively picking pairs of clusters to combine(known as agglomerative clustering), or by starting witha small number of clusters and then iteratively pickingone cluster to split into two (known as divisive cluster-ing, which we employ in our analysis) [48]. In both cases,the mapping from k clusters to k − 1 clusters is surjec-tive, thereby defining a coarse-graining of the system.Thus, letting S(k) denote the entropy production esti-mated with k clusters, hierarchical clustering defines ahierarchy of lower bounds on the true entropy produc-tion S:

0 = S(1) ≤ S(2) ≤ S(3) ≤ . . . ≤ S. (E1)

This hierarchy, in turn, demonstrates that the estimatedentropy production S(k) becomes more accurate with in-creasing k.

We remark that the discussion above neglects finitedata effects. We recall that estimating the entropy pro-duction requires first estimating the transition probabil-ities Pij from state i to state j. This means that fork clusters, one must estimate k2 different probabilities.Thus, while increasing k improves the accuracy of the es-timated entropy production in theory, in practice increas-

ing k eventually leads to sampling issues that decreasethe accuracy of the estimate. Given these competing in-fluences, when analyzing real data the goal should be tochoose k such that it is as large as possible while stillproviding accurate estimates of the transition probabili-ties. We discuss how to choose k in a reasonable mannerin the following section.

Appendix F: Choosing the number of coarse-grainedstates

As discussed above, when calculating the entropy pro-duction, we wish to choose a number of coarse-grainedstates k that is as large as possible while still arriving atan accurate estimate of the transition probabilities. Onesimple condition for estimating each transition probabil-ity Pij is that we observe the transition i → j at leastonce in the time-series. For all of the different tasks, Fig.10(a) shows the fraction of the k2 state transitions thatare left unobserved after coarse-graining with k clusters.We find that k = 8 is the largest number of clusters forwhich the fraction of unobserved transitions equals zero(within statistical errors) for all tasks; that is, the largestnumber of clusters for which all state transitions acrossall tasks were observed at least once. This is the pri-mary reason why we used k = 8 coarse-grained states toanalyze the brain’s entropy production (Fig. 4).

Interestingly, we find that k = 8 coarse-grained statesis a good choice for two additional reasons. The firstcomes from studying the amount of variance explainedby k clusters [Fig. 10(b)]. We find that the increase inexplained variance from k−1 to k clusters is roughly con-stant for k = 3 and 4, then k = 5 to 8, and then k = 9to 16. This pattern means that k = 4, 8, and 16 arenatural choices for the number of coarse-grained states,since any further increase (say from k = 8 to 9) will yield

12

(a) (b)


0.1

0.2

0.3

2 5 10 15 200

0.01

0.02

0.030.1

0.2

0.3


Frac

tion

of v

aria

nce

expl

aine

d

-0.04

-0.02

0

2 5 10 15 200.4

0.5

0.6

0.7

Disp

ersi

onDe

crea

se


Uno

bser

ved

trans

ition

s (%

)(c)

2 5 10 15 20

0

0.05

0.1

0.15

0.2

0.25 RestEmotionWorking memorySocialLanguageRelationalGamblingMotor

Cumulative

Individual

FIG. 10. Choosing the number of coarse-grained states k. (a) Fraction of the k2 state transitions that remain unobserved afterhierarchical clustering with k clusters for the different tasks. Error bars represent two standard deviations over 100 bootstraptrajectories for each task. (b) Percent variance explained (top) and the increase in explained variance from k − 1 to k clusters(bottom) as functions of k. (c) Dispersion, or the average distance between data points within a cluster (top), and the decreasein dispersion from k − 1 to k clusters (bottom) as functions of k.

a smaller improvement in explained variance. Similarly,the second reason for choosing k = 8 comes from study-ing the average distance between states within a cluster,which is known as the dispersion [Fig. 10(c)]. Intuitively,a coarse-grained description with low dispersion providesa good fit of the observed data. Similar to the explainedvariance, we find that the decrease in dispersion fromk− 1 to k clusters is nearly constant for k = 3 to 4, thenk = 5 to 8, and then k = 9 to 16, once again suggestingthat k = 4, 8, and 16 are natural choices for the numberof clusters. Together, these results demonstrate that thecoarse-grained description with k = 8 states provides agood fit to the neural time-series data while still allow-ing for an accurate estimate of the entropy production ineach task.

Appendix G: Flux networks: Visualizing fluxesbetween coarse-grained states

In Fig. 4, we demonstrated that the brain has thecapacity to operate at a wide range of distances fromequilibrium. We did so by estimating the amount of en-tropy the brain produces during different cognitive tasks.In addition to investigating the entropy production, onecan also examine the specific neural processes underlyingthe brain’s non-equilibrium behavior, which are encodedin the fluxes between coarse-grained states.

We find that each of the k = 8 states correspondsto high-amplitude activity in one or two cognitive sys-tems [21] [Fig. 11(a)]. For each task, we can visualizethe pattern of fluxes as a network, with nodes represent-ing the coarse-grained states and directed edges reflect-ing net fluxes between states [Fig. 11(b-i)]. These fluxnetworks illustrate, for example, that the brain nearlyobeys detailed balance during rest [Fig. 11(b)]. Interest-

ingly, in the emotion, working memory, social, relational,and gambling tasks [Fig. 11(c-e,g,h)] – all of which in-volve visual stimuli – the strongest fluxes connect visual(VIS) states. By contrast, these fluxes are weak in thelanguage task [Fig. 11(f)], which only involves auditorystimuli. Finally, in the motor task, wherein subjects areprompted to make physical movements, the dorsal at-tention (DAT) state mediates fluxes between disparateparts of the network [Fig. 11(i)], perhaps reflecting therole of the DAT system in directing goal-oriented atten-tion [49, 50]. In this way, the brain’s non-equilibrium dy-namics are not driven by a single underlying mechanism,but rather emerge from a complex pattern of fluxes thatchanges depending on the task. Examining the structuralproperties and cognitive neuroscientific interpretations ofthese flux networks is an important direction for futurestudies.

Appendix H: Testing the Markov assumption

Thus far, we have employed a definition of entropyproduction [Eq. (1)] that relies on the assumption thatthe time-series is Markovian; that is, that the state xt ofthe system at time t depends only on the previous statext−1 at time t − 1. For real time-series data, however,the dynamics may not be Markovian, and Eq. (1) is notexact. In general, the entropy production (per trial) isgiven by [29, 51]

Sinfo = lim`→∞

1

`

∑i1,...,i`+1

Pi1,...,i`+1log

Pi1,...,i`+1

Pi`+1,...,i1

, (H1)

where Pi1,...,i`+1= Prob[xt−` = i1, . . . , xt = i`+1]

is the probability of observing the sequence of states

13

Low-amplitudeHigh-amplitude

-1 -0.5 0 0.5 1

-1

-0.5

0

0.5

1

-1 -0.5 0 0.5 1

-1

-0.5

0

0.5

1

-1 -0.5 0 0.5 1

-1

-0.5

0

0.5

1

-1 -0.5 0 0.5 1

-1

-0.5

0

0.5

1 -1 -0.5 0 0.5 1

-1

-0.5

0

0.5

1

-1 -0.5 0 0.5 1

-1

-0.5

0

0.5

1

-1 -0.5 0 0.5 1

-1

-0.5

0

0.5

1

(b)

-1.5 -1 -0.5 0 0.5 1 1.5-1.5

-1

-0.5

0

0.5

1

1.5Rest

VIS/LIM VIS

VIS/DAT

DMN_1

DMN_2SOM

FPN

DAT

-1.5 -1 -0.5 0 0.5 1 1.5-1.5

-1

-0.5

0

0.5

1

1.5Emotion

-1.5 -1 -0.5 0 0.5 1 1.5-1.5

-1

-0.5

0

0.5

1

1.5N-back

-1.5 -1 -0.5 0 0.5 1 1.5-1.5

-1

-0.5

0

0.5

1

1.5Social

-1.5 -1 -0.5 0 0.5 1 1.5-1.5

-1

-0.5

0

0.5

1

1.5Language

-1.5 -1 -0.5 0 0.5 1 1.5-1.5

-1

-0.5

0

0.5

1

1.5Relational

-1.5 -1 -0.5 0 0.5 1 1.5-1.5

-1

-0.5

0

0.5

1

1.5Motor

-1.5 -1 -0.5 0 0.5 1 1.5-1.5

-1

-0.5

0

0.5

1

1.5Gambling

(e)Rest

Emotion Working memory Social

Language Relational Gambling Motor(g)

(c) (d)

(h)(f) (i)

VIS / LIMDMN_1VIS VIS / DAT

DMN_2 FPN DATSOM

(a) Coarse-grained states

Flux networks

10-3 s-1×Flux rate

10 0

-1 -0.5 0 0.5 1

-1

-0.5

0

0.5

1 DMNFPN

VIS

SOMDAT

VAT

LIM

0.5 0.9

0 0.3-0.3

z

FIG. 11. Flux networks reveal non-equilibrium dynamics unique to each cognitive task. (a) Coarse-grained brain statescalculated using hierarchical clustering (k = 8), with surface plots indicating the z-scored activation of different brain regions.For each state, we calculate the cosine similarity between its high-amplitude (green) and low-amplitude (orange) componentsand seven pre-defined neural systems [21]: default mode (DMN), frontoparietal (FPN), visual (VIS), somatomotor (SOM),dorsal attention (DAT), ventral attention (VAT), and limbic (LIM). We label each state according to its largest high-amplitudecosine similarities. (b-i) Flux networks illustrating the fluxes between the eight coarse-grained states at rest (b) and duringseven cognitive tasks: emotional processing (c), working memory (d), social inference (e), language processing (f), relationalmatching (g), gambling (h), and motor execution (i). Edge weights indicate flux rates, and fluxes are only included if they aresignificant relative to the noise floor induced by the finite data length (one-sided t-test, p < 0.001).

14

0

0.01

0.02

0.03

0.04

0.05

0.06

0

0.01

0.02

0.03

0.04

0.05

0 0.1 0.2 0.3 0.4 0.50

0.01

0.02

0.03

0.04

0.05

Lowload

Highload

Entro

py p

rodu

ctio

n (b

its/s

)

(b)

(a)

(c)Workingmemory

Response rate (1/s)

slope = 0.07 bits

Entro

py p

rodu

ctio

n (b

its/s

)

Entro

py p

rodu

ctio

n (b

its/s

)

RestEmotion

Working

memory

SocialLanguage

Relational

Gambling

Motor

k = 8

2nd-order

1st-order

r = 0.770p = 0.026

FIG. 12. Second-order approximation of entropy production in the brain. (a) Second-order entropy production at rest andduring seven cognitive tasks (dark bars), estimated using hierarchical clustering with k = 8 clusters. For comparison, we alsoinclude the first-order entropy productions from Fig. 4(a) (light bars). (b) Second-order entropy production as a function ofresponse rate for the tasks listed in panel (a) (dark points). Each response induces an average 0.07 ± 0.03 bits of producedentropy (Pearson correlation r = 0.770, p = 0.026). For comparison, we include the first-order entropy productions from Fig.4(b) (light points). (c) We find a significant difference in the second-order entropy production between low cognitive load andhigh cognitive load conditions in the working memory task (dark bars), where low and high loads represent 0-back and 2-backconditions, respectively (one-sided t-test, p < 0.001, t > 10, df = 198). For comparison, we include the first-order entropyproductions from Fig. 4(c) (light bars). Across all panels, second-order entropy productions [calculated using Eq. (H2)] aredivided by the fMRI repetition time ∆t = 0.72 s to compute an entropy production rate, and error bars reflect two standarddeviations estimated using trajectory bootstrapping (see Appendix A).

i1, . . . , i`+1. If the dynamics are Markovian, for exam-ple, then the limit converges for ` = 1 and we recoverEq. (1) [29]. In general, one can approximate Eq. (H1)by evaluating the function inside the limit for ` as large aspossible. In order to do so, however, one must estimatek`+1 different probabilities for a system with k states.Thus, given data limitations, it is often impractical toestimate the entropy production beyond the Markov ap-proximation (` = 1).

Here we demonstrate that the main conclusions aboutentropy production in the brain (summarized in Fig. 4)do not depend qualitatively on the Markov approxima-tion in Eq. (1). To do so, we consider the second-order

approximation

Sinfo ≈ 1

2

∑i,j,k

Pijk logPijkPkji

, (H2)

which incorporates information about sequences of lengththree. Just as we did under the Markov assumption(Fig. 4), we cluster the neural data using k = 8 coarse-grained states. Given that we are now required to esti-mate k3 = 512 probabilities rather than just 82 = 64,there are inevitably entries in the sum in Eq. (H2) thatare infinite (i.e., those corresponding to reverse-time se-quences k → j → i that are not observed in the time-series). As is common [29, 51], we simply ignore theseterms.

15

Across the different task settings, we find that thesecond-order entropy productions are nearly identicalto the first-order (Markov) approximations presented inFig. 12(a). Moreover, the second-order entropy produc-tion remains significantly correlated with the frequencyof physical responses in different tasks, with each re-sponse still inducing an additional 0.07 ± 0.03 bits ofproduced entropy [Fig. 12(b)]. Finally, in the workingmemory task, the second-order entropy production re-mains larger for high-load conditions than low-load con-ditions [Fig. 12(c)], suggesting that cognitive demandsdrive the brain away from equilibrium. Together, theseresults demonstrate that the brain’s entropy productionis well-approximated by the Markov formulation in Eq.(1) and Fig. 4.

Appendix I: Varying the number of coarse-grainedstates

In Appendix F, we presented methods for choosing thenumber of coarse-grained states k, concluding that k = 8is an appropriate choice for our neural data. However, itis important to check that the entropy production resultsfrom Fig. 4 do not vary significantly with our choice ofk. In Fig. 13(a), we plot the estimated entropy pro-duction for each task setting (including rest) as a func-tion of the number of coarse-grained states k. We findthat the tasks maintain approximately the same orderingacross all choices of k considered, with the brain produc-ing the least entropy during rest, the most entropy duringthe motor task, and the second most entropy during thegambling task. Furthermore, we find that the correla-tion between entropy production and physical responserate [Fig. 4(b)] remains significant for all k ≤ 8 [thatis, for all choices of k for which we observe all transi-tions at least once in each task; Fig. 10(a)] as well ask = 9, 11, 12, 13, and 14 [Fig. 13(b)]. We remark thatwe do not study the case k = 2 because the entropy pro-duction is zero by definition for two-state systems [Fig.13(a)]. Finally, we confirm that the brain produces sig-nificantly more entropy during high-cognitive-load con-ditions than low-cognitive-load conditions in the workingmemory task [Fig. 4(c)] for all choices of k considered[Fig. 13(c)]. Together, these results demonstrate thatthe relationships between entropy production and physi-cal and cognitive effort are robust to reasonable variationin the number of coarse-grained states k.

Appendix J: Robustness to head motion and signalvariance

The brain’s entropy production is significantly corre-lated with the frequency of physical responses [Fig. 4(b)]and increases during periods of cognitive exertion [Fig.4(c)]. Here, we show that the effects of physical and cog-nitive effort on entropy production cannot be explained

0 5 10 15 200

0.02

0.04

0.06

0.08

0.1

0.12

0 5 10 15 200

0.02

0.04

0.06

0.08

Entro

py p

rodu

ctio

n (b

its/s

)

Number of clusters k(b)

(a)RestEmotionWorking memorySocialLanguageRelationalGamblingMotor

Ent.

prod

. per

resp

onse

(bits

)Number of clusters k

0 5 10 15 200

0.01

0.02

0.03

0.04

0.05

0.06

Ent.

prod

. diff

eren

ce (b

its/s

)


(c)

**

***

* * * * * * * **** p < 0.01

p < 0.05

k > 8k 8≤

FIG. 13. Entropy production in the brain at different lev-els of coarse-graining. (a) Entropy production at rest andduring seven cognitive tasks as a function of the number ofclusters k used in the hierarchical clustering. The raw en-tropy production [Eq. (1)] is divided by the fMRI repetitiontime ∆t = 0.72 s to compute an entropy production rate,and error bars reflect two standard deviations estimated us-ing trajectory bootstrapping (see Appendix A). (b) Slope ofthe linear relationship between entropy production and physi-cal response rate across tasks for different numbers of clustersk. Error bars represent one-standard-deviation confidence in-tervals of the slope and asterisks indicate the significance ofthe Pearson correlation between entropy production and re-sponse rate. (c) Difference between the entropy productionduring high-load and low-load conditions of the working mem-ory task as a function of the number of clusters k. Error barsrepresent two standard deviations estimated using trajectorybootstrapping (see Appendix A), and the entropy productiondifference is significant across all values of k (one-sided t-test,p < 0.001).

16

70 80 90 100 110 1200

0.01

0.02

0.03

0.04

0.05(b)

Entro

py p

rodu

ctio

n (b

its/s

)

Variance

r = -0.312p = 0.452

58.5 59 59.5 60 60.50

0.01

0.02

0.03

0.04

0.05

Entro

py p

rodu

ctio

n (b

its/s

)

Average DVARS

(a) RestEmotionWorking memorySocialLanguageRelationalGamblingMotor

r = 0.199p = 0.637

FIG. 14. Entropy production in the brain cannot be ex-plained by head movement nor signal variance. Entropy pro-duction versus the average DVARS (a) and the variance of theneural time-series (b) at rest and during seven cognitive tasks.Across both panels, entropy productions are estimated usinghierarchical clustering with k = 8 clusters and are dividedby the fMRI repetition time ∆t = 0.72 s to compute entropyproduction rates. Error bars reflect two standard deviationsestimated using trajectory bootstrapping (see Appendix A).

by head movement within the scanner (a common con-found in fMRI studies [33]) nor variance in the neuraltime-series. To quantify head movement, for each timepoint in every time-series, we compute the spatial stan-dard deviation of the difference between the current im-

age and the previous image. This quantity, known asDVARS, is a common measure of head movement in fMRIdata [52]. Importantly, we find that entropy productionis not significantly correlated with the average DVARSwithin each task [Fig. 14(a)], thereby demonstrating thatthe relationship between entropy production and physi-cal response rate is not simply due to the confound ofsubject head movement within the scanner. Addition-ally, we find that entropy production is not significantlycorrelated with the variance of the neural data withineach task [Fig. 14(b)]. This final result establishes thatour entropy production estimates are not simply drivenby variations in the amount of noise in the neural dataacross different tasks.

Appendix K: Data processing

The resting, emotional processing, working memory,social inference, language processing, relational match-ing, gambling, and motor execution fMRI scans are fromthe S1200 Human Connectome Project release [20, 32].Brains were normalized to fslr32k via the MSM-AII reg-istration with 100 regions [53]. CompCor, with five prin-cipal components from the ventricles and white mattermasks, was used to regress out nuisance signals fromthe time series. In addition, the 12 detrended motionestimates provided by the Human Connectome Projectwere regressed out from the regional time series. Themean global signal was removed and then time series wereband-pass filtered from 0.009 to 0.08 Hz. Then, frameswith greater than 0.2 mm frame-wise displacement or aderivative root mean square (DVARS) above 75 were re-moved as outliers. We filtered out sessions composed ofgreater than 50 percent outlier frames, and we only an-alyzed data from subjects that had all scans remainingafter this filtering, leaving 590 individuals. The process-ing pipeline used here has previously been suggested tobe ideal for removing false relations between neural dy-namics and behavior [54]. Finally, for each subject andeach scan, we only analyze the first 176 time points, cor-responding to the length of the shortest task (emotionalprocessing); this truncation controls for the possibility ofdata size affecting comparisons across tasks.

[1] Erwin Schrodinger, What is life? The physical aspectof the living cell and mind (Cambridge University PressCambridge, 1944).

[2] FS Gnesotto, Federica Mura, Jannes Gladrow, andCP Broedersz, “Broken detailed balance and non-equilibrium dynamics in living systems: A review,” Rep.Prog. Phys. 81, 066601 (2018).

[3] Sadi Carnot, Reflexions sur la puissance motrice du feu(Bachelier, Paris, France, 1824).

[4] Clifford P Brangwynne, Gijsje H Koenderink, Freder-ick C MacKintosh, and David A Weitz, “Cytoplasmicdiffusion: Molecular motors mix it up,” J. Cell Biol. 183,583–587 (2008).

[5] Hong Yin, Irina Artsimovitch, Robert Landick, andJeff Gelles, “Nonequilibrium mechanism of transcrip-tion termination from observations of single rna poly-merase molecules,” Proc. Natl. Acad. Sci. 96, 13124–13129 (1999).

17

[6] Kerwyn Casey Huang, Yigal Meir, and Ned S Wingreen,“Dynamic structures in Escherichia coli: Spontaneousformation of mine rings and mind polar zones,” Proc.Natl. Acad. Sci. 100, 12724–12728 (2003).

[7] Pankaj Mehta and David J Schwab, “Energetic costs ofcellular computation,” Proc. Natl. Acad. Sci. 109, 17978–17982 (2012).

[8] Ganhui Lan, Pablo Sartori, Silke Neumann, Victor Sour-jik, and Yuhai Tu, “The energy–speed–accuracy trade-off in sensory adaptation,” Nat. Phys. 8, 422 (2012).

[9] Marina Soares e Silva, Martin Depken, Bjorn Stuhrmann,Marijn Korsten, Fred C MacKintosh, and Gijsje H Koen-derink, “Active multistage coarsening of actin networksdriven by myosin motors,” Proc. Natl. Acad. Sci. 108,9408–9413 (2011).

[10] Bjorn Stuhrmann, Marina Soares e Silva, Martin Dep-ken, Frederick C MacKintosh, and Gijsje H Koenderink,“Nonequilibrium fluctuations of a remodeling in vitro cy-toskeleton,” Phys. Rev. E 86, 020901 (2012).

[11] Claude E. Shannon, “A mathematical theory of commu-nication,” Bell Syst. Tech. J. 27, 379–423 (1948).

[12] Julia J Harris, Renaud Jolivet, and David Attwell,“Synaptic energy use and supply,” Neuron 75, 762–777(2012).

[13] Christopher W Lynn and Danielle S Bassett, “Thephysics of brain network structure, function and control,”Nat. Rev. Phys. 1, 318 (2019).

[14] Maria Erecinska and Ian A Silver, “ATP and brain func-tion,” J. Cereb. Blood Flow Metab. 9, 2–19 (1989).

[15] K Norberg and BK Siejo, “Cerebral metabolism in hy-poxic hypoxia. II. Citric acid cycle intermediates and as-sociated amino acids,” Brain Res. 86, 45–54 (1975).

[16] Fei Du, Xiao-Hong Zhu, Yi Zhang, Michael Friedman,Nanyin Zhang, Kamil Ugurbil, and Wei Chen, “Tightlycoupled brain activity and cerebral ATP metabolic rate,”Proc. Natl. Acad. Sci. 105, 6409–6414 (2008).

[17] Massimiliano Esposito, “Stochastic thermodynamics un-der coarse graining,” Phys. Rev. E 85, 041125 (2012).

[18] Ignacio A Martınez, Gili Bisker, Jordan M Horowitz, andJuan MR Parrondo, “Inferring broken detailed balance inthe absence of observable currents,” Nat. Commun. 10,1–10 (2019).

[19] David A Egolf, “Equilibrium regained: From nonequilib-rium chaos to statistical mechanics,” Science 287, 101–104 (2000).

[20] David C Van Essen, Stephen M Smith, Deanna M Barch,Timothy EJ Behrens, Essa Yacoub, Kamil Ugurbil, Wu-Minn HCP Consortium, et al., “The WU-Minn HumanConnectome Project: An overview,” Neuroimage 80, 62–79 (2013).

[21] BT Thomas Yeo, Fenna M Krienen, Jorge Sepulcre,Mert R Sabuncu, Danial Lashkari, Marisa Hollinshead,Joshua L Roffman, Jordan W Smoller, Lilla Zollei,Jonathan R Polimeni, et al., “The organization of thehuman cerebral cortex estimated by intrinsic functionalconnectivity,” J. Neurophysiol. 106, 1125–1165 (2011).

[22] John P Cunningham and M Yu Byron, “Dimensionalityreduction for large-scale neural recordings,” Nat. Neu-rosci. 17, 1500 (2014).

[23] Christopher Battle, Chase P Broedersz, Nikta Fakhri,Veikko F Geyer, Jonathon Howard, Christoph F Schmidt,and Fred C MacKintosh, “Broken detailed balance atmesoscopic scales in active biological systems,” Science352, 604–607 (2016).

[24] RKP Zia and B Schmittmann, “Probability currents asprincipal characteristics in the statistical mechanics ofnon-equilibrium steady states,” J. Stat. Mech. 2007,P07012 (2007).

[25] M Newman and G Barkema, Monte Carlo methods instatistical physics (Oxford University Press, New York,USA, 1999).

[26] Penelope Kale, Andrew Zalesky, and Leonardo L Gollo,“Estimating the impact of structural directionality: Howreliable are undirected connectomes?” Net. Neurosci. 2,259–284 (2018).

[27] David Sherrington and Scott Kirkpatrick, “Solvablemodel of a spin-glass,” Phys. Rev. Lett. 35, 1792 (1975).

[28] Udo Seifert, “Entropy production along a stochastic tra-jectory and an integral fluctuation theorem,” Phys. Rev.Lett. 95, 040602 (2005).

[29] Edgar Roldan and Juan MR Parrondo, “Estimating dis-sipation from single stationary trajectories,” Phys. Rev.Lett. 105, 150607 (2010).

[30] Thomas M Cover and Joy A Thomas, Elements of infor-mation theory (John Wiley & Sons, 2012).

[31] Sid Lamrous and Mounira Taileb, “Divisive hierarchicalk-means,” in CIMCA (IEEE, 2006) pp. 18–18.

[32] Deanna M Barch, Gregory C Burgess, Michael P Harms,Steven E Petersen, Bradley L Schlaggar, Maurizio Cor-betta, Matthew F Glasser, Sandra Curtiss, Sachin Dixit,Cindy Feldt, et al., “Function in the human connectome:Task-fMRI and individual differences in behavior,” Neu-roimage 80, 169–189 (2013).

[33] Karl J Friston, Steven Williams, Robert Howard,Richard SJ Frackowiak, and Robert Turner, “Movement-related effects in fmri time-series,” Magn. Reson. Med.35, 346–355 (1996).

[34] Claudio Castellano, Santo Fortunato, and VittorioLoreto, “Statistical physics of social dynamics,” Rev.Mod. Phys. 81, 591 (2009).

[35] J Matias Palva, Alexander Zhigalov, Jonni Hirvonen,Onerva Korhonen, Klaus Linkenkaer-Hansen, and SatuPalva, “Neuronal long-range temporal correlations andavalanche dynamics are correlated with behavioral scal-ing laws,” Proc. Natl. Acad. Sci. 110, 3585–3590 (2013).

[36] Gijsje H Koenderink, Zvonimir Dogic, Fumihiko Naka-mura, Poul M Bendix, Frederick C MacKintosh, John HHartwig, Thomas P Stossel, and David A Weitz, “Anactive biopolymer network controlled by molecular mo-tors,” Proc. Natl. Acad. Sci. 106, 15192–15197 (2009).

[37] Linda Van Aelst and Crislyn DSouza-Schorey, “Rho gt-pases and signaling networks,” Genes Dev. 11, 2295–2322(1997).

[38] Jordan D Dworkin, Kristin A Linn, Erin G Teich, PerryZurn, Russell T Shinohara, and Danielle S Bassett, “Theextent and drivers of gender imbalance in neurosciencereference lists,” arXiv preprint arXiv:2001.01002 (2020).

[39] Neven Caplar, Sandro Tacchella, and Simon Birrer,“Quantitative evaluation of gender bias in astronomicalpublications from citation counts,” Nat. Astron. 1, 1–5(2017).

[40] Eli J Cornblath, Arian Ashourvan, Jason Z Kim,Richard F Betzel, Rastko Ciric, Graham L Baum, Xi-aosong He, Kosha Ruparel, Tyler M Moore, Ruben CGur, et al., “Temporal sequences of brain activity at restare constrained by white matter structure and modu-lated by cognitive demands,” Communications Biology,

18

in press.[41] Xiao Liu and Jeff H Duyn, “Time-varying functional net-

work information extracted from brief instances of spon-taneous brain activity,” Proc. Natl. Acad. Sci. 110, 4392–4397 (2013).

[42] Marcus E Raichle, “Behind the scenes of functional brainimaging: A historical and physiological perspective,”Proc. Natl. Acad. Sci. 95, 765–772 (1998).

[43] Xiaomu Song, Tongyou Ji, and Alice M Wyrwicz, “Base-line drift and physiological noise removal in high fieldfMRI data using kernel PCA,” in 2008 IEEE Interna-tional Conference on Acoustics, Speech and Signal Pro-cessing (IEEE, 2008) pp. 441–444.

[44] Udo Seifert, “Stochastic thermodynamics, fluctuationtheorems and molecular machines,” Rep. Prog. Phys. 75,126001 (2012).

[45] Denis J Evans, Ezechiel Godert David Cohen, andGary P Morriss, “Probability of second law violations inshearing steady states,” Phys. Rev. Lett. 71, 2401 (1993).

[46] E Dieterich, J Camunas-Soler, M Ribezzi-Crivellari,U Seifert, and F Ritort, “Single-molecule measurementof the effective temperature in non-equilibrium steadystates,” Nat. Phys. 11, 971–977 (2015).

[47] Gemma Lancaster, Dmytro Iatsenko, Aleksandra Pidde,Valentina Ticcinelli, and Aneta Stefanovska, “Surrogatedata for hypothesis testing of physical systems,” Phys.Rep. 748, 1–60 (2018).

[48] Vincent Cohen-Addad, Varun Kanade, FrederikMallmann-Trenn, and Claire Mathieu, “Hierarchicalclustering: Objective functions and algorithms,” inProceedings of the Twenty-Ninth Annual ACM-SIAM

Symposium on Discrete Algorithms (SIAM, 2018) pp.378–397.

[49] Michael D Fox, Maurizio Corbetta, Abraham Z Snyder,Justin L Vincent, and Marcus E Raichle, “Spontaneousneuronal activity distinguishes human dorsal and ventralattention systems,” Proc. Natl. Acad. Sci. 103, 10046–10051 (2006).

[50] Simone Vossel, Joy J Geng, and Gereon R Fink, “Dor-sal and ventral attention systems: distinct neural cir-cuits but collaborative roles,” Neuroscientist 20, 150–159(2014).

[51] Edgar Roldan and Juan MR Parrondo, “Entropy produc-tion and Kullback-Leibler divergence between stationarytrajectories of discrete systems,” Phys. Rev. E 85, 031129(2012).

[52] Jonathan D Power, Kelly A Barnes, Abraham Z Snyder,Bradley L Schlaggar, and Steven E Petersen, “Spuri-ous but systematic correlations in functional connectiv-ity mri networks arise from subject motion,” Neuroimage59, 2142–2154 (2012).

[53] Alexander Schaefer, Ru Kong, Evan M Gordon, Timo-thy O Laumann, Xi-Nian Zuo, Avram J Holmes, Simon BEickhoff, and BT Thomas Yeo, “Local-global parcella-tion of the human cerebral cortex from intrinsic func-tional connectivity mri,” Cereb. Cortex 28, 3095–3114(2018).

[54] Joshua S Siegel, Anish Mitra, Timothy O Laumann, Ben-jamin A Seitzman, Marcus Raichle, Maurizio Corbetta,and Abraham Z Snyder, “Data quality influences ob-served links between functional connectivity and behav-ior,” Cereb. Cortex 27, 4492–4502 (2017).

arXiv:2005.02526v1 [physics.bio-ph] 5 May 2020namics is the human brain, which consumes up to 20% of the body’s energy to perform an array of cognitive functions, from computations

Documents