# How to Describe Biological Systems Using Master Equations

**Introduction**

The master equation describes discrete Markov jump processes on a complex energy landscape. It provides the appropriate formal setting for the study of stochastic processes. The master equation contains information about all possible transitions among the discrete states in the phase space, together with the corresponding transition rates. The system moves between the discrete states just as a particle jumps between nodes on a multi-dimensional lattice phase space.

The master equation calculates the probability of that the system is located at point *m* at a time *t* increases due to transitions from other points *m’* to point *m*. In principle, we can obtain the solution to the master equation using a matrix exponential. In practice, this approach is too costly to implement and only applicable to very simple systems. To overcome this issue, one either works with approximations of the master equation or applies stochastic simulation techniques.

**How to build master equation for a system**

It is not always clear how to build master equations for complex systems. Breuer and Petruccione proposed to use the deterministic macroscopic equation as the starting point. The macroscopic variables are interpreted as the expected values of underlying stochastic variables and the master equation is built so that the expected value of the stochastic process follows the macroscopic dynamics. This approach has two issues. First, in many systems, the macroscopic equation is not known, and even if it is, the dynamics of the stochastic system might deviate significantly from the expected macroscopic trend . Second, it is not always possible to derive the transition rates between microscopic states from the macroscopic equations. Only for some ideal systems, such as well-stirred chemical systems, it is possible to construct the master equation from rigorous considerations of microphysics.

**Chemical Master Equation**

Traditionally, chemical reactions in biological contexts are often modeled by deterministic population-average equations, which are generally valid when the number of reacting molecules is large. However, many intracellular reactions only involve a small number of molecules, hence, are strongly subjected to molecular-level noise. Consequently, discrete and stochastic treatment of such reactions is highly desirable.

Application of the master equation formalism to chemical reactions in *well-stirred, thermodynamically stable* systems gives rise to the chemical master equation (CME). The CME describes the evolution of the discrete probability distribution of the number of molecules of each species in a reaction network. It accounts for individual reaction channels but not for individual molecules. One can view the process as a sequence of Markovian jumps in the state space, corresponding to the changes in composition of the molecules. Between the jumps, the large number of nonreactive collisions will randomize the positions of the molecules, ensuring the each pair of molecules is equally likely to be the next to react. This effectively leads to the loss of memory, ensuring the Markov property of the system. Rigorous derivation of CME based on basic Newtonian physics and thermodynamics was first offered by Gillespie.

The CME has been applied to describe a large number of biological processes, including DNA breathing , molecular motors , polymerization kinetics and protein denaturation dynamics. Most notably, the CME is frequently used to model gene expression (Arkin et al, 1998). Clearly, biological processes are much more complex than chemical reactions and rigorous derivation of the propensity function for such processes are not always possible. Consequently, assumptions and simplifications have to be made to complete the model. For instance, in gene expression models, the propensity function for transcription is often assumed to be proportional to the number of active genes. This representation is a rough approximation of complex and nonlinear interactions between RNA polymerase and active genes, and possible involvement of activators and repressors that may inhibit or facilitate further transcription.

The solution of the CME suffers from the so-called the curse of dimensionality as the number of equations increases exponentially with the number of species involved and the numbers of molecules of each species. Thus, except for small and simple systems, it is extremely difficult to obtain solutions of the CME, both analytically and numerically. Three approaches have been taken to resolve this issue. The first approach searches for approximations of the CME. If the mean number of molecules involved is much larger than unity, one may approximate discrete molecular numbers with a continuous molecular concentration. This leads to the Fokker-Planck equation and its corresponding continuous-stochastic (Langevin) differential equations (SDEs). The second approach relies on exact stochastic simulations of the CME through the Gillespie algorithm or its improved variants. The third approach employs the fluctuation-dissipation theorem to examine broader classes of random processes collectively, without focusing on a particular set of assumptions.

**References**

Lu, T., et al., *Cellular growth and division in the Gillespie algorithm.* Syst Biol (Stevenage), 2004. **1**(1): p. 121-8.

Baras, F. and M.M. Mansour, *Reaction-diffusion master equation: A comparison with microscopic simulations.* Physical Review. E. Statistical Physics, Plasmas, Fluids, and Related Interdisciplinary Topics, 1996. **54**(6): p. 6139-6148.

Lipkow, K., S.S. Andrews, and D. Bray, *Simulated diffusion of phosphorylated CheY through the cytoplasm of Escherichia coli.* J Bacteriol, 2005. **187**(1): p. 45-53.

Shnerb, N.M., et al., *The importance of being discrete: life always wins on the surface.* Proc Natl Acad Sci U S A, 2000. **97**(19): p. 10322-4.

Parrish, J.K. and L. Edelstein-Keshet, *Complexity, pattern, and evolutionary trade-offs in animal aggregation.* Science, 1999. **284**(5411): p. 99-101.

Samoilov, M.S. and A.P. Arkin, *Deviant effects in molecular reaction pathways.* Nat Biotechnol, 2006. **24**(10): p. 1235-40

Ambjornsson, T., et al., *Master equation approach to DNA breathing in heteropolymer DNA.* Phys Rev E Stat Nonlin Soft Matter Phys, 2007. **75**(2 Pt 1): p. 021908.

Lattanzi, G. and A. Maritan, *Master equation approach to molecular motors.* Phys Rev E Stat Nonlin Soft Matter Phys, 2001. **64**(6 Pt 1): p. 061905.

Wang, H., C.S. Peskin, and T.C. Elston, *A robust numerical algorithm for studying biomolecular transport processes.* J Theor Biol, 2003. **221**(4): p. 491-51