## Theory and Life

To many mathematics and physical sciences students, the vastness of the life sciences is almost overwhelming. In this post, I want to give my current impressions about the life sciences, with particular emphasis on the role that modeling plays, and the area broadly referred to as theoretical biology. This is by no means an introduction to mathematical and theoretical biology, but should at least give mathematically-inclined individuals some ideas for where to look for more information, and perhaps some awareness of the challenges and exciting opportunities that these interdisciplinary areas provide. As a brief disclaimer, I am by no means an expert in these areas, and so I will defer to experts wherever possible.

There are many good introductions to the various areas of modeling in biology. The broad distinctions are usually divided up by methodology, and hence tend to relate to particular subfields of mathematics, physics, and computational science. The finer divisions often correspond to subfields in the life sciences themselves. An incomplete and very rough list of the broad subdivisions would be:

• Mathematical Biology: This is a broad term for research usually done by mathematicians using techniques and thinking from applied mathematics to tackle problems in or motivated by biology.
• Biophysics: This is an area that grew out of interactions between physics and biochemistry, and can roughly be thought of as the application of physics to understanding biological processes.
• Biostatistics: This is the development and application of statistical techniques used to model problems in biology and medicine. Biostatisticians seem to be very applied in general, often contributing statistical analysis to experimental work.
• Computational Biology: This area develops and applies computational methods to simulate complex processes in the life sciences, as well as analyze patterns and other meaningful information from very large data sets.

This list is clearly missing many important areas. Some of them can be seen as being between two of these. Bioinformatics, for example, is the use of powerful statistical and computational tools to analyze very large data sets, such as from genome sequencing. Some involve these fields, but in not so obvious ways, such as in systems biology. One could write a very long post about the semantics and nomenclature between all of these various fields. Instead, I will focus on the general interactions between the theoretical and experimental life sciences, and try to give you an idea of what meaningful progress in this direction looks like.

Modeling has always provided science with frameworks to abstract fundamental notions from the complex details of phenomena. Physics has a long history of reducing and explaining incredibly diverse processes using relatively simple mathematics. These models often target particular length and time scales, and exploit physically realistic assumptions such as nonrelativistic speeds. There has been some success in applying these approaches to studying biological phenomena. Professor James Murray says the following in his book, Mathematical Biology

This is not to say that nature always acts in the simplest possible way: parsimony is a human construct, and evolution is an opportunistic process which builds on the available materials, not according to any global optimization scheme. However, when building models, it ill behooves the modeller to capriciously add complexity when simple mechanisms will do the job.

The best way to decide between competing models is through experiment. We feel that one of the modeller’s jobs is to present the experimental biologist with a shopping list of possibilities which are consistent not only with the observations, but with the known laws of physics and chemistry.

These models will suggest experiments, and guide further model building. We see modelling and experiment cooperating in a feedback loop—just as chemistry and mechanics do in our models—the combination being a more efﬁcient tool for research than either one acting alone.

How complicated should a model be? Consider the task of explaining to someone how a clock works. It would help, of course, if they understood the mechanics of gears and levers; however, to understand the clock you would have to simply describe it: this gear turns that one, and so on.

Now this is not a very satisfactory way to understand a phenomenon; it is like having a road map with a scale of one mile equals one mile. ‘Understanding’ usually involves some simpliﬁed conceptual representation that captures the essential features, but omits the details or secondary phenom- ena. This is as good a deﬁnition as any of what constitutes a model.

Just how simpliﬁed a model can be and still retain the salient aspects of the real world depends not only on the phenomenon, but how the model is to be used. In these chapters on mechanical aspects of morphogenesis we deal only with mathematical models; that is, phenomena which can be cast in the form of equations of a particular type.

Mathematical models can be used to make detailed predictions of the future behaviour of a system (as we have seen). This can be done only when the phenomenon is rather simple; for complex systems the number of parameters that must be determined is so large that one is reduced to an exercise in curve ﬁtting. The models we deal with in this book have a different goal. We seek to explain phenomena, not simply describe them.

If one’s goal is explanation rather than description then different criteria must be applied. The most important criterion, in our view, was enunciated by Einstein: ‘A model should be as simple as possible. But no simpler.’ That is, a model should seek to explain the underlying principles of a phenomenon, but no more. We are not trying to ﬁt data nor make quantitative predictions. Rather we seek to understand. Thus we ask only that our models describe qualitative features in the simplest possible way.

Unfortunately, even with this modest (or ambitious) goal, the equations we deal with are probably more complicated than even most physical scientists are accustomed to. This is because the phenomena we are attempting to describe are generally more complex than most physical systems, although it may reﬂect our own ineptness in perceiving their underlying simplicity.

There are several worthwhile notions in this passage. Firstly, he delineates between two conceptualizations of modeling. On the one hand, predictive, quantitative models are useful in science for obvious reasons. Usually, these models are phenomenological, or otherwise black-boxes to the fundamental mechanisms of what is really going on. Statistical models, machine learning techniques, such as neural networks, and parameter-heavy equation models all tend to fit into this category. The other type of model he describes are the very simple ones that capture the qualitative properties of the system being described. These are often not as useful for prediction, but can be used to elucidate mechanisms or causes for various phenomena. Differential equations, recurrence relations, and other types of simple relations between objects and their rates of change are often used here.

In reality, there is a spectrum between these two types of models. Many statistical models use a priori knowledge about the system to create a better model. Many qualitative models take certain terms to have a phenomenological form, as we don’t yet have a good enough understanding about fundamental things in the life sciences to know what forms all interactions take. There are also other kinds of modeling paradigms that don’t quite fit neat between these two, such as agent-based models.

This leads me to the second thing that Murray points out. Mathematical and theoretical biology is hard. This is a fairly nontrivial point that I will defer to Professor Michael Reed’s article on the topic. Overall, there are two major reasons for this difficulty. Firstly, we don’t have nearly the same fundamental understanding of the life sciences as we do the physical sciences. There are practical and historical reasons for this, and it may be that we will never have as concise and clean of a general formulation of any area of biology as we do physics. Classical Mechanics is a powerful, general-purpose way of describing phenomena very accurately, and analogues of it in the life sciences never have as much descriptive or explanatory power.

The second issue is that of complexity. Experimentally speaking, many physical systems (e.g. chemical reactions) can be isolated to really investigate fundamental interactions. Living systems are seemingly harder to apply reductionist scientific approaches, because the overall interactions of a living system appear to have emergent properties that are far more complex than predicted by investigating the constituent parts. There are of course some simple biological systems, and some very complex systems in the physical sciences, but this idea of complex system dynamics is crucial to many biological systems.

So one might ask, “Well, if theoretical explorations are difficult due to the inherent complexity, why bother pursuing it at all? Why spend time or large amounts of grant money researching this?” There are several good reasons why theoretical explorations of biological systems are worthwhile. Professor Reed outlines a few in the article linked above, as does Professor Murray in the book cited above. Below, I will give my personal opinions on the matter.

Fundamentally, science is not just about knowledge. Science should not be confused with lists of facts or figures, and instead should be the logical explanation of phenomena. The key here is explanation. Due to its inherent complexity, many of the life sciences focus a great deal on simply categorizing and exploring what exists in nature. Observation of natural phenomena is crucial for science, but it is not the end goal of it. In order to understand these things, we have to go further to try and explain the how and the why of the systems we are studying. Some mechanisms are observable, or can at least be experimentally determined, such as giant squid neurons. Others were first conceptualized in a theoretical fashion which then matched well with experiments, such as Michaelis-Menten reaction kinetics. This interplay between observation, experiment, theoretical hypothesis and explanation is really the heart and soul of many modern areas of science.

There are also important modern considerations. The life sciences are growing every year. Some trends can be seen from the 2012 survey of Doctorates Awarded By Field of Science and Engineering, and many of the other tables and articles can be found by searching for similar terms. Due to more people, more funding, and growing interest in biological research and Medicine, the amount of data we have collected is enormous. This is a partial exploration for the involvement of statisticians and computer scientists: these groups are exceptionally well-trained and prepared to understanding patterns and useful relationships inside of these huge datasets,

This is a good explanation for the need for predictive and quantitative models, but what about simpler mathematical models? What use is there to models derived from a priori physical principles, such as conservation of mass or momentum? Well, due to the complex and interconnected nature of many living systems, it becomes necessary to understand what aspects of a system are crucial to understand its behavior. Modern applied mathematics, alongside computational, statistical, and experimental methods, is ideally suited to determining such relationships. Perturbation Theory, Bifurcation Theory, and a host of other techniques exist to identify a system’s dependence on variables or parameters. These kinds of models may provide possible explanations for the causes or mechanisms of a particular phenomenon, and if validated, can be used to further explore the fundamental science behind what is happening.

There is a current debate occurring about the idea of a Theory of Life. These are analogues of Unified Theories in physics, hoping to explain both the origins and current progression of life in its various forms. There is a lot of common pseudoscientific ideas being thrown around, but there are also useful advances being made in this direction, that may or may not prove to be catalysts for paradigm shifts, even if the ideas are incorrect. There is also a discussion of how these general theoretical frameworks are essential to begin unifying ideas in the life sciences. Finally, I will link to this blog post by Philip Maini about the current state of mathematical biology, as well as a bit of its history.

There is quite a lot to say about this area, and all of its potential and pitfalls. I will undoubtedly post more detailed and particular blogs about this topic again in the future, but I wanted to give a (not-so) brief overview of the topic.