Whatever interests, is interesting. | |
William Hazlitt |
Traditionally astronomy, physics, and engineering have been the heaviest and almost exclusive users of advanced mathematical techniques other than statistical methods. In recent years however, applications of mathematics to biology have been mushrooming. This trend is being driven by two forces. While problems in the physical sciences often lead to elegant and parsimonious mathematical models, living creatures tend to be complicated and unpredictible, which makes mathematical models of them messy and often intractable by traditional methods. Fortunately, thanks to the tremendous power of contemporary computers, these messy models of biological systems can now be studied numerically, often with enlightening results. This makes meaningful (i.e., biologically realistic) modeling in the life sciences possible. On the other hand, contemporary technology enables biologists to generate huge amounts of data, such as the billions of letters that make up the genome sequences of higher organisms. Analyzing these data and translating them into understanding of the workings of organisms or ecosystems requires mathematical models and algorithms based on these models. The need for such models and algorithms makes applications of mathematics to biology a necessity.
Organisms are complex systems of a large number of components that interact at various levels of organization. The lowest level consists of tens of thousands of chemical reactions that form giant biochemical networks. Whatever goes on at a higher level of organization, for example, at the level of cells, tissues, organs, or the whole organism, can be considered as emerging properties of the interactions at lower levels. Most of 20th century biology was concentrating on understanding the individual parts (such as individual biomolecules and individual reactions in the case of molecular biology). Currently the focus is shifting towards studying how properties of the whole system are emerging from the interactions of these parts. This latter perspective is known under the name of systems biology. In particular, biologists now want to understand what governs the dynamics of the entire biochemical network of a cell, and how this dynamics translates into emergent properties at higher levels of organization.
In the study of transmission of infections diseases we want to predict whether or not an epidemic is likely to get started by introduction of one or a few infectious individuals into an otherwise susceptible population. If an epidemic is likely to occur, we want to predict the proportion of the overall population that will eventually be affected. One can often derive such predictions form a low-dimensional ODE systems based on the proportion of individuals in a few so-called compartments (susceptible, infectious, and removed individuals in one classical model). But disease transmission is inherently a stochastic process, and the partition into compartments may ignore differences between individuals and important aspects of the underlying contact network that may significantly influence the outcome. Since our data about the structure of the contact network are usually very limited, it becomes an important mathematical problem to compare the predictions of various fine-grained models (that presuppose a specific contact network, for example) and coarser-grained compartment-based models in order to determine which aspects of the contact network most significantly influence the predicted outcome. This type of results can inform both our strategies of data collection (what do we really need to know about the contact network?) and the design of of optimal control measures (such as vaccination) within the available resources.
Another aspect of interest is the influence of awareness that can trigger a behavioral response to an outbreak that may confer some degree of protection against infection. Together with Joan Saldaña we study questions of effectiveness of this type of behavior modification as a potential control measure. An interesting aspect of this problem is that awareness is usually transmitted via a different contact network than the pathogens that cause the actual disease.
On a more practical level, in collaboration with other members of the Tropical Disease Institute at Ohio University I am studying the transmission of Chagas disease, a vector-borne infection that is endemic in rural areas of South and Central America. The focus is on modeling the variety that is found in Ecuador, with the ultimate goal of designing effective control measures.
Dynamical systems provide a natural mathematical framework for studying how the variables in a biological system change over time. Time can be conceptualized as taking arbitrary real values, as in the study of flows generated by ODE systems or as advancing in discrete steps, as in the study of iteration of maps. Similarly, the variables can be assumed to change continuously, in which case the state space is usually a manifold, or take on values from a discrete, often finite set. Dynamical systems may be deterministic or, alternatively, allow for a certain amound of stochasticity. The emphasis is usually on studying the long-term behavior of the system, which is reflected in the structure of its attractors and their basins of attraction.
Biological systems are a prime example of complex systems and usually involve a large number of variables (e.g., concentrations of chemicals in a biochemical network, voltages across membranes of individual neurons in neuronal networks, numbers of individuals of a given species in ecosystems). Typically though, we have only partial knowledge of how these variabes interact. This leads to interesting mathematical questions.
For example, the mathematical questions in the study of transmission of infectious diseases that were described above are in essence questions about the attractor of a dynamical system and its basin of attraction. My interests here focus on how these attractors change when we approximate a finer grained dynamical system model (such as a stochastic network-based model or a model with a behavioral reponse) by a coarser-grained model (such as a low-dimensional ODE system or a model without a behavioral response).
On a more abstract theoretical level, we may want to know what structure of attractors and their basins of attraction is expected in generic systems that satisfy certain assumptions or even possible under these assumptions. For example, together with German A. Enciso and Maciej Malicki we investigate which conditions impose nontrivial provable upper bounds on the lengths of attractors in Boolean systems.
In Boolean dynamical systems, time is discrete and each variable can take only two states. Many biological systems such as gene regulatory or neuronal networks can be modeled by either an ODE or a Boolean model. The latter type of model is simpler and often easier to study. But under which conditions will the dynamics of the Boolean model reliably reflect the dynamics of the underlying more realistic ODE model? My current joint research with Todd R. Young and a group of students at Ohio University focuses on this type of question.
My interest in approximating ODE dynamics by discrete systems grew out of joint work in mathematical neuroscience with David Terman and graduate students Xueying Wang and Sungwoo Ahn. We proved an exact correspondence between the ODE dynamics and that of a discrete model for neuronal networks with certain architectures. This correspondence gives a possible explanation why the experimentally observed firings of neurons in certain brain structures show distinct discrete episodes during which certain clusters of neuros fire together, while membership in a cluster may change from episode to episode. This phenomenon is called dynamic clustering. We then studied the expected properties of the dynamics of the discrete model under random connectivities as well as bounds on the lenghts of attractors and transients for some important specified connectivities.
According to the famous competitive exclusion principle, two species that utilize the same resource cannot coexist. In plants, the limiting resource is usually sunlight. So why are there so many different plant species? The competitive exclusion principle does not apply when different species utilize the same resource at different intensities, and in communities of different plant species this may be the case when different species shade each other in varying degrees; a phenomenon called canopy partitioning. In order to show that canopy partitioning alone can be sufficient for coexistence of multiple species, Nevai and Vance developed a Kolmogorov-type differential equation model model that predicts coexistence of two species at a stable positive equilibrium under certain conditions. They also showed that under certain assumptions this equilibrium will always be unique and stable. In joint work with Andrew Nevai we showed that if these assumptions are somewhat relaxed, the model allows for species to coexist at multiple stable and unstable equilibria.
An important problem in systems biology is the reconstruction of a network from data on its dynamic. This process is called reverse engineering. For example, our knowledge of the individual reactions in large biochemical networks is still very incomplete, but biologists are generating massive amounts of data, such as microarray data for the measurement of gene transcription levels under various experimental conditions, taken for tens of thousands of genes simultaneously. Developing algorithms for reverse engineering a gene-regulatory network from such data is a challenging problem. Reverse-engineering algorithms developed by Laubenbacher and Stigler use advanced algebraic tools called Groebner bases. In cooperation with Brandilyn Stigler we developed implementations of algorithms for efficiently computing Groebner bases in the setting relevant for reverse-enginnering applications. I also studied the expected performance of the Laubenbacher-Stigler algorithm with the goal of deriving guidelines for choosing suitable parameters.
The field of bioinformatics is devoted to the design, analysis, and fine-tuning of algorithms for making inferences from massive sets of biomolecular data. The boundary between bioinformatics and systems biology is not clear-cut. Reverse-engineering of gene regulatory networks is a topic that belongs to both areas. One of the most fundamental problems in bioinformatics is the multiple sequence alignment problem of arranging a set of amino acid or nucleotide sequences into a matrix whose columns represent our best guess at the characters that are derived from individual ancestral loci. Results by myself and other authors show that the problem is in general computationally intractible (NP-hard). As far as is known, this makes it impossible to find an algorithm that runs reasonably fast (in polynomial time) and always finds the best alignment. On the other hand, together with Gianluca Della Vedova we showed that polynomial-time algorithms for approximating the best multiple sequence alignment exist under some conditions.
Together with Molly R. Morris and graduate students Fang Zhu and Xiaolu Sun of Ohio University we studied game-theoretic models of animal behaviour. Game theory is a branch of mathematics that investigates situations of conflict between two or more players and tries to predict their optimal behavior. In biological applications, the "players" are organisms competing for food, mates, or other resources. Success in a game is usually measured in the number of offspring a given organism produces. Game-theoretic models of such interactions are developed to explain or predict which behavioral patterns for conflict resolution will evolve under what circumstances. The predictions coming out of the mathematical model can be tested empirically by observing actual animal behavior and/or by running computer simulations. Game-theoretic models have been very successful in explaining why animal contests tend to be settled by ritualized displays rather than aggressive fights. But if a fight does occur, then which contestant will more often initiate it, the likely loser or the likely winner? The models developed jointly with Molly R. Morris and tested jointly with Xiaolu Sun shed light on this fascinating question.
My original training and my earlier research were in the area of mathematical logic, in particular, set theory. Mathematical logic studies the methods by which mathematicians arrive at their conclusions.
Set theory is often considered the foundation of mathematics in the following sense: All the structures studied in various branches of mathematics can be interpreted as sets, and hence all mathematical theorems can, at least in principle, be derived from the rules governing the formation of sets. These rules can be reduced to a few particularly simple ones that are called the axioms of set theory. The axioms act as a grand unifying principle for all of mathematics: A mathematical statement is a theorem if and only if it ultimately follows from the axioms of set theory. This is the case regardless of whether the statement is about algebraic structures, differential equations, or probability distributions. The method for establishing that a mathematical statement is indeed a theorem is to give a proof of this statement.
But how can one establish that a given mathematical statement is not a theorem? The usual way is to prove that its negation is a theorem, thus establishing that the statement in question is a falsehood. This, however, is not always possible. As Kurt Gödel has shown with his famous First Incompleteness Theorem, there exist mathematical statements that are neither theorems nor falsehoods. Such statements are said to be independent of the axioms of set theory. It may be possible to prove that a given statement falls into this category; the arguments that lead to such conclusions are called independence proofs. I started my mathematical career by proving independence of a number of statements from the accepted axioms of set theory.
Since I switched to mathematical biology, I have not used set theory directly in my work. But my original training has shaped my habits of thinking about complex problems in ways that I find quite useful in my current research.
In my expository and instructional writing I try to convey not only the mathematical facts (definitions, theorems, and proofs), but also the process of discovering these facts. It is no accident that my joint set theory textbook with Martin Weese is titled "Discovering Modern Set Theory." In both Volume 1 and Volume 2 of this text, we continuously challenge the reader to actively participate in the process of creating the mathematics.
Similarly, in a book chapter that was co-authored by David Terman and Sungwoo Ahn we invite the reader to explore a class of discrete models of neuronal dynamics and provide guidance in the form of a multiple projects and exercises that are structured from elementary ones to open research problems.
More recent expositions of models of infectious diseases that I wrote with several co-authors follow the same format.
Currently, in collaboration with several co-authors, I am developing modules for active student exploration of models of disease transmission on contact networks.
I have also been experimenting with development of interactive tutorials that facilitate student engagement and give instantaneous feedback. You can find at this website a sample web-based tutorial.
© 2006 Winfried Just
Last modified March 2, 2015.