I'm a quantitative disease ecologist.

- I use computational, analytical and numerical methods in mathematical modeling and statistical analysis.
- I aim to advance the understanding of complex interactions between hosts, vectors, pathogens, and their biotic and abiotic environments.
- My goal is to protect human communities, livestock and wildlife, and reduce the burden and economic costs due to infectious diseases globally, through the development of early warning systems that prevent infections before they occur.

## From reaction to prevention

Current methods to control infectious diseases are largely reactive. If we could better predict the location, time and conditions in which the risk of acquiring infectious diseases is the highest, we could prevent the burden and cost of diseases before they occur.

Since insects and other vectors that transmit various pathogens are ectotherm, and depend on specific habitats for their development, it is intuitive that land-use land-cover and weather conditions would be good predictors of their abundance, and hence the transmission risk of their associated pathogens. However, effective early warning systems that are able to predict transmission risk based on abiotic conditions on large spatial scales are currently lacking, with notable exceptions for specific pathogens at specific locations.

The understanding of the complex interactions between hosts, vectors, pathogens and their biotic and abiotic environment is absolutely essential to develop these early warning systems. Another essential element of these models is the effect of human activities, such as development, construction modes and behavior on the transmission risk of infectious diseases. For example, the absence of dengue fever in the continental US, with the exception of southern Florida, is hypothetized to be partly explained by the relatively low rates of mosquito bites as people reside in air-conditioned, screened houses and tend to spend less time outdoors than dengue-endemic countries.

Incorporating surveillance effort to correct for under-reporting

A particular difficulty in developing early warning systems for infectious diseases is under-reporting. The proportion of cases reported can vary in space, time and even across age groups. Assuming homogeneous levels of reporting is likely to lead to misleading results and interpretations. However, estimating this variable level of under-reporting would allow us to correctly estimate the true number of cases. Mathematical models that incorporate surveillance and reporting allow us to identify the models that fit the observation the best, including the most likely parameters for under-reporting. In addition to using these models for predicting transmission risk, we can estimate the optimal levels of reporting which balance the cost of surveillance with the benefits of detecting outbreaks earlier.

Combining ecological and evolutionary information

Integrating information from multiple sources allows one to make statistical inferences that would be impossible to make based on any one of the sources alone. Molecular information, such as genetic sequence data is being amassed at an exponential rate. For example, the DNA sequences of pathogens isolated from reported cases hold information of the history of the pathogen. The relationship between the sequences of the pathogens isolated from multiple types of hosts and vectors (i.e. evolutionary trees) reflect the ecological interactions between those hosts, vectors and pathogens. This molecular information can be exploited and combined with ecological and epidemiological information by building eco-evolutionary models of infectious diseases. These models can be used to estimate parameters of both the ecological/epidemiological and of the evolutionary processes involved, based on the datasets available for the particular pathogen. These eco-evolutionary models form an essential part of the toolkit for the development of early warning systems for the emergence of novel or re-emerging zoonotic diseases.

Embracing Uncertainty in models and data

All models, either mathematical, statistical or phenomenological, are an abstraction and simplification of nature, and their predictions can only be as accurate as the underlying assumptions and parameter estimates allow them to be. This uncertainty, either arising from inherent stochasticity, uncertainty in parameter estimates, or large variance in the data used to fit model, needs to be incorporated into early warning systems for infectious diseases, instead of being ignored. Modern statistical methods, such as Approximate Bayesian Computation can link the predictions of stochastic models in an early warning system with characteristics of the data used to fit the model. A particular type of uncertainty arises due to the fact that multiple sets of models can fit the underlying data, perhaps with different parameter estimates. This structural uncertainty might lead to very different predictions from these models under the same circumstances, requiring model selection. However, selecting the best model from a set of competing stochastic models, from the simplest to the most complex is still an open area of research.

Since insects and other vectors that transmit various pathogens are ectotherm, and depend on specific habitats for their development, it is intuitive that land-use land-cover and weather conditions would be good predictors of their abundance, and hence the transmission risk of their associated pathogens. However, effective early warning systems that are able to predict transmission risk based on abiotic conditions on large spatial scales are currently lacking, with notable exceptions for specific pathogens at specific locations.

The understanding of the complex interactions between hosts, vectors, pathogens and their biotic and abiotic environment is absolutely essential to develop these early warning systems. Another essential element of these models is the effect of human activities, such as development, construction modes and behavior on the transmission risk of infectious diseases. For example, the absence of dengue fever in the continental US, with the exception of southern Florida, is hypothetized to be partly explained by the relatively low rates of mosquito bites as people reside in air-conditioned, screened houses and tend to spend less time outdoors than dengue-endemic countries.

Incorporating surveillance effort to correct for under-reporting

A particular difficulty in developing early warning systems for infectious diseases is under-reporting. The proportion of cases reported can vary in space, time and even across age groups. Assuming homogeneous levels of reporting is likely to lead to misleading results and interpretations. However, estimating this variable level of under-reporting would allow us to correctly estimate the true number of cases. Mathematical models that incorporate surveillance and reporting allow us to identify the models that fit the observation the best, including the most likely parameters for under-reporting. In addition to using these models for predicting transmission risk, we can estimate the optimal levels of reporting which balance the cost of surveillance with the benefits of detecting outbreaks earlier.

Combining ecological and evolutionary information

Integrating information from multiple sources allows one to make statistical inferences that would be impossible to make based on any one of the sources alone. Molecular information, such as genetic sequence data is being amassed at an exponential rate. For example, the DNA sequences of pathogens isolated from reported cases hold information of the history of the pathogen. The relationship between the sequences of the pathogens isolated from multiple types of hosts and vectors (i.e. evolutionary trees) reflect the ecological interactions between those hosts, vectors and pathogens. This molecular information can be exploited and combined with ecological and epidemiological information by building eco-evolutionary models of infectious diseases. These models can be used to estimate parameters of both the ecological/epidemiological and of the evolutionary processes involved, based on the datasets available for the particular pathogen. These eco-evolutionary models form an essential part of the toolkit for the development of early warning systems for the emergence of novel or re-emerging zoonotic diseases.

Embracing Uncertainty in models and data

All models, either mathematical, statistical or phenomenological, are an abstraction and simplification of nature, and their predictions can only be as accurate as the underlying assumptions and parameter estimates allow them to be. This uncertainty, either arising from inherent stochasticity, uncertainty in parameter estimates, or large variance in the data used to fit model, needs to be incorporated into early warning systems for infectious diseases, instead of being ignored. Modern statistical methods, such as Approximate Bayesian Computation can link the predictions of stochastic models in an early warning system with characteristics of the data used to fit the model. A particular type of uncertainty arises due to the fact that multiple sets of models can fit the underlying data, perhaps with different parameter estimates. This structural uncertainty might lead to very different predictions from these models under the same circumstances, requiring model selection. However, selecting the best model from a set of competing stochastic models, from the simplest to the most complex is still an open area of research.