Diana is an alumna of the University of Kent, having obtained her PhD in Statistics in 2003 for the thesis Stochastic Branching Processes in Biology. She then worked as a Research Associate on Stochastic models for yeast prion propagation and then on Parameter redundancy in ecological models. She became a Lecturer in Statistics in 2007.

Research interests

Ecological statistics, Integrated Population Modelling, Identifiability, Parameter redundancy, Generalised linear mixed models.
Part of the SE@K (Statistical Ecology at Kent) research group and a member of the National Centre for Statistical Ecology (NCSE).   


  • Anita Jeyam (with Dr Rachel McCrea)
  • Marina Jimenez-Munoz (with Dr Eleni Matechou)



  • Wojczulanis-Jakubas, K., Jiménez-Muñoz, M., Jakubas, D., Kidawa, D., Karnovsky, N., Cole, D. and Matechou, E. (2020). Duration of female parental care and their survival in the little auk Alle alle - are these two traits linked?. Behavioral Ecology and Sociobiology [Online] 74. Available at: https://dx.doi.org/10.1007/s00265-020-02862-9.
    Desertion of offspring before its independence by one of the parents is observed in a number of avian species with bi-parental care but reasons for this strategy are not fully understood. This behaviour is particularly intriguing in species where bi-parental care is crucial to raise the brood successfully. Here, we focus on the little auk, Alle alle, a small seabird with intensive bi-parental care, where the female deserts the brood at the end of the chick rearing period. The little auk example is interesting as most hypotheses to explain desertion of the brood by females (e.g. “re-mating hypothesis”, “body condition hypothesis”) have been rejected for this species. Here, we analysed a possible relationship between the duration of female parental care over the chick and her chances to survive to the next breeding season. We performed the study in two breeding colonies on Spitsbergen with different foraging conditions – more favourable in Hornsund and less favourable in Magdalenefjorden. We predicted that in Hornsund females would stay for shorter periods of time with the brood and would have higher survival rates in comparison with birds from Magdalenefjorden. We found that indeed in less favourable conditions of Magdalenefjorden, females stay longer with the brood than in the more favourable conditions of Hornsund. Moreover, female survival was negatively affected by the length of stay in the brood. Nevertheless, duration of female parental care over the chick was not related to their parental efforts, earlier in the chick rearing period, and survival of males and females was similar. Thus, although females brood desertion and winter survival are linked, the relationship is not straightforward.
  • Nater, C., Vindenes, Y., Aass, P., Cole, D., Langangen, Ø, Moe, S., Rustadbakken, A., Turek, D., Vøllestad, L. and Ergon, T. (2020). Size- and stage-dependence in cause-specific mortality of migratory brown trout. Journal of Animal Ecology [Online]. Available at: http://dx.doi.org/10.1111/1365-2656.13269.
    Evidence‐based management of natural populations under strong human influence frequently requires not only estimates of survival but also knowledge about how much mortality is due to anthropogenic vs. natural causes. This is the case particularly when individuals vary in their vulnerability to different causes of mortality due to traits, life history stages, or locations. Here, we estimated harvest and background (other cause) mortality of landlocked migratory salmonids over half a century. In doing so, we quantified among‐individual variation in vulnerability to cause‐specific mortality resulting from differences in body size and spawning location relative to a hydropower dam. We constructed a multistate mark–recapture model to estimate harvest and background mortality hazard rates as functions of a discrete state (spawning location) and an individual time‐varying covariate (body size). We further accounted for among‐year variation in mortality and migratory behaviour and fit the model to a unique 50‐year time series of mark–recapture–recovery data on brown trout (Salmo trutta ) in Norway. Harvest mortality was highest for intermediate‐sized trout, and outweighed background mortality for most of the observed size range. Background mortality decreased with body size for trout spawning above the dam and increased for those spawning below. All vital rates varied substantially over time, but a trend was evident only in estimates of fishers' reporting rate, which decreased from over 50% to less than 10% throughout the study period. We highlight the importance of body size for cause‐specific mortality and demonstrate how this can be estimated using a novel hazard rate parameterization for mark–recapture models. Our approach allows estimating effects of individual traits and environment on cause‐specific mortality without confounding, and provides an intuitive way to estimate temporal patterns within and correlation among different mortality sources.
  • Jourdain, N., Cole, D., Ridout, M. and Rowcliffe, J. (2020). Statistical Development of Animal Density Estimation Using Random Encounter Modelling. Journal of Agricultural, Biological, and Environmental Statistics [Online] 25:148-167. Available at: https://doi.org/10.1007/s13253-020-00385-4.
    Camera trapping is widely used in ecological studies to estimate animal density, although these studies are largely restricted to animals that can be identified to the individual level. The random encounter model, developed by Rowcliffe et al. (J Anal Ecol 45(4):1228–1236, 2008), estimates animal density from camera-trap data without the need to identify animals. Although the REM can provide reliable density estimates, it lacks the potential to account for the multiple sources of variance in the modelling process. The density estimator in REM is a ratio, and since the variance of a ratio estimator is intractable, we examine and compare the finite sample performance of many approaches for obtaining confidence intervals via simulation studies. We also propose an integrated random encounter model as a parametric alternative, which is flexible and can incorporate covariates and random effects. A data example from Whipsnade Wild Animal Park, Bedfordshire, south England, is used to demonstrate the application of these methods.
  • Cole, D. (2019). Parameter Redundancy and Identifiability in Hidden Markov Models. Metron [Online] 77:105-118. Available at: https://doi.org/10.1007/s40300-019-00156-3.
    Hidden Markov models are a flexible class of models that can be used to describe time series data which depends on an unobservable Markov process. As with any complex model, it is not always obvious whether all the parameters are identifiable, or if the model is parameter redundant; that is, the model can be reparameterised in terms of a smaller number of parameters. This paper considers different methods for detecting parameter redundancy and identifiability in hidden Markov models. We examine both numerical methods and methods that involve symbolic algebra. These symbolic methods require a unique representation of a model, known as an exhaustive summary. We provide an exhaustive summary for hidden Markov models and show how it can be used to investigate identifiability.
  • Zhou, M., McCrea, R., Matechou, E., Cole, D. and Griffiths, R. (2019). Removal models accounting for temporary emigration. Biometrics [Online] 75:24-35. Available at: https://doi.org/10.1111/biom.12961.
    Removal of protected species from sites scheduled for development is often a legal requirement in order to minimize the loss of biodiversity. The assumption of closure in the classic removal model will be violated if individuals become temporarily undetectable, a phenomenon commonly exhibited by reptiles and amphibians. Temporary emigration can be modeled using a multievent framework with a partial hidden process, where the underlying state process describes the movement pattern of animals between the survey area and an area outside of the study. We present a multievent removal model within a robust design framework which allows for individuals becoming temporarily unavailable for detection. We demonstrate how to investigate parameter redundancy in the model. Results suggest the use of the robust design and certain forms of constraints overcome issues of parameter redundancy. We show which combinations of parameters are estimable when the robust design reduces to a single secondary capture occasion within each primary sampling period. Additionally, we explore the benefit of the robust design on the precision of parameters using simulation. We demonstrate that the use of the robust design is highly recommended when sampling removal data. We apply our model to removal data of common lizards, Zootoca vivipara, and for this application precision of parameter estimates is further improved using an integrated model.
  • Jimenez-Munoz, M., Cole, D., Freeman, S., Robinson, R., Baillie, S. and Matechou, E. (2019). Estimating age-dependent survival from age-aggregated ringing data - extending the use of historical records. Ecology and Evolution [Online] 9:769-779. Available at: https://doi.org/10.1002/ece3.4820.
    Bird ring-recovery data have been widely used to estimate demographic parameters
    such as survival probabilities since the mid-twentieth century. However,
    while the total number of birds ringed each year is usually known, historical
    information on age at ringing is often not available. A standard ring-recovery
    model, for which information on age at ringing is required, cannot be used
    when historical data are incomplete. We develop a new model to estimate agedependent
    survival probabilities from such historical data when age at ringing
    is not recorded; we call this the historical data model. This new model provides
    an extension to the model of Robinson (2010) by estimating the proportion of
    the ringed birds marked as juveniles as an additional parameter. We conduct
    a simulation study to examine the performance of the historical data model
    and compare it with other models including the standard and conditional ringrecovery
    models. Simulation studies show that the approach of Robinson (2010)
    can cause bias in parameter estimates. In contrast, the historical data model
    yields similar parameter estimates to the standard model. Parameter redundancy
    results show that the newly developed historical data model is comparable
    to the standard ring-recovery model, in terms of which parameters can be
    estimated, and has fewer identifiability issues than the conditional model. We
    illustrate the new proposed model using Blackbird and Sandwich Tern data.
    The new historical data model allows us to make full use of historical data and
    estimate the same parameters as the standard model with incomplete data and
    in doing so, detect potential changes in demographic parameters further back
    in time.
  • Allen, S., Satterthwaite, W., Hankin, D., Cole, D. and Mohr, M. (2016). Temporally varying natural mortality: Sensitivity of a virtual population analysis and an exploration of alternatives. Fisheries Research [Online] 185:185-197. Available at: http://dx.doi.org/10.1016/j.fishres.2016.09.002.
    Cohort reconstructions (CR) currently applied in Pacific salmon management estimate temporally variant exploitation, maturation, and juvenile natural mortality rates but require an assumed (typically invariant) adult natural mortality rate (dA), resulting in unknown biases in the remaining vital rates. We explored the sensitivity of CR results to misspecification of the mean and/or variability of dA, as well as the potential to estimate dA directly using models that assumed separable year and age/cohort effects on vital rates (separable cohort reconstruction, SCR). For CR, given the commonly assumed dA = 0.2, the error (RMSE) in estimated vital rates is generally small (? 0.05) when annual values of dA are low to moderate (? 0.4). The greatest absolute errors are in maturation rates, with large relative error in the juvenile survival rate. The ability of CR estimates to track temporal trends in the juvenile natural mortality rate is adequate (Pearson's correlation coefficient > 0.75) except for high dA (? 0.6) and high variability (CV > 0.35). The alternative SCR models allowing estimation of time-varying dA by assuming additive effects in natural mortality, fishing mortality, and/or maturation rates did not outperform CR across all simulated scenarios, and are less accurate when additivity assumptions are violated. Nevertheless an SCR model assuming additive effects on fishing and natural (juvenile and adult) mortality rates led to nearly unbiased estimates of all quantities estimated using CR, along with borderline acceptable estimates of the mean dA under multiple sets of conditions conducive to CR. Adding an assumption of additive effects on the maturation rates allowed nearly unbiased estimates of the mean dA as well. The SCR models performed slightly better than CR when the vital rates covaried as assumed. These separable models could serve as a partial check on the validity of CR assumptions about the adult natural mortality rate, or even a preferred alternative if there is strong reason to believe the vital rates, including juvenile and adult natural mortality rates, covary strongly across years or age classes as assumed.
  • Cole, D. and McCrea, R. (2016). Parameter Redundancy in Discrete State-Space and Integrated Models. Biometrical Journal [Online] 58:1071-1090. Available at: http://dx.doi.org/10.1002/bimj.201400239.
    Discrete state-space models are used in ecology to describe the dynamics of wild animal populations, with parameters, such as the probability of survival, being of ecological interest. For a particular parametrisation of a model it is not always clear which parameters can be estimated. This inability to estimate all parameters is known as parameter redundancy or a model is described as non-identifiable. In this paper we develop methods that can be used to detect parameter redundancy in discrete state-space models. An exhaustive summary is a combination of parameters that fully specify a model. To use general methods for detecting parameter redundancy a suitable exhaustive summary is required. This paper proposes two methods for the derivation of an exhaustive summary for discrete state-space models using discrete analogues of methods for continuous state-space models. We also demonstrate that combining multiple data sets, through the use of an integrated population model, may result in a model in which all parameters are estimable, even though models fitted to the separate data sets may be parameter redundant.
  • Cole, D. (2016). Reply to determining structural identifiability of parameter learning machines. Neurocomputing 173:2039-2040.
    The paper Ran and Hu (2014, Neurocomputing) examines identifiability and parameter redundancy in classes of models used in machine learning. This note discusses the results on global identifiability and also clarifies that the paper's results on parameter redundancy already exist in the paper Cole et al. (2010, Mathematical Biosciences).
  • Cole, D., Morgan, B., McCrea, R., Pradel, R., Gimenez, O. and Choquet, R. (2014). Does Your Species Have Memory? Analysing Capture-Recapture Data with Memory Models. Ecology and Evolution [Online] 4:2124-2133. Available at: http://dx.doi.org/10.1002/ece3.1037.
    1. We examine memory models for multi-site capture-recapture data. This is an important topic,as animals may exhibit behaviour that is more complex than simple first-order Markov movement between sites, when it is necessary to devise and fit appropriate models to data.

    2. We consider the Arnason-Schwarz model for multi-site capture-recapture data, which incorporates just first-order Markov movement, and also two alternative models that allow for memory, the Brownie model and the Pradel model. We use simulation to compare two alternative tests which may be undertaken to determine whether models for multi-site capture-recapture data need to incorporate memory.

    3. Increasing the complexity of models runs the risk of introducing parameters that cannot be estimated, irrespective of how much data are collected, a feature which is known as parameter redundancy. Rouan et al (JABES, 2009, pp 338-355) suggest a constraint that may be applied to overcome parameter redundancy when it is present in multi-site memory models. For this case, we apply symbolic methods to derive a simpler constraint, which allows more parameters to be estimated, and give general results not limited to a particular configuration. We also consider the effect sparse data can have on parameter redundancy, and recommend minimum sample sizes.

    4. Memory models for multi-site capture-recapture data can be highly complex, and difficult to fit to data. We emphasise the importance of a structured approach to modelling such data, by considering a priori which parameters can be estimated, which constraints are needed in order for estimation to take place, and how much data need to be collected. We also give guidance on the amount of data needed to use two alternative families of tests for whether models for multi-site capture-recapture data need to incorporate memory.
  • Hubbard, B., Cole, D. and Morgan, B. (2014). Parameter Redundancy in Capture-Recapture-Recovery Models. Statistical Methodology [Online] 17:17-29. Available at: https://doi.org/10.1016/j.stamet.2012.11.005.
    In principle it is possible to use recently-derived procedures to determine whether or not all the parameters of particular complex ecological models can be estimated using classical methods of statistical inference. If it is not possible to estimate all the parameters a model is parameter redundant. Furthermore, one can investigate whether derived results hold for such models for all lengths of study, and also how the results might change for specific data sets. In this paper we show how to apply these approaches to entire families of capture-recapture and capture-recapture-recovery models. This results in comprehensive tables, providing the definitive parameter redundancy status for such models. Parameter redundancy can also be caused by the data rather than the model, and how to investigate this is demonstrated through two applications, one to recapture data on dippers, and one to recapture-recovery data on great cormorants.
  • McCrea, R., Morgan, B. and Cole, D. (2013). Age-dependent mixture models for recovery data on animals marked at unknown age. Journal of the Royal Statistical Society: Series C (Applied Statistics) [Online] 62:101-113. Available at: http://dx.doi.org/10.1111/j.1467-9876.2012.01043.x.
    Data are often collected from wild animals that have been marked at unknown
    age. As a result, standard probability models, fitted by maximum likelihood, cannot incorporate age dependence in probabilities of annual survival.We propose and fit new mixture models to ring–recovery data on birds ringed of unknown age, in which it is possible to incorporate age dependence in survival. It is shown that it is important to analyse simultaneously data on animals marked as young, and of known age, as otherwise the mixture model is parameter redundant. The potential of the approach is illustrated by a new analysis of data on mallards, Anas platyrhynchos, and the wider performance of the approach is demonstrated through simulation.The models provide a way of analysing correctly large numbers of historical data sets.
  • Cole, D., Morgan, B., Catchpole, E. and Hubbard, B. (2012). Parameter redundancy in mark-recovery models. Biometrical Journal [Online] 54:507-523. Available at: http://dx.doi.org/10.1002/bimj.201100210.
    We provide a definitive guide to parameter redundancy in mark-recovery models, indicating, for a wide range of models, in which all the parameters are estimable, and in which models they are not. For these parameter-redundant models, we identify the parameter combinations that can be estimated. Simple, general results are obtained, which hold irrespective of the duration of the studies. We also examine the effect real data have on whether or not models are parameter redundant, and show that results can be robust even with very sparse data. Covariates, as well as time- or age-varying trends, can be added to models to overcome redundancy problems. We show how to determine, without further calculation, whether or not parameter-redundant models are still parameter redundant after the addition of covariates or trends.
  • Choquet, R. and Cole, D. (2012). A Hybrid Symbolic-Numerical Method for Determining Model Structure. Mathematical Biosciences [Online] 236:117-125. Available at: http://dx.doi.org/10.1016/j.mbs.2012.02.002.
    In this article, we present a method for determining whether a model is at least locally identifiable and in the case of non-identifiable models whether any of the parameters are individually at least locally identifiable. This method combines symbolic and numeric methods to create an algorithm that is extremely accurate compared to other numeric methods and computationally inexpensive. A series of generic computational steps are developed to create a method that is ideal for practitioners to use. The algorithm is compared to symbolic methods for two capture-recapture models and a compartment model.
  • Cole, D. (2012). Determining Parameter Redundancy of Multi-state Mark-recapture Models for Sea Birds. Journal of Ornithology [Online] 152:S305-S315. Available at: http://dx.doi.org/10.1007/s10336-010-0574-0.
    Multi-state mark–recapture models are structurally complex models, and in particular the complexity increases when there are unobservable states. Until recently, determining whether or not such models were parameter redundant was only possible numerically. In this paper, we show how it now possible to examine parameter redundancy of such models symbolically. The advantage of this approach is that you can determine exactly how many parameters can be estimated in a model for any number of years of marking and recovery, as well as which combinations of parameters can be estimated. Here, we illustrate how the new methodology works for multi-state models.Wefurther develop rules for determining the parameter redundancy status of a whole family of multi-state mark–recapture models.
  • Cole, D. and Morgan, B. (2010). A note on determining parameter redundancy in age-dependent tag return models for estimating fishing mortality, natural mortality and selectivity. Journal of Agricultural, Biological, and Environmental Statistics [Online] 15:431-434. Available at: http://dx.doi.org/10.1007/s13253-010-0026-6.
    Jiang et al. (JABES 12:177-194, 2007) present models for tag return data on fish. They examine whether the models are parameter redundant, but need to resort to numerical methods as symbolic methods were sometimes found to be intractable. Also, their results are only applicable for a specified number of years of tagging data and age-classes. Here we show how symbolic methods can in fact be used and also how conclusions apply to any number of years of tagging data and age-classes.
  • Cole, D. and Morgan, B. (2010). Parameter Redundancy with Covariates. Biometrika [Online] 97:1002-1005. Available at: http://dx.doi.org/10.1093/biomet/asq041.
    We show how to determine the parameter redundancy status of a model with covariates from that of the same model without covariates, thereby simplifying the calculation considerably. A matrix decomposition is necessary to ensure that the symbolic computation computer programmes return correct results. The paper is illustrated by mark-recovery and latent-class models, with associated Maple code.
  • Cole, D., Morgan, B. and Titterington, D. (2010). Determining the parametric structure of models. Mathematical Biosciences [Online] 228:16-30. Available at: http://dx.doi.org/10.1016/j.mbs.2010.08.004.
    In this paper we develop a comprehensive approach to determining the parametric structure of models. This involves considering whether a model is parameter redundant or not and investigating model identifiability. The approach adopted makes use of exhaustive summaries, quantities that uniquely define the model. We review and generalise previous work on evaluating the symbolic rank of an appropriate derivative matrix to detect parameter redundancy, and then develop further tools for use within this framework, based on a matrix decomposition. Complex models, where the symbolic rank is difficult to calculate, may be simplified structurally using reparameterisation and by finding a reduced-form exhaustive summary. The approach of the paper is illustrated using examples from ecology, compartment modelling and Bayes networks. This work is topical as models in the biosciences and elsewhere are becoming increasingly complex.
  • Byrne, L., Cole, D., Cox, B., Ridout, M., Morgan, B. and Tuite, M. (2009). The Number and Transmission of [PSI+] Prion Seeds (Propagons) in the Yeast Saccharomyces Cerevisiae. PLoS ONE [Online] Online. Available at: http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0004670.
    Abstract Top
    Yeast (Saccharomyces cerevisiae) prions are efficiently propagated and the on-going generation and transmission of prion seeds (propagons) to daughter cells during cell division ensures a high degree of mitotic stability. The reversible inhibition of the molecular chaperone Hsp104p by guanidine hydrochloride (GdnHCl) results in cell division-dependent elimination of yeast prions due to a block in propagon generation and the subsequent dilution out of propagons by cell division.

    Principal Findings
    Analysing the kinetics of the GdnHCl-induced elimination of the yeast [PSI+] prion has allowed us to develop novel statistical models that aid our understanding of prion propagation in yeast cells. Here we describe the application of a new stochastic model that allows us to estimate more accurately the mean number of propagons in a [PSI+] cell. To achieve this accuracy we also experimentally determine key cell reproduction parameters and show that the presence of the [PSI+] prion has no impact on these key processes. Additionally, we experimentally determine the proportion of propagons transmitted to a daughter cell and show this reflects the relative cell volume of mother and daughter cells at cell division.

    While propagon generation is an ATP-driven process, the partition of propagons to daughter cells occurs by passive transfer via the distribution of cytoplasm. Furthermore, our new estimates of n0, the number of propagons per cell (500–1000), are some five times higher than our previous estimates and this has important implications for our understanding of the inheritance of the [PSI+] and the spontaneous formation of prion-free cells.
  • Byrne, L., Cox, B., Cole, D., Ridout, M., Morgan, B. and Tuite, M. (2007). Cell division is essential for elimination of the yeast [PSI+] prion by guanidine hydrochloride. Proceedings of the National Academy of Sciences of the United States of America [Online] 104:11688-11693. Available at: http://dx.doi.org/10.1073/pnas.0701392104.
    Guanidine hydrochloride (Gdn center dot HCl) blocks the propagation of yeast prions by inhibiting Hsp104, a molecular chaperone that is absolutely required for yeast prion propagation. We had previously proposed that ongoing cell division is required for Gdn center dot HCl-induced loss of the [PSI+] prion. Subsequently, Wu et al. [Wu Y, Greene LE, Masison DC, Eisenberg E (2005) Proc Nat] Acad Sci USA 102:1278912794] claimed to show that Gdn center dot HCl can eliminate the [PSI+] prion from alpha-factor-arrested cells leading them to propose that in Gdn center dot HCl center dot treated cells the prion aggregates are degraded by an Hsp104-independent mechanism. Here we demonstrate that the results of Wu et al can be explained by an unusually high rate of alpha-factor-induced cell death in the [PSI+] strain (780-1D) used in their studies. What appeared to be no growth in their experiments was actually no increase in total cell number in a dividing culture through a counterbalancing level of cell death. Using media-exchange experiments, we provide further support for our original proposal that elimination of the [PSI+] prion by Gdn center dot HCl requires ongoing cell division and that prions are not destroyed during or after the evident curing phase.
  • Cole, D., Ridout, M., Morgan, B., Byrne, L. and Tuite, M. (2007). Approximations for expected generation number. Biometrics [Online] 63:1023-1030. Available at: http://dx.doi.org/10.1111/j.1541-0420.2007.00780.x.
    A deterministic formula is commonly used to approximate the expected generation number of a population of growing cells. However, this can give misleading results because it does not allow for natural variation in the times that individual cells take to reproduce. Here we present more accurate approximations for both symmetric and asymmetric cell division. Based on the first two moments of the generation time distribution, these approximations are also robust. We illustrate the improved approximations using data that arise from monitoring individual yeast cells under a microscope and also demonstrate how the approximaitions can be used when such detailed data are not available.
  • Ridout, M., Cole, D., Morgan, B., Byrne, L. and Tuite, M. (2006). New approximations to the Malthusian parameter. Biometrics [Online] 62:1216-1223. Available at: http://dx.doi.org/10.1111/j.1541-0420.2006.00564.x.
    Approximations to the Malthusian parameter of an age-dependent branching process are
    obtained in terms of the moments of the lifetime distribution, by exploiting a link with renewal
    theory. In several examples, the new approximations are more accurate than those currently in use,
    even when based on only the first two moments. The new approximations are extended to include a
    form of asymmetric cell division that occurs in some species of yeast. When used for inference, the
    new approximations are shown to have high efficiency.
  • Cole, D., Morgan, B. and Ridout, M. (2005). Models for strawberry inflorescence data. Journal of Agricultural, Biological, and Environmental Statistics [Online] 10:411-423. Available at: http://dx.doi.org/10.1198/108571105X80761.
    The flowers of strawberry plants grow on very variable branched structures called inflorescences, in which each branch gives rise to 0, 1, or 2 offspring branches. We extend previous modeling of the number of strawberry flowers at each individual level in the inflorescence structure conditional on the number of strawberry flowers at the previous level. We consider a range of logistic regression models, including models that incorporate inflorescence effects and random effects. The models can be used to summarize the overall structure of any particular variety and to indicate the main differences between varieties. For the data of the article, we show that models based on convolutions of correlated Bernoulli random variables outperform binomial regression models.
  • Cole, D., Morgan, B., Ridout, M., Byrne, L. and Tuite, M. (2004). Estimating the number of prions in yeast cells. Mathematical Medicine and Biology [Online] 21:369-395. Available at: http://dx.doi.org/doi:10.1093/imammb/21.4.369.
    Certain yeast cells contain proteins that behave like the mammalian prion PrP and are called yeast prions. The yeast prion protein Sup35p can exist in one of two stable forms, giving rise to phenotypes [PSI+] and [psi(-)]. If the chemical guanidine hydrochloride (GdnHCl) is added to a culture of growing [PSI+] cells, the proportion of [PSI+] cells decreases overtime. This process is called curing and is due to a failure to propagate the prion form of Sup35p. We describe how curing can be modelled, and improve upon previous models for the underlying processes of cell division and prion segregation; the new model allows for asymmetric cell division and unequal prion segregation. We conclude by outlining plans for future experimentation and modelling.
  • Cole, D., Morgan, B. and Ridout, M. (2003). Generalized linear mixed models for strawberry inflorescence data. Statistical Modelling [Online] 3:273-290. Available at: http://dx.doi.org/10.1191/1471082X03st060oa.
    Strawberry inflorescences have a variable branching structure. This paper demonstrates how the inflorescence structure can be modelled concisely using binomial logistic generalized linear mixed models. Many different procedures exist for estimating the parameters of generalized linear mixed models, including penalized likelihood, EM, Bayesian techniques, and simulated maximum likelihood. The main methods are reviewed and compared for fitting binomial logistic generalized linear mixed models to strawberry inflorescence data. Simulations matched to the original data are used to show that a modified EM method due to Steele (1996) is clearly the best, in terms of speed and mean-squared-error performance, for data of this kind.


  • Cole, D. (2020). Parameter Redundancy and Identifiability. [Online]. Chapman and Hall/CRC. Available at: https://www.routledge.com/Parameter-Redundancy-and-Identifiability/Cole/p/book/9781498720878.
    Statistical and mathematical models are defined by parameters that describe different characteristics of those models. Ideally it would be possible to find parameter estimates for every parameter in that model, but, in some cases, this is not possible. For example, two parameters that only ever appear in the model as a product could not be estimated individually; only the product can be estimated. Such a model is said to be parameter redundant, or the parameters are described as non-identifiable. This book explains why parameter redundancy and non-identifiability is a problem and the different methods that can be used for detection, including in a Bayesian context.
  • Newman, K., Buckland, S., Morgan, B., King, R., Borchers, D., Cole, D., Besbeas, P., Gimenez, O. and Thomas, L. (2014). Modelling Population Dynamics: Model Formulation, Fitting and Assessment Using State-Space Methods. Springer.
    Provides unifying framework for estimating the abundance of open populations that are subject to births, deaths and movement in and out of the population

    Going beyond the estimation of abundance, teaches ways of determining the reasons for variation in abundance over time and survival probabilities

    Ecologists and wildlife managers will learn to model dynamics in annual cycles for populations of large vertebrates, including discrete time models

    This book gives a unifying framework for estimating the abundance of open populations: populations subject to births, deaths and movement, given imperfect measurements or samples of the populations. The focus is primarily on populations of vertebrates for which dynamics are typically modelled within the framework of an annual cycle, and for which stochastic variability in the demographic processes is usually modest. Discrete-time models are developed in which animals can be assigned to discrete states such as age class, gender, maturity, population (within a metapopulation), or species (for multi-species models).

    The book goes well beyond estimation of abundance, allowing inference on underlying population processes such as birth or recruitment, survival and movement. This requires the formulation and fitting of population dynamics models. The resulting fitted models yield both estimates of abundance and estimates of parameters characterizing the underlying processes.


  • Cole, D. and Freeman, S. (2012). Estimating Age-Specific Survival Rates from Historical Ringing Data: Comment. University of Kent Technical Report.
    Robinson (2010) describes a model for recoveries of Sandwich Terns Sterna sandvicensis. As is often the case in the UK, the numbers of birds ringed each year, at least until recently, are not fully computerised. The model assumes a fixed proportion of birds in different age classes. We show that the proportion can be estimated from the data, improving accuracy in estimates of survival.
  • Cole, D. and Choquet, R. (2012). Parameter Redundancy in Models With Individual Random Effects. University of Kent.
    In parameter redundant models it is not possible to estimate all the parameters regardless of the amount of data collected. Introducing individual random effects may result in models that are no longer parameter redundant. Here we show how it is possible to determine whether or not a wide class of models with individual random effects are parameter redundant.
  • Cole, D. (2009). A Note on the Identifiability of Certain Latent Class Models. University of Kent.
    Wiering (2005, Statistics and Probability Letters, 75, 211-218) provides conditions
    for the identifiability of a class of latent models. Here we derive an alternative more
    general method of proving this result, which is based on standard identi¯ability
    methods involving forming Jacobians.
  • Ridout, M., Woodcock, C. and Cole, D. (2006). Generation Number in a Pure Birth Process. University of Kent.


  • Jeyam, A. (2017). New Diagnostic Tools for Capture-Recapture Models.
    Capture-recapture models have increased in complexity over the last decades and goodness-of-fit assessment is crucial to ensure that considered models provide an adequate fit to the data. In this thesis, my primary emphasis is to develop new diagnostic tools for capture-recapture models in order to target possible reasons of lack-of-fit, which might provide biological insights and point towards better-fitting candidate models.

    Starting with the basic Cormack-Jolly-Seber model, I develop a new tool for detecting heterogeneity in capture. I then progress to the more complex multi-state models, for which I propose a test for detecting a mover-stayer structure within the population. Finally, I move on to more general models presenting additional levels of uncertainty: first partial observations and then unobservable states. In the presence of partial observations, part of the observations are assigned to states with certainty whereas others are not. I develop a new test for the underlying state-structure of the partial observations, this test detects that the partial observations are not generated by the observable states defined in the experiment. In the presence of unobservable states, the additional level of uncertainty relates only to the non-captures. I present a procedure to test whether one or two unobservable states need to be defined in order for the model to provide an adequate fit to the data.

    Lastly, I explore the use of multi-state models to incorporate individual time-varying covariates, based on a fine discretisation of the covariate space.
  • Jourdain, N. (2017). New Analytical Methods for Camera Trap Data.
    Density estimation of terrestrial mammals has become increasingly important in ecology, and robust analytical tools are required to provide results that will guide wildlife management. This thesis concerns modelling encounters between unmarked animals and camera traps for density estimation. We explore Rowcliffe et al. (2008) Random Encounter Model (REM) developed for estimating density of species that cannot be identified to the individual level from camera trap data. We demonstrate how REM can be used within a maximum likelihood framework to estimate density of unmarked animals, motivated by the analysis of a data set from Whipsnade Wild Animal Park (WWAP), Bedfordshire, south England. The remainder of the thesis focuses on developing and evaluating extended Random Encounter Models, which describe the data in an integrated population modelling framework. We present a variety of approaches for modelling population abundance in an integrated Random Encounter Model (iREM), where complicating features are the variation in the encounters and animal species. An iREM is a more flexible and robust parametric model compared with a nonparametric REM, which produces novel and meaningful parameters relating to density, accounting for the sampling variability in the parameters required for density estimation. The iREM model we propose can describe how abundance changes with diverse factors such as habitat type and climatic conditions. We develop models to account for induced-bias in the density from faster moving animals, which are more likely to encounter camera traps, and address the independence assumption in integrated population models. The models we propose consider a functional relationship between a camera index and animal density and represent a step forward with respect to the current simplistic modelling
    approaches for abundance estimation of unmarked animals from camera trap data. We illustrate the application of the models proposed to a community of terrestrial mammals from a tropical moist forest at Barro Colorado Island (BCI), Panama.
  • Yu, C. (2015). The Use of Mixture Models in Capture-Recapture.
    Mixture models have been widely used to model heterogeneity. In this thesis, we focus on the use of mixture models in capture--recapture, for both closed populations and open populations.
    We provide both practical and theoretical investigations. A new model is proposed for closed populations and the practical difficulties of model fitting for mixture models are demonstrated for open populations.
    As the number of model parameters can increase with the number of mixture components, whether we can estimate all of the parameters using the method of maximum likelihood is an important issue. We explore this using formal methods and develop general rules to ensure that all parameters are estimable.
  • Hubbard, B. (2014). Parameter Redundancy With Applications in Statistical Ecology.
    This thesis is concerned with parameter redundancy in statistical ecology models. If it is not possible to estimate all the parameters, a model is termed parameter redundant. Parameter redundancy commonly occurs when parameters are confounded in the model so that the model could be reparameterised in terms of a smaller number of parameters. In principle, it is possible to use symbolic algebra to determine whether or not all the parameters of a certain ecological model can be estimated using classical methods of statistical inference.

    We examine a variety of different ecological models: We begin by exploring models based on marking a number of animals and observing the same animals at future time points. These observations can either be when the animal is marked and then
    recovered dead in mark-recovery modelling, or when the animal is marked and then recaptured alive in capture-recapture modelling. We also explore capture-recapture-recovery models where both dead recoveries and alive recaptures can be observed in the same study. We go on to explore occupancy models which are used to obtain
    estimates of the probability of presence, or absence, for living species by the use of repeated detection surveys, where these models have the advantage that individuals are not required to be marked. A variety of different occupancy models are examined included the addition of season-dependent parameters, group-dependent parameters and species-dependent, along with other models.

    We investigate parameter redundancy by deriving general results for a variety of different models where the model's parameter dependencies can be relaxed suited to different studies. We also analyse how the results change for specific data sets and how sparse data influence whether or not a model is parameter redundant using procedures written in Maple. This theory on parameter redundancy is vital for the correct use of these ecological models so that valid statistical inference can be made.
Last updated