Year 2018, Volume 47, Issue 5, Pages 1335 - 1347 2018-10-16

Overdispersed count models for mRNA transcription

Burcin Simsek [1] , Satish Iyengar [2]

21 49

Direct detection of gene activity is often not possible because new proteins from an individual activation event are masked by proteins remaining from previous events. Thus, researchers determine gene activation or inactivation by observing messenger RNA (mRNA) production instead. Typically, mRNA transcription occurs in short rapid bursts when the gene is in its on-state, and no transcriptions during its offstate. This burstiness of mRNA production is not well modeled by a Poisson process. We propose the Conway-Maxwell-Poisson (COM- Poisson) distribution as a potential alternative to the more common negative binomial (NB) distribution. We use the generalized linear model version of these models to incorporate covariate information. We also consider zero inflation to model excess zero counts. We use data from E. coli bacteria and mammalian cells to illustrate our proposed methods. We find that when there is a biophysically derived distribution, this distribution performs well. We also show that in the absence of such biophysical knowledge, the COM-Poisson is competitive with the NB. Both the COM-Poisson and NB arise in queueing theory, suggesting that further application of that framework to study mRNA dynamics would be useful.
Conway-Maxwell-Poisson, Link function, Model comparison, Negative binomial, Generalized linear model
  • Arazia, A., Ben-Jacob, E.B. and Yechiali, U. Bridging genetic networks and queueing theory, Physica A 332, 585-616, 2004.
  • Barriga, G.D.C. and Louzada, F. The zero-inflated Conway-Maxwell-Poisson distribution: Bayesian inference, regression modeling and influence diagnostic, Statistical Methodology 21, 23-34, 2014.
  • Chatla, S.B. and Shmueli, G. An efficient estimation of Conway-Maxwell-Poisson regession and additive model with an application to bike sharing, arXiv:1610.08244v2, 2016.
  • Conway, R.W. and Maxwell, W.L. A queuing model with state dependent service rates, Journal of Industrial Engineering 12, 132-136, 1962.
  • Elgart, V., Jia, T., and Kulkarni, R.V. Applications of Little's law to stochastic models of gene expression, Physical Review E 82, 021901(1-6), 2010.
  • Fraser D. and Kærn M. A chance at survival: gene expression noise and phenotypic diver- sification strategies, Molecular Microbiology 71, 1333-1340, 2009.
  • Gaunt, R.E., Iyengar, S., Olde Daalhuis, A.B., and Simsek, B. An asymptotic expansion for the normalizing constant of Conway-Maxwell-Poisson distribution, arXiv:1612:06618, 2016.
  • Golding, I., Paulsson, J., and Cox, E.C. Real-time kinetics of gene activity in individual bacteria, Cell 123, 1025-1036, 2005.
  • Guikema, S.D. and Goffelt, J.P. A flexible count data regression model for risk analysis, Risk Analysis 28, 213-223, 2008.
  • Imoto, K. A generalized Conway-Maxwell-Poisson distribution which includes the negative binomial distribution, Applied Mathematics and Computation 247, 824-834, 2014.
  • Lord, D., Guikema, S.D., and Geedipally, S.R. Application of the Conway-Maxwell-Poisson generalized linear model for analyzing motor vehicle crashes, Accident Analysis & Prevention 40, 1123-1134, 2008.
  • Minka, T., Shmueli, G., Kadane, J.B., Borle, S., and Boadwright, P. Computing with the COM-Poisson distribution, Technical Report 776, Statistics Department, Carnegie Mellon University, 2003.
  • Mitarai N, Semsey A, Sneppen K. (2015) Dynamic competition between transcription initiation and repression: Role of non-equilibrium steps in cell to cell heterogeneity. arXiv: 1502.03011v3.
  • Peccoud, J. and Ycart, B. Markovian modeling of gene-product synthesis, Theoretical Population Biology 48, 222-234, 1995.
  • Pogany T.K. Integral form of the COM-Poisson renormalization constant, Statistics & Probability Letters 119, 144-145, 2016.
  • Raj, A., Peskin, C.S., Tranchina, D., Vargas, D.Y., and Tyagi, S. Stochastic mRNA synthesis in mammalian cells, PLOS Biology 4, 1707-1719, 2006.
  • Sellers, K.F., Borle, S., and Shmueli, G. The COM-Poisson model for count data: a survey of methods and applications, Applied Stochastic Models in Business and Industry 28, 104- 116, 2012.
  • Sellers, K.F. and Raim, A.M. A flexible zero-inflated model to address data dispersion, Computational Statistics and Data Analysis 99, 68-80, 2016.
  • Sellers, K.F. and Shmueli, G. A flexible regression model for count data, Annals of Applied Statistics 4, 943-961, 2010.
  • Shahrezaei, V. and Swain, P.S. Analytical distributions for stochastic gene expression, Proceedings of the National Academy of Sciences 45, 17256-17261, 2008.
  • Shmueli, G., Minka, T., Kadane, J.B., Borle, S., and Boatwright, P. A useful distribution for fitting discrete data: revival of the Conway-Maxwell-Poisson distribution, Journal of the Royal Statistical Society C 54, 127-142, 2005.
  • Simsek B. Applications of Point Process Models to Imaging and Biology, PhD Dissertation, Statistics Department, University of Pittsburgh, 2016.
  • Simsek, B. and Iyengar, S. Approximating the the Conway-Maxwell-Poisson normalizing constant, Filomat 30, 953-960, 2016.
  • So, L., Ghosh, A., Zong, C., Sepulveda, L.A., Segev R., and Golding I. General properties of transcriptional time series in Escherichia coli, Nature Genetics 43, 554-560, 2011.
  • Trcek, T., Chao, J.A., Larson, D.R., Park, H.Y., Zenklusen, D., Shenoy, S.M., and Singer,R.H. Single-mRNA counting using fluorescent in situ hybridization in budding yeast, NatureProtocols 7, 408-419, 2012.
  • Vuong, Q.H. Likelihood ratio tests for model selection and non-nested hypotheses, Econometrica. 57, 307-333, 1989.
  • Zenklusen, D., Larson, D.R., and Singer, R.H. Single-RNA counting reveals alternative modes of gene expression in yeast, Natural Structural & Molecular Biology 15, 12631271, 2008.
  • Zhang, H., Pounds, S.B., and Tang, L. Statistical methods for overdispersion in mRNA-Seq count data, Open Bioinformatics Journal 7, 34-40, 2013.
Primary Language en
Subjects Mathematics
Journal Section Statistics

Author: Burcin Simsek (Primary Author)
Country: United States

Author: Satish Iyengar
Country: United States

Bibtex @research article { hujms471515, journal = {Hacettepe Journal of Mathematics and Statistics}, issn = {2651-477X}, eissn = {2651-477X}, address = {Hacettepe University}, year = {2018}, volume = {47}, pages = {1335 - 1347}, doi = {}, title = {Overdispersed count models for mRNA transcription}, key = {cite}, author = {Simsek, Burcin and Iyengar, Satish} }
APA Simsek, B , Iyengar, S . (2018). Overdispersed count models for mRNA transcription. Hacettepe Journal of Mathematics and Statistics, 47 (5), 1335-1347. Retrieved from
MLA Simsek, B , Iyengar, S . "Overdispersed count models for mRNA transcription". Hacettepe Journal of Mathematics and Statistics 47 (2018): 1335-1347 <>
Chicago Simsek, B , Iyengar, S . "Overdispersed count models for mRNA transcription". Hacettepe Journal of Mathematics and Statistics 47 (2018): 1335-1347
RIS TY - JOUR T1 - Overdispersed count models for mRNA transcription AU - Burcin Simsek , Satish Iyengar Y1 - 2018 PY - 2018 N1 - DO - T2 - Hacettepe Journal of Mathematics and Statistics JF - Journal JO - JOR SP - 1335 EP - 1347 VL - 47 IS - 5 SN - 2651-477X-2651-477X M3 - UR - Y2 - 2017 ER -
EndNote %0 Hacettepe Journal of Mathematics and Statistics Overdispersed count models for mRNA transcription %A Burcin Simsek , Satish Iyengar %T Overdispersed count models for mRNA transcription %D 2018 %J Hacettepe Journal of Mathematics and Statistics %P 2651-477X-2651-477X %V 47 %N 5 %R %U
ISNAD Simsek, Burcin , Iyengar, Satish . "Overdispersed count models for mRNA transcription". Hacettepe Journal of Mathematics and Statistics 47 / 5 (October 2018): 1335-1347.