Direct detection of gene activity is often not possible because new proteins from an individual activation event are masked by proteins remaining from previous events. Thus, researchers determine gene activation or inactivation by observing messenger RNA (mRNA) production instead. Typically, mRNA transcription occurs in short rapid bursts when the gene is in its on-state, and no transcriptions during its offstate. This burstiness of mRNA production is not well modeled by a Poisson process. We propose the Conway-Maxwell-Poisson (COM- Poisson) distribution as a potential alternative to the more common negative binomial (NB) distribution. We use the generalized linear model version of these models to incorporate covariate information. We also consider zero inflation to model excess zero counts. We use data from E. coli bacteria and mammalian cells to illustrate our proposed methods. We find that when there is a biophysically derived distribution, this distribution performs well. We also show that in the absence of such biophysical knowledge, the COM-Poisson is competitive with the NB. Both the COM-Poisson and NB arise in queueing theory, suggesting that further application of that framework to study mRNA dynamics would be useful.
Conway-Maxwell-Poisson, Link function, Model comparison, Negative binomial, Generalized linear model