Abstract
Regulatory networks consist of genes encoding transcription factors (TFs) and the genes they activate or repress. Various types of systems of ordinary differential equations (ODE) have been proposed to model these networks, ranging from linear to Michaelis-Menten approaches. In practice, a serious drawback to estimate these models is that the TFs are generally unobserved. The reason is the actual lack of high-throughput techniques to measure abundance of proteins in the cell. The challenge is to infer their activity profile together with the kinetic parameters of the ODE using level expression measurements of the genes they regulate.
In this work we propose general statistical framework to infer the kinetic parameters of regulatory networks with one or more TFs using time course gene expression data. Our approach is also able to predict the activity levels of the TF. We use a penalized likelihood approach where the ODE is used as a penalty. The main advantage is that the solution of the ODE is not required explicitly as it is common in most proposed methods. This makes our approach computationally efficient and suitable for large systems with many components. We use the proposed method to study a SOS repair system in Escherichia coli. The reconstructed TF exhibits a similar behavior to experimentally measured profiles and the genetic expression data are fitted properly.
7 Appendix
7.1 Notation
For the proofs below we introduce some notation. We define a matrix

vectors Φk=(δk, θk, Σk, μ, αk),
and functions

k=1, …, m.
Proof of (15). (Derivation of the influence matrix.)
We can write the penalized log-likelihood as

The maximizer of lλ,k with respect to αk is given by 
From (14) it follows

where Sλ,k=Kδk(Kδk+2λΣk)−1 is the influence matrix that depends on the regularization parameter λ.
Proof of (20). (Expectation step of EM algorithm.)
For every k=1, …, m we calculate
Denote with ΣH,k and ΣO,k diagonal matrices whose diagonals are vectors of variances of observed and hidden observations of kth equation, respectively. Splitting the likelihood in two parts corresponding to the hidden and the observed observations we obtain that

where
In the previous expression we have that

By using the properties of the expectation and the variance and by factorizing terms it is straightforward to obtain that

Replacing eq. (29) in the expected likelihood we obtain that

Define
Grouping the terms we obtain that

By taking the sum over k’s and taking into account (18) it is straightforward to conclude the proof.
Proof of (22). (Maximization step of EM algorithm.) Denote by
Then for fixed δk and Σk the maximum of
is given for the vector
By substituting αk into the expression and simplifying we obtain that

where
By taking sum over k’s we obtain

as we aimed to prove.
References
Äijö, T. and H. Lähdesmäki (2009): “Learning gene regulatory networks from gene expression measurements using non-parametric molecular kinetics,” Bioinformatics, 25, 2937–2944.10.1093/bioinformatics/btp511Suche in Google Scholar PubMed
Antia, H. (2002): Numerical methods for scientists and engineers. Basel, Boston, Berlin: Birkhauser Verlag, 2nd edition.Suche in Google Scholar
Auliac, C., V. Frouin, X. Gidrol and F. d’Alché Buc (2008): “Evolutionary approaches for the reverse-engineering of gene regulatory networks: A study on a biologically realistic dataset,” BMC Bioinformatics, 9.10.1186/1471-2105-9-91Suche in Google Scholar PubMed PubMed Central
Barenco, M., D. Tomescu, D. Brewer, R. Callard, J. Stark and M. Hubank (2006): “Ranked prediction of p53 targets using hidden variable dynamic modeling,” Genome. Biol., 7, R25+.10.1186/gb-2006-7-3-r25Suche in Google Scholar PubMed PubMed Central
Berlinet, A. (2004): “Reproducing kernel hilbert spaces in probability and statistics,” Boston: Springer.10.1007/978-1-4419-9096-9Suche in Google Scholar
Calderhead, B., M. Girolami and N. Lawrence (2009): “Accelerating Bayesian inference over nonlinear differential equations with Gaussian processes,” Ad. Neural Inform. Process. Syst., 21, 217–224.Suche in Google Scholar
Cao, J. and H. Zhao (2008): “Estimating dynamic models for gene regulation networks,” Bioinformatics, 24, 1619–1624.10.1093/bioinformatics/btn246Suche in Google Scholar PubMed PubMed Central
Chen, T., H. L. He and G. M. Church (1999): “Modeling gene expression with differential equations,” Pac. Symp. Biocomput., 29–40.Suche in Google Scholar
Chou, I.-C. and E. O. Voit (2009): “Recent developments in parameter estimation and structure identification of biochemical and genomic systems,” Math. Biosci., 219, 57–83.Suche in Google Scholar
Efron, B. (1979): “Bootstrap Methods: Another Look at the Jackknife,” Ann. Stat., 7, 1–26.Suche in Google Scholar
Efron, B. (2004): “The estimation of prediction error,” J. Am. Stat. Asso., 99, 619–632.Suche in Google Scholar
FitzHugh, R. (1955): “Mathematical models of threshold phenomena in the nerve membrane,” B. Math. Biol., 17, 257–278.Suche in Google Scholar
Girolami, M. and B. Calderhead (2011): “Riemann Manifold Langevin and Hamiltonian Monte Carlo methods,” J. R. Stat. Soc.: B Met. (with Discussion), 73, 123–214.Suche in Google Scholar
Green, P. J. and B. W. Silverman (1994): Nonparametric regression and generalized linear models: a roughness penalty approach, volume 58, London: Chapman & Hall.10.1007/978-1-4899-4473-3Suche in Google Scholar
Hastie, T., R. Tibshirani and J. Friedman (2009): The elements of statistical learning, Springer series in statistics, New York, NY, USA: Springer New York Inc., 2nd edition.10.1007/978-0-387-84858-7Suche in Google Scholar
Iserles, A. (2009): A first course in the numerical analysis of differential equations, New York, NY, USA: Cambridge University Press, 2nd edition.10.1017/CBO9780511995569Suche in Google Scholar
Jong, H. D. (2002): “Modeling and simulation of genetic regulatory systems: a literature review,” J. Comput. Bio., 9, 67–103.Suche in Google Scholar
Khanin, R., V. Vinciotti, V. Mersinias, C. Smith and E. Wit (2007): “Statistical reconstruction of transcription factor activity using Michaelis-Menten kinetics,” Biometrics, 63, 816–823.10.1111/j.1541-0420.2007.00757.xSuche in Google Scholar PubMed
Khanin, R., V. Vinciotti and E. Wit (2006): “Reconstructing repressor protein levels from expression of gene targets in E. Coli,” P. Natal. Sci, 103, 18592–18596.Suche in Google Scholar
Lawrence, N. D., M. Rattray, A. Honkela and M. Titsias (2011): Gaussian process inference for differential equation models of transcriptional regulation. In: Stumpf, M. P. H., Balding, D. J., Girolami, M. (Eds.), Handbook of Statistical Systems Biology, Chichester: Wiley, 22, 376–394.10.1002/9781119970606.ch19Suche in Google Scholar
Lawrence, N. D., G. Sanguinetti and M. Rattray (2007): Modelling transcriptional regulation using Gaussian processes. In: Schölkopf, B., Platt, J., Hoffman, T. (Eds.), Advances in Neural Information Processing Systems 19. Cambridge, MA: MIT Press, 785–792.Suche in Google Scholar
Lillacci, G. and M. Khammash (2010): “Parameter estimation and model selection in computational biology,” PLoS Comput. Biol., 6, e1000696.Suche in Google Scholar
Milo, R., S. Shen-Orr, S. Itzkovitz, N. Kashtan, D. Chklovskii and U. Alon (2002): “Network motifs: simple building blocks of complex networks,” Science, 298, 824–827.10.1126/science.298.5594.824Suche in Google Scholar
Poggio, T. and F. Girosi (1990): “Networks for approximation and learning,” Proc. IEEE, 78, 1481 –1497.10.1109/5.58326Suche in Google Scholar
Quach, M., N. Brunel and F. d’Alché Buc (2007): “Estimating parameters and hidden variables in non-linear state-space models based on odes for biological networks inference,” Bioinformatics, 23, 3209–3216.10.1093/bioinformatics/btm510Suche in Google Scholar
Ramsay, J. O., G. Hooker, D. Campbell and J. Cao (2007): “Parameter estimation for differential equations: a generalized smoothing approach,” J. R. Stat. Soc. B Met., 69, 741–796.Suche in Google Scholar
Rasmussen, C. E. and C. K. I Williams (2006): Gaussian processes for machine learning. Cambridge, Massachusetts, London, England: MIT Press.10.7551/mitpress/3206.001.0001Suche in Google Scholar
Rogers, S., R. Khanin and M. Girolami (2007): “Bayesian model-based inference of transcription factor activity,” BMC Bioinformatics, 8 (suppl 2): S2.10.1186/1471-2105-8-S2-S2Suche in Google Scholar
Ronen, M., R. Rosenberg, B. I. Shraiman and U. Alon (2002): “Assigning numbers to the arrows: parameterizing a gene regulation network by using accurate expression kinetics,” PNAS, 99, 10555–10560.10.1073/pnas.152046799Suche in Google Scholar
Sassanfar, M. and J. W. Roberts (1990): “Nature of the SOS-inducing signal in Escherichia coli. The involvement of DNA replication,” J. Mol. Biol., 212, 79–96.Suche in Google Scholar
Secrier, M., T. Toni and M. P. H. Stumpf (2009): “The abc of reverse engineering biological signalling systems,” Mol. BioSyst., 5, 1925–1935.Suche in Google Scholar
Shanno, D. F. (1978): “On the convergence of a new conjugate gradient method,” SIAM J. Nume. Anal. 15, 1247–1257.Suche in Google Scholar
Smola, A. J., B. Schölkopf and K.-R. Müller (1998): “The connection between regularization operators and support vector kernels,” Neural Networks, 11, 637–649.10.1016/S0893-6080(98)00032-XSuche in Google Scholar
Steinke, F. and B. Scholkopf (2008): “Kernels, regularization and differential equations,” Pattern Recog., 41, 3271–3286.Suche in Google Scholar
Voit, E. O. and J. Almeida (2004): “Decoupling dynamical systems for pathway identification from metabolic profiles,” Bioinformatics, 20, 1670–1681.10.1093/bioinformatics/bth140Suche in Google Scholar PubMed
©2013 by Walter de Gruyter Berlin Boston
Artikel in diesem Heft
- Studying the evolution of transcription factor binding events using multi-species ChIP-Seq data
- Approximate Bayesian computation with functional statistics
- Monte Carlo estimation of total variation distance of Markov chains on large spaces, with application to phylogenetics
- Higher order asymptotics for negative binomial regression inferences from RNA-sequencing data
- Flexible pooling in gene expression profiles: design and statistical modeling of experiments for unbiased contrasts
- On optimality of kernels for approximate Bayesian computation using sequential Monte Carlo
- Inferring latent gene regulatory network kinetics
Artikel in diesem Heft
- Studying the evolution of transcription factor binding events using multi-species ChIP-Seq data
- Approximate Bayesian computation with functional statistics
- Monte Carlo estimation of total variation distance of Markov chains on large spaces, with application to phylogenetics
- Higher order asymptotics for negative binomial regression inferences from RNA-sequencing data
- Flexible pooling in gene expression profiles: design and statistical modeling of experiments for unbiased contrasts
- On optimality of kernels for approximate Bayesian computation using sequential Monte Carlo
- Inferring latent gene regulatory network kinetics