Sự tương đồng giữa mô hình biểu hiện gene v?
Tôi xin giới thiệu với các thành viên SHVN một nghiên cứu mới của TS. Đạt vừa được đăng trên tạp chí Molecular Systems Biology thuộc Nature Publishing Group.
Công trình: http://www.nature.com/msb/journal/v2/n1/full/msb4100054.html
Bài bình luận: http://www.nature.com/msb/journal/v2/n1/full/msb4100055.html
Bạn nào quan tâm có thể chuyên ngữ giúp phần abstract của bài báo và bài bình luận sang tiếng Việt để giới thiệu lên trang nhất SHVN.
Transcription regulation has been responsible for organismal complexity and diversity in the course of biological evolution and adaptation, and it is determined largely by the context-dependent behavior of cis-regulatory elements (CREs). Therefore, understanding principles underlying CRE behavior in regulating transcription constitutes a fundamental objective of quantitative biology, yet these remain poorly understood. Here we present a deterministic mathematical strategy, the motif expression decomposition (MED) method, for deriving principles of transcription regulation at the single-gene resolution level. MED operates on all genes in a genome without requiring any a priori knowledge of gene cluster membership, or manual tuning of parameters. Applying MED to Saccharomyces cerevisiae transcriptional networks, we identified four functions describing four different ways that CREs can quantitatively affect gene expression levels. These functions, three of which have extrema in different positions in the gene promoter (short-, mid-, and long-range) whereas the other depends on the motif orientation, are validated by expression data. We illustrate how nature could use these principles as an additional dimension to amplify the combinatorial power of a small set of CREs in regulating transcription.
COMMENTS:
Modeling gene expression control using Omes Law
Harmen J Bussemaker1
Department of Biological Sciences and Center for Computational Biology and Bioinformatics, Columbia University, New York, NY, USA
The binding of transcription factors (TFs) to specific sites in the genome is a crucial step in the molecular process controlling gene expression. The in vitro sequence specificity of these regulatory proteins can generally be well represented by consensus DNA motifs or slightly more sophisticated sequence profiles called position-specific scoring matrices. These are widely used to scan genome sequences in order to find novel transcriptional target genes. Unfortunately, usually only a small fraction of the 'hits' thus obtained are functional in vivo, where local chromatin structure and TF–TF interactions come into play. Taking into account the context provided by the surrounding noncoding DNA is therefore essential. In a recent study currently published in Molecular Systems Biology, Nguyen and D'haeseleer (2006) present a promising strategy for determining which context features are most important for a given TF binding motif. Their approach belongs to a growing class of methods that fit simple mathematical models of transcription regulation to DNA microarray data to map gene regulation networks.
Many of the molecular players that govern gene expression are known, but our knowledge about their interactions with the DNA and with each other is very incomplete. Information about the gene regulatory network is only implicitly represented in the large volume of functional genomics data now available to us. The strengths of the 'arrows' between TFs and their target genes and the condition-specific activities of the regulatory 'nodes' need to be inferred by computational means. A detailed mathematical model that accurately describes the molecular computations performed by the cell would greatly deepen our understanding of cellular physiology, and provide a framework for analyzing regulatory pathways or predicting the effects of genetic variation between individuals.
While the activity of a TF is often represented by its mRNA expression level (Segal et al, 2003), regulatory control is more often than not exerted at the level of subcellular localization or covalent modification of the protein, or the presence/absence of ligands. These variables really define the regulatory state of the cell, but they are much harder to measure experimentally than mRNA expression levels and therefore usually remain 'hidden'. Nguyen and D'haeseleer use multivariate linear modeling to computationally infer the hidden post-translational activity of each TF from the mRNA expression levels of its target genes, ignoring the mRNA expression level of the TF itself. This model-based approach was previously introduced (Bussemaker et al, 2001) as an alternative to clustering-based analysis of microarray data (Eisen et al, 1998; Beer and Tavazoie, 2004), and has been extended to include TF deletion data (Wang et al, 2002), position-specific scoring matrices (Conlon et al, 2003; Foat et al, 2005), and TF–TF interactions (Das et al, 2004). Since each individual microarray experiment is analyzed by itself, TF activities can be inferred in a condition-specific manner.
The ability to infer condition-specific TF activities makes it possible to estimate the regulatory coupling strength between a TF and a putative target gene, by comparing the mRNA expression profile of the gene with the inferred TF activity profile across a large number of microarray experiments. This approach has previously been used (Liao et al, 2003; Gao et al, 2004) to refine the gene regulatory network structure derived from genome-wide TF occupancy data (Harbison et al, 2004). Nguyen and D'haeseleer derive their initial guess of the network connectivity from matches to TF binding motifs in noncoding sequence, and subsequently use a modified version of the method of Liao et al (2003) to self-consistently infer a matrix of inferred activities of every TF in every condition and a matrix of regulatory coupling strengths between every TF and every gene. Their approach provides an alternative to the use of evolutionary conservation to distinguish functional DNA motifs from nonfunctional ones (Kellis et al, 2003). While this is already interesting per se, the unique insight of the authors is that the inferred regulatory couplings can in turn be analyzed to determine which aspects of the promoter context cause the same motif to be functional in one gene and nonfunctional in another. They use this approach to gain insight into the role of promoter geometry and the interplay between two elusive motifs called PAC and rRPE.
An appealing analogy exists between the linear model for transcription regulation used by Nguyen and D'haeseleer and the well-known linear equation called Ohm's Law, I=GV, which states that the electrical current (I) through a resistor is proportional to the voltage (V) across it. In the cell, TF activities play the role of the voltage and transcription rates that of the current, while the regulatory coupling between a TF and a target gene corresponds to the conductivity (G) of the resistor (see Figure 1 ). Changes in the mRNA expression level of all genes (often called the 'transcriptome') are interpreted as a response to changes in the regulatory activity of all TFs (which we might call the 'transfactome'), and this relationship is modeled by a linear equation one might refer to as 'Omes Law'. Nguyen and D'haeseleer show that Omes Law allows them to predict condition-specific expression levels that were held out from the data set used to fit their model parameters more accurately than the method of Beer and Tavazoie (2004).
Electrical engineers will be surprised to learn that, in biology, the observed conductivity of a resistor strongly depends on where it gets inserted into the electronic circuit. With the work of Nguyen and D'haeseleer, we now have a computational strategy to systematically analyze how genomic context influences the in vivo responsiveness of TF binding sites.
Tôi xin giới thiệu với các thành viên SHVN một nghiên cứu mới của TS. Đạt vừa được đăng trên tạp chí Molecular Systems Biology thuộc Nature Publishing Group.
Công trình: http://www.nature.com/msb/journal/v2/n1/full/msb4100054.html
Bài bình luận: http://www.nature.com/msb/journal/v2/n1/full/msb4100055.html
Bạn nào quan tâm có thể chuyên ngữ giúp phần abstract của bài báo và bài bình luận sang tiếng Việt để giới thiệu lên trang nhất SHVN.
Transcription regulation has been responsible for organismal complexity and diversity in the course of biological evolution and adaptation, and it is determined largely by the context-dependent behavior of cis-regulatory elements (CREs). Therefore, understanding principles underlying CRE behavior in regulating transcription constitutes a fundamental objective of quantitative biology, yet these remain poorly understood. Here we present a deterministic mathematical strategy, the motif expression decomposition (MED) method, for deriving principles of transcription regulation at the single-gene resolution level. MED operates on all genes in a genome without requiring any a priori knowledge of gene cluster membership, or manual tuning of parameters. Applying MED to Saccharomyces cerevisiae transcriptional networks, we identified four functions describing four different ways that CREs can quantitatively affect gene expression levels. These functions, three of which have extrema in different positions in the gene promoter (short-, mid-, and long-range) whereas the other depends on the motif orientation, are validated by expression data. We illustrate how nature could use these principles as an additional dimension to amplify the combinatorial power of a small set of CREs in regulating transcription.
COMMENTS:
Modeling gene expression control using Omes Law
Harmen J Bussemaker1
Department of Biological Sciences and Center for Computational Biology and Bioinformatics, Columbia University, New York, NY, USA
The binding of transcription factors (TFs) to specific sites in the genome is a crucial step in the molecular process controlling gene expression. The in vitro sequence specificity of these regulatory proteins can generally be well represented by consensus DNA motifs or slightly more sophisticated sequence profiles called position-specific scoring matrices. These are widely used to scan genome sequences in order to find novel transcriptional target genes. Unfortunately, usually only a small fraction of the 'hits' thus obtained are functional in vivo, where local chromatin structure and TF–TF interactions come into play. Taking into account the context provided by the surrounding noncoding DNA is therefore essential. In a recent study currently published in Molecular Systems Biology, Nguyen and D'haeseleer (2006) present a promising strategy for determining which context features are most important for a given TF binding motif. Their approach belongs to a growing class of methods that fit simple mathematical models of transcription regulation to DNA microarray data to map gene regulation networks.
Many of the molecular players that govern gene expression are known, but our knowledge about their interactions with the DNA and with each other is very incomplete. Information about the gene regulatory network is only implicitly represented in the large volume of functional genomics data now available to us. The strengths of the 'arrows' between TFs and their target genes and the condition-specific activities of the regulatory 'nodes' need to be inferred by computational means. A detailed mathematical model that accurately describes the molecular computations performed by the cell would greatly deepen our understanding of cellular physiology, and provide a framework for analyzing regulatory pathways or predicting the effects of genetic variation between individuals.
While the activity of a TF is often represented by its mRNA expression level (Segal et al, 2003), regulatory control is more often than not exerted at the level of subcellular localization or covalent modification of the protein, or the presence/absence of ligands. These variables really define the regulatory state of the cell, but they are much harder to measure experimentally than mRNA expression levels and therefore usually remain 'hidden'. Nguyen and D'haeseleer use multivariate linear modeling to computationally infer the hidden post-translational activity of each TF from the mRNA expression levels of its target genes, ignoring the mRNA expression level of the TF itself. This model-based approach was previously introduced (Bussemaker et al, 2001) as an alternative to clustering-based analysis of microarray data (Eisen et al, 1998; Beer and Tavazoie, 2004), and has been extended to include TF deletion data (Wang et al, 2002), position-specific scoring matrices (Conlon et al, 2003; Foat et al, 2005), and TF–TF interactions (Das et al, 2004). Since each individual microarray experiment is analyzed by itself, TF activities can be inferred in a condition-specific manner.
The ability to infer condition-specific TF activities makes it possible to estimate the regulatory coupling strength between a TF and a putative target gene, by comparing the mRNA expression profile of the gene with the inferred TF activity profile across a large number of microarray experiments. This approach has previously been used (Liao et al, 2003; Gao et al, 2004) to refine the gene regulatory network structure derived from genome-wide TF occupancy data (Harbison et al, 2004). Nguyen and D'haeseleer derive their initial guess of the network connectivity from matches to TF binding motifs in noncoding sequence, and subsequently use a modified version of the method of Liao et al (2003) to self-consistently infer a matrix of inferred activities of every TF in every condition and a matrix of regulatory coupling strengths between every TF and every gene. Their approach provides an alternative to the use of evolutionary conservation to distinguish functional DNA motifs from nonfunctional ones (Kellis et al, 2003). While this is already interesting per se, the unique insight of the authors is that the inferred regulatory couplings can in turn be analyzed to determine which aspects of the promoter context cause the same motif to be functional in one gene and nonfunctional in another. They use this approach to gain insight into the role of promoter geometry and the interplay between two elusive motifs called PAC and rRPE.
An appealing analogy exists between the linear model for transcription regulation used by Nguyen and D'haeseleer and the well-known linear equation called Ohm's Law, I=GV, which states that the electrical current (I) through a resistor is proportional to the voltage (V) across it. In the cell, TF activities play the role of the voltage and transcription rates that of the current, while the regulatory coupling between a TF and a target gene corresponds to the conductivity (G) of the resistor (see Figure 1 ). Changes in the mRNA expression level of all genes (often called the 'transcriptome') are interpreted as a response to changes in the regulatory activity of all TFs (which we might call the 'transfactome'), and this relationship is modeled by a linear equation one might refer to as 'Omes Law'. Nguyen and D'haeseleer show that Omes Law allows them to predict condition-specific expression levels that were held out from the data set used to fit their model parameters more accurately than the method of Beer and Tavazoie (2004).
Electrical engineers will be surprised to learn that, in biology, the observed conductivity of a resistor strongly depends on where it gets inserted into the electronic circuit. With the work of Nguyen and D'haeseleer, we now have a computational strategy to systematically analyze how genomic context influences the in vivo responsiveness of TF binding sites.