|
Mem Inst Oswaldo Cruz, Rio de Janeiro, Vol. 101, No. 7, November ,2006, pp. 775-777 The untranslated regions of genes from Trypanosoma cruzi: perspectives for functional characterization of strains and isolates Adeilton Brandão Departamento de
Medicina Tropical, Instituto Oswaldo Cruz-Fiocruz, Av. Brasil 4365, 21045-900
Rio de Janeiro, RJ, Brazil Received 16 May
2006 Code Number: oc06129 The sequencing of Trypanosoma cruzi genome has been completed and a great deal of information is now available. However, the organization of protozoa genomes is somewhat elusive and much effort must be applied to reveal all the information coded in the nucleotide sequences. Among the DNA segments that needs further investigation are the untranslated regions of genes. Many of the T. cruzi genes that were revealed by the genome sequencing lack information about the untranslated regions. In this paper, some features of these untranslated segments as well as their applications in T. cruzi populations are discussed. Key words: untranslated regions - genes - genomes - protozoa The finishing of the Trypanosoma cruzi genome sequencing (El Sayed et al. 2005) has disclosed for the community of parasitologists an enormous volume of information about genes and other nucleotide sequences from this protozoan parasite. Paradoxically, this same flow of information constitutes the bottleneck for the revelation of all molecular aspects of the parasite life cycle, from its purely biological component (interaction with the invertebrate and vertebrate hosts) up to the medical and social impact (Chagas disease and its clinical variations). At present, the speed with which nucleotide sequences are produced has led to disequilibrium between the potential of the genomic information and its straight use for the elucidation of the biological phenomena. The reductionism impulse stimulated by the availability of the nucleotide sequences only made clear that the complex network of biological interactions is not explicitly encoded in the order of the genome bases. Indeed, they are the result of a dynamic process in which several live beings define what genomes will survive. In the case of parasites and its hosts, there always is a dispute between two unrelated genomes, which can interact for million years and the net result is an improvement in their skills for survival in hostile environments. The genes of the eukaryotes present segments in their 5' and 3' ends that are transcribed into RNA but not translated, and therefore they are called untranslated regions or simply 5' and 3' UTR. In these segments, DNA elements involved in regulation of the gene expression and in the interaction with the new types of RNA (e.g., micro-RNA) have already been identified (Mignone et al. 2002, Hughes 2006). For T. cruzi (and all trypanosomatids), most of the genes have no introns and the messenger RNA are processed by the mechanism of trans-splicing (Gull 2001, Liang et al. 2003). In principle, the analysis of the transcripts in trypanosomatids could be easily accomplished, from the very moment that the respective sequences are available. The definition of the "cardinal points" in the UTRs of the genes from T. cruzi, i.e. beginning of 5'UTR and the end of 3'UTR, is a process that depends a great deal on the experimental investigation, especially for the 3'UTR, since the polyadenylation signal typical of eukaryotes has not been recognized in the trypanosomatids. For the time being no computational tool has been developed to accurately predict the site of polyadenylation in try-panosomatids, and only with the aid of experimental work can the complete information be extracted from their genes. The absence of canonic polyadenylation signal in trypa-nosomatids may be related to the mechanism that couples the polyadenylation of upstream gene to trans-splicing of a downstream gene, and it has been shown that the modification of the downstream splice site (dinucleotide AG) shifted the poly (A) site (Lebowitz et al. 1993, Mathews et al. 1994). Nucleotide sequence per se does not inform precisely at which points the transcription of T. cruzi genes begins and finishes, but the existence of the cis elements (the dinucleotide AG, poly-pyrimidines rich tracts, adenine branching point) which are involved in the coupling of trans-splicing and polyade-nylation provide the starting points for generation of transcription recognition algorithms (Gopal et al. 2005, Benz et al. 2005). Although these algorithms proved to work fairly well in detecting the correct splice site for the organisms studied (> 80% of hits for T. brucei and Leishmania major) they suffer from limitations imposed on gene prediction algorithms (Brent & Guigo 2004). Experimental approaches can also be used to complement the work of these algorithms and some methods have been developed to explore the canonical signals associated to trypanosomes mRNA processing (Webb et al. 2005, Siegel et al. 2005). Thus, if information generated by cDNA sequencing or transcription assays is not available, computational analysis of sequences allows only an estimate of these "cardinal points". In this context a window of opportunity opens for complementing the information of T. cruzi genome and to achieve this goal it is essential that accurate description of their untranslated regions be available to perform thorough analysis. The power of current experimental methods allows one to pinpoint nucleotides in these regions and afterwards observe the effects on parameters such as transcription rates, gene expression and regulation, mRNA stability and half-life. Association of sequence motifs in 3'UTR of T. cruzi genes to defined cellular and molecular effects has gained sound experimental support (D'Orso & Frasch 2001, Bartholomeu et al. 2002). Though both untranslated segments are expected to influence gene expression and function, most investigations points to 3'UTR playing a major role by the presence of cis-acting elements in T. cruzi genes (Nozak & Cross 1995, D'Orso et al. 2003). This can be viewed as the result of structural differences, for the 3'UTR being on average 2-3 times longer than the 5'UTR increases the possibilities to influence mRNA processing (more sequence elements, protein binding sites, and secondary structures). Genes like glycoprotein 72 (gp72), glycoprotein 85 (gp85), glyceraldehyde 1-phosphate dehydrogenase (gapdh), amastin, b-tubulin, mucin, trans-sialidase and, FL-60 have elements (presumptively or accurately described) in their 3'UTR that in different ways affect gene expression in stage-specific process (Nozak & Cross 1995, Abuin et al. 1999, Weston et al. 1999, Coughlin et al. 2000, Di Noia et al. 2000, Bartholomeu et al. 2002) The bottleneck created by the huge amount of nucleotide sequences offers new routes in the search of models and experimental strategies that translate the DNA code into biological phenomenon. Thus the DNA sequence to biological phenomenon translation is a process of searching for codes behind the code, i.e., the searching of a Meta code. Except for the house keeping genes and the functionally important ones in closely related species, the composition and structural characteristics of UTRs of eukaryotes can be very distinct (Mignone et al. 2005). This at first can be explained by the absence of highly functional pressures, as occurs to the protein coding segments of genes. Inspection of T. cruzi sequences shows that untranslated regions in some genes can be good molecular and functional markers for the populations of T. cruzi. Minor variations in each UTR (small insertion/deletions) can be the starting points for the establishment of relationships between the gene expressions and population structure. It is not clear at what basis this relationship should be structured, but some hints come from variations detected in the UTR of evolutionary conserved genes in strains of T. cruzi. It is worth note that small changes in the UTRs can affect the expression of the gene (Weston et al. 1999, Teixeira et al. 1999), and this can be a smart evolutionary strategy to gain functional advantage without large mutations in genes themselves. The result of these alterations in protozoa parasites is represented by more ability for survival in different environments. Sequence analysis shows that 3'UTR of the calmodulin gene in the major lineages of T. cruzi present mutational events that cause alterations in the predicted secondary structure of the respective messenger RNA (Brandao & Fernandes 2006). This gene is very important for the cell and carries out multiple tasks in the cellular metabolism (Hoeflich & Ikura 2002). Thus, it is very promising that a population group of T. cruzi, the so-called Zymodeme III (ZIII), which is predominant in a region that is one of the biggest repositories of the genetic diversity in the globe (the Amazon region), present a deletion of 13 nucleotides in 3'UTR in comparison with the groups TcI and TcII. Certainly, this different secondary structure of the calmodulin messenger RNA in T. cruzi groups means that either the expression or the regulation of this fundamental gene is driving the functional divergence of T. cruzi populations. If it is true, the network of gene interactions in T. cruzi, and its biological consequences (interaction with the vertebrate and invertebrate hosts) show fluctuations that not only respond to the different environments but create evolutionary alternatives for the species. In the present description of the T. cruzi genome (El-Sayed et al. 2005) the ~ 6000 hypothetical genes are presumed to act one by one with their respective untranslated regions, but a completely different picture can emerge if we imagine this same number of genes making interactions with each other. In this last possibility two separated (conceptually, not biological) network should be built: the gene products and the untranslated regions interactions. The meeting of two very different organisms, a protozoan parasite and its vertebrate host, is a biological phenomenon in which the most complex genome (from the point of view of the content and functions) will be tested in his capacity of dealing with a less complex one. This will promote many events of selection, adaptation and genomic variation. Thus, from an evolutionary perspective, the parasitic David (less complex genomes) will keep on challenging the Goliad contained in the more complex genomes. REFERENCES
Copyright 2006 Instituto Oswaldo Cruz - Fiocruz
|
|