search
for
 About Bioline  All Journals  Testimonials  Membership  News


Memórias do Instituto Oswaldo Cruz
Fundação Oswaldo Cruz, Fiocruz
ISSN: 1678-8060 EISSN: 1678-8060
Vol. 90, Num. 2, 1995, pp. 215-216
Memorias Instituto Oswaldo Cruz, Vol. 90(2):215-216
mar./apr. 1995

Sequencing and Identification of Expressed Schistosoma mansoni Genes by Random Selection of cDNA Clones from a Directional Library

Gloria R Franco, Andrew JG Simpson*, Sergio DJ Pena

Departamento de Bioquimica e Imunologia, Instituto de Ciencias Biologicas - UFMG, Avenida Antonio Carlos 6627, 31270-901 Belo Horizonte, MG, Brasil *Laboratorio de Biologia Molecular, Centro de Pesquisas "Rene Rachou"-FIOCRUZ, Avenida Augusto de Lima 1715, 30190-002 Belo Horizonte, MG, Brasil

Code Number:OC95043
Size of Files:
    Text: 7.5K
    No associated graphics files

We have initiated a gene discovery program in Schistosoma mansoni based on the technique of Expressed Sequence Tags (ESTs), i.e. partial sequences of cDNAs obtained from single passes in automatic DNA sequencers. ESTs can be used to identify genes on the basis of their homology with sequences from other species deposited in DNA or protein databases. Transcripts with sequences without matches in the databases may represent novel parasite-specific genes. This approach has shown to be very efficient and in less than two years a broad range of novel genes has already been ascertained, more than doubling the number of known S. mansoni genes.

Key words: Schistosoma mansoni - automated DNA sequencing - EST - gene identification

Although a decade has passed since the first Schistosoma mansoni gene was cloned, relatively little information is currently available concerning its genome and its gene complement. Indeed, GenBank contains less than 100 different full-length cDNA or genome sequences from schistosomes (GenBank 1993). In the context of the global effort to systematically map and sequence the human genome (Olson 1993), a program that embraces the study of a number of model organisms such as yeast, Drosophila and Caenorhabditis elegans, we have thus embarked on a program to characterize the schistosome genome and its component genes in depth. Our view is that since current efforts at understanding the biology of the organism, drug design and vaccine production are being undertaken with meager molecular knowledge, a gene discovery program may contribute importantly to these areas.

The collaboration involves the Federal University of Minas Gerais, the Funda‡ o Oswaldo Cruz (Brazil) and The Institute for Genomic Research (U.S.A.) and is focused on defining the expressed gene content of S. mansoni expressed by producing Expressed Sequence Tags (ESTs). ESTs are single pass sequences of at least 150 bp derived from one or both ends of cDNAs which we obtain using automated DNA sequencers. ESTs provide an "identity tag of the gene" that can be used for the identification of the respective gene based on its homology with database sequences derived from the same or other organisms in the DNA data bases (Adams et al. 1991). The cDNAs from which the ESTs are derived then become important tools for selecting and identifying full length copies of the gene within DNA libraries and for mapping the position of the gene within the organisms genome.

An adult worm size-selected cDNA library was constructed and cloned into BA vector (a phagemid derived from pEMBL), essentially as described (Adams et al. 1993). The ESTs were obtained by partial sequencing from one or both ends of randomly selected cDNAs using M13 universal, M13-40 or M13 reverse primers, in automated DNA sequencers (ALF Pharmacia or 373A Applied Biosystem). Homology searches were done at National Center for Biotechnology Information site using the program Basic Local Alignment Search Tool (Altschul et al. 1990) for nucleotide sequences (BLASTN) or amino acid sequences (BLASTX). The genes were identified based on the high score alignments between the EST and the database sequences. The cDNAs were classified as "identified genes" when they showed matches with schistosome sequences or with those of other organism in the databases, or "non-identified genes" when they showed only partial homology with non- schistosome genes or had no significant database match.

We obtained a total of 536 ESTs from 390 cDNA clones. From these clones, 21.1% corresponded to undesirable sequences such as rRNA, mitochondrial sequences and plasmids without inserts. The remaining were grouped as follows: (1) 36.4% with no database match which may correspond to untranslated regions of the cDNAs or genes expressed only in S. mansoni; (2) 5.1% with a partial match with non-schistosome sequences and whose alignments consisted either of short regions probably corresponding to either conserved motif or domains or regions enriched with a specific amino acid; (3) 15.9% homologous to previously sequenced schistosome genes; and (4) 20.5% showing homology with genes from other organisms. The first two groups consisted of "non-identified genes" and the two last groups of "identified genes". The ESTs with no database match were compared one with other in order to see if they correspond to the same or different genes. The corresponding cDNAs were divided into classes of distinct genes, some of them with high frequency in the library and others representing medium abundance or low abundance transcripts. The ESTs collected to date represent a broad range of transcripts including enzymes and known antigens as well as cytoplasmic, structural, regulatory, membrane, nuclear and secretory proteins.

Using the EST strategy we have partially sequenced 168 different genes from the cDNA library. Fourteen of them represent genes previously characterized in S. mansoni and 154 correspond to new genes of this organism that have never sequenced before. From these, 42 were identified based on matches with non-schistosome entries in the databases, 20 were not precisely identified although they showed partial homology with sequences of different organisms and 92 were not identified, because they did not match any entry in the databases.

This EST project was initiated in 1992 and has already more than doubled the number of defined S. mansoni genes. It represents the first initiative of a systematic study of the parasite's genome and expressed gene complement. In addition, to increasing the S. mansoni EST data base we are currently embarking on the full length sequencing of selected genes and the construction of a genetic map using the cDNAs as probes.

REFERENCES

Adams MD, Kelley JM, Gocayne JD, Dubnick M, Polymeropoulos MH, Xiao H, Merril CR, Wu A, Olde B, Moreno RF, Kerlavage AR, McCombie WR, Venter JC 1991. Complementary DNA sequencing: expressed sequence tags and human genome project. Science 252: 1651-1656.

Adams MD, Soares MB, Kerlavage AR, Fields C, Venter JC 1993. Rapid cDNA sequencing (Expressed Sequence Tags) from a directionally cloned human infant brain cDNA library. Nature Genet 4: 373-389.

Altschul SF, Gish W, Miller W, Myers EW, Lipman D 1990. Basic local alignment search tool. J Molec Biol 215: 403- 410.

GenBank 1993, Release 80, December 10.Olson MV 1993. The human genome project. Proc Natl Acad Sci USA 90: 4338- 4344.

Copyright 1995 Fundacao Oswaldo Cruz (Fiocruz)

Home Faq Resources Email Bioline
© Bioline International, 1989 - 2024, Site last up-dated on 01-Sep-2022.
Site created and maintained by the Reference Center on Environmental Information, CRIA, Brazil
System hosted by the Google Cloud Platform, GCP, Brazil