search
for
 About Bioline  All Journals  Testimonials  Membership  News


Journal of Postgraduate Medicine
Medknow Publications and Staff Society of Seth GS Medical College and KEM Hospital, Mumbai, India
ISSN: 0022-3859 EISSN: 0972-2823
Vol. 48, Num. 1, 2002, pp. 83-87

Journal of Postgraduate Medicine, Vol. 48, Issue 1, 2002 pp. 83-87

Structure Determination of Proteins in Solution by Nuclear Magnetic Resonance Spectroscopy

Chary KVR, Atreya HS

Department of Chemical Sciences, Tata Institute of Fundamental Research, Colaba, Mumbai - 400005, India.
Address for Correspondence: K. V. R. Chary, Department of Chemical Sciences, Tata Institute of Fundamental Research, Homi Bhabha Road, Colaba, Mumbai-400005 India. Email: chary@mailhost.tifr.res.in

Code Number: jp02029

Each cell in the human body consists of a myriad of biological macromolecules and organelles, which are required for various cellular functions and metabolism. A key component in such a system comprises of proteins, which play a crucial role in proper functioning of the cell. There are an estimated 100,000 different proteins present in the human body.1 The activity of a protein molecule inside the cell is indirectly governed by the overall fold of the individual polypeptide chains, or in other words, their three dimensional (3D) structures in space.2 Thus, the knowledge of the 3D structure of a given protein is most essential for a complete understanding of its function inside the cell. Many diseases in humans such as Alzheimers', Parkinsons', Prion disease, Cystic fibrosis, cancers etc. are attributed as the result of malfunctioning of proteins.1 Further, knowledge of the 3D structure of a protein, involved in a disease, is eventually used in designing its target drugs. Such a sphere of activity is popularly called as quantitative structure activity relationship (QSAR).3

As of today, there are only two experimental techniques available to determine the 3D structure of proteins. These are namely, X-ray crystallography4 and Nuclear Magnetic Resonance (NMR) spectroscopy.5 In this article, we will review the various steps involved in unraveling the 3D structures of proteins by NMR. As an example, we will consider the 3D structure of a calcium binding protein from Entamoeba histolytica (also called as EhCaBP), which we have recently determined using NMR.6

The NMR Methodology

The phenomenon of NMR arises in certain elements or their isotopes, whose nuclei possess, what is called a non-zero nuclear spin. Such nuclei are termed as NMR-active nuclei. Thus, the nucleus of an atom can be imagined as a ball spinning about an axis (see Scheme I below). The nuclear spin is measured experimentally in units of Planck's constant. Elements such as 1H (Hydrogen atom, or more commonly known as proton), 13C (an isotope of carbon), 15N (an isotope of nitro gen) etc. naturally possess a nuclear spin of ½ . Such spin ½ nuclei when placed in a magnetic field get distributed (or quantized) into two distinct energy levels, with some atoms of the sample in the lower energy level (called the ground state or a-state) and the remaining in the higher energy level (called the excited state or b-state) (Scheme I). Such distribution of nuclei is governed by the famous Boltzman distribution, which results in more atoms in the ground state than the excited state.

Now, nuclei in the lower energy level can be taken to higher energy level (also known as excitation) by applying an energy (GE), which corresponds to the difference in energies of the ground state and the excited state (see Scheme I). However, similar kind of nuclei in a given molecule cannot be excited with the same energy, GE. This is primarily because the splitting in the energy levels (GE) of nuclei of a given type (1H, 13C or 15N) depends on their surrounding chemical environment and, is mostly different in different parts of the molecule. Thus, in the same molecule, say ethanol (CH3CH2OH), protons (1H nuclei) belonging to the CH3 group will have different energy splitting as compared to that of CH2, which in turn is different from the OH groups in the same molecule, due to their different chemical environments. This difference in splitting among different nuclei, and hence the difference in energy required to excite them, forms the crux of high resolution NMR spectroscopy. A plot of frequency versus the intensity of absorption of energy forms the NMR spectrum. The frequency of absorption of individual groups of nuclei is associated with their chemical shifts. A NMR spectrum of ethanol is shown in Figure 1A, where the three set of lines arising from three different types of protons are indicated. Chemical shifts (or frequencies) of resonance lines in a NMR spectrum are always reported with respect to a reference sample, which is usually taken as the zero of reference. The most common reference is tetramethylsilane (TMS). This is shown in Figure 1A, where TMS has been assigned 0 ppm (parts per million, the unit of chemical shifts). The extent of splitting in energy levels (or GE in Scheme I) determines the sensitivity of a NMR spectrum, which in turn, depends on the strength of the external magnetic field (denoted as B0 in Scheme I). B0 is given in terms of the frequency required to excite a proton at a given magnetic field strength. Modern day spectrometers have field strengths ranging form 90 MHz to 1000 MHz.

Although the NMR spectrum of ethanol (Figure 1A) appears very simple, with three distinct sets of peaks, a NMR spectrum of a protein molecule is extremely complex with many hundreds of peaks. Since each type of proton in a molecule gives rise to a peak (or resonance line) in the NMR spectrum, a protein consisting of about 100 amino acids will display, on an average, 600-700 resonance lines. As an illustrative example of such a complexity of NMR spectrum, the one-dimensional (1D) NMR spectrum of a calcium binding protein from Entamoeba histolytica is shown in Figure 1B (this protein is made up of 134 amino acid residues and there are ~900 observable proton resonances). The fact that analysis of such a complex spectrum is nearly impossible, led to the concept of multidimensional NMR experiments. 7 Consider a one-dimension (1D) spectrum (Figure 1A and 1B), which consists of overlapping peaks as depicted in Scheme II. If we spread the peaks in a two-dimensional plane (2D) some peaks will get resolved, depending on what property we choose to separate them in a plane. However, there can still remain some overlaps in a 2D spectrum, which can be further resolved by resorting to third dimension or 3D (Scheme II)

Thus, it is easier to analyze multidimensional NMR spectra of a protein owing to its good resolution. Normally, all the dimensions in a multidimensional NMR spectrum (2D or 3D) consist of frequencies of either proton (homonuclear) or other nuclei such as 13C/15N (heteronuclear). For a given protein, various multidimensional spectra are recorded which are then used in a concerted manner to obtain its 3D structure as discussed below.

The 3D structure determination of proteins by NMR proceeds in the following steps (see Scheme III), each of which is described in detail below.5

1. Protein Sample Preparation

The protein sample preparation step consists of dissolving the required amount of protein under investigation in a small quantity (~600 microlitres) of water (H2O) or deuterium oxide (D2O), to obtain the desired concentration The concentration required to obtain a good spectrum depends on factors such as stability of the protein against aggregation, the sensitivity of the NMR spectrometer and the type of NMR experiments to be recorded on the sample. While a protein concentration of 1 to 2 mM suffices in most cases, the lower limit on concentration is largely determined by the spectrometer sensitivity. With highly sensitive modern day spectrometers, equipped with what is known as cryo-probes, one can obtain a good NMR spectrum with a concentration as low as 50 mM of the protein samples. For a protein with a molecular weight (Mr) of 10 kDa (1kDa º 1000 protons) (roughly 90 amino acids), 10 mg dissolved in 500 ml of solvent corresponds to 2 mM, or 250 mg of protein dissolved in same amount of the solvent corresponds to 50 mM.

In the case of large molecular weight proteins (Mr > 20 kDa), there is a severe overlap of peaks in the NMR spectrum (as seen in Figure 1B). In such cases, one can utilize the good resolution and sensitivity offered by 13C and 15N nuclei in combination with multidimensional NMR experiments. However, as proteins are naturally deficient in 13C (only 1 13C atom in every 100 C atoms) and 15N nuclei (only 1 15N atom in every 300 N atoms), it is necessary to enrich them with 13C and 15N isotopes. This is achieved by over-expressing the protein in a suitable host, such as E. coli. The host, E. coli is grown in a medium containing 15NH4Cl and 13C-Glucose (these are available commercially) as the sole source of nitrogen and carbon, respectively. This methodology is also referred to as isotope labeling. Detailed description of this subject is beyond the scope of this review. However, it is suffice to mention that, as of today, one can isotopically label any given protein with 13C or/and 15N. At times, it may be also necessary to partially or uniformly deuterate (2H) the protein, particularly for large molecular weight proteins (Mr > 20 kDa), for which the protocols are well established.8

2. NMR Experiments

The next step consists of recording a series of different NMR experiments on the protein sample. These experiments, which range from a simple 1D spectrum to complex 2D, 3D and 4D experiments, fall in two classes (see Figure 2 for an illustrative example of a 2D and 3D NMR spectrum). The first category of experiments is aimed at assigning all the NMR active nuclei in the protein, which constitutes Step 3, discussed below. Once such sequence specific resonance assignments (hereafter referred to as ssr-assignment) are done, the second category of experiments (e.g. 2D NOESY, 3D NOESY-HSQC etc.) is used to obtain different types of structural constraints.5 Such constraints are used as inputs in molecular modelling to compute the final 3D structure of protein. Depending on the protein concentration, the type of experiment used, and the spectrometer sensitivity, a 2D experiment generally, can be recorded in few hours, while a 3D experiment may take 12-48 hours for completion.

3. Sequence specific resonance assignments

Since different types of nuclei in a given molecule give rise to different NMR signals (Figure 1A), it is necessary to identify and assign all the signals in the NMR spectrum to their respective nuclei. This implies that in a protein, each NMR signal has to be assigned to its respective nuclei in all the amino acid residues. This process is called as ssr-assignments. Ssr-assignments, if carried out manually, constitutes a tedious and time-consuming task. However, in recent years, many methodologies have been proposed to carry out ssr-assignments in an automated fashion.9-13 Such methodologies, in general, use NMR data from different experiments as input and directly output the assignments.

4. Generating structural constraints

A stretch of amino acid residues in the protein primary sequence can be involved in a specific local geometry, such as a-helix, b-sheet, b-turn or a loop, which are also popularly called as secondary structural elements.14 The overall disposition of the secondary structural elements in the 3D space constitutes a complete picture of the tertiary structure of the protein. This implies that residues that are located far apart in the primary sequence can come closer in space, within a short distance (2-5 Å) (1Å = 10-10 meters). This is depicted in Scheme IV(A), where two protons are shown to come closer in space due to the tertiary structural fold of the protein. On the other hand, presence of secondary structural elements in a protein provides constraints on the local conformation of amino acid residues. Such a conformation is identified using torsion angles, which is defined for a covalent bond as shown in Scheme IV(B). Thus, determination of torsion angle values in amino acid residues can indicate whether they are part of a a-helix or b-sheet.

The identification of such short range (2-2.5Å), medium range (2.5-4.0 Å) and long range (4.0-6.0 Å) distance contacts between nuclei belonging to different amino acid residues in space and torsion angle values for the local conformation, helps in identifying a unique overall geometry of the protein. Experimentally, protons close in space in the protein transfer part of their energy to each other. The magnitude of transfer can be used as a measure of their closeness. On the other hand, torsion angle values can be estimated by measuring the coupling between 2 nuclei that are separated by three covalent bonds (e.g. HN-Ha as in Scheme IV(B)).

5. 3D structure computation using molecular dynamics

The different distance and torsion angle constraints generated in the previous step can now be used to define a unique geometry of the protein, starting from a random configuration of amino acid residues in space. This is done using molecular modeling programs, which use these constraints as input to obtain an energy minimized 3D structure.15 Energy minimized structures are preferred, as molecules tend to be in their minimum energy state in their native form.

Case Study: A calcium binding protein from Entamoeba histolytica

The procedures outlined in the previous sections are demonstrated on a 15 kDa (134 amino acid residues) calcium binding protein from the Entamoeba histolytica (EhCaBP). A number of biochemical experiments suggest that calcium (Ca2+) may be involved in the pathogenetic mechanisms of amoebiasis.16 Thus, in order to understand the mechanism by which Ca2+ effects virulence and gain more insight of the function of the protein, a gene encoding a novel calcium binding protein was isolated from E. histolytica and cloned in E. coli.17 The protein was isotopically labelled with 13C or/and 15N for NMR experiments. Ssr-ssignments were achieved using a series of heteronuclear multidimensional NMR experiments.18,19 Subsequently, 1265 distance and 200 torsion angle constraints obtained for the protein were used as input to the molecular dynamics program, DYANA,20 which computes 3D structure of proteins using experimental structural constraints. A minimum energy structure obtained is shown in Figure 2. The 3D structure of EhCaBP reveals that it belongs a well-known family of EF-hand proteins, a popular member of this family being, Calmodulin.21 Such a structure of EhCaBP can now be used as a template to further investigate structure-function relationship in this protein.

Sources of support

The National Facility for High Field NMR supported by Department of Science and Technology, Department of Biotechnology, Council of Scientific and Industrial Research, and Tata Institute of Fundamental Research, Mumbai.

Acknowledgement

The facilities provided by the National Facility for High Field NMR, supported by Department of Science and Technology (DST), Department of Biotechnology (DBT), Council of Scientific and Industrial Research (CSIR), and Tata Institute of Fundamental Research, Mumbai, are gratefully acknowledged. We dedicate this paper in the memory of late Prof. G. N. Ramachandran (1922-2001).

References

  1. Dobson CM. Protein misfolding, evolution and diseases. Trends Biochem Sci 1999;24:329-32.
  2. Xenarios I, Eisenberg D. Protein interaction databases. Curr Opin in Biotech 2001;12:334-9.
  3. Roderick EH. Can drugs be designed? Curr Opin in Biotech 1997; 8:696-700.
  4. Drenth J. Principles of Protein X-ray Crystallography. Springer Verlag; 1994.
  5. Wuthrich K. NMR of Proteins and Nucleic Acids. New York: John Wiley and Sons; 1986.
  6. Atreya HS, Sahu SC, Bhattacharya A, Chary KVR, Govil G. NMR derived solution structure of an EF-hand calcium binding protein from Entamoeba histolytica. Biochemistry 2001;40:14392-403.
  7. Clore GM, Gronenborn AM. NMR structure determination of proteins and protein complexes larger than 20 kDa. Curr Opin in Chem Biol 1998;2:564-70.
  8. Natalie KG, Kay LE. New developments in isotope labeling strategies for protein solution NMR spectroscopy. Curr Opin in Struct Biol 2000;10:585-92.
  9. Atreya HS, Sahu SC, Chary KVR. Govil G. A Tracked approach to automated NMR assignments in proteins (TATAPRO). J Biomol NMR 2000;99:125-36.
  10. Moseley HNB, Montelione GT. Automated analysis of NMR assignments and structures for proteins. Curr Opin in Struct Biol 1999; 9:635-42.
  11. James TL. Computational strategies pertinent to NMR solution structure determination Curr Opin in Struct Biol 1994;4:275-84.
  12. Bartels C, Guntert P, Billeter M, Wuthrich K. GARANT-A general algorithm for resonance assignment of multidimensional nuclear magnetic resonance spectra. J Comput Chem 1997;18:139-49.
  13. Meadows RP, Olejniczak, ET, Fesik, SW. A computer based protocol for semi- automated assignments and 3D structure determination of proteins, J Biomol NMR 1994;4:79-96.
  14. Branden C, Tooze J. Introduction to Protein Structure. New York: Garland Publishing; 1991.
  15. Guntert P. Structure calculation of biological macromolecules from NMR data. Quat Rev Biophys 1998;31:145-237.
  16. Meza I. Extracellular Matrix-induced Signaling in Entamoeba histolytica: its role in invasiveness. Parasitol Today 2000;16:23-8.
  17. Prasad J, Bhattaharya S, Bhattacharya A. Cloning and sequence analysis of a calcium binding protein gene from a pathogenic strain of Entamoeba histolytica. Mol Biochem Parasitol 1992;52:137-40.
  18. Sahu SC, Atreya HS, Chauhan S, Bhattacharya A, Chary KVR, Govil G. Sequence specific 1H, 13C and 15N assignments of a calcium binding protein from Entamoeba histolytica. J Biomol NMR 1999;14:93-4.
  19. Sahu SC, Bhattacharya A, Chary KVR, Govil G. Secondary structure of a calcium binding protein from Entamoeba histolytica. FEBS Letters 1999;459:51-6.
  20. Guntert P, Mumenthaler C, Wuthrich K. Torsion angle dynamics for NMR structure calculation with the new program DYANA. J Mol Biol 1997;273:283-98.
  21. Rety S, Lewit-Bently A. EF-hand calcium binding proteins. Curr Opin in Struct Biol 2000;10:637-43

This article is also available in full-text from http://www.jpgmonline.com/

© Copyright 2002 - Journal of Postgraduate Medicine


The following images related to this document are available:

Photo images

[jp02029s4.jpg] [jp02029f2.jpg] [jp02029s2.jpg] [jp02029s3.jpg] [jp02029s1.jpg] [jp02029f1.jpg]
Home Faq Resources Email Bioline
© Bioline International, 1989 - 2024, Site last up-dated on 01-Sep-2022.
Site created and maintained by the Reference Center on Environmental Information, CRIA, Brazil
System hosted by the Google Cloud Platform, GCP, Brazil