Back to TABLE OF CONTENTS Return to:    File Formats & Support
Example of a PDB File
   The 1crn.pdb coordinate file is included in every downloaded copy of RasMol v2.6 as an example to be used for getting acquainted with the program. This page displays all of the text information at the top of that file and deletes almost all of the X, Y, Z coordinates.
   The purpose is two-fold: 1) to provide glossary descriptions of the entries in this portion of a typical PDB file and to link some of these terms to the commands and parameters used by RasMol; and 2) to highlight topics that should be considered by "informed consumers" of X-ray crystallographic (and NMR) structural data, i.e. those who want to make valid interpretations of structural models. Consult your lecture notes or a text on X-ray crystallography to get a complete description of these topics. I recommend "Crystallography Made Crystal Clear" by Gale Rhodes, Academic Press (1993).
   As a complement to the descriptions on this page see the Chime display of Crambin. A RasMol command line is also provided that allows additional manipulation of the molecular model.

HEADER    PLANT SEED PROTEIN                      30-APR-81   1CRN      1CRND  1
COMPND    CRAMBIN                                                       1CRN   4
SOURCE    ABYSSINIAN CABBAGE (CRAMBE ABYSSINICA) SEED                   1CRN   5
AUTHOR    W.A.HENDRICKSON,M.M.TEETER                                    1CRN   6
REVDAT   5   16-APR-87 1CRND   1       HEADER                           1CRND  2
REVDAT   4   04-MAR-85 1CRNC   1       REMARK                           1CRNC  1
REVDAT   3   30-SEP-83 1CRNB   1       REVDAT                           1CRNB  1
REVDAT   2   03-DEC-81 1CRNA   1       SHEET                            1CRNB  2
REVDAT   1   28-JUL-81 1CRN    0                                        1CRNB  3
REMARK   1                                                              1CRN   7
REMARK   1 REFERENCE 1                                                  1CRNC  2
REMARK   1  AUTH   M.M.TEETER                                           1CRNC  3
REMARK   1  TITL   WATER STRUCTURE OF A HYDROPHOBIC PROTEIN AT ATOMIC   1CRNC  4
REMARK   1  TITL 2 RESOLUTION. PENTAGON RINGS OF WATER MOLECULES IN     1CRNC  5
REMARK   1  TITL 3 CRYSTALS OF CRAMBIN                                  1CRNC  6
REMARK   1  REF    PROC.NAT.ACAD.SCI.USA         V.  81  6014 1984      1CRNC  7
REMARK   1  REFN   ASTM PNASA6  US ISSN 0027-8424                  040  1CRNC  8
REMARK   1 REFERENCE 2                                                  1CRNC  9
REMARK   1  AUTH   W.A.HENDRICKSON,M.M.TEETER                           1CRN   9
REMARK   1  TITL   STRUCTURE OF THE HYDROPHOBIC PROTEIN CRAMBIN         1CRN  10
REMARK   1  TITL 2 DETERMINED DIRECTLY FROM THE ANOMALOUS SCATTERING    1CRN  11
REMARK   1  TITL 3 OF SULPHUR                                           1CRN  12
REMARK   1  REF    NATURE                        V.  290  107 1981      1CRN  13
REMARK   1  REFN   ASTM NATUAS  UK ISSN 0028-0836                  006  1CRN  14
REMARK   1 REFERENCE 3                                                  1CRNC 10
REMARK   1  AUTH   M.M.TEETER,W.A.HENDRICKSON                           1CRN  16
REMARK   1  TITL   HIGHLY ORDERED CRYSTALS OF THE PLANT SEED PROTEIN    1CRN  17
REMARK   1  TITL 2 CRAMBIN                                              1CRN  18
REMARK   1  REF    J.MOL.BIOL.                   V.  127  219 1979      1CRN  19
REMARK   1  REFN   ASTM JMOBAK  UK ISSN 0022-2836                  070  1CRN  20
REMARK   2                                                              1CRN  21
REMARK   2 RESOLUTION. 1.5 ANGSTROMS.                                   1CRN  22
REMARK   3                                                              1CRN  23
REMARK   3 REFINEMENT. RESTRAINED LEAST SQUARES (HENDRICKSON,W.A.,      1CRN  24
REMARK   3  KONNERT,J.H. COMPUTING IN CRYSTALLOGRAPHY, EDS.DIAMOND,R.,  1CRN  25
REMARK   3  RAMASESHAN,S.,VENKATESAN,K. (1980)).                        1CRN  26
REMARK   4                                                              1CRN  27
REMARK   4 CONFORMATIONAL HETEROGENEITY EXISTS AT ILE 7 AND ILE 25      1CRN  28
REMARK   4 WHERE CD1 ATOMS TAKE EITHER OF TWO STAGGERED POSSIBILITIES.  1CRN  29
REMARK   4 COMPOSITIONAL HETEROGENEITY ALSO EXISTS AT POSITIONS 22 AND  1CRN  30
REMARK   4 25.  REFINEMENT PARAMETERS SUGGEST THAT RESIDUE 22 IS ABOUT  1CRN  31
REMARK   4 60/40 PRO/SER AND THAT RESIDUE 25 IS ABOUT 60/40 ILE/LEU.    1CRN  32
REMARK   4 THE HETEROGENEITY AT RESIDUE 22 APPARENTLY CAUSES A          1CRN  33
REMARK   4 DISORDER IN TYR 29 - THE REFINED POSITION OF ITS OH ATOM     1CRN  34
REMARK   4 MAKES AN IMPOSSIBLY SHORT CONTACT OF 2.6 ANGSTROMS WITH      1CRN  35
REMARK   4 ATOM CD OF PRO 22 ON A SCREW-RELATED MOLECULE.  THE          1CRN  36
REMARK   4 DEPOSITED COORDINATES ARE ONLY FOR THE MAJOR CONTRIBUTOR AT  1CRN  37
REMARK   4 EACH SITE (PRO 22 AND ILE 25).  DEPOSITION OF THE MODEL OF   1CRN  38
REMARK   4 DISORDER AND SOLVENT STRUCTURE IS DEFERRED UNTIL HIGHER      1CRN  39
REMARK   4 RESOLUTION REFINEMENT.  THE R-FACTOR FOR THE COMPLETE MODEL  1CRN  40
REMARK   4 INCLUDING HETEROGENEITY AND SOLVENT IS 0.114 ISOTROPIC AND   1CRN  41
REMARK   4 0.104 ANISOTROPIC AGAINST ALL DATA IN THE 10.0 TO 1.5        1CRN  42
REMARK   4 ANGSTROM SHELL.                                              1CRN  43
REMARK   5                                                              1CRN  44
REMARK   5 THE SECONDARY STRUCTURE SPECIFICATIONS ARE THOSE DEFINED     1CRN  45
REMARK   5 IN REFERENCE 1 ABOVE AND DEPEND ON PARTICULAR DEFINITIONS    1CRN  46
REMARK   5 THAT MAY AFFECT THE DETERMINATION OF END POINTS.  PLEASE     1CRN  47
REMARK   5 CONSULT THE PRIMARY REFERENCE AND EXAMINE STRUCTURAL         1CRN  48
REMARK   5 DETAILS SUCH AS HYDROGEN BONDING AND CONFORMATION ANGLES     1CRN  49
REMARK   5 WHEN MAKING USE OF THE SPECIFICATIONS.                       1CRN  50
REMARK   6                                                              1CRNA  1
REMARK   6 CORRECTION. CORRECT RESIDUE NUMBER ON STRAND 1 OF SHEET S1.  1CRNA  2
REMARK   6  03-DEC-81.                                                  1CRNA  3
REMARK   7                                                              1CRNB  4
REMARK   7 CORRECTION. INSERT REVDAT RECORDS. 30-SEP-83.                1CRNB  5
REMARK   8                                                              1CRNC 11
REMARK   8 CORRECTION. INSERT NEW PUBLICATION AS REFERENCE 1 AND        1CRNC 12
REMARK   8  RENUMBER THE OTHERS.  04-MAR-85.                            1CRNC 13
REMARK   9                                                              1CRND  3
REMARK   9 CORRECTION. CHANGE DEPOSITION DATE FROM 31-APR-81 TO         1CRND  4
REMARK   9  30-APR-81.  16-APR-87.                                      1CRND  5
SEQRES   1     46  THR THR CYS CYS PRO SER ILE VAL ALA ARG SER ASN PHE  1CRN  51
SEQRES   2     46  ASN VAL CYS ARG LEU PRO GLY THR PRO GLU ALA ILE CYS  1CRN  52
SEQRES   3     46  ALA THR TYR THR GLY CYS ILE ILE ILE PRO GLY ALA THR  1CRN  53
SEQRES   4     46  CYS PRO GLY ASP TYR ALA ASN                          1CRN  54
HELIX    1  H1 ILE      7  PRO     19  1 3/10 CONFORMATION RES 17,19    1CRN  55
HELIX    2  H2 GLU     23  THR     30  1 DISTORTED 3/10 AT RES 30       1CRN  56
SHEET    1  S1 2 THR     1  CYS     4  0                                1CRNA  4
SHEET    2  S1 2 CYS    32  ILE    35 -1                                1CRN  58
TURN     1  T1 PRO    41  TYR    44                                     1CRN  59
SSBOND   1 CYS      3    CYS     40                                     1CRN  60
SSBOND   2 CYS      4    CYS     32                                     1CRN  61
SSBOND   3 CYS     16    CYS     26                                     1CRN  62
CRYST1   40.960   18.650   22.520  90.00  90.77  90.00 P 21          2  1CRN  63
ORIGX1      1.000000  0.000000  0.000000        0.00000                 1CRN  64
ORIGX2      0.000000  1.000000  0.000000        0.00000                 1CRN  65
ORIGX3      0.000000  0.000000  1.000000        0.00000                 1CRN  66
SCALE1       .024414  0.000000  -.000328        0.00000                 1CRN  67
SCALE2      0.000000   .053619  0.000000        0.00000                 1CRN  68
SCALE3      0.000000  0.000000   .044409        0.00000                 1CRN  69
ATOM      1  N   THR     1      17.047  14.099   3.625  1.00 13.79      1CRN  70
ATOM      2  CA  THR     1      16.967  12.784   4.338  1.00 10.80      1CRN  71
ATOM      3  C   THR     1      15.685  12.755   5.133  1.00  9.19      1CRN  72
ATOM      4  O   THR     1      15.268  13.825   5.594  1.00  9.85      1CRN  73
ATOM      5  CB  THR     1      18.170  12.703   5.337  1.00 13.02      1CRN  74
ATOM      6  OG1 THR     1      19.334  12.829   4.463  1.00 15.06      1CRN  75
Deleted 313 lines here of 76-387. 
ATOM    319  N   ASN    46      13.966   6.502  13.739  1.00  5.80      1CRN 388
ATOM    320  CA  ASN    46      13.512   5.395  12.878  1.00  6.15      1CRN 389
ATOM    321  C   ASN    46      13.311   5.853  11.455  1.00  6.61      1CRN 390
ATOM    322  O   ASN    46      13.733   6.929  11.026  1.00  7.18      1CRN 391
ATOM    323  CB  ASN    46      12.266   4.769  13.501  1.00  7.27      1CRN 392
ATOM    324  CG  ASN    46      12.538   4.304  14.922  1.00  7.98      1CRN 393
ATOM    325  OD1 ASN    46      11.982   4.849  15.886  1.00 11.00      1CRN 394
ATOM    326  ND2 ASN    46      13.407   3.298  15.015  1.00 10.32      1CRN 395
ATOM    327  OXT ASN    46      12.703   4.973  10.746  1.00  7.86      1CRN 396
TER     328      ASN    46                                              1CRN 397
CONECT   20   19  282                                                   1CRN 398
CONECT   26   25  229                                                   1CRN 399
CONECT  116  115  188                                                   1CRN 400
CONECT  188  116  187                                                   1CRN 401
CONECT  229   26  228                                                   1CRN 402
CONECT  282   20  281                                                   1CRN 403
MASTER       62    0    0    2    2    1    0    6  327    1    6    4  1CRND  6
END                                                                     1CRN 405

  GLOSSARY of Terms Used in the 1CRN PDB File:
   When RasMol opens the 1crn.pdb file, the first two columns below are displayed in the command window. The additional third column indicates where the information occurs in this PDB file.

RasMol Category       Crambin Entry          PDB (cols 1-6) Label
RasMol>
Molecule name ....... CRAMBIN                COMPND
Classification ...... PLANT SEED PROTEIN     HEADER
Secondary Structure . PDB Data Records       HELIX, SHEET, & TURN
Brookhaven Code ..... 1CRN                   HEADER
Number of Groups .... 46                     SEQRES
Number of Atoms ..... 327                    ATOM    327
Number of Bonds ..... 3                      SSBOND
Number of Helices ... 2                      HELIX
Number of Strands ... 2                      SHEET
Number of Bonds ..... 334
RasMol> |

The Number of Bonds on the last line, is an internal RasMol calculation. The same information can be displayed at any time with the command, show information.
30-APR-81 is the original submission date for the coordinates.
1CRN is the PDB four-character accession code for this coordinate set.
REVDAT indicates the five revisions/ammendations made to this coordinate file. They are explained in REMARK 5-9 in the text. Four of these correspond to PDB files with new Brookhaven codes: 1CRNA, 1CRNB, 1CRNC, & 1CRND.
REFERENCE 1 is the primary reference for this model, as explained in REMARK 5; the primary reference is usually preceded by the JRNL label.
RESOLUTION corresponds to the amount of X-ray diffraction data included in the final model building and ultimately to the amount of atomic detail that will be interpreted in the electron density maps. Low resolution is >2.8Å; medium resolution is 1.8-2.8Å; and high resolution is <1.8Å. NMR models do not have an entry in this category.
The REFINEMENT method used here was RESTRAINED LEAST SQUARES. Some additional computational methods used for refinement (and cited in this section in other PDB files) are: CORELS, TNT, MD, etc.
The R-FACTOR is a measure of the overall quality of the model as compared to the experimental diffraction data. In this case, when the calculated intensities were compared to the observed data, the residual (rms) difference was 0.114 or 11.4%. NMR models do not have an entry in this category.
Typically, a PDB file will have information in this remark section on the following root mean square estimates:

   RMS DEVIATIONS FROM IDEAL VALUES.
    BOND LENGTHS                 (A) :
    BOND ANGLES            (DEGREES) :
    DIHEDRAL ANGLES        (DEGREES) :

As noted in REMARK 3, the authors deferred making these estimates until higher resolution refinement had been done.
Three higher resolution crystal structures are now available at the Protein Data Bank:
 
1CBN Teeter et al. ,1991 (0.83Å resolution).
  1CNR Yamano & Teeter ,1994 (0.83Å resolution).
  1AB1 Teeter et al. ,1997 (0.83Å resolution).
See below for NMR structures of crambin.
The four lines beginning with SEQRES 1 show the primary sequence of crambin from the N-terminal THR1 to the C-terminal ASN46. The same information (with residue numbers) can be displayed at any time with the command, show sequence.
If HETEROGEN atoms such as water, ligands, ions, etc. had been present in this file, they would have been listed here along with their names (HETNAM) and chemical formulas (FORMUL). The coordinates for these small molecules follow the protein data, each line beginning with HETATM in place of ATOM.
Display of hetero atoms is controlled by the
set hetero and related commands.
The HELIX, SHEET, and TURN lines record the authors' assignments of the secondary structure. These are used by RasMol, if present, to display and color the secondary structure. If absent from the PDB file, the DSSP algorithm of Kabsch and Sander [8] is used to calculate helices, sheets and turns.
If the structure command is used, RasMol will discard the assignments in the PDB file and recalculate helices and sheets using the DSSP algorithm.
For crambin, the first helix extends from Ile7 to Pro19; the second, from Glu23 to Thr30. For both helices, the notes on the HELIX lines indicate 310-helical geometry at the C-termini. The remaining residues are understood to be alpha-helix.
The SHEET lines identify two beta-strands extending from Thr1 to Cys4 and from Cys32 to Ile35.
The TURN extends from Pro41 to Tyr44. Consult the primary reference to learn which of the several regular turns this one is.
The three SSBOND lines were also assigned by the authors. They are disulfide bonds between Cys3-Cys40, Cys4-Cys32, and Cys16-Cys26. (See also the comments on CONECT lines below.)
The CRYST1 line reports the unit cell dimensions in Angstroms (A = 40.96Å, B = 18.65Å, and C = 22.52Å). The next three entries are the unit cell alpha, beta and gamma parameters. The last term on the line is the space group, P 21, i.e.P21. The same information can be displayed at any time with the command, show symmetry.
The ORIGXn line reports the transformation from orthogonal coordinates to the submitted coordinates.
The SCALEn line reports the transformation from orthogonal coordinates to fractional crystallographic coordinates.
ATOM 1 starts the first of 327 lines of coordinates for crambin (of which, most have been deleted above). Atom number 1 is the amino-N (nitrogen) on Thr1.
CA indentifies atom number 2 as the C-alpha carbon. The other backbone atoms are C and O. The sidechain atoms are listed next: Thr1 has a CB (C-beta) and an OG (O-gamma).
THR is the Group (or residue) name. After picking this atom, the RasMol command line display would be:
Atom: C 3 Group: THR 1
Group (or residue) number in the sequence. THR1 can be selected using the command line atom expressions, "thr1", or "resno=1".
RasMol can label selected residues with a variety of name/number formats.
These three values are the X, Y, & Z coordinates (in Angstroms) of ATOM 5.
The occupancy value follows the X, Y, Z coordinates on each line; it is usually listed as 1.00. However, if a residue is found to occupy two positions in the structure, the corresponding values will be listed as a fraction (or explained by the authors, as was done here in REMARK 4 for Ile7, Ile25, and others).
The last datum on each coordinate line is the B-Factor. Also called the temperature factor, the B-Factor is a measure of the uncertainty of atomic position due both to thermal motion and crystal imperfections; the units are Å2. A value of 80Å2 corresponds roughly to a RMS (root mean square) uncertainty of ±1Å.
Each line in a PDB file is numbered sequentially. 1CRN 75 identifies this one as line 75.
ATOM 327, the last atom in the file, is the C-terminal oxygen of Thr46. It is uniquely designated as, OXT. However, it is chemically identical to another atom listed above. What is the atomno of its identical twin?
Check your answer by viewing the 1crn.pdb model in RasMol (or peek at the HTML text of this page).

TER identifies the end of each polypeptide (or polynucleotide) chain. Crambin has only one chain.
The CONECT lines instruct RasMol to make covalent bonds between the atoms listed in each row. In this example, the disulfide bridges in crambin are specified. The first two columns list the SG and CB atoms by number of all six cysteines. The third column lists the SG of the cysteine to which each is bonded. Thus, the first line defines the Cys3-Cys40 disulfide bridge. The RasMol command, ssbonds controls the display of disulfide bonds. The CONECT records in a file can be discarded (and the connectivity recalculated by RasMol) by using the connect command.
END is the end-of-file statement for crystallographic models.
However, NMR models often have several sets of coordinates in each PDB file. The sets can be distinguished by the MODELn labels at the beginning, and the ENDMDL labels at the end of each set. For example, the NMR structure(s) of crambin has been deposited at the PDB in two files:
 
1CCM Bonvin et al. ,1993 Ensemble of eight refined structures.
  1CCN Bonvin et al. ,1993 Final minimized average structure.
To manipulate multiple NMR models in PDB files, see the Manual description under Primitive Expressions.
Back to TABLE OF CONTENTS  Return to:     File Formats & Support