Sequences that are relatively close to the startpoint for transcription of a gene (usually within a few hundred or a few thousand nucleotides of the startpoint) and control its expression are collectively known as the promoter. The promoter sequence acts as a basic recognition unit, signaling that there is a gene that can be transcribed and providing the information needed for the RNAPol II to recognize the gene and to correctly initiate RNA synthesis, both at the right place and using the correct strand of DNA as template. The promoter also plays an important role in determining that RNA is synthesized at the right time in the right cell. Most control regions in the promoter are upstream of the transcription startpoint, and, therefore, are not transcribed into RNA; however, occasionally some elements of the promoter may be downstream of the startpoint for RNA synthesis, and may actually be transcribed into RNA. The structure of promoters varies from gene to gene, but there are a number of key sequence elements that can be identified within the promoter. These elements may be present in varying combinations, some elements being present in one gene but absent in another. Sequences further away from the transcription start site, some of which are known as enhancers, may also have major impact on the transcription of a gene. Enhancers exert their effect independently of their orientation or whether they are upstream or downstream of the start site.
|
The efficiency and specificity of gene expression is conferred by cis-acting elements
|
A promoter exerts its effect because it is on the same piece of DNA as the gene being transcribed and is referred to as a cis-acting sequence or element to emphasize that it affects only the neighboring gene on the same chromosome. Since the promoter is critical to gene expression, it is often regarded as being part of the gene it controls, since without it the mRNA would not be made.
|
The nucleotide sequence immediately surrounding the startpoint of a gene varies from gene to gene. However, the first base in mRNA tends to be adenine (A), usually followed by a pyrimidine-rich sequence, termed the initiator (Inr). In general, it has the nucleotide sequence Py2CAPy5 (Py = pyrimidine base; C-cytosine, A-adenine) and is found between positions -3 to +5 in relation to the startpoint. In addition to Inr, most promoters possess a sequence known as the TATA box. This sequence element is usually approximately 25 bp upstream from the startpoint. The TATA box has an 8 bp consensus sequence that usually consists entirely of adenine-thymine (A-T) base pairs, although very rarely a guanine-cytosine (G-C) pair may be present. This sequence appears to be very important in the process of transcription, as nucleotide substitutions that disrupt the TATA box result in a marked reduction in the efficiency of transcription. The positions of Inr and the TATA box relative to the startpoint are relatively fixed (Fig. 33.1).
|
In addition to the TATA box, other commonly found cis-acting promoter elements have been described. For example, the CAAT box is often found upstream of the TATA box, typically about 80 bp from the startpoint, and may be present and functional in either orientation. As in the case of the TATA box, it may be more important for its ability to increase the strength of the promoter signal rather than in controlling tissue- or time-specific expression of the gene. Another commonly noted promoter element is the GC box. This is a GC rich sequence that can function in either orientation, and multiple copies may be found in a single promoter region.
|
page 460 | | page 461 |
Figure 33.1 Idealized version of a promoter comprising various different elements. Each promoter element has a specific consensus sequence that binds ubiquitous transcription-activating factors. Binding of transcription factors encompasses the consensus site and a variable number of anonymous adjacent nucleotides, depending on the promoter element. CTF, a member of a protein family whose members act as transcription factors; TBP, TATA-binding protein; NF-1, nuclear factor-1; SP-1, ubiquitous transcription factor. |
Identifying the function and specificity of nucleotide sequences |
Consensus sequences are nucleotide sequences that contain unique core elements that identify the function and specificity of the sequence, for example the TATA box. The sequence of the element may differ by a few nucleotides in different genes, but a core, or consensus, sequence is always present. In general, the differences do not influence the effectiveness of the sequence. |
Figure 33.1 lists some of the common cis-acting elements
seen within promoters. These promoter elements bind protein factors (transcription factors) that recognize the DNA sequence of each particular element. Some transcription factors stimulate transcription, others suppress it; some are expressed ubiquitously, others are expressed in a tissue- or time-specific fashion. Thus the array of factors bound to a promoter region can vary from cell to cell, tissue to tissue, and be affected by the state of the organism. These factors, bound to promoter sequences, determine how actively the RNApol II copies the DNA into RNA.
|
|