Liverpool Life Sciences UTC Innovation Labs: August 2017

This month is a little later than I had hoped, mainly due to my choice of molecule. I decided I wanted to get more familiar with the (expanding number of) regulatory proteins involved in the control of gene transcription in bacteria; partly out of a research interest, and partly because it allows me to combine my interest in language (or more specifically alphabets) and Science. The down-side is that I will have to replace all of my as with αs and my bs with βs etc. which is always a little clunky on my free Blogging software! The molecule I am focusing on is RNA Polymerase (Pol), which I have covered earlier. However this time I am going to take a look at the transcription factors, or "known associates" of this multi-subunit protein complex that bridges the gap between information and function. The genomes of all prokaryotes contain a set of between 2 000 and 5 000 protein coding genes together with a few hundred genes that encode functional RNAs. This is all information; but in order to "translate" from nucleic acid speak (Nu-speak: sorry George!) to the language of amino acids and proteins (Pep-speak?), the ribosome is required. However, a limited number of RNA species combine an information mode with function, such as the hammerhead ribozyme, that can catalyse specific RNA cleavage in the absence of any proteins (a good future molecular candidate perhaps?). The nucleotide sequence of a ribozyme is no different than that in the genome (apart from an additional oxygen atom per sugar),and it also determines its three-dimensional fold. And therefore its biological function.

You may be interested to know the source of the images used in this post. I have chosen, where possible, to include the beautiful models created and exhibited at the Pingry Biomolecular Modelling Project web site which is just one of the incredibly impressive Pingry School initiatives at the school: more information on this ground-breaking collaboration between the Milwaukee School of Engineering (MSOE University) staff and the students and teachers at the Pingray School can be found here. On the right is an image of the components of RNA Pol in the early stages of transcriptional initiation. I hope you will agree with me that these models capture both structure and function in a beautiful and informative way.

The Basics RNA Pols, in their simplest forms (let's leave bacteriophage enzymes on the side for now), comprise two α subunits, a β and a variation of β, called β-prime (written β')(there are some enzymes in which the β-type subunits are fused, but these are only occasional exceptions). This hetero-tetrameric "apo-enzyme" then associates with a number of "regulators" to form the "holo-enzyme", the most important being the σ subunit, which is critical for determining the DNA sequence specificity associated with the choice of the promoter to be transcriptionally active. The image below the reaction scheme shows the promoter sequences recognised by the RNA Pol holoenzyme, with the -10 and -35 elements (recognised by the sigma factor) highlighted. As we we shall see below, the σ subunit comes in a number of different "flavours".The reaction catalysed by RNA Pols is shown below: it is important to remember that while I am discussing sequence specific DNA binding, RNA Pols are catalysts and DNA and RNA represent substrates and products respectively.

The prefixes apo and holo are derived from the Greek: meaning away from and complete, respectively, and are used frequently by Biochemists to describe proteins without (apo) a key component, such as a co-factor compared with the fully functional molecule (holo): apo-haemoglobin lacks the haem, for example. Which brings me to the inevitable glossary: an essential set of definitions of terms, symbols and concepts needed to understand gene transcription and for those of you are unfamiliar with the idiosyncrasies of the Greek alphabet, I have included my suggested (phonetic) pronunciations: remember when discussing Science, it really helps if you feel confident about the pronunciation of some of the rather ludicrous terms!

The Greek alphabet and my advice on pronunciation! [A "hard" consonant, eg the first and last G in gang is written gg, while the soft G in German is written as a single j. Where there is no ambiguity, e.g. the letter D, it is shown as a single d. If the vowel is drawn out, like the two Es in meet, it is again doubled].

α (alff-a)
β (bee-ta (UK), bayta (USA))
γ (ggamm-a)
δ (delt-a)
ε (ep-ssee-lon)
ζ (zee-ta)
η (new)
θ (thee-ta (UK), sometimes tha-yta (USA))
ι (eye-oh-ta)
κ (kapp-a)
λ (lamm-da)
μ (mew)
ν (new)
ξ (k-ss-eye)
ο (oh-mee-kron)
π (p-eye, or for English readers pie!)
ρ (row)
σ (ssigg-ma)
τ (torr)
υ (up-ssee-lon)
φ (ff-eye)
χ (kai, or k-eye [not kee])
ψ (p-ss-eye, as in psychology)
ω (oh-mee-ga (UK) or oh-may-ga (USA))

A short glossary

A
Apoenzyme: an incomplete molecule, usually requires a coenzyme (such as FAD, an additional protein (such as σ) or an RNA molecule for full function
H
Holoenzyme: an complete molecule, usually incorporating an essential coenzyme (such as FAD, an additional protein (such as σ) or an RNA molecule and expressing full biological function
O
Operator is the term given to a promoter that is flanked by a repressor (or an activator) binding site. The sequence of the promoter is extended in either direction (or possibly both
P
Promoter: a stretch of double-stranded DNA sequence to which an RNA Pol binds and, through a series of orchestrated molecular interactions, marks the initiation point for the transcription of a particular gene or group of genes. In bacteria, the DNA sequence comes in two sections: the -10 box comprises around 10 base pairs which are recognised by a σ factor (which is itself associated with the apo-enzyme for of RNA Pol). The -35 "box" provides contacts for the αβ subunits. The negative sign indicates the distance between the two "boxes" and the nucleotide that forms the 5' end of the transcript. The diagram below should help explain these concepts.
R
Ribosomes: a multi-component molecular machine comprising rRNA and polypeptides in the form of two "subunits" referred to by their sedimentation properties in an analytical ultracentrifuge. The 30S (small) and 50S (large) subunits co-assemble during the initiation of protein synthesis in the presence of initiation factors aminoacylated tRNAs, mRNA and an energy supply. You can read more here.
T
Transcription: the catalytic, template mediated synthesis of RNA from double stranded DNA. The products are a range of RNAs, including messenger, transfer etc and the enzyme may be a single species such as bacterial RNA Polymerase, or a dedicated one such as RNA PolII in eukaryotes that catalyses mRNA biosynthesis.
Translation: the biosynthesis of polypeptide chains from mRNA templates via the ribosome. Each ribosome can accommodate virtually any mRNA and in higher organisms, aggregates of ribosomes are called polysomes

Sigma factors One of the many returns on our collective investment in genome sequencing, has been the insights gained into those genes that are essential for cell growth and reproduction. Not surprisingly, the genes encoding the polypeptides that make up RNA Pols are essential for cell viability. However, while all prokaryotes possess the genes encoding the α (rpoA), β/β'(rpoB and C) and the major σ factor, σ⁷⁰(rpoD or sigA), there are some other regulatory factors that seem to confer advantages in regulating gene expression, that are likely to add to the physiological versatility of the organisms in which they are expressed. In the well-studied prokaryote E.coli, in addition to σ⁷⁰ , we find the following σ factors:

σ¹⁹ (fecI) - regulates the fec gene for iron transport
σ²⁴ (rpoE) - the extreme heat stress factor
σ²⁸ (rpoF) - the flagellar factor
σ³² (rpoH) - the heat shock factor, that is turned on when the bacteria are exposed to heat. Some of the enzymes that are expressed upon activation of σ³² are chaperones, proteases and DNA-repair enzymes.
σ³⁸ (rpoS) - the starvation/stationary phase sigma factor
σ⁵⁴ (rpoN) - the nitrogen-limitation factor.

Before (L) and After (R)

σ factors interact with the RNA Pol apoenzyme to generate the holoenzyme and in doing so, provide the enzyme with the capacity to recognise the -10 and -35 elements of a promoter (see figure and scheme above). The "before and after" images (LHS) show the location of the (orange) σ factor in the complex, and how its elongated shape facilitates recognition of the -10 and -35 elements (the promoter is the blue and pale green duplex above the RNA Pol). The initiation of transcription of all constitutive genes only requires the RNA Pol holoenzyme as in the "before" image. As soon as the transcriptional start site is exposed and a supply of NTPs is made available, the σ factor dissociates (the "after" image) and the elongation phase of transcription gets underway. The role of the σ factor is primarily to "target" the catalytic apparatus: by replacing the house-keeping σ factor with any of the above sigma variants, selective sets of genes can be expressed in response to one or more environmental cues. Pretty straight forward I think you'll agree. This principle of combining a core function, in this case RNA synthesis, with a variety of targeting polypeptides (in this case sigma subunits), is a common strategy used in Biology, with antibodies being a well known example.

Anti-sigma factors The potency of σ factors has led to the evolution of antagonistic molecules, called anti-sigma factors. In some organisms, σ factors need to be attenuated [slowed] (or even abrogated [stopped]): this can be achieved by the expression of anti-sigmas. Again, the logic is pretty simple. A σ factor can be maintained in complex with an anti-sigma, until an environmental queue is triggered. Through an induced conformational switch, such as a pH transition, or the binding of a small molecule to the anti-sigma component, the two components (see the image of the T4 phage anti-sigma-σ complex, RHS) are able to dissociate and the σ factor is free to promote targeted transcription.

Repressors These molecules have a special place in the history of Molecular Genetics. The work of Jacob and Monod (see an earlier post on RNA Pol) in the early 1960s laid the foundations for our understanding of gene regulation in prokaryotes and higher organisms. At the centre of their logic was the concept of the repressor, which was later defined in molecular terms as a protein molecule (although it can also be an RNA molecule) that interferes with transcription. The mode of action of repressors can be simply described as creating a road block in the path of a promoter bound RNA Pol, but since this simple concept was proposed, genetic, structural and kinetic studies have shown that repressors can inhibit RNA Pol progress by a variety of mechanisms which do not always arise from simply blocking the path of the RNA Pol, or by competing for a specific sequence in at or around the promoter. In fact, some repressors (including the lambda repressor shown left) are able to act as both repressors and activators of RNA Pol mediated transcription, and this forms the basis of the "plot" of the remarkable work from Mark Ptashne's laboratory, whose short book on this topic is a "must read" for all Molecular Biology students. Since most repressors do not form stable interactions with RNA Pols (although this is not meant to be a dogmatic statement), I will not discuss them further in this post.

Termination factor ρ , which is shown on the right, is responsible for terminating RNA Pol mediated transcription, but once again ρ acts like a classical repressor in recognising a specific RNA termination sequence of around 70 nucleotides, signalling the end of the road for RNA Pol: the ρ protein does not form a stable complex with RNA Pol. Bacteria like E.coli invest significant energy in synthesising this hexameric homo-polymeric protein and it is essential for viability in most prokaryotes. In fact the transcription of about half of the genes in E.coli are terminated via ρ while the remainder are said to be ρ-independent, or alyternatively utilise the proteins τ or nusA.

The ω and δ factors. These are both bona fide components of the RNA Pol holoenzyme. ω seems to be involved in chaperoning and stabilising the interactions of the β' subunit. Unlike ρ it is dispensable, in that ω knockouts survive; but it does seem to improve the net efficiency of transcription: I expect growth rates in ω knockouts are lower than wild-type strains. δ is also formally enshrined in the RNA Pol holoenzyme, and like ω, the gene encoding this not factor is not essential, but its removal from a genome, does give rise to some strange morphological changes in growing cells (abnormal elongation in particular). A complete understanding of the roles of these two factors in transcription remains to be elucidated, but both primary structures are highly conserved amongst prokaryotes and a number of groups are currently looking at the functions of these accessory factors during infections in pathogenic bacteria.

I want to close with a mention of a growing number of regulators of transcription that seem to modulate transcription and bind to DNA, or indirectly via σ factors, and thereby RNA Pol transcription through a redox signal mediated by an iron-sulphur (Fe-S) cluster, buried in the heart of the protein, or sometimes in a flexible subdomain. One such regulator is SoxR (containing an 2Fe-2S cluster, shown in red-yellow on the LHS). I think you can see how the distortion of DNA might be induced by SoxR and this can modulate transcription initiation. Environmental signals such as reactive oxygen species and NO, trigger gene expression events that ultimately lead to the elaboration of processes that defend the cell against this metabolic challenge. The wbl proteins are a class of gene regulators (originally identified in Streptomyces strains), but which are also found amongst the Mycobacteria (think TB). The main reason for including them is however that they have become a major area of interest of one of my colleagues at Sheffield, Professor Jeff Green. And because I really like the story emerging from his lab (see a review here ), that connects redox sensing and the control of gene expression, which may have wider implications for a number of prokaryotes and may possibly modulate the mode of action of some antibiotics.

In summary, the extraordinary focal point for gene regulation is RNA Pol in bacteria and we are learning every day about the plethora of polypeptides and RNAs that influence its activity. I hope this has given you a flavour of the structure and function of this area of Molecular Biology. At some point, when I am brave enough, I'll look at the eukaryotic RNA Pols!

Liverpool Life Sciences UTC Innovation Labs

Wednesday 23 August 2017

The challenges of making RNA in bacteria. A late summer selection of molecules

Before (L) and After (R)

Blog Archive

DNA as first imaged

Stats