Messenger RNA (mRNA) is a single-stranded RNA molecule that corresponds to the genetic sequence of a gene and is read by the ribosome in the process of producing a protein. mRNA is created during the process of transcription, where the enzyme RNA polymerase converts genes into primary transcript mRNA (also known as pre-mRNA). This pre-mRNA usually still contains introns, regions that will not go on to code for the final amino acid sequence. These are removed in the process of RNA splicing, leaving only exons, regions that will encode the protein. This exon sequence constitutes mature mRNA. Mature mRNA is then read by the ribosome, and, utilising amino acids carried by transfer RNA (tRNA), the ribosome creates the protein. This process is known as translation. All of these processes form part of the central dogma of molecular biology, which describes the flow of genetic information in a biological system.
Like in DNA, mRNA genetic information is in the sequence of nucleotides, which are arranged into codons consisting of three base pairs each. Each codon codes for a specific amino acid, except the stop codons, which terminate protein synthesis. This process of translation of codons into amino acids requires two other types of RNA: transfer RNA, which recognises the codon and provides the corresponding amino acid, and ribosomal RNA (rRNA), the central component of the ribosome's protein-manufacturing machinery.
The existence of mRNA was first suggested by Jacques Monod and François Jacob and was subsequently discovered by Jacob, Sydney Brenner and Matthew Meselson at the California Institute of Technology in 1961.
The brief existence of an mRNA molecule begins with transcription, and ultimately ends in degradation. During its life, an mRNA molecule may also be processed, edited, and transported prior to translation. Eukaryotic mRNA molecules often require extensive processing and transport, while prokaryotic mRNA molecules do not. A molecule of eukaryotic mRNA and the proteins surrounding it are together called a messenger RNP.
Transcription is when RNA is made from DNA. During transcription, RNA polymerase makes a copy of a gene from the DNA to mRNA as needed. This process is similar in eukaryotes and prokaryotes. One notable difference, however, is that eukaryotic RNA polymerase associates with mRNA-processing enzymes during transcription so that processing can proceed quickly after the start of transcription. The short-lived, unprocessed or partially processed product is termed precursor mRNA, or pre-mRNA; once completely processed, it is termed mature mRNA.
Processing of mRNA differs greatly among eukaryotes, bacteria, and archea. Non-eukaryotic mRNA is, in essence, mature upon transcription and requires no processing, except in rare cases. Eukaryotic pre-mRNA, however, requires several processing steps before its transport to the cytoplasm and its translation by the ribosome.
The extensive processing of eukaryotic pre-mRNA that leads to the mature mRNA is the RNA splicing, a mechanism by which introns or outrons (non-coding regions) are removed and exons (coding regions) are joined together.
A 5' cap (also termed an RNA cap, an RNA 7-methylguanosine cap, or an RNA m7G cap) is a modified guanine nucleotide that has been added to the "front" or 5' end of a eukaryotic messenger RNA shortly after the start of transcription. The 5' cap consists of a terminal 7-methylguanosine residue that is linked through a 5'-5'-triphosphate bond to the first transcribed nucleotide. Its presence is critical for recognition by the ribosome and protection from RNases.
Cap addition is coupled to transcription, and occurs co-transcriptionally, such that each influences the other. Shortly after the start of transcription, the 5' end of the mRNA being synthesized is bound by a cap-synthesizing complex associated with RNA polymerase. This enzymatic complex catalyzes the chemical reactions that are required for mRNA capping. Synthesis proceeds as a multi-step biochemical reaction.
In some instances, an mRNA will be edited, changing the nucleotide composition of that mRNA. An example in humans is the apolipoprotein B mRNA, which is edited in some tissues, but not others. The editing creates an early stop codon, which, upon translation, produces a shorter protein.
Polyadenylation is the covalent linkage of a polyadenylyl moiety to a messenger RNA molecule. In eukaryotic organisms most messenger RNA (mRNA) molecules are polyadenylated at the 3' end, but recent studies have shown that short stretches of uridine (oligouridylation) are also common. The poly(A) tail and the protein bound to it aid in protecting mRNA from degradation by exonucleases. Polyadenylation is also important for transcription termination, export of the mRNA from the nucleus, and translation. mRNA can also be polyadenylated in prokaryotic organisms, where poly(A) tails act to facilitate, rather than impede, exonucleolytic degradation.
Polyadenylation occurs during and/or immediately after transcription of DNA into RNA. After transcription has been terminated, the mRNA chain is cleaved through the action of an endonuclease complex associated with RNA polymerase. After the mRNA has been cleaved, around 250 adenosine residues are added to the free 3' end at the cleavage site. This reaction is catalyzed by polyadenylate polymerase. Just as in alternative splicing, there can be more than one polyadenylation variant of an mRNA.
Polyadenylation site mutations also occur. The primary RNA transcript of a gene is cleaved at the poly-A addition site, and 100-200 A's are added to the 3' end of the RNA. If this site is altered, an abnormally long and unstable mRNA construct will be formed.
Another difference between eukaryotes and prokaryotes is mRNA transport. Because eukaryotic transcription and translation is compartmentally separated, eukaryotic mRNAs must be exported from the nucleus to the cytoplasm--a process that may be regulated by different signaling pathways. Mature mRNAs are recognized by their processed modifications and then exported through the nuclear pore by binding to the cap-binding proteins CBP20 and CBP80, as well as the transcription/export complex (TREX). Multiple mRNA export pathways have been identified in eukaryotes.
In spatially complex cells, some mRNAs are transported to particular subcellar destinations. In mature neurons, certain mRNA are transported from the soma to dendrites. One site of mRNA translation is at polyribosomes selectively localized beneath synapses. The mRNA for Arc/Arg3.1 is induced by synaptic activity and localizes selectively near active synapses based on signals generated by NMDA receptors. Other mRNAs also move into dendrites in response to external stimuli, such as ?-actin mRNA. Upon export from the nucleus, actin mRNA associates with ZBP1 and the 40S subunit. The complex is bound by a motor protein and is transported to the target location (neurite extension) along the cytoskeleton. Eventually ZBP1 is phosphorylated by Src in order for translation to be initiated. In developing neurons, mRNAs are also transported into growing axons and especially growth cones. Many mRNAs are marked with so-called "zip codes," which target their transport to a specific location.
Because prokaryotic mRNA does not need to be processed or transported, translation by the ribosome can begin immediately after the end of transcription. Therefore, it can be said that prokaryotic translation is coupled to transcription and occurs co-transcriptionally.
Eukaryotic mRNA that has been processed and transported to the cytoplasm (i.e., mature mRNA) can then be translated by the ribosome. Translation may occur at ribosomes free-floating in the cytoplasm, or directed to the endoplasmic reticulum by the signal recognition particle. Therefore, unlike in prokaryotes, eukaryotic translation is not directly coupled to transcription. It is even possible in some contexts that reduced mRNA levels are accompanied by increased protein levels, as has been observed for mRNA/protein levels of EEF1A1 in breast cancer.
Coding regions are composed of codons, which are decoded and translated (in eukaryotes usually into one and in prokaryotes usually into several) into proteins by the ribosome. Coding regions begin with the start codon and end with a stop codon. In general, the start codon is an AUG triplet and the stop codon is UAA, UAG, or UGA. The coding regions tend to be stabilised by internal base pairs, this impedes degradation. In addition to being protein-coding, portions of coding regions may serve as regulatory sequences in the pre-mRNA as exonic splicing enhancers or exonic splicing silencers.
Untranslated regions (UTRs) are sections of the mRNA before the start codon and after the stop codon that are not translated, termed the five prime untranslated region (5' UTR) and three prime untranslated region (3' UTR), respectively. These regions are transcribed with the coding region and thus are exonic as they are present in the mature mRNA. Several roles in gene expression have been attributed to the untranslated regions, including mRNA stability, mRNA localization, and translational efficiency. The ability of a UTR to perform these functions depends on the sequence of the UTR and can differ between mRNAs. Genetic variants in 3' UTR have also been implicated in disease susceptibility because of the change in RNA structure and protein translation.
The stability of mRNAs may be controlled by the 5' UTR and/or 3' UTR due to varying affinity for RNA degrading enzymes called ribonucleases and for ancillary proteins that can promote or inhibit RNA degradation. (See also, C-rich stability element.)
Translational efficiency, including sometimes the complete inhibition of translation, can be controlled by UTRs. Proteins that bind to either the 3' or 5' UTR may affect translation by influencing the ribosome's ability to bind to the mRNA. MicroRNAs bound to the 3' UTR also may affect translational efficiency or mRNA stability.
Cytoplasmic localization of mRNA is thought to be a function of the 3' UTR. Proteins that are needed in a particular region of the cell can also be translated there; in such a case, the 3' UTR may contain sequences that allow the transcript to be localized to this region for translation.
Some of the elements contained in untranslated regions form a characteristic secondary structure when transcribed into RNA. These structural mRNA elements are involved in regulating the mRNA. Some, such as the SECIS element, are targets for proteins to bind. One class of mRNA element, the riboswitches, directly bind small molecules, changing their fold to modify levels of transcription or translation. In these cases, the mRNA regulates itself.
The 3' poly(A) tail is a long sequence of adenine nucleotides (often several hundred) added to the 3' end of the pre-mRNA. This tail promotes export from the nucleus and translation, and protects the mRNA from degradation.
An mRNA molecule is said to be monocistronic when it contains the genetic information to translate only a single protein chain (polypeptide). This is the case for most of the eukaryotic mRNAs. On the other hand, polycistronic mRNA carries several open reading frames (ORFs), each of which is translated into a polypeptide. These polypeptides usually have a related function (they often are the subunits composing a final complex protein) and their coding sequence is grouped and regulated together in a regulatory region, containing a promoter and an operator. Most of the mRNA found in bacteria and archaea is polycistronic, as is the human mitochondrial genome. Dicistronic or bicistronic mRNA encodes only two proteins.
In eukaryotes mRNA molecules form circular structures due to an interaction between the eIF4E and poly(A)-binding protein, which both bind to eIF4G, forming an mRNA-protein-mRNA bridge. Circularization is thought to promote cycling of ribosomes on the mRNA leading to time-efficient translation, and may also function to ensure only intact mRNA are translated (partially degraded mRNA characteristically have no m7G cap, or no poly-A tail).
Other mechanisms for circularization exist, particularly in virus mRNA. Poliovirus mRNA uses a cloverleaf section towards its 5' end to bind PCBP2, which binds poly(A)-binding protein, forming the familiar mRNA-protein-mRNA circle. Barley yellow dwarf virus has binding between mRNA segments on its 5' end and 3' end (called kissing stem loops), circularizing the mRNA without any proteins involved.
RNA virus genomes (the + strands of which are translated as mRNA) are also commonly circularized. During genome replication the circularization acts to enhance genome replication speeds, cycling viral RNA-dependent RNA polymerase much the same as the ribosome is hypothesized to cycle.
Different mRNAs within the same cell have distinct lifetimes (stabilities). In bacterial cells, individual mRNAs can survive from seconds to more than an hour. However, the lifetime averages between 1 and 3 minutes, making bacterial mRNA much less stable than eukaryotic mRNA. In mammalian cells, mRNA lifetimes range from several minutes to days. The greater the stability of an mRNA the more protein may be produced from that mRNA. The limited lifetime of mRNA enables a cell to alter protein synthesis rapidly in response to its changing needs. There are many mechanisms that lead to the destruction of an mRNA, some of which are described below.
In general, in prokaryotes the lifetime of mRNA is much shorter than in eukaryotes. Prokaryotes degrade messages by using a combination of ribonucleases, including endonucleases, 3' exonucleases, and 5' exonucleases. In some instances, small RNA molecules (sRNA) tens to hundreds of nucleotides long can stimulate the degradation of specific mRNAs by base-pairing with complementary sequences and facilitating ribonuclease cleavage by RNase III. It was recently shown that bacteria also have a sort of 5' cap consisting of a triphosphate on the 5' end. Removal of two of the phosphates leaves a 5' monophosphate, causing the message to be destroyed by the exonuclease RNase J, which degrades 5' to 3'.
Inside eukaryotic cells, there is a balance between the processes of translation and mRNA decay. Messages that are being actively translated are bound by ribosomes, the eukaryotic initiation factors eIF-4E and eIF-4G, and poly(A)-binding protein. eIF-4E and eIF-4G block the decapping enzyme (DCP2), and poly(A)-binding protein blocks the exosome complex, protecting the ends of the message. The balance between translation and decay is reflected in the size and abundance of cytoplasmic structures known as P-bodies The poly(A) tail of the mRNA is shortened by specialized exonucleases that are targeted to specific messenger RNAs by a combination of cis-regulatory sequences on the RNA and trans-acting RNA-binding proteins. Poly(A) tail removal is thought to disrupt the circular structure of the message and destabilize the cap binding complex. The message is then subject to degradation by either the exosome complex or the decapping complex. In this way, translationally inactive messages can be destroyed quickly, while active messages remain intact. The mechanism by which translation stops and the message is handed-off to decay complexes is not understood in detail.
The presence of AU-rich elements in some mammalian mRNAs tends to destabilize those transcripts through the action of cellular proteins that bind these sequences and stimulate poly(A) tail removal. Loss of the poly(A) tail is thought to promote mRNA degradation by facilitating attack by both the exosome complex and the decapping complex. Rapid mRNA degradation via AU-rich elements is a critical mechanism for preventing the overproduction of potent cytokines such as tumor necrosis factor (TNF) and granulocyte-macrophage colony stimulating factor (GM-CSF). AU-rich elements also regulate the biosynthesis of proto-oncogenic transcription factors like c-Jun and c-Fos.
Eukaryotic messages are subject to surveillance by nonsense mediated decay (NMD), which checks for the presence of premature stop codons (nonsense codons) in the message. These can arise via incomplete splicing, V(D)J recombination in the adaptive immune system, mutations in DNA, transcription errors, leaky scanning by the ribosome causing a frame shift, and other causes. Detection of a premature stop codon triggers mRNA degradation by 5' decapping, 3' poly(A) tail removal, or endonucleolytic cleavage.
In metazoans, small interfering RNAs (siRNAs) processed by Dicer are incorporated into a complex known as the RNA-induced silencing complex or RISC. This complex contains an endonuclease that cleaves perfectly complementary messages to which the siRNA binds. The resulting mRNA fragments are then destroyed by exonucleases. siRNA is commonly used in laboratories to block the function of genes in cell culture. It is thought to be part of the innate immune system as a defense against double-stranded RNA viruses.
MicroRNAs (miRNAs) are small RNAs that typically are partially complementary to sequences in metazoan messenger RNAs. Binding of a miRNA to a message can repress translation of that message and accelerate poly(A) tail removal, thereby hastening mRNA degradation. The mechanism of action of miRNAs is the subject of active research.
Full length mRNA molecules have been proposed as therapeutics since the beginning of the biotech era but there was little traction until the 2010s, when Moderna Therapeutics was founded and managed to raise almost a billion dollars in venture funding in its first three years.
Theoretically, the administered mRNA sequence can cause a cell to make a protein, which in turn could directly treat a disease or could function as a vaccine; more indirectly the protein could drive an endogenous stem cell to differentiate in a desired way.
The primary challenges of RNA therapy center on delivering the RNA to directed cells, more even than determining what sequence to deliver. Naked RNA sequences will naturally degrade after preparation; they may trigger the body's immune system to attack them as an invader; and they are impermeable to the cell membrane. Once within the cell, they must then leave the cell's transport mechanism to take action within the cytoplasm, which houses the ribosomes that direct manufacture of proteins.