Gene Expression

Ground Rules for Gene Expression

(AKA Central Dogma)


Central Dogma, in the broadest sense, encompasses the genetic mechanisms of Replication, Translation and Translation. In the strictest sense, Central Dogma describes gene expression: Information encoded in the nucleotides of DNA being use to construct proteins. The two core genetic processes involved in gene expression are Transcription (synthesis of RNA) and Translation (synthesis of proteins).

Central Dogma of Biology

Before digging into each process, let’s talk a little about what is at stake here. DNA holds our genetic history. It holds codes on how to build an organism, but what does that really mean?

The basic unit of life is the cell, and cells are formed from phospholipids can naturally form bilayers. Furthermore, phospholipids can even naturally form spherical structures that create two fluid compartments, outside vs. inside.  The phospholipids that make up the cellular membrane form the most basic feature of the cell: a dividing point, separating the inside from the outside (more on this in the weeks to come).

Membranes though are passive. As a selectively permeable barrier, only certain materials can cross. Proteins add functionality to the membrane. By embedding proteins, you can change the permeability of the membrane. This is how cells balance what is on the inside, and what is on the outside. Membrane proteins can also have enzymatic or signal functions. Proteins add functionality to the membrane.

A common expression is that DNA holds the code to make an organism. The meaning of this phrase lies in the concept that by making proteins, we make phospholipid membranes and cells functional. From DNA, cells can build proteins for metabolic pathways, to produce various chemical compounds, anchor with other cells, and in multicellular complex life, we even have the development of special cellular roles that work together to form a composite whole.

The concept of how we go from DNA to RNA and then Proteins is one of the most critical concepts in biology! Today we are going to focus on some of the basics, the Ground Rules, of genetics.

All genetic processes work due to base complementarity. If you know the base complementarity rules, then the foundations of genetics will make sense. At times, this may seem repetitious, but I really want you to get these terms and concepts.
Genes are sometimes referred to as the unit of heredity, and with good reason. A gene is a segment of DNA that holds the code to make a protein (NOTE: or functional RNA, such as transfer RNA). In modern biology, we refer to gene products, which are just the expressed macromolecules coded by a gene.
Remember, a gene product can be either protein or functional RNA (e.g., tRNA). Functional RNA does not code for proteins, instead, these RNA strands have some function in cellular metabolism, most notably in the genetic process of Translation. Examples include transfer RNA (tRNA), ribosomal RNA (rRNA), and small nuclear RNA (snRNA).

All genes have non-coding portions that are critical for the correct transcription (synthesis of RNA). These non-coding areas are critical for regulation and aligning the transcription enzymes (e.g., RNA polymerase). Below is a graphic that shows the structure of a gene. The promoter of a gene is a sequence of DNA upstream of the actual code (coding region) that indicates the “Start” point for transcription. This is how your cell knows where to begin transcription. The loss of the promoter means that the gene will no longer be expressed.

In Eukaryotic cells, a common promoter is a DNA sequence that reads TATAAA and is better known as the TATA-Box. In bacteria, the promoter is known as the Pribnow Box (Pribnow-Schaller box).  In both cases, the promoter is found in the Major Groove of the DNA Major Groovemolecule. As can be seen in the image to the right, the major groove is wide enough to “see” the base pairs. The base pairs have an electrochemical profile, and thus can respond to other chemicals (via van Der Waals forces). Thus, the major groove is a place where proteins (and other compounds) can bind to specific sequences of DNA! The promoter sequences are found in the major groove. Major Groove with Initation FactorThe image to the right shows a bacterial promoter event. One of the factors needed to start transcription (by recognizing the promoter) has bound into the major groove. This recognition event is needed to identify the start point of a gene. The Transcription Initiation Complex will then begin to form at this site and begin the transcription of the gene. 
 
Many genes are regulated, meaning they can be turned on and off. Beyond a promoter, a regulated gene will typically have a non-coding region known as the Operator. The operator is located downstream of the promoter (meaning it will be between the promoter and the coding region). Regulatory proteins can bind to the operator, preventing transcription. Remember, cells are masters at energy conservation. They will not begin producing proteins that are unnecessary. Gene regulation is a common activity of Signal & Receptor systems. The image below is a good visual of the promoter & operator systems. Gene Housekeeping genes are those that are needed for the general function of the cell and can include genes for glycolysis, citric acid cycle, and ribosomes. These genes are always ON, and are referred to as constitutive genes
 
Messanger RNA (mRNA) is a molecule of RNA that cares the gene code for the construction of a protein. mRNA is sent to the Ribosome in order to produce a protein. The code for constructing a protein is in Nucleotide Language, meaning the code is a code of nucleotides. Specifically, the code in mRNA is in the ribonucleotide language (A, U, G, C). In order to make a protein, it is necessary to Translate the ribonucleotide language into the language of proteins, i.e., amino acid sequences. 
 
In order to translate, you need an agent of translation. This agent of translation must be a molecule that contains both ribonucleotides and amino acids (think of it as the nucleotide-amino acid dictionary). A specific ribonucleotide sequence must directly correspond to an amino acid, just as in translating human languages requires a word for word relationships. This concept of a direct nucleotide to amino acid relationship is the basis of the Genetic Code.
 
tRNAThe agent of translation is Transfer RNA (tRNA). In tRNA, there is a direct physical correspondence between a 3 nucleotide sequence (anti-codon) and an amino acid. To the right are common ways of illustrating tRNA, with the 3rd image being the most common way of drawing the molecule. In the image, each molecule has a region known as the anticodon; this region will interact with mRNA. At the 3′ end of the molecule, a specific amino acid will be bound. 
 
On the mRNA, the code is broken down into codons (think of these as genetic words). Codons consist of 3 adjacent nucleotides. Codons are complimentary to anticodons found on tRNA. Each tRNA has a specific anticodon-amino acid relationship, so each codon then specifies an amino acid. The genetic code is NOT ambiguous. There is a direct correspondence between codon and amino acid; the tRNAs make sure of this.
The ribonucleic language is divided into 64 3-nucleotide words known as codons. Condons specify though tRNA an amino acid. The Genetic Code is thus the translation scheme between codons and amino acids. [NOTE: another way to describe the genetic code is in terms of a computer algorithm]. Below is a rather unique way of viewing the genetic code. It is an excellent way of visualizing the number of redundancies in the code.
 
GeneticCode21-version-2
 

The genetic code is redundant, which means that there are multiple codons (3 nucleotides) that specify the same amino acid. For example, around the 12 o’clock position of the above chart, you see the amino acid glycine. The codons GGU, GGC, GGA and GGG all specify Glycine. Phenylalanine is specified by UUU and UUC. There are only a few amino acids, such as methionine, that are specified by a single codon (in the case of methionine it is AUG).

The presence of redundancies means that some alterations in the gene sequence are silenced (silent mutation). For example, changing GGU to GGA does not change the specified amino acid (Glycine). This is a silent mutation. Changing UUC to UUA may cause a problem (point mutation), but both Leucine and Phenylalanine are hydrophobic, so the variation may be minor. Chaing CAC to CAG though has more impact as you are changing the positive histidine to a polar glutamine (you lose the full positive charge of histidine). Remember, changing amino acids can easily change the way a protein folds. REMEMBER: The genetic code has redundancies, which will limit some problems with mutations.

Below is a more classic way to represent the genetic code, in the form of a table. The way the table is arranged, you can easily see the various redundancies in the system. In both representations, notice that there are three codons that specify STOP. These stop codons, UAA, UAG, and UGA are essential for the termination of protein synthesis. In the image below, you will notice AUG has been tagged as the initiation (start) codon. All protein synthesis begins with the code AUG. We will talk more about this later in the week.
Genetic Code


Daily Challenge

Today helps to set the stage for our discussion of the central dogma of biology (gene expression).  In reading, you find that DNA holds the codes to make various types of RNA and Proteins.  Most of the time, what concerns us is the production of proteins, as they will add functionality to our cells.
At the heart of the Central Dogma is the genetic code.  This code shows how you move from the language of nucleic acids to the language of proteins (aka, amino acids).  This code is Universal and Non-Ambiguous, but what does that mean?  Your goal today is to read, in your text and in the optional reading, and reflect on the concept of gene expression and the genetic code.  Why is it so important?  How do we use it?  How does this influence concepts from understanding hormonal changes at puberty, evolution and genetic engineering?

Nucleotides

 

Nucleic Acids

Nucleotide Structure: The following image from Wikipedia’s image gallery shows the basic structure of the nucleotide and the five nitrogenous bases.

The central component of all nucleotides will be a pentose sugar (5-carbon sugar). We will either see ribose or 2’deoxyribose as the sugar (the second carbon has one less oxygen than ribose). Off of the 5′ carbon of the sugar, you will find a phosphate group attached, while on the 1′ carbon, you will find a nitrogenous base. [NOTE: remember the numbering of carbon atoms in carbohydrates from yesterday? Do you see why the numbering is important?]
There are five nitrogenous bases, divided into two categories: Purines and Pyrimidines. Notice that the purines are a composite of two ring structures, while the pyrimidines are a single ring structure. When you take organic chemistry and biochemistry, the importance and complexity of these ring structures will be further discussed. At present, just become aware of their respective shapes and sizes (and the inclusion of nitrogen).

As with amino acids, the nucleotide contains a functional group: the nitrogenous base. Just like the side chain in an amino acid, the nitrogenous base will play an important part in the function of this biomolecule. The Sugar-Phosphate then becomes the backbone of the molecule (line the Amino-Chiral Carbon-Carboxyl of an amino acid). We will in later weeks that the sugar-phosphates of nucleotides will create the strands of DNA and RNA. The nitrogenous bases then playing an information role.

Base Complementarity:

The nucleic acids are referred to as informational biomolecules (biopolymers). This is because the sequence of nucleotides carries information on how to build RNA and Proteins. One of the central foundations of genetics (i.e., how it all works), is base complementarity. Here we are looking at the interactions between purines and pyrimidines:

A links with T through 2 hydrogen bonds.

G links with C through 3 hydrogen bonds.

A to T G to C

U has the binding properties of T, but is only found in RNA.
T is never found in RNA, only DNA.
NOTE: base complementarity is a critical concept to remember. All genetic processes rely on base complementarity!

Directionality

When we get to genetics, we will be talking about the directionality of the nucleic acids. For example, we will talk about DNA being built from the 5′ to 3′. This is in reference to the carbon atoms in the ribose or deoxyribose. The 5′ holds a phosphate, while the 3′ holds an open -OH (hydroxyl) group. This concept of directionality is critical, and you are warned to learn how it works, and what the terms represent.
As with all biopolymers, monomers are added together through dehydration synthesis, and separation is through hydrolysis. When synthesis occurs, the 5′ phosphate links to the 3′ -OH, forming a phosphodiester bond.


Daily Challenge

The challenge today is to understand the history of the discovery of DNA.  Look up the following researchers and read about their discovery, how it was done, and the importance of the discovery.  In addition, watch the TED Talk from James Watson “How we discovered DNA.”  What are your impressions?

Carbohydrates and Molecular Interactions

I would like you to bookmark a website that will be helpful for you as you move through biology and biochemistry: Molecular Interactions.   This website is a product of the Loren Dean Williams’s lab at Georgia Tech, and is an excellent resource to help you understand the molecular interactions that allow for the structure and function of biomolecules.  To start, look at the following sections of the page:


Carbohydrates

While carbohydrates are mainly used as chemical energy storage, carbohydrates are also used as modifiers of proteins and in forming cellular receptors and anchors. One of your goals is to gain a good understanding of the structure of carbohydrates, and a little about their naming.
A topic that will come up throughout the semester is how carbons are numbered in carbohydrates. This is important as we will find carbohydrates being components of monomers and when we move through the carbohydrate catabolism. The following image shows the linear form of glucose, and the two possible cyclic (pyranose ring) isomers.

http://images.tutorcircle.com/cms/images/44/glucose.png

 

The formation is based on aldehyde chemistry so we will leave some of this discussion to organic chemistry and biochemistry. For our purpose this semester, what is important is that we number carbons from the aldehyde. Notice in the above diagram that carbon 1 is to the left of the oxygen, we go around to carbon 5, and then carbon 6 is outside of the ring. If you see the expression 3′, it is referring to the third carbon. 5′ the fifth carbon. 6′ the sixth carbon, and so forth.
Notice also, that when the ring was formed, there were differences in the groups coming off of carbon 1. These differences are important and can influence how the sugar is metabolized. We say that these different forms are isomers (if you don’t know what an isomer is, look it up and add the definition to your notebook).
One critical difference comes when linking two monosaccharides together to form disaccharides and polysaccharides. For instance, here is maltose:

Maltose
https://chemstory.files.wordpress.com/2013/06/dokeo14.png

This is an α 1-4 glycosidic linkage. We have an α Maltose (look at carbon 1) bound from carbon 1 to carbon 4. Since the maltose on the left-hand side is α at the 1 carbon, we form an α linkage. In comparison, look at cellobiose:

https://biochemphilic.files.wordpress.com/2013/03/cellobios.gif
 
Cellobiose has a β 1-4 glycosidic linkage. The designation of β comes from the sugar unit that donates carbon 1 to the bond.
 

So, what is the big deal? Maltose is digestible by humans, cellobiose is not. Just this slight isomeric difference changes the metabolism.

All carbohydrate monomers are connected through glycosidic linkages, whether it is a disaccharide, oligosaccharide or a polysaccharide. Make sure that you learn the different types of carbohydrates.


Molecular Visualization

Here are four different ways of visualizing α-D-Glucose.  The first is a ball-n-stick model, with the black spheres representing carbon, the red oxygen, and the grey hydrogen.  This is a useful way to begin understanding the 3-D orientation and structure of the molecule.  If you notice, the other models do not always explicitly state where the carbon is located.  Instead, the where lines intersect are where you would find carbon.  This is done to create a simplified diagram that allows you to see the geometry of the molecule.
https://upload.wikimedia.org/wikipedia/commons/8/8a/Alpha-D-Glucose.png
Glucose model – rotatable in 3 dimensions is a good place to go to gain a good visual impression of glucose.
 
Please start gaining a good visual of the biomolecules.

Challenge

Draw the structure for Ribose and Deoxyribose.   Number the carbon atoms.
Structurally, what is different between ribose and deoxyribose?
What are the differences in the electrochemistry of the molecule (where do you find charged and partially charged regions for example).