"The reference genome was like a dictionary when we announced it," Ma said. "Each gene was like a single word. However, there was a piece of critical information lacking: transcription initiation sites for individual genes."
Transcription initiation sites are locations in the DNA where a specialized transcription-factor protein can attach and then build an mRNA copy of the gene in front of it. That mRNA is read and translated at a cell's ribosome to create more proteins, important for the chemical and physical function of every organism.
Knowing where the mRNA begins formation on the DNA strand is a significant part of understanding how genes are expressed. These initiation sites contain regulatory elements and provide information to the cell about when and where to transcribe each gene to make protein, and how frequently to do so at any point in time.
In genetics, it has generally been accepted that each gene has one transcription initiation site, located downstream of a core promoter region and typically around a TATA box—a DNA sequence rich in thymine and adenine repeats. But Ma and his colleagues no longer think this is the case
Click here to see more...