Similarity between language of nature and language of computersI
attended 2014 summer school at the Center for Computational Biology
and Bioinformatics (CCBB) at University of Texas, Austin. I was
overwhelmed by my first experience with intense course on computer
programming languages (I am a geneticist). Summer school at CCBB
helped me a lot in understanding the bioinformatics
(interdisciplinary field combining biology and computer science)
but more importantly, on the second day of my class when the
instructor talked about ASCII (American Standard Code for
Information Interchange) which is code of ones and zeros that make
binary or computer language, it occurred to me (and its very
apparent to anyone who is a biologist/geneticist) that the
similarity between basic structure of language of nature which is
A, T, G and C (four letters or nucleotides) and language of
computer (1 and 0) is fascinating.Let me first explain the way
natures language (genetic code) works. DNA is a long chain or
linear polymer of four nucleotides A, T, G and C (Adenine, thymine,
Guanine and Cytosine). Specific sequence of these nucleotides,
generally thousands of them, make a gene. This sequence of
nucleotides/ gene, codes for specific protein which has specific
function in cells of living beings. Humans have about 20,000
different genes (exact number is still debated) that code for
hundreds of thousands of proteins using various combinations of
nucleotides and make us human. Each and every well defined living
thing on the planet Earth is coded by different combinations of
nucleotides A, T, G and C making different genes (sometimes similar
between and among different species) and produce the incredible
diversity that we find on this planet; millions of species of
plants and animals and microorganisms like bacteria, fungi and
viruses.
Source: "RNA-codons" by TransControl -
http://en.wikipedia.org/skins-1.5/common/images/magnify-clip.png.
Licensed under Creative Commons Attribution-Share Alike 3.0 via
Wikimedia Commons - http://commons.wikimedia.org/wiki/File:RNA
codons.png#mediaviewer/File:RNA-codons.pngU (Uracil) in RNA is T
(Thymine) in DNA
To be more specific the way genetic code works is that
nucleotide sequences are read in triplets (known as codons), for
example if the sequence of gene is
ATGCTAGCGGCTAATCCGTACTAGATACCGAATAG, ATG would be read as its first
codon which codes for a specific amino acid (building blocks of
protein). ATG is very special code because it generally is place
where protein synthesis begins (start codon), i.e. it indicates
start of a gene. Similarly, there is a code which tells where to
stop protein synthesis (stop codon) or end of a gene along long DNA
molecule. Sequence between the start and stop codon codes for
different proteins for different metabolic function. Reading these
start and stop codon along the genome of organisms using computer
programs, bioinformaticians predict number of genes in an organisms
genome.
Start codon ATG or AUG indicates start of a gene and stop codon
TAA or UAA indicated termination of a gene.
Source: http://leavingbio.net/heredity-higher%20level.htm
Genetic code chartSource:
http://courses.bio.indiana.edu/L104-Bonner/F11/imagesF11/Genetics_MPs/MPs.html
The genetic code work in very similar way as computer
code/ASCII. In computer codes, for example, the combination of
zeros and ones make code for alphabets or instructions given
through keys. To be more specific, if you type F on your computer,
the binary code in which computer stores or recognize it is
1000110. Apart from all the alphabets, ASCII has defined binary
code for numbers and special characters. Not only that, there is a
binary code defined by ASCII for space between words and new line
as well.
A chart of ASCII from a 1972 printer manual.Source:
Wikipediahttp://en.wikipedia.org/wiki/ASCII
Even with this fundamentally similar principle (use of simple
notations to code for more complex things) behind ASCII and genetic
code, it is surprising to me that ASCII was not inspired by the
genetic code, rather, ASCII was developed from manual telegraphic
codes dating way back from the discovery of genetic code. It is
interesting to note that technology we developed and evolved is
based on similar principle on which we ourselves are evolved. It is
very tempting to speculate that the way simple A, T, G and C has
given rise to intelligence in form of extremely complex brains
through evolution for millions of years, the computer coding might
give rise to an artificial intelligence given sufficient time and
training to the machines. Proponents of Intelligent design might
say that the way ASCII, which is created by humans, can give rise
to artificial intelligence, we humans ourselves are also creation
of something superior, however, I do not subscribe to or advocate
this idea.