Introduction to Gene Mining Part B: How similar are plant and human versions of a gene? After completing part B, you will demonstrate How to use NCBI BLASTp and www.Araport.org data to determine whether Arabidopsis thaliana and human muscle protein genes and gene products are homologous. 1
62
Embed
Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1
Introduction to Gene Mining
Part B: How similar are plant and human versions of a gene?
After completing part B, you will demonstrate
How to use NCBI BLASTp and www.Araport.org data to determine whether Arabidopsis thaliana
and human muscle protein genes and gene products are homologous.
Each amino acid is represented by a particular letter
25
Navigate to the BLASTp link on NCBI.
26
Paste the protein sequence for ACTA1 here.
Enter Arabidopsis thaliana for the search database.
Select blastp and then click on the BLAST button.
27
The BLASTp report is similar to the BLASTn report.
Query sequence
28
“Descriptions” shows 4 actins with the same query coverage, E-value and Ident!There appear to be 4 possible homologous proteins but which is most similar to the human ACTA1 protein?
29
There are a number of actin proteins with high Query coverage, very low E-values and high identity. Check them all (for some whose numbers are represented more than once, check the first listing). Then select “Multiple Alignment” to directly compare those sequences.
30
Conserved amino acids are shown in red. Which differences can you find quickly?
Can you spot a deletion? Where is an amino acid replaced by a chemically
similar type?Where is an amino acid replaced by a chemically
different type? 31
Protein sequence homology is analyzed by constructing a Distance tree of results. Check the desired
“hits”, then select “Distance tree”.
32
Query—human ACTA1 protein
Nodes represent a shared ancestral gene
These proteins are all homologs.
33
34
Of the proteins in Arabidopsis thaliana, ACT7 has the highest identity (88%) and lowest E-value (0.0) when compared to human ACTA1.
A gene tree program predicts the presence of ancestral genes between ACT7 and ACTA1.
Is that sufficient to confirm protein homology for experimental modeling?
35
A more restricted alignment between human ACTA1 and the closest 3 Arabidopsis proteins can check that ACT7 is the protein
closest to the ancestral gene.
Check Align two or more sequences, then copy and past protein sequences for ACT7, ACT8 and ACT2 into Subject Sequence box.
36
Multiple alignment results for human ACTA1 protein and the 3 closest Arabidopsis proteins.
37
What do the distance tree results indicate?
38
Do you have enough data to use Arabidopsis ACT7 gene as a model for the human ACTA1 gene?
Discuss and report your ideas.
39
What criteria from published work indicated that these plant processes and human diseases involved
homologous genes or proteins ?
40
Homologous proteins will have:
• Very low E-values for sequence alignment(< .00001)
• >25% conserved sequences for >100 aa* • Protein-protein interactions of one homolog which
are similar to protein-protein interactions of the other homolog
• Similar co-expression of genes for each homolog • Similar Function Gene Ontology (GO terms) • Conserved sequences and protein domains*
Let’s find homology information and data about the Arabidopsis ACT7 gene in http://www.Araport.org Use the pull-down menu to access the ThaleMine tool.
Enter information about your gene of interest, in this case, ACT7
43
Results show 1 gene, 2 articles and 1 mRNA in the database.
We are only interested in studying the gene for now, so we will select the category –Gene or just select the identifier for the gene from the list at right
44
This is the Gene information sheet for the Arabidopsis thaliana ACT7 gene. How did the function listed under Curator Summary compare to your
previous prediction?
45
The blue bar under Curator Summary has tabs that take you quickly to that section down the page. Click on the Homology tab.
Links to information about human ACT7 homologs.
46
Homologous proteins will have:
• Very low E-values for sequence alignment• (< .00001)• >25% conserved sequences for > 100 aa* • Protein-protein interactions of one homolog which
are similar to protein-protein interactions of the other homolog
• Similar co-expression of genes for each homolog • Similar Function Gene Ontology (GO terms) • Conserved protein domains
Compare the first (human ACTA1) and second (Arabidopsis ACT7) sequences in each alignment and it is evident that many more than 25% of any 100 amino acids in any of the regions align.
48
Homologous proteins will have:
• Very low E-values for sequence alignment• (< .00001)• >25% conserved sequences for > 100 aa* • Protein-protein interactions of one homolog which
are similar to protein-protein interactions of the other homolog
• Similar co-expression of genes for each homolog • Similar Function Gene Ontology (GO terms) • Conserved protein domains
ACT7 and ACTA1 proteins each interact with a variety of other proteins. Because the same protein may have a plant name and a different animal name, further investigation is needed to
know from this data whether ACTA1 and ACT7 are interacting with identical proteins.
Arabidopsis ACT7 interacts with these proteins
Human ACTA1 interacts with these proteins
51
Homologous proteins will have:
• Very low E-values for sequence alignment• (< .00001)• >25% conserved sequences for > 100 aa* • Protein-protein interactions of one homolog which
are similar to protein-protein interactions of the other homolog ??
• Similar co-expression of genes for each homolog • Similar Function Gene Ontology (GO terms) • Conserved protein domains
Have we found a suitable plant research model for nemaline myopathy?
What additional information would you want? Scientific literature searches for Arabidopsis information are easy to access in http:www.Araport.org apps 50 years of Arabidopsis research!