https://www.youtube.com/watch?v=zdWDM-o9ZXU
/mnt/1T-5e7/mycodehtml/Biology/SangSooKim/Bioinformatics_basic/009/001_BlastP_to_Gene_Tree/main.html
================================================================================
Run BLASTP program at NCBI
- Query: human's EGF-Receptor, canonical form
- Database: RefSeq proteins
- Exclude model sequences (XM/XP)
================================================================================
================================================================================
================================================================================
================================================================================
click
================================================================================
================================================================================
================================================================================
click
================================================================================
================================================================================
1210 number of amino acid
================================================================================
Protein sequence
================================================================================
How to get protein sequence
Click FASTA
================================================================================
================================================================================
Click BLAST
================================================================================
BLAST is the program which compares the 2 sequences
Note that your goal, compare "human EGF-receptor sequence" with "RefSeq protein sequence"
================================================================================
Click
================================================================================
blastn: compare 2 genome sequences
blastp: compare 2 protein sequences
================================================================================
================================================================================
Change database
================================================================================
================================================================================
Exclude "model sequences"
================================================================================
Exclude "uncultured/environmental sample sequences"
uncultured/environmental sample sequences: microbe which can't grow in the lab
================================================================================
click
================================================================================
================================================================================
================================================================================
================================================================================
Comparison result
================================================================================
Note that your query length was 1210 number of amino acid
================================================================================
Meaning: 1210 amino acid matches to your input query protein sequence
================================================================================
Mouse, Identity score is only 90%
================================================================================
Click that one
Query: your input EGF receptor sequence
Sbjct: mouse's EGF receptor sequence
================================================================================
Different sequence part
================================================================================
Meaning of +
Amino acid V and amino acid I have difference of methyl group
================================================================================
================================================================================
Gap location
================================================================================
EGF receptor isoform a
EGF receptor isoform b
They are alternative splicing relationship
================================================================================
================================================================================
Click taxonomy reports
================================================================================
================================================================================
================================================================================
Click distance tree of results
================================================================================
================================================================================