https://www.youtube.com/watch?v=zdWDM-o9ZXU /mnt/1T-5e7/mycodehtml/Biology/SangSooKim/Bioinformatics_basic/009/001_BlastP_to_Gene_Tree/main.html ================================================================================ Run BLASTP program at NCBI - Query: human's EGF-Receptor, canonical form - Database: RefSeq proteins - Exclude model sequences (XM/XP) ================================================================================ ================================================================================ ================================================================================ ================================================================================ click ================================================================================ ================================================================================ ================================================================================ click ================================================================================ ================================================================================ 1210 number of amino acid ================================================================================ Protein sequence ================================================================================ How to get protein sequence Click FASTA ================================================================================ ================================================================================ Click BLAST ================================================================================ BLAST is the program which compares the 2 sequences Note that your goal, compare "human EGF-receptor sequence" with "RefSeq protein sequence" ================================================================================ Click ================================================================================ blastn: compare 2 genome sequences blastp: compare 2 protein sequences ================================================================================ ================================================================================ Change database ================================================================================ ================================================================================ Exclude "model sequences" ================================================================================ Exclude "uncultured/environmental sample sequences" uncultured/environmental sample sequences: microbe which can't grow in the lab ================================================================================ click ================================================================================ ================================================================================ ================================================================================ ================================================================================ Comparison result ================================================================================ Note that your query length was 1210 number of amino acid ================================================================================ Meaning: 1210 amino acid matches to your input query protein sequence ================================================================================ Mouse, Identity score is only 90% ================================================================================ Click that one Query: your input EGF receptor sequence Sbjct: mouse's EGF receptor sequence ================================================================================ Different sequence part ================================================================================ Meaning of + Amino acid V and amino acid I have difference of methyl group ================================================================================ ================================================================================ Gap location ================================================================================ EGF receptor isoform a EGF receptor isoform b They are alternative splicing relationship ================================================================================ ================================================================================ Click taxonomy reports ================================================================================ ================================================================================ ================================================================================ Click distance tree of results ================================================================================ ================================================================================