BLAST
BLAST is a primary sequence database searching algorithm that enables you to compare a DNA or protein sequence of interest to other known sequences, allowing you to find regions of similarity between them [1].
The results of a BLAST search give a graphic summary of the amount of alignment between the query sequence and the sequence hits from the database. The database sequences which produce significant alignments are shown in a table, which tells you the protein that the sequence codes for. Each has an accession number; clicking on this accession number allows you to see the souce organism of the protien, as well as giving links to papers on the protein. The E value (Expect value) is the staistical significance threshold; it gives the number of hits that are likely to be found by chance, therefore a lower E value will result in better matches as the probability of a chance result is less.
The Blast web page can be found at: http://blast.ncbi.nlm.nih.gov/Blast.cgi
Basic Blast Programs
There are 5 different BLAST programmes depending on the type of sequence you are studying:
Nucleotide blast (blastn) | Search a nucleotide database using a nucleotide query |
Protein blast (blastp) | Search protein database using a protein query |
blastx | Search protein database using a translated nucleotide query |
tblastn | Search translated nucleotide database using a protein query |
tblastx | Search translated nucleotide database using a translated nucleotide query |
BlastP, allows you to compare an amino acid query sequence against a protein database. BlastN enables you to look at a nucleotide query sequence against a nucleotide sequence database. Using BlastX you can compare a nucleotide query sequence translated in all reading frames against a protein sequence database. With tBlastN you can see the similarites between a protein query sequence and nucleotide sequences from the database, dynamically translated in all reading frames, and finally, tBlastX allows you to compare the six-frame translations of a nucleotide query sequence against the six-frame translations of a nucleotide sequence database [1]. Links to these programs can be found in the table above.
The low complexity sequence means a region with an abnormal structure that can cause problems when searching for a sequence similarity. They can often be inspected visually as their sequences are usually repetitive e.g. CCCCCCCGGCCCCCCGGGG. It is necessary to remove these from the search because it can give results which are not entirely true, as in it may not be due to shared homology.
Specialised Blast Programs
There is also the option to perform more specialized BLAST searches such as a primer BLAST search. Links for these searches can be found below to main BLAST search links.
References
- ↑ 1.0 1.1 http://blast.ncbi.nlm.nih.gov/Blast.cgi
- ↑ Information for the table taken from http://blast.ncbi.nlm.nih.gov/Blast.cgi