Processing your SIFT submission ...
Differences Between JCVI Aug 2011 and sift-dna.org Nov 2011 database
Background:
My group had released previous versions of the SIFT genome databases at JCVI (November 2009 and earlier versions). I moved from JCVI to GIS in 2010.
In Aug 2011, JCVI independently released a SIFT database (SIFT 4.0.3b).
My assessment of JCVI Aug 2011 (4.0.3b) database:
Conclusion
If you would still prefer to use the JCVI's databases and not our more recent versions of SIFT, this is what I'd recommend :
SIFT Predictions on Amino Acid Substitutions
Predictions from JCVI Aug 2011 database had expected accuracy (based on HumDiv and HumVar datasets). Performance of JCVI with HumDiv and HumVar were
similar with sift-dna Nov 2011 database.
![]() |
![]() |
Please note:
Scores will differ between JCVI and sift-dna databases because 1) scores are highly dependent on the protein sequence database used to retrieve homologous sequences, and different protein databases were used and 2) different SIFT versions were used. 1) is a major factor, 2) is a minor factor, where SIFT 4.0.4 used a newer version of BLAST, but the core algorithm did not change.
Scores are not expected to correlate between the two databases because scores are scaled probabilities and so a score of 0.25 has the same interpretation as a score of 0.75. Therefore, correlation is not a relevant metric.
sift-dna.org has higher coverage due to addition of RefSeq, Ensembl, and CCDS predictions. The figure below shows that sift-dna.org has annotation for an additional 1.95 million missense positions and chrY predictions are missing from JCVI's database.
|