/
The current SIFT web server is hosted here where it is publicly accessible to any user. Due to the hassle of installation, massive database download that comes with SIFT standalone, we have developed a web service for the benefit of SIFT users. SIFT Web Service is built on Amazon cloud with the capability of scaling up to cater to large number of SIFT jobs. With this, SIFT users can now sumbit batch jobs from command line less the installation of SIFT standalone. At the end of this guide, SIFT users will be able to run SIFT smoothly through the Web Serivce
1. Download SIFTexome_nssnvs.sh.tar.gz containing the script and config files
2. Place it in your Unix/Linux working directory. You may use WINscp to do this. Alternatively, do a wget in your working directory.
wget http://sift.bii.a-star.edu.sg/downloads/SIFTexome_nssnvs.sh.tar.gz
3. Extract the files by doing:
tar zxvf SIFTexome_nssnvs.sh.tar.gz
This will extract the compressed file into the current directory. You should see two files: SIFTexome_nssnvs.sh and SIFTexome_nssnvs.conf.
4. SIFTexome_nssnvs.sh is the main program used to submit a job to the web service. It takes in one parameter file SIFTexome_nssnvs.conf. You are only required to edit SIFTexome_nssnvs.conf!
5. Let us move on to editing SIFTexome_nssnvs.conf.
Open up SIFTexome_nssnvs.conf with your favourite editor software eg, vi or emacs.
This is what you will see when you open up the file :
As we can see from above, comment lines begin with a '#' key. These will not be processed. You do not need to touch them!
Editing SIFTexome_nssnvs.conf
The first entry: ORGANISM can take value: ORGANISM='hg19'or ORGANISM='hg18'; depending on the human database you intend to run SIFT on.
FILE1 is the path to your input file. For example, if your SIFt input file is in your home directory, then set it to FILE1='/home/username/siftformatfile'.Download a sift test file.
EMAIL will contain your email address.
FINALOUTDIR='/path_to_output_directory/output' . Change the path to the directory you want to retrieve the output to.
Following that, you will see a long list of options.
These are closely familar to the check boxes you see in SIFT website; which they are.
Here instead of checking the checkboxes, set the options to value of '1' if you want to display additional results. If i want to display multiple rows for multiple-transcripts i will set: MULTI_TRANSCRIPTS=1. Leave the values at the default '0' to run SIFT with default settings.
The last line of SIFTexome_nssnvs.conf contains the URL of our Web Service. Do not change this
7. Save the changes you have made to SIFTexome_nssnvs.conf and exit the editor. We are now ready to submit our job to the Cloud server!
8. Navigate to the directory containing SIFTexome_nssnvs.sh. We are going to send the job request by doing:
./SIFTexome_nssnvs.sh ./SIFTexome_nssnvs.conf
9. The job has been submitted to our job queue and will be automatically processed and retrieved upon completion.
10. Depending on the size of your file, the job nature and the number of SIFT users using the Web Service, the time for job completion may differ.
You will see a string of processing text similar to below:
A message will be printed to screen informing you of the JobID. We recommend that you copy it down. Upon completion you will see "Job Completed". The script will exit with exit status 0. Move on to check out your output in your output directory!
Hopefully by this point in time you have successfully ran your first SIFT Web Service tool!
Supported formats
Supported format: SIFT format. Variants file (e.g Pileup, VCF4, Maq, SOAP, GFF3, CASAVA, cg) can be easily converted to SIFT format using this.
For more example of SIFT format please refer to SIFT format; Max file size < 5MB.
Files with size larger than 5MB will result in the following when submitting job:
SIFT nonsynonymous single nucleotide variants tool
Test data
We also provide a sample data files for SIFT users to test run. This file is derived from 1000 Genomes and then converted to SIFT format. Download away and start running!.
User will assume all risks for using SIFT Web Service, including but not limited to, unsupported or unreplicated research or medical data and information.
User will use the information and software provided at their own risk.
The current SIFT web server is hosted here where it is publicly accessible to any user. Due to the hassle of installation, massive database download that comes with SIFT standalone, we have developed a web service for the benefit of SIFT users. SIFT Web Service is built on Amazon cloud with the capability of scaling up to cater to large number of SIFT jobs. With this, SIFT users can now sumbit batch jobs from command line less the installation of SIFT standalone. At the end of this guide, SIFT users will be able to run SIFT smoothly through the Web Serivce
1. Download SIFTBLink.tar.gz containing the script and config files
2. Place it in your Unix/Linux working directory. You may use WINscp to do this. Alternatively, do a wget in your working directory.
wget http://sift.bii.a-star.edu.sg/downloads/SIFTBLink.tar.gz
3. Extract the files by doing:
tar -zxvf SIFTBLink.tar.gz
This will extract the compressed file into the current directory. You should see two files: SIFTBLink.sh and SIFTBLink.conf.
4. SIFTBLink.sh is the main program used to submit a job to the web service. It takes in one parameter file SIFTBLink.conf. You are only required to edit SIFTBLink.conf!
5. Like with all the other SIFT web service script, we have to first update the configuration file to our liking. Let us move on to it.
Open up SIFTBLink.conf with your favourite editor software eg, vi or emacs.
This is what you will see when you open up the file :
As we can see from above, comment lines begin with a '#' key. These will not be processed. You do not need to touch them!
Editing SIFTBLink.conf
EMAIL will contain your email address. Here, my email is EMAIL='email@email.com'.
GI_NUMBER can take value of NCBI GI number or RefSeq ID. The default value is 22209009 for gi:22209009. Change it to the GI of interest.
Similarly you can try it out with RefSeq ID e.g NP_665861. We accept both format! If you are unsure of how to get the GI number, head over to SIFT BLink input format faq to find out how.
SUBSTITUTION_FILE is for you to set the full path to your substitution file(What is it?). In any case that you do not want to input any specific substituition, our suggestion is to create an empty file.
To elaborate on the previous point, i would type this in unix to make an empty file:
touch empty.file
This would create empty file named empty.file in the current working directory.
Follow that up by setting: SUBSTITUTION_FILE='/path/to/file/empty.file' !
*If you have problem creating this file, go to Downloads to get one.
SEQUENCES_TO_SELECT take value of SEQUENCES_TO_SELECT='ALL' or SEQUENCES_TO_SELECT='BEST'. 'ALL' includes sequences from all BLAST hits while 'BEST' takes only the best BLAST hits for the calculation.
SEQ_IDENTITY_FILTER takes value from 0-100. It is used to remove sequences more than "SEQ_IDENTITY_FILTER" percent identical to query. By default we have set this at 90 (%).
Last but not least, FINALOUTDIR='/path_to_output_directory/output' . Change the path to the directory you like to retrieve the output to.
7. We have successfully edited SIFTBLink.conf! Save the changes you made to SIFTBLink.conf and exit the editor. We are now ready to submit our job to the Cloud server!
8. Navigate to the directory containing SIFTexome_nssnvs.sh. We are going to send the job request by doing:
./SIFTexome_nssnvs.sh ./SIFTexome_nssnvs.conf
9. The job has been submitted to our job que and will be automatically processed and retrieved upon completion.
You will see a string of processing text similar to below:
A message will be printed to screen informing you of the JobID. We recommend that you copy it down. Upon completion you will see "Job Completed". The script will exit with exit status 0. Move on to check out your output in your output directory!
Hopefully by this point in time you have successfully ran your first SIFT Web Service tool!
Supported formats
Supported format: GI number (e.g 22209009) or RefSeq ID (e.g NP_665861) as described above.
SIFT BLink tool
Test data
We also provide a sample 3 lines substitution files for SIFT users to test run. You may get more detail for this substitution file at here. Download away and start running!.
Here's an empty file if you do not want to run with any specific substitution
User will assume all risks for using SIFT Web Service, including but not limited to, unsupported or unreplicated research or medical data and information.
User will use the information and software provided at their own risk.