Tuesday, November 13, 2012


Fetch files or sequences

The tools in this section get things, like files or sequences - e.g., from the Web.
To use a script, cut and paste the code from the light green or blue box into a terminal window, change the bold, red text as needed, and hit Enter.
See More Information for notes on using these tools.

Fetch a sequence from a popular Internet database

Fetch a sequence from a popular Internet database (fetch_sequence_web)

Gets a sequence with a given id from a given database. (The database must be one of: swiss, genbank, genpept, embl, refseq.) The format of the fetched sequence (fasta by default) can be embl, fasta, gcg, genbank, swiss, or a whole bunch of other formats: see
The Bioperl SeqIO HOWTO
for details.
This script requires Bioperl to be installed (on whichever machine the script runs on). Many biology computers will have it installed. If the script breaks because it "can't locate Bio/Perl.pm", you can download Bioperl from bioperl.org.
$databaseDatabase name
$idIdentifier
$formatFormat to write sequence in
Output file
perl -MBio::Perl -e ' $database="embl"; $id="AI129902"; $format="fasta"; $sequence = get_sequence($database, $id); write_sequence(">-", $format, $sequence); warn "Wrote $database sequence $id in $format format\n"; ' > seq.fasta
Example: Get the ROA1_HUMAN sequence from Swiss-Prot in FASTA format, and put it in seq.fasta by running the above script.
Output file (seq.fasta)Screen Output
 >AI129902; qc41b07.x1 Soares_pregnant_uterus_NbHPU Homo sapiens cDNA [etc.]
 CTCCGCGCCAACTCCCCCCACCCCCCCCCCACACCCC
 Wrote embl sequence AI129902 in fasta format

Get a file from the Web

Fetch a file from the web (fetch_file_web)

Given an http or ftp address, get a file and store it in a given filename. This assumes you have an Internet connection, the file exists, etc. If something breaks, it should print an error message.
$web_fileWeb address
$storeName of file to save to
perl -MLWP::Simple -e ' $web_file="ftp://ftp.ncbi.nih.gov/genbank/GB_Release_Number"; $store="GB.txt"; if (is_success(getstore($web_file, $store))) { warn "Downloaded $web_file into $store\n"; } else { warn "Error downloading $web_file\n" } '
Example: Run the above script to download the current GenBank release number to a file GB.txt. The resulting file will have in it one line, giving the release number (as of this writing, 151).
Example 2: Download the NCBI home page by setting $web_file to "http://ncbi.nih.gov" and $store to "ncbi.html".

No comments:

Post a Comment