Featured Article
NCBI is phasing out sequence GIs - use Accession.Version instead!
As of September 2016, the integer sequence identifiers known as "GIs" will no longer be included in the GenBank, GenPept, and FASTA formats supported by NCBI for sequence records. The FASTA header will be further simplified to report only the sequence accession.version and record title for accessions managed by the International Sequence Database Collaboration (INSDC) and NCBI’s Reference Sequence (RefSeq) project. As NCBI makes this transition, we encourage any users who have workflows that depend on GI's to begin planning to use accession.version identifiers instead. After September 2016, any processes solely dependent on GIs will no longer function as expected.
Read more...RefSeq release 75 is now available
RefSeq release 75 is accessible via FTP and through NCBI’s programming utilities. This full release incorporates genomic, transcript and protein data available as of March 7, 2016 and includes 92,936,289 records, 61,034,675 proteins, 14,035,988 RNAs, and sequences from 58,776 organisms.
March 23, 2016: NCBI to offer workshop for advanced SRA and dbGaP users
On March 23 at 12 PM EST, NCBI staff will present a workshop for advanced users of SRA and dbGaP who are interested in using public datasets, and:
Search for WGS Sequences using Stand-alone BLAST!
It is now much easier to search WGS (Whole Genome Shotgun) with stand-alone BLAST on your own computer. New tools from the NCBI allow you to BLAST just the WGS projects you are interested in. You can also search a taxonomic subset of WGS (e.g., all human or all bacterial sequences). These new tools for WGS make the existing search mechanism obsolete.
As of August 5, 2016, the current single WGS BLAST database will be retired from the NCBI FTP site and BLAST server. We suggest moving to the new tools as soon as possible.
First of the New Bookshelf NCBI Insights Blog Posts - New Streptococcus pyogenes book
The first of a new series of NCBI Insights blog posts highlighting books and documents is available on NCBI’s Bookshelf showcasing a new book: “Streptococcus pyogenes: Basic Biology to Clinical Manifestations”.
Tree Viewer's Next Update is Available
An updated version (v.1.8.0) of the NCBI Tree Viewer, a tool for viewing your own phylogenetic tree data, has been released which has several new features and improvements, as well as some bug fixes.
NCBI to assist Brandeis University in hosting Boston genomics hackathon in April
From April 25 to 27, NCBI will assist Brandeis University in hosting a genomics hackathon focusing on advanced bioinformatics analysis of next-generation sequencing data and metadata. This event is for students, postdocs, investigators and other researchers already engaged in the use of pipelines for genomic analyses from next-generation sequencing data or metadata.* Researchers and/or data scientists from the Boston area are especially encouraged to apply, but the event is open to anyone selected for the hackathon and able to travel to Brandeis.
GenBank release 212.0 available via FTP
GenBank release 212.0 (2/13/2016) has 190,250,235 non-WGS, non-CON records containing 207,018,196,067 base pairs of sequence data. In addition, there are 333,012,760 WGS records containing 1,399,865,495,608 base pairs of sequence data, as well as 92,132,318 TSA records containing 81,932,555,094 base pairs of sequence data.
March 2nd webinar: NCBI Resources for Cancer Researchers
Next Wednesday, NCBI staff will discuss the facets of NCBI resources relevant to cancer in a live webinar. The databases and tools included in this overview are: BLAST, GenBank, DNA-Seq, RNA-Seq, Epigenomics and metagenomics datasets, as well as tools and APIs at NCBI that can be used to extract relevant subsets of data for cancer research.
Zika virus resource page provides access to nucleotide, protein sequences from latest outbreak
The new Zika virus resource page makes it easy to find and analyze relevant sequence data. The page includes links to the following Zika virus data at NCBI: nucleotide and protein sequences, the reference genome with updated mature peptide annotation, and publications.
dbSNP Build 146 for salmon, barrel medic, cottonwood and mouse now available
dbSNP Build 146 data for salmon, barrel medic, cottonwood and mouse are available now on the web and FTP.