NCBI Prokaryotic Genome Annotation Pipeline
Genome annotation is a multi-level process that includes prediction of protein-coding genes, as well as other functional genome units such as structural RNAs, tRNAs, small RNAs, pseudogenes, control regions, direct and inverted repeats, insertion sequences, transposons and other mobile elements.
NCBI has developed an automatic annotation pipeline that combines ab initio gene prediction algorithms with homology based methods. The first version of NCBI Prokaryotic Genome Automatic Annotation Pipeline (PGAAP) See Pubmed Article developed in 2005 has been replaced with an upgraded version.
See description of the Annotation process
See description of the Annotation standards
Read Release notes
GenBank
The NCBI prokaryotic annotation pipeline is intended to help the submitters with genome annotation. The pipeline is capable of annotating both complete genomes and draft WGS genomes consisting of multiple contigs. To use the annotation pipeline for your genome for submission to GenBank, follow the guidelines in the Bacterial Genome Submission Guide
Refseq
Historically RefSeq prokaryotic genomes retained on author submitted annotation. Annotation from different submitters varies in quality resulting in the inconsistent annotation even in closely related genomes. The NCBI Prokaryotic Annotation Pipeline can produce consistent high quality automatic annotation.
See information on Eukaryotic Genome Annotation
Any questions can be sent to: genomes@ncbi.nlm.nih.gov