At NIF we try to mix it up. Instead of bringing you the top databases this month, we thought that this may be of more interest.
A heart-felt congratulations to the most annotated papers of all time! It is sort of like being the most cited, but it is more like being the most useful.
The winners are clearly publications about databases, congratulations to InterPro and Protein Data Bank, the Gene Ontology itself, and also complete genomes for various organisms. The black horse in the race asks “How many drug targets are there?” and the answer appears to be at least 7000. Heyndrickx and Vandepoele may have the most citations per human being because they wrote the only paper on this list with only 2 authors, most of the rest dilute their citations by an order of magnitude.
In any case, we thought that this would be a fun set of data to look at.
* Mulder et al The InterPro Database 2003 cited 49030 times
* Camon et al The Gene Ontology Annotation (GOA) project: implementation of GO in SWISS-PROT, TrEMBL, and InterPro 2003 cited 49030 times
* Heyndrickx and Vandepoele Systematic identification of functional plant modules through the integration of complementary data sources 2012 cited 37987 times
* Matsuyama et al ORFeome cloning and global analysis of protein localization in the fission yeast Schizosaccharomyces pombe 2006 cited 14156 times
* Barbe et al Toward a confocal subcellular atlas of the human proteome 2008 cited 9574 times
* Simmer et al Genome-wide RNAi of C elegans using the hypersensitive rrf-3 strain reveals novel gene functions 2003 cited 8249 times
* Heidelberg et al DNA sequence of both chromosomes of the cholera pathogen Vibrio cholerae 2000 cited 8192 times
* Imming et al Drugs, their targets and the nature and number of drug targets 2006 cited 7742 times
* Overington et al How many drug targets are there? 2006 cited 7664 times
* Gibbs et al Rat Genome Sequencing Project Consortium. Genome sequence of the Brown Norway rat yields insights into mammalian evolution 2004 cited 7594 times
* Berman et al The Protein Data Bank 2000 cited 7107 times
* Ceron et al Large-scale RNAi screens identify novel genes that interact with the C elegans retinoblastoma pathway as well as splicing-related components with synMuv B activity 2007 cited 6691 times
* Moran et al Genome sequence of Silicibacter pomeroyi reveals adaptations to the marine environment 2004 cited 6389 times
* Read et al The genome sequence of Bacillus anthracis Ames and comparison to closely related bacteria 2003 cited 6155 times
* Young et al Odorant receptor expressed sequence tags demonstrate olfactory expression of over 400 genes, extensive alternate splicing and unequal expression levels 2003 cited 5935 times
* Buell et al The complete genome sequence of the Arabidopsis and tomato pathogen Pseudomonas syringae pv tomato DC3000 2003 cited 5518 times
* Heidelberg et al Genome sequence of the dissimilatory metal ion-reducing bacterium Shewanella oneidensis 2002 cited 5183 times
* Methé et al Genome of Geobacter sulfurreducens: metal reduction in subsurface environments 2003 cited 4084 times
* Nelson et al Whole genome comparisons of serotype 4b and 1/2a strains of the food-borne pathogen Listeria monocytogenes reveal new insights into the core genome components of this species 2004 cited 3881 times
* Ward et al Genomic insights into methanotrophy: the complete genome sequence of Methylococcus capsulatus (Bath) 2004 cited 3877 times
The data are composed of annotations aggregated from over 40 individual databases and the Gene Ontology Consortium. For a complete and current list of databases included please see the NIF annotations information page, and to see a complete list of current annotations see the NIF annotations complete data set.
Note, these numbers are accurate for July 20th, 2013, but may change as data are added.