NIH Plan for Increasing Access to Scientific Publications and Digital Scientific Data

Posted on March 4th, 2015 in Anita Bandrowski, Interoperability, News & Events | No Comments »

The NIH put out a plan to increase access to scientific data.

What do they really mean and what does this mean to researchers?

Researchers have been asked to provide PubMed Central PMC identifiers in grant applications and this single requirement has pushed authors to submit their papers to PMC and many journals do this as a matter of fact leading to a large corpus of publications that are fully searchable texts. I think that researchers are now familiar with this process and see the benefit, as I do when I am at home and need to look up a piece of information from my old paper that a publisher tries to charge me $36 to find.

What happens to data and what is meant by data?
Will authors need to submit all of their supplementary data files to PMC?

Perhaps not, some wording in the document from the NIH shows that they know that data is not homogeneous. They recognize that they can’t handle the diversity in a good way without working with existing repositories.

They point out that data should be FAIR:
Findable
Accessible
Interoperable
Reusable
This is known as the FAIR standard.

They also state:
“A strategy for leveraging existing archives, where appropriate, and fostering public- private partnerships with scientific journals relevant to the agency’s research; Encourage public-private collaboration; Encourage public-private collaboration to … otherwise assist with implementation of the agency plan; Ensure that publications and metadata are stored in an archival solution that… uses standards, widely available and, to the extent possible, nonproprietary archival formats for text and associated content (e.g., images, video, supporting data).”

So will there be a set of repositories that are “approved” community standards? Will the NIH have a box for grantees to put in their community repository IDs?
Seems like a good direction!

For now, NIF has a very large list of repositories that will house your data.
Try this registry search.
There are over 1000 that respond to the query, but which one or which ones can you use?
It does not seem that the NIH is willing to be proscriptive, so it will be left to individual communities to rally around repositories that best serve them.
For now, NIF just aggregates the information around these and attempts to make them findable (the F in FAIR).

Integrated Annotation just added the 7-million-th record

Posted on February 27th, 2015 in Anita Bandrowski, Data Spotlight, News & Events | No Comments »

Yes we do have annotations!

What can we do with these annotations?

* When you are reading a paper, would you like to know if the data you are looking at has been stored somewhere?

* Would you like to know if someone figured out what antibody the authors used?

* What about the mouse described in the paper, is there additional information in MGI?

The integrated annotation view is an aggregate of any database included in NIF that contains the PubMed Identifier.

In over 50 databases there are citations containing PubMed Identifiers, a reference for a particular data record. While each database is different, there are some themes. Records may include reagents used in the paper like AddGene plasmids, data that is stored somewhere like ModelDB computational models, or they may include a set of values that were extracted from the paper like BioNumbers.

Through a software tool called the LinkOut Broker, we submit these data to PubMed (unless the database does this already), an annotation that says this paper is referenced in a particular database. However, these citations are not searchable in PubMed and so we have made the integrated annotation view to allow NIF users to search these same annotations.

However, we know that people read papers in many places, pdf readers and on line so we have started working with several groups including a team at Science Direct to push the data into the places where the readers are. We are proud to work with the Elsevier Antibody App team, who created an application visible in Science Direct in all Elsevier papers that have an antibody annotated in the antibodyregistry.org.

An example paper from Experimental Neurology can be viewed here http://www.sciencedirect.com/science/article/pii/S0014488614003896

The NIA Butler-Williams Scholars Program

Posted on February 27th, 2015 in Anita Bandrowski, News & Events | No Comments »

The NIA Butler-Williams Scholars Program (formerly Summer Institute on Aging Research) is accepting applications for an intensive introduction to aging research. This program for investigators that are new to aging research is focused on the breadth of research supported by the National Institute on Aging, including basic biology, neuroscience, behavioral and social research, geriatrics and clinical gerontology. As an offering through the NIA Office of Special Populations, program content will include a focus on health disparities, research methodologies, and funding opportunities. The Butler-Williams Scholars Program (B-W Scholars) is one of the premier, short-term training opportunities for new investigators. New researchers are defined as those who have recently received the M.D., Ph.D. or other doctoral level degree. The B-W Scholars Program provides participants with unparalleled access to NIA and NIH staff in an informal setting.

The 2015 B-W Scholars Program will be held July 27-31, 2015 in Bethesda, Maryland. Support in most cases is available for travel and living expenses.  The B-W Scholars Program is sponsored by NIA with support from the National Hartford Centers of Gerontological Nursing Excellence.

***Applications are due Friday, March 27, 2015***

Researchers with an interest in health disparities research are encouraged to apply. Applicants from diverse backgrounds, including individuals from underrepresented racial and ethnic groups, individuals with disabilities and women are always encouraged to apply for NIH support. Applicants must be U.S. citizens, non-citizen nationals, or permanent residents.
Please view more information on the NIA web site:  www.nia.nih.gov/about/events/2014/butler-williams-scholars-program-2015

For more information, please contact:
Ms. Andrea Griffin-Mann
Office of Special Populations
National Institute on Aging
National Institutes of Health

griffinmanna@mail.nih.gov

Did you know? The IMPC maintains a large list of predicted mouse gene phenotypes

Posted on February 16th, 2015 in Anita Bandrowski, Data Spotlight, News & Events | No Comments »

The Monarch project (monarchinititiave.org) with the NIF project have brought in many sources that are now available from NIF or many of the SciCrunch portals that contain a wealth of phenotype information.

The International Mouse Phenotyping Consortium is one of these sources and the creates, curates, and maintains targeted knockout mutations in embryonic stem cells for 20,000 known and predicted mouse genes. These phenotypes are available through several views showing the variant phenotypes.

What can be learned from phenotype data? Phenotype is a superset of disease, so this data can be instrumental in figuring out if a better model for the disease you are studying exists and what are the associated traits to each organism. A worm researcher may not be aware that a fly mutation expresses the same phenotype, but perhaps does so as a result of a different genotypes / knockouts.

 

Check out other sources of phenotype data also available:

WormBase provides anatomical and genetic information of C. elegans and related research nematodes. This Worm:VariantPhenotypes view curates the relationship between an allele and a phenotype, where the allele can be a genetic or RNAi-induced change. 100.00% (543,874 Results)

Online Mendelian Inheritance in Man (OMIM) curates human genetic diseases from the literature. The OMIM:VariantPhenotype view describes the curated relationships between genes, allelic variants (if available), and diseases/traits. 100.00% (28,706 Results)

WormBase provides anatomical and genetic information of C. elegans and related research nematodes. The GeneExprLoc view shows the localization of gene expression in C. elegans anatomy. 100.00% (72,346 Results)

OMIM is a human curated authoritative source of information about disease to gene connections. The DiseaseGeneAssociation view is organized by the OMIM phenotype/disease identifiers, and lists all genes and text annotated to a given disease or phenotype. more about OMIM      100.00% (4,809 Results)

HPO annotations provide annotations of human phenotypes and diseases. This phenotype to gene view is the associations between a phenotype and it’s putative causative gene based on the link between a gene and it’s known involvement in a disease. 100.00% (284,441 Results)

The Mouse Phenome Database is a project at the Jackson Laboratory, which characterizes mouse studies based on the types of measurements that are made in each study. This MeasurementDefinitions view shows the curated mappings of the assay measurements to the relevant phenotype, trait, and anatomy terms at are measured. 100.00% (14,765 Results)

The HPO group provides annotations of phenotypes of human diseases, linked to OMIM, Orphanet, and DECIPHER.    100.00% (116,600 Results)

Online Mendelian Inheritance in Animals (OMIA) is a data set describing phenotype relationships with individual breeds and genes. This BreedPhenotypes view curates species and breed-specific-phenotype relationships for non-model organisms. 100.00% (15,516 Results)

Animal Quantitative Trait Loci Database collects and provides publicly available trait mapping data, i.e. QTL (phenotype/expression, eQTL), candidate gene and association data (GWAS), and copy number variations (CNV) mapped to livestock animal genomes to facilitate locating and comparing discoveries within and between species. Additional information regarding QTL data can be found at the Animal QTL Database FAQ.   100.00% (28,751 Results)

The ZFIN Genotype-Phenotype View  contains Genotype-to-Phenotype mappings in ZFIN, with experimental-environmental context. This Genotype-Phenotype view is a combination of intrinsic (organismal) and extrinsic (experimental/morphant) genotypes, in the context of environmental conditions. The effective genotypes are extracted and built from ZFIN genotype-phenotype data following the GENO genotype ontology model as developed by the Monarch Initiative. 100.00% (85,118 Results)

FlyBase is a database of genetic and molecular data for D. melanogaster and other Drosophila species. Flybase:Phenotypes are the curated links for phenotypes of the flies of a specified genotype, in a specified environment, attributed to a publication. 100.00% (275,697 Results)

The International Mouse Phenotyping Consortium creates, curates, and maintains targeted knockout mutations in embryonic stem cells for 20,000 known and predicted mouse genes. The IMPC:MousePhenotypes view reports on the genotypes and associated phenotypes collected from a broad based primary phenotyping pipeline in all the major adult organ systems. All phenotype calls are found to be significant with a p-value < 1 x 10-4. 100.00% (7,156 Results)

Mouse Genome Informatics offered by Jackson Laboratory includes information on integrated genetic, genomic, phenotypic, and biological data of the laboratory Mouse. The MGI:Phenotypes view presents the curated relationships between genotypes and phenotypes. 100.00% (275,856 Results)

The NHGRI Elements of Morphology: Human Malformation Terminology is being developed by a group of international clinicians working in the field of dysmorphology to standardize terms used to describe human morphology, thereby increasing the utility of descriptions of human phenotype and facilitating reliable comparisons of findings among patients. 100.00% (400 Results)

The Mouse Phenome Database is a project at The Jackson Laboratory which collects and curates mouse strain survey data for behavior, physiology, and anatomy. Data are available for inbred and recombinant inbred strains, chromosome substitution strains, other classical panels, Collaborative Cross (CC) lines and Diversity Outbred (DO) populations. 100.00% (235 Results)

The ClinVar aggregates information about sequence variation and its relationship to human health. The ClinVar:VariantPhenotypes view provides information on sequence alterations present in genes and the resulting phenotypes. For records listing more than one variation, data is presented with the assumption that the individual sequence alterations are in cis. 100.00% (458,639 Results)

The Mouse Phenome Database is a project at The Jackson Laboratory which collects and curates mouse strain survey data for behavior, physiology, and anatomy. Data are available for inbred and recombinant inbred strains, chromosome substitution strains, other classical panels, Collaborative Cross (CC) lines and Diversity Outbred (DO) populations. The MPD:StrainPhenotypes view computes the extreme outlier phenotypes (>2 s.d.) as compared to the overall mean for each assay, and maps the quantitative measurements to their qualitative phenotype. (The strains measured for each assay varies, and therefore the means computed may be drawn from collections of different strains.) 100.00% (8,605 Results)

The International Mouse Phenotyping Consortium creates, curates, and maintains targeted knockout mutations in embryonic stem cells for 20,000 known and predicted mouse genes. The IMPC:KnockoutPhenotypes view reports on the phenotypes collected from a broad based primary phenotyping pipeline in all the major adult organ systems. 100.00% (7,156 Results)

Webinar from BioCaddie (aka DDI): Jeff Grethe presents NIF

Posted on February 9th, 2015 in Anita Bandrowski, News & Events, Webinar Announcement | No Comments »

Cooperative and collaborative data and resource discovery platforms for scientific communities – The Neuroscience Information Framework (NIF) and SciCrunch

 

Date: Thursday, February 12, 2015
Time: 10:00 AM – 11:00 AM (PST); 1:00 PM – 2:00 PM (EST)

 

Presenter

Jeffrey S. Grethe, Ph.D.
Associate Director, Center for Research in Biological Systems
University of California, San Diego

Abstract

Data and information on research resources are everywhere, in numerous repositories and download sites, and more floods in every day. What’s a researcher to do? In order to be able to use shared data, the first fundamental rule is that you have to be able to find it. We have search engines like Google for web documents, PubMed and Google Scholar for articles, NCBI for selected genomics resources. The Neuroscience Information Framework (NIF; neuinfo.org) was instantiated in 2006 in response to a Broad Agency Announcement from the NIH Blueprint for Neuroscience Research citing an overwhelming need for an ”information framework for identifying, locating, and characterizing neuroscience information”. NIF was tasked with surveying the neuroscience resource landscape and developing a resource description framework and search strategy for locating, accessing and utilizing research resources, defined here as data, databases, tools, materials, literature, networks, terminologies, or information that can accelerate the pace of neuroscience research and discovery. NIF adds value to these existing biomedical resources by increasing their discoverability, accessibility, visibility, utility and interoperability, regardless of their current design or capabilities and without the need for extensive redesign of their components or information models. Unlike more general search engines, NIF provides deeper access to a more focused set of resources that are relevant to neuroscience, provides search strategies tailored to neuroscience, and also provides access to content that is traditionally “hidden” from web search engines. To accomplish this, NIF has deployed an infrastructure allowing a wide variety of resources to be searched and discovered at multiple levels of integration, from superficial discovery based on a limited description of the resource (NIF Registry), to deep content query (NIF Data Federation). It is currently one of the largest sources of biomedical information on the web, currently searching over 13,000 research resources in its Registry, and the contents of 250+ data resources comprising more than 800 million records in its Data Federation.

Building on the NIF infrastructure, SciCrunch was designed to help communities of researchers create their own portals to provide access to resources, databases and tools of relevance to their research areas. A data portal that searches across hundreds of databases can be created in minutes. Communities can choose from our existing SciCrunch data sources and also add their own. SciCrunch was designed to break down the traditional types of portal silos created by different communities, so that communities can take advantage of work done by others and share their expertise as well. SciCrunch currently supports a diverse collection of communities in addition to NIF, each with their own data needs: CINERGI – focuses on constructing a community inventory and knowledge base on geoscience information resources; NIDDK Information Network (dkNET) – serves the needs of basic and clinical investigators by providing seamless access to large pools of data relevant to the mission of The National Institute of Diabetes, Digestive and Kidney Disease (NIDDK); Research Identification Initiative (RII) – aims to promote research resource identification, discovery, and reuse.

Biography

Dr. Jeffrey S. Grethe, Ph.D. is a Principal Investigator (MPI) for the Neuroscience Information Framework (NIF; http://neuinfo.org) and the NIDDK Information Network (dkNET; http://dknet.org) in the Center for Research in Biological Systems (CRBS; http://crbs.ucsd.edu) at the University of California, San Diego. Following a B.S. in Applied Mathematics from the University of California, Irvine, he received a doctorate in neurosciences with a focus on neuroinformatics and computational modeling from the University of Southern California. Throughout his career, he has been involved in enabling collaborative research, data sharing and discovery through the application of advanced informatics approaches. This started at USC with his involvement in the Human Brain Project and continues today with his work on NIF, dkNET and with standards bodies such as the International Neuroinformatics Coordinating Facility.

Details on how to join Webinar:

*Must use Web and Audio*

Please include name and company when joining meeting. Web: http://www.readytalk.com/ (Join meeting with Access Code 2201876)
Audio: 1-866-740-1260 (Access Code 2201876)
Click here to test your computer’s compatibility before the meeting.

This webinar is open to all.
More information about this webinar, future webinars and events can be found at:

https://biocaddie.org/events/webinars

The Journal of Comparative Neurology Reaches 100 RRID papers!

Posted on February 5th, 2015 in Anita Bandrowski, News & Events | No Comments »

Have you ever wondered how can you find validated antibodies?
Have you ever wondered where most of our literature mentions come from in the antibodyregistry.org?

Well, wonder no more. Neuroscience Information Framework and more specifically the antibodyregistry.org has been working closely with the Journal of Comparative Neurology (Bandrowski et al., JCN 2013) to enhance the already high standards of antibody identification of JCN papers.

In the Research Resource Identification initiative we have just crossed a major mile stone, the 100 papers in a single journal and the journal that takes this is not surprisingly JCN. Most of these papers identify the antibodies they contain with unique identifiers in the following format RRID:AB_2298772, which can be searched for in google scholar or PubMed.

The growing number of papers and the growing number of journals (42 and counting) suggests that this standard fills a need in the community.

If you wish to join the initiative as an author, just visit the SciCruch resource portal and search for your antibodies, a convenient “cite this” button will give you the text to use in your methods section.
If you wish to join as an editor, we have draft letters to authors and author guidelines that you can use at Force11.

Research Resource Identification has made it into PLoS

Posted on January 29th, 2015 in Anita Bandrowski, News & Events | No Comments »

Thanks to some great folks at PLoS, we will be adding the PLoS Genetics and Biology journals to our list of journals joining the RRID pilot.

For those of you who don’t know about the initiative, NIF with the help of Force11, Monarch, MGI, RGD, and many other groups involved in some way with informatics launched an initiative to create a standard for research resource reporting. In this case the original agreement was among 25 journal chief editors who agreed to ask authors to report identifiers for: model organisms, antibodies, software tools and databases they used in the course of their research (methods section) in a format that is uniform across publishers.

The initiative has been very successful, thus far, with identifiers for resources appearing in over 40 journals (link to google scholar). Now, we will bring on board two journals in the PLoS family and we hope that others will soon follow.

See blog post at PLoS.

Statement of Commitment from Earth and Space Science Publishers and Data Facilities:

Posted on January 14th, 2015 in Anita Bandrowski, Interoperability, News & Events | No Comments »

This is an important committment from the CODATA and earth science community. Looking quite forward to circulating a similar document from the Neuroscience community.

 

Coalition on Publishing Data in the Earth and Space Sciences

 
Earth and space science data are special resources, critical for advancing science and
addressing societal challenges – from assessing and responding to natural hazards and
climate change, to use of energy and natural resources, to managing our oceans, air, and
land. The need for and value of open data have been encoded in major Earth and space
science society position statements, foundation initiatives, and more recently in
statements and directives from governments and funding agencies in the United States,
United Kingdom, European Union, Australia, and elsewhere. This statement of
commitment signals important progress and a continuing commitment by publishers and
data facilities to enable open data in the Earth and space sciences.
 

Scholarly publication is a key high­value entry point in making data available, open,
discoverable, and usable. Most publishers have statements related to the inclusion or
release of data as part of publication, recognizing that inclusion of the full data enhances
the value and is part of the integrity of the research. Unfortunately, the vast majority of
data submitted along with publications are in formats and forms of storage that makes
discovery and reuse difficult or impossible.
 

Repositories, facilities, and consortia dedicated to the collection, curation, storage, and
distribution of scientific data have become increasingly central to the scientific enterprise.
The leading Earth and space science repositories not only provide persistent homes for
these data, but also ensure quality and enhance their value, access, and reuse. In addition
to data, these facilities attend to the associated models and tools. Unfortunately, only a
small fraction of the data, tools, and models associated with scientific publications makes
it to these data facilities.
 

Connecting scholarly publication more firmly with data facilities thus has many
advantages for science in the 21st century and is essential in meeting the aspirations of
open, available, and useful data envisioned in the position statements and funder
guidelines. To strengthen these connections, with the aim of advancing the mutual
interests of authors, publishers, data facilities, and end­users of the data, a recent Earth
and space science data and publishing conference, supported by the National Science
Foundation, was held at AGU Headquarters on 2­3 October 2014. It brought together
major publishers, data facilities, and consortia in the Earth and space sciences, as well as
governmental, association, and foundation funders. Further informational meetings were
held with Earth and space science societies, publishers, facilities, and librarians that were
not present at the October meeting. Collectively the publishers, data facilities, and
consortia focused on open data for Earth and space science formed a working group:
Coalition on Publishing Data in the Earth and Space Sciences. As one outcome, this
group collectively endorsed the following commitments to make meaningful progress
toward the goals above. We encourage other publishers and data facilities and consortia
to join in support.
 

Signatory data facilities, publishers, and societies, in order to meet the need for
expanding access to data and to help authors, make the following commitments:
● We reaffirm and will ensure adherence to our existing repository, journal, and
publisher policies and society position statements regarding data sharing and
archiving of data, tools, and models.
● We encourage journals, publishers, and societies that do not have such statements
to develop them to meet the aspirations of open access to research data and to
support the integrity and value of published research. Examples of policies and
position statements from signatory journals and societies are listed here.
● Earth and space science data should, to the greatest extent possible, be stored in
appropriate domain repositories that are widely recognized and used by the
community, follow leading practices, and can provide additional data services.
We will work with researchers, funding agencies, libraries, institutions, and other
stakeholders to direct data to appropriate repositories, respecting repository
policies.
● Where it is not feasible or practical to store data on community­approved
repositories, journals should encourage and support archiving of data using
community­ established leading practices, which may include supplementary
material published with an article. These should strive to follow existing NISO
guidelines.

Over the coming year, the signatory Earth and space science publishers, journals, and
data facilities will work together to accomplish the following:
● Provide a usable online community directory of appropriate Earth and space
science community repositories for data, tools, and models that meet leading
standards on curation, quality, and access that can be used by authors and journals
as a guide and reference for data deposition.
● Promulgate metadata information and domain standards, including in the online
directory, to help simplify and standardize data deposition and re­use.
● Promote education of researchers in data management and organize and develop
training and educational tools and resources, including as part of the online
directory.
● Develop a working committee to update and curate this directory of repositories.
● Promote referencing of data sets using the Joint Declaration of Data Citation
Principles, in which citations of data sets should be included within reference
lists.
● Include in research papers concise statements indicating where data reside and
clarifying availability.
● Promote and implement links to data sets in publications and corresponding links
to journals in data facilities via persistent identifiers. Data sets should ideally be
referenced using registered DOI’s.
● Promote use of other relevant community permanent identifiers for samples
(IGSN), researchers (ORCID), and funders and grants (FundRef).
● Develop workflows within the repositories that support the peer review process
(for example, embargo periods with secure access) and within the editorial
management systems that will ease transfer of data to repositories.
 

A major challenge today is that much more Earth and space science data are being
collected than can be reasonably stored, curated, or accessed. This includes physical
samples, information about them, and digital data (sometimes streaming at rates of
terabytes per minute). Researchers and publishers are looking for guidance on what
constitutes archival data across diverse fields and disciplines. The major data repositories
provide leading practices that should help guide the types of samples, data, metadata, and
data processing descriptions that should be maintained, including information about
derivations, processing, and uncertainty.
 

To enable improved coordination and availability of open data, we encourage funders to
support these commitments, ensure a robust infrastructure of data repositories, and enable
broad outreach with researchers. As a general rule, data management plans promulgated
by funders should indicate that release into leading repositories, where available, of those
data necessary to support published results is expected at publication. The ultimate
measure of success is in the replicability of science, generation of new discoveries, and in
progress on the grand challenges facing society that depend on the integration of open
data, tools, and models from multiple sources.
 

Signatories
American Astronomical Society
American Geophysical Union
American Meteorological Society
Biological and Chemical Oceanography Data Management Office, Woods Hole
Oceanographic Institution (BCO­DMO)
Center for Open Science
CLIVAR and Carbon Hydrographic Data Office (CCHDO)
Community Inventory of EarthCube Resources for Geosciences Interoperability
(CINERGI)
Council of Data Facilities
Elsevier
European Geophysical Union
Geological Data Center of Scripps Insitution of Oceanography
ICSU World Data System
Incorporated Research Institutions for Seismology (IRIS)
Integrated Earth Data Applications (IEDA)
John Wiley and Sons
Magnetics Information Consortium (MagIC)
Mineralogical Society of America
National Snow and Ice Data Center
Nature Publishing Group
Proceedings of the National Academy of Sciences
Rolling Deck to Repository (R2R)
Science

Your chance to provide input on NIDA strategic plan

Posted on December 15th, 2014 in Anita Bandrowski, General information, News & Events | 1 Comment »

 

NIDA is asking for input on their strategic plan for the next 5 years. They are listening, what do you want to tell them? Text about the RFI from NIDA is below:

 

The Division of Basic Neuroscience and Behavioral Research (DBNBR) is interested in your thoughts on how to drive basic science research on drug abuse forward over the next 5 years.  I want to personally invite you to submit your perspective and ideas on NIDA’s strategic priorities for 2016-2020 and hope you will share some of your valuable time to participate in this critical process!  We are particularly interested in your thoughts regarding the basic science strategic priorities outlined in the RFI.  Which of these priorities are most important?  Are there priorities that are missing and should be added?  Are there areas that need just a small investment to make big advancements?  This is your chance to provide input into NIDA’s strategic plans.  Please take just a few minutes to submit your thoughts and ideas to this RFI.

 

The RFI is here: http://grants.nih.gov/grants/guide/notice-files/NOT-DA-15-005.html.

Share your vision and ideas related to basic science by emailing them here:  NIDAOSPCPlanning@mail.nih.gov

 

One final note:  Please share this information with any relevant scientific colleagues, post-doctoral fellows and graduate students.  We would like to get as much input from the basic science research community as possible.
 

Community based standards, a BD2K request for comments

Posted on December 5th, 2014 in Anita Bandrowski, News & Events | No Comments »

One of the Big Data to Knowledge, BD2K, activities has been to hold workshops to assess the state of the bioinformatics resource landscape and one important feature is the community standard, something NIF has been very concerned about for the last 8 years. We know that there are thousands of scientific databases in the public space, but how many are used by the community and how many are even consistently maintained is a difficult question to answer.

 

What is a community standard?

Many communities of scientists, especially those who share some aspects of their data, have developed sets of information that they routinely capture about the experiments they run. For example, all scientists would capture the species which their experiments were performed on, though sometimes this may be quite difficult if one is working with mouse genes in a humanized e.coli cell line.

 

Why do they want to know about your community standards?

The BD2K grants have come into a complex landscape of existing community resources, supported resources, and unsupported or abandoned resources. The question as to which resources should be adopted broadly and which can be scrapped is a very important question, but can’t be answered without community feedback.

 

How can I comment? Click here to see draft report and comment

The bottom of the workshop report has a comment section. These comments are necessary to get a broader picture of the landscape. Please add your two cents to this, especially if you run a repository or other data resource. NIH and the BD2K project leaders need to hear from you, before January 20th!