Statement of Commitment from Earth and Space Science Publishers and Data Facilities:

Posted on January 14th, 2015 in Anita Bandrowski, Interoperability, News & Events | No Comments »

This is an important committment from the CODATA and earth science community. Looking quite forward to circulating a similar document from the Neuroscience community.


Coalition on Publishing Data in the Earth and Space Sciences

Earth and space science data are special resources, critical for advancing science and
addressing societal challenges – from assessing and responding to natural hazards and
climate change, to use of energy and natural resources, to managing our oceans, air, and
land. The need for and value of open data have been encoded in major Earth and space
science society position statements, foundation initiatives, and more recently in
statements and directives from governments and funding agencies in the United States,
United Kingdom, European Union, Australia, and elsewhere. This statement of
commitment signals important progress and a continuing commitment by publishers and
data facilities to enable open data in the Earth and space sciences.

Scholarly publication is a key high­value entry point in making data available, open,
discoverable, and usable. Most publishers have statements related to the inclusion or
release of data as part of publication, recognizing that inclusion of the full data enhances
the value and is part of the integrity of the research. Unfortunately, the vast majority of
data submitted along with publications are in formats and forms of storage that makes
discovery and reuse difficult or impossible.

Repositories, facilities, and consortia dedicated to the collection, curation, storage, and
distribution of scientific data have become increasingly central to the scientific enterprise.
The leading Earth and space science repositories not only provide persistent homes for
these data, but also ensure quality and enhance their value, access, and reuse. In addition
to data, these facilities attend to the associated models and tools. Unfortunately, only a
small fraction of the data, tools, and models associated with scientific publications makes
it to these data facilities.

Connecting scholarly publication more firmly with data facilities thus has many
advantages for science in the 21st century and is essential in meeting the aspirations of
open, available, and useful data envisioned in the position statements and funder
guidelines. To strengthen these connections, with the aim of advancing the mutual
interests of authors, publishers, data facilities, and end­users of the data, a recent Earth
and space science data and publishing conference, supported by the National Science
Foundation, was held at AGU Headquarters on 2­3 October 2014. It brought together
major publishers, data facilities, and consortia in the Earth and space sciences, as well as
governmental, association, and foundation funders. Further informational meetings were
held with Earth and space science societies, publishers, facilities, and librarians that were
not present at the October meeting. Collectively the publishers, data facilities, and
consortia focused on open data for Earth and space science formed a working group:
Coalition on Publishing Data in the Earth and Space Sciences. As one outcome, this
group collectively endorsed the following commitments to make meaningful progress
toward the goals above. We encourage other publishers and data facilities and consortia
to join in support.

Signatory data facilities, publishers, and societies, in order to meet the need for
expanding access to data and to help authors, make the following commitments:
● We reaffirm and will ensure adherence to our existing repository, journal, and
publisher policies and society position statements regarding data sharing and
archiving of data, tools, and models.
● We encourage journals, publishers, and societies that do not have such statements
to develop them to meet the aspirations of open access to research data and to
support the integrity and value of published research. Examples of policies and
position statements from signatory journals and societies are listed here.
● Earth and space science data should, to the greatest extent possible, be stored in
appropriate domain repositories that are widely recognized and used by the
community, follow leading practices, and can provide additional data services.
We will work with researchers, funding agencies, libraries, institutions, and other
stakeholders to direct data to appropriate repositories, respecting repository
● Where it is not feasible or practical to store data on community­approved
repositories, journals should encourage and support archiving of data using
community­ established leading practices, which may include supplementary
material published with an article. These should strive to follow existing NISO

Over the coming year, the signatory Earth and space science publishers, journals, and
data facilities will work together to accomplish the following:
● Provide a usable online community directory of appropriate Earth and space
science community repositories for data, tools, and models that meet leading
standards on curation, quality, and access that can be used by authors and journals
as a guide and reference for data deposition.
● Promulgate metadata information and domain standards, including in the online
directory, to help simplify and standardize data deposition and re­use.
● Promote education of researchers in data management and organize and develop
training and educational tools and resources, including as part of the online
● Develop a working committee to update and curate this directory of repositories.
● Promote referencing of data sets using the Joint Declaration of Data Citation
Principles, in which citations of data sets should be included within reference
● Include in research papers concise statements indicating where data reside and
clarifying availability.
● Promote and implement links to data sets in publications and corresponding links
to journals in data facilities via persistent identifiers. Data sets should ideally be
referenced using registered DOI’s.
● Promote use of other relevant community permanent identifiers for samples
(IGSN), researchers (ORCID), and funders and grants (FundRef).
● Develop workflows within the repositories that support the peer review process
(for example, embargo periods with secure access) and within the editorial
management systems that will ease transfer of data to repositories.

A major challenge today is that much more Earth and space science data are being
collected than can be reasonably stored, curated, or accessed. This includes physical
samples, information about them, and digital data (sometimes streaming at rates of
terabytes per minute). Researchers and publishers are looking for guidance on what
constitutes archival data across diverse fields and disciplines. The major data repositories
provide leading practices that should help guide the types of samples, data, metadata, and
data processing descriptions that should be maintained, including information about
derivations, processing, and uncertainty.

To enable improved coordination and availability of open data, we encourage funders to
support these commitments, ensure a robust infrastructure of data repositories, and enable
broad outreach with researchers. As a general rule, data management plans promulgated
by funders should indicate that release into leading repositories, where available, of those
data necessary to support published results is expected at publication. The ultimate
measure of success is in the replicability of science, generation of new discoveries, and in
progress on the grand challenges facing society that depend on the integration of open
data, tools, and models from multiple sources.

American Astronomical Society
American Geophysical Union
American Meteorological Society
Biological and Chemical Oceanography Data Management Office, Woods Hole
Oceanographic Institution (BCO­DMO)
Center for Open Science
CLIVAR and Carbon Hydrographic Data Office (CCHDO)
Community Inventory of EarthCube Resources for Geosciences Interoperability
Council of Data Facilities
European Geophysical Union
Geological Data Center of Scripps Insitution of Oceanography
ICSU World Data System
Incorporated Research Institutions for Seismology (IRIS)
Integrated Earth Data Applications (IEDA)
John Wiley and Sons
Magnetics Information Consortium (MagIC)
Mineralogical Society of America
National Snow and Ice Data Center
Nature Publishing Group
Proceedings of the National Academy of Sciences
Rolling Deck to Repository (R2R)

Your chance to provide input on NIDA strategic plan

Posted on December 15th, 2014 in Anita Bandrowski, General information, News & Events | No Comments »


NIDA is asking for input on their strategic plan for the next 5 years. They are listening, what do you want to tell them? Text about the RFI from NIDA is below:


The Division of Basic Neuroscience and Behavioral Research (DBNBR) is interested in your thoughts on how to drive basic science research on drug abuse forward over the next 5 years.  I want to personally invite you to submit your perspective and ideas on NIDA’s strategic priorities for 2016-2020 and hope you will share some of your valuable time to participate in this critical process!  We are particularly interested in your thoughts regarding the basic science strategic priorities outlined in the RFI.  Which of these priorities are most important?  Are there priorities that are missing and should be added?  Are there areas that need just a small investment to make big advancements?  This is your chance to provide input into NIDA’s strategic plans.  Please take just a few minutes to submit your thoughts and ideas to this RFI.


The RFI is here:

Share your vision and ideas related to basic science by emailing them here:


One final note:  Please share this information with any relevant scientific colleagues, post-doctoral fellows and graduate students.  We would like to get as much input from the basic science research community as possible.

Community based standards, a BD2K request for comments

Posted on December 5th, 2014 in Anita Bandrowski, News & Events | No Comments »

One of the Big Data to Knowledge, BD2K, activities has been to hold workshops to assess the state of the bioinformatics resource landscape and one important feature is the community standard, something NIF has been very concerned about for the last 8 years. We know that there are thousands of scientific databases in the public space, but how many are used by the community and how many are even consistently maintained is a difficult question to answer.


What is a community standard?

Many communities of scientists, especially those who share some aspects of their data, have developed sets of information that they routinely capture about the experiments they run. For example, all scientists would capture the species which their experiments were performed on, though sometimes this may be quite difficult if one is working with mouse genes in a humanized e.coli cell line.


Why do they want to know about your community standards?

The BD2K grants have come into a complex landscape of existing community resources, supported resources, and unsupported or abandoned resources. The question as to which resources should be adopted broadly and which can be scrapped is a very important question, but can’t be answered without community feedback.


How can I comment? Click here to see draft report and comment

The bottom of the workshop report has a comment section. These comments are necessary to get a broader picture of the landscape. Please add your two cents to this, especially if you run a repository or other data resource. NIH and the BD2K project leaders need to hear from you, before January 20th!

Worm brain in Robot body!

Posted on December 4th, 2014 in Anita Bandrowski, Data Spotlight | No Comments »

Well we have done it. Captured the imagination of media types!
Wish it sounded a little more like science and less like science fiction, but heck, this is a little bit of science posing as science fiction, right in our back yard.

In terms projects that started at UCSD/CRBS, the Open Worm is a fantastic success. Dr. Larson, the creator of NeuroLex, founded the Open Worm, funded it with kickstarter (take that, NIH), an open “hacker” community that asks “Can we fully simulate the C.Elegans?” Biomechanics, neurons, molecules….you name it. If you put these models together into one virtual organism, will the organism function as expected?

If there is ever an organism that is amenable to simulation, it is the worm (sorry Human Brain Project folks) and the argument goes that what we learn from the worm simulations we can start to apply to other simulated species.

Check out Stephen’s Ted Talk:

NIF’s top 10 brain regions and cells

Posted on December 2nd, 2014 in Anita Bandrowski, Inside NIF | No Comments »

Back by popular demand,

NIF Brings you the top brain region and cellular search terms, most of which actually lost to the weirdest search term for this month: lettuce! No, we have no idea why people are looking for lettuce on our site, but apparently we have a very nice scanning electron microscopic image of it.

Top 10 Brain Regions
Frontal Lobe
Fusiform Gyrus
Cerebral Crus
anterior cingulate cortex

Top 10 Cells
Retinal Ganglion Cell
cholinergic neuron
Medium Spiny Neuron
Photoreceptor Cell
Bergmann glia
Cone Cell
Satellite Cell
Amacrine Cell

Call for task force members – BioCaddie (BD2K – DDIC)

Posted on November 21st, 2014 in Anita Bandrowski, News & Events | No Comments »

As part of the NIH’s Big Data to Knowledge (BD2K) kick off, we are pleased to announce the launch of the Data Discovery Index Consortium, bioCADDIE ( bioCADDIE (biomedical and healthCAre Data Discovery and Indexing Ecosystem) was established to help define requirements for building a Data Discovery Index to make it easier for researchers to find, access, reuse and share data. bioCADDIE will be engaging the community through task forces and funding small pilot projects to explore the best ways to search for, access and cite data from NIH-funded researchers.

We are inviting community members to join bioCADDIE task forces.

For more information please visit bioCADDIE website.

INCF & NIF social – Monday night in DC

Posted on November 12th, 2014 in Anita Bandrowski, News & Events | No Comments »

NIF and INCF are present at Neuroscience 2014 in Washington DC, November 15-19 at the Walter E Washington Convention Center. Please visit us in booth #3516 & #3517, located near other neuroinformatics exhibitors (see map: Are you working on an interesting project, or maybe looking for new collaborators? Come by and talk to us!

Neuroscience tools and projects demos at the INCF booth
For the eight year running, we are hosting neuroinformatics demonstrations in our booth during all exhibition days. Tools and projects from INCF Programs, National Nodes and community will be presented. You can find the schedule and demo abstracts on our webpage via the following links:
* Demo schedule:
* Demo abstracts:

Monday Nov 17 Neuroinformatics social with INCF, NIF and NITRC
On Monday November 17, 17:30 – 19:30, we will be hosting a social together with NIF and NITRC, at the Cuba Libre Restaurant & Rum bar on 801 9th St. NW (the corner of 9th & H Streets NW) in the Penn Quarter neighbourhood of Washington DC. Drinks and finger food will be served – come by and mingle!

You can also follow us on Twitter (; we will be posting regular updates all through the meeting.

Hope to see you in Washington!

From Force11: Abstract deadline in 5 days for Force2015

Posted on November 10th, 2014 in Anita Bandrowski, General information, News & Events | No Comments »


  • New! Travel Fellowship Applications Open
  • New! Call for Vision Flash Talk Submissions Open
  • Call for Session Abstracts Submission Deadline November 15
  • Call for Demo/Posters Submission Deadline December 1
  • Call for Pre-Conference Workshop Submissions
  • Registration
The FORCE2015 Research Communication and e-Scholarship Conference (FORCE2015) will be held 12-13 January, 2015 at the University of Oxford, UK. FORCE2015, the successor of the Beyond-the-PDF conferences, will bring together resesearchers, scholars, librarians, archivists, information scientists, publishers, and research funder in a lively forum.

NEW! Travel fellowships applications – Deadline December 1, 2014

Travel fellowships are available for students, Post-docs and scholars from developing nations.

NEW! Flash Vision Session: Deadline December 1, 2014

This is your chance to get your idea on the map and help change the future of scholarly communication. Submissions are invited from anyone in attendance to give a 5 min, 3 slide pitch on your idea.

Call for Session Abstracts – Deadline November 15,
Submit your abstracts for the “Valuing the Diversity of Scholarly Impact in a Networked World” and“Credit Where Credit is Due” Sessions.

Call for posters and demos – Deadline December 1, 2014
Show what the future of research and how new technologies are shaping the way scholarship is being done, reported and pushed forward. Present a poster or demonstration.

Call for Pre-Conference Workshops
On the preceding day, 11 January 2015, there will be workshops, informal and formal collaborations, and business meetings associated with the main conference. You may submit requests for hosting a meeting on Sunday.

Registration – Discount registration Deadline December 15,

We hope to see you there!

Big Data vs Small Data: Is it really about size?

Posted on October 31st, 2014 in Anita Bandrowski, Curation, Data Spotlight, Inside NIF, Interoperability | No Comments »

We have been hearing for some time that when it comes to data, it is all about size. The bigger is better mantra has been all over the press, but is it really size that matters?

There are the so called “Big Data” projects such as the Allen Brain Atlas, which generates data, sans hypothesis, over the whole brain for thousands of genes. This is great because the goal of the project is to generate consistent data and not worry about which disease will or will not be impacted by each data point. That may be a great new paradigm for science, but there are not many projects like this “in the wild”.

Most data is being generated in the world of science can be considered small, i.e., would fit on a personal computer, and there are a LOT of labs out there generating this sort of data. So the question that we addressed in the recent the Big Data issue of Nature Neuroscience, is whether small data could organize to become big data? If such a thing is desirable, then what would be the steps to accomplish this lumping?

Here are the principles that we have extracted from working on NIF that we think will really help small data (from Box 2):

Discoverable. Data must be modeled and hosted in a way that they can be discovered through search. Many data, particularly those in dynamic databases, are considered to be part of the ‘hidden web’, that is, they are opaque to search engines such as Google. Authors should make their metadata and data understandable and searchable, (for example, use recognized standards when possible, avoid special characters and non-standard abbreviations), ensure the integrity of all links and provide a persistent identifier (for example, a DOI).

Accessible. When discovered, data can be interrogated. Data and related materials should be available through a variety of methods including download and computational access via the Cloud or web services. Access rights to data should be clearly specified, ideally in a machine-readable form.

Intelligible. Data can be read and understood by both human and machine. Sufficient metadata and context description should be provided to facilitate reuse decisions. Standard nomenclature should be used, ideally derived from a community or domain ontology, to make it machine readable.

Assessable. The reliability of data sources can be evaluated. Authors should ensure that repositories and data links contain sufficient provenance information so that a user can verify the source of the data.

Useable. Data can be reused. Authors should ensure that the data are actionable, for example, that they are in a format in which they can be used without conversion or that they can readily be converted. In general, PDF is not a good format for sharing data. Licenses should make data available with as few restrictions as possible for researchers. Data in the laboratory should be managed as if it is meant to be shared; many research libraries now have data-management programs that can help.


RISE program at NMSU, highlighting success

Posted on October 21st, 2014 in Anita Bandrowski, Author, General information, News & Events | No Comments »

We at NIF were very fortunate to have hosted the director of the RISE program from NMSU, Elba Serrano this summer attempting to develop coursework and improve outreach to students and faculty at New Mexico State. We believe that the scientific workforce in the coming decade will need to be aware of computational problems, statistics and bioinformatics resources. It is critical that we train this workforce and the RISE program is exactly the right place to start.

We are thrilled that this collaboration has resulted in a set of workshops that will be held in the spring and a set of course modifications for NMSU students to start becoming aware of bioinformatics in the fall.

A news story outlining the success of the program can be found below; a 90% promotion rate of minority and disadvantaged students is truly something to be proud of!