Archive for the ‘Essays’ Category

Draft Declaration of Data Citation Principles, community comment are being sought

Posted on November 22nd, 2013 in Anita Bandrowski, Interoperability, News & Events | No Comments »

NIF is proud to support this important effort by members of FORCE11 (the future of scholarly communications) and request that the NIF community comment.

Announcing the “Draft Declaration of Data Citation Principles” .    The Data Citation Synthesis group of 40 individuals from 25+ organizations developed these draft principles over the past 9 months and now welcome feedback and comments from the community.    The feedback received by the end of 2013 will be reviewed and incorporated into the final principles.  Once the final principles are published, a mechanism will be in place for worldwide endorsement.  

Thanks for your input.


Major Update of BAMS data

Posted on November 19th, 2013 in Anita Bandrowski, Curation, Data Spotlight | No Comments »

From our friends up the coast.

We are happy to announce a major BAMS update:

1. Data. more than 4000 connections reports collated from the primary
rat literature. The number of publicly available connectivity reports,
accessible through both BAMS and BAMS2 exceeds the 70k benchmark. Most
of the newly added reports are related to the cortico-cortical
connections in the rat. Consequently the rat cortico-cortical matrix
is better populated and is accessible through the “Connectomes” tab of
the classic BAMS menu
(http://brancusi.usc.edu/connectomes/standard_rat.php).

2. Interfaces and functionalities.
A. Many of BAMS’ users asked for a simpler method to view connections
data. Hence, we implemented an additional list view of the connections
in classic BAMS. This interface is now implemented in the “Reference”
section of BAMS (searching for information by references). A second
request from users was related to the export of data from the classic
BAMS. Consequently, the connections data retrieved by searches can be
downloaded by any user in Excel format. This new functionality was
also implemented in the “Reference” section of BAMS.
Example: the connectivity data collated from Cenquizca & Swanson,
2007: http://brancusi.usc.edu/connections/conef-det2.php?ref=1653
We will expand these two new simple tools to the other BAMS modules
and data types after feedback from users.
B. The connectivity matrices that can be constructed online allow
inspection of detailed data by clicking on their colored squares.You
can verify this for any of the matrices available at the URL:
http://brancusi.usc.edu/connectomes/standard_rat.php
C. A new functionally relevant rat brain circuitry, the connections
between the medial cortex and the amygdala, can be accessed online:
http://brancusi.usc.edu/connectomes/custom_rat.php

3. Last but not least, the presence of BAMS at the SfN Meeting 2013 is
described together with two very important neuroinformatic platforms,
NIF and NeuroLex. You can read the article at the URL:
http://sfari.org/news-and-opinion/conference-news/2013/society-for-neuroscience-2013/online-tools-sift-through-storehouses-of-brain-data

Elsevier and the Neuroscience Information Framework Work Together to Improve Reporting of Research in Neuroscience Literature

Posted on November 7th, 2013 in Anita Bandrowski, Data Spotlight, Interoperability | No Comments »

I am very excited to share the following press release with the NIF community.

 

Elsevier recommends authors to follow the Minimal Data Standards

Amsterdam, November 7, 2013Elsevier, a world-leading provider of scientific, technical and medical information products and services, announces its collaboration with the Neuroscience Information Framework (NIF), by incorporating the Minimal Data Standards across four of its neuroscience journals.

Minimal Data Standards are a set of recommendations developed by NIF, the most comprehensive portal of available web-based resources in the field of neuroscience, to facilitate resource identification in published neuroscience articles. One of the big challenges that neuroscientists face today is that research findings reported in the literature often lack sufficient details to enable reproducibility of methodology or reuse of data. With the launch of the Minimal Data Standards, NIF aims to address this issue.

Elsevier is one of the first scholarly publishers to adopt the Minimal Data Standards guidelines. Initially four Elsevier journals will take part in the pilot: Brain Research, Experimental Neurology, Journal of Neuroscience Methods, and Neurobiology of Disease. These journals will incorporate the guidelines into their article submission process, recommending authors to include gene/genome accession numbers, species specific nomenclatures, antibody identifiers, and software details in the methods section of their articles. More Elsevier neuroscience journals will join the initiative as the pilot further develops in 2014.

Prof. Maryann Martone, Professor of Neuroscience at the University of California San Diego, and Executive Director of the Future of Research Communications and e-Scholarship (FORCE11), said, “Scientific reproducibility starts with materials and methods. We are pleased to work with Elsevier to help neuroscientists make their methods more understandable for not only humans but also machines. This pilot is a step towards changing the way we write papers to take advantage of 21st century technology for searching and linking across vast amounts of information.”

Michael Osuch, Publishing Director for Neuroscience & Psychology at Elsevier said, “With our support for the Minimal Data Standards, we aim to make it easier for the community to identify the key resources used to produce the data in published studies. Neuroscience is a highly multi-disciplinary field with thousands of relevant web-based resources and data repositories. Direct linking to all of them would have been impossible without NIF’s capacity to serve as a central portal.”

Supporting the NIF to roll out the Minimal Data Standards pilot with the aim of developing better and more accurate resource identification within the neuroscience literature, falls within the scope of Article of the Future, Elsevier’s on-going program to improve the format of the scientific article.
# # #


About the Neuroscience Information Framework

An initiative of the NIH Blueprint for Neuroscience Research, the Neuroscience Information Framework (NIF) advances neuroscience research by enabling discovery and access to public research data and tools worldwide through an open source, networked environment. In addition to giving access to over 200 neuroscience relevant databases and data sets, NIF hosts millions of annotations on the literature, which includes information about the reagents used in the paper, links to data, and comments about the data or arguments presented in the paper.

About Elsevier
Elsevier is a world-leading provider of scientific, technical and medical information products and services. The company works in partnership with the global science and health communities to publish more than 2,000 journals, including The Lancet and Cell, and close to 20,000 book titles, including major reference works from Mosby and Saunders. Elsevier’s online solutions include ScienceDirect, Scopus, SciVal, Reaxys, ClinicalKey and Mosby’s Suite, which enhance the productivity of science and health professionals, helping research and health care institutions deliver better outcomes more cost-effectively.

A global business headquartered in Amsterdam, Elsevier employs 7,000 people worldwide. The company is part of Reed Elsevier Group plc, a world leading provider of professional information solutions. The group employs more than 30,000 people, including more than 15,000 in North America. Reed Elsevier Group plc is owned equally by two parent companies, Reed Elsevier PLC and Reed Elsevier NV. Their shares are traded on the London, Amsterdam and New York Stock Exchanges using the following ticker symbols: London: REL; Amsterdam: REN; New York: RUK and ENL.

Media contact
Shamus O’Reilly
Publisher Neuroscience
Elsevier
+44 1865 843651
s.oreilly@elsevier.com

Resource Identification Guidelines – now at Elsevier

Posted on September 6th, 2013 in Anita Bandrowski, Curation, Interoperability, NIFarious Ideas | No Comments »

The problem of reproducibility of results has been addressed by many groups, as being due to scientists having very large data sets and highlighting the interesting, yet most likely statistically anomalous findings and other science no-no’s like reporting only positive results.

Our group, has been working to make the methods and reagents reporting better and I am happy to report that this group has been seeing resonance of these ideas.

In a group sponsored by FORCE11, a group of researchers, reagent vendors and publishers has been meeting to discuss how to best accomplish better reporting in all of the literature and both the NIH and publishers themselves are now becoming interested in their sucess. The latest and greatest evidence of this can be found on the Elsevier website, as a guideline to authors, however this will soon be followed by a pilot project to be launched at the Society for Neuroscience meeting with over 25 journals and most major publishers.

Of course there is no reason to wait for an editor to ask to put in catalog numbers or stock numbers for transgenic animals. These should be things that we are trained to do in graduate school as good practices for reporting our findings.

We seem to be getting ready to change (or change back) to a more rigorous methods reporting, which should strengthen the recently eroded credibility of the scientific enterprise. I for one, hope that the message that will be communicated is: “scientists don’t hide problems, even endemic ones, we examine them and find workable solutions”.

There is a Link between literature and data, it has been there for years, but nobody ever found it

Posted on July 10th, 2013 in Anita Bandrowski, Curation, Data Spotlight, Force11, Interoperability, NIFarious Ideas | No Comments »

The NIH has had a recent request for information about the NIH data catalog to which our group and many others have responded. Many voices including fairly important ones from the white house are now calling for making scientific research data open, available and linked to the publications written about the data. This is a very good thing. It should lead to better handling and comparison of data and better science.

However, sitting in many recent meetings with members of various national libraries, who shall remain nameless, I am astounded to learn that not only the scientists, but also librarians have never found the LinkOut feature in PubMed.

LinkOut is a little option at the bottom of all articles in PubMed hidden by the good staff into complete obscurity, please see the screen shot below if you don’t believe me that such a feature exists.

Screen Shot 2013-07-10 at 3.10.35 PM

The article above links to two data sets, one is based on OMIM.org a curated set of annotations linking genes to genetic disorders, and the other is the antibodyregistry.org a set of statements about antibody reagents used in this paper. Links from other papers lead to computation model code described in the paper, activation foci or data repositories.

Although it is certainly rarely used, the model organism communities, data repositories and researchers, have been diligently adding their data to PubMed in the form of links. We may quibble about the fact that PubMed asks many of us to reduce the specific links to data to generic links that lead to another version of the same article, but the fact is, that the links to data are present! Because they are present, if the National Library of Medicine ever decides to search them, export them, or acknowledge their existence, it would be a treasure trove of data to literature links that would not require a huge new investment in infrastructure.

I am not suggesting that our infrastructure could not be upgraded, in fact we have many more technical gripes that I will not bring up here, but I am suggesting that we all take advantage of the massive investment of time and energy of curators and authors over the last decades to meticulously link their data or data repositories to the literature.

The LinkOut broker has helped NIF aggregate a list of about 250,000 links from ~40 databases, but what PubMed must have is a much much larger set of data. The links provided by NIF can be searched through the NIF site, they can be filtered by category and by database, and they can be extracted and embedded into other sites like science direct (see neuinfo.org/developers). Of these 1/4 million links that we provide to PubMed, between 100 and 200 users find them per month. I think that we can and should do better.

  • We can ask that PubMed makes links to data prominent.
  • We can ask that any links in PubMed be of good quality, e.g., results of text-mining output should not be included without verification by authors or curators.
  • We can ask that the links show actual data as opposed to the representation of the paper in another site (currently required).

If you feel the sudden urge to be an arm-chair activist, then please let PubMed know that it would be nice if they celebrated the current links between data and publications instead of hiding them.

How long does it take to get a resource into NIF? The case of the open source brain.

Posted on June 4th, 2013 in Anita Bandrowski, Data Spotlight, Force11, Inside NIF, Interoperability | 2 Comments »

Believe it or not, there really is a project called open source brain, and it is a wonderful community of hackers that attempts to do very novel things with open source models, mainly in a format called NeuroML.

What is the open source brain?

Well, it takes models, converts them into cool visualizations and then allows users to manipulate them in their browser, with functionality similar to google body. The hope is to strap some significant computational power from the Neuroscience Gateway’s massive clusters so that the pretty pictures can be fully functional, but for now, this is a great way of exploring three-dimensional neurons and connectivity.

Screen Shot 2013-06-04 at 5.29.01 PM

But the reason I am blogging about this project is not because of the “ooohh-aaaahhh” factor that nice graphics usually have on me, but also because this source came to NIF in an interesting way, by human flying from London on his way to another meeting. Unfortunately last week we did not know about the Open Source Brain, but Padraig knew about NIF and wanted to register the project, hoping to integrate his data or at least “get the process started”.

At 10:30 am we were sufficiently caffeinated to begin and created a registry entry, from which we obtained an identifier.

The identifier was then used to create a sitemap entry in the DISCO database (essentially anyone who has logged in to the NeuroLex can click a button at the bottom of a curated registry entry can actually do this).

Then we added an “interop” file, which instructs our crawler to put data the xml data output by open source brain into our local data warehouse making sure to specify appropriate tables and columns.

Then we went to lunch, came back after fighting much larger crowds at the indian place than were expected before finals, and created the “view” of the data (basically, wrote a sql statement and used our concept mapping tool to define what data would be displayed).

By 3:30 pm we had a view deployed. Well ok, we did have to import the data twice because we messed up the file once, and this deployment was the beta server and we had to wait to update to production until Friday night, but that is still pretty darn fast in my opinion.

The question for many people who have data has been how much effort will it take to make my data interoperable with community resources and for the first time ever, we can report …. it will only take a couple of hours (we should insert many caveats here).

We have A LOT of neuroscience information, and would like to share….

Posted on May 14th, 2013 in Curation, Inside NIF, Jonathan Cachat | No Comments »

Over the past 4 years, the Neuroscience Information Framework systematically scanned the literature, internet and social buzz for all things neuroscience (& biomedical science). This tedious bookkeeping has resulted in the largest, most comprehensive catalog of neuroscience-relevant information ever amassed – with the added bonus of semantically enhanced search functions. And now, we would like to share it with you via myNIF…but before those details…

What do we mean “neuroscience information”?

Neuroscience information includes data, resources, literature, grants, multimedia, social buzz, a lexicon and more..

Data: Over 140 independent databases (i.e. CCDB, Grants.gov, GENSAT) are deeply indexed and semantically mapped by NIF – representing over 400 million pieces of data. These data are considered part of the “hidden web”, not indexed by major search engines because do so requires specialized database query statements for retrieving data within, rather than on the surfaces of pages surrounding the database. NIF has developed technologies to regularly re-crawl and update data content, index it, and provide search within the contents of these databases simultaneously. Moreover, data resulting from a search can be exported with a single click into standard data formats for desired, subsequent analysis. This can simply save  you time – if you need to know what type of serotonin receptors have been classified in zebrafish (Danio rerio) – searching NIF for ‘zebrafish serotonin receptor’ provides results from authoritative data providers (HomoloGene, EntrezGene) which can be compared instantly, rather than visiting each site separately, and comparing through notes, multiple windows, or several downloads. In addition to this primary information , the results also include related, and sometimes very helpful information about zebrafish and serotonin – signaling pathways, antibodies, and grant information.

Resources: Need to find a software analysis package for microarray data? NIF can recommend 41 options, as well as 100+ unique organizations, centers, labs and websites that  have similar interests. Looking for non-governmental funding of ALS research? Here are 7. What about a tissue bank with Alzheimer’s disease CNS tissue samples available for researchers? NIF is aware of around 88 worth a look. All of this to convey that a resource is object or entity, with a website, that provides potential value to neuroscience research or the researchers. Importantly, this catalog of resources indexed by NIF is maintained at NeuroLex, a semantic mediawiki website. Homologous to Wikipedia, in that any one can contribute their resource or favorite resources, but endowed with reasoning capabilities permitting logical reasoning on relationships between data (i.e. list all GABAergic Neurons).

How Do You Evaluate a Database

Posted on May 3rd, 2013 in Author, Essays, Force11, Maryann Martone, News & Events, NIFarious Ideas | 3 Comments »

by Maryann E Martone

I was speaking with a colleague recently who, like many of us, had experienced the frustration of trying to support his on-line resources.  He has assembled a comprehensive on-line resource, it is used by the community and was used by others to publish their studies.  It is not Genbank or EBI;  it is one of the thousands of on-line databases created by individuals or small groups that the Neuroscience Information Framework and others have catalogued.  My colleague has spent years on this resource, pored over hundreds of references and entered close to a million statements in the database.  By many means, it is a successful resource.  But in the grant review, he was criticized for not having enough publications.  I experienced the same thing in a failed grant for the resource that I had created, the Cell Centered Database.  In fairness, that was not the most damning criticism, but it just seemed so very misplaced. I had succeeded in standing up and populating a resource, well before there was any thought of actually sharing data.  People used the database and published papers on it, but apparently I should have been spending more time writing about it and less time working on it.

The problems of creating and maintaining these types of resources are well known and were discussed at Beyond the PDF2:  to be funded, you have to be innovative.  But you don’t have to be innovative to be useful.  To quote or paraphrase Carole Goble at the recent conference,  “Merely being useful is not enough.”

But presumably there is a threshold of perceived value where “merely being useful” is enough.  I am thinking of the Protein Databank or Pub Med.  These resources are well funded and also well used but hardly innovative.  I am guessing that many of the resources like my colleague and I created were started with the hope that they would be as well supported and integral to people’s work as the PDB or Pub Med.  But the truth is, they are not in the same class.  But they are still valuable and represent works of scholarship.  We are now allowed to list them on our biosketch for NSF.  So my question to you is:  how do we evaluate these thousands of smaller databases?

Ironically, our peers have no trouble evaluating an article about our databases, but they have much more trouble evaluating the resource itself.  How does one weigh 30,000 curated statements against 1 article?  What level of page views, visits, downloads and citations make a database worthwhile?  If my colleague had published 10 papers, the reviewers wouldn’t have likely checked how often they were cited, particularly if they were recent.  What is the equivalent of a citation classic for databases?  If you don’t have the budget of NCBI, then what level of service can you reasonably expect from these databases?  I thought that the gold standard was a published study that utilized your database to do something else, by a group unconnected to you.  Grant reviewers found that unconvincing.  Perhaps I didn’t have enough? But how many of these do you need, relative to the size of your community,  and on what time frame should you expect them to appear?  Sometimes studies take years to publish.  Do they need to be from the community that you thought you were targeting (and whose institute may have funded your resource) or does evidence from other communities count?

So perhaps if we want to accept databases and other artefacts in lieu of the article, we should help define a reasonable set of criteria by which they can be evaluated.  Anyone care to help here?

What is the Cerebral Cortex?

Posted on January 14th, 2013 in Anita Bandrowski, Curation, Essays, Force11, Interoperability, News & Events | No Comments »

by Anita Bandrowski,

This may seem a silly question, but lets see if you are more like a fifth grader or more like me. It appears that a fifth grade class I recently interacted with can answer a question that I am having a lot of trouble with. They rattle off “the outside part of the brain”. True enough.
They can point to it, its the part that is “squiggly”. True enough.
“It is the part that thinks”. Ok, we can go with that answer.

So why are these fifth graders smarter than I am? Pun intended.

Read the rest of this entry »

Why I started blogging-a scientist’s perspective

Posted on December 19th, 2012 in Essays, Force11, Maryann Martone, News & Events, NIFarious Ideas | 8 Comments »

by Maryann Martone

A recent post at the London School of Economics Social Science Impact blog on “Finding the time to blog” reminded me that I wanted to write a blog about why I started to blog. The use of social media and its proper place in academic communications is being discussed in many circles. Over at FORCE11, we aggregate quite a few blog feeds like the one from LSE where these issues are thoroughly covered. I wanted, however, to share a personal perspective. Like many scientists, I suspect, I was at first reluctant to blog. I did write a few posts for the NIF blog when we started it up, but then stopped because “It takes too much time”. Each blog took me several weeks before I was happy with it and, as is well advertised, blogs don’t count towards academic promotion, etc. So if I was going to spend that amount of time, I might as well spend it towards something that does count: writing papers, giving talks, training, teaching, networking and, oh, doing research. Besides, who would want to hear what I had to say?

Well, the astute reader might have noted that many of our rewarded activities involve someone (funders, conference organizers, students) actually paying to hear what we have to say. And, the astute reader might also note that a blog is a much more effective communication vehicle than most of these for accomplishing these tasks. I started to blog for real when I realized that a blog is my communication with the world. A lot of money has been invested in me as a vehicle for knowledge acquisition and integration. The more I share that with the world, the better I do my job. A blog is not a learned treatise which needs to carefully consider all angles, acknowledge all references in a specified format and go through rounds and rounds of editing to craft the language so as to offend nobody with unsupported statements. A blog is a written yet highly interactive version of the type of conversation I engage in every day with students, colleagues, audiences. It is my thoughts on a topic, developed over a lifetime of active inquiry, open to correction and discussion. You can believe them or not, just as you choose to believe them when I am speaking to you in an informal or formal setting.

But unlike these other forms of transient communication, where my words evaporate into the air, blogs live on the net. They are searched by Google, so they can be found easily. And they are living things, open to comment, discussion, updating. Once I realized what a blog could be, I could fire one off in a matter of minutes. Do I get some things wrong? Sure. But isn’t that why we communicate with each other in science, so we can try to put our thoughts in order in a way where flaws can be exposed? It was a magical moment when I read over a blog that I had posted earlier and realized that I had left out a part of the argument. Oh no! But then I just opened edit and put it in. But what if I misrepresent some part of an argument or forget to acknowledge someone? Isn’t that why we have peer review? Well, if you want peer review, just read the comments. Usually, someone will correct you if they care enough. And again, you can immediately acknowledge that input and modify your posting or post a new one. So rather than blogging taking me away from my job, I actually think it lets me do it better. It is a freeing form of communication. Scientists generally are interesting people, but you would never know it from the articles they produce. But you do when you get them talking. And that, imho, is what a blog should be: scientists talking for everyone’s benefit.