Archive for the ‘Force11’ Category

How Do You Evaluate a Database

Posted on May 3rd, 2013 in Author, Essays, Force11, Maryann Martone, News & Events, NIFarious Ideas | 3 Comments »

by Maryann E Martone

I was speaking with a colleague recently who, like many of us, had experienced the frustration of trying to support his on-line resources.  He has assembled a comprehensive on-line resource, it is used by the community and was used by others to publish their studies.  It is not Genbank or EBI;  it is one of the thousands of on-line databases created by individuals or small groups that the Neuroscience Information Framework and others have catalogued.  My colleague has spent years on this resource, pored over hundreds of references and entered close to a million statements in the database.  By many means, it is a successful resource.  But in the grant review, he was criticized for not having enough publications.  I experienced the same thing in a failed grant for the resource that I had created, the Cell Centered Database.  In fairness, that was not the most damning criticism, but it just seemed so very misplaced. I had succeeded in standing up and populating a resource, well before there was any thought of actually sharing data.  People used the database and published papers on it, but apparently I should have been spending more time writing about it and less time working on it.

The problems of creating and maintaining these types of resources are well known and were discussed at Beyond the PDF2:  to be funded, you have to be innovative.  But you don’t have to be innovative to be useful.  To quote or paraphrase Carole Goble at the recent conference,  “Merely being useful is not enough.”

But presumably there is a threshold of perceived value where “merely being useful” is enough.  I am thinking of the Protein Databank or Pub Med.  These resources are well funded and also well used but hardly innovative.  I am guessing that many of the resources like my colleague and I created were started with the hope that they would be as well supported and integral to people’s work as the PDB or Pub Med.  But the truth is, they are not in the same class.  But they are still valuable and represent works of scholarship.  We are now allowed to list them on our biosketch for NSF.  So my question to you is:  how do we evaluate these thousands of smaller databases?

Ironically, our peers have no trouble evaluating an article about our databases, but they have much more trouble evaluating the resource itself.  How does one weigh 30,000 curated statements against 1 article?  What level of page views, visits, downloads and citations make a database worthwhile?  If my colleague had published 10 papers, the reviewers wouldn’t have likely checked how often they were cited, particularly if they were recent.  What is the equivalent of a citation classic for databases?  If you don’t have the budget of NCBI, then what level of service can you reasonably expect from these databases?  I thought that the gold standard was a published study that utilized your database to do something else, by a group unconnected to you.  Grant reviewers found that unconvincing.  Perhaps I didn’t have enough? But how many of these do you need, relative to the size of your community,  and on what time frame should you expect them to appear?  Sometimes studies take years to publish.  Do they need to be from the community that you thought you were targeting (and whose institute may have funded your resource) or does evidence from other communities count?

So perhaps if we want to accept databases and other artefacts in lieu of the article, we should help define a reasonable set of criteria by which they can be evaluated.  Anyone care to help here?

Lab Data Management Practices?

Posted on April 28th, 2013 in Force11, Jonathan Cachat, NIFarious Ideas | 2 Comments »

A number of groups, from libraries and universities and academic projects are striving to implement flexible data management systems in order to harness the latest and greatest in semantic web technologies striving to integrate and facilitate breakthrough interdisciplinary analysis.

It is known that every lab, every individual research group (regardless of the discipline) has developed internal data management systems that “work” (i.e. literature & data collection > excel > stats > graphing > word processer) but what has your lab found useful and what are your biggest frustrations?

Please feel free to comment below, or join the discussion on ResearchGate.

 

Nature seeing a trend?

Posted on March 31st, 2013 in Anita Bandrowski, Force11, Inside NIF, News & Events | No Comments »

Although the pace of science often seems glacial, returning from the Beyond the PDF2 meeting, I have been struck at how much scientific publishing has changed in the last 18 months, since the first beyond the pdf meeting.

Not many scientific meetings really are transformative, so this one has been a rare gem in the great sea of ‘same old’. It seems that major publishers (most present at the meeting), have also started to pay attention.

This week Nature has used its influence to comment on this very issue in the form of various editorials.

They address open access to the scientific literature, which was a big topic of conversation at the last beyond the pdf, a topic conspicuously absent this time. Open access is here, we need to still work out some kinks, as Nature is pointing out, but essentially it is solved. Ok, perhaps not solved, especially if you are talking about access by machines (e.g., statistical algorithms) and publication costs are still too high for the humanities and those in the third world, but as a topic of conversation as to whether we should have access, the jury is in and implementation awaits. So now that anyone will have access to an increasing quantity of the world’s publications, what now?

My read of the next steps are largely what the meeting kept oscillating between: data vs prose. The data side seems to believe that if we can have data as a publication because prose is a highly imperfect way of communicating about data. We could now have access to everyone’s raw data and then all will be well? The other side (including the advocacy for the pen and paper) also has an interesting point, it turns out that the human species has evolved to communicate by using language, not statistical inference.

My analysis of this meeting is: Yes you will have access to the scientific literature, but the problem of data is far from solved, in fact the problem of data is only now starting to be considered. I do not sit on the sidelines of this discussion, at NIF we have data, compare data from different sources, but even so, I wonder if the lack of prose will be as great a problem as all prose.

Top 25 for March

Posted on March 15th, 2013 in Force11, Inside NIF, News & Events | No Comments »

Top 25 Databases:
1. Grants.gov/Opportunity
2. AntibodyRegistry/ABs
3. BioNOT/Negation
4. ABCD/Brain Regions
5. AmiGO/Genes
6. BCBC/ABs
7. BioGRID/Interactions
8. NIF Registry/Info
9. AddGene/Plasmids
10. BrainMaps/Atlas
11. ClinicalTrials/ClinTr
12. NIF Integrated Disease/Info
13. BAMS/Brain Regions
14. BrainInfo/Brain Region
15. AllenInstitute/MouseBrainAtlas
16. RePORTER/CurrentNIHGrants
17. Allenbrain/Atlas
18. HumanBrainAtlas/Michigan
19. CellImageLibrary/CIL
20. ModelDB/Models
21. OMIM/Genes
22. BREDE/Activation Foci
23. ResearchCrossroads/Grants
24. SumsDB/Activation Foci
25. NIF Integrated CT/Registry

Top 25 Search terms:
1. antibodyregistry
2. database
3. “Anterior Nuclear Group”
4. “Drug Related Gene Database”
5. “Forebrain”
6. Brodmann area 44
7. “Cerebellum”
8. “Gene Ontology Tools”
9. cerebellum
10. hippocampus
11. “Fusiform Gyrus”
12. “Mitral Cell”
13. “Oligodendrocyte Precursor Cell”
14. “Brainstem”
15. “Flocculonodular Lobe”
16. “Fornix”
17. “Frontal Lobe”
18. “Hypothalamus”
19. “Medium Spiny Neuron”
20. “Basal Ganglia”
21. “Photoreceptor Cell”
22. gene
22. “Amygdala”
23. “Cerebral Crus”
24. “Cerebrum”
25. “Motor Neuron”

Call for INCF Abstract Submissions

Posted on March 14th, 2013 in Force11, General information, News & Events | No Comments »

6th INCF Congress of Neuroinformatics in Stockholm, Sweden August 27 – 29, 2013

Call for abstract submissions

Neuroinformatics 2013 is now accepting abstracts, submission this year is hosted by Frontiers in Neuroscience. When you submit your abstract, please indicate if you would like to be considered for an oral presentation by choosing “Oral presentation”.

Submit your abstract here!

We are accepting submissions until April 8th.

NIF Announcement Team

INCF.org

Call for papers: Frontiers in Neuroinformatics special topic “Recent advances and the future generation of neuroinformatics infrastructure”

Posted on March 3rd, 2013 in Force11, General information, News & Events | No Comments »

Call for papers:

*Frontiers in Neuroinformatics special topic *“Recent advances and the
future generation of neuroinformatics infrastructure”

Topic Editor(s): *Xi Cheng*, *Daniel R. Weinberger*, *Daniel Marcus*, *John
Van Horn*, *Venkata S. Mattay*, *Qian Luo*

Deadline for Abstract Submission: 31 Jul 2013

Deadline for Article Submission: 31 Oct 2013

The huge volume of multi-modal neuroimaging data across different
neuroscience communities has posed a daunting challenge to traditional
methods of data sharing, data archiving, data processing and data analysis.
Neuroinformatics plays a crucial role in creating advanced methodologies
and tools for the handling of varied and heterogeneous datasets in order to
better understand the structure and function of the brain. These tools and
methodologies not only enhance data collection, analysis, integration,
interpretation, modeling, and dissemination of data, but also promote data
sharing and collaboration. This Neuroinformatics Special Issue aims to
summarize the state-of-art of the current achievements and explores the
directions for the future generation of neuroinformatics infrastructure. We
welcome submissions from the following 4 topic areas: 1) data archiving, 2)
data processing and workflow, 3) data mining, and 4) system integration
methodologies. The data archiving section focus on novel methods for
efficient collection, storage and query of huge volume neuroimaging data,
genetics data, and all other related measures as well as meta-data and
processing results. The data processing and workflow section emphasizes
methods that facilitate large-scale parallel data processing tasks under
the heterogeneous computational environment. The data mining section
emphasizes novel data analysis methodologies to extract meaningful
information from the data. Finally the system integration section focus on
the best practices and novel approaches to integrate multiple systems
(database, workflow, ware house, etc). To be more specific, we would like
to invite a spectrum of subtopics from the 4 areas, including but not
limited to:

1.Data archiving

1.1.Novel data annotation and visualization

1.2.Archiving automation/simulation/de-identification/compression/quality
control/ontology, etc.

1.3.Novel storage architecture (e.g., grid, web storage, XML, OO, …, etc.)
and search engine

1.4.Data warehousing

2.Data processing and workflow

2.1.Novel neuroimaging data processing algorithms, packages and libraries

2.2.Distributed/parallel algorithms design (cluster/GPU/FPGA/cloud/HPC, etc)

2.3.Neuroimaging workflow design

2.3.1.Visualization/annotation

2.3.2.Workflow for distributed/collaborative environment

2.3.3.Provenance, meta data management

3.Data mining

3.1.Novel multi-modal integration methods

3.2.Feature extraction and visualization methods

3.3.Novel machine learning methods

3.4.Novel applications (biomarker detection/clinical
application/imaging-genomics/brain-computer-interface/publication mining,
etc.)

4.System integration methodologies, frameworks, architectures and best
practices

4.1.Integration of different software components, e.g., database/data
warehouse/workflow system/data analysis package and libraries.

4.2.Cross-platform, cross-modality integration

4.3.Cross-domain integration

4.3.1. Integration of imaging database with other types of databases

4.3.2. Integration of imaging workflow with other types of workflow

4.4.Distributed system integration

4.4.1. Data synchronization/query/transaction control/access control/single
sign on/processing

More Information:

http://www.frontiersin.org/neuroinformatics/researchtopics/recent_advances_and_the_future/1577

If you have questions regarding whether your research fits in the scope of
this scpecial issue, please send inquiries to
infrastructure.neuroinfo@gmail.com

How to define an Action Potential?

Posted on February 26th, 2013 in Anita Bandrowski, Force11, Inside NIF, News & Events, NIFarious Ideas | No Comments »

Dear electrophysiologists,

If you could spare about 5 minutes of your time, the INCF electrophysiology task force is attempting to survey the landscape of definitions of the action potential. The main interest here is to determine how working physiologists view it, is it an event, property of the membrane or a process?

The following link will take you to a google-survey, click here or follow this link:

https://docs.google.com/forms/d/1RjFbxxQ1APZ-wZ3qRudgLMF7B6x6C0E5QhATqd4JYxQ/viewform?sid=afb1a6cad01cf6e&token=3MImFj0BAAA.ISZ-71P88ISEhUfd4WLe_Q.mHMDtq5u5_29UwuZ1EYquw

The results will be stored on a google spreadsheet and the answers will be used to design a relevant knowledge model to describe electrophysiology data with the Experimental Neurophysiology Ontology (ENO).

Thank you in advance for any help you can provide.

Introduction to the Neuroscience Gateway Workshop

Posted on February 17th, 2013 in Data Spotlight, Force11, News & Events | No Comments »

Introduction to the Neuroscience Gateway Workshop

When: 10:00 – 12:00 PDT, 14 Mar, 2013

This workshop introduces participants to the Neuroscience Gateway
(NSG), which is being developed for the computational neuroscience
community with funding from the National Science Foundation. The NSG
provides an easy and user-friendly environment to access XSEDE high
performance computing (HPC) resources to run computational neuroscience
software such as NEURON, GENESIS, MOOSE etc. The NSG’s simple web-based
interface makes it quick and easy to create an account, upload neuronal
model code, run simulations and get back results.

The workshop will begin with an introduction of the NSG. This will be
followed by a hands-on session in which attendees will use the NSG
portal to upload neuronal models, set appropriate HPC parameters, run
simulations with NEURON on XSEDE HPC resources, monitor simulation
status, retrieve output results etc.

Agenda:

10:00 AM – 10:45 AM (PDT): Overview of NSG
• Background of NSG
• Software architecture of NSG
• Functionality of NSG
• How to use NSG
10:45 AM – 11:45 AM (PDT): Hands on usage of NSG
• Login
• Neuronal model upload
• Setting parameters
• Submission of simulation to XSEDE HPC resource
• Retrieval (and storage) of output results
11:45 AM – 12:00 PM (PDT): Questions and Discussion
————————————————————-
More information about the Neuroscience Gateway:

http://www.nsgportal.org

This training session is offered at no charge.
Interested users are encouraged to register at:
https://www.xsede.org/web/xup/course-calendar/-/training/class/109 (for onsite attendees)
https://www.xsede.org/web/xup/course-calendar/-/training/class/108 (for online attendees)

Registration opens 9:00 PST, 12 Feb, 2013
and closes 16:00 PST, 8 Mar, 2013

NIF Webinar: Feb 19th 11 am PDT: Sage Bionetworks discusses ClearScience

Posted on February 12th, 2013 in Force11, News & Events, Webinar Announcement | No Comments »

Please join us to discuss the future of scientific communication as seen by SAGE Bionetworks.

Title: clearScience: Dragging Scientific Communication into the Information Age
Presenter: Erich Huang & Brian Bot (Sage Bionetworks)
Date: February 19th, 2013; 11 am PDT
Where: http://connect.neuinfo.org/webinar

Abstract: Scientific communication must be re-engineered. Complex analyses from biomedical research are outpacing the means to convey them effectively. Imprisoning insights gleaned from these data to a few two-dimensional representations is wholly inadequate for transmitting the complexity of “big science” to our peers and the public. Numerous editorials and papers, as well as the cancer genomics scandal at Duke University highlight the need for infrastructure that supports reproducible and transparent science. When only 11% of landmark cancer studies can be independently confirmed, it is clear that serving our patients requires a new standard of openness and reproducibility. If scientific progress depends on our being able to effectively communicate our science so that the community can build upon it, evidence shows that we need to improve.

As a not-for-profit research organization, Sage Bionetworks is charged with exploring open models in the practice of biomedical science and enhancing the value of medical research to the community. The foundation for this exploration is Synapse, a cloud-based platform which co-locates data, code, and computing resources for analyzing genome-scale data and seamlessly integrates these services. While typical scientific publications are a minor elaboration on a 15th century technology, we propose that ‘publication’ of data-intensive science should not just be a representation of science, but the science itself.

We will present clearScience, a pilot project in building infrastructure for effective scientific communication. By leveraging Synapse services, we demonstrate how scientists can easily transition from exploring data—executing science—and providing the scientific community all the resources to recreate analyses. By capturing the complete lifecycle of a project, reproducibility becomes a byproduct rather than a burden of publication. Further, we provide for “forking an analysis”, allowing anyone to explore and elaborate on ‘published’ work. If the goal of biomedical research is to deliver results that will ultimately alleviate suffering and minimize harm to patients, being able to transparently share, reproduce, and build off of one another’s work is critical to scientific progress. clearScience represents one compelling model for facilitating this progress.

Run parallel simulations via the Neuroscience Gateway!

Posted on February 8th, 2013 in Anita Bandrowski, Data Spotlight, Force11, News & Events | No Comments »

Computational neuroscientists are invited to use the Neuroscience
Gateway portal http://www.nsgportal.org (NSG) to run parallel
simulations on high performance computing (HPC) resources.  We are
developing the NSG, with support from NSF, as a means to reduce the
administrative and technical barriers that keep many users from
employing HPC.

NSG users do not have to divert time and effort from their own
research, because there is
* NO formal request for allocation of CPU time–in fact,
no paperwork at all, and no charges since the NSG employs
HPC resources that are already supported by NSF and other
agencies
* NO need to install or configure software
* NO need to wrestle with HPC system software or job scheduling

Instead the NSG provides a simple web-based interface that makes
it quick and easy to create an account, upload model code, run
simulations, and get back results.  To get started, go to
http://www.nsgportal.org, click on the “Go to the NSG Portal”
button, and follow the instructions in the sentence
New users who are interested in getting an account should
fill out the form and email it to nsghelp@sdsc.edu
on that page.

Currently the NSG has the latest version of NEURON installed, and
we plan to make other simulators such as GENESIS, MOOSE, and NEST
available in the next few months.

Investigators who have already obtained allocations on HPC resources,
and would like to use the NSG as a convenient interface for using those
allocations, are also invited to contact us to facilitate this.

For any questions related to the NSG portal, please contact us at
nsghelp@sdsc.edu