Archive for the ‘NIFarious Ideas’ Category

Do you know what you don’t know? A gap analysis of Neuroscience Data.

Posted on October 17th, 2013 in Anita Bandrowski, Data Spotlight, Inside NIF, NIFarious Ideas | No Comments »

My thesis adviser, a colorful spirit and one whose wisdom will long be missed, used to say that undergraduate or professional students differed from graduate students in that they were asked to learn what was known about a subject, while graduate students were asked to tackle the unknown.

We, in higher education, are essentially seeking to find out what is not known and start to come up with new answers. How does one find out what is not known? In fact, is it possible to do that? Don’t most graduate students or postdocs add onto a lab’s existing body of knowledge? Adding to the unknown by building on the known? If this is how we work then does this create a very skewed version of the brain? How would we even know what is truly unknown?

Now we enter the omics era, where we try to find out all things about a set of things. We no longer want to know about a gene, we want to know about all of the genes, the genome of an organism. We want to account for all things of the type DNA and figure out which parts do what. In neuroscience, this tends to be a little more difficult. Mainly because we do not have a finite list of things that we can account for. We have a large quantity of species with brains, or at least ganglia, we have billions of cells and many more connections between them in a single human brain. The worst part is that these connections are not even static so a wiring diagram is only good for a few minutes for a single brain and then the brain reorganizes some of these connections.

Is the hope for an “omics” approach to neuroscience?

Well, the space is not infinite and has been studied over the last 100+ years so we have some ways of getting at the problem. We have a map!
Can we use this map to figure out some basic information about what we do and do not study? Well, the short answer at least for some things seems to be yes!

The Neuroscience Information Framework ( project has been aggregating data of various sorts that is useful to neuroscientists, and also a set of vocabularies for all of the brain parts, the map of the nervous system. So we can start to look at which labels are used for tagging data, and which are found in the literature? Are all parts of the brain equally represented by relatively even amounts of data or papers or are there hot spots and cold spots for data?

Below is a heat map generated using the Kepler tool for data sources vs brain regions across the canonical brain regions (a hierarchy built to resemble what one may find in a graduate level text book of neuroanatomy).
Screen Shot 2013-10-17 at 1.28.39 PM

Albeit the heat map is very hard to read (the darker the green the more data, you can generate your own by clicking on the graph icon in NIF), there is little doubt that all brain regions are not equal, and some have very little data, while others have a plethora of data begging the question: Are there popular brain regions and not-so-popular brain regions?

Screen Shot 2013-10-28 at 8.38.30 AM

Indeed, there are brain region annotations that are found more often, when looking at data and much like pop stars, they tend to have shorter names. The most popular data label is actually brain, and the least popular appears to be the Oculomotor nerve root. This is starting to tell us that most data is just labeled as “brain vs kidney”, but can we do better as neuroscientists? In fact, we can break down the labels into major regions like hindbrain, midbrain and forebrain and add up all of the data that fit into each of these. Note most of the data are attributed to the forebrain, housing some of the most popular brain regions such as the cerebral cortex and the hippocampus, but the hindbrain also comes back with some reasonable data, mainly for the cerebellum. It turns out that adding up all the data labels for midbrain regions results in an awkward sense that the midbrain may be completely non-essential to brain research. On the other hand, removing the midbrain appears to be essential to life, so why do neuroscientists not know much or at least publish much about the midbrain?

Screen Shot 2013-10-17 at 3.57.58 PM

So if you are hiding a big pile of data about the midbrain in your desk drawer, I would like to formally ask you to share it with NIF (just email so that I can stop thinking of the midbrain as the tissue equivalent of fly-over country.


Resource Identification Guidelines – now at Elsevier

Posted on September 6th, 2013 in Anita Bandrowski, Curation, Interoperability, NIFarious Ideas | No Comments »

The problem of reproducibility of results has been addressed by many groups, as being due to scientists having very large data sets and highlighting the interesting, yet most likely statistically anomalous findings and other science no-no’s like reporting only positive results.

Our group, has been working to make the methods and reagents reporting better and I am happy to report that this group has been seeing resonance of these ideas.

In a group sponsored by FORCE11, a group of researchers, reagent vendors and publishers has been meeting to discuss how to best accomplish better reporting in all of the literature and both the NIH and publishers themselves are now becoming interested in their sucess. The latest and greatest evidence of this can be found on the Elsevier website, as a guideline to authors, however this will soon be followed by a pilot project to be launched at the Society for Neuroscience meeting with over 25 journals and most major publishers.

Of course there is no reason to wait for an editor to ask to put in catalog numbers or stock numbers for transgenic animals. These should be things that we are trained to do in graduate school as good practices for reporting our findings.

We seem to be getting ready to change (or change back) to a more rigorous methods reporting, which should strengthen the recently eroded credibility of the scientific enterprise. I for one, hope that the message that will be communicated is: “scientists don’t hide problems, even endemic ones, we examine them and find workable solutions”.

Top 20 – Publications – of the month!

Posted on July 22nd, 2013 in Anita Bandrowski, Data Spotlight, NIFarious Ideas | No Comments »

At NIF we try to mix it up. Instead of bringing you the top databases this month, we thought that this may be of more interest.

A heart-felt congratulations to the most annotated papers of all time! It is sort of like being the most cited, but it is more like being the most useful.

The winners are clearly publications about databases, congratulations to InterPro and Protein Data Bank, the Gene Ontology itself, and also complete genomes for various organisms. The black horse in the race asks “How many drug targets are there?” and the answer appears to be at least 7000. Heyndrickx and Vandepoele may have the most citations per human being because they wrote the only paper on this list with only 2 authors, most of the rest dilute their citations by an order of magnitude.

In any case, we thought that this would be a fun set of data to look at.

* Mulder et al The InterPro Database 2003 cited 49030 times

* Camon et al The Gene Ontology Annotation (GOA) project: implementation of GO in SWISS-PROT, TrEMBL, and InterPro 2003 cited 49030 times

* Heyndrickx and Vandepoele Systematic identification of functional plant modules through the integration of complementary data sources 2012 cited 37987 times

* Matsuyama et al ORFeome cloning and global analysis of protein localization in the fission yeast Schizosaccharomyces pombe 2006 cited 14156 times

* Barbe et al Toward a confocal subcellular atlas of the human proteome 2008 cited 9574 times

* Simmer et al Genome-wide RNAi of C elegans using the hypersensitive rrf-3 strain reveals novel gene functions 2003 cited 8249 times

* Heidelberg et al DNA sequence of both chromosomes of the cholera pathogen Vibrio cholerae 2000 cited 8192 times

* Imming et al Drugs, their targets and the nature and number of drug targets 2006 cited 7742 times

* Overington et al How many drug targets are there? 2006 cited 7664 times

* Gibbs et al Rat Genome Sequencing Project Consortium. Genome sequence of the Brown Norway rat yields insights into mammalian evolution 2004 cited 7594 times

* Berman et al The Protein Data Bank 2000 cited 7107 times

* Ceron et al Large-scale RNAi screens identify novel genes that interact with the C elegans retinoblastoma pathway as well as splicing-related components with synMuv B activity 2007 cited 6691 times

* Moran et al Genome sequence of Silicibacter pomeroyi reveals adaptations to the marine environment 2004 cited 6389 times

* Read et al The genome sequence of Bacillus anthracis Ames and comparison to closely related bacteria 2003 cited 6155 times

* Young et al Odorant receptor expressed sequence tags demonstrate olfactory expression of over 400 genes, extensive alternate splicing and unequal expression levels 2003 cited 5935 times

* Buell et al The complete genome sequence of the Arabidopsis and tomato pathogen Pseudomonas syringae pv tomato DC3000 2003 cited 5518 times

* Heidelberg et al Genome sequence of the dissimilatory metal ion-reducing bacterium Shewanella oneidensis 2002 cited 5183 times

* Methé et al Genome of Geobacter sulfurreducens: metal reduction in subsurface environments 2003 cited 4084 times

* Nelson et al Whole genome comparisons of serotype 4b and 1/2a strains of the food-borne pathogen Listeria monocytogenes reveal new insights into the core genome components of this species 2004 cited 3881 times

* Ward et al Genomic insights into methanotrophy: the complete genome sequence of Methylococcus capsulatus (Bath) 2004 cited 3877 times

The data are composed of annotations aggregated from over 40 individual databases and the Gene Ontology Consortium. For a complete and current list of databases included please see the NIF annotations information page, and to see a complete list of current annotations see the NIF annotations complete data set.

Note, these numbers are accurate for July 20th, 2013, but may change as data are added.

There is a Link between literature and data, it has been there for years, but nobody ever found it

Posted on July 10th, 2013 in Anita Bandrowski, Curation, Data Spotlight, Force11, Interoperability, NIFarious Ideas | No Comments »

The NIH has had a recent request for information about the NIH data catalog to which our group and many others have responded. Many voices including fairly important ones from the white house are now calling for making scientific research data open, available and linked to the publications written about the data. This is a very good thing. It should lead to better handling and comparison of data and better science.

However, sitting in many recent meetings with members of various national libraries, who shall remain nameless, I am astounded to learn that not only the scientists, but also librarians have never found the LinkOut feature in PubMed.

LinkOut is a little option at the bottom of all articles in PubMed hidden by the good staff into complete obscurity, please see the screen shot below if you don’t believe me that such a feature exists.

Screen Shot 2013-07-10 at 3.10.35 PM

The article above links to two data sets, one is based on a curated set of annotations linking genes to genetic disorders, and the other is the a set of statements about antibody reagents used in this paper. Links from other papers lead to computation model code described in the paper, activation foci or data repositories.

Although it is certainly rarely used, the model organism communities, data repositories and researchers, have been diligently adding their data to PubMed in the form of links. We may quibble about the fact that PubMed asks many of us to reduce the specific links to data to generic links that lead to another version of the same article, but the fact is, that the links to data are present! Because they are present, if the National Library of Medicine ever decides to search them, export them, or acknowledge their existence, it would be a treasure trove of data to literature links that would not require a huge new investment in infrastructure.

I am not suggesting that our infrastructure could not be upgraded, in fact we have many more technical gripes that I will not bring up here, but I am suggesting that we all take advantage of the massive investment of time and energy of curators and authors over the last decades to meticulously link their data or data repositories to the literature.

The LinkOut broker has helped NIF aggregate a list of about 250,000 links from ~40 databases, but what PubMed must have is a much much larger set of data. The links provided by NIF can be searched through the NIF site, they can be filtered by category and by database, and they can be extracted and embedded into other sites like science direct (see Of these 1/4 million links that we provide to PubMed, between 100 and 200 users find them per month. I think that we can and should do better.

  • We can ask that PubMed makes links to data prominent.
  • We can ask that any links in PubMed be of good quality, e.g., results of text-mining output should not be included without verification by authors or curators.
  • We can ask that the links show actual data as opposed to the representation of the paper in another site (currently required).

If you feel the sudden urge to be an arm-chair activist, then please let PubMed know that it would be nice if they celebrated the current links between data and publications instead of hiding them.

The experience of a bench scientist with open publishing.

Posted on June 21st, 2013 in Anita Bandrowski, Force11, NIFarious Ideas | No Comments »

I recently asked a bench scientist about her experiences in publishing in this very new mode of scholarly communication, i.e. in F1000Research, which is open access, has an open review process and is about as transparent as the community has ever asked any journal to be. The question was how did she view this process.

To give a bit of background, she is still attempting to publish 3 articles in F1000 research, about work that she has done on tracking down the switch from benign to malignant tumor growth. Two of the articles are now accepted for publication and in the process of being indexed by PubMed (F1000Research 2013, 2:10 (doi: 10.12688/f1000research.2-10.v1), F1000Research 2013, 2:9 (doi: 10.12688/f1000research.2-9.v2)) and the last is in the bowels of the publishing machinery (Witkiewicz et al Article I).

I asked her a set of questions about the review process, which she discusses below. She agreed to let me post them here and just as a note, the articles prior to publication were viewed 1415, 1373 & 1005 times and downloaded 231, 330 & 321 times, respectively. This sort of buzz is seldom generated by published work so I have been quite surprised that it can be generated prior to publication.

Your questions are easy to answer; however, I would like to point out that my answers may not well represent the larger community of younger bench scientist. My sense of right and wrong has been shaped in different countries (Poland, Austria, and Canada) and at different times. Nevertheless, here they are for whatever it is worth:

How do you view the landscape of open scholarly communication, do you get lost in it?

If I do not feel lost in the maze of the new ways of communicating it may be because of not having explored it enough. So far I have been relying mostly on the traditional ways of searching literature: PubMed and following references within articles found that way, as needed. I do get personal copies of the Scientist and Nature Methods and attend meetings in San Diego that are relevant to my work. I think it was in The Scientist that I first read about PLoS ONE and later F1000Research. From the meetings I get new clues for additional searches of the literature on my own.

If you were asked to change your methods to include catalog numbers or unique identifiers, would this make you mad and would you comply?

The catalog numbers for antibodies, the strain of GFP labeled mice and references to cell lines are all in the first versions of the articles, as they should be. These sorts of things although tedious do not bother me and in the long run having all practical details in one easy to find place is helpful.

Were there things you appreciated about having an open review?

Yes, definitely. I very much appreciated the professional editorial help up front. Another and even more critical point is that if the referees listed by the journal decline the invitation to write a review, others, not listed there may be considered as well. I waited too long for second and third reviews not realizing that they would not come.

Were there things that were a lot harder?

No. It is perhaps a little hard to take that defending one’s position does not change anything in the end. The editor does not judge one way or another. However, I do not mind that because the negative comments do not disqualify the article if there are others. That is fair enough. Any rules are fine with me provided all parties play by the same rules. ‘Dura lex sed lex’: harsh law but law.

Do you think that open review is more or less fair than traditional reviews?

Open review is more fair although fewer people are free enough to take sides in public.

Cannabis in NIF’s Data Holdings

Posted on June 13th, 2013 in Anita Bandrowski, Data Spotlight, Inside NIF, NIFarious Ideas | No Comments »

NIF was asked to give the National Institutes on Drug Abuse a report of the state of the data holdings for one abused substance: Cannabis. The report is included below. The data reflect the state of NIF in early 2013, following the links will potentially lead users to updated numbers.

Within NIF many sources have information about cannabis or the endocanabinoid system (we did not include an analysis of the literature). These results have been broken down by the number, below (for an interactive graph click here and then select the Graph Filters Box).



Looking at genes we find the two human genes CNR1 and CNR2, which are endocanabinoid receptors have these counterparts (CNRIP1,  Cnr1,  cnrip1,  Cnr2,  Cnrip1,  and cnr1) in many other species including, mammals, birds, fish and tunicates. This indicates that the gene family is quite widespread.

Three clusters of genes have emerged based on the Homologene clustering algorithm. These gene families are for CNR1, CNR2 and the interacting protein called CNRIP1.



The endocanabinoid receptors have several drugs (largely derived from THC) that interact with the receptors.

Drugbank, a leading source of drug information tells us that two small molecule drugs have been used to affect the endocannabinoid system.

Nabilone has been used largely as an anti-anxiety agent or an antiemetic. Nabilone is a cannabinoid with therapeutic uses. It is an analog of dronabinol (also known as tetrahydrocannabinol or THC), the psychoactive ingredient in cannabis. It is reserved for use in individuals who do not respond to the more commonly used anti-emetics.

Dronabinol has also been used as an antiemetic, but also analgesic, non-narcotic psychotropic drug and a hallucinogen. Marinol may have complex effects on the central nervous system (CNS), including cannabinoid receptors. Dronabinol may inhibit endorphins in the emetic center, suppress prostaglandin synthesis, and/or inhibit medullary activity through an unspecified cortical action.

The NIMH Chemical Synthesis and Drug Supply Program lists three more specific drugs including two specific antagonists and ChEBI lists 12 variants of THC, which have less associated data, but may be useful as highly experimental substances that may have some specificity as agonists or antagonists, see below.

Screen Shot 2013-06-13 at 2.27.39 PM
ChEBI compounds can be found below.



Chemical Formula


CHEBI:219639 C21H27F3O2 6,6,9-Trimethyl-3-(5,5,5-trifluoro-pentyl)-6a,7,10,10a-tetrahydro-6H-benzo[c]chromen-1-ol (5′-F3-delta8-THC)
CHEBI:219603 C21H29FO2 3-(5-Fluoro-pentyl)-6,6,9-trimethyl-6a,7,10,10a-tetrahydro-6H-benzo[c]chromen-1-ol (5′-F-delta8-THC)
CHEBI:164237 C21H30O2 6,6,9-Trimethyl-3-pentyl-6a,7,8,10a-tetrahydro-6H-benzo[c]chromen-1-ol (delta9-THC)
CHEBI:566631 C37H54O4 alpha-Cadinyl delta9-Tetrahydrocannabinolate
CHEBI:566632 C37H54O4 gama-Eudesmyl delta9-Tetrahydrocannabinolate
CHEBI:566609 C32H46O4 alpha-Terpenyl delta9-Tetrahydrocannabinolate
CHEBI:566610 C32H46O4 beta-Fenchyl delta9-Tetrahydrocannabinolate
CHEBI:566618 C32H46O4 alpha-Fenchyl delta9-Tetrahydrocannabinolate
CHEBI:566619 C32H46O4 epi-Bornyl delta9-Tetrahydrocannabinolate
CHEBI:566620 C32H46O4 Bornyl delta9-Tetrahydrocannabinolate
CHEBI:566630 C32H46O4 4-Terpenyl delta9-Tetrahydrocannabinolate
CHEBI:566636 delta9-tetrahydrocannabinolic acid A

Several of these compounds have been tested in various brain regions against the two known canabinoid receptors and from the Ki database we find that there is more data for CB1 (345), than the CB2 (173). Results are also organized by species and brain region with rat (296 results) and human (192) tested most frequently, followed by mouse (18), zebra finch (8) and newt (5) data.



Wiki pathways, a collaborative platform from UCSF, lists the following pathways for CNR1 and CNR2:

Pathway Name

  • GPCRs, Class A Rhodopsin-like (11)
  • Small Ligand GPCRs (11)
  • GPCRs, Other (3)

Gene Symbol

  • CNR2 (7)
  • CNR1 (6)


  • Mus musculus (5)
  • Pan troglodytes (5)
  • Rattus norvegicus (5)
  • Homo sapiens (4)
  • Bos taurus (2)
  • Danio rerio (2)
  • Gallus gallus (2)


MultiMedia Information

Scientists studying endo- and exo-cannabinoids have written hundreds of blogs. Most of these focus on cannabis risks or the treatment of various disorders. A quick survey of these blogs indicates that there are links to Fragile X syndrome, multiple sclerosis, depression and decreases in intelligence on standardized tests. In addition, many scientists study the risk of motor vehicle accidents and possible interactions of mothers’ canabinoid exposure and the proclivity of the offspring toward opiates, but surprisingly some have also reported a Marijuana-Borne Salmonella Outbreak and poisoning of workers from herbicides used to kill the plants.

A few video talks and interviews with leading researchers are also available (see several examples below).


NIHVideo Molecular Dissection of Cannabis Sensitivity in the Developing Brain Tibor Harkany, PhD, Department of Medical Biochemistry and Biophysics, Karolinska Institute
NIHVideo New Developments in Cannabinoid Research: The Path from Plant to Modern Prescription Medicine Guy, Geoffrey W. National Institutes of Health (U.S.)
The Guardian: Science Videos Cannabis ‘more harmful to under-18s than adults’ – video
NIHVideo Brain Stress Systems and Addiction Koob, George F. National Institutes of Health (U.S.)


Funding Sources

Not surprisingly, NIDA funds most research on cannabis, but a few current grants are also given by NIMH, Alcohol, and NINDS. The breakdown of the number of recent grants by institute follows: national institute on drug abuse (375), national institute of mental health (26), national institute on alcohol abuse and (26), national institute on aging (11),  and many others.

From older grants (Research Crossroads dataset covering both federal and foundation grants) we find that many institutes and foundation have given out grants related to the cannabinoids, including:

  • National Institute on Drug Abuse(NIDA) (2937)
  • National Center for Research Resources(NCRR) (152)
  • National Institute of Neurological Disorders and Stroke(NINDS) (90)
  • National Institute on Alcohol Abuse and Alcoholism(NIAAA) (64)
  • National Institute of Mental Health(NIMH) (57)
  • National Institute of General Medical Sciences(NIGMS) (52)
  • National Cancer Institute(NCI) (33)
  • National Eye Institute(NEI) (33)
  • National Institute of Allergy and Infectious Diseases Extramural Activities(NIAID) (15)
  • National Institute of Child Health & Human Development(NICHD) (15)
  • Medical Research Council (UK) (12)
  • National Heart, Lung, and Blood Institute(NHLBI) (12)
  • CORDIS (11)
  • National Institute of Environmental Health Sciences(NIEHS) (9)
  • National Center for Complementary and Alternative Medicine(NCCAM) (7)
  • National Institute of Diabetes and Digestive and Kidney Diseases (7)
  • National Institute on Aging(NIA) (7)
  • Fogarty International Center(FIC) (5)

Our search also reveals that even private funders like the American Diabetes Association also gave out a grant focusing on the therapeutic effects of endogenous cannabinoids in diabetic retinopathy.


Diseases and Clinical Studies

There are few diseases directly associated with Marijuana, however Pubmed health mentions four, including:

Marijuana intoxication

Aspergillosis, which is an infection or allergic response due to the Aspergillus fungus. Aspergillosis is caused by a fungus (Aspergillus), which is commonly found growing on dead leaves, stored grain, compost piles, or in other decaying vegetation. It can also be found on marijuana leaves. Although most people are often exposed to aspergillus, infections caused by the fungus rarely occur in people who have a normal immune system. The rare infections caused by aspergillus include pneumonia and fungus ball (aspergilloma).

Lung cancer – non-small cell, which mentions that “research shows that smoking marijuana may help cancer cells grow, but there is no direct link between the drug and developing lung cancer.”

Paraquat poisoning, which describes paraquat (dipyridylium) as a highly toxic weed killer once promoted by the United States for use in Mexico to destroy marijuana plants. Research found that this herbicide was dangerous to workers who applied it to the plants. This article discusses the health problems that can occur from swallowing or breathing in Paraquat.

NIF searches across two sources of clinical data, and the US based contains information about 294 clinical trials, but the European based EU Clinical Trials Register finds only 27 additional trials.  Below you can find the main conditions, interventions and sponsors for the clinical trials, with numbers indicating the number of clinical trial results.


  • Marijuana Dependence (15)
  • Marijuana Abuse (13)
  • Multiple Sclerosis (11)
  • Cannabis Dependence (10)
  • Healthy (8)
  • Marijuana Smoking (6)
  • Pain (5)
  • Schizophrenia (4)
  • Schizophrenia;Schizoaffective Disorder (4)
  • Substance-Related Disorders


  • Behavioral: Behavior Therapy (6)
  • Drug: Dronabinol (6)
  • Drug: Cannabis (4)
  • Drug;Drug: Sativex®;Placebo (4)
  • Drug: GW-1000-02 (3)
  • Drug: Nabilone (3)
  • Drug: Smoked Cannabis (3)
  • Drug;Drug: GW-1000-02;Placebo (3)

Sponsored By

  • National Institute on Drug Abuse (NIDA); NIH (23)
  • GW Pharmaceuticals Ltd.; Industry (15)
  • New York State Psychiatric Institute;National Institute on Drug Abuse (NIDA); Other;NIH (9)
  • Center for Medicinal Cannabis Research; Other (7)
  • National Institute of Mental Health (NIMH); NIH (6)
  • Yale University; Other (6)


Inference Data

The Clinical Toxogenomics Database (CTD) includes data about genes, pathways and diseases that have been found to be statistically associated with cannabinoids. The data are not from direct assertions, so they should be taken with a certain degree of skepticism. However, possibly interesting interactions with the following genes, diseases and pathways have been asserted.


  • AKT1 (580)
  • TNF (414)
  • ABCB1 (400)
  • SCARB1 (146)
  • IL1B (144)
  • IFNG (140)
  • CNR1 (139)
  • VEGFA (110)
  • FDFT1 (104)
  • HMOX1 (100)
  • PTGS2 (97)


  • Marijuana Abuse (1512)
  • Breast Neoplasms (74)
  • Prostatic Neoplasms (69)
  • Stomach Neoplasms (55)
  • Lung Neoplasms (47)
  • Carcinoma, Hepatocellular (45)
  • Myocardial Ischemia (43)
  • Schizophrenia (43)
  • Cocaine-Related Disorders (39)
  • Liver Cirrhosis, Experimental (37)


  • Signal Transduction (11)
  • Immune System (9)
  • Disease (8)
  • Membrane Trafficking (6)
  • Metabolism (6)
  • Neuronal System (5)
  • Glioma (4)
  • Hemostasis (4)
  • Muscle contraction (4)
  • Neurotrophin signaling pathway (4)




Microarray and Gene Expression Data

The Gene Expression Omnibus (GEO) contains data from 5 studies having to do with cannabinoids.

One of those studies was analyzed in significant detail by Gemma, which tells us that the study which specifically targeted an animal model for cutaneous contact hypersensitivity, showed that mice lacking both known cannabinoid receptors display exacerbated allergic inflammation. The study looked at CNR knockout mice and the main experimental factors were dinitrofluorobenzene vs. Control_group, and the Cnr1-/-/Cnr2-/- vs. C57BL/6J (knockout vs control).

The drug related gene database, reports on several studies including a heroin withdrawl / cannabidol withdrawl interaction study showing some interactions in the caudoputamen. The brain regions, experimental conditions and organisms most commonly studied are listed below:

Brain Region

  • Anterior prefrontal cortex (556)
  • Dorsolateral caudoputamen (8)
  • Medial caudoputamen (8)
  • Mid-lateral caudoputamen (8)
  • CA1 stratum lacunosum moleculare (5)
  • CA1 stratum oriens (5)
  • CA1 stratum radiatum (5)
  • CA3 stratum lucidum (5)
  • CA3 stratum oriens (5)
  • CA3 stratum radiatum (5)

Exp vs Control

  • Cocaine + THC + PCP vs. Control (139)
  • Cocaine vs. Control (139)
  • PCP vs. Control (139)
  • THC vs. Control (139)
  • Delta9 THC vs. 1:1:18 solution of ethanol, emulphor, and saline (17)
  • Alpha7 nicotinic acetylcholine + cannabinoid receptor 1 vs. Alpha7 nicotinic acetylcholine (9)


  • Human: , 13-64 years Adolescent – Adult human (556)
  • Sprague Dawley Rat: Male, Adult 200-250 g (43)
  • Long Evans rat Rat: Male, 230-250 g at the beginning of the experiment (30)
  • Mouse: , (17)
  • Sprague Dawley Rat: Male, Adult rat 380-410 g (2)

Protocol Type

  • dna microarray (558)
  • immunohistochemistry (47)
  • in situ hybridization / immunohistochemistry (18)
  • in situ hybridization / double in situ hybridization (16)
  • in situ hybridization /double in situ hybridization (9)


In mice, the brain structures that express cnr1 and cnr2, based on data from the mouse genome informatics database, the alen brain atlas and gensat are:

  • hypothalamus (10)
  • olfactory bulb (10)
  • thalamus (10)
  • cerebellum (9)
  • cerebral cortex (9)
  • midbrain (9)
  • pons (8)
  • amygdala (7)
  • hippocampus (7)
  • pallidum (7)
  • hippocampal formation (6)
  • lateral septal complex (6)
  • anterior olfactory nucleus (5)
  • basal ganglia (5)
  • corpus striatum (5)
  • diencephalon (5)
  • entorhinal cortex (5)
  • globus pallidus (5)
  • hindbrain (5)
  • inferior colliculus (5)

Most common assays are:

  • rt-pcr (976)
  • bac-cre recombinase driver (38)
  • rna in situ hybridization (34)
  • rna in situ (26)

The age of the organism:

  • postnatal week 6-8 (732)
  • postnatal (244)
  • adult (53)
  • embryonic day 14.5 (26)
  • p7 (19)


Brain Volume and Brain Activation Foci Data

Based on a publication in 2010, the cerebellar volume of marijuana abusers of 18 years of age appears to be significantly smaller than the norm. The norm is established by looking at the volumes reported from many publications.

The brain activation foci from SumsDB involved in marijuana use or abuse are shown as gray dots on the brain below. Each gray dot represents a coordinate from a study cataloged by SumsDB that involves cannabis. These have been pulled from the WebCaret software, from the laboratory of David VanEssen, and are accessible by clicking the “View on Brain” button within the SumsDB data result. Below, we are showing several views of the same result set, because not all points are visible from any one view of the brain, suggesting that there is no unified brain region involved cannabis abuse rather many regions are involved.


Medial View


Posterior View


Imaging Data

The Cell Image Library has several data sets from Margaret I. Davis, mainly of interneurons expressing EGFP from the 5HT3 receptor promoter (Tg(Htr3a-EGFP)DH30Gsat, colabelled for the CB1 cannabinoid receptor.

The image below is from the pyramidal cell layer in hippocampal CA1.


The figures show the distribution of interneurons expressing EGFP from the 5HT3 receptor promoter (Tg(Htr3a-EGFP)DH30Gsat, in the dorsal hippocampus colabelled for the CB1 cannabinoid receptor (red) and counterstained with DAPI (blue) to show the cell layers. In this experiment EGFP expression was amplified with chicken anti-GFP (Abcam, 1:2000); cell bodies and fibers are present throughout all layers of the hippocampus but enriched in the hilus and stratum lacunosum moleculare (see associated images). CB1 immunoreactivity (L15 rabbit polyclonal 1:200, K. Mackie) is prominent in the terminals of basket cells synapsing in the pyramidal cell layer. CB1 is also enriched in axons with distinct intensities in the inner and outer molecular layer of the dentate gyrus. CB immunoreactivity is also present in the stratum radiatum and stratum lacunosum moleculare.


The data above can be complemented by 168 images gathered by Gensat from this mouse, generated by that project.



Animals, antibodies, databases as well as a game useful for teaching undergraduates are available from various sources.

The mouse line generated by Gensat (images above) is now available from the MMRRC resource, where two founder lines of knock out mice are available. The International Mouse Resource Center (IMSR) lists several other mice that may be available to the research community.

The Antibody Registry, our in house antibody aggregator, which aims to serve the scientific community by providing a set of unique identifiers for commercial and non-commercial antibody reagents, lists 242 antibody reagent offerings related to cannabinoids from over 10 vendors.


The NIF registry does not encode many resources that are specifically devoted to cannabinoids, but does note that several resources mention cannabis in their description including:
University of California at San Diego, Center for Medicinal Cannabis Research Resource Type: institutional portal
The Center for Medicinal Cannabis Research (CMCR) will conduct high quality scientific studies intended to ascertain the general medical safety and efficacy of cannabis products and examine alternative forms of cannabis administration. The Center will be seen as a … See full record: nif-0000-10503

Mouse Party Resource Type: training material, video
Mouse Party is an interactive website that teaches how various drugs disrupt the synapse by taking a look inside the brains of mice on drugs! Every drug of abuse has its own unique molecular mechanism. Where applicable, this presentation primarily depicts how drugs… See full record: nif-0000-00429


SPROUTS- Structural Prediction for Protein Folding Utility System Resource Type: database, data analysis service
SPROUTS is a database of predicted protein folding related data. It was designed to gather all the results from a study concerning the comparison between tools devoted to the prediction of stability changes upon point mutations. The second aim of this database is…  See full record: nif-0000-03491

Universal Virus Database Resource Type: web accessible database
The ICTVdB is a dynamic database containing information about viruses of animals, plants, bacteria, and fungi. Though initially designed for taxonomic (or classification) research, the ICTVdB has evolved to become a major reference resource and research tool. The … [more] See full record: nif-0000-21213

Psychoactive Drug Screening Program Ki Database Resource Type: database, data repository
Database of information on the abilities of drugs to interact with an expanding number of molecular targets. It serves as a data warehouse for published and internally-derived Ki, or affinity, values for a large number of drugs and drug candidates at an expanding n… [more] See full record: nif-0000-01866

Subviral RNA Database Resource Type: database
The Subviral RNA database facilitates the research and analysis of viroids, satellite RNAs, satellite viruses, the human hepatitis delta virus, and related RNA sequences. It integrates a large number of Subviral RNA sequences, their respective RNA motifs, analysis … [more] See full record: nif-0000-03507

What if scientists dropped the jargon?

Posted on May 5th, 2013 in Anita Bandrowski, News & Events, NIFarious Ideas | No Comments »

This is a question asked by Alan Alda, a tireless advocate for clear communication of science to the public. He has been the host of the PBS series “Scientific American Frontiers” for many years and has worked since 2005 on a novel and amazing idea: scientists should drop the jargon and actually try to communicate clearly to the public.

Apparently if you drop the big words that are relatively inscrutable, most medical procedures begin to sound a little more comprehensible. Alan used the example of the end-to-end anastomosis, an inscrutable surgery that he performed on M.A.S.H. often, but when a doctor had to perform this same surgery on him, it was explained a little more clearly, in that a part of the colon that had gone bad needed to be removed and the healthy ends needed to be re-attached.

This does not take a genius to understand, but the terminology in medicine and science in general can be so full of jargon that scientists in adjacent fields have trouble. In fact, here is a sentence that I wrote, but would be hard pressed to ask anyone outside of a small field of neuroscientists to interpret correctly “In CNQX, ACPD only decreased EPSPs, but APV or bicuculline did not change the effect of ACPD.” This was not meant for a general audience, but I can see that we must do better in explaining our findings to each other, if not the general public. Why have we built a system that has become so difficult to understand?

If we think that this system is not a good system, like Alan Alda suggests, what can be done to change this?

Alan, for one is not sitting around asking questions, this year at Stony Brook University he will be starting a new chapter in his illustrious career as an academic. In 2009, he launched the “Alan Alda Center for Communicating Science” and has been touring universities, teaching workshops and campaigning since. This is a great effort and I for one, wish him the best of luck and hope that more scientists can think and talk clearly about science.

For more, see:

How Do You Evaluate a Database

Posted on May 3rd, 2013 in Author, Essays, Force11, Maryann Martone, News & Events, NIFarious Ideas | 3 Comments »

by Maryann E Martone

I was speaking with a colleague recently who, like many of us, had experienced the frustration of trying to support his on-line resources.  He has assembled a comprehensive on-line resource, it is used by the community and was used by others to publish their studies.  It is not Genbank or EBI;  it is one of the thousands of on-line databases created by individuals or small groups that the Neuroscience Information Framework and others have catalogued.  My colleague has spent years on this resource, pored over hundreds of references and entered close to a million statements in the database.  By many means, it is a successful resource.  But in the grant review, he was criticized for not having enough publications.  I experienced the same thing in a failed grant for the resource that I had created, the Cell Centered Database.  In fairness, that was not the most damning criticism, but it just seemed so very misplaced. I had succeeded in standing up and populating a resource, well before there was any thought of actually sharing data.  People used the database and published papers on it, but apparently I should have been spending more time writing about it and less time working on it.

The problems of creating and maintaining these types of resources are well known and were discussed at Beyond the PDF2:  to be funded, you have to be innovative.  But you don’t have to be innovative to be useful.  To quote or paraphrase Carole Goble at the recent conference,  “Merely being useful is not enough.”

But presumably there is a threshold of perceived value where “merely being useful” is enough.  I am thinking of the Protein Databank or Pub Med.  These resources are well funded and also well used but hardly innovative.  I am guessing that many of the resources like my colleague and I created were started with the hope that they would be as well supported and integral to people’s work as the PDB or Pub Med.  But the truth is, they are not in the same class.  But they are still valuable and represent works of scholarship.  We are now allowed to list them on our biosketch for NSF.  So my question to you is:  how do we evaluate these thousands of smaller databases?

Ironically, our peers have no trouble evaluating an article about our databases, but they have much more trouble evaluating the resource itself.  How does one weigh 30,000 curated statements against 1 article?  What level of page views, visits, downloads and citations make a database worthwhile?  If my colleague had published 10 papers, the reviewers wouldn’t have likely checked how often they were cited, particularly if they were recent.  What is the equivalent of a citation classic for databases?  If you don’t have the budget of NCBI, then what level of service can you reasonably expect from these databases?  I thought that the gold standard was a published study that utilized your database to do something else, by a group unconnected to you.  Grant reviewers found that unconvincing.  Perhaps I didn’t have enough? But how many of these do you need, relative to the size of your community,  and on what time frame should you expect them to appear?  Sometimes studies take years to publish.  Do they need to be from the community that you thought you were targeting (and whose institute may have funded your resource) or does evidence from other communities count?

So perhaps if we want to accept databases and other artefacts in lieu of the article, we should help define a reasonable set of criteria by which they can be evaluated.  Anyone care to help here?

Lab Data Management Practices?

Posted on April 28th, 2013 in Force11, Jonathan Cachat, NIFarious Ideas | 2 Comments »

A number of groups, from libraries and universities and academic projects are striving to implement flexible data management systems in order to harness the latest and greatest in semantic web technologies striving to integrate and facilitate breakthrough interdisciplinary analysis.

It is known that every lab, every individual research group (regardless of the discipline) has developed internal data management systems that “work” (i.e. literature & data collection > excel > stats > graphing > word processer) but what has your lab found useful and what are your biggest frustrations?

Please feel free to comment below, or join the discussion on ResearchGate.


Have you been wanting to tell the national academy all of your views about scientific data?

Posted on April 11th, 2013 in Anita Bandrowski, News & Events, NIFarious Ideas | No Comments »

…Well now you can.

Two Planning Meetings: Public Access to Federally-Supported Research and Development Data andPublications

National Academy of Sciences
Meetings are free and open to the public, but registration is required

Public Comment Meeting: PUBLICATIONS

14 May 2013, 9 a.m. – 5 p.m.
15 May 2013, 9 a.m. – 12 p.m.
National Academy of Science
2100 Constitution Avenue
Washington DC 20418
Register Here

On 22 February 2013, the Office of Science and Technology Policy (OSTP) issued a memorandum to the heads of executive departments and agencies, directing them to “develop a plan to support increased public access to the results of research funded by the Federal Government.”

As part of this planning process, a group of cooperating federal agencies has requested that the National Research Council (NRC) Division on Behavioral and Social Sciences and Education (DBASSE) organize this meeting to draw in representatives of all stakeholder groups and interested parties.  The purpose of the meeting is to allow those who wish to do so an opportunity to provide input into the development of acceptable models of public access to the outputs of federally supported research and development.

The National Academy of Science, in keeping with its 150-year tradition of service to the nation, will provide an unbiased forum in which all views may be heard.  Time constraints may require that selection be made among the many who will apply to speak; every effort will be made to assure that those who are selected are representative of all viewpoints.  In addition, all written statements that are submitted through this registration site will be passed on to the sponsors (no other written materials will be accepted).

The meeting will be webcast, which means that the proceedings can be followed by stakeholders around the world, and, after the meeting is over, the video will be posted for easy access and convenience.

Questions about the meeting should be directed to:

Public Comment Meeting: DATA

16 May 2013, 9 a.m. – 5 p.m.
17 May 2013, 9 a.m. – 12 p.m.
National Academy of Science
2100 Constitution Avenue
Washington DC 20418
Register Here

On 22 February 2013, the Office of Science and Technology Policy (OSTP) issued a memorandum to the heads of executive departments and agencies, directing them to “develop a plan to support increased public access to the results of research funded by the Federal Government.”

As part of this planning process, a group of cooperating federal agencies has requested that the National Research Council (NRC) Division on Behavioral and Social Sciences and Education (DBASSE) organize this meeting to draw in representatives of all stakeholder groups and interested parties.  The purpose of the meeting is to allow those who wish to do so an opportunity to provide input into the development of acceptable models of public access to the outputs of federally supported research and development.

The National Academy of Science, in keeping with its 150-year tradition of service to the nation, will provide an unbiased forum in which all views may be heard.  Time constraints may require that selection be made among the many who will apply to speak; every effort will be made to assure that those who are selected are representative of all viewpoints.  In addition, all written statements that are submitted through this registration site will be passed on to the sponsors (no other written materials will be accepted).

The meeting will be webcast, which means that the proceedings can be followed by stakeholders around the world, and, after the meeting is over, the video will be posted for easy access and convenience.

Questions about the meeting should be directed to: