A storm is brewing in the last week or so over the results in a study published in Science on genes related to longevity. At the heart of this are key issues in experimental design and quality control. A story of two platforms, genotyping calling errors and at least two variants that appear to be false positives – SNP Gate. However, what I’m most curious about is how many scientists will read the media snapshots of this and actually see some points that can help them in their own study:

  1. Batch effects can introduce noise into a study, possibly confounding interpretation of the results. To avoid this, possible issues need to be taken into consideration BEFORE the study starts. In this case, it might have been wise to order/run all the arrays at one time, rather than to start running arrays and find out the manufacturer stopped making them before you complete your study.
  2. Studies can live and die by the genotyping. QA/QC, examination of impact of genotyping algorithm – all of these are steps that need to happen before the analysis begins.
  3. Replication is still King (sorry LeBron). No matter how persuasive the results look, hold your breath until you have replication in an independent population.

Electronic searching means that no relevant paper is likely to go unread, but narrowing the definition of “relevance” risks reducing the cross-fertilisation of ideas that sometimes leads to big, unexpected advances. As a wag once put it, an expert is someone who knows more and more about less and less until, eventually, he knows everything about nothing. It would be ironic if that is the sort of expertise that the world wide web is creating.  – The Economist 

Research by James Evans from the University of Chicago in the latest issue of Science suggests that science is experiencing what could be considered a  ‘lineage truncation’ with respect to citations in current articles.   As more journals become available online, fewer articles are being cited in the reference lists of the research papers published within them. Those articles that are cited tend to have been recently published themselves. For every additional year of back-issues of a journal available online, the average age of the articles cited from that journal fell by a month. It also appears that once a journal is online, there is a drop in the number of papers in it that get any citations at all. 

In terms of why this is happening, he suggests that the forced browsing of print archives may have led to increased scholarship, perhaps due the discovery of new concepts or work. This led me to consider several questions:

  1. Is this really just a failing with respect to our curriculum? In the google/digi era, is the art of a literature search, as well as the skills that accompany it, seen as no longer necessary?
  2. Does the web make it easier for us to find “prevailing opinion” such that we no longer are synthesizing on our own? If this trend continues, it will become harder and harder to find the original references as papers become ‘shallower’ in their citations and cite more and more of the same papers.
  3. Or is this another side effect of the information overload? Is there just not enough time / capacity for an individual to synthesize all of the the articles in his/her area? Especially as domains are now more integrative and multi-disciplinary. This is important for two reasons. The first is collegial- no one likes to see their work ignored or worse yet forgotten when another (overlapping) paper is published without citing the original contributions to the field. Second, there is a  certain magic that occurs when you are exposed to new research or ideas, (particularly those outside of your field) and you find a connection to your work. How can we keep this potential for synergy and mental stimulus alive?

Perhaps variations on tools like eTBLAST (given of course that they have complete open access to the journal databases) are part of the solution to help us avoid  jumping off this cliff into the intellectual void.