dublin_convention_center

Dublin was a fantastic setting for the joint meeting of ISMB and ECCB – a full five days packed with talks, discussion and networking starting with the SIGS (BOSC, BIOVIS,  HitSeq and others), Satellite Meetings (CAMDA, 3DSig) and the main conference. While all of the science was excellent, a wonderful highlight was David Searls Special Presentation on James Joyce’s Ulysses: A Bioinformatics Perspective.

Ulysses Map of Dublin

Ulysses Map of Dublin

If you missed the meeting, here are some key resources:

Dates set for 8-12 July ISMB 2016 in Orlando, Florida

berlin_lightpaint

Joint 2013 ISMB / ECCB Conference (At ICC Berlin

We are about to kick-off 5 days of presentations, workshops, discussions and tutorials at the largest computational biology meeting in the world. ISMB/ECCB brings together scientists from computer science,  biology, mathematics and statistics, as well as  other disciplines, all of whom are focused in one way or another on the development and application of advanced computational methods to solve biological problems. This is the ideal place to facilitate new ideas, methods, and new collaborations.

Live Feed from the Meeting: 

General Twitter Feed (#ISMBECCB) 

PLOS Community Blog

Quick Links:: Conference: 

Entire Conference Schedule (July 19th- 23rd)

ISMB/ECCB 2013 SIGs & Satellite Meetings (July 19th- 20th)

ISCB Student Council Symposium 2013 (July 19th)

Junior PI Meeting (July 20th)

Tutorials (July 20th)

Workshops (July 21st- 23rd)

Birds of Feather (BoF) (July 22nd)

Quick Links:: Berlin:

Interactive map (Stadtplan)

S+U Bahn Map

Berlin’s Top 30 Restaurants

Visit Berlin (Official Tourism Info)

Online Meter Reader by Jonathan Harris (via the New York Times 6/20/13)

Online Meter Reader by Jonathan Harris (via the New York Times 6/20/13)

big-data The term “Big data” was first published by Michael Cox and David Ellsworth in 1997, in the context of visualization challenges for large data sets from NASA simulations. As Gil Press highlights in his recent blog post “A Very Short History of Big Data“, the term “information explosion”  and management of large data were topics of research interest even in the early 1940’s (Note: Another big data timeline was posted last month by Uri Friedman).  O’Reilly’s  Roger Magoulas and others have been influential in promoting this into a mainstream concept across disciplines.

The concept of “BIG” data refers not only to scale but also complexity. Dr. Jim Gray highlighted this in his final public talk on data intensive science in 2007 under the framework of the  “fourth paradigm“, where the deluge of data is inundating researchers, requiring development of new methods for data management, integration, visualization and interpretation (eScience).

bigdata_gali_figure7-626x306

Phrase map of highly occurring keywords from big data related publications from 2006-2012 (from post by Gali Halevi, MLS, PhD & Dr. Henk F. Moed).

Big data has become an increasing popular area of research. An analysis based on Scopus entries published in a recent Research Trends blog post highlighted the increase in big data related publications in a diverse array of disciplines.  Most striking was the phrase-map based on the top 50 occurring keywords in publications from 2006-2012. From the post:  “These maps visualize two main characteristics of the text: (1) connections between terms are depicted by the gray lines, where a thicker line notes a stronger relationship between the terms; and (2) the centrality of the terms which are depicted by their font size (the bigger the font, the more frequently a  term appears in the text).  Clusters of connections may appear when a connection is found between single words but not to other clusters.” Compared with the phrase map for 1995-2005, there is a clear increase in complexity and connectivity of the map as the research area has developed.

As the research in this area intensifies and the popularity of this term increases further, there is some need for caution to avoid big data tunnel vision. Dr. Phil Bourne stated this succinctly “We need to be less fixated on the big data problems”, highlighting the need to also focus on data management issues for the long tail (i.e., “scientists who generate small quantities of data (collectively much larger than the big data problems but distributed) that are not managed and subsequently analyzed in a way that is optimal”). Steve Lohr, in a his recent article for the New York Times, noted the limitations of focusing on big data in a vacuum and reiterated the need to also emphasize  experience and intuition. Indeed, balanced thinking and perspective is critical in how we focus not only our research but also policy and education around big data.

1992 was a leap year that showcased the maiden voyage of Space Shuttle Endeavor, the release of the Macintosh LCII, riots in LA, the release of the first video phone (AT&T $1,499), and the Imageend of apartheid in South Africa. It was also the same year a group of researchers interested in artificial intelligence and molecular biology participated in a joint NLM meeting with the National Science Foundation on the future of what was then termed artificial intelligence in molecular biology. The following year, the meeting evolved into the first Intelligent Systems in Molecular Biology (ISMB) conference, held in Washington DC.

Today, ISMB celebrates 20 years as the longest running annual bioinformatics/computational biology conference. Over the next five days in Long Beach, California, scientists from diverse disciplines and geographic coordinates will meet to share their work, exchange ideas and discuss challenges and opportunities. This meeting is near and dear to my heart given its focus on methodology and research centered on key biological problems.

Short list of useful links for in-person and virtual attendees:
Proceedings and News Feeds
Logistics
Meeting Coordinates: Long Beach Convention Center 
Public Transit: Passport Shuttle
Local Attractions: Long Beach Interactive Map

In a two-for-one combo, ISMB and ECCB join forces once again (3rd joint conference since 2004).  In the days approaching the meeting, here is a short-list of helpful links if you are attending in person or virtually:

Blogging/News Feeds/ Proceedings:

Twitter Feed

ISMB/ECCB 2011 Friend Feed

Online meeting proceedings

Plos Computational Biology ISMB 2011 related blogs

Conference Logistics:

Meeting Coordinates:  Austria Center Vienna, Bruno-Kreisky-Platz, 1220 Wien

Nearest underground station to ACV: red line, U1, “Kaisermühlen – Vienna International Center” (Underground Map)

Meeting Schedule: Overview & Satellite Meeting/SIGS

ISCB Student Council: Symposium & Other Events

Food Recommendations: NY Times Where to Eat (Map), Vegetarian, Trip Advisor

The door closes on Day 1 of the Conference  in Boston. It was a day marked by the Overton Prize (to Steve Brenner),  a strong and diverse collection of talks reflecting the maturity of the field and an impromptu session on the science of the world cup held in the corridor of the conference center (the winner of course being the cephalopod). Microblogging again provides the best snap-shot of the meeting proceedings. Key links include:

  • Official Friend Feed Room ISMB 2010
  • Twitter search on the #ISMB2010
  • Blog on ISCB Student Council Symposium and ISMB related activities
  • Blog to coordinate the VIZBI BOFA session on Monday

    Celebration on the streets of Boston

A storm is brewing in the last week or so over the results in a study published in Science on genes related to longevity. At the heart of this are key issues in experimental design and quality control. A story of two platforms, genotyping calling errors and at least two variants that appear to be false positives – SNP Gate. However, what I’m most curious about is how many scientists will read the media snapshots of this and actually see some points that can help them in their own study:

  1. Batch effects can introduce noise into a study, possibly confounding interpretation of the results. To avoid this, possible issues need to be taken into consideration BEFORE the study starts. In this case, it might have been wise to order/run all the arrays at one time, rather than to start running arrays and find out the manufacturer stopped making them before you complete your study.
  2. Studies can live and die by the genotyping. QA/QC, examination of impact of genotyping algorithm – all of these are steps that need to happen before the analysis begins.
  3. Replication is still King (sorry LeBron). No matter how persuasive the results look, hold your breath until you have replication in an independent population.

    Photo by Roberto Hurtado

Scripps Translational Science Institute, Navigenics, Affymetrix and Microsoft will team up to determine how personal genomics information influences health decisions.  

The 10,000-participant study will examine individuals’ long-term psychological reactions and behavior change — or its absence — resulting from receiving individualized risk information. Participants will complete a questionnaire about a wide range of health behaviors at the start of the study before they receive their genetic disease risk results.  They will report their psychological and physical response to this information after three months and at the end of the first year, then annually or once every two years for the next 19 years. Individuals’ data will be available on Microsoft’s Health Vault, a Web-based electronic medical-record system launched last year.

It’s intended to be the foundational study of preventative genomic medicine  – Vance Vanier, chief medical officer of Navigenics.

A minor but important note is that the individuals will be recruited through the Scripps Health System, who could be more health conscious and better educated than the general public.  It is not clear how representative these results will be to the general population, particularly those who would need more information about interpretation of the genomic data, their risk and genetics in general. 

 

Over the years, I have used a number of tools and websites for communication, collaboration, sharing information, social networking etc. However, at ISMB 2008 this year, using Friend Feed I am struck what may seem to be an obvious observation: Friend Feed (FF) is a VERY effective way to share information with friends and colleagues.  (Note: if you are not familiar with Friend Feed, Cameron Neylon has an excellent introductory overview of the site with great screen shots in the blog Science in the Open)

What perhaps I didn’t realize (and what has tremendous implications for science) is that in fact, it is a brilliant way to build a collaborative knowledgebase. In most of the sessions at ISMB, there have been 2-3 people simultaneously microblogging about a talk – as it happens! This allows an automatic aggregation of different viewpoints and perspectives on the material. It also fills in gaps – notes that were missed, references or urls. A single query is usually answered within a few seconds with the missing material. At the end, you have a virtual e-record – very handy at meetings for reviewing or if you have missed the talk. 

Obviously, the key issue is that you trust the providers and therefore there content. Also the level of detail can vary and unless you are actively contributing to to a feed, you most likely will still need your own notes.  But the mechanistic potential, as well as the community building should make us sit up and take note. And there seem to be some obvious places where sites like Friend Feed could make an immediate impact. For example, as Chris Heuer points out, FF rooms can easily replace mailing lists.

We get to have it [Friend Feed content] on the Web instead of locked in our email inbox. Chris Heuer

The design of the aggregation stream allows us to build and just as importantly maintain relationships with our friends and colleagues. This is what Facebook set out to do. However, if you look at the new design of Facebook, it appears they have raised the white flag and surrendered to Friend Feed’s concept. With the new design, there is now just a single content stream, with the ability to include outside information such as Twitter. Applications have taken a back seat and are now on their own tab – a click away before they are visible. Given that imitation is the sincerest form of flattery and in light of the striking redesign by Facebook, I wonder if FF is blushing? 

Note: If you are at ISMB and interested in discussing this further, there will be a Birds of a Feather (BoF) session on Tuesday, July 22nd . And yes, if you miss the session, there will be plenty of coverage on FF in the ISMB room for you to read to catch up!