FREE THE ARTICLES! (Full-text for researchers & scientists and their machines)

At a recent plenary I gave [earlier post] at the Colorado Association of Research Libraries Next Gen Library Interfaces conference, I went a little off-script and was educating (/haranguing) the mostly librarian audience about the present-and-near-future importance of the accessibility of full-text research articles to their researchers and scientists.

By accessibility of full-text I didn't mean the ability of a human to access the PDF or HTML of an article via a web browser: I was referring to the machine-accessibility of the text contained in the article (and the metadata and the citation information).

I was concerned because of the increasing number of discipline-specific tools that use full-text (& metadata & citations) to allow users (via text mining, semantic analysis, etc.) to navigate, analyze and discover new ideas and relationships, from the research literature. The general label for this kind of research is 'literature-based discovery', where new knowledge hidden in the literature is exposed using text mining and other tools.

Most publisher licenses do not allow for the sort of access to the full-text that many of these discovery and exploration tools need.

When I asked for a show of hands of how many were aware of this issue, of the ~200 in the audience, no one raised their hand.

I went on to suggest/rant that librarians should expect more of their researcher/scientist patrons to be needing/demanding this sort of access to the full-text of (licensed) journal articles. They need to anticipate this response, and I suggested the following non-mutually-exclusive strategies:
  • demanding licenses from publishers and aggregators that allow them to offer access to full-text for analysis by arbitrary patron tools
  • asking publishers to publish their full-text in the Open Text Mining Interface (OTMI)
  • supporting Open Access journals which allow-for much of this this out-of-the-box (but often have very difficult APIs or non-at-all and only web pages to get at the content!!)
Recently I retro-discovered an article[1] in The Economist, which explains to the lay-person some of the kind of things that can be done with access to the literature. This study [2] shows how researchers discovered the biochemical pathway involved in drug addiction from the literature alone. They did no experiments. This discovery was derived from an analysis and extraction of information from more than 1000 articles! This is not the first time this sort of thing has happened[3]. Clearly, this sort of analysis can save time and money in discovering important and relevant scientific knowledge.

[1] Drug Addiction: Going by the book (2008). The Economist, January 10 print issue.
[2] Li, C., Mao, X., Wei, L. (2008). Genes and (Common) Pathways Underlying Drug Addiction. PLoS Computational Biology, 4(1), e2. DOI: 10.1371/journal.pcbi.0040002
[3] Swanson, D. (1986). Fish oil, Raynaud's syndrome, and undiscovered public knowledge. Perspect Biol Med, 30:1:7-18.

Additional reading:
Update 2008 April 7: Peter Suber's posts on how OA facilitates meta-analysis and text-mining.

Thanks to Martha Lee UCLA via NGC4LIB.

Comments

Daniel Lemire said…
My impression is that librarians do not take open access seriously. I suppose they consider that they are gatekeepers to the locked information.

Who needs them if people can freely get access to the data?

Ah! Precisely. We don't need gatekeepers. There is still a huge need for people to help organize the data, through software or through manual intervention, but please, we don't need people to grant us access to the data.
Glen Newton said…
While traditionally librarians have often played a gatekeeper role, and certainly there has been an existential crisis in the library community over the last ~10 years, I have to mostly disagree with your comment.

My experience is that most librarians are supportive of Open Access; many of them were and are impacted by the quickly rising costs of commercial journals. They have experienced reduced buying power and commensurate reduced availablity of journals for their users. And - for the most part - most librarians want information to be as freely available as possible. There is no question that their roles are evolving and that the uncertainty of this evolution has made many librarians uncomfortable.


References:

Library sees red over rising journal prices

The Cost of Journals

Biomedical Journal Costs and Trends
frenzy said…
This comment has been removed by a blog administrator.

Popular posts from this blog

Java, MySql increased performance with Huge Pages

Canadian Science Policy Conference

Project Torngat: Building Large-Scale Semantic 'Maps of Science' with LuSql, Lucene, Semantic Vectors, R and Processing from Full-Text