Posts

Showing posts from 2009

eScience Librarians

The School of Information Studies (iSchool) at Syracuse University in Syracuse, N.Y., has introduced a new program (in collaboration with Cornell University Library) called " Building an eScience Librarianship Curriculum for an eResearch Future ". It is focused on creating librarians with a better understanding of eScience and the research process, as well as the new types of digital resources - in particular research data and their long term preservation and use - and how to manage them. Right now they have a call out for applications for scholarships that they have for this new program. The lack of eScience and research data savvy librarians is one of the gaps identified by the Research Data Canada and is the focus of its capacity working group.

Government and Open Source Software

A colleague of mine is having some difficulties getting an Open Source solution to be made available within his government organization. In providing support to him, I've collected the below resources. Of particular interest is the 2007 Government Open Source Policies from the Center for Strategic and International Studies , listing the Open Source policies of hundreds of national, state/province/territory and local governments (including Canada's). 2009 The DoD and Open Source Software. Oracle White Paper 2009 Open source software could save millions (Scotland) 2009 Open Source Vendors welcome new UK Government policy, but want more action 2009 Liam Maxwell The High Cost of government IT 2009 Guide to Open Source Software for Australian Government Agencies 2009 D.Gatto & J.Herzfeld. Department of Defense Debunks Myths and Endorses Use of Open Source Software 2008. The acquisition of (open-source) software A guide for ICT buyers in the public and semi-public sectors (Dut

Open Source and Data Sharing questions in UK Parliament (Nov 12 2009)

It was very interesting to recently discover this Hansard exchange from the UK parliament dated Nov 12 2009 involving Open Source and sharing data: House of Commons Hansard Written Answers for 12 Nov 2009 Public Bodies: Databases Mr. Maude : To ask the Minister for the Cabinet Office what steps her Department is taking to facilitate data sharing among public sector bodies. [299480] Angela E. Smith : The Ministry of Justice is the lead Department on data sharing. The Cabinet Office supports technical elements of secure data handling and ensures that considerations of Data Sharing informs our work to promote more joined up public services. Sharing data securely is a requirement of the Data Handling Review, which all public bodies must adhere to. Public Sector: ICT Mr. Maude: To ask the Minister for the Cabinet Office what assessment has been made of the levels of compliance with her Department's guidance on public sector open source software procurement; and what steps are being tak

Opening government funded research to improve research, teaching and learning in higher education

The report Harnessing Openness to Improve Research, Teaching and Learning in Higher Education (A Report by the Digital Connections Council of the Committee for Economic Development Committee, 2009 ISBN #0-87186-184-7) has some very relevant sections dealing with Open Access and Open Data in the context of higher education and the research process: Chapter 5. Openness in Higher Education: Changes in Research a. Resistance to Greater Openness b. Openness and Open-Access Journals c. Digital Repositories d. Educating Faculty Members on Their Intellectual Property Rights e. Openness and Commercial Support of Research f. Access to Government-Funded Research Results g. Openness and University Libraries h. Openness and Academic Presses i. Openness and Technology Transfer Of particular interest to those who - perhaps at a more general level - are working on getting better access to government funded research , are the following recommendations on this particular issue: f. Access to Governme

Symposium on the Data Sharing Plans and on the Scientific Benefits of Data Sharing in GEOSS

Today in Washington. D.C, the CODATA organized Symposium on the Data Sharing Plans and on the Scientific Benefits of Data Sharing in GEOSS was held. Among other things, it looked at the draft GEOSS data sharing plan: The Plan, now endorsed by 80 government Members and 56 Participating Organizations, highlights the following GEOSS Data Sharing Principles: There will be full and open exchange of data, metadata, and products shared within GEOSS, recognizing relevant international instruments and national policies and legislation. All shared data, metadata, and products will be made available with minimum time delay and at minimum cost. All shared data, metadata, and products being free of charge or no more than cost of reproduction will be encouraged for research and education. Programme: Part One: Implementing the GEOSS Data Sharing Principles How We Got There and Where We're Going . Beth Greenaway. UK Environmental Observation Network An Overview of the Key Substantive Provisions

frAgile programming...

Ravi Mohan has posted to his blog, Pin Dancing , a provocative (and likely correct) evaluation of the Agile/xtreme/lean programming wave we have seen over the last couple of years (" Let the Agile Fad Flow By " - Sept 26 2009). Enjoy.

Data Life Cycle Patterns in the Life Sciences

Image
The UK Research Information Network (RIN) and the British Library (BL) have produced an amazing report looking at the patterns of flow of data in its production and tranformation in the research process of life scientists. Patterns of information use and exchange: case studies of researchers in the life sciences . A report by the Research Information Network and the British Library November 2009. They have seven case studies that look at the data lifecycles of data for researchers in the following life science disciplines: Animal genetics and animal diseases Transgenesis in the chick and development of the chick embryo Epidemiology of zoonotic diseases Neuroscience Systems biology Regenerative medicine They have extended Chuck Humphrey 's data lifecycle model , and use this extended model to illustrate how the data lifecycles are expressed in these different disciplines: The diagrams (below) for the different disciplines are very revealing, and show the great deal

The Future of Science: Semantic Web Applications in Scientific Discourse

For those who want to take a glimpse at where science and scientific discourse are going, take a look at some of the papers at this workshop: Workshop on Semantic Web Applications in Scientific Discourse , October 26, 2009, Proceedings ), part of The 8th International Semantic Web Conference (ISWC 2009) Keynote: Enabling Semantic Publication and Integration of Scientific Information David Shotton. Presentation . A Short Survey of Discourse Representation Models Tudor Groza, Siegfried Handschuh, Tim Clark and Simon Buckingham Shum Paper Presentation Strategic Reading and Scientific Discourse Allen Renear and Carole Palmer Paper 'Confortation': about a new qualitative category for analyzing biomedical texts Delphine Battistelli, Antonietta Folino, Patricia Geretto, Ludivine Kuznik, Jean-Luc Minel and Florence Amardeilh Paper Presentation Hypotheses, Evidence and Relationships:The HypER Model of Representing Scientific Knowledge Anita de Waard, Simon Buckingham Shu

UK Government Recognizes the Value of Research Data (in 2004!)

I recently discovered the (at least partial) root of some of the excellent activity in the area of research data management in the UK: the UK government's Science & innovation investment framework 2004-2014 . Of particular interest: 2.23 The growing UK research base must have ready and efficient access to information of all kinds – such as experimental data sets, journals, theses, conference proceedings and patents. This is the life blood of research and innovation [emphasis added] . Much of this type of information is now, and increasingly, in digital form. This is excellent for rapid access but presents a number of potential risks and challenges. For example, the digital information from the last 15 years is in various formats (versions of software and storage media) that are already obsolete or risk being so in the future. Digital information is also often transient in nature, especially when published formally or informally on websites; unless it is collected and archived

New work: The Fourth Paradigm: Data-Intensive Scientific Discovery

Microsoft Research has put together a quite amazing collection looking at the revolution that is data intensive research, calling it the fourth paradigm: The Fourth Paradigm: Data-Intensive Scientific Discovery Edited by Tony Hey, Stewart Tansley, and Kristin Tolle

"The granting system turns young scientists into bureaucrats and then betrays them"

Lawrence PA (2009) Real Lives and White Lies in the Funding of Scientific Research . PLoS Biol 7(9): e1000197 . doi: 10.1371/journal.pbio.1000197 In this article, Lawrence convincingly describes (in a Kafkaesque fashion) the present system - with the help of a number of quotes from working scientists - as broken: “ The problem is, over and over again, that many very creative young people, who have demonstrated their creativity, can't figure out what the system wants of them—which hoops should they jump through? By the time many young people figure out the system, they are so much a part of it, so obsessed with keeping their grants, that their imagination and instincts have been so muted (or corrupted) that their best work is already behind them. This is made much worse by the US system in which assistant professors in medical schools will soon have to raise their own salaries. Who would dare to pursue risky ideas under these circumstances? Who could dare change their re

Canadian Science Policy Conference

The Canadian Science Policy Conference is to held in Toronto, October 28-30, 2009. While the five themes , sub-panels and speakers looking both interesting and relevant, I am disappointed and perhaps a little alarmed that there appears to be no (explicit) mention of research data issues, research data management or research data archiving, despite the series of Canadian consultations examining these issues ( National Consultation on Access to Scientific Research Data (NCASRD) , 2004; National Data Archive Consultation Building Infrastructure for Access to and Preservation of Research Data , 2002; Data Access in Canada: Issues for Global Change Research” (Royal Society of Canada), 1996 ) and indicating the pressing need for 1) A research data archiving strategy and policy; 2) A research data archive. I think this is an issue that has too long been neglected (although there are some positive signs, like International Polar Year ), impacting Canadian science and innovation, especiall

New Journal for rejected math papers: "Rejecta Mathematica"

What is Rejecta Mathematica ? From the FAQ : Rejecta Mathematica is an open access online journal that publishes only papers that have been rejected from peer-reviewed journals in the mathematical sciences. But weren't those papers rejected for a reason? Quite probably, yes. So why publish them? We believe that many previously rejected papers (even those rejected for legitimate reasons) can nonetheless have legitimate value to the academic community. This value may take many forms: "mapping the blind alleys of science" : papers containing negative results can warn others against futile directions; "reinventing the wheel" : papers accidentally rederiving a known result may contain new insight or ideas; "squaring the circle" : papers discovered to contain a serious technical flaw may nevertheless contain information or ideas of interest; "applications of cold fusion" : papers based on a controversial premise may contain ideas applicable in more

Project Torngat: Building Large-Scale Semantic 'Maps of Science' with LuSql, Lucene, Semantic Vectors, R and Processing from Full-Text

Image
Project Torngat is a research project here at NRC - CISTI   [ Note that I am no longer at CISTI and that I am now continuing this work at Carleton University - GN 2010 04 07 ] that looks to use the full-text of journal articles to construct semantic journal maps for use in -- among other things -- projecting article search results onto the map to visualize the results and support interactive exploration and discovery of related articles, term and journals. Starting with 5.7 million full-text articles from 2200+ journals (mostly science, technology and medical (STM)), and using LuSql , Lucene , Semantic Vectors , R , and processing , a two dimensional mapping of a 512 dimension semantic space was created which revealed an excellent correspondence with the 23 human-created journal categories: Semantic Journal Space of 2231 Journals Scaled to Two Dimensions This initial work was initiated to find a technique that would scale, and follow-up work is looking at integrating t

Springer LNCS, or, How not to do alerts!

Image
I subscribe to Springer Lecture Notes in Computer Science (LNCS) alerts. Several times now, I have received alerts when there was no web content at the URLs that they sent me. Very annoying, wasting my time. Please, this is 2009: try and make things work! . The latest was yesterday: at Mon, 20 Jul 2009 14:53:35 -0700 (PDT) I got an alert email from Springer LNCS: Dear Glen Newton, We are pleased to deliver your requested table of contents alert for a new volume of "Lecture Notes in Computer Science", subseries: "Lecture Notes in Artificial Intelligence". Volume 5632: Machine Learning and Data Mining in Pattern Recognition by Petra Perner is now available on the SpringerLink web site at http://springer.r.delivery.net/r/r?2.1.Ee.2Tp.1gRdiL.ByTshW..N.I9y2.3DBm.bW89MQ%5f%5fDJNcFRf0 Going to this page (as of Tues 14:26 ET July 21 2009, ~24hrs later), or to any of the URLs for the articles (including DOIs, like http://dx.doi.org/10.1007/978-3-642-03070-3_5 ) gives me -

Emacs 'mode' and learning `modes`

Image
I've used emacs as my primary editor, (emersive?) environment and de fact o almost-OS for about 20 years now. I read and send my email in it ( vm ), write/run/debug Java in it ( JDEE ), edit and compile my LaTeX in it, edit all other files with it, sometimes with complex macros that others would use Perl to do, and interact with shells inside of it. In the past I've edited and debugged C and C++, HTML and XML in various foo ML modes. The only other major thing I have running on my workstation is my web browser (//and occasionally OpenOffice for reading Word files//). Of course, I will have additional emacs windows open on the 3 or 4 servers I am editing and running code on (and also use tramp to transparently edit remote files). Yes, I've tried Eclipse . I know it quite well: I've even written an Eclipse plugin and published a paper about it ( Takaka: Eclipse Image Processing Plug-in ) . But it does not work for me like emacs does. If Eclipse works for you : that

Bibliography: The Socioeconomic Effects of Public Sector Information

Chapter 14 ( Measuring the Social and Economic Costs of Public Sector Information Online: A Review of the Literature and Future Directions by Paul F. Uhlir, Raed M. Sharif, and Tilman Merz) of The Socioeconomic Effects of Public Sector Information (PSI) on Digital Networks has a useful bibliography of useful links in this area. I have reproduced the bibliography portion of the chapter below: Models of Public Sector Information Provision via Trading Funds . David Newbery, Lionel Bently, and Rufus Pollock. 2008. EcoGeo Project . Stéphane Roche, et al. 2007 . Fair Use in the U.S. Economy: Economic Contribution of Industries Relying on Fair Use . Thomas Rogers and Andrew Szamosszegi. 2007. The Power of Information: An Independent Review . Ed Mayo and Tom Steinberg. 2007. The Socio-Economic Impact of the Spatial Data Infrastructure of Catalonia . Pilar Garcia Almirall, Montse Moix Bergadà, and Pau Queraltó Ros. Edited by Max Craglia. 2007; published 2008. Benefits of the New

The Socioeconomic Effects of Public Sector Information

Paul Uhlir (who recently spoke at the session I was chairing at the Ottawa ICSTI2009 conference) has produced a report for the U.S. National Academies entitled The Socioeconomic Effects of Public Sector Information (PSI) on Digital Networks , which is a collection of papers on PSI policy from a number of OECD countries. Paul is also on the U.S. National Committee for CODATA. Table of Contents: Overview of U.S. Federal Government Information Policy Nancy Weiss, Institute of Museum and Library Services, United States PSI Implementation in the UK: Successes and Challenges Jim Wretham, Office of Public Sector Information United Kingdom The Value to Industry of PSI: The Business Sector Perspective Martin Fornefeld, MICUS Management Consulting Germany Achieving Fair and Open Access to PSI for Maximum Returns Michael Nicholson, PSI Alliance United Kingdom Public Sector Information: Why Bother? Robbin te Velde, Dialogic The Netherlands Measuring the Economic Impact of the PSI Directive i

ICSTI Conference: Managing Data for Science

CISTI (Canada Institute for Scientific and Technical Information) is hosting the ICSTI (International Council for Scientific and Technical Information) conference " Managing Data for Science " here in Ottawa at the LAC (Library and Archives Canada). The conference is June 9-10, 1 1/2 days long with an excellent international single stream program structured into four sessions, " Foundations ", " Libraries ", " Data Services " and " Semantic Science ". I will be attending all sessions and will also be moderating the " Semantic Science " session.

IBM on Linux: "Lean, clean, and green"

IBM developerWorks has an article ( Linux: Lean, clean, and green: How GNU/Linux is becoming more eco-friendly - 26 May 2009) which examines some of the Green benefits of the Linux operating system. It focuses primarily on the low resource demands Linux has on systems (as well as its support for older systems), thus extending the life of machines that would otherwise be junked. Also discussed is virtualization and aspects of the Linux OS that reduce power consumption in servers. Additional Green Linux and Open Source resource: LessWatts.org - Saving Power on Intel systems with Linux Ten ways Linux can turn you green . 2009 Green Computing With Open Source Software . 2009 Open Source is Already Naturally Green: Fewer Lawyers, Fewer Showers, More Real People Add Up to Big Green Wins . 2009 Is Linux the Greenest Operating System . 2009 Go Green, Save Green with Linux . 2008 Linux captures the 'green' flag, beats Windows 2008 power-saving measures 2008 Canadian Green Party

Canadian Federal Natural Resources Department selects Open Source library system

Image
This is old news from February, but I seemed to have missed it: the NRCan (Federal Natural Resources ministry in Canada) libraries have chosen the Open Source Evergreen integrated library system (ILS). Kudos to my colleague George Duimovich and others at the NRCan library . Here is the opening search interface . It is great to see sensible Web 2.0 and Open Source choices can be made in such organizations.

NSF Workshop report: Information Seeking Support Systems Workshop

The final report for the NSF Information Seeking Support Systems Workshop has been released . "The general goal of the workshop will be to coalesce a research agenda that stimulates progress toward better systems that support information seeking." From the executive summary: Our nation and our world depend on citizens who are able to seek, assess, understand, and use diverse kinds of information. Much of the information we need is complex with different components held in disparate electronic sources and many of our efforts to gather, assess, and use this information are done in collaboration with others. Additionally, much of the information we need is not discretely anticipated, but rather emerges as seeking and reflection continues over time. Information seeking in the digital age is a kind of problem solving activity that demands agile and symbiotic coordination of human and cyber resources; in short, a fundamental kind of computationally-augmented thinking. Computation

JCDL 2009 Poster Session to also be in Second Life

The 2009 ACM/IEEE Joint Conference on Digital Libraries ( JCDL ) poster session will be held both in real life and in Second Life . This is the first time that the JCDL has done this, and allows for remote participation in at least this part of the conference. More information on all of the JCDL2009 sessions . BTW, I will be chairing session #7 on Wednesday, June 17.

Google announces Maps Data API

http://code.google.com/apis/maps/documentation/mapsdata/

W3C: Service Modeling Standards Extend Reach of XML Family

The W3C has today announced Service Modeling Language 1.1 and SML Interchange Format 1.1 (SML-IF) , two XMl-based standards: SML, SML-IF Enable Validation of Sets of XML Documents To illustrate what SML adds to the XML ecosystem, consider what happens when someone purchases an airline ticket. Suppose the reservation information is stored as an XML document that includes passenger information. The reservation also refers to a second XML document that stores departure time and other information about the flight. One department manages customer information, another manages flight information. Before any transaction with the customer, the airline wants to ensure that the system as a whole is valid. SML allows the airline to verify that certain constraints are satisfied across the reservation and flight data. This makes it easier to manage inconsistencies, and to do so without writing custom code. As a result, the airline lowers the cost of managing tasks such as informing passengers when

First issue of "Journal of Information Architecture"

The issue 1 volume 1, Spring 2009 of the Journal of Information Architecture is now available. Topics for this journal are (from the site): Theoretical foundations of information architecture; Pervasive information architecture; History of information architecture; Information architecture techniques and best practices; card sorting; freelisting; Way-finding in digital environments; human information seeking; human information interaction; navigation and navigation behaviors; findability; Labeling and representation in digital environments; Organization of information; pace layering; taxonomies; folksonomies; collaborative tagging; Social media; social computing; social networks; Information architecture and digital genres; Information architecture development in organizations, in communities, in society, globally; The role of information architecture in information systems development; The value of information architecture for organizations; The impact of information architecture in