Tuesday, December 22, 2009

eScience Librarians

The School of Information Studies (iSchool) at Syracuse University in Syracuse, N.Y., has introduced a new program (in collaboration with Cornell University Library) called "Building an eScience Librarianship Curriculum for an eResearch Future". It is focused on creating librarians with a better understanding of eScience and the research process, as well as the new types of digital resources - in particular research data and their long term preservation and use - and how to manage them.

Right now they have a call out for applications for scholarships that they have for this new program.

The lack of eScience and research data savvy librarians is one of the gaps identified by the Research Data Canada and is the focus of its capacity working group.

Monday, November 23, 2009

Government and Open Source Software

A colleague of mine is having some difficulties getting an Open Source solution to be made available within his government organization. In providing support to him, I've collected the below resources. Of particular interest is the 2007 Government Open Source Policies from the Center for Strategic and International Studies, listing the Open Source policies of hundreds of national, state/province/territory and local governments (including Canada's).

Open Source and Data Sharing questions in UK Parliament (Nov 12 2009)

It was very interesting to recently discover this Hansard exchange from the UK parliament dated Nov 12 2009 involving Open Source and sharing data:

House of Commons Hansard Written Answers for 12 Nov 2009

Public Bodies: Databases

Mr. Maude: To ask the Minister for the Cabinet Office what steps her Department is taking to facilitate data sharing among public sector bodies. [299480]

Angela E. Smith: The Ministry of Justice is the lead Department on data sharing. The Cabinet Office supports technical elements of secure data handling and ensures that considerations of Data Sharing informs our work to promote more joined up public services.

Sharing data securely is a requirement of the Data Handling Review, which all public bodies must adhere to.

Public Sector: ICT

Mr. Maude: To ask the Minister for the Cabinet Office what assessment has been made of the levels of compliance with her Department's guidance on public sector open source software procurement; and what steps are being taken to ensure compliance. [299407]

Angela E. Smith: The Open Source, Open Standards and Re-use Action Plan was published in February 2009 and is Government policy.

The Office of Government Commerce (OGC) is currently developing guidance for the procurement of open source, working with departments and local authorities that have successfully implemented open source applications, to share best practice and effective methods for procurement. The basis of the guidance has been prepared and material based on practical experience is now being sought from industry and government bodies to enhance the content.

The Cabinet Office does not gather centralised data regarding software procurement.
I'm glad someone's parliament is at least talking about these issues.

Wednesday, November 18, 2009

Opening government funded research to improve research, teaching and learning in higher education

The report Harnessing Openness to Improve Research, Teaching and Learning in Higher Education (A Report by the Digital Connections Council of the Committee for Economic Development Committee, 2009 ISBN #0-87186-184-7) has some very relevant sections dealing with Open Access and Open Data in the context of higher education and the research process:
  • Chapter 5. Openness in Higher Education: Changes in Research
    • a. Resistance to Greater Openness
    • b. Openness and Open-Access Journals
    • c. Digital Repositories
    • d. Educating Faculty Members on Their Intellectual Property Rights
    • e. Openness and Commercial Support of Research
    • f. Access to Government-Funded Research Results
    • g. Openness and University Libraries
    • h. Openness and Academic Presses
    • i. Openness and Technology Transfer
Of particular interest to those who - perhaps at a more general level - are working on getting better access to government funded research, are the following recommendations on this particular issue:
  • f. Access to Government-Funded Research Results
    Governments should:
    • Retain the existing requirements of the NIH public-access policy regarding the results of NIH-funded research (public availability within 12 months of publication).
    • Stimulate research and increase the pace of innovation
      by extending the NIH public-access policy to cover all non-classified research funded by the 11 federal agencies providing over $100 million each in research support.
    • Extend the NIH public-access policy, under appropriate conditions, to primary data resulting from federally funded research and data gathered in support of government regulatory activities.
    • Extend the NIH public-access policy to publicly funded research at institutions of higher education at the state, and local levels.
    • Adopt policies that promote the accessibility and utilization of all non-classified government procedures and processes, data and information products (e.g. databases, publications, audio and video products etc.) as well as materials held in government-funded museums and collections. Lower, to the extent practicable, barriers to access and use, including permission and attribution requirements and technological barriers. Consider the utilization of standardized formats and metadata to facilitate searching and use. (Policies should neither favor one commercial entity over another nor commercial entities over noncommercial entities.)
    • Develop long-term plans and policies for ongoing permanent public access to government information in whatever form, taking into account the fragility of digital media and the format migration that has impeded access.

Thanks to Bill St. Arnaud for this info.

Monday, November 16, 2009

Symposium on the Data Sharing Plans and on the Scientific Benefits of Data Sharing in GEOSS

Today in Washington. D.C, the CODATA organized Symposium on the Data Sharing Plans and on the Scientific Benefits of Data Sharing in GEOSS was held. Among other things, it looked at the draft GEOSS data sharing plan:

The Plan, now endorsed by 80 government Members and 56 Participating Organizations, highlights the following GEOSS Data Sharing Principles:
  1. There will be full and open exchange of data, metadata, and products shared within GEOSS, recognizing relevant international instruments and national policies and legislation.
  2. All shared data, metadata, and products will be made available with minimum time delay and at minimum cost.
  3. All shared data, metadata, and products being free of charge or no more than cost of reproduction will be encouraged for research and education.
  • Part One: Implementing the GEOSS Data Sharing Principles
    • How We Got There and Where We're Going. Beth Greenaway. UK Environmental Observation Network
    • An Overview of the Key Substantive Provisions of the Implementation Guidelines. Robert Chen, of the Implementation Guidelines CODATA and Columbia University
    • Panel Discussion with the Symposium Participants Moderated by Roberta Balstad
  • Part Two: The Scientific Benefits of Data Sharing
    • Data Sharing and Innovation. Christopher Tucker, Open Geospatial Consortium (OGC) Board
    • Understanding Ecosystems and Their Services. Anthony Janetos, Director, Joint Global Change Research Inst., PNL/University of Maryland
    • Earthquakes, Tsunamis and Nuclear Explosion: Open Data Exchange for Research and Monitoring. David Simpson, President, IRIS in Seismology

Disclaimer: I am a member of the Canadian National Committee (CNC) for CODATA

frAgile programming...

Ravi Mohan has posted to his blog, Pin Dancing, a provocative (and likely correct) evaluation of the Agile/xtreme/lean programming wave we have seen over the last couple of years ("Let the Agile Fad Flow By" - Sept 26 2009). Enjoy.

Tuesday, November 10, 2009

Data Life Cycle Patterns in the Life Sciences

The UK Research Information Network (RIN) and the British Library (BL) have produced an amazing report looking at the patterns of flow of data in its production and tranformation in the research process of life scientists.
Patterns of information use and exchange: case studies of researchers in the life sciences. A report by the Research Information Network and the British Library November 2009.
They have seven case studies that look at the data lifecycles of data for researchers in the following life science disciplines:
  • Animal genetics and animal diseases
  • Transgenesis in the chick and development of the chick embryo
  • Epidemiology of zoonotic diseases
  • Neuroscience
  • Systems biology
  • Regenerative medicine
They have extended Chuck Humphrey's data lifecycle model, and use this extended model to illustrate how the data lifecycles are expressed in these different disciplines:

The diagrams (below) for the different disciplines are very revealing, and show the great deal of diversity in and between these disciplines, as well as the complexity within many of these areas.

In doing this analysis, the researchers found that an effective representation of this information involved the following two axes:
  • "the volume of data being handled"
  • "the complexity or heterogeneity of that data"
The researchers plotted the seven disciplines into this space, shown in the following diagram:

Discipline data lifecycle diagrams:
  • Animal genetics and animal diseases

  • Transgenesis in the chick and development of the chick embryo

  • Epidemiology of zoonotic diseases

  • Neuroscience
  • Systems biology (Computing-bases and Lab-based)

  • Regenerative medicine

  • Regenerative medicine

Note: All of these diagrams presented here are from the publication, which has the following license: Creative Commons Attribution-Non-Commercial-Share-Alike 2.0 UK: England and Wales License.

Other discussions of this document:

Thursday, November 05, 2009

The Future of Science: Semantic Web Applications in Scientific Discourse

For those who want to take a glimpse at where science and scientific discourse are going, take a look at some of the papers at this workshop:
Workshop on Semantic Web Applications in Scientific Discourse, October 26, 2009, Proceedings), part of The 8th International Semantic Web Conference (ISWC 2009)
  • Keynote: Enabling Semantic Publication and Integration of Scientific Information
    David Shotton. Presentation.

  • A Short Survey of Discourse Representation Models
    Tudor Groza, Siegfried Handschuh, Tim Clark and Simon Buckingham Shum
    Paper Presentation

  • Strategic Reading and Scientific Discourse
    Allen Renear and Carole Palmer

  • 'Confortation': about a new qualitative category for analyzing biomedical texts
    Delphine Battistelli, Antonietta Folino, Patricia Geretto, Ludivine Kuznik, Jean-Luc Minel and Florence Amardeilh
    Paper Presentation

  • Hypotheses, Evidence and Relationships:The HypER Model of Representing Scientific Knowledge
    Anita de Waard, Simon Buckingham Shum Annamaria Carusi, Jack Park, Matthias Samwald and Agnes Sandor
    Paper Presentation

  • SWAN/SIOC: Aligning Scientific Discourse Representation and Social Semantics
    Alexandre Passant, Paolo Ciccarese, John Breslin and Tim Clark
    Paper Presentation

  • Harnessing the Power of the Community in a Library of Biomedical Ontologies
    Natasha Noy, Michael Dorf, Nicholas Griffith, Csongor Nyulas and Mark Musen.
    Paper Presentation

  • myExperiment: An ontology for e-Research
    David Newman, Sean Bechhofer and David De Roure
    Paper Presentation

  • System Description: Reaching Deeper into the Life Science Bibliome with CORAAL
    Vit Novacek, Tudor Groza and Siegfried Handschuh
    Paper Presentation

  • Nano-Publication in the e-Science Era
    Barend Mons and Jan Velterop.
    Paper Presentation

Friday, October 30, 2009

UK Government Recognizes the Value of Research Data (in 2004!)

I recently discovered the (at least partial) root of some of the excellent activity in the area of research data management in the UK: the UK government's Science & innovation investment framework 2004-2014. Of particular interest:
2.23 The growing UK research base must have ready and efficient access to information of all kinds – such as experimental data sets, journals, theses, conference proceedings and patents. This is the life blood of research and innovation [emphasis added]. Much of this type of information is now, and increasingly, in digital form. This is excellent for rapid access but presents a number of potential risks and challenges. For example, the digital information from the last 15 years is in various formats (versions of software and storage media) that are already obsolete or risk being so in the future. Digital information is also often transient in nature, especially when published formally or informally on websites; unless it is collected and archived it will disappear.4 There are other challenges too, navigating vast online data/information resources determining the providence and quality of the information, and wider issues of security and access.

2.24 It is clear that the research community needs access to information mechanisms which:

  • systematically collect, preserve and make available digital information;
  • are easily navigable;
  • are quality assured;
  • tie into international efforts (e.g. to ensure compatibility); and
  • take on board the current

2.25 The Government will therefore work with interested funders and stakeholders to consider the national e-infrastructure (hardware, networks, communications technology) necessary to deliver an effective system. These funders and stakeholders include the British Library, which plays an important role in supporting scientific research and potential, including providing benefits to smaller businesses in the UK through access to science, engineering and technology information sources. Due to the potential importance of a national e-infrastructure to the needs of the research baseand its supporting infrastructure in meeting the Government’s broader science and innovation goals, as a first step OST will take a lead in taking forward discussion and development of proposals for action and funding, drawing in other funders and stakeholders as necessary.

Friday, October 16, 2009

New work: The Fourth Paradigm: Data-Intensive Scientific Discovery

Microsoft Research has put together a quite amazing collection looking at the revolution that is data intensive research, calling it the fourth paradigm:
The Fourth Paradigm: Data-Intensive Scientific Discovery
Edited by Tony Hey, Stewart Tansley, and Kristin Tolle

Wednesday, September 16, 2009

"The granting system turns young scientists into bureaucrats and then betrays them"

Lawrence PA (2009) Real Lives and White Lies in the Funding of Scientific Research. PLoS Biol 7(9): e1000197. doi:10.1371/journal.pbio.1000197
In this article, Lawrence convincingly describes (in a Kafkaesque fashion) the present system - with the help of a number of quotes from working scientists - as broken:
The problem is, over and over again, that many very creative young people, who have demonstrated their creativity, can't figure out what the system wants of them—which hoops should they jump through? By the time many young people figure out the system, they are so much a part of it, so obsessed with keeping their grants, that their imagination and instincts have been so muted (or corrupted) that their best work is already behind them. This is made much worse by the US system in which assistant professors in medical schools will soon have to raise their own salaries. Who would dare to pursue risky ideas under these circumstances? Who could dare change their research field, ever?Ted Cox, Edwin Grant Conklin Professor of Biology, Director of the Program on Biophysics, Princeton University [quoted in article]

Lawrence also offers up how to solve the science-numbing effects of the present research granting system (of most western countries), again with the assistance of a number of scientists:

“My solution? Everyone should get slotted into a funding category and assessed every five years. If you're productive, you get five more years of resources. If productivity is down, you are moved down a category. If it is high, you can apply to move up. Starting PIs are in a different category and must apply to get onto the treadmill. The difference: PIs would be judged by overall productivity, not grantsmanship. We can stop wasting our time writing grants, and the system can be more easily calibrated to train a sustainable number of postdocs. It is depressing to train people who will struggle for funding.

A peer-reviewed, 5-year renewable, productivity-based ‘track’ system with a set amount of money at each level would stabilize funding, encourage innovation and productivity, allow each PI to control how their money is allocated, and permit us to make nationwide decisions about the size of our science enterprise. It also has the merit of simplicity.”—Ross Cagan, Professor of Developmental and Regenerative Biology, Mount Sinai School of Medicine [quoted in article]

A simpler, more efficient, fairer, and more productive system is that operated by research institutes, such as the IMCB in Singapore, where investigators are given a budget, allowed to get on with their research and reviewed after five years.Philip Ingham, Professor, Institute of Molecular and Cell Biology, Singapore/MRC Centre for Developmental and Biomedical Genetics, University of Sheffield [quoted in article]

Lawrences own solution distills down to:
  • Shorten grant applications (less time writing and less time reviewing)
  • Grants last longer, like 5 yrs
  • Large groups making grant submissions need to be scrutinized on their ability and time availability to manage people, project, etc.
  • Limit # publications supporting the granting submission

As a side note, I believe this is one of the first scientific papers (albeit a perspective/opinion piece) that has a citation to a US presidential inaugural speech that I have seen.

Monday, September 07, 2009

Canadian Science Policy Conference

The Canadian Science Policy Conference is to held in Toronto, October 28-30, 2009.

While the five themes, sub-panels and speakers looking both interesting and relevant, I am disappointed and perhaps a little alarmed that there appears to be no (explicit) mention of research data issues, research data management or research data archiving, despite the series of Canadian consultations examining these issues (National Consultation on Access to Scientific Research Data (NCASRD), 2004; National Data Archive Consultation Building Infrastructure for Access to and Preservation of Research Data, 2002; Data Access in Canada: Issues for Global Change Research” (Royal Society of Canada), 1996) and indicating the pressing need for 1) A research data archiving strategy and policy; 2) A research data archive.

I think this is an issue that has too long been neglected (although there are some positive signs, like International Polar Year), impacting Canadian science and innovation, especially given the advances of other countries (Investigating Data Management Practices in Australian Universities; Open Data for Global Science: A Review of Recent Developments in National and International Scientific Data Policies and Related Proposals; Infrastructure planning and data curation: a comparative study of international approaches to enabling the sharing of research data version 1.6).

Conference themes:
  • Theme 1: Major Issues in Canadian Science and Technology Policy
    • Canada's National Science & Technology Strategies
    • Implementing scientific knowledge in the decision making process: Lessons learned and new models
    • Who Speaks for Science? Stakeholder Communication in the Canadian Scientific Community
  • Theme 2: Scientific Research in Economic Growth and Recession
    • Private Sector Research & Development, role of R&D in global economy
    • Innovation Commercialization – From Bench to Market
  • Theme 3: Science and Technology and Canada’s Future Challenges
    • Meeting the challenges ahead, Canada’s policies on environment and energy
    • Canadian economy, from resource based to knowledge driven
    • Governance of Emerging Technologies
  • Theme 4: Science and Public Engagement
    • The Next Generation of Scientists: science education and a new culture of civic engagement
    • The Democratization of Science
    • Science journalism, media and communication
  • Theme 5: Science and Technology in the Global Village
    • Best Science Policy Practices from Other Nations
    • Science Diplomacy and International Cooperation

Here are the speakers:
  • Eric Archambault President and Founder Science-Metrix
  • Alain Beaudet President Canadian Institutes of Health Research
  • Peter Brenders President & CEO BIOTECanada
  • Elana Brief President The Society of Canadian Women in Science and Technology
  • Deb de Bruijn Executive Director Canadian Research Knowledge Network
  • Tom Brzustowski RBC Financial Group Professor in the Commercialization of Innovation, University of Ottawa
  • Christian Burks President and CEO Ontario Genomics Institute
  • Peter Calamai Science Reporter The Toronto Star
  • David Castle Canada Research Chair in Science and Society University of Ottawa
  • Hadi Dowlatabadi Canada Research Chair & Professor of Applied Mathematics and Global Change Institute for Resources, Environment and Sustainability University of British Columbia
  • Paul Dufour International Development Research Centreon exchange as Senior International S&T Advisor to Natural Resources Canada
  • Ronald J. Dyck Assistant Deputy Minister, Research, Alberta Advanced Education and Technology
  • Suzanne Fortier President Natural Science and Engineering Research Council
  • Peter R. Frise Professor of Automotive Engineering, AUTO21 Program Leader & CEO, The University of Windsor
  • Chad Gaffield President Social Sciences and Humanities Research Council (SSHRC)
  • Peter Hackett President Alberta Ingenuity
  • Mark Henderson Managing Editor Research Money
  • J. Adam Holbrook Adjunct Professor and Associate Director, Centre for Policy Research on Science and Technology, Simon Fraser University
  • Ramin Jahanbegloo Professor of Political Science University of Toronto
  • Robert James Director General, Policy Branch Science and Innovation Sector National Research Council (NRC)
  • Rees Kassen Chair of The Partnership Group for Science and Engineering (PAGSE) University Research Chair in Experimental Evolution, University of Ottawa
  • Kei Koizumi Assistant Director for Federal Research and Development at the White House Office of Science and Technology Policy
  • Jeff Kinder Manager, S&T Strategy Natural Resources Canada
  • John Leggat Past President Canadian Academy of Engineering and President International Council of Academies of Engineering and Technological Sciences
  • Alidad Mafinezam Co-founder Mosaic Institute
  • Robert Mann President Canadian Association of Physicists
  • Sunny Marche Associate Dean of Graduate Studies Dalhousie University
  • Hiromi Matsui Former President Canadian Coalition of Women in Engineering, Science, Trades & Technology
  • Ann McMillan Former President Canadian Coalition of Women in Engineering, Science, Trades & Technology
  • Andrew Miall President Academy of the Royal Society of Canada
  • Geoff Munro Associate ADM and Chief Scientist, Science and Policy Integration
  • Heather Munroe-Blum President McGill University
  • Peter Nicholson President, Council of Canadian Academies
  • Jorge Niosi Canada Research Chair in Management of Technology, Université du Québec à Montréal
  • Christopher Paige Vice-President, Research University Health Network
  • Nils Petersen Director General of the NRC National Institute for Nanotechnology
  • Reinhart Reithmeier Past President, Canadian Society for Biochemistry and Molecular Biology Professor and Chair, Department of Biochemistry, University of Toronto
  • Mark Romoff President and CEO Ontario Centres of Excellence Inc.
  • David Rose Chair Department of Biology University of Waterloo
  • Nathalie Des Rosiers President, Canadian Federation for the Humanities and Social Sciences Acting Secretary of the University in 2008 and Acting Vice-President, Governance, University of Ottawa
  • Marc Saner Director of Research Regulatory Governance Carleton University
  • Kevin Shortt President Canadian Space Society
  • Peter Singer Director McLaughlin-Rotman Centre for Global Health
  • Halla Thorsteinsdóttir Associate Professor Dalla Lana School of Public Health
  • Caroline Wagner Research Scientist Center for International Science and Technology Policy George Washington University
  • Bryn Williams-Jones Professeur Programmes de bioethiques Université de Montréal

Related posts:
Thanks to Tracey for the heads-up on this.

Monday, August 17, 2009

New Journal for rejected math papers: "Rejecta Mathematica"

What is Rejecta Mathematica?
From the FAQ:
Rejecta Mathematica is an open access online journal that publishes only papers that have been rejected from peer-reviewed journals in the mathematical sciences.
But weren't those papers rejected for a reason?
Quite probably, yes.
So why publish them?
We believe that many previously rejected papers (even those rejected for legitimate reasons) can nonetheless have legitimate value to the academic community. This value may take many forms:
  • "mapping the blind alleys of science": papers containing negative results can warn others against futile directions;
  • "reinventing the wheel": papers accidentally rederiving a known result may contain new insight or ideas;
  • "squaring the circle": papers discovered to contain a serious technical flaw may nevertheless contain information or ideas of interest;
  • "applications of cold fusion": papers based on a controversial premise may contain ideas applicable in more traditional settings;
  • "misunderstood genius": other papers may simply have no natural home among existing journals.

Wednesday, July 29, 2009

Project Torngat: Building Large-Scale Semantic 'Maps of Science' with LuSql, Lucene, Semantic Vectors, R and Processing from Full-Text

Project Torngat is a research project here at NRC-CISTI  [Note that I am no longer at CISTI and that I am now continuing this work at Carleton University - GN 2010 04 07] that looks to use the full-text of journal articles to construct semantic journal maps for use in -- among other things -- projecting article search results onto the map to visualize the results and support interactive exploration and discovery of related articles, term and journals.

Starting with 5.7 million full-text articles from 2200+ journals (mostly science, technology and medical (STM)), and using LuSql, Lucene, Semantic Vectors, R, and processing, a two dimensional mapping of a 512 dimension semantic space was created which revealed an excellent correspondence with the 23 human-created journal categories:

Font sizeSemantic Journal Space of 2231 Journals
Scaled to Two Dimensions

This initial work was initiated to find a technique that would scale, and follow-up work is looking at integrating this with a search interface, and evaluating if better structure is revealed within semantic journal mappings of single categories.
This may be the first time such large scale full-text is used in this fashion, without the help of article metadata.

Try-out the prototype (needs Java on the browser) [the site appears to be down right now], which displays journals in the 2-D space.

How it was done:
  1. Using a custom LuSql filter, for each of 2231 journals, concatenate the full-text of all a journal's articles into a single document.
  2. Using LuSql, create a Lucene index of all the journal documents (took ~14hrs, highly multithreaded on multicore, resulting in 43GB index)
  3. Using Semantic Vectors BuildIndex, create a docvector index of all journal documents, with 512 dimensions (58 minutes, 3.4GB index)
  4. Using Semantic Vectors Search, find the cosine distance between all journal documents (8 minutes)
  5. Build journal-journal distance matrix
  6. Use R's multidimensional scaling (MDS) to scale distance matrix to 2-D
  7. Build visualization using Processing
NB: all the above software are Open Source.

You can read more about it in the preprint:
Newton, G. & A. Callahan & M. Dumontier. 2009. Semantic Journal Mapping for Search Visualization in a Large Scale Article Digital Library. Second Workshop on Very Large Digital Libraries at the European Conference on Digital Libraries (ECDL) 2009.

Thanks to my collaborators, Alison Callahan and Michel Dumontier, Carleton University.

Built with Open Source Software.

Tuesday, July 21, 2009

Springer LNCS, or, How not to do alerts!

I subscribe to Springer Lecture Notes in Computer Science (LNCS) alerts. Several times now, I have received alerts when there was no web content at the URLs that they sent me. Very annoying, wasting my time. Please, this is 2009: try and make things work!.

The latest was yesterday: at Mon, 20 Jul 2009 14:53:35 -0700 (PDT) I got an alert email from Springer LNCS:
Dear Glen Newton,

We are pleased to deliver your requested table of contents alert
for a new volume of "Lecture Notes in Computer Science",
subseries: "Lecture Notes in Artificial Intelligence".

Volume 5632: Machine Learning and Data Mining in Pattern Recognition
by Petra Perner
is now available on the SpringerLink web site at

Going to this page (as of Tues 14:26 ET July 21 2009, ~24hrs later), or to any of the URLs for the articles (including DOIs, like http://dx.doi.org/10.1007/978-3-642-03070-3_5) gives me - No, not an error page - but a BLANK PAGE:

How hard is it to make sure you don't send out alerts with links to web pages until after the linked-to pages actually have the content you are planning on presenting?

And at least try and have a decent error page when you do not have the content in place.

Oh, and 1999 called, they wants their web infrastructure back.

[Of course, by the time you read this entry these URLs may be working...]

Update: 2009 07 21 16:03 ET: Now when I go to the above pages, instead of a blank page I get a general launching-off page for SpringerLink:

Still not good enough! :-)

Update: 2009 07 21 22:40 ET:
Now the URLs for the articles work:

but the URLs for the publication do not:

Wednesday, July 15, 2009

Emacs 'mode' and learning `modes`

I've used emacs as my primary editor, (emersive?) environment and de facto almost-OS for about 20 years now. I read and send my email in it (vm), write/run/debug Java in it (JDEE), edit and compile my LaTeX in it, edit all other files with it, sometimes with complex macros that others would use Perl to do, and interact with shells inside of it. In the past I've edited and debugged C and C++, HTML and XML in various fooML modes. The only other major thing I have running on my workstation is my web browser (//and occasionally OpenOffice for reading Word files//). Of course, I will have additional emacs windows open on the 3 or 4 servers I am editing and running code on (and also use tramp to transparently edit remote files).

Yes, I've tried Eclipse. I know it quite well: I've even written an Eclipse plugin and published a paper about it (Takaka: Eclipse Image Processing Plug-in) . But it does not work for me like emacs does. If Eclipse works for you: that is great. But I don't think it necessarily works for everyone. Emacs + JDEE IS my IDE for Java (and is other things as well).

And I have no interest in starting any editor war ('IDE war'?)

Before I actually tried Eclipse in a significant way, I thought that I was just unwilling to change due to the learning curve of something new, and the momentum of the 'known' and once I tried it I would find it better, like many other people seemd to (and certainly many of the pundits). But when I did invest in making the change, I discovered that was not the reason. It still didn't work as well for me as emacs (where 'work' for me meant helped my productivity: no, productivity was less than working using emacs). Again, this is not a criticism of Eclipse.

My theory is that - like learning, where it has been established that different people learn in different ways (Visual/Verbal, Visual/Nonverbal, Auditory/Verbal, Tactile/Kinesthetic) - I believe that particular modes of human-machine interaction are better suited to some individuals than others.

I am not an HCI expert, so I don't know if the various modes have been as clearly defined, delineated and validated as in learning, but I could imagine at least one HCI mode mapping to the Microsoft Word/NetBeans/Eclipse mode of mouse oriented, busy GUIs, and another to the emacs mode [I think I've poorly described the attributes of the former and won't attempt to describe the latter].

This all said, I found it quite interesting to find these two recent blog entries today:

"I think that there is some confusion between mastering emacs, and using emacs. You can learn to use emacs in 1/2 an hour. Is that a shockingly long time? Yes. Great design usually makes its uses obvious.

But emacs makes up for that initial investment with accelerating returns --- where Notepad, or even Eclipse, you stop gaining power and knowledge relatively quickly, emacs is like the universe: no matter how long you look at it, there is always more to learn --- and the best part is the more you learn, the faster you can learn more."- comment by Sam Bleckley on Learn Emacs in Ten Year post
With a little looking around, I found a couple more blog postings that capture some of the emacs-ness of emacs that might be of interest:
XKCD can have the final word (for now) on emacs:

Tuesday, June 23, 2009

Bibliography: The Socioeconomic Effects of Public Sector Information

Chapter 14 (Measuring the Social and Economic Costs of Public Sector Information Online: A Review of the Literature and Future Directions by Paul F. Uhlir, Raed M. Sharif, and Tilman Merz) of The Socioeconomic Effects of Public Sector Information (PSI) on Digital Networks has a useful bibliography of useful links in this area. I have reproduced the bibliography portion of the chapter below:

The Socioeconomic Effects of Public Sector Information

Paul Uhlir (who recently spoke at the session I was chairing at the Ottawa ICSTI2009 conference) has produced a report for the U.S. National Academies entitled The Socioeconomic Effects of Public Sector Information (PSI) on Digital Networks, which is a collection of papers on PSI policy from a number of OECD countries.

Paul is also on the U.S. National Committee for CODATA.

Table of Contents:
  • Overview of U.S. Federal Government Information Policy Nancy Weiss, Institute of Museum and Library Services, United States
  • PSI Implementation in the UK: Successes and Challenges Jim Wretham, Office of Public Sector Information United Kingdom
  • The Value to Industry of PSI: The Business Sector Perspective Martin Fornefeld, MICUS Management Consulting Germany
  • Achieving Fair and Open Access to PSI for Maximum Returns Michael Nicholson, PSI Alliance United Kingdom
  • Public Sector Information: Why Bother? Robbin te Velde, Dialogic The Netherlands
  • Measuring the Economic Impact of the PSI Directive in the Context of the 2008 ReviewChris Corbin, ePSIplus United Kingdom
  • Different PSI Access Policies and Their Impact Frederika Welle Donker, Delft University of Technology The Netherlands
  • The Price of Everything but the Value of Nothing Antoinette Graves, Office of Fair Trading United Kingdom
  • Enhancing Access to Government Information: Economic Theory as It Applies to Statistics Canada Kirsti Nilsen, University of Western Ontario Canada
  • Assessing the Impact of Public Sector Geographic Information Max Craglia, Institute for Environment and Sustainability, JRC Italy
  • Assessing the Economic and Social Benefits of NOAA Data Online Rodney Weiher, NOAA United States
  • Exploring the Impacts of Enhanced Access to Publicly Funded Research John Houghton, Victoria University Australia
  • Measuring the Social and Economic Costs of Public Sector Information Online: A Review of the Literature and Future Directions Paul F. Uhlir, Raed M. Sharif, and Tilman Merz
Thanks to Tracey for pointing this out.

Monday, June 08, 2009

ICSTI Conference: Managing Data for Science

CISTI (Canada Institute for Scientific and Technical Information) is hosting the ICSTI (International Council for Scientific and Technical Information) conference "Managing Data for Science" here in Ottawa at the LAC (Library and Archives Canada). The conference is June 9-10, 1 1/2 days long with an excellent international single stream program structured into four sessions, "Foundations", "Libraries", "Data Services" and "Semantic Science". I will be attending all sessions and will also be moderating the "Semantic Science" session.

Wednesday, May 27, 2009

IBM on Linux: "Lean, clean, and green"

IBM developerWorks has an article (Linux: Lean, clean, and green: How GNU/Linux is becoming more eco-friendly - 26 May 2009) which examines some of the Green benefits of the Linux operating system. It focuses primarily on the low resource demands Linux has on systems (as well as its support for older systems), thus extending the life of machines that would otherwise be junked. Also discussed is virtualization and aspects of the Linux OS that reduce power consumption in servers.

Additional Green Linux and Open Source resource:

Tuesday, May 26, 2009

Canadian Federal Natural Resources Department selects Open Source library system

This is old news from February, but I seemed to have missed it: the NRCan (Federal Natural Resources ministry in Canada) libraries have chosen the Open Source Evergreenintegrated library system (ILS). Kudos to my colleague George Duimovich and others at the NRCan library. Here is the opening search interface. It is great to see sensible Web 2.0 and Open Source choices can be made in such organizations.

NSF Workshop report: Information Seeking Support Systems Workshop

The final report for the NSF Information Seeking Support Systems Workshop has been released.
"The general goal of the workshop will be to coalesce a research agenda that stimulates progress toward better systems that support information seeking."
From the executive summary:
Our nation and our world depend on citizens who are able to seek, assess, understand, and use diverse kinds of information. Much of the information we need is complex with different components held in disparate electronic sources and many of our efforts to gather, assess, and use this information are done in collaboration with others. Additionally, much of the information we need is not discretely anticipated, but rather emerges as seeking and reflection continues over time. Information seeking in the digital age is a kind of problem solving activity that demands agile and symbiotic coordination of human and cyber resources; in short, a fundamental kind of computationally-augmented thinking. Computation has expanded our ability to do scalable what if thinking that leverages the best capabilities of humans and machines to abstract, synthesize, and iterate intellectual actions, and today’s search engines are the primitives on the technical side of information seeking. We must rise to the challenge to move information seeking from search engine support that provides discrete items in response to simple queries to tools and services that support reflective and interactive search over time and in collaboration....[emphasis added]

...Three kinds of challenges are defined and preliminary steps toward meeting the challenges are presented in this report: robust models of human‐information interaction; new tools, techniques, and services to support the full range of information seeking activities; and techniques and methods to evaluate information seeking across communities, platforms, sources, and time. Special attention is given to collaborative information seeking and the need for industry‐academic collaboration. Much broader and intensive efforts on the part of the academy, government, and industry are required if we are to meet the grand challenges of usable and ubiquitous information seeking support systems that empower people to solve problems, create new knowledge, and increase participation in efforts to improve the global human condition [emphasis added]. Preliminary efforts as illustrated in this report provide promising directions, however, sustained efforts are urgently needed to support research that leads to understanding information seeking as computationally augmented learning and problem solving, better seamless and ubiquitous systems for supporting information seeking, methods for training people to practice effective and efficient information seeking, and techniques and measures for assessing the tools and practices.

JCDL 2009 Poster Session to also be in Second Life

The 2009 ACM/IEEE Joint Conference on Digital Libraries (JCDL) poster session will be held both in real life and in Second Life. This is the first time that the JCDL has done this, and allows for remote participation in at least this part of the conference.

More information on all of the JCDL2009 sessions.
BTW, I will be chairing session #7 on Wednesday, June 17.

Tuesday, May 12, 2009

W3C: Service Modeling Standards Extend Reach of XML Family

The W3C has today announced Service Modeling Language 1.1 and SML Interchange Format 1.1 (SML-IF), two XMl-based standards:
SML, SML-IF Enable Validation of Sets of XML Documents

To illustrate what SML adds to the XML ecosystem, consider what happens when someone purchases an airline ticket. Suppose the reservation information is stored as an XML document that includes passenger information. The reservation also refers to a second XML document that stores departure time and other information about the flight. One department manages customer information, another manages flight information. Before any transaction with the customer, the airline wants to ensure that the system as a whole is valid. SML allows the airline to verify that certain constraints are satisfied across the reservation and flight data. This makes it easier to manage inconsistencies, and to do so without writing custom code. As a result, the airline lowers the cost of managing tasks such as informing passengers when flight times change.

An organization may also find that it needs to apply additional constraints when using data in a particular context, for example because of local laws. Developers can use SML to layer on context-specific constraints without duplicating content. -From the W3C Press release

Friday, May 08, 2009

First issue of "Journal of Information Architecture"

The issue 1 volume 1, Spring 2009 of the Journal of Information Architecture is now available. Topics for this journal are (from the site):
  • Theoretical foundations of information architecture;
  • Pervasive information architecture;
  • History of information architecture;
  • Information architecture techniques and best practices; card sorting; freelisting;
  • Way-finding in digital environments; human information seeking; human information interaction; navigation and navigation behaviors; findability;
  • Labeling and representation in digital environments;
  • Organization of information; pace layering; taxonomies; folksonomies; collaborative tagging;
  • Social media; social computing; social networks;
  • Information architecture and digital genres;
  • Information architecture development in organizations, in communities, in society, globally;
  • The role of information architecture in information systems development;
  • The value of information architecture for organizations;
  • The impact of information architecture in organizational information policy and information strategy;
  • Multilingual, multicultural information architecture; global information architecture;
  • Information architecture design and evaluation for various applications in business, managerial, organizational, educational, social, cultural, and other domains;
  • The impact of information, information architecture or information technology on people's attitude, behavior, performance, perception, and productivity;
  • Information architecture education.
The Volume One Issue One table of contents is:

Tuesday, April 28, 2009

EU Digital Preservation Workshop: Planets in Denmark

"Digital Preservation – the Planets way" is a Planets (Preservation and Long-term Access through NETworked Services) outreach and training event to be held at the Royal Library in Copenhagen, Denmark, on 22-24 June 2009.

Monday, April 27, 2009

H1N1 Swine Flu TimeMap

The very brilliant Rod Page (of whom I've blogged previously) has made a very cool time/map visualization mashup showing confirmed and suspected cases. The application uses the RSS feeds from the original 2009 Swine Flu Outbreak Map.

Thursday, April 16, 2009

Energy is not the problem: energy source mix is the problem

Bill St.Arnaud made the simple point today (in addition to a number of other interesting and important points) at the ISACC (ICT Standards Advisory Council of Canada) meeting with respect to ICT and global change (which generalizes to all other industries and activities):
Energy isn't the issue: carbon is the issue.
If you have an energy expensive process, but all the energy comes from wind,
solar, hydro (carbon neutral), that is OK.
And energy efficiency doesn't really get you anywhere.
I will be speaking this afternoon with the talk: "World Wide Web Consortium (W3C) Standards for Industry and Governments".

Bill St Arnaud's blog.

Wednesday, March 25, 2009

Appointed to Canadian National Committee for CODATA

I am happy to announce that I have just been appointed to the Canadian National Committee for CODATA, which is an ICSU committee. I have been an observer on this committee since 1999, and look forward to continuing my work with this committee. :-)