Friday, October 27, 2006

Extensible Text Framework (XTF): FLOSS platform for access to digital content
XTF is the California Digital Library's amazing access platform for digital content. It is based on Lucene, a tool that is well known as a scalable and stable full-text engine. But XTF is more than Lucene, and is a full end-to-end system, offering ΓΌber configurable indexing, quering and display. Java-based, completely XSLT-driven presentation-layer, extensible to things like Shibboleth, and has some very nice additioanl features like OAI-PMH provider and SRU. From what I can tell it does not have an SOA architecture, but offers a high degree of modularity which could easily be wrapped in Web services, etc

Wednesday, October 25, 2006

Google not cashing-in on Amazon linking?
In my ever-vigilant interest in making sure that Google has covered all the funding streams it can ;-) , it seems to me that it is missing an important one: whenever I search Google and there is a link to a book on Amazon, the URL does not seem to have an Amazon associates ID. Why isn't Google an Amazon Associate member, cashing-in on the click-throughs to Amazon, getting a % of the sales from people it directs to Amazon? They are likely the top forwarder to Amazon and it shouldn't be too hard to insert their Amazon Associates ID etc. into their Amazon-bound URLs...

Saturday, October 21, 2006

Proprietary vs. Open Source development analogy:
like training-for-a-race vs. running-a-race
In reading about the new (to me, at least) transactional database engine for MySQL (v >= 5.1) called the PrimeBase XT storage engine (PBXT) I ran across an interview with its creator, Paul McCullagh. It seems that Paul was from the proprietary software development world, and was surprised by the response to the Open Source community around this project, and the new friends he has found. He felt it was a very different environment from what he was used to. In his words, from the article:
I like to take marathon running as an example. Think of the difference between training for a marathon and running a race. The closed source industry is like training for a marathon. You are basically on your own. The open source community is like running a race. Not because you want to win. Most people don't run a marathon to win, they run to complete. But during the race you experience a comradeship and sense of doing something together that makes running much easier then in training.

Going by this comparison, I have been in training too long.

Thursday, October 19, 2006

Big Ball of Mud pattern
Reading Grady Booch's very well developed "Snake Oil-oriented Architecture" [a must-read for anyone doing or buying SOA] in his blog (Software architecture, software engineering, and Renaissance Jazz) brought me to a truly joyous article for a pattern that I had forgotten about: the Big Ball of Mud pattern. Read and enjoy (and remember architectures of days long gone - but still with us!). :-)

Tuesday, October 17, 2006

Tapping the power of text mining
In his closing plenary to the Access 2006 conference in Ottawa, Clifford Lynch listed text mining as one of the exciting areas of activity for the near future, soon (hopefully!) realizing its potential for discovery on large text corpora. In the September 2006 issue of Communications of the ACM, Fan et al. have a good general introduction to this area.

Fan, W., Wallace, L., Rich, S., and Zhang, Z. 2006. Tapping the power of text mining. Commun. ACM 49, 9 (Sep. 2006), 76-82. DOI=

More text mining
ACM & IEEE team-up for Wiki for Discussing and Promoting Best Practices in Research
The scope is somewhat(!) narrower than the title suggests, focusing on the challenges in running and managing conferences in the areas on which the ACM and IEEE focus. The Wiki includes categories dealing with: acceptance rates (too high & too low), creative ideas (like lightning talks), examining allowing author responses to reviewer concerns, (technical) competitions, tracking reviews (if a paper is rejected by conference X and is usually re-submitted to conference Y, with some organizing & cooperation, the two conferences can have the reviews carried-over (shared)), two-phase reviewing, double blind submissions, scaling of programme committees using hierarchy and not agglomeration.

Hill, M. D., Gaudiot, J., Hall, M., Marks, J., Prinetto, P., and Baglio, D. 2006. A Wiki for discussing and promoting best practices in research. Commun. ACM 49, 9 (Sep. 2006), 63-64.

Saturday, October 14, 2006

Stan Rueker
I've just learnt about the nora project which is an amazing visual-based search construction interface from Stan Rueker, University of Alberta, in his David Binkley Award presentation at Access 2006 in Ottawa. today. I can see why see was presented this award, as he is building truly beautiful and functional prototypes...

Friday, October 13, 2006

Eclipse Plugin-in Architecture Article
ACM Queue magazine this month has a very good article on Eclipse, The Heart of Eclipse , focused on its plug-in architecture.ACM Queue vol. 4, no. 8 - October 2006
by Dan Rubel, Instantiations

Thursday, October 12, 2006

Access 2006: Day 1: v1.1
The Hackfest (and Ad Hoc Fest) results were presented at the Access 2006 conference. For the projects that were worked on, please go to Donna Dinberg (LAC) organized the effort, with Dan Chudnov, Ross Singer and Art Rhyno supporting

The original 40 spaces were taken up and the 28 people on the waiting list were eventually added to an additional fest, called the Ad Hoc Fest. The Hack Fest was hosted at Carleton University, and the Ad Hoc
Fest was held at the Library and Archives of Canada.
WWW 2007 Call for Papers out
The 16th International World Wide Web conference in Banff (Alberta, Canada) CFP is out. Hopefully see you all there... :-)
Access 2006: First Day v1.0
While I am here at the Access 2006 conference, I can't help that I am missing out on some great Web 2.0 discussions, knowing that I am missing out on going out with Michael Stephens et al. to the Thai restaurant at Internet Librarian International in London, which I did do last year. Richard Wallis reports in Panlibis that he has the good fortune of doing this this year.

That said, I am sure that Access will not dissappoint, as it has always been a great conference for the library techie crowd...

Wednesday, October 11, 2006

Digital Libraries Come of Age...Yet again...
In the September/October 2006 issue of IEEE Computing in Science & Engineering, Pam Gorder[1] presents a view of digital libraries. Project Gutenberg (perhaps a little too much time spent on), ACM/IEEE Joint Conference on Digital Libraries, DSpace, Fedora, US National Science Digital Library (NSDL) are all discussed, as well as Google's digitization efforts (and copyright woes).

Pam Frost Gorder, Digital Libraries Come of Age, Computing in Science & Engineering, vol. 8, no. 5, September/October 2006, pp. 6-10.

Tuesday, October 03, 2006

XML11: Amazing AJAX Toolkit
XML11 is a very exciting AJAX toolkit inspired by the X11 protocol. It allows Java applications to be rendered on a web browser, but also under Java Swing and Java AWT. In addition (and very wild), as there is an X11 server implemented (WeirdX) in Java, you can also have an X11 application working in a web browser!

Seeing xcalc and xeyes rendered on Firefox, via AJAX, WeirdX, AWT and X11 is borderline bizarre. Check out the Google TechTalks video by Arno Puder, it is quite amazing. I would have liked to have seen Firefox running inside of this convoluted set of protocols and environments inside of Firefox.

He also coins a wonderful phrase: "JavaScript is the assembly of the Web...", basically claiming that while JavaScript is fundamental to the Web (or at least AJAX), no sane person wants to use it (like assembler today: is is a "pain" to write in). You would prefer to use a proper high level programming language like Java, C++, etc. I have to agree...

Logic code can either run on the original platform (X11, Java) or can run on the client via a Java-bytecode-to-XML-to-XSLT-to-JavaScript (wow!!) cross-compiler. This is configurable at the class level, I believe. If something on the browser needs a component on the server, some transparent middleware looks after making this connection...

They are also looking at getting VNC (via a VNC Java client) to work inside of a browser, and looking at something that works with .NET...

Some other Java/AJAX toolkits/frameworks: