jump to navigation

Human Assisted Search February 25, 2007

Posted by Andre Vellino in Search, Statistical Semantics.
2 comments

I tried the human-powered “search with guide” feature on the ChaCha search engine the other day. I can’t see human-guided search becoming a business success story in the mass market – for most purposes searching is becoming suffiently easy that we don’t need help any more.

But the idea of having a human guide to help with sophisticated searches (which has been floating around in on-line libraries for a while) may work well in a scientific digital library where the help of a trained librarian or subject specialist could really be welcome. I’m optimistic that this kind of service will be offered in on-line science libraries both because my experience at ChaCha was quite good and because I believe there are situations where there aren’t likely to be automated alternatives.

(more…)

Global Warming February 21, 2007

Posted by Andre Vellino in Global Warming.
add a comment

It is a little ironic that my considerable preoccupation with Global Warming found some peace during my visit to the Canadian Museum of Nature the other day. The museum is in the process of being renovated and the two re-vamped exhibits I saw (Fossils and Birds) were world-class. The whole experience was quite uplifting in a paradoxical kind of way.

(more…)

GapMinder February 20, 2007

Posted by Andre Vellino in Data Mining, Visualization.
add a comment

Speaking of social science data, Richard Akerman pointed me to an interesting social data visualization tool. It comes from GapMinder (inspired by “Mind the Gap” announcements in the London Underground). Gapminder is a non-profit venture that develops and distributes free software for visualising human development data (population / CO2 emissions per capita / Internet access per 1000 people / infant mortality etc.) and plotting its change over time.

There’s quite an entertaining TED video-lecture by Hans Rosling, professor of international health and founder of GapMinder. Be prepared to reconsider a few misconceptions about “the third world” and to be impressed by how much a visualization tool can do for your understanding of social trends.

MetaData in Social Science February 13, 2007

Posted by Andre Vellino in Data Mining.
3 comments

I was aware that there are repositories of and search engines for many databases in various “hard science” disciplines like Chemistry and Astronomy, but until a few weeks ago, it hadn’t occurred to me that social scientists also have large and valuable collections of digital data and that these too are “published”.

(more…)

Darwin Biography February 6, 2007

Posted by Andre Vellino in Book Review.
2 comments

I have just finished reading a magnificent, two-volume biography of Charles Darwin by Janet Browne. She took some 15 years to write these books (Charles Darwin, Voyaging and Charles Darwin, The Power of Place) and the quality of the research shines! As one reviewer put it about Voyaging, in anticipation of The Power of Place, “if Browne’s second volume is as comprehensively lucid as her first there will be no need for anyone to write another word on Darwin”.

(more…)

Citations February 3, 2007

Posted by Andre Vellino in Citation, Digital library, Information retrieval.
add a comment

A lot of “social information” can be gleaned from journal articles in a scientific digital library. The most obvious source of social information is found in citations. Citation indexes measure the number of times an article or monograph is referenced by other documents, hence giving a measure of the cumulative impact and relevance of an individual’s scientific research output. This simple measure has been improved upon by the Hirsh Index, which measures citation relevance as a function of the distribution of citations received by a given researcher’s publications.

Members of the CISTI Research team are looking at the question of how to use networks of citations to rank search results by relevance, in the same way that web search results are sorted by page rank. I am not working on citations myself, but I have been wondering whether it would be possible to improve that ranking measure by extracting more detailed information about citations. For example, one could (i) count the co-occurrence of different citations across a collection, (ii) count the number of occurrences of each citation inside the article and (iii) weight these citation occurrences according to their location article. (more…)