jump to navigation

Top 10 Mac Software for 2013 December 26, 2013

Posted by Andre Vellino in Software Review.
2 comments

This is a top 10 list of Mac software for 2013. Most of them are not new, but many are new to me for this year.

(1) 1Password : https://agilebits.com/onepassword

This is single-handedly the most useful and valuable piece of software I own.  It’s a password-vault that securely generates and stores passwords for all your logins. Free and Open Source equivalents include Password Safe and KeePass but 1Password has them all beat in their user interface and that’s important when you use something every day. It’s true that Open Source alternatives have the security advantage that anyone can inspect the code for back-doors and security mistakes, but I am willing to trust Agile Bits.  Maybe it’s because they’re Canadian.

(2) BoxCryptor : https://www.boxcryptor.com/

Worry about storing your files in the cloud no more. Boxcryptor provides file-encryption  for cloud storage services, including Dropbox, Google Drive and SkyDrive.  For file encryption or even disk-level encryption I would have recommended TrueCrypt except that it hasn’t been updated in more than a year. For Windows systems, I would suggest Axcrypt.

gpg-tools(3) GPG Mail : https://gpgtools.org/

GNU Privacy Guard (GPG) is a tool for encrypting, decrypting, signing and verifying files or messages. Despite adding my GPG signature on all my e-mails for the past 5 months, no one has yet sent me an encrypted e-mail, but once everyone uses it, I predict it will be the spam-killer app.

things2(4) Things 2 : http://culturedcode.com/things/

If you’re not the most organized person in the world you’ll be grateful for this tool: it helps remind you of what you need to do, and when you need to do it.

tex-studio

(5) TeXStudio : http://texstudio.sourceforge.net/

TeX is 35 years old and still going strong.  TeXStudio is a pretty good text editor and a pretty interface for this rather complicated typesetting system.  Essential for writing camera-ready copy, particularly if it involves mathematical equations and symbols.

pixelmator

(6) Pixelmator : http://www.pixelmator.com/

If you don’t have the patience to learn Photoshop or even Gimp, Pixelmator likely does most of what you’ll want if you are a casual photo editor.

(7) WhatSize : http://www.whatsizemac.com/

Even a rarely used software item can be quite valuable.  Sometimes you really need to see how your space is allocated on your disk when you see your space disappear from it. WhatSize does only one thing but it does it well.

(8) GoBan : http://www.sente.ch/

I don’t play computer games much, but when I do it’s the game of Go – still by far the most beautiful board game ever invented.  This UI app is very nice for playing others on line or against Go software like GNU Go or Pachi Go.

stellarium(9) Stellarium : http://www.stellarium.org/

Starry Night used to be the king of the hill of sky simulators for astronomy – and perhaps it still is – but Stellarium is a quite a fine Open Source alternative that is quite a bit less complicated.

(10) Audacity : http://audacity.sourceforge.net/

Audio editing software is probably frustrating no matter how good the user interface. And Audacity’s user interface is frustrating!  But I keep coming back to it because it’s so available and does so much that’s useful (noise reduction, normalization, export to various formats, etc.)

Needless to say, I have no commercial or other interest in any product mentioned above and I have paid for all my personal product licenses for the commercial software listed above: 1Password, Things 2, WhatSize, Pixelmator and GoBan.

The HIP-index: A Better Measure of Research Impact November 16, 2013

Posted by Andre Vellino in Bibliometrics, Citation Analysis, Statistical Semantics.
Tags:
4 comments

hip-index

Eighteen months ago, Xiaodan Zhu, Peter Turney, Daniel Lemire and I embarked on an experiment to see if we could identify the features in an article that would enable us to identify the critical (vs. incidental) references.  We thought that being able to identify references that are crucial would help us devise a better researcher productivity index – one that was better than the h-index.

I am happy to report that we were successful!  In September I gave an overview presentation to the U. Ottawa School of Information Studies that describes the problem we were trying to solve, our methods and results. Since then our paper has been accepted for publication in JASIST, most likely in a 2014 issue.

To automatically identify the subset of references in a bibliography that have a central academic influence on the citing paper, we examined the effectiveness of a variety of candidate features – positional features, semantic features, context features and citation-frequency features – that might be predictors of the academic influence of a citation. We asked the authors of 100 papers to identify the key references in their own work and created a dataset in which citations were labeled according to their academic influence (note that this dataset is made available under the Open Data Commons Public Domain Dedication and License). We then used supervised machine learning to perform feature selection and found a model that predicts academic influence effectively using only four features.

The performance of these features inspired us to design an influence-primed h-index (the hip-index). Unlike the conventional h-index, the hip-index weights citations simply by how many times a reference is mentioned. We show that the hip-index has better precision than the conventional h-index at predicting ACL Fellows on a collection of 20,000 articles from the ACL Digital Archive of Research Papers.

P.S. (Nov. 18) Daniel Lemire in his related blog post gives the following credit, which I entirely share: Most of the credit for this work goes to my co-authors. Much of the heavy lifting was done by Xiaodan Zhu.

Protecting Yourself from Spies September 7, 2013

Posted by Andre Vellino in Ethics, Human Rights, Information.
add a comment

prism

I once worked for a company that makes the kind of software that the NSA and CSIS appear to be using to monitor email and internet metadata (see the Guardian for a quick survey of the metadata that exists in different digital media).

I might add that I think there is nothing morally wrong with the surveillance technology itself – indeed it can be used to protect privacy and prevent harm. It is more a question of whether our privacy rights are violated when the technology is used and whether those rights should be relinquished to the state for the greater good.

The recent revelation that the presumption of privacy even when engaging in encrypted transactions is erroneous adds fuel to my concern that people don’t make informed decisions about what information they disclose and that they don’t even try to protect their information even when it is quite easy to do. This post highlights some software solutions you can use to reduce the likelihood that your private information is monitored.

Web Browsing

Let’s start with web browsing. The amount of information that a web servers can glean from your web browser’s attempt to connect with it is quite voluminous. To see what a server can find out about your browser and computer, try this link:

http://www.mybrowserinfo.com/detail.asp?bhcp=1

Furthermore, the combination of these browser characteristics, while they may not provide personal identity information can still identify you uniquely.  Try this test from the Electronic Frontier Foundation:

https://panopticlick.eff.org/

When I try it, they assert that my browser information-collection, i.e. my browser “fingerprint” is unique among the 3M or so they have tested.

There is not much you can do to limit the uniqueness of your browser’s fingerprint other than having a generic computer and a generic browser configuration.  Using the TOR browser / network (see below) helps to reduce the uniqueness of your browser-fingerprint, but there are tradeoffs (response speed for one thing).

HTTPS

There was a time when I thought that HTTP-Secure (“https”) was a reliable way of ensuring that information between your browser and the end-point server (e.g. a Bank) could not be intercepted or tampered with. The revelation that the NSA is able to decrypt such communications reduces my confidence that this method is “secure” in any meaningful way, but at least it offers some degree of assurance that not just anybody and either read or tamper with such transactions.

If that level of confidence is sufficient for you, then you might consider adding the HTTPS Everywhere plugin (brought to you by the Electronic Freedom Foundation) to your browser.

TOR

This browser / encrypted network system describes itself as

…free software and an open network that helps you defend against a form of network surveillance that threatens personal freedom and privacy, confidential business activities and relationships, and state security

In principle, the Onion Routing technology behind it offers the end-user a high degree of anonymity and untraceability. However, if anyone can break SSL, the next step is to break TOR.

File and file system encryption

If you want to protect computer files, or indeed a whole file system (e.g. in case your laptop is stolen or your USB key is lost) you should try TrueCrypt. It offers operating-system level, on-the fly encryption, file-level encryption and partition encryption.  Best of all, TrueCrypt is open source (so you can check for yourself, if you have the patience and know-how, that there are no backdoors for the NSA or CSIS).

Also, for Windows PCs (or Wine enabled Macs), AxCrypt is a pretty good and easy to use tool for encrypting files.

Email

Securing email is a bit trickier. There is no meaningful way to encrypt e-mail metatdata. The very nature of e-mail addressing and store-and-forward protocols like SMTP require that metadata. Which, of course, is a fundamental design flaw with email.

However, if you want to protect the content of what you say from prying eyes, you can try Gnu Privacy Guard (GPG). Its precursor was PGP (Pretty Good Privacy) and Edward Snowden thinks it works.

Conclusion

It appears that most people think that their privacy is worth sacrificing in exchange for safety and protection by government.  This is short-sighted. A benevolent government in whose integrity you trust might do the right thing at any point in time, but the issue is a matter of principle. You should not relinquish your right to privacy to the state.

As Bruce Schneier wrote in The Guardian:

By subverting the internet at every level to make it a vast, multi-layered and robust surveillance platform, the NSA has undermined a fundamental social contract…..

We have a moral duty to [dismantle the surveillance state], and we have no time to lose.

In the meantime we can at least do better to protect ourselves.

Some Problems with MOOCs August 17, 2013

Posted by Andre Vellino in Education, Ethics.
add a comment

Michael Sandel‘s acclaimed undergraduate lectures at Harvard on Justice are now offered in a MOOC at EdX and watching them for a second time gave me an insight into a few of the significant shortcomings of recorded lectures.

First, they have a limited shelf-life. However perennial the issues are (e.g. “What is Justice?”), what makes it a learning experience for the students is the process of investigation and enquiry.  While Sandel’s recordings of his lectures are a master class on how to engage students, how to foster critical thinking and make issues pertinent and alive,  their very nature as recordings ultimately limits them to being historical documents.

For instance, since 2005 – the year in which these lectures were recorded – the richest person in the world (taken as an example of [potential] financial injustice) is no longer Bill Gates (it’s Carlos Slim Helu), significant examples of greed and inequality are better illustrated with the 2007-2008 financial crisis and there have been many changes in U.S. politics since the election of President Obama.

At least as importantly, watching these lectures makes the viewer feel wanting of interactions with the lecturer. Listening to young minds grappling with the issues is pedagogically interesting, but as a student what you really want is to be in the audience asking questions, taking positions and arguing with the lecturer and fellow students.

As a taste of how a student might benefit from a Harvard education, having a course such as this on-line is wonderful. And it is clearly of value to anyone who would be unable to attend or afford such an education.  But it is no substitute for the real experience.

So, for these two reasons alone, I think that MOOCs will, at best, be a complement to a university education, not an alternative to it.

Freedom Abhors a Chill March 24, 2013

Posted by Andre Vellino in Ethics.
add a comment

Is Clippy the Future? February 8, 2013

Posted by Andre Vellino in Artificial Intelligence, Collaborative filtering, Data Mining.
add a comment

iwblogoThe student-led Information without Borders conference that I attended at Dalhousie yesterday was truly excellent – as much for its organization (all by students!) as for its diverse topics: the future of libraries, cloud computing, recommender systems, sciverse apps and the foundations for innovation.

At the panel discussion in which I participated, I suggested that to predict the future one need only look at the past. To predict the iPad one needed only look at the Apple Newton (which died in 1998). What was the analog, I wondered, for an information retrieval tool, now dead and buried, that might still evolve into something we all want in the field of information management?

I proposed that the future of information retrieval might be something like an evolved Office Assistant, (affectionately coined “Clippy”) – the infamous, now deceased Microsoft Paperclip that assisted you in understanding and navigating Microsoft products.

My vision for a next generation Clippy was clearly not well articulated since it prompted the following tweet from Stephen Abram:

abram-tweet

I think that Siri, (about which I posted a few years ago) belongs to the old Clippy style of annoying and in-the-way-of-what-I-want-to-do applications. I am surprised it has survived so long and was promoted by Apple so strongly. I predict it will join Clippy, Google Wave and Google Glasses on the growing heap of unwanted technologies that were not ready for prime-time.

Watson (who is now going to medical school, and about which I also posted a couple of years ago) is, however, just the sort of Natural Language Understanding component technology that I have in mind for for an interactive, personal information assistant. When a computer that now costs three million dollars with15 terrabytes of RAM can fit in your pocket and cost $500, a Watson-like system that understands natural language queries will be an important component of Clippy++.

What neither Watson nor Siri have – and this is what I foresee in my crystal ball is the most significant attribute about “Clippy++” – is personalization and autonomy. What will make true personalization possible with “Clippy++” is our collective willingness to accept the intrusion of a mechanical supervisor that learns from our behaviour about what we want, need and expect.

This culture-shift is happening right now – we gladly and willingly disclose our information consumption habits to supervisory software and data-analytics engines in exchange for entertainment and social networking. It won’t be long before we’re willing to do that for serious, personalized information management purposes as well.

The key, though, is going to be the interaction – the dialog that we have with Clippy++ – and it will have to have explanations for its actions and recommendations. That’s going to be the hallmark of its evolution to Machina Sapiens.

Follow

Get every new post delivered to your Inbox.