Archive for the 'tagclouds' Category:

FeedVis 2.0: custom visualization for your feeds

this is what feedvis looks like

My FeedVis project–the interactive tagcloud for a group of feeds–has been out for a week now, I’ve been thrilled at the positive response I’ve gotten so far.  One rather glaring problem with the program, though, was that you could only look at the top 50 edublogs.

Not anymore.  After a few late nights, I’ve got a beta system for uploading and analyzing your own sets of feeds.  You just upload your opml, wait a few minutes, and you’re set: FeedVis gives you a custom page that you can bookmark and return to anytime you like; it’ll continue to update every time you visit.  You can also browse visualizations of other people’s feeds.

It’s pretty untested, and I’m sure use will uncover some bugs.  But it’s got potential; I’m excited to see what people think.

FeedVis: a deeper tagcloud for edublogs

a screenshoto of feedvis

Tagclouds have value, but, as I’ve written before, they’ve a number of shortfalls as well.  I’ve just finished my attempt to remedy some of these problems: FeedVis.  It’s an animated tagcloud that lets you compare word frequencies accross different time periods and authors, then check out the posts that used the words.  The demo is using the feeds for Scott McLeod’s Technorati-compiled list of top 50 edublogs, since that’s what got me started about feeds and tagclouds in the first place (although the program will work with any set of feeds).  More details about how it works are on the demo page.

I think what I’m really most excited about is the way this uses animation to let you actually see the words changing from one sample to the next.    Motion is such an important part of the way we see the world, and it’s been underemployed in information visualization, I think (although this changing; Hans Rosling’s TED talks have gotten a lot of buzz, for instance).

The project has been really fun, and a great learning experience; it’s gotten me really pumped about inofVis for learning about online interaction.  I think there is a lot of potential there for ed tech research.  I’m also pretty excited about programming; I started learning in February (with php), and then started javascript a couple months ago.  It’s been a really mind-expanding experience, and I’m looking foward to my next project, probably once I get done with grad school apps.

The trouble with tagclouds

Tag clouds, those darlings of early web 2.0, have been seeing something of a backlash lately. Zeldman was suggesting that tag clouds were the new mullets back in 2005; more lately, ReadWriteWeb wondered if tagclouds were dead altogether. The main complaint in both cases wasn’t that tag clouds were just no good, but that they’d become trendy and thus overused.  Later criticism has argued that the increasingly common practice of using tag clouds for navigation is fundamentally flawed.

But the problems of tag clouds–and their close cousin, word clouds–go deeper, to their usefulness as a visualization method.  These aren’t problems with how the method is used or misused, but with the idea itself.

Moritz Stefaner points out (and presents his own solution for) several problems with the format:

  • tag clouds give a great picture of the “big head” of tags: the most frequently used tags that change little over time; they overlook, though, the “long tail”–where many of the interesting tags are located.
  • tag clouds don’t show change over time.  Chirag Mehta has created a tag cloud with a time slider, which helps with this.  But as Stefaner points out, animating tag clouds doesn’t work very well, as the changing size of the cloud moves the words around so they’re hard to follow.
  • Finally, tag clouds don’t show the relationships between tags (pretty much everyone who criticizes tag clouds mentions this one).

The IBM Many Eyes site has one of the best tag cloud (actually this does word clouds, too) tools I’ve seen, allowing users to get lots of data from each tag while keeping the interface clean and simple.  They make a great point about an inherent limitation of the tool: the size and shape of the words themselves isn’t controlled for.  So, long words seem more dominant than short ones, and words with lots of ascenders and descenders (the vertical strokes of letters like ‘b’ or ‘p’) tend to dominate as well.  This can subtly alter the overall gist that tag clouds are supposed to deliver.

The academic community has noted shortcomings of the technique, as well. Hearst and Rosner (2008) observe that the alphabetical layout of the cloud may lead to a sort of “false clustering” effect, as users misinterpret words because of surrounding tags.  Renninger and Shumar (2007) found that tag cloud quadrants have different rates of recall, a fact which most tag cloud designs ignore.  In fact, their findings suggest that a simple list of tags, ordered by frequency, may deliver a more accurate overall impression than a tag cloud.  Several researchers have sought to improve shortcomings in tag cloud presentation with packing and sorting algorithms that manage whitespace and cluster relevant concepts (Kaser and Lemire, 2007; Seifert, Kump, Kienreich, Granitzer, and Granitzer, 2008).

Now, this isn’t to say that tag clouds have no value; in fact, I think they have great potential. It’s just that we need to know when tag clouds and word clouds are appropriate, know their shortcomings, and (this is the fun part) try to find ways to make them better. Most of the sources cited above have set about doing just that. In my next post, I’ll discuss a few of these “next-generation tag cloud” concepts; in particular, I’ll be examining methods of using word clouds to compare different versions of a text.