information

Making Web Log Analysis Tools Better

design, information, internet

March 05, 2004, 12:22 AM

We're starting our final project in Mapping and Diagramming, which is self-defined. I've decided to focus on designing visualizations of web log data (not to be confused with weblogs, although webloggers are my primary user group), since I've always been of the opinion that the visualizations generated by most current tools tend to suck. So far I've come up with a few sketches and a description of the project.

Some things I want to explore include:

And possibly others as well. I'd like to check out more existing web analysis tools as well as look over some of the relevant research that's been done on the subject, so if anyone has some good pointers, let me know.

Oh, and Dan is doing something wacky with linking together the music people buy, or something.

Got Something to Say About This?

Email Rob:

OR Post a Comment:

 

Enter the number below into the text box next to it.*


 

* These fields are required. Your email address will not be publicly displayed. Your web address is optional, and will be publicly displayed if provided.

Allowed HTML: a href, strong, em, ul, ol, li, blockquote, dl, dt, dd, dfn, code, q, samp, kbd, var, cite, abbr title, acronym title, sub, sup

IBM's Social Computing

information, society & sociology

February 29, 2004, 11:26 AM

Wendy Kellog from the IBM T.J. Watson research center, gave a talk at the HCII seminar series last Wednesday on research directions in social computing. It was one of those "look at all the cool stuff we did" talks, but there were some fairly interesting ideas underlying the mishmash of technologies.

IBM research have moved from developing funky visualizations of social phenomenon to emphasizing technologies that universally represent users in the computing environment. They call them "people proxies", but they are essentially a form of digital identity. For instance, one of these technologies, Grapevine, was an intelligent electronic business card that allows recipients to contact you via multiple mediums (phone, email, IM, etc.) without disclosing your actual phone number, email address, etc. Another, Rendezvous, aimed to make conference calling more transparent by making it easier to bring in more people without hassling with special phone numbers and the like. IBM are also interested in personal middleware, the idea that individuals should be able to create and manage personal web services which rove inter- and intranets to locate information and perform other tasks for them, and that these services (dare I say "agents"?) should be sharable (although the really hard question, how everyday users are supposed to create these services, was completely skirted by Wendy). The theory behind all these approaches is that the vast majority of a company's information assets exists in employees' heads, whereas only 4% exists in enterprise database systems. So currently, 80% of a company's IT budget is spent managing that 4%. These technologies aim to facilitate sharing the remaining information.

I ran into my friend Cristen Torrey in the hall last Friday and we had a short discussion about the talk. She was concerned about privacy issues, which always rear their heads when the subject of consolidated online identities comes up. IBM assumes that making certain information transparent will improve productivity and enhance communication, but it could also increase the power of those on top of the management (or government) chain, encourage micromanagement, strip us of the right to choose what details of our lives are public and which are not, as well as a host of other unintended consequences. Where are the guarantees that we will have control over our digital selves? Where are the researchers that are bringing these issues to the table? I have yet to see them.

Got Something to Say About This?

Email Rob:

OR Post a Comment:

 

Enter the number below into the text box next to it.*


 

* These fields are required. Your email address will not be publicly displayed. Your web address is optional, and will be publicly displayed if provided.

Allowed HTML: a href, strong, em, ul, ol, li, blockquote, dl, dt, dd, dfn, code, q, samp, kbd, var, cite, abbr title, acronym title, sub, sup

Information Design In Practice: Agnew Moyer Smith

design, information

February 25, 2004, 11:57 PM

Karen, our Mapping and Diagramming teacher, took us on a field trip yesterday to Agnew Moyer Smith, an information design consultancy in Pittsburgh's Southside. Don Moyer, Karen's husband and a partner in the firm, gave us a tour of the facilities and talked to us about a few of his current projects as examples of how he practices information design. Agnew Moyer Smith specializes in making complex technical topics clear and accessible to non-experts, something I'm particularly interested in.

Don spent the most time on a project he'd been working on to develop an information design piece that clearly explains how RFID tags can be used to improve supply chain management. He started out by reviewing an existing piece on supply chain management that AMS had produced for a previous client. This enabled him to set the stage for his own research by working from a similar document whose quality he trusted. He also began to look for sources of information on the domain, investigating a technical paper on RFID, some articles about it in the popular press, and his client's own initial attempts at explaining the process. Don warned against gettting caught up in too much research; he could have probably found hundreds of articles on the subject and although he may have learned a little bit of important information from every one of them, he would have spent days and days going through them all. It's best, he said, to find the book on the topic written for 8-year-olds; that way someone else has already done all the hard work of pulling all the information together and making it understandable. If such a book doesn't exist, however, then you'll have to make do with the most clear and comprehensive sources you can find.

Next, he started collecting together a list of all the nouns (things, actors, etc.) that existed in his problem domain. Then he started listing all the verbs that each of these actors could perform. Interestingly enough, this is the same approach software engineers take when modeling a problem domain for developing an object-oriented software design.

Don prepared a brief of the project for his client to ensure that they were both on the same page as far as the project's requirements went (in theory a brief like this is prepared by the client, but Don says that has never happened in all the years he's been working. The brief, if one exists, is prepared by the designer, just as requirements are developed by software engineers and not handed to them by clients). This brief contained:

  1. A short objective for the project.
  2. A description of the intended audience of the communication piece.
  3. The information that the client wants to know before completing the communication piece.
  4. The information the client already knows (or thinks he knows) about the subject.
  5. A list of the main messages or "story elements" that the final communication piece must clearly portray.

After getting the client's approval on the brief, Don started developing sketches of the process as well as a notation language for the nouns and verbs he had identified earlier. He then started assembling the spread itself. Don prefers to develop information spreads before powerpoint presentations or other types of information design; he finds that the page works well to force constraints on the design project that ensure clarity and effectiveness of results that might be lacking in a more permissive medium like a presentation or video. This is very much in line with my own rants on the value of constraints.

Once Don reached a solution he was happy with, he moved the project along. Although he would have liked to explore more solution possibilities, he recognizes that making a decision in the development phase early enough to leave room for sufficient refinement is important.

Next, Don wrote the prose for the piece in the form of numbered steps; he believes that it's important for designers to be able to write their own clear, concise copy. He then merged the text with his sketches, and called in a colleague to conduct a critique of his work. His colleage came into the crit cold and played at being a user, pointing out areas where he didn't understand what was being communicated. Don remarked that its common practice in AMS to post designs up in the kitchen or on boards around the space to get comments from the other employees as they wander around the floor.

Finally Don handed off his project to an illustrator to come up with some final iterations. Don described the finished project as "not just a picture, but a story". Thus the need for spending eight weeks on it.

Afterwards, Don showed us around the very impressive AMS space, which completely blows anything at CMU out of the water. The space was very much tailored to the work; projectors were configured to be easily connected to any laptop, bookshelves with design-related books abounded, small workspaces could be reserved by teams so they'd have a permanent location to store work relating to their project. Interestingly, there were no personal private spaces at AMS; everyone had an open desk. Even more interestingly, Don himself had the same setup; his desk didn't really look much different from that of any other employee. No clearly visible hierarchy in this organization. There were also many common areas to encourage conversation and collaboration, although the atmosphere also created enough quiet in the working areas for employees to get things accomplished without interruption. There were closed off conference rooms as well as small booths for private meetings and telephone calls.

Finally, Karen showed us some small booklets that Don produced apparently just for fun about a variety of topics that he wanted to know more about. Don's motives were mixed, however, since he also sent copies of these booklets around to important contacts in the hope that they would bring more work. In many respects, this reminded me of the motivations of open source software contributors.

It was a great experience to see a practicing designer at work. I wish we got more field trips in this graduate school thing.

Dan went along as well; his commentary is also available.

Got Something to Say About This?

Email Rob:

OR Post a Comment:

 

Enter the number below into the text box next to it.*


 

* These fields are required. Your email address will not be publicly displayed. Your web address is optional, and will be publicly displayed if provided.

Allowed HTML: a href, strong, em, ul, ol, li, blockquote, dl, dt, dd, dfn, code, q, samp, kbd, var, cite, abbr title, acronym title, sub, sup

A Map of my Daily Walk

design, information

January 28, 2004, 10:58 PM

For my Mapping and Diagramming class I created a map of my walking route from the front door of my house to Margaret Morrison A11, the room where our class is held. Karen asked us to imagine that an old friend (and not just any friend, a particular friend) had come into town to visit and we needed to leave directions at our house so that they could find their way to CMU to have lunch with us. I chose to design my map for my friends Kim and Alaina.

I'm fairly pleased with the results. The form of my walking route in this schematic line representation is interesting and the elements of the piece fit together rather nicely. I tend to favor a minimalist approach to visual design perhaps because I still have too little confidence in my instincts to add much "decoration". But I think it worked for me with this piece; the simplicity made it easier to focus on attractiveness and understandability.

It struck me that this piece came together fairly easily because the users and their tasks were so focused and narrow. We were designing for one person, and all we had to do was get them from our houses to school. The design solution flowed almost naturally. This leads me to believe that one approach worth trying with more complex problems is to split them up into their component tasks and then design solutions for each task. You can then integrate the "sub-designs" into a larger metaphor that satisfies all tasks, taking into account the relevant tradeoffs and priorities. This has the side benefit of creating easily dividable work for the members of a design team.

Got Something to Say About This?

Email Rob:

OR Post a Comment:

 

Enter the number below into the text box next to it.*


 

* These fields are required. Your email address will not be publicly displayed. Your web address is optional, and will be publicly displayed if provided.

Allowed HTML: a href, strong, em, ul, ol, li, blockquote, dl, dt, dd, dfn, code, q, samp, kbd, var, cite, abbr title, acronym title, sub, sup

Of Maps and Diagrams

design, information

January 24, 2004, 11:45 PM

I'm taking a course called "Mapping and Diagramming" this semester from the Design department, taught by Karen Moyer, whom some may remember from my postings on CDF. This time, however, I'm learning about the design of (you guessed it) maps and diagrams, which Karen describes as essentially that subset of information design that deals with high-density information artifacts. I've always found maps and diagrams fascinating in and of themselves, quite apart from any information they convey, so this is a neat course to have the opportunity to take.

So far we've talked a lot about analyzing maps as information objects, and the field is more complex than you may think. Many think of maps as merely representing factual information. The design of a map doesn't seem quite so hard since the information already exists in the world; all the designer must do is record it accurately and in a reasonably aesthetic representation. But nothing could be further from the truth. All maps are selective in the information they represent, whether that information is geographical, political, or something else. Many maps outright lie with respect to certain kinds of information to more accurately or prominently display others (Karen showed us a geodesic dome of the world where all the continents were accurately sized, but the oceans were completely off; the purpose of the map was to allow comparison of the continent sizes while retaining the spherical form of the earth). Some maps may even make a political argument through the information they choose to analyze; Charles Minard's famous map of Napolean's Russian campaign is an example.

The definition of a "map" we're using is also broader than one may think; kayaking eskimos use a piece of wood carved in the shape of the coastline to navigate by. A series of photographs of the intersections of a highway is another example.

The essential goal of a well-designed map is (usually) to present a large amount of information that is intuitively understandable to its audience at a glance. Richard Saul Wurman's book Understanding is a nice collection of year 2000 statistics on the USA presented with this design goal in mind. Information design of this quality should be regularly produced by the government in a healthy democracy rather than relying on the good will of a cabal of famous designers (Wurman and his compatriots lost money on the project).

A final note: Karen mentioned a group of urban planners who had members of the community they were seeking to redesign draw maps of the areas in which they lived to find out what places were most important, dangerous, etc. to the inhabitants. An interesting user-research technique.

Commentary

Posted by Andyed on January 25, 2004 at 09:49 AM

There's some seminal work from a couple of other CMU'ers, Larkin & Simon, that provides compelling a cognitive psych account of how diagrams an serve as a very rich form of external memory, extending the reach of computation:

http://citeseer.nj.nec.com/context/24189/0

Got Something to Say About This?

Email Rob:

OR Post a Comment:

 

Enter the number below into the text box next to it.*


 

* These fields are required. Your email address will not be publicly displayed. Your web address is optional, and will be publicly displayed if provided.

Allowed HTML: a href, strong, em, ul, ol, li, blockquote, dl, dt, dd, dfn, code, q, samp, kbd, var, cite, abbr title, acronym title, sub, sup

The Case of Powerpoint

design, information

January 06, 2004, 12:02 AM

Kevin has a nice summary of several of the Powerpoint-is-bad memes that are out there on the internet. Going over these again got me thinking about how one cause of this problem might not be any particular design flaw of Powerpoint's, but rather an extension of the tool into areas it was never meant to enter. Powerpoint is reasonably good at putting together visual aids for lectures and talks, and as long as it's kept to that purpose it tends to shine (Neema, who is an ardent defender of Powerpoint, proved this with his excellent presentation, "A Spiritual Journey", on his 9-month adventure across Europe/Asia/Africa). The problem arises when people use Powerpoint as their sole venue for information presentation. As Tufte points out, the slideshow format is inappropriate for many communication purposes.

However, there are many who use Powerpoint not only as a means of presenting visuals for a talk, but also as a way of publishing take-away materials for the audience and even for storing databases of information so that it can be quickly pulled together into a presentation format. On the one hand, this is a perfectly reasonable strategy for busy people who don't have the time to maintain their information in a more appropriate tool, then convert all the relevant bits into Powerpoint slides. On the other, this leads to all the problems Tufte rails against for the reasons he defends far more eloquently than I can hope to do here.

Perhaps the answer lies in developing tools that support more robust single-sourcing techniques, both for authoring and publication. Such tools should not only make it easy to store content generically and squish it into a variety of presentation templates, but should afford using the appropriate templates for each information presentation purpose. But who will provide us with a vision of what such a tool might look like? It'd be an interesting challenge to take on.

Got Something to Say About This?

Email Rob:

OR Post a Comment:

 

Enter the number below into the text box next to it.*


 

* These fields are required. Your email address will not be publicly displayed. Your web address is optional, and will be publicly displayed if provided.

Allowed HTML: a href, strong, em, ul, ol, li, blockquote, dl, dt, dd, dfn, code, q, samp, kbd, var, cite, abbr title, acronym title, sub, sup

Metacrap and Categories Revisited

information

October 04, 2003, 03:12 PM

I came across (Micah -> The Midnight Blog -> The Well) Cory Doctorow's essay on Metadata, entitled "Metacrap", where he outlines the reasons why the sort of universal metadata-enhanced world that information storage and retrieval experts dream of (he calls it meta-utopia) is practically impossible. Of special interest is his section on why schemas aren't natural, where he basically makes the same point I did in my post on why categories don't work so well (in brief: they are too hard to get right). He makes some interesting, true, and slightly scary points, since XML as a mechanism for universal information storage basically relies on bringing about the meta-utopia he pans.

Commentary

Posted by Jordan on October 04, 2003 at 04:21 PM

Metacrap is a great essay -- makes one think about leveraging the power of groups to get to accurate metadeta -- I don't know exactly how to do this -- maybe we can learn something from wikipedia, etc.

Posted by Rob on October 05, 2003 at 10:12 AM

You might be interested in some of the things we're studying in CSCW. Unfortunately Bob axed all the readings on Recommender Systems in the interests of time, but much of the other things we're learning might also apply to ideas like this one.

Dana and I are planning to try to distill as much of the readings as we can into digestible design principles. I'll post more about that when there is more to post about.

Got Something to Say About This?

Email Rob:

OR Post a Comment:

 

Enter the number below into the text box next to it.*


 

* These fields are required. Your email address will not be publicly displayed. Your web address is optional, and will be publicly displayed if provided.

Allowed HTML: a href, strong, em, ul, ol, li, blockquote, dl, dt, dd, dfn, code, q, samp, kbd, var, cite, abbr title, acronym title, sub, sup

Newsable: Ready to Rock!

information, internet, software development, usability

October 04, 2003, 01:14 PM

It is, after all, Rocktober.

I've just unveiled Newsable 0.8 beta to a breathlessly-awaiting public (this public, in case you were curious, consists of Kerry and Jordan), thus marking the first release of my web-based news aggregator / modest experiment in merging open source and usability. The low-down on the current status follows.

First off, go ahead and create an account to take Newsable for a test drive. No, I really mean it. Go do that now.

Ok, now that you're back, let me confess that a few things still need work in the current release:

  1. The interface is ugly as all hell. Jordan is kindly lending me some of his awesome graphic design skills to help fix that. Additional help would of course be greatly appreciated.
  2. There are still known bugs. Namely, the harvester isn't entirely behaving itself, and the RSS autodetection algorithm needs a bit of improvement to handle the real web. I'll try to get these cleared up ASAP, but the system should be usable in the mean time.
  3. The help system isn't available yet. If you get confused about something, feel free to email me. This will help me know what needs documentation / usability improvements as well.
  4. Some of the interaction design decisions are tentative. The "Archived Stories" tab exemplifies this best. I don't yet have enough user data to make guided design decisions about these features. The personas still need some refinement.
  5. Some more advanced features aren't yet available. OPML import / export support is one of these. I hope to have these in place before 1.0

As you can see, there are many opportunities to get involved with improving Newsable even if you aren't interested in writing Perl code (unlike the majority of open source projects). Please email me if you have any ideas for improvements in the aforementioned areas. I may try to set up a mailing list soon to create a central place for discussion.

Happy reading!

Commentary

Posted by Mathilde on October 04, 2003 at 01:21 PM

The release also consists of myself. :-) I volunteer to help Jordan with the visual design. I *need* to fix it if I am going to use it. It's *soooo* ugly right now. :-P

Posted by Rob on October 04, 2003 at 03:28 PM

Huzzah! There is benefit in doing a crappy job; it convinces others they need to give you a hand ;).

If anyone else has any interest in helping with the design or implementation of Newsable, do speak up! It doesn't even have to be anything significant; every little bit helps.

Posted by Mathilde on October 05, 2003 at 02:03 PM

Oh no! You started saying "Huzzah!" too! Curt has corrupted you. ;-)

Posted by jeff on October 05, 2003 at 07:44 PM

Congrats on the launch Rob.

Now for some constructive criticism. One long page of feed seems very hard to scan. I especially have trouble seeing the breaks between the sites.

When I order by newest, I see the newest post of the newest feed, then scroll through the entirety of the older posts of that feed before hitting the newest post of the second-newest feed. Repeat.

Since every ordering scheme except alphabetical seems to aggregate by feed, seeing breaks between feeds could significantly assist scanning.

A potential solution would be to remove the feed title from every post, and use it instead as a header that separates one feed from the previous feed. This has the added benefit of saving one line per post. Many lines over the course of the page.

Posted by Rob on October 05, 2003 at 08:16 PM

Oldest-to-newest and Newest-to-oldest actually interleave feeds, ordering entirely by Posted Time. What's probably throwing you off is that some RSS feeds don't contain Posted Time information, so Newsable has to substitute the time that it harvested the feed for the Posted Time (RSS 0.91 feeds are the offenders, in case you are curious). This means that when you first add a feed, all the posts in it are assigned the same Posted Time (the time you added it) which has the effect of aggregating them by site initially (this will change as Newsable harvests new content, though). Of course, there might also be a few bugs in Newsable at the moment that are exacerbating this problem. I'm looking into it.

Your point about site-separation is well-taken for the by-site ordering. I have an initial design idea (alternate the background gray-and-white by site, instead of by story) that should go into 0.8.1 or .2, but I like your "pull out the feed title" suggestion as well (although that'll be a bit harder to program).

Thanks for the great comments!

Posted by Jeff on October 07, 2003 at 04:27 PM

The more I understand how Newsable scrapes sites for feeds, the more impressed I become. It seems very much in the spirit of Mark Pilgrim's Ultra-liberal RSS Locator. Nice work.

Posted by Rob on October 07, 2003 at 07:56 PM

Thanks, Jeff. It turned out to be a much harder programming problem that I originally thought it'd be, given the mess that is the current state of real RSS feeds out on the real web. Conflicting standards and incorrect implementations galore.

I just came across Mark's RSS locator as well. He also has an ultra-liberal feed parser available; both of these projects are open source. Sadly, Newsable is written in Perl and thus can't take advantage of these excellent resources. Had I known Mark had done all this work for me three months ago I would have written the damn thing in Python. I hear its a cleaner language anyway.

The good news, though is that I found a Perl port of Mark's ultra-liberal RSS locator yesterday and plan to integrate it into Newsable in the next release. This should greatly improve the "Add to Newsable" functionality.

Posted by Kevin Fox on October 17, 2003 at 12:50 AM

Oooh... OPML support... I can't wait! (and I'm *not* being sarcastic!)

Got Something to Say About This?

Email Rob:

OR Post a Comment:

 

Enter the number below into the text box next to it.*


 

* These fields are required. Your email address will not be publicly displayed. Your web address is optional, and will be publicly displayed if provided.

Allowed HTML: a href, strong, em, ul, ol, li, blockquote, dl, dt, dd, dfn, code, q, samp, kbd, var, cite, abbr title, acronym title, sub, sup

On the Inadequacy of Categories

design, information

August 24, 2003, 10:11 PM

As I just mentioned yesterday, a common technique for organizing information is to put this information into categories, thus making it easier for users to locate specific information as well as search for types of information based on predefined properties. Generally speaking, this works pretty well; libraries and other information-rich environments have always lived by their categorization schemes.

But then someone got the bright idea to allow ordinary users to define their own categories for their own information. This, I'd argue, doesn't work so well.

A case in point is the very weblog you're currently reading. Shortly after starting roBlog, I went through a fairly rigorous process to develop categories for my current and future posts. I was hoping these categories would be fairly exhaustive (most every future post would fall into one or more of them), fairly meaningful to my readers, fairly few (I didn't want a huge list of categories that was hard to scan), and fairly balanced (I didn't want all my posts clustered into one category).

I'm becoming more and more dissatisfied with the results. I'm finding that most of the posts are clustering into a few categories and even I can't always tell what category a particular post should fall under, so I can't imagine its intuitive for you guys and gals. One option is to simply create more specific and a wider variety of categories, but I'm concerned that this will soon cause the number of categories to balloon into something unmanagable.

In "Inmates", Cooper complains about file systems (which are generally hierarchical, but hierarchies are essentially nested categories); he feels it is unrealistic to expect users to effectively organize all their personal information through this paradigm. And I can certainly relate to this; I know many people, all of whom are highly intelligent and very computer-saavy, who constantly lose track of where they put things in the file system. But even more specialized applications have this problem; I've given up on trying to categorize my MP3s by genre; I'm never able to remember what I filed a particular song under and I rarely want to hear all songs in a given genre.

The basic problem, I believe, is the user-built categorization schemes expect users to know what kinds of questions they're going to want to ask when searching for their information before they actually need to ask them. Most users don't have sufficient experience in the domain to know their own behavior patterns. And most of those users that do won't have reflected sufficiently on their own practices to design effective categorization schemes anyway.

The upshot is that designing good categorization schemes often requires extensive training. Library Science is an entire field that focuses mainly on solving this very problem. For those of us that lack such training, its difficult to know how to categorize the information we work with for maximum productivity.

For my own categorization problem with roBlog, I'm considering taking a suggestion Micah made; he thought I should group posts by certain interesting topics (such as some of my professor's names) since this would gather information on those topics in one location for Google searchers and the like. I'm thinking that these topics wouldn't be exhaustive; they'd be more like the "features" sections that Micah has on his own weblog.

Commentary

Posted by Dan on August 25, 2003 at 07:28 AM

Micah made the same suggestion to me. I've considered making a MT category for each professor, but, over the course of a semester, since I'm trying to write about each class, the prof would have dozens of entries. Plus, I worry about stealing a prof's mojo in my blog. :)

As far as your categorizations, you took a "top-down" approach to creating them: trying to create broad categories to contain everything. That approach usually means some things will be odd fits. The other approach, "bottom-up," is the opposite: the content's characteristics create the categories or, in other words, the information architecture. As you pointed out, this can be too specific. This is why IAs get paid money.

An IA would probably do a card sorting exercise with your users, having them arrange your entries (one per card) into piles, then having them name the piles. Do that with several groups users and you some semblance of a navigation scheme. You've also spent several days and many thousands of dollars. :)

Have you looked at your logs to see how much your categories are even used? In six months of reading, I've never used them, but perhaps I am an atypical user. Search seems to fit the content of your blog much better.

Dan

Posted by Rob on August 26, 2003 at 11:15 PM

Yeah, I made a conscious decision to take a top-down approach to developing the categories; I didn't want to make them up as needed for each post since I was afraid of winding up with the "category soup" that lots of weblogs suffer from. I wanted the number to be short, sweet, and managable. Plus coming up with categories on the spot may wind up with lots of categories at wildly different levels of generality.

I agree with your assessment of the "correct" approach to take in this situation, but as you implied, I can't spend several thousand dollars on this weblog. I think this lends credence to my claim in the post that developing effective categories requires training and most ordinary users (even fairly design-saavy ones like me) can't do it effectively.

I have checked my logs, actually, and surprisingly categories are accessed fairly frequently. Much less so than the main page, of course, and less so than some of the more popular individual entries, but still a respectable amount and much more than the search url. Which surprises me too; I certainly search my weblog much more than I browse categories. But then I'm as far from a "typical" user as you can get (or maybe not; I have a sneaky suspicion that I use this weblog much more than anyone else does ;)

Got Something to Say About This?

Email Rob:

OR Post a Comment:

 

Enter the number below into the text box next to it.*


 

* These fields are required. Your email address will not be publicly displayed. Your web address is optional, and will be publicly displayed if provided.

Allowed HTML: a href, strong, em, ul, ol, li, blockquote, dl, dt, dd, dfn, code, q, samp, kbd, var, cite, abbr title, acronym title, sub, sup

Info Design's LATCH

design, information

August 23, 2003, 07:22 PM

This one's gonna be a quickie, but I wanted to throw it up while I was thinking about it.

During the last week of CDF on information design, our instructor, Bob Swineheart, remarked that there were five ways of organizing information items. The mneumonic is "LATCH":

  1. Location
  2. Alphabetical
  3. Time
  4. Category
  5. Hierarchy

I've been thinking it over, and so far I haven't come up with organization scheme that doesn't in some way fit into one of the above. Thoughts?

Ok, now I'm heading over to Walnut Street to drink and party with the new MHCI students. Less bloggin', more boozin'! :)

Commentary

Posted by Jeff on August 23, 2003 at 08:01 PM

Richard Saul Wurman covers LATCH in Information Anxiety 2. He describes it as a finite list of organizational possibilities. I think this is largely because Category is such a catch-all scheme.

The choice of how to organize infomation isn't always obvious, since more than one scheme can apply, either exclusively or in tandem. Case in point is the Vietnam Memorial Wall, first approached as an alphabetical listing, and then refined as chronological, or the yellow pages, both categorical and alphabetical.

Posted by Rob on August 24, 2003 at 05:44 PM

Thanks for the reference; I knew Bob said he'd heard of LATCH from somewhere else, but I'd forgotten the name of the guy. Here's the book on Amazon.com, for posterity.

I thought about this a little more, though, and realized this list doesn't cover a "network" as a means of organizing information. Granted, networks are rarely useful as organizational schemes because they tend to get confusing quickly, but I'd imagine they are appropriate (or perhaps unavoidable) for at least some types of information organization.

Posted by Jeff on August 24, 2003 at 06:36 PM

Isn't the concept of "network" an offshoot of the location principle? Maybe I'm not understanding exactly how you mean. In any event, network seems too much like a "thing", sort of like saying that a bookcase is an organizational principle.

Posted by Vishi on August 25, 2003 at 12:20 AM

Hierarchys are basicly trees structures. Categories make hierarchies. Location, alphabet and time are types of categories. There may be lots of other types of categories like author, application used, etc which are the meta data of the object being described.

Now, a network or a semi-latice is made up of hierarchies or trees. More info on this is here (http://www.rudi.net/bookshelf/classics/city/alexander/alexander1.shtml).

Posted by Rob on August 25, 2003 at 01:09 AM

By a network, I mean an organizational structure where each information item is organized by its association with some subset of the other information items in the structure. Maybe "graph" would be a better term. Mathematicians define hierarchies as special cases of graphs (hierarchies usually have a single root and every child element has only one parent, whereas graphs have no root and every element can be associated with an arbitrary number of other elements). Friendster's mechanism for organizing your personal network would be an example (although maybe not the best one).

I suppose this could be considered an offshoot of "location". I was thinking that part of the definition of "location" was that each information item had some absolute position in space, independent of the positions of other information items, whereas in a graph information items are ordered purely with respect to one another.

Vishi: It's true that in some sense location, lexical order, and time are all categories; philosophers would say that all these things are properties of the information item. But to an information designer, thinking at this level of abstraction is rarely useful since good design is very much about staying concrete. Most users don't see lexical ordering and chronological ordering as the same thing. Remember, we're trying not to be astronauts here. :)

Posted by Vishi on August 25, 2003 at 12:34 PM

The lesson of the architecture astronauts is that, it makes it easier for the designers/programmers to work with abstract patterns, but they should not loose focus of the user. Task based interfaces help in working with abstract patterns while keeping the user in focus. So, what is important to an information designer is which categories are important to people while doing a task. The importance of the Location, alphabetical , time or something else depends on the task the user is doing and cannot be generalized.

Posted by Jeff on August 26, 2003 at 10:10 AM

We talked about LATCH in studio yesterday. Cheryl brought up the possibility that a sixth organizational scheme was "Random." This caused much discussion, including the possiblity that the acronym could be LARTCH. I prefer L'CHART.

Posted by Rob on August 26, 2003 at 12:26 PM

"Random"... That's a good one. Almost always a bad organizational scheme, but unfortunately also often the easiest to program. Until last weekend, my "Friends" sidebar was organized using that scheme, as you may have noticed. Then I learned more Perl...

Posted by Rob on August 27, 2003 at 11:39 AM

Just wanted to point out that Micah has more on this, including some interesting analysis of LATCH by Ryan.

Posted by Jeff on August 31, 2003 at 12:18 AM

In the first edition of Information Anxiety, Wurman hadn't yet come up with the clever acronym and was referring to "hierarchy" as "continuum". There's a subtle difference, but maybe not enough to justify a sixth method.

Posted by Rob on January 21, 2004 at 09:10 AM

As a quick follow-up: Jeff had it right when he noted that "hierarchy" was originally called "continuum", which is really a more accurate name. "Hierarchy" implies some sort of tree structure to many people, which is really just categories of categories. But the hierarchy in LATCH refers to a continuum organization where each information item is organized in relation to the other items based on some property; e.g. widget A is bigger that widget B which is bigger than widget C, etc.

Got Something to Say About This?

Email Rob:

OR Post a Comment:

 

Enter the number below into the text box next to it.*


 

* These fields are required. Your email address will not be publicly displayed. Your web address is optional, and will be publicly displayed if provided.

Allowed HTML: a href, strong, em, ul, ol, li, blockquote, dl, dt, dd, dfn, code, q, samp, kbd, var, cite, abbr title, acronym title, sub, sup

On the Properties of Links

information, internet

June 22, 2003, 11:40 PM

I've recently become peripherally aware of a discussion that's going on among the widely-read webloggers on what makes a weblog a weblog (as opposed to a normal website). This reminded me of a discussion I had with Micah after his CHI Weblogs BOF. At the session, Micah gave a quick definition of a weblog entry, which he described as necessarily having four components: a main link, some commentary on the link, a "posted at" date by which the entries are ordered, and a permalink. At the time, I took issue with the "main link" idea; I felt that this assumed a certain type of weblog, namely the "MLP-style weblog" that tends to be centered around spreading links to interesting content. Kevin, who was also participating in the discussion, remarked that he frequently posts entries that are spawned by his own thoughts and not directly from something he read on the web, a sentiment that I echoed (to be fair, Micah emphasized that he wasn't trying to present a rigorous definition so much as he was trying to give an idea of what a weblog is).

Since then, however, I've been thinking a lot about links and how they are used in weblogs. There's been a call recently for more expressive links on the web. Right now, all an <a> tag can really say is "this text refers to this other document". As a human, you may be able to infer the author's purpose in linking that particular text to that particular document, but a machine cannot do so, limiting the analysis capabilities of web robots. But what if you could, through a more powerful <a> tag, say things like "this text is rejecting the argument presented in this other document" or "this text refers to the person who wrote this other document". This is in line with the "semantic web" concept: Berners-Lee's vision of a more meaning-driven web that internalizes the separation of content and presentation. I find this idea quite intriguing; I have a strong intuition that there are lots of interesting things we could do with the web if links could express why they were made. Merely knowing whether the author agrees or disagrees with the documents he links to could open up whole new dimensions for social network analysis tools, for instance.

This lead me to reflect on my linking habits for these weblog entries I write every now and again. I realized that I have several types of links:

This is not meant to be an exhaustive list of the purposes of links; I can think of many more. But I think this post demonstrates that even within the limits of roBlog, the diversity of purpose that lies behind those little blue words is astounding.

Got Something to Say About This?

Email Rob:

OR Post a Comment:

 

Enter the number below into the text box next to it.*


 

* These fields are required. Your email address will not be publicly displayed. Your web address is optional, and will be publicly displayed if provided.

Allowed HTML: a href, strong, em, ul, ol, li, blockquote, dl, dt, dd, dfn, code, q, samp, kbd, var, cite, abbr title, acronym title, sub, sup

Waypath: A "What's Related" for Weblogs

information, internet

June 13, 2003, 07:50 PM

I ran across a service called "Waypath" today that basically implements the "Related Entries" functionality I discussed last April. You can give Waypath a URL to a weblog entry or a few keywords and it will return a list of related weblog entries. They even appear to have plugins available for Movable Type and Radio that allow you to query their server for posts related to yours and embed links to those posts on your site. Very cool.

It doesn't look like they are implementing it in quite the way I concieved of such a system; they are polling the "recently updated weblogs" lists such as weblogs.com and blo.gs instead of having each weblog ping them when they update, which makes more sense than the pinging model given that they aren't very popular yet. If they take off, however, a pinging model will probably prove much more efficient, as it did for weblogs.com. This is actually a nice feature of the recently-updated lists; if you're trying out a new metablogging tool you can use them as a source of active weblogs until you get your feet off the ground and build your own user community.

I tried out a couple of searches using Waypath; some of them were pretty accurate and relevant, others were pretty mysterious. But this is my impression of most automated "what's related" engines I've encountered, including Google's. And as of now Waypath seems to be just a couple of guys hacking in their spare time, so you can't expect a polished product yet.

Best of luck to them, though. They're working on a cool technology.

Commentary

Posted by Steve on June 15, 2003 at 10:32 PM

Glad you like it!

The pinging model hasn't escaped our attention. We have the notion to talk to the CMS publishers about a special post-level ping, of which Waypath could be one of the recipents, but we haven't had the time to pursue it.

(Right now, we're starting to collaborate with other Metablog service sites and I'm hoping that between us, we'll both be able to lobby for this kind of thing from the software vendors and have the bandwith to implement the listener.)

I took a look at your post from last April. The weblogs.com + related posts feed is on our road map, too. I like your idea of a ping-back with the related posts--building that functionality into the CMS platform instead of relying on plugins.

Lot's of great ideas; so little time. We could use a patron. But then, who couldn't? :)

Got Something to Say About This?

Email Rob:

OR Post a Comment:

 

Enter the number below into the text box next to it.*


 

* These fields are required. Your email address will not be publicly displayed. Your web address is optional, and will be publicly displayed if provided.

Allowed HTML: a href, strong, em, ul, ol, li, blockquote, dl, dt, dd, dfn, code, q, samp, kbd, var, cite, abbr title, acronym title, sub, sup

The Users We'll Never See

design, information, society & sociology, usability

June 03, 2003, 05:10 PM

About an hour ago, I felt a need to feed my Chipwich addiction, so I invited Kerry to go to Entropy with me. On the way back, she needed to stop by the Hub to get some registration and class scheduling issues worked out. Since I had nothing imminently pressing to do, I waited for her.

While sitting in the Hub waiting area, surrounded by brochures, magazines, important-looking forms, computers, and the various other trappings of a functioning administrative center, I started to reflect on how much information was out there in the world, and how much of it had to be dealt with. That's the thing about information; it always seems to need processing and organizing and analyzing and basically loads and loads of attention. Information can be rather childish in that sense.

I was reminded of Herb Simon's quote: "In the future, the scarce resource will be human attention". I reflected for a bit about how right he was.

But wait a minute, a little voice in my head remarked. Exactly how right was he? Sure the information-overload problem is a big deal for people at CMU, and probably is for information workers everywhere. But is everyone like us?

My father owns a house in rural West Virginia (Monroe County, for those of you familiar with the area). Life's quite different out there. People aren't so busy, for one thing; they don't appear to be so bothered by all this information. The information is there; there are construction projects and farm vehicles and weather patterns and plant DNA (scads of information lies in the DNA of even a simple organism) and who knows what else. But this information doesn't seem to be so concerned with getting your attention. It could care less about being processed.

My point is not that this "simple" life is ideal or perfect; it isn't. My point is not even that it's better; "better" is a matter of opinion and generally doesn't mean much in any absolute sense. Hell, I happen to like city life; I wouldn't move to the country if I was given a million bucks to do it. Rather, my point is that old Herb was not thinking about these people when he made up that quote. And how could he? He was here at CMU, where everyone is information-overloaded. Of course he saw the future in those terms; he was extrapolating from the present, just as any rational person would.

As human beings, our conceptions of the world and its problems are necessarily a product of where we are situated in society, both physically and status-wise, and who we choose to associate with. Dan Sieworek, Scott Hudson, and many other researchers here at CMU, are taking Herb's quote seriously with projects like Aura and Situationally-Appropriate Interaction to try to solve this information-attention problem of the future. But whose problem is this, really? It's their own problem, ultimately, and the problem of the people they associate with and the people who fund them.

As user-centered designers, we can only design based on our conceptions of the world. We can only perceive and design for the slice of reality we find ourselves in, and for most of us, this slice is quite limited.

Around this point, I was interrupted from my meditations by the secretaries loud discussion about whether picture files are supposed to end in ".jpg" or ".jpeg". They spent a good five minutes trying to figure out the correct answer (of course, both extensions will work fine for most modern programs). I smiled. File extensions are one of those known usability problems that no one can get seem to rid of. Usability advocates routinely complain about how design decisions such as this one are made by programmers who don't understand the mentality of the users they are designing for. But in the large scale, in the "redesign society" sense that researchers and the upper echelons of industry and government confront daily, does anyone really know who they are designing for? Does even the most skilled user-centered designer have any meaningful grasp on who the users are when the user base includes the population of entire countries (or the entire planet)? In this scenario, even we user-centered designers may be no better than the archetypal isolated programmer in a cubical we so frequently revile.

Sometimes I wonder if at least a subset of the designers of the world need to step out of their bounds sometimes. Some of us should go out into the world to observe the people whose lifestyles and values clash with the ones we are familiar with. Perhaps this would help us get a larger view of the world we are so eager to change, essential if we are to do more good than harm.

This idea has some background in design research; last semester we read a paper from DIS 2000 that advocated designing for "extreme characters" as an exercise to help break out of the mould of the existing interface design frameworks. But this is just a thought experiment; the extreme characters described in the paper are (by admission) caricatures of real people. A designer can't truly know what the elements of society outside his spheres of experience are like without seeing them for himself.

Now if we really were able to see the people of the world, the "end-users" of our world-changing visions, as who they are, and not just who we assume them to be, well, how mind-blowing would that be?

Got Something to Say About This?

Email Rob:

OR Post a Comment:

 

Enter the number below into the text box next to it.*


 

* These fields are required. Your email address will not be publicly displayed. Your web address is optional, and will be publicly displayed if provided.

Allowed HTML: a href, strong, em, ul, ol, li, blockquote, dl, dt, dd, dfn, code, q, samp, kbd, var, cite, abbr title, acronym title, sub, sup

Threads of Thinking

design, information, internet

May 24, 2003, 01:21 PM

One beef I've had with the generally excellent Movable Type engine that I use to maintain this here weblog is that it doesn't have any convenient way to connect posts that are continuing the same basic topic thread. You can categorize posts to organize them by topic, but most posts, although nominally on the same topic, discuss completely different things. What I need is a way to get an overall picture of an entire rolling thought process that I've posted about. This need also arose in the Weblogs Informal SIG at CHI last month.

Right now, I do this by posting links to related previous posts in the text of the weblog entry (like I just did in the previous sentence). This works reasonably well, but it would be nice to have a convenient way of seeing the entire topic thread in one glance. So in a future iteration of this weblog, I'm thinking of adding a sidebar to the individual entry template like the following:

LookingForwardBehind.jpg

The "Looking Forward" posts are future posts that reference this post. The "Looking Behind" posts are just the collection of links to previous posts that appear in this posts. I'm hoping this will help make it more clear which posts are connected to this one and afford reading entire threads of thought at once.

Got Something to Say About This?

Email Rob:

OR Post a Comment:

 

Enter the number below into the text box next to it.*


 

* These fields are required. Your email address will not be publicly displayed. Your web address is optional, and will be publicly displayed if provided.

Allowed HTML: a href, strong, em, ul, ol, li, blockquote, dl, dt, dd, dfn, code, q, samp, kbd, var, cite, abbr title, acronym title, sub, sup

Lessig's Modest Crusade

information, politics

May 16, 2003, 06:18 PM

Lawrence Lessig, law professor at Stanford, advocate of free speech and the public domain, fellow Movable Type weblogger, and all-around good guy, has a modest proposal for undoing the major damage of the Sonny Bono Mickey Mouse Protection Copyright Extension Act. He is trying to get a bill proposed to Congress that will require copyright holders to pay a 1$ tax on their copyrights after the first 50 years. If the tax is paid the owner keeps the copyright but must provide a contact point for requests to license the copyright (though the copyright holder need not actually grant any of these requests). The reasoning behind this move is that the vast majority of works are no longer economically valuable after 50 years, and thus they should enter the public domain so that anyone can make use of them without having to license the copyright or worry about litigation. So Disney can keep Mickey, but the rest of us will have access to the countless numbers of works that are locked up by copyright even though they are not available on the market and aren't making anyone any money.

Unfortunately, Lessig is encountering resistance even to this seemingly very reasonable law from lobbyists on Capitol Hill. He attributes it to content creation companies like Disney's willingness to stomp out any form of public domain competition at any cost. Perhaps they are just afraid to let him get his foot in the door. But regardless of the reason, this is a law that could provide a lot of benefit for revitalizing the public domain and can't be even vaguely construed as hurting anyone. It's just a good idea. So write your Congresspersons a quick email asking them to please sponser the Eric Eldred Act, or at least support it if someone else does. It's the least we can do for a good cause.

Got Something to Say About This?

Email Rob:

OR Post a Comment:

 

Enter the number below into the text box next to it.*


 

* These fields are required. Your email address will not be publicly displayed. Your web address is optional, and will be publicly displayed if provided.

Allowed HTML: a href, strong, em, ul, ol, li, blockquote, dl, dt, dd, dfn, code, q, samp, kbd, var, cite, abbr title, acronym title, sub, sup

The Spread of Misinformation

information, internet

May 13, 2003, 10:05 PM

There was an article on Slashdot yesterday that makes an interesting case study in how easily the truth can get warped as information propogates from person to person (or website to website), and should make us all very cautious of blindly believing what we read, online or off.

The article was titled "Google To Create "Blog" Search; Potentially Remove From Main". The upthrust of the writeup was that Google had announced that (1) they were planning on creating a special "blog" search engine and (2) this means they would be removing all weblog posts from their main index.

(1) is not too surprising; after acquiring Blogger this is the logical next step for Google to take. But (2) is a big deal, especially to webloggers like me who are fairly fond of having Google index our writings. From the comments, it appears most people took this point at face value and produced reams of commentary about it (mostly in favor of Google's supposed decision, I might add).

The problem is Google has decided no such thing. They haven't even hinted that they are considering removing weblogs from their main search index. So where did this idea come from?

The story links to an article by Andrew Orlowski, who is known to have some issues with the whole weblog concept. Andrew links to a Reuters story on Yahoo News, which contains the following comments from Google's CEO:

Google allows people to search Web pages, as well as search specific types of content such as news sources, shopping sites through its "Froogle" service, Usenet groups. Soon the company will also offer a service for searching Web logs, known as "blogs," Schmidt said.

That's the only reference the original article makes to weblogs. But over at the Register, Andrew adds:

It isn't clear if weblogs will be removed from the main search results, but precedent suggests they will be. After Google acquired Usenet groups from Deja.com, it developed a unique user interface and a refined search engine, and removed the groups from the main index.

He provides no further evidence for this hunch than what appears above; that based on a "precedent" of one very different situation, web-accessible usenet news, Google's mention of a weblog search tool meant that they were removing weblogs from their main index.

What's amazing is how quickly Slashdot's readers took his words at face value, even though the original source material that would have thrown doubt on these claims was only a couple of clicks away. The sad fact is that most people don't bother to check the facts too carefully on most of what they read, even when checking the facts isn't onerously difficult. And I include myself in that statement; I frequently don't bother to follow the links in a K5 story, or do a quick Google search on a meme mentioned on Slashdot.

And this isn't just limited to online, collaborative media like Slasdot, either. K5 has a story on the Klingon language interpreter myth that gives an example of this principle in a traditional media setting.

In today's world, where we experience an enormous information influx as the result of advanced communications technologies, it's more important than ever to reserve a certain amount of skepticism for every new fact we read about, and try to check up on the facts when possible even when we would like to believe what's being said.

Got Something to Say About This?

Email Rob:

OR Post a Comment:

 

Enter the number below into the text box next to it.*


 

* These fields are required. Your email address will not be publicly displayed. Your web address is optional, and will be publicly displayed if provided.

Allowed HTML: a href, strong, em, ul, ol, li, blockquote, dl, dt, dd, dfn, code, q, samp, kbd, var, cite, abbr title, acronym title, sub, sup

Filtering Feeds to Free Up Focus

design, information, internet

May 09, 2003, 09:52 AM

Micah mentioned to me yesterday that he has been having problems recently with the volume of content he subscribes to with his news aggregator. He says it is getting to the point where catching up on his online reading is a multiple-hour engagement and is starting to take up too much of his time. At least judging from the "in the near future, the scarce resource will be human attention" meme that's running through HCI research circles, this is likely a problem for many and will only get worse.

Micah was thinking of adding a "priority" to posts, where I, as the author, can tag each entry in the RSS feed with how important I think it is. Then Micah, as the reader, can tell his news aggregator something like "Don't show me any posts from Rob, unless they're of High importance or greater".

After thinking about this, I'm not sure how well it would work, however, for two reasons. One is that I can only encode how important I think the post is to me; I don't know how important my readers will think it is. Worse, for most posts I imagine every reader would have a different opinion of its importance based on their interests.

I mentioned using categories to divide up content into topic areas, then provide separate RSS feeds for each category so my readers can only subscribe to the categories they are interested in. Movable Type can probably already support this. But this isn't really an optimal solution either, since my categorization scheme may not be the scheme my readers want, I may not be consistent with my assignment of categories, etc.

So here's my suggestion: what if news aggregators allowed you to have a set of feeds you subscribed to just as they do now, and then a different set of feeds you "monitored"? You could specify certain search keywords or other criteria that must appear in the feed's content in order for it to appear in your list. This way your aggregator could automate some of the weblog filtering process for you so your valuable attention could be directed to more important tasks, like reading the posts you're interested in. Additionally, this may have the side benefit of encouraging sites to syndicate all of their content, rather than just excerpts, in the hopes that the full content will match more people's filter criteria than just the partial content.

Just an idea.

Got Something to Say About This?

Email Rob:

OR Post a Comment:

 

Enter the number below into the text box next to it.*


 

* These fields are required. Your email address will not be publicly displayed. Your web address is optional, and will be publicly displayed if provided.

Allowed HTML: a href, strong, em, ul, ol, li, blockquote, dl, dt, dd, dfn, code, q, samp, kbd, var, cite, abbr title, acronym title, sub, sup

A Note on XML and Single Sourcing

design, information, software development, usability

May 09, 2003, 01:29 AM

In reference to my post a few days ago on a single sourcing paradigm for research content management, Dave asked why I didn't mention XML as a solution to the semantic content / presentation separation problem. That's a good question and raises a point I want to clarify.

XML provides a technical solution to the single sourcing problem by allowing people to define markup languages specifically for semantic content and markup languages for presentation specifications (XHTML, CSS, SVG, etc.) and a means of translating between them given some content-to-presentation mapping rules (XSLT). This is a fairly neat answer for the technical problem, but isn't very intrinsically interesting since there have been other technologies that have purported to solve this same problem before. XML is really only interesting because it has broad industry support and thus has a chance of catching on and lasting.

The interesting and difficult problem that I discussed in my post was the interface problem: the "how do you build an interface that lets the user define semantic content and presentation of that content separately, yet still provides a metaphor that makes sense to the average user?" problem. The difficulty is that we don't think in terms of semantics, we think in terms of how things look. Headings in documents are headings because they are in big bold sans-serif fonts. So when we are starting a new section and want to make a heading for it, our first impulse is to type some text, then change it to a big, bold, sans-serif font. If the program instead requires us to highlight the text and click on "make this a heading 3" then go into a bunch of dialog boxes to say "oh and by the way, heading 3s in this document will be in a big, bold, sans-serif font" then I guarantee people won't use it. This is why so few people use styles in Word, despite their many benefits; they break the WYSIWYG paradigm that fits so well with their tasks. And what if the user underlines the third heading in the document? Should the style change, hence underlining all headings in the document? Should just that heading change, creating a new, local style? And so on. I found an excellent article awhile back that expands on the ambiguity of styling nicely.

It's a hard problem, but one that I think could be solved much more neatly than any interface I've seen (Word's styles and Dreamweaver's CSS support both spring to mind as bad examples). And that's what XML doesn't do for us, and what I was driving at in my post.

Got Something to Say About This?

Email Rob:

OR Post a Comment:

 

Enter the number below into the text box next to it.*


 

* These fields are required. Your email address will not be publicly displayed. Your web address is optional, and will be publicly displayed if provided.

Allowed HTML: a href, strong, em, ul, ol, li, blockquote, dl, dt, dd, dfn, code, q, samp, kbd, var, cite, abbr title, acronym title, sub, sup

Visions of a Distributed Future

information, patterns, society & sociology

May 07, 2003, 07:00 PM

While walking home today I was thinking about the direction wireless and networking technologies are moving in and what impact this might have on society in the long term.

Since at least the beginning of the Industrial Age, we humans have lived a daily routine that looks something like the following:

  1. Leave home in the morning and go to the workplace.
  2. Do work at the workplace all day.
  3. Leave the workplace in the early evening and go have fun, either by returning home or going to some other location like a bar, a restaurant, a club, etc.
  4. Go home to relax and sleep.

This has been the structure of our daily life, dictated to us by the protestant work ethic which, according to Max Weber, has defined the spirit of the Industrial Age.

But the sands of time are flowing, and the world isn't what it was. Information technology has already enabled easy, global distribution of information and rapid, decentralized communication, and we are beginning to discover how to harness this power. Moreover, information/communication technology will soon become ubiquitous with the advent of universal wireless networking, wearable computing, and other mobile and embedded technologies.

So what does this mean for the structure of our lives? This ubiquity of communication is starting to make the concept of a "workplace" obsolete. Projects such as Aura, Civium, and many others like them are looking into ways of ensuring that your information is always with you, will always be there for you when you need it. So there is no need to go to the office to fetch the information you need. Moreover, wireless communication technologies, of which cell phones are only the beginning, make it easy to communicate with the people you work with, either completely virtually or as a quick means of setting up a real-world meeting. So your "workplace" could be anywhere; your project group could teleconference and agree to meet in a coffee shop or a park, then you could go back home to do the work you took on. We're already starting to see this happen technologically with the increasing sophistication of computer-supported cooperative work systems and socially with the popularity of telecommuting.

All this also means that the traditional structuring of the day into work, play, sleep may no longer be necessary in the future. We have the option of decentralizing, working when we want, coordinating our schedules only as necessary to get together with our colleagues. In fact, the entire concept of a "work time" could be discarded, and our jobs can be seamlessly interwoven with the rest of our lives. Work and play will coexist peacefully, and frequently become one.

In "The Hacker Ethic", Pekka Himanen discusses how this has already begun in certain cultures of information technologists, who interweave their work into their lives, their passions, and their dreams. These technologists ("hackers" in the original sense of the word) prefer this method of leading their lives to the "nine-to-five rat race / daily grind" option.

Many people fear the interweaving of work and play, of public and private life. They're afraid of getting "too connected"; they worry that if work can be done anywhere then they will never have time to do what they want. I believe this is an artifact of the protestant work ethic, which has always clearly separated "work" from "fun" and emphasized duty over pleasure. But that was a fact of life in the Industrial Age, and the Industrial Age is fading.

In "A Pattern Language", Christopher Alexander describes the "Community of 7000" pattern as a means of capping the size of towns and other communes to humane limits. Alexander is big on decentralization and distribution as means of building healthy societies. His patterns describe a society that meets these criteria; it lives at peace with nature and provides a humane environment for its inhabitants. What's interesting is how much easier it could be to facilitate many of the changes required by such a society were we to have a robust information infrastructure that gave us all "floating workplaces" so we were free to spend time in the places we want to and cultivate those places to become even more appealing to us and our peers. The chains that pull us all into overcrowded cities out of the sad necessity of centralization would disappear.

In our hierarchical world with global superpowers, multinational corporations, sprawling cities, an obscene rich/poor divide, and many millions of lives choked to death by inhumane conditions of every sort, it is difficult to imagine that a modern society could really work in the way he describes. But perhaps, just perhaps, distributed information technologies can be the catalyst that will bring this about. It is far from certain. No technology can change the world; people change the world, and if this vision is to become real we must make it happen. But I, for one, believe it will soon be in our reach.

Got Something to Say About This?

Email Rob:

OR Post a Comment:

 

Enter the number below into the text box next to it.*


 

* These fields are required. Your email address will not be publicly displayed. Your web address is optional, and will be publicly displayed if provided.

Allowed HTML: a href, strong, em, ul, ol, li, blockquote, dl, dt, dd, dfn, code, q, samp, kbd, var, cite, abbr title, acronym title, sub, sup

A Wiki for the CMU HCI Master's Program

information, internet, personal

May 07, 2003, 10:49 AM

Awhile back, Micah and I were talking about his excellent MHCI Cheat Sheet and how we were going to maintain it and make sure it is available and useful to new students when he's gone. He suggested setting up a Wiki and putting the cheat sheet in it, and the more I think about the idea, the more I like it.

Wikis are basically web sites where anyone who comes to the site (or at least, a largish group of people who come to the site) can edit every page on the site just by pushing a button. They're useful when you have a large amount of frequently-changing information that needs to be updated by a distributed group of peers, and when things like consistency of appearance and style are not too important. I think this matches the MHCI program fairly well, since what we need is a collaboratively-owned "knowledge space" that makes it easy for information to flow between those of us who have it and those of us who need it. This is especially true when those of us who have it have graduated, left Pittsburgh, and moved on with their lives, and those who need it are new students who just arrived in town and now have to figure out a million little things before they can get up and running. Since this is only a one year program with little student carryover, this is a common situation.

I think an MHCI Wiki could serve several purposes, including:

There are a couple of concerns I have about the idea as well.

I'm thinking of setting one up on Loki's Labs to try it out. If there is enough interest I could probably find some way to finagle CMU into hosting it somewhere permenantly.

Commentary

Posted by Rob on May 07, 2003 at 12:30 PM

I added the bullet point on "Posting finished student projects" around 12:30. I knew I'd forgotten something when I posted this thing...

Got Something to Say About This?

Email Rob:

OR Post a Comment:

 

Enter the number below into the text box next to it.*


 

* These fields are required. Your email address will not be publicly displayed. Your web address is optional, and will be publicly displayed if provided.

Allowed HTML: a href, strong, em, ul, ol, li, blockquote, dl, dt, dd, dfn, code, q, samp, kbd, var, cite, abbr title, acronym title, sub, sup

Research and Content Management

information, software development, usability

May 03, 2003, 12:00 PM

I've recently become more and more aware of the need for a "single source paradigm" in productivity applications. To gloss over a complicated topic, single sourcing calls for the separation of the content you create (the text, images, raw data, etc.) from the medium you view or present it in. Here's a diagram of such a system:

UniversalContentDiagram.jpg

Imagine this weblog post is the semantic content. I'd use my entry editing tool (Movable Type in this case) to create the semantic post; I'd write this text and the diagram above and break the post down into its semantic parts (such as the "main points", "examples", etc.) Once I had that, I'd be able to tell the system to generate not only the HTML pages that normally make up the post, but a printable PDF, a Powerpoint presentation to show to my bosses, etc. based on stylesheets for each presentation medium I had previously defined. Having a system like this would bring us closer to the view of application-agnostic computer use that I've mentioned before, since we could use many types of editors to create the appropriate view of the content we've already defined.

Single sourcing has long been a promise of computing but has yet to be delivered in any satisfactory form. So I thought I'd describe the problem we're having where single sourcing might apply to help reason about what an adequate single source solution would need to do.

For my work on Usability and Software Architecture (U&SA) with Bonnie and Len, we have a number of "scenario packages" we are developing that contain, among other things, a description of a system usage scenario with architectural implications, a list of responsibilities the system must fulfill to implement this scenario, benefits to the user of including the scenario in the system, etc. These scenario packages are the cornerstones of our work.

There are a number of properties these packages have that would make them amenible to a single source solution. Here's a list of them:

  1. We produce a number of publications about these scenarios. These publications span data formats (we produce Word documents and PDFs for academic journals, Powerpoint presentations for giving tutorials on our scenarios, I'm now trying to put together a U&SA website, etc.) as well as presentation formats (even publications that all accept Word document submissions all have different ideas of how the paper should be organized, what the font and spacing should be, etc.). We waste a lot of time fooling around with getting the scenario text, graphics, etc. out of one presentation format and into another.
  2. The scenario packages change frequently. Since this is still an area of active research, we're still developing the notion of what makes up an architecturally-sensitive usability scenario package. And when we change our minds, we want all our future publications to include the updated scenarios and not the old stuff. We've also wasted a lot of time fooling around with doing manual "diffs" of old material to make sure its up to date with respect to our new understandings of the problem.
  3. We don't want to be limited in what we can express by the particular presentation-oriented tool we're using. For example, most of the diagrams we're developing are currently done in Powerpoint, but Powerpoint is not a diagramming tool and it places limits on the ease with which we can create and modify these graphics. I've proposed using Omnigraffle instead, but Omnigraffle and Powerpoint don't always play well together. Plus, when I create a diagram in Omnigraffle it may be more detailed than what I want to show on a Powerpoint slide; I may want to be able to generate a stylized version of the diagram for presentation purposes.
  4. We use different machines running different platforms (mostly Windows and Mac OS) to develop this material. This isn't directly solved by the single source paradigm, but by centralizing the content in a single location, we could more easily share our work, track versions, and attribute changes to who made them. This would help solve a frequently occuring problem where Bonnie changes her copy of our slides, I change mine, and then I have a messy and error-prone manual merge task to perform.

Micah is hopeful that Office 2003 will help solve some of these problems. I'm skeptical, but willing to reserve judgement. At this point I'm willing to see anyone take the next step forward, since this idea has been in gestation far too long without much real progress to show for it.

Got Something to Say About This?

Email Rob:

OR Post a Comment:

 

Enter the number below into the text box next to it.*


 

* These fields are required. Your email address will not be publicly displayed. Your web address is optional, and will be publicly displayed if provided.

Allowed HTML: a href, strong, em, ul, ol, li, blockquote, dl, dt, dd, dfn, code, q, samp, kbd, var, cite, abbr title, acronym title, sub, sup

A Recommender System for End-Consumer Service Providers

information, internet, society & sociology

April 25, 2003, 08:46 PM

A sector of the economy that I've always experienced problems with, and I think many others have too, is the end-consumer service providers industry. This is just a fancy way of referring to auto mechanics, maid services, landlords, and anyone else who sells a service directly to individual private citizens (as opposed to other businesses or government organizations, which are a different story). Note that many companies that nominally produce goods may fall into this category as well since most provide some sort of service to support their products. For example, Compaq makes computers but also has a tech support line.

I don't mean to imply that its impossible to get excellent service from this sector; the problem I'm describing is that its hard to determine who will provide good service and who will not without a lot of trial and error. Part of the problem is that many of these services have slow cycles; unless I frequently experience car problems I won't need the services of a mechanic too often. Landlords generally sign leases for a minimum of one year. So you may have to go through several providers before you find one that's good. Meanwhile the not-so-good providers may stay in business through people who are uninformed or just don't want to have to take a chance with someone else. To compound the problem, if you're like me and you move fairly frequently then even if you do locate a good service provider, you'll have to start all over again when you reach your new home.

To help solve this problem, I propose a web site that allows consumers to rate service providers and discuss their service experiences. Kind of the same intent as Amazon.com's product rating system, except for service providers. This would allow consumers to mobilize and take crappy service providers to task, while promoting the businesses of honest, competent providers.

Aside from all the problems that normally confront recommender systems, one obstacle for this system might be slander and libel laws. Users of the system who are upset at a service provider may post comments to the site that aren't true, and of course it is impossible to check up on the facts for all comments. I'm no lawyer, but I have listened to lectures on business law and torts, and my understanding is that you can get sued for publishing libelious comments that another person made. I'm not sure if this applies differently to community web sites, but I imagine the proprietor of such a system might need to be prepared to defend himself in court.

Got Something to Say About This?

Email Rob:

OR Post a Comment:

 

Enter the number below into the text box next to it.*


 

* These fields are required. Your email address will not be publicly displayed. Your web address is optional, and will be publicly displayed if provided.

Allowed HTML: a href, strong, em, ul, ol, li, blockquote, dl, dt, dd, dfn, code, q, samp, kbd, var, cite, abbr title, acronym title, sub, sup

What's Related to this Post?

design, information, internet

April 23, 2003, 06:31 PM

One question I've frequently wanted to know the answer to since I started this here weblog is: "Now that I've finished this entry, what things are other webloggers saying that are related to the topics I discussed?" So I have a modest proposal for a solution to this problem.

I propose someone create a server that listens for weblogs.com XML-RPC pings from weblog updates. But instead of just providing a recently-updated list like weblogs.com does, when this server receives a ping it will fetch the weblog's RSS feed and extract the entry content and any associated metadata (title, category, whatever). Then it would run a "what's related" algorithm comparing the new post to all the posts in its database. Finally it would generate a report of weblog entries that are related to the new entry, and spit out an HTML page to a web server that could distribute this information to the masses.

The final system would look like this:

RelatedEntriesArchitecture.png

A nice extension to the system would be to send the URL of the final report back to the weblog tool so it could include a "Related Entries" link next to the post (if the weblog's author wanted it to do so, of course). But I don't know of any way to accomplish this without extending existing weblog tools.

I must confess I'm not sure how "what's related" algorithms work or how useful the current state of the art tools are. Google's concept of what's related to my page is a little strange (why does Micah's gesture literature review page come up, but not his weblog? And why does Mathilde's friend Mav's page appear?) although not completely inaccurate. But hey, that's a problem for the Language Technologies people to solve :).

Currently, the automatic notification ("ping") features built into most popular weblog tools are, in my opinion, underexploited. We could probably think of lots of other cool things to do with them if we tried.

Got Something to Say About This?

Email Rob:

OR Post a Comment:

 

Enter the number below into the text box next to it.*


 

* These fields are required. Your email address will not be publicly displayed. Your web address is optional, and will be publicly displayed if provided.

Allowed HTML: a href, strong, em, ul, ol, li, blockquote, dl, dt, dd, dfn, code, q, samp, kbd, var, cite, abbr title, acronym title, sub, sup

MAYA's Marketplace of Mutable Meanings

design, information, internet, software development

April 17, 2003, 12:36 AM

I went to the HCII Seminar Series talk today given by Peter Lucas, the CEO of MAYA Design (which is where Mathilde works nowadays). He talked about Civium, a "vision and architecture for a seamless, distributed public information space".

Civium is a distributed data storage system that implements an information commons through location-transparent persistence. In this sense, it's similiar to Gnutella or Freenet: information can be widely distributed across machines to provide effectively limitless physical storage space and no one machine contains the cannonical representation of one piece of information, so information is much harder to destroy. So in this sense, Civium's not very interesting.

Civium also provides various clients to visualize the information present in the distributed database. Peter demoed a geographic information system (GIS) client that accessed information MAYA obtained from publically available government sources. But I've seen several such systems before, and this one didn't appear to do anything particularly novel (except that it was potentially more extensible, as we'll see). So in this sense, Civium's not very interesting.

What is interesting is the way that Civium handles data semantics and their dissemination. At the core of the system lies VIA, a universal database. VIA isn't entirely relational or object-oriented, but instead is based on three types of entities:

  1. U-forms are sets of attribute name / value pairs with a Universal Unique Identifier (UUID) bound to them. So essentially, they're glorified hashtables that can be uniquely distinguished from all the other glorified hashtables on the network. Note that this is essentially what a RDBMS row is (provided it has a UUID primary key), so U-forms can work similiarly to relational table rows in terms of building more complex data structures using foreign keys, etc.
  2. Shepherds are software agents that push U-forms around from repository to repository (a repository is any physical storage medium connected to VIA) without having to understand the semantic meaning of the U-form content. They take care of the whole "distributed" part of the system.
  3. Roles are the schemata that define the semantic meaning of the data that appears in U-forms. But the neat thing is, Roles are stored in U-forms themselves. So if you need to discover the semantics of data in a U-form you've found, you just need to locate the data's Role U-form from the VIA system. If you want to extend an existing Role, you just need to insert your own U-form into VIA that includes the modifications you need and references the original Role.

Assuming that the semantics of Role U-forms are defined by the system (Peter skipped most of his slides on Roles, unfortunately), then this system essentially implements a "free market of data formats" in which anyone can create new formats or use existing ones, no one has control over a particular format, and market forces can determine which formats become "standards". I don't know of any other systems that provide real architectural support for this concept.

Granted, I have my doubts as to whether this approach would truly be effective in improving computing's recurring "format wars". After all, the browser wars of the ninties between Netscape and Microsoft were a good example of market forces influencing a shared format, HTML, while the relevent standards body, the W3C, stood by and watched, irrelevent. And that was a bad time to be a web developer. But at least in Civium, we wouldn't wind up with one powerful monopoly corporation controlling the format that won out.

Whether market forces, rather than standards committees, are truly the best way to produce consensus on data semantics and representations is a question I'm not prepared to answer. But Civium, if it ever becomes a major force, will certainly put this idea to the test.

Got Something to Say About This?

Email Rob:

OR Post a Comment:

 

Enter the number below into the text box next to it.*


 

* These fields are required. Your email address will not be publicly displayed. Your web address is optional, and will be publicly displayed if provided.

Allowed HTML: a href, strong, em, ul, ol, li, blockquote, dl, dt, dd, dfn, code, q, samp, kbd, var, cite, abbr title, acronym title, sub, sup

CHI Report, Recommender Systems

design, information, internet, society & sociology, usability

April 12, 2003, 04:40 PM

After lunch, I went with Mathilde to a short talk session on Recommender Systems. Three out of the four talks were interesting, so I considered it a pretty successful session.

The first talk was on how the ratings in systems like Amazon.com's customer-supplied product rating system can influence future customers ratings and lead to inaccurately high or low ratings for some products. The speaker studied a system called MovieLens, which looks similiar to an Amazon.com that only does theater movies. He found that around 8% of the time people would raise or lower their ratings based on the rating that was already present. Thus if users would rate a movie as 2 stars, but see a rating of 4 stars on MovieLens, they might adjust their rating, consciously or unconsciously, up to 3 stars. Though these individual effects may be slight, the inaccurracies may spread.

Another interesting finding he reported was that users could tell if a system had wildly inaccurate ratings, and that this would effect the users' opinions of the system. So if users found good movies rated low or bad movies rated high consistently, they reported a much lower opinion of the system as a whole. Unfortunately, he did only study a fairly extreme case of inaccuracy so it is difficult to say where the relevant threshhold value is.

The next speaker, David McDonald from the University of Washington School of Information, discussed a recommender system for group-ware tasks such as finding the relevant expert in a company who may be able to answer a given question. What made his talk particularly interesting was that he was using social networks to build such a system for a medical software company. Here are my main takeaway points:

The final interesting speaker was Thomas Erickson from IBM T.J. Watson Research Center. He discussed visualizations of social activity, such as those produced by the Sociable Media Group at the MIT Media Lab (which I applied to, by the way, and never heard back from...). He had a few interestiing things to say:

One side comment I thought of while watching these presentations: talks such as these are useful for making a few small, well-argued points that are interesting enough to encourage your audience to look into the area in more depth (or just take away your main points). The speakers who tried to go into great detail and make complicated arguments quickly lost the audience, whereas those who had simple points that powerfully portrayed their message were more effective. More on this when I get to Norman's closing plenary.

Got Something to Say About This?

Email Rob:

OR Post a Comment:

 

Enter the number below into the text box next to it.*


 

* These fields are required. Your email address will not be publicly displayed. Your web address is optional, and will be publicly displayed if provided.

Allowed HTML: a href, strong, em, ul, ol, li, blockquote, dl, dt, dd, dfn, code, q, samp, kbd, var, cite, abbr title, acronym title, sub, sup