| « | Week of Aug 24 | < | Individual Entries | > | Week of Sep 7 | » | ||
A More User-Centered HTTP Log Analysis Tool?
internet, usability
September 06, 2003, 12:39 PM
I was pouring over my logs yesterday and I realized I'm getting more and more frustrated with the inadequacies of Webalizer, especially with regard to performing log analyses for the purposes of enhancing usability. Webalizer takes a very system-centric approach of basically compiling and graphing HTTP request data, rather than a user-centric approach of trying to generate visualizations of the log data that answer the questions of its users. Although there are a few promising features (tracking "visits" and extracting search query terms from referer URLs), in general the tool leaves much to be desired.
So I'm wondering if there are any better tools out there, ideally open-source but I'd be willing to purchase a proprietary product if the price is reasonable. Here's some of the kinds of features I'm looking for:
- Better capabilities for associating visitors with the URLs they viewed. It's less helpful for me to see the top 30 URLs by hits than it is to see what sites visited what URLs and how many times they visited them.
- Some tracking of the paths visitors take through the site. This can help give a sense of what content people are most interested in, what content is hard to find, etc.
- Association of search terms with the pages they linked the visitor to.
- Better graphing of hits, sites, and visits, such as the ability to view daily trends over multiple months.
- Just working better in general; for instance, allowing me to tell it to completely ignore requests from htdig in performing the analyses. I've supposedly told Webalizer to IgnoreSites from the Labs itself, but this only works for some of the calculations.
I realize that much of the data users really want to see (exactly who is reading their site, how many real people are reading, etc.) cannot be obtained from HTTP request headers. But all the features I've mentioned above are perfectly implementable. The designers of these tools just need to take a more user-centered perspective.
So, anyone know of any better alternatives? 'Cuz I'd sure like to.
Email Rob:
Teaching, IT, and Making Research Matter
internet, teaching & learning
September 05, 2003, 07:22 PM
Last Wednesday, I attended a discussion forum on "Information Technology and the Research University", which I mentioned I was invited to earlier. There were a bunch of bigwigs from the National Academies present; they were apparently interested in how CMU faculty and students were using IT currently, what IT research was going on to try to improve education, and what the future trends appeared to be. Ken Koedinger, the cognitive tutors and educational technology expert from the HCII, was there to talk about his research. So was Jared Cohen, the president of CMU, so this must have been a reasonably big deal to the university.
I spoke a little about the types of communication mediums that students commonly use nowadays, including weblogs, instant messaging, and email. Some of the committee members seemed very interested in multiplayer gaming, which I didn't know too much about although some of the other students did.
They were also concerned about how many of the CSCW technologies introduced into classrooms to foster community tended to get dropped after a few weeks of use. I pointed out that part of the reason for this (I believe) is that these technologies tend to get imposed in a top-down fashion without much regard for whether they will fit into the students' (and teachers') styles of working and learning. Most of the technologies that have been successful (email, IM, file sharing services, and now, perhaps, weblogs) were not imposed from above but instead were picked up by students and pulled into the classroom. The committee didn't seem to have much to say about this; one of the members responded by claiming (essentially) that attempts to impose technologies in a top-down fashion has worked successfully in the Army, so why not in the university? I don't think I have to elaborate on the problems with that reasoning. My hypothesis is that educational technologies (community-oriented or not) have to take into account their users (teachers and students) and their users' goals if they wish to be successful. And this means properly integrating themselves with the lesson plans and the student and teacher work flows. Hopefully I'll have more to say about this as CSCW wears on.
As we were wrapping up, Joel asked the committee if they had recommendations on what future research should be done in the area of IT and the university. Their reaction was surprising and encouraging. Pretty much all the committee members responded that we already know quite enough; the problem isn't that we need more research, the problem is that we need to figure out a way to take the research we do have and actually put it into practice. All of them seemed to be aware that very little of the work that gets done in the modern university actually influences the work of industry and government. (Interesting factoid: I was talking to a woman who works for the Office of Technology Transfer. She said the government gets a return of two cents for every dollar they spend on university research. That's a net loss of 98%. Think about that for a minute.)
Several of the faculty who were present pointed out that there currently exists no incentive structure for them to work towards getting their research adopted in practice. There is also little incentive structure in a research institution (like CMU) for the faculty to spend time improving their lesson plans and teaching skills. President Cohen and the Vice Provost of Education resisted this claim on the grounds that CMU had many wonderful teachers, but the committee responded that they didn't doubt that; they were just asking if there were better ways of doing business at the organizational level. And of course the committee is right. Sure there will be those who take initiative on their own, but if there is no organizational support and award structures to support good teaching, this will be the exception and not the rule. Matt made this point quite nicely in a discussion of one of my old posts. Essentially this is the same argument that the CMM makes about the importance of software development processes; relying on virtuosos may produce great results sometimes, but these results won't be repeatable. If you want your organization to grow, you need a good process.
At any rate, it's good to know the decision makers seem to be thinking about the right issues. Whether they do anything effective about them remains to be seen.
Email Rob:
Spam-Protect Your Email Links
internet, software development
September 05, 2003, 12:49 AM
I've been putting the finishing touches on the MHCI students' bios page this evening. One thing I wanted to do was spam-protect everyone's email address (we get enough web-harvested spam already since the SCS, in its infinite wisdom, posts our addresses unprotected in its directory). I was going to roll my own perl script to scan the HTML files and replace mailto: links with impossible-to-harvest javascript generation code, but instead I decided to search the web to see if someone else had thought of this already and came across a very nice solution already put together for me.
Jody Brabec, if you ever read this, thanks for the great script! And the take home lesson for the rest of us is: oftentimes spending five minutes with Google will save you hours of coding, testing, and debugging time (just to be safe, I took a few minutes to modify the variable names in Jody's javascript to fool spambots that might happen to know of his script; I'd suggest you do the same). Reuse is a Good Thing™ when you can get it.
Email Rob:
Empirical Research on the Reasons for Free Riding
society & sociology, systems
September 04, 2003, 08:51 PM
For CSCW this week, we had to read an empirical research article relating to free riding / social loafing and summarize it for the class. I chose an article that studied the reasons why people tend to free ride as a long-term trend in Prisoner's Dilemma-style situations, since I'm pretty interested in game theory economics (although not ordinarily so interested that I'm willing to slog through academic journal articles on the topic). I'm posting it here because I found the results interesting enough that I believe a higher-level summary of the experiment is worth disseminating. Just in case you're interested in the real thing, the reference for the article, in APA-style, is: Andreoni, J. (1988). Why free ride? Strategies and learning in public goods experiments. Journal of Public Economics 37: 291-304.
In "Why Free Ride? Strategies and Learning in Public Goods Experiments", James Andreoni is concerned with determining the cause of "free riding" in game economics experiments. Free riding is the term used to describe behavior where an individual acts in his own self-interest at the expense of the group because this will result in greater benefits to him, even though the optimal behavior is for all members of the group to act in the group's interest. For example, the classic experiment in free riding studies involves forming groups of five subjects and giving each member of the group 50 "tokens". Tokens can only be redeemed for cash by investing them in one of two funds: an "Individual Exchange" fund returns 1 cent to the investor and nothing to his group members, whereas a "Group Exchange" fund returns a half cent to the investor and everyone else in the group. Note that the game involves the same general situation as the Tragedy of the Commons; the rational individual will choose to invest in his Individual Exchange fund since that always returns 1 cent to him (and he'll also gain from any Group Exchange investments made by his teammates), but the best strategy for the group as a whole is for all members to invest all their money in the Group Exchange, since that returns 2.5 cents to each group member. It's also important to note that none of the participants are allowed to communicate about the game, so they cannot, for example, all agree to follow the optimal strategy.
Andreoni notes that all experiments of this type tend to result in a convergence on free riding behavior when the game is played repeatedly, whereas the behavior is evenly divided between free riding and public interest when the games are "one shot" deals. He describes two theories for why this occurs: the "learning hypothesis", which postulates that individuals don't initially understand the game, but, over time, they learn that free riding is the optimal strategy for their personal gain. The "strategies hypothesis" goes further and claims that some individuals learn the optimal strategy but seek to further maximize their games by occasionally contributing to the Group Exchange fund to prevent others from learning the game or realizing that they are playing rationally so that their groupmates will contribute more to the Group Exchange fund. However, at some point near the end they will "bail out" and stop contributing to the group fund, which explains the end-game free riding behavior.
To test these theories, Andreoni ran an experiment where a control group played the normal repeated game (he calls this group the "Partners"), whereas a variable group played a version of the game where the participants in each group were randomly shuffled at each iteration, so that effectively each player was just playing a series of "one shot" games (he calls this group the "Strangers"). This eliminated the effects of the strategies hypothesis (it makes no sense to try to "psych out" your groupmates if you are assigned new groupmates after each iteration) so that only learning could explain any observed behavior in the variable group. Moreover, both groups were subjected to an unexpected "restart" where the game continued for three additional rounds after they were told it would end. This was to isolate the learning hypothesis; if learning was primarily responsible neither group should be affected since both had already learned the game. If either group was affected, then something else had to be going on.
Andreoni summarizes his results as a series of six observations:
- Investments in the Group Exchange was greater by the Strangers than the Partners in all rounds. This is evidence against the strategies hypothesis, which claims Partners would always be greater since all players invest more in Group Exchange when they are fooling or being fooled.
- The percent of Partners who free ride is greater than Strangers in all the rounds, with the greatest difference in the last round. This is more evidence against the strategies hypothesis.
- The Partners give the least in the last round, but most have still not reached the free riding equilibrium (the amount they would contribute as the number of rounds approaches infinity). Once again, more evidence against "strategies".
- The Strangers give more than the Partners in the last round. If learning is all that's going on, then the Partners must be learning much faster than the Strangers, which is hard to accept. This suggests that learning is not solely responsible for the trend towards free riding.
- The Strangers appeared to be only temporarily affected by the restart.
- The Partners are affected, and return to high levels of investment in the Group Exchange in the restart. This appears to have a lasting effect. Remember that the learning hypothesis predicts that neither group will be affected if it is the sole cause of the free riding trend.
Andreoni summarizes by asserting that his experimental data contracts both the learning and the strategies hypotheses as accounting for the tendency towards free riding.
Finally, he concludes with a discussion on what theories and new experiments may account for free riding behavior. For example, he points out that it is possible that participants may have learned the optimal behavior for the "one shot" game, but don't yet understand the dynamics of the repeated game. This would help explain the fluctuations after the restart, since participants may not recognize that the continuation of the game should not affect their behavior. He also suggests the examination of theories that look at "non-standard behavior" that consider the reasons subjects make their decisions rather than just the characteristics of the equilibrium. For example, subjects may get non-monetary pleasures from cooperating. Or they may be enforcing social norms on cooperation by trying to punish those who don't contribute to the Group Exchange. Or participants who do worse than the expected may play more cautiously and not contribute as much to the Group Exchange. In conclusion, Andreoni calls for an empirical examination of a broader range of alternatives such as these.
Email Rob:
Note to Phone Manufacturers: Innovate!
design, writing & communication
September 03, 2003, 07:47 PM
Why is it that phone manufacturer's can't seem to create a phone that contains a single innovative, useful, and usable feature to save their lives? Even cell phones, which certainly have quite a bit of market variety (too much; I just spent ages trying to find the right one for me), don't seem to produce much real innovation. It's all well and good that the phone can synch with my address book (or so claims the box) but if I have to go through some godawful 12 step process to do it, the feature mind as well not be there. I want these features to save time, not waste it! And no, "let's stick the internet on our phone!" doesn't count as innovation.
How about adding in the ability to not ring if the caller is not identifiable? I was just interrupted from writing this very post by "Unknown Name, Unknown Number"; the fourth time tonight. Or perhaps speak the name of the caller using text-to-speech? My phone should be my ally in warding off the unwelcome telemarketers, not collaborating with them.
So come on, guys and gals. I know cell phones are new, but a little interaction design and usability would do wonders for them. And if you don't hurry up, Neema just might beat you to it...
Email Rob:
The Tragedy of the Commons
politics, society & sociology, systems
September 03, 2003, 12:38 PM
Last night, I read a paper for CSCW on "The Tragedy of the Commons", which I believe is the seminal paper that applied this concept to modern political science. If ever there was an argument that knocked down the "Invisible Hand will fix everything" theory of capitalism (still popular among some of the more overzealous libertarians), this is it.
The paper is very good and pretty readable, so I'd strongly suggest you read it. But since I know most of you won't, here's a summary of the argument, as applied to the problem of overpopulation:
- As the number of people increases, the amount of available resources on this planet per capita will decrease.
- Since population, unchecked, grows exponentially, we will soon reach a point where there is only enough resources per capita for bare subsistence, so no one will have sufficient resources to enjoy life (in reality I'd argue this is an unlikely scenario; it's much more likely that a minority will have sufficient resources and the majority will not have enough to survive, but the important point is that either scenario is bad news).
- Therefore, overpopulation is a big problem.
- However, from the perspective of the rational self-interested individual actor (whom the invisible hand is supposedly guiding), the cost of having another child is nearly entirely beneficial since the burden of overpopulation is distributed evenly to all humans and thus is negligable to him, whereas the cost of not having another child is entirely detrimental to him.
- Therefore, the rational self-interested individual actor will conclude that he should always have another child.
- And, of course, so will all the other rational, self-interested actors out there.
- Therefore, the overpopulation problem will continue to get exponentially worse until the scenario described in (2) occurs.
- Sucks to be us.
This is an insight that exposes a core problem confronted by modern economists and other social thinkers. It proves that a blind, simplistic trust in markets is misguided, and that the world is much more complicated than most laissez faire propoganda would have you believe.
This isn't to say that markets are bad, of course, just that they need to be fixed sometimes. The interesting debates are over how to do this; how to balance freedom with responsibility. But in order to engage in this debate, you first have to accept the reality that it must occur.
Hardin claims that attempts to convince people through education that they must voluntarily reject the course of action that benefits them the most for the sake of society at a whole are inherently flawed and unworkable. He argues:
If we ask a man who is exploiting a commons to desist "in the name of conscience," what are we saying to him? What does he hear? — not only at the moment but also in the wee small hours of the night when, half asleep, he remembers not merely the words we used but also the nonverbal communication cues we gave him unawares? Sooner or later, consciously or subconsciously, he senses that he has received two communications, and that they are contradictory: 1. (intended communication) "If you don't do as we ask, we will openly condemn you for not acting like a responsible citizen"; 2. (the unintended communication) "If you do behave as we ask, we will secretly condemn you for a simpleton who can be shamed into standing aside while the rest of us exploit the commons."
Similar to the Prisoner's Dilemma, the system ensures that the people trapped in it cannot escape through cooperation unless they can be convinced to all trust one another, an unlikely scenario.
The only alternative, Hardin argues, is "Mutual Coercion Mutually Agreed Upon". Essentially this involves stamping out the commons and replacing it with systemic structures that either carve out portions and allocate them to individuals ("private property") or artifically make the costs of abusing the commons much higher than the benefits (fines, incarceration, etc). Essentially this entails giving up freedoms, but its a mutual sacrafice that humans agree to on their own volition. Of course, Hardin skirts the issue of what exactly those people who choose not to enter into the agreement are supposed to do... I don't like this conclusion, but so far I must say his argument is convincing.
The core idea, however, is that we as intelligent actors have to alter the system as a whole to work for our long-term best interests rather than against them. But this introduces problems of its own. Systems are complex, and it is hard to predict all the real effects changes will have. I've mentioned the "Law of Unintended Consequences" before. Hardin himself brings up the phrase "Quis custodies ipsos custodes?" or "Who shall watch the watchers themselves?". Whenever you introduce changes, there have to be mechanisms in place to enforce those changes. Usually this requires enforcers, who are human too and subject to the same rules as all the other humans in the system. You can't expect enforcers to be angels; they will look out for their own self-interest just the same as everyone else will.
Is it even possible to design perfect systems that can really address all these issues? Or is this just a cleverly disguised technical solution to a "no technical solutions problem", which Hardin deplores in the first few paragraphs? T.S. Eliot once said "It is impossible to design a system so perfect that no one has to be good". Where, then, lies the hope for lasting betterment?
Email Rob:
Novice Teachers
teaching & learning
September 02, 2003, 11:53 AM
I'm reading an article called "Information Ecologies" for Jodi Forlizzi's Visual Interface and Interaction Design class. As a side point, the authors mention a company, Farallon, that has developed an interesting hiring practice for technical support persons. Instead of hiring highly technical people, they hire people with little or no computer experience like former cocktail waitresses and social workers. Instead, they focus on hiring people who are "natural teachers" because they found that the technical people, while they understood the computer, didn't understand how to teach others about it. In the words of their technical support manager, "You can teach people to use a computer but it's real hard to teach patience. I look for natural born teachers because that's what they're doing all day".
Although I'm not sure if I agree with the phrase "natural born teacher" (a term that would irritate Matt, if I'm not mistaken), I do believe Farallon has a great idea here. It's well-known that the people who tend to be the best at teaching concepts are those who just learned them, not the long-time experts. This is because the newly minted expert is much more familiar with the novice's way of thinking; he understands what it is like to not understand. The long-time expert has forgotten this experience; to her, the novice's mindset has become alien and thus she has great difficulty communicating concepts in terms the novice will understand. I can relate; I find it very hard to teach introductory programming concepts to new programmers because these concepts are so second-nature to me. It almost seems odd that anyone could not understand them. And I'm a relatively patient teacher; many technical support people are easily frustrated by "inane" customer questions, as their culture makes quite clear.
The backwards thing is that our society assumes that its ideal to have subject-matter experts teach such introductory material. This is the idea behind the teaching university, where top researchers in the field teach undergraduates who are just beginning to grasp the core concepts. And my 5+ years as a student in academia has bourne out this theory; I've found that these researchers tend to be excellent teachers when they are directly teaching their research. At teaching the more introductory material (which is most of an undergrad's education), they tend to suck.
Email Rob:
Analyzing Communities
internet, society & sociology
September 01, 2003, 12:57 AM
The fall semester has begun here at CMU, and I'm taking a class in Computer-Supported Cooperative Work (CSCW) from Bob Kraut. This semester, the class is focusing on Designing Online Communities and Bob is co-teaching it with Paul Resnick, a recommender systems (think Amazon.com ratings) expert who is visiting from the University of Michigan.
In the first class, which was last Friday, we discussed several types of social structures that exist both in the real world and online:
- Groups are a small number of people who come together to accomplish well-defined goals and have a specific, agreed-upon purpose. Most of the teams you may have worked with for projects in a class or at a workplace fall into this category.
- Voluntary associations are generally larger collections of people who all share common goals or interests and have agreed to congregate around those interests. Unlike groups, the goals of a voluntary association may not be very well-defined. Voluntary associations tend to last longer than groups, however; many may even have indefinite lifespans. An example is your local chapter of the YMCA.
- Communities are like voluntary associations but might be more loosely organized. I don't believe we talked much about this category, so I don't recall a whole lot of distinguishing features. I'd imagine communities involve people coming together to socialize and share in each other's lives, and are potentially even longer-running than voluntary organizations (you may belong to some communities for your entire life).
- Third places are locations people go to socialize that are separate from the home and the workplace (which are the first and second places). Third places are characterized by people coming together to revel in the uniqueness of each others' personalities; the patrons of the third place go for each other and not as much because the place is enjoyable per se. An example would be a neighborhood bar. Cheers is kind of the prototypical third place.
- Social networks are collections of people who are associated with one another through social interactions such as friendships, working relationships, etc. Your social network defines who you talk to, who you can ask favors of, who you can get information out of, etc. Bob went into an interesting digression where he showed a drawing of a social network some sociologists had observed; he pointed out how the network formed certain "clusters" that were only connected to other clusters through a single link between one node in each. Bob remarked that the two people who formed that link were frequently (de facto) powerful individuals since they controlled the communication between those two social clusters. This is especially true if the groups must frequently exchange important information, since they get to play gatekeeper for that information.
- Social capital describes the sense of trustworthiness and shared identity that people feel towards one another. Your social capital is a measure of this sense that people experience towards you; if you have high social capital among a certain group of people they will tend to value what you say and listen to you; if you have a low social capital, they're more likely to ignore you. This is a different way of looking at social relationships than the more structural ones I mentioned before.
Bob also made an interesting point about how physical architectures (as in, the structures of buildings) define the environment in which a community operates and thus has a large influence on the community itself. The readings drew a metaphor between building architectures and city planning (the kinds of things Alexander discusses) that physical communities live in and the design of the software systems that virtual communities live in. The suggestion, of course, is that the software system design influences these virtual communities in a similar fashion. This was really interesting to me; I had originally thought of going to grad school to study online communities in general and specifically how the design of social software systems influenced the way their communities operated. I hope we'll talk more about this issue in later classes.
Neema is also in the class and has posted his thoughts on his weblog. Chad is taking the course as well; hopefully he'll put up a few reflections as the semester wears on. If you, dear reader, happen to be a member of the class and have a weblog of your own, please post it in a comment below or send me a trackback ping. I'd like to keep tabs on what other people think of the course.
Email Rob:
Posted by Andy Edmonds on September 08, 2003 at 10:36 PM
Summary is my fav for low cost commercial solutions and tracks search terms to pages and does good path analysis:
http://www.summary.net
AwStats looks better than webalizer by a touch in the demos:
http://awstats.sourceforge.net/
Posted by Rob on September 22, 2003 at 09:18 PM
Thanks Andy!
Dan told me in meatspace today that he's been experimenting with Sawmill and that he's pretty happy with it. I might have to check out that one too once I get a break from this never-ending work...
Posted by Jehiah on July 30, 2004 at 04:10 PM
I know this topic is old, but I had the same problem finding something for Path Analysis of my log files, and I ended up developing my own applicaiton. If you want to check out what I have for PathAnalysis please do, and let me know if you see any ways for me to improve it.