context-aware access to a software project memory

In my blog’s new spirit of academic openness, I’ve decided to post some details about a project I did for a human-computer interaction course last term. The work is admittedly rough and much work would need to be done to pursue it further, but I think the idea is interesting and I’d love it if something like this were to come to light some day. If you’re interested enough, you can read the paper I wrote about this project. The following is a summary.

This work was motivated by my experiences in industry where the act of building software produced vast numbers of artifacts like wiki pages, meeting notes, emails, tickets and/or bug reports, stick-and-arrow drawings, and maybe even a note on the back of a napkin with a pint glass imprint. The historical aggregation of these artifacts form a kind of software project memory.

In my particular domain, these artifacts were often of an informal nature. When I actually got down to writing code, I’d have these artifacts scattered about my workspace and I’d be digging through them and referring to them constantly to guide my implementations, especially in writing test cases. Essentially, this loose collection of informal artifacts defined the specifications for the code I had to write. Wouldn’t it be awesome if the artifact containing the information I needed at any particular time could be automatically discovered and recommended to me?

One way to do this would be to use a developer’s context to automatically find relevant artifacts in the project memory. Hipikat allows developers to manually query a project memory to find related artifacts, but I want something that is context-aware and can automatically discover artifacts related to the task that I’m working on.

The problem is that automatically determining a developer’s context is a tricky thing. A developer might be juggling numerous tasks at once: checking email, talking on the phone, chatting with other developers when they pop by the office, writing documentation, and writing code. All of these tasks contribute to a developer’s context and some of them are difficult to sense using software. So, because context is such a tricky thing, I chose to restrict my discussion of context to a developer’s interaction with an Integrated Development Environment (IDE) like Eclipse.

I came up with a few heuristics that a system could use to infer context in an IDE. Each heuristic definition consists of an action in the IDE (a contextual cue) and an associated method of query into the software project memory. For example, let’s imagine that a developer is tasked with implementing some new feature. Based on discussions with colleagues, she knows that she has to make use of a class called DefaultHttpMethodRetryHandler, but she has no idea how to instantiate or use this class. So, she just starts typing the name of the class in her IDE. The IDE senses that she’s working with this class and automatically triggers a query consisting of the name of the class. The IDE recommends the following relevant artifacts:

  • A tutorial document with a code sample utilizing DefaultHttpMethodRetryHandler
  • API documentation for DefaultHttpMethodRetryHandler
  • The source code file the implements DefaultHttpMethodRetryHandler

The following screen capture shows how this case looks in the prototype application that I built as part of this project. Clicking on contextual cues on the left activates associated queries into the software project memory. The application displays relevant artifacts on the right, grouped by artifact type. Of course, it looks nothing like an IDE, but hopefully you get the idea.

recommenderdemo-full.jpg

I wrote the application in Objective-C on Mac OS X using Xcode and Interface Builder. It’s loosely based on the Spotlighter example application that is included with the Mac OS X developer tools. It uses the Spotlight search framework to query an artifact repository stored on the developer’s local disk and I seeded the repository with artifacts of different types from the Jakarta Commons HttpClient project.

This recommendation system is entirely automatic from sensing a developer’s tasks to inferring context to querying a project memory to recommending relevant artifacts. As such, there are bound to be times when a developer finds the system useful and also times when a developer doesn’t need it. The user interface therefore needs to provide an ambient, minimally distracting display of artifact recommendations, especially as the developer’s context changes.

As I said, I think this is an interesting idea. As it stands, however, there are a few issues that I see with taking this work further.

First, I suspect that the usefulness of such a system would vary widely by domain. I conducted a small survey of developers that informed my designs in this project, but the results admittedly suffered from a large sampling bias in that most respondents work for my former employer. In the extreme case, if you’re writing code for a space shuttle, you probably have an air-tight specification sitting in front of you and you don’t need artifact recommendations. More thorough study of how developers use artifacts to guide their implementations in different domains would be useful.

Second, I’ve admittedly kind of waved my hands over how to infer context in an IDE. I came up with a few heuristics, but they’re far from complete. One would have to come up with a bunch more heuristics and evaluate their usefulness.

So, that’s what I came up with. If you’re curious and have questions or comments about this work, definitely check out the paper and get in touch.


About this entry