I've been through a lot of interesting papers this past week that deal with information seeking in software development. After reviewing the literature in this space, it strikes me as a good area for me to position my work in software project visualization. The work falls into three major categories: exploratory studies of information seeking among developers, applications, and frameworks.
Exploratory Studies
Information Needs in Collocated Software Development Teams: Ko et al. report on an exploratory case study of developers at Microsoft that asked: what information do software developers seek; where do developers seek this information; and what prevents them from finding information? The authors observed 17 developers in 90-minute sessions and coded their information-seeking activities by type. Although these sessions were relatively short, some interesting patterns emerged. What I found most interesting is that developers' questions about historical design rationale were rated as highly important, but they were often deferred because the information wasn't readily available. Also, although developers rated coworker awareness as unimportant, they frequently sought this information and it was readily available.
Maintaining Mental Models: A Study of Developer Work Habits: LaToza et al. conducted surveys of software developers at Microsoft to gather information on the activities they undertake during their work and interviewed developers about the problems they encounter. The survey asked developers to indicate the fraction of time they spent on various activities (based on a taxonomy developed by the authors) and to rate the seriousness of various problems in software development (as proposed by the authors). The developers rated these problems as most serious:
- understanding the rationale behind existing code
- having to switch tasks because of manager or teammate requests
- being aware of changes elsewhere
- finding code duplicates
Follow-up interviews on these problems revealed a general theme that "Developers go to great lengths to create and maintain rich mental models of code that are rarely permanently recorded." It seems that there is a very high cost to seek out relevant information in building a mental model. It's not particularly surprising, but it's interesting to see this theme come to light based on empirical evidence from a large software company. The ugly implication is that all developers go through this model-building process and the cost isn't really re-invested in making it easier for newcomers who go through the same process later.
An Exploratory Study of How Developers Seek, Relate, and Collect Relevant Information during Software Maintenance Tasks: Ko et al. performed a study of 10 developers working on five maintenance tasks on a small, unfamiliar piece of software in a laboratory setting. I'm most interested in how developers seek information in the rich context of an industrial setting, so this study was less interesting to me. One interesting finding is that for the vast majority of tasks, developers "began with a textual search for what [they] perceived to be a task-relevant identifier in the code." If these textual search activities are already an integral part of a developer's normal workflow, then maybe they'll be interested in trying out other search tools without too much coercion.
Applications
Ligature: Combining node-and-link graph rendering with a timeline for sensemaking in software development repositories: Gina Venolia describes a prototype visualization for software project artifacts and the relationships between them in a Microsoft tech report. Ligature shares some common ground with our work on threading software project histories. I'd love to see an evaluation of a tool like this in front of real developers in an industrial setting.
Backstory: A Search Tool for Software Developers Supporting Scalable Sensemaking: Gina Venolia describes a prototype search tool for software developers in a Microsoft tech report.
My goal is to help developers make better use of the written resources where knowledge may be lying fallow, and so reduce the need to interrupt teammates and increase knowledge flow within the team.
The paper frames Backstory as a tool for conducting sensemaking investigations, specifically root-cause analysis of software defects. You can get an overview of the UI here. In contrast with Ligature, Backstory emphasizes content of artifacts over structure. This seems like an interesting tool and it'd be nice to see it evaluated by developers.
FASTDash: a visual dashboard for fostering awareness in software teams: Biehl, Czerwinski, et al. describe and evaluate a dashboard UI that displays indications of team activity with the goal of increasing awareness among team members. The tool emphasizes source code modifications as the primary indicator of team activity. The authors evaluated the tool using a pre/post observation design where observers coded communication between team members while they were working in situ. The authors also administered a pre/post situational awareness test. Surprisingly, using the tool resulted in increased communication among team members. According to the situational awareness test, there was a significant reduction in attentional demands and ratings on the instability of the situation when the visualization was present. I'd be curious to get more details on the nature of the communication that increased while using FastDASH: is this useful communication or is the tool acting as a degenerative proxy for meddling and micro-managing? I'm also a little curious about the pre/post awareness test that authors used. These results seem to imply that awareness is important for developers, whereas the Ko et al. case described above found that developers perceived awareness as relatively unimportant. Given these apparent discrepancies, I think further study of the importance of awareness in software development would be interesting.
Frameworks
A socio-technical framework for supporting programmers: Ye et al. propose a framework to guide the design of tools that support information seeking in software development. The framework views individuals as "information resources equally important as code, documents, and various kinds of information repositories." In other words, a software project is conceptualized as a "socio-technical information space." The paper defines this space simply as code, documents, programmers, and the relationships between these three elements. The most interesting part of the paper for me is the implication that individuals could be rendered in the same fashion as artifacts in an information seeking tool.
2 Trackbacks
[...] Handcock has posted a nice summary of his recent reading in information seeking. It’s a fascinating topic, and some of these papers are really [...]
[...] a while I was keen on doing a Human-Computer-Interaction-style study of a tool to support information seeking in software development. This type of research would involve recruiting a development group to study, conducting [...]