Proposals

Marc Roper and Murray Wood, Department of Computer and Information Sciences, The University of Strathclyde, Glasgow

1. Cost Estimation
The problems of calculating accurate estimates for software projects are well known and still persist within industry. Very broadly, current estimation techniques fall into two categories: those that are based around a model of some nature (e.g. COCOMO) and those which base their estimates on previous projects. Of these, those which base their data on previous projects and use a form of analogy (e.g. Case-Based Reasoning - CBR) for estimates have been shown to be more accurate. However, neither approach has yet to receive widespread adoption by industry.
Our particular interests in this area lie in providing accurate cost-estimates for web-based applications. We have already been exploring the application of CBR to one set of project data, but a major barrier to work in this area is the lack of data sets to experiment with (this may possibly also be a barrier to industrial adoption of the approach, since data is also required to provide convincing evidence of the efficacy of an approach). If the Observatory has accurate records of development data in the form of characteristics of the project (nature of the project, languages and platforms used, details of developers, actual development time etc.) then this could be a very valuable source of data to permit further experimentation with techniques such as CBR in order to refine their accuracy and explore the practicalities associated with their application.

2. Recommender Systems
We have been exploring the problem of developer awareness in distributed systems. Understanding who is doing what on a system is difficult enough in co-located teams, but this becomes a significant problem when teams are distributed in space and also possibly in time. To this end we have developed and implemented a model (CRI, standing for Continuum of Relevance Index) which, for every member of a software development team, monitors all developer interactions in Eclipse. The CRI system then uses this data to calculate the most relevant tasks, artifacts and developers to whatever other task or artifact you are currently working on. We currently have data of all relevant Eclipse interactions (creates, views, updates, and deletes) from a small number of short-term student projects which may be mined to create project histories or social networks of developers, tasks and artifacts. It could be possible to upload this data to the observatory, or alternatively to install the CRI tool for use in the future to gather more data from longer term projects. There is interest in continuing this work and extending it into the more general area of recommender systems to support a wider set of development and maintenance activities.