After three very hectic first months of 2012, the final version of my Ph.D. thesis has been submitted and I’ve gone through the graduation ceremonies. From the 1st of April I will be a postdoctoral associate in bioinformatics at the National Institute of Biomedical Innovation in Osaka, Japan. I will comment further on my Ph.D. experience and my entry into bioinformatics when I can.
Being a Ph.D. student in the Honiden laboratory has been a great experience, and I am very grateful to professor Honiden and to the other lab members for their support.
My thesis and the associated slides are available. The abstract is as follows.
In the last few decades, software systems have become less and less atomic, and increasingly built according to the component-based software development paradigm: applications and libraries are increasingly created by combining existing libraries, components and modules. Object-oriented programming languages have been especially important in enabling this development through their essential feature of encapsulation: separation of interface and implementation. Another enabling technology has been the explosive spread of the Internet, which facilitates simple and rapid acquisition of software components. As a consequence, now, more than ever, different parts of software systems are maintained and developed by different people and organisations, making integration and reintegration of software components a very challenging problem in practice.Â
One of the most popular and widespread object-oriented programming languages today is the Java language, which through features such as platform independence, dynamic class loading, interfaces, absence of pointer arithmetic, and bytecode verification, has simplified component-based development greatly. However, we argue that Java encapsulation, in the form supported by its interfaces, has several shortcomings with respect to the need for integration. API clients depend on the concrete forms of interfaces, which are collections of fields and methods that are identified by names and type signatures. But these interfaces do not capture essential information about how classes are to be used, such as usage protocols (sequential constraints), the meaning and results of invoking a method, or useful ways for different classes to be used together. Such constraints must be communicated as human-readable documentation, which means that the compiler cannot by itself perform tasks such as integrating components and checking the validity of an integration following an upgrade. In addition, many trivial interface changes, such as the ones that may be caused by common refactorings, do not lead to complex semantic changes, but they may still lead to compilation errors, necessitating a tedious manual upgrade process. These problems stem from the fact that client components depend on exact syntactic forms of interfaces they are making use of. In short, Java interfaces and integration dependencies are too rigid and capture both insufficient and excessive information with respect to the integration concern.Â
We propose a Java extension, Poplar, which enriches interfaces with a semantic label system, which describes functional properties of variables, as well as an effect system. This additional information enables us to describe integration requests declaratively using integration queries. Queries are satisfied by integration solutions, which are fragments of Java code. Such solutions can be found by a variety of search algorithms; we evaluate the use of the well-known partial order planning algorithm with certain heuristics for this purpose. A solution is guaranteed to have at least the useful effects requested by the programmer, and no destructive effects that are not permitted. In this way, we generate integration links (solutions) from descriptions of intent, instead of making programmers write integration code manually. When components are upgraded, the integration links can be verified and accepted as still valid, or regenerated to conform to the new components, if possible. The design of Poplar is such that verification and reintegration can be carried out in a modular fashion. Poplar aims to provide a sound must-analysis for the establishment of labels, and a sound may-analysis for the deletion of labels. We describe the semantics of Poplar informally using examples, and provide a formal specification of Poplar, which is based on Middleweight Java (MJ). We describe an implementation of a Poplar integration checker and generator, called Jardine, which compiles Poplar code to pure Java. We evaluate the practical applicability of Jardine through a case study, which is carried out by refactoring the JFreeChart library. We also discuss the applicability of Poplar to Martin Fowler’s well known collection of refactorings. Our results show that Poplar is highly applicable to a wide range of refactorings and that the evolution of integrated components becomes considerably simpler.