Research Statements
Software architectures [1] - the blueprints of a software system which organize complex functionality in terms of computational entities (components), communications between these entities (connectors), and their structure including their mapping to physical hosts (topology) - is a discipline that has shown great strides in assisting software developers to manage development, maintenance, and evolutionary tasks. While many domains such as distributed and embedded systems have adopted software architectures as a key aspect of their overall software engineering methodology, one domain that has been slow to adopt both software architectures and general software engineering principles is high-performance scientific computing.
A common mythos of the high performance software community is that abstractions that aid the developer come only at an unacceptable performance penalty. Nonetheless, these abstractions could greatly benefit the high performance scientific developer and I have found that they do not induce the overhead feared by domain practitioners. We will explore a number of these benefits of software architectures applied to high performance computing below:
High Performance Connectors. Encapsulating the locus of communications in first-class software connectors removes a significant burden from the developer of parallel software, as they are required neither to be an expert in the type of communications required nor are they forced to interlace communication code into their computational algorithms as is required in the prevalent single instruction, multiple execution paradigm.
Componentization. That there are a number of current initiatives to introduce component-based software engineering into scientific application development such the Common Component Architecture [2] is evidence that the scientific computing community desires reuse of software functionality. Software architectures further aid the developer of component-based solutions by exposing architectural mismatch [3] to the developer and applying standardization of interfaces and communications via connectors.
Maintenance and Evolution. Scientific codes are extremely long-lived. This is largely an artifact of the irony that scientific software developers tend to be novice computer scientists while developing some of the most complex code of any computer science sub-domain. Legacy software languages such as procedural C and Fortran are the dominant development languages of scientific programming due to wariness on the part of scientific developers to port code to new languages despite potential benefits. Architectures allow new components to be introduced to legacy systems as well as offer a roadmap for evolution.
Software architecture is uniquely positioned to address these issues while also guiding the role of the software architect in developing large-scale scientific applications and allowing scientific developers to focus on the algorithm development - the piece of the software system which should be paramount to them.
While my Master's degree work was largely in the first area highlighted above (high performance connectors), I am focusing on the later two points, componentization and maintenance/evolution, for my PhD. I have developed methods for injecting software architectural constraints into legacy scientific applications such as the Fortran and C programs I have used while working as a software engineer at NASA's Jet Propulsion Laboratory. As my work progresses, I hope to develop automatic techniques for componentizing parallel scientific code via code parsing and generation. In addition, I intend to leverage existing research in component migration techniques to enhance the autonomic configuration of high performance software deployed of commodity cluster environments.
[1] D. E. Perry and A. L. Wolf. Foundations for the study of software architectures. ACM SIGSOFT
Software Engineering Notes. October, 1992.
[2] B. A. Allen et al. A component architecture for high-performance scientific computing. The
International Journal of High Performance Computing Applications, 20(2):16-202, 2006.
[3] D. Garlan, R. Allen, and J. Ockerbloom. Architectural mis-match: Why reuse is so hard. IEEE
Software, 12(6):1-26, 1995.