Software is the key crosscutting technology that enables advances in mathematics, computer science, and domain-specific science and engineering to achieve robust simulations and analysis for science, engineering, and other research fields. However, software itself has not traditionally received focused attention from research communities; rather, software has evolved organically and inconsistently, with its development largely as by-products of other initiatives. Moreover, challenges in scientific software are expanding due to disruptive changes in computer hardware, increasing scale and complexity of data, and demands for more complex simulations involving multiphysics, multiscale modeling and outer-loop analysis. In recent years, community members have established a range of grass-roots organizations and projects to address these growing technical and social challenges in software productivity, quality, reproducibility, and sustainability. This article provides an overview of such groups and discusses opportunities to leverage their synergistic activities while nurturing work toward emerging software ecosystems.
Significance and Impact
The paper describes a variety of organizations and their approaches to affect cultural changes in the research software community. This work is important because the amount of attention paid to research software has not tracked the explosive growth in the important of software to the scientific enterprise, at the same time, the developers of that software face growing challenges due to the increasing complexity and scale of the modeling, simulation, and data, compounded by disruptive changes in computer hardware.
- The paper presents 17 different organizations working through different methods to influence thinking, practice, and policy around research software.
During the past 20 years, computation has penetrated essentially all areas of research, including science, engineering, technology, and society; advanced modeling, simulation, and data analysis drive new discoveries and new understanding as complements to experimental and theoretical methods.1,2 Reusable software is a key element of these advances; software libraries and community codes encapsulate cutting-edge algorithms and domain-specific expertise, thereby enabling use/reuse and facilitating collaboration. We can understand the impact of software in research across all fields by asking researchers, by examining their published papers, and by examining their funding. Two surveys, of academic researchers at Russell Group universities in the U.K.3 and members of the National Postdoctoral Association in the United States,4 found that about 65% of respondents said they could not do their research without software, while about 25% said they could, but it would be much more difficult, and only a few percent said it would make no difference. And, a study of 40 papers in Nature from January to March 2016 showed that 32 explicitly mentioned software, with each paper mentioning an average of 6.5 software tools, almost all of which were research software.5 Likewise, searching the NSF award database for projects that mention “software” in their abstracts between 1995 and 2016 finds 18, 592 awards totaling $9.6 billion.6
Software is essential to most of today’s research, and the software used for a research project is almost never entirely developed by that project but, rather, it depends on, uses, and builds on research software from other projects and from other developers. Thus, it becomes apparent that research software, much like research itself, is actually developed and maintained by a community, forming an ecosystem of competing and collaborating products. Much of this circumstance is due to the open source movement and its culture of sharing and collaboration, similar to the idealized culture of open science and open research toward which we are slowly moving.
Open source has created a tremendous variety of software, but this plethora of solutions is not easy for researchers to find and use out of the box. Moreover, researchers face growing challenges in creating more ambitious software for all research areas.7,8 For example, in computational science and engineering (CSE), challenges include coupling physics, scales, and analytics while adapting to disruptive changes in computing architectures. Due to limitations in funding models and reward structures, researchers face pressure to publish new scholarly results quickly rather than investing in development of sustainable software that reliably supports longer-term research and interdisciplinary collaboration. Researchers need training in best practices for software engineering, customized to address the unique needs of disciplinary cultures, yet typical graduate and undergraduate programs do not adequately cover these topics.
To address these circumstances and promote research collaboration through emerging software ecosystems, community members have recently established a variety of grass-roots organizations and projects, which have been further inspired by the growth of digital resources (that can more easily be shared), the growth of the Internet (making sharing easier), and the growth of collaborative tools such as GitHub and Slack. Community organizations that focus on a particular discipline, a particular technology, or particular functional skills can help researchers understand relevant parts of the ecosystem, including what software is available and what is not, and how the available software packages compare. In addition, these community organizations can support the health of the ecosystem, for example, by encouraging policies, reuse, and collaboration and supporting best practices for software development. The remainder of this paper highlights such organizations and how they work.