SESYNC sought to provide the best cyber infrastructure (CI) for facilitating computation and data-related synthesis activities. SESYNC’s computational personnel assisted teams with database development and data integration, analysis, and visualization. While our staff performed many CI tasks directly for teams, we also strove to build CI capacity; therefore, we preferred to work with team members so they could learn how to perform the tasks themselves. Since SESYNC teams were quite diverse in their approaches to synthesis and in the types of data they brought to a problem, our CI staff had a broad range of disciplinary backgrounds and expertise. Through an array of integrated services, the staff and infrastructure enabled researchers to collaborate remotely; to address research problems at unprecedented scales; and to build their own capacity to do computationally enabled, socio-environmental analysis. To support a variety of groups, the CI staff developed procedures to actively engage teams and ensure they could meet teams’ CI needs throughout projects’ life cycles, from pre-award through completion, and at least one year beyond.
Many SESYNC synthesis teams focused exclusively on highly quantitative data (biophysical, social, or both). Although the social sciences are becoming increasingly quantitative, survey data, interviews, and case studies remain very important to understand how behaviors, attitudes, decisions, and local context interact with or drive changes in the environment and vice versa. Teams relying on both quantitative and qualitative information could employ mixed methods analyses, meta-studies approaches, or agent-based modeling to analyze interactions among variables, as well as outcomes across diverse scales. Many groups sought to integrate theoretical frameworks with modeling, and this approach grew over time with SESYNC’s increasing support and available learning materials related to socio-environmental modeling. Some teams were well equipped to undertake mixed methods analyses, but we found that most natural scientists were not familiar with such methods, and many struggled with how to incorporate qualitative data into their synthesis. Therefore, we provided advice, courses, and one-on-one assistance to help participants overcome these and other barriers.
Computing Infrastructure
The suite of hardware and software offered by SESYNC aimed to provide maximum computing power with minimal barriers to entry. Team members had access to large data storage (75 TB), database servers, and a 64-core computational cluster via SSH, a web file gateway, virtual desktop, or SESYNC’s RStudio Server. We tightly integrated all services so that researchers would have seamless, remote access to all their group’s resources—regardless of which SESYNC service they were using. We had SESYNC’s key infrastructure virtualized or running bare metal (RStudio/computing cluster), allowing staff to reconfigure and redistribute resources as necessary depending on research needs. SESYNC also provided remote access to all its infrastructure, allowing seamless collaboration regardless of group members’ locations. SESYNC’s computing environment was scalable and adaptable to the changing landscape of collaborative computing. As necessary, SESYNC pursued partnerships with University of Maryland-based resources and national groups, such as CyVerse, to support the computing needs of our teams and researchers.
In-House Expertise
The eight-member CI staff offered a wide range of expertise crucial to successful synthesis research, including statistical programming and advanced statistical methods, simulation and analytical modeling, database design and administration, geospatial data handling, and parallel computing. Half of the CI team were domain scientists, and the remainder were computer scientists; this diversity allowed SESYNC to bridge our researchers’ varied objectives to appropriate computing resources.
Computational Training and Community Capacity Building
SESYNC recognized the urgent, largely unmet need domain of S-E researchers seeking to acquire and apply data science skills in their work; thus, we regularly offered hands-on short courses combined with research-focused hack-a-thons to familiarize our community with tools like R, command line (shell), collaborative code development (git), SQL, qualitative methods, and more. This instruction not only accelerated progress on SESYNC projects but also connected the Center’s education mission to CI by preparing researchers for future work on computationally demanding S-E problems.