This is a three day workshop for Jupyter developers, high-performance computing (HPC) engineers, and staff from experimental/observational science (EOS) facilities. The main purpose of the workshop is to foster a new collaborative community that can make Jupyter the pre-eminent interface for managing EOS workflows and data analytics at HPC centers. EOS scientists need Jupyter to work well at their facilities and HPC centers, and this workshop will help us address the technical, sociological, and policy challenges involved. The workshop itself will include presentations, posters, and two half-day hackathon/breakout sessions for collaboration.
During the workshop, participants will be invited to begin collaborating on a survey white paper that documents the current state of the art in Jupyter deployments at various facilities and HPC centers. The document will include deployment descriptions, maintenance and user support strategies, security discussions, use cases, and lessons learned. A forward-looking summary provided at the end of the white paper will tie together common threads across various facilities and highlight areas for future research, development, and implementation. We will aim to have the paper completed and published to arXiv within three months of the end of the workshop.
Advances in technology at EOS facilities (e.g. telescopes, particle accelerators, light sources, genome sequencers), in robust high-bandwidth global networking, and in HPC have resulted in an exponential growth of data for scientists to collect, manage, and understand. Interpreting these data streams requires computational and storage resources greatly exceeding those available on laptops, workstations, or university department clusters. Funding agencies increasingly look to HPC centers to address the growing and changing data needs of their scientists. These institutions are uniquely equipped to provide the resources needed for extreme scale science. At the same time, scientists seek new ways to seamlessly and transparently integrate HPC into their EOS workflows.
Jupyter’s software ecosystem provides many of the missing pieces scientists need to manage HPC-enabled EOS workflows in real time, at small and large scales. Staff and researchers at HPC and EOS facilities have begun developing open source components and best practices to adapt Jupyter, JupyterHub, and JupyterLab to their specific computational environments. Jupyter has successfully gained a foothold at these institutions, but how can its position become accepted, be expanded, and made stronger? Realizing the vision of interactive supercomputing for data-intensive science requires: