Today we officially announced our ongoing partnership with the Allen Institute, a Seattle-based independent nonprofit bioscience and medical research institute founded by Microsoft co-founder and philanthropist Paul G. Allen. Composed of five scientific units and more than 800 employees, the Allen Institute conducts large-scale research through foundational science to fuel the discovery and acceleration of new treatments and cures for diseases such as Alzheimer’s disease, heart disease, cancer, addiction, and more.
In keeping with its belief in open science, one of the Allen Institute’s unique core values is to make all data and resources publicly available for external researchers and institutions to access and use. Code Ocean shares this deep commitment to scientific discovery and open collaboration, and it has served as the foundation of our partnership.
The Catch-22 of Computational Research
Research organizations such as the Allen Institute often face a dilemma when it comes to managing large-scale computational research and computational analysis.
Most computational platforms cater to technical users who are well-versed in a command-line interface, git, docker, APIs, and version control. Research organizations, on the other hand, typically employ scientists who can write experimental code but may not possess deep expertise in engineering. And therein lies the paradox for research organizations — the people handling massive amounts of data are not necessarily equipped to manage the very complex technical aspects associated with generating insights from their datasets. Meanwhile, the individuals who do have this training (engineers, developers, administrators) aren’t necessarily familiar with the research topics and processes.
Historically, this paradox has required research organizations to deploy teams of research scientists and engineers to work in parallel. Although there may be some short-term advantages to this model, most organizations find that it ultimately leads to an inefficient allocation of time and resources.
Ultimately, for scientists to do efficient, collaborative, and fully reproducible computational research while focusing on the science, research organizations need to implement a computational platform that automates the technical complexity involved with large-scale scientific computing. In addition to affording optimal flexibility, taking this step can enable organizations to reallocate engineering resources to more strategic activities.
A Match Made in Data Heaven
When it comes to implementing a platform for computational research, it’s crucial to select the right partner. Naturally, this requires aligning the proficiency of your staff with the platform’s technical capabilities, but it’s equally important to make sure that your technical partner understands and embraces your research organization’s larger objectives. The Allen Institute found the right match in Code Ocean.
Since implementing Code Ocean one year ago, the Allen Institute has achieved dramatic improvements in reproducibility, interoperability, and productivity. The organization has been able to complete more with less and at a quicker pace, enabling more than 100 researchers to actively use the platform to make discoveries faster and more efficiently.
Using Code Ocean has resulted in efficiency gains as well as a reallocation of time and resources to higher-leverage activities:
- The speed at which a pipeline can be built has decreased from 12 to 3 weeks, representing— a 4x increase in workflow efficiency.
- The Allen Institute has been able to reallocate the time of 5 engineers to more impactful projects.
- With fully interoperable Code Ocean Pipelines, the time and effort required to share a pipeline has been reduced to just one click.
Code Ocean’s “no lock-in” platform has also helped the Allen Institute stay true to its mission of advancing open science by reaching and collaborating with more than 200 external users.
Helping Research Organizations to Scale Effectively
Code Ocean has developed a wide range of dedicated features to assist large organizations such as the Allen Institute to operate effectively at scale. These features include fine-grained custom metadata management and search, data organization in Collections, automated data provenance for trusted science and reproducibility, and integration with external data warehouse providers and leading machine learning platforms. One noteworthy statistic is the number of times scientists committed code to git (2,370 times) and the percentage of users committing to git (61%). These figures surpass what’s typically observed within the broader scientific community (Ram, 2013). This achievement has been made possible by Code Ocean’s dedicated UI features, which bypass the need for command-line operations requiring expertise in git.
These statistics show that the Allen Institute not only reaps the benefits of implementing these foundational technologies at a broad scale but also reinforces industry-standard best practices. Among other benefits, Code Ocean has empowered the Allen Institute to utilize machine learning on a broad scale, for the first time handling petabyte-scale data, with a growing user base of 282 users running more than 15,000 computations, totaling over 10,000 hours.
“Our partnership with Allen Institute has helped us to refine our ability to accommodate the specialized needs of large organizations,” said Dr. Daniel Koster, Vice President of Product at Code Ocean. “We have adapted our services to meet requirements such as handling petabyte-scale data, deploying FAIR-based analysis for reliable, scalable science, ensuring the reproducibility of scientific outcomes from experiment initiation to final presentation, complying with necessary regulatory standards, and providing a single pane of glass that integrates the multitude of tools utilized internally.”
Less than a year into their partnership, the Allen Institute has already generated a data corpus on Code Ocean of over a petabyte, with a growing user base of 282 running over 15,000 computations totaling over 10,000 hours.
For a more extensive report on how Code Ocean is accelerating neuroscience research for the Allen Institute, read the full case study.
To learn more about how Code Ocean can help your team, schedule a demo with us!