Help found 2i2c!

J. Colliander (PIMS and UBC)

Acknowledgement: This document was built with feedback and encouragement from many people including Ian Allison, Joe Hamman, Lindsey Heagy, Scott Henderson, Chris Holdgraf, Michael Lamoureux, Ed Lazowska, Meredith Lee, Rick McGeer, Yuvi Panda, Jesse Perla, Fernando Perez, and Anthony Suen.

Background

Interactive Computing: transformational medium for communication

Intuitive web interfaces for interacting with code, data, text, and visualizations (e.g. Jupyter, R-Studio, Apache Zeppelin,) have been deployed using open source infrastructure technologies (Docker, Kubernetes, Dask, ) to create a new transformational medium for human communication called Interactive Computing. Interactive Computing is a new way to use the internet that will be more revolutionary than email for research collaboration.

Diverse academic disciplines are enthusiastically responding to the opportunities created by Interactive Computing. Pioneering work on Data8 at UC Berkeley triggered a global interest in new interdisciplinary data science programs. The Pacific Institute for the Mathematical Sciences (PIMS) launched a national scale interactive computing platform called Syzygy. An entrepreneurial group of young researchers responded to computational challenges obstructing collaboration to create Pangeo, a community platform for big data climate science. These early successes are inspiring similar efforts in digital humanities, economics, K12 education (Callysto), neuroscience, genomics, infectious disease modelling, A team of more than ten thousand analysts use an interactive computing platform operated by the National Security Agency.

Interactive Computing is driving the creation of new workflows for collaborative research and knowledge sharing. Real-time data streams (e.g. signals from distributed sensor networks or IOT devices, financial data streams, social media data, telescope imagery, etc.) delivered via API can be integrated with live code and expository text in easily shared interactive computing documents. Transformational research ecosystem themes (reproducibility, extensibility, open data, open source code, open access publishing) are converging into a new open toolchain that is accelerating discovery and knowledge mobilization. Binder, JupyterBook, and related emerging technologies have the potential to supplant PDF and paper as the standard media for scholarly publication.

The universities that found 2i2c will be in a leading position to collaboratively define the business model and best practices for interactive computation during this transformation.

Risks and challenges

There are risks and challenges associated with the emergence of Interactive Computing. The cloud computing expertise required for Interactive Computing is specialized and expensive. This creates a risky dynamic with inequitable outcomes. Leading universities with the capacity to deploy interactive computing will advance faster than other colleges and universities. Academic disciplines that happen to have the right personnel to deploy Interactive Computing platforms will advance faster than other disciplines.

There are risks associated with sourcing Interactive Computing from the major commercial cloud providers. The cost of access to knowledge from proprietary publishers, through their bundling and digital subscription strategies, provide a background for study and inform strategies for avoiding these risks.

Universities should control the Interactive Computing stack used for research and education.

Solution: 2i2c

We propose to bring universities and other partners together to create a new organization, the International Interactive Computing Collaboration (2i2c). 2i2c will deliver Interactive-Computing-as-a-Service (ICaaS) and, through 2i2c Labs, will be an active partner that helps universities advance their education, research, and service mission. The new organization will be built around core values of trust, collaboration, and transparency with a rigorous governance structure to maintain alignment with the university mission.

2i2c will offer a menu of ICaaS offerings designed to meet the needs and growing demand for Interactive Computing within colleges and universities. These services will be integrated within the academic technology ecosystem (LMS, library, laboratory, data stream integrations) with server hosting offered on a variety of infrastructures (Amazon, Google, Microsoft, local metal). The underlying cloud infrastructure will be sourced with discount pricing leveraging volume discounts that are larger than a single university can secure. 2i2c may partner with CloudBank or other commercial cloud resource aggregators. The services will be robustly delivered under a service level agreement following best practices for information security and privacy protection. Pricing for these services will be transparent and fair.

2i2c will include 2i2c Labs, an organizational structure designed to flexibly support research and development activities through partnerships. Through 2i2c Labs, university research teams will have access to 2i2c expertise (as consultants, embedded on-site experts, student internship training programs, etc.). 2i2c Labs will serve as a collaborating partner on grant proposals. 2i2c Labs and our partners will contribute improvements to the Interactive Computing toolchain for research and education. 2i2c Labs will also develop expertise on the use of Interactive Computing with sensitive data (e.g. HIPPA, personalized medicine, cybersecurity, etc.).

We envision 2i2c as focused on serving colleges and universities in North America. 2i2c will also serve as a catalyst and first node in an international network of similar organizations that provide interactive computing services and expertise with local leadership and customizations optimized for other regions.

Benefits for universities

Lessons from Data8 at Berkeley and Syzygy in Canada informed the vision for 2i2c. Universities will receive benefits from 2i2c. These benefits include significant cost savings through the pooling of expertise in areas like Kubernetes, volume discounts and autoscaling on cloud computing. Robust Interactive Computing platforms also provide new mechanisms for better education and research programs with improved collaboration, reproducibility and extensibility. Through 2i2c and 2i2c Labs, domain experts within the university become connected with the open source communities developing tools that help drive research advances.

Operating a single dedicated Interactive Computing platform for an education program like Data8 at Bekeley requires robustness. Students and instructors expect reliable access to the computing resources to do their work. Effective monitoring and maintenance of such a mission critical platform requires two or more engineers with the necessary expertise in cloud operations, autoscaling, privacy, integrations, etc. It turns out, based on the collective experience from Data8 and Syzygy, that a team that can operate a single robust Interactive Computing platform can use the power of the cloud to operate many similar platforms. This capacity to use the cloud to scale Interactive Computing resources and meet the needs of many universities is a main insight that drives the idea to build 2i2c. Moreover, managed deployments of platforms that serve many universities simultaneously generate significant cost saving opportunities through volume discounts and clever infrastructure management. Through 2i2c, universities avoid the problem of inefficiently developing capacity at the wrong scale.

The engineering expertise required to operate robust Interactive Computing platforms is extremely valuable. Our experience with Data8 and Syzygy has revealed that hiring and retaining these kinds of experts within the HR matrix at most universities is difficult. 2i2c and 2i2c Labs provides a flexible mechanism for universities to obtain and retain the highly skilled talent required for effective Interactive Computation.

After 2i2c is launched and achieves some success, there will be a relatively standardized Interactive-Computing-as-a-Service available across many universities. With this resource in place, universities can be even more effective at doing what they do best. Knowledge will become more shareable and collaboration will become easier.

Governance

The success of 2i2c as an organization that serves universities well requires the right governance. We propose to work with a small collection of three to seven founding universities to collaboratovely define the 2i2c governance structure. 2i2c should be overseen by a Board with representatives from universities and the open source communities that define the Interactive Computing software ecosystem. To achieve success 2i2c will require some agility so the governance structure should be designed accordingly. 2i2c could be incorporated as a for-profit or not-for-profit organization.

It is imperative that 2i2c be launched with the right values and culture. 2i2c needs to lead by example by including women and members of underrepresented groups in the organization's leadership and through the development of programs that ensure full and fair access to Interactive Computing resources and training opportunities.

2i2c Board of Directors (Draft).

Target Size: 7 people

  • 3 University Representatives
  • 3 Others from:
    • Philanthropy
    • Government
    • Open source
    • Industry
  • 1 CEO of 2i2c

The Board may need to expand. A board of 7 people may be more agile at launch. A larger board of 11 or 13 people may emerge as the right size.

Org Chart at Launch (Draft)

graph TD
    Universities --- Board
    OpenSource --- Board
    Philanthropy --- Board
    Industry --- Board
    Board --- CEO
    CEO --- VPSales
    VPSales --- Sales
    CEO --- VPEng/Labs
    Sales -.- Customers
    VPEng/Labs --- Engineers/Support
    Engineers/Support -.- Customers
    Engineers/Support -.- Partners

Initial Team:

  • CEO
  • VPEng/Labs
  • Engineer
  • Engineer
  • (CFO) consultant, help from Board
  • (VPSales) after initial validation with founding universities

Org Chart in Sustain Stage (Draft)

graph TD
    Universities --- Board
    OpenSource --- Board
    Philanthropy --- Board
    Industry --- Board
    Board --- CEO
    CEO --- VPSales
    CEO --- VPEng
    CEO --- VPLabs
    CEO --- VPFinance
    VPFinance --- Finance
    VPSales --- Marketing
    Marketing -.- Conferences
    Marketing -.- Prospects
    Sales -.- Prospects
    VPSales --- Sales
    Sales -.- Customers
    VPEng --- Support
    VPEng --- Engineers
    Support -.- Customers
    VPLabs --- Engineers
    VPLabs -.- Partners
    Engineers -.- Partners
    Support -.- Partners

Service level agreement, privacy protection, information security

needs work

Pricing

2i2c will offer ICaaS with transparent and fair pricing. There appear to be two natural pricing structures:

  • FTE-based pricing
  • Cost-based pricing

Universities often use budget models for education programs that are based on full-time equivalent (FTE) student counts within an academic program term or calendar year. These budget models may require that ICaaS from 2i2c be priced by FTE within the appropriate service interval (quarter, semester, calendar year). This type of pricing structure (e.g. $XX/student-FTE/year) may not reflect actual costs since students will use the computing resources in different ways.

Another pricing model is for 2i2c to charge a markup (e.g. 20%) on actual cloud computing costs used by the university. 2i2c could potentially generate reports on usage with granularity showing the cloud resources used by individual students, instructors, researchers, classes, or programs. These reports could be used by universities to fairly distribute charges across the units that are using ICaaS from 2i2c. As 2i2c develops, cloud computing costs will likely decrease through volume discounts negotiated with commercial cloud infrastructure providers and through improved autoscaling strategies.

2i2c and its founding universities will work together to demonstrate the benefits and cost savings accrued through this partnership. The 2i2c Board will provide oversight to ensure that pricing is transparent and fair.

Invitation: help found 2i2c?

We invite you and your university to help found 2i2c. Launching 2i2c requires capital and collaboration. Your university can help 2i2c get started in many ways:

  1. Commit to being a founding university of 2i2c. Identify a person or a group of people that will work with us to launch 2i2c. Ensure that those appointees have the endorsement of administrators with the authority to bind your university to 2i2c once the organization is effectively defined.

berkeley-seal Will Berkeley nominate a Founding Chair of the 2i2c Board?

  1. Commit to providing in-kind support to help found 2i2c. Can your university offer space to help us launch? Can your university connect 2i2c to startup incubators or accelerators? Can your university provide expertise and guidance on the creation of an ideal service level agreement with the right structures to protect privacy and keep information secure? Can your university help 2i2c interface with commercial cloud providers to obtain credits or cash investments to help us bootstrap? Can your university second software developers to 2i2c to help us build the platforms and integrations required for launch?

berkeley-seal Can Berkeley share or second devops talent to help launch 2i2c?

  1. Commit to working together to raise initial capital from other sources. Help 2i2c connect with prospective investors. Provide a letter of endorsement of 2i2c from your university with an expressed committment to eventually purchase ICaaS from 2i2c after some miletsones are achieved. Philanthropic organizations may be interested in making an initial bootstrap investment provided there is a high probability that 2i2c can achieve sustainabililty.
  2. Commit to working with 2i2c Labs. Help launch 2i2c by identifying research programs that will benefit from Interactive Computing. Connect prospective PIs from those programs with 2i2c so that we can work together on grant proposals to secure funds to advance those lines of investigation and support the launch of 2i2c.

berkeley-seal F. Perez wishes to collaborate with 2i2c Labs on Jupyter meets the Earth.

  1. Commit to buying ICaaS from 2i2c by agreeing to an advanced purchase pending the achievement of certain milestones. For example, after the governance structure and business model for 2i2c is defined and agreed upon, your university might commit to transferring $X00K to 2i2c as an advanced purchase of ICaaS. To ensure sustainability, perhaps those funds are used to pay for 50% of the actual costs with the other 50% paid out on an ongoing basis.
  2. Commit to paying an annual membership fee of $X0K pending the achievement of milestones.

Target Launch Date?

June 1, 2020.

Select a repo