Background - HackMD

# Background *Last updated: 2024-11-15* * URL for this page: https://hackmd.io/@investinopen/Infra-Finder-background * Open the sidebar [using this link](https://hackmd.io/@investinopen/Infra-Finder) to see all Infra Finder documentation. ## Motivation for Creating Infra Finder IOI was founded to help increase adoption and investment in the open infrastructure needed to drive equitable access and participation in research. A core premise of that is our aim to provide targeted, evidence-based tools for adopters and funders of open infrastructure to help them make more informed decisions when investing in and adopting research and scholarship infrastructure. After collectively developing and launching the Catalog of Open Infrastructure Services pilot project in 2022 and learning from our community, we realized we needed to re-envision what outcomes we could affect and rethink what solutions could look like. From a policy standpoint, we wanted to further the aims of UNESCO’s [Recommendation on Open Science](https://www.unesco.org/en/open-science/about), which sets an intention that “access to scientific knowledge should be as open as possible” and emphasizes: “The world needs open science now.” We also knew that infrastructure and information and communications technology is a key part of the United Nations’ [Sustainable Development Goal 9](https://sdgs.un.org/goals/goal9). In addition, the US government published the [Nelson memo](https://www.whitehouse.gov/wp-content/uploads/2022/08/08-2022-OSTP-Public-access-Memo.pdf), which calls for an update to public access policies for Federal funders “no later than December 31st, 2025.” In looking to prior and existing efforts, we consulted and sought inspiration from works including the [Principles of Open Scholarly Infrastructure](https://openscholarlyinfrastructure.org/), the [Scholarly Communication Technology Catalogue](https://www.scomcat.net/), and the [HELIOS Scholarly Communication Infrastructure Guide](https://static1.squarespace.com/static/61b3819ed113b0380812d182/t/642acd3079fcf27b36e14494/1680526641186/HELIOS+Shared+Infrastructure+Decision-Making+Guide.pdf). To ensure that what we’re building will be useful for adoption of open infrastructure, we took a product development approach, reviewing learnings and outcomes from prior existing work, conducting additional user research, interviewing infrastructure service providers, and leading focus group discussions and user testing to refine our understanding of the problems and frictions experienced when looking to adopt open infrastructure. A key theme came to the fore: Leaders of libraries and research institutions were looking for open infrastructures that could enable the sharing of scholarly output in a way that advances their institution’s vision for “open”. We therefore are building a tool that can help decision makers in libraries and research institutions discover the services and resources that could help them — one that could grow and evolve with the needs of the research environment. We also want to make sure Infra Finder serves the needs of open infrastructure service providers. From our conversations with the service providers and their feedback, we understand Infra Finder has value in bringing visibility to the work of infrastrcutre service providers that are often hidden in the background but are critical to the research and scholarship ecosystem. From our user research and engagement, we've discovered that users see Infra Finder as a way to start a number of conversations — about potential infrastructures to support a service or program, about gaps in services or synergies between them, and about essential funding to sustain the open infrastructure ecosystem. We are excited to build a tool that lets users find, discover, and compare open infrastructures. We also think there will be plenty of inspiration to be found in the information contained in Infra Finder, and we are eager to explore these with the community. ## Project History and Timeline **Between August 2021 and September 2023**, IOI developed the Catalog of Open Infrastructure Services (COIs), a prototype for Infra Finder. COIs, launched in January 2022, enabled us to collect community feedback, gauge infrastructure providers' interest, and conduct additional user research to better understand the utility and value proposition of such a tool. - Read further about [the background of COIs](https://hackmd.io/@investinopen/InfraFinder-bakckground) - Read about [the findings from our COIs user research](https://investinopen.org/blog/exploring-the-values-of-the-catalog-of-open-infrastructure-services-cois/) in August 2022 In **April 2023**, we [received grant funding from the Mellon Foundation](https://investinopen.org/blog/ioi-receives-1m-from-the-mellon-foundation-to-scale-the-catalog-of-open-infrastructure-services-cois/) to further develop and productize COIs. With the funding, we hired a product team, including a Product Lead, Engagement Coordinator, and contract developers and designers to further develop the tool. Between **July and August 2023**, we redesigned the infrastructure intake form. Building on existing resources and COIs user research findings, we asked questions on topics like technology infrastructure, history, organizational structure, finances, and community engagement that both enables infrastructures to showcase their achievements and meet users’ needs in discovery and evaluation ([more below](#Designing-the-data-collection-instrument)). We also developed a list of infrastructure providers to invite to participate in the next release of the tool ([more below](#Selection-of-initial-service-providers)) and prefilled intake forms with publicly available information to reduce the burden on providers. In **September 2023**, we began sending invitations to infrastructure service providers to participate. In having introductory conversations with them, we developed a better understanding of their motivation to participate and collected valuable feedback which greatly helped evolved the form. Simultaneously, we conducted focus groups with 13 library directors and members of staff (casemakers) to deepen our understanding of the their experiences and how the tool can support them. We learnt that there's an opportunity for the tool to unburden early-stage evaluation of infrastructure services through showcasing verified information around aspects including costs (of implementation and maintenance), technical dependencies, and open values alignment. Based on the findings, our designer developed low-fidelity wireframes which we tested with 5 librarians. The user testing was critical in helping further refine the user experience and design. We also developed a workflow to validate information that providers have submitted through the intake form. We invited a total of 84 infrastructure service providers to participate in this first release and received responses from 57 services as of January 2024. We shared the designs for and data in Infra Finder with the providers for their feedback and review in mid-December. We also confirmed a new name for the tool – Infra Finder. Between **December 2023 and April 2024**, our product delivery team focussed on loading the data into our data store and implementing and adjusting the application design. Infra Finder was released to the public in **April 2024.** We monitored usage, fielded [Expressions of Interest](https://investinopen.org/add-an-infrastructure-to-infra-finder/), and continued gathering user feedback while developing plans that would enable open infrastructures in Infra Finder to self-service data updates. In response to trends and needs that we saw within the open science ecosystem, in **June 2024** we began the process of expanding Infra Finder to include services, protocols, standards, and software tools that are in active use for the analysis and sharing of research data, and for digital preservation and archiving. These two new areas In **September 2024,** we launched self-service data update capability within Infra Finder and added additional filters requested by users to the frontend. See [this blog post](https://investinopen.org/blog/help-shape-the-future-of-infra-finder-explore-new-features-and-share-your-feedback/) for details. In **November 2024,** [we announced](https://investinopen.org/blog/infra-finder-grows-90-open-infrastructures-now-available/) 37 additional open infrastructures in Infra Finder and 9 new solution categories, and launched a campaign to grow attention, usage, and product feedback. ## Selection criteria We recognize that open infrastructure is a spectrum, rather than a binary. To help scope the infrastructures found in Infra Finder, we created the following selection criteria. A service may be eligible for inclusion in Infra Finder if it is in active use as a **service, protocol, standard or software** that the academic ecosystem needs in order to perform its functions throughout the research lifecycle. Additionally, we prioritize services that meet one or more of the following criteria: * Meets the definition of [open source software (OSS)](https://en.wikipedia.org/wiki/Open-source_software?ref=investinopen.org); * Primarily or exclusively distributes openly licensed (open access) content; * Is free to use by anyone (free of charge or other restrictions); * Is community-governed and is transparent in its operations and finances; * Is operated by a non-profit or non-commercial entity. ## Resources Consulted * [Mapping the Scholarly Communication Landscape 2019 Census](https://educopia.org/2019-Census/) and [bibliographic scan](https://educopia.org/mapping-the-scholarly-communication-landscape-bibliographic-scan/) * [Scholarly Communication Technology Catalogue (SComCAT)](https://www.scomcat.net/) * [The FOREST Framework for Values-Driven Scholarly Communication](https://educopia.org/forest-framework-for-values-driven-scholarly-communication/) * [The Principles of Open Scholarship Infrastructure (POSI)](https://openscholarlyinfrastructure.org/) * [The HELIOS Scholarly Communication Infrastructure Guide](https://static1.squarespace.com/static/61b3819ed113b0380812d182/t/642acd3079fcf27b36e14494/1680526641186/HELIOS+Shared+Infrastructure+Decision-Making+Guide.pdf) * [re3data](https://www.re3data.org/) * [fairsharing.org](https://fairsharing.org) * [EOSC Marketplace](https://marketplace.eosc-portal.eu/) * [Open Access Publishing Tools](https://radicaloa.disruptivemedia.org.uk/resources/publishing-tools/) from the [Radical Open Access Collective](https://radicaloa.disruptivemedia.org.uk/) * [400+ Tools and Innovations in Scholarly Communication](https://docs.google.com/spreadsheets/d/1KUMSeq_Pzp4KveZ7pb5rddcssk1XBTiLHniD0d3nDqo/edit#gid=0) compiled by [Jeroen Bosman and Bianca Kramer](https://101innovations.wordpress.com/) of Utrecht University Library ## Data Sources To ensure that the data in Infra Finder is up-to-date and verifiable, we chose to scope the first group of providers to those who are primarily in data and content repositories space for the first Infra Finder release for reasons we explain below. This group's input has been invaluable in helping refine and co-develop our data collection and application design. ### Selection of initial service providers As mentioned above and based on the [selection criteria](https://hackmd.io/gdcbXFT7SOe9sYLG28XvHA?both#Selection-criteria), we invited 84 service providers to participate in Infra Finder at launch in 2024. This initial list of providers we invited represents a small subset of a more comprehensive list we've pulled together from [the resources we consulted](#resources-consulted) that we then refined and analysed. We are incredibly grateful to the colleagues who built those resources for their foundational work. In refining the list and defining our initial release, we decided to focus on infrastructure services that enable the sharing of research data and content. We see increasing the visibility of infrastructure services in this area and advancing their adoption as key to furthering the aims of policies including UNESCO’s [Recommendation on Open Science](https://www.unesco.org/en/open-science/about), the United Nations’ [Sustainable Development Goal 9](https://sdgs.un.org/goals/goal9), and the US Office of Science and Technology Policy's [Nelson memo](https://www.whitehouse.gov/wp-content/uploads/2022/08/08-2022-OSTP-Public-access-Memo.pdf). The shared focus across our current core programmes of work also allows us to better align our research, community engagement, and strategic support efforts. Some of the 84 invited services came from an [earlier Expression of Interest](https://investinopen.org/blog/next-steps-for-the-catalog-of-open-infrastructure-services-cois/). Others were invited in the course of ongoing research efforts (e.g. [the “reasonable costs” investigation](https://investinopen.org/data-room/reasonable-costs)). Of the 84 services we invited, we received responses from 57 services as of January 2024. ### Designing the data collection instrument The data collection instrument's initial design was inspired by many of [the resources we consulted](#resources-consulted) and refined by the insights and findings from our user interviews and focus groups. In our design of the instrument, we prioritize collecting information that would help users make decisions or make the case to adopt open infrastructure, and information that would highlight and showcase the features and achievements of open infrastructures. The initial iteration featured over 50 questions on various aspects from organizational and governance details to technical attributes. The initial group of service providers and librarians and institutional leaders we spoke with in the run-up to our launch provided pivotal input which helped clarify the questions and associated documentation, and enable our team in understanding where, for example, certain questions were challenging to answer or not applicable to certain providers. The process of designing the user interface and experience has also provided our team with a much deeper understanding of how we can improve the instrument to collect data that can be easily interpreted by users, efficiently managed in the Infra Finder database, and scalably displayed in the Infra Finder interface. We will continue to improve the instrument and process we use to collect Infra Finder data. The preliminary round of data collection will also help us move from free text responses to controlled lists for some questions based on the responses we received. ### Data collection and review The data collection for the first release of Infra Finder was done collaboratively between IOI and the infrastructure service teams. IOI developed the intake form on [Jotform](https://jotform) and prefilled each infrastructure service provider's form with information gathered through searching publicly available data sources. We tracked organization details, technical attributes, and other pieces of information. When a provider agreed to participate in Infra Finder, they filled in the remaining fields on the intake form and verified the information IOI had prefilled. Once completed, the information was reviewed and verified once again against publicly available data sources by the IOI team. The IOI team also took care to document the verification process and capture any artefacts and data sources used in that. From December 2023, we also provided a mechanism for infrastructure service providers to submit updates to their data to us. ## Technology Used As both advocates of open source solutions and believers in the transformative value of open source, IOI endeavors to leverage open solutions whenever possible. In line with this commitment, Infra Finder is built largely using open source solutions, with the code hosted in a GitHub repository, as is common practice in open source software development. While the repository is currently private, we intend to make the source code and data openly available pending a security review to identify any potential vulnerabilities or privacy concerns. The following is a brief overview of the technology used to develop Infra Finder. As the development of Infra Finder is ongoing, we will be updating this documentation to reflect the current state of the technology at key milestones to reflect the current operating structure of the application. ### Data Storage After the IOI team has reviewed and verified providers' information, it is then loaded into a [PostgreSQL relational database](https://en.wikipedia.org/wiki/PostgreSQL), an open source relational database management system, that is hosted in a [DigitalOcean](https://en.wikipedia.org/wiki/DigitalOcean) [Kubernetes](https://en.wikipedia.org/wiki/Kubernetes) container, which automates operational tasks for deployment, changing, and scaling, and monitoring applications. The initial data model is built to power the Infra Finder application. ### Hosting We chose cloud hosting on DigitalOcean as it relieves IOI of the need for specialised infrastructure engineering knowledge. The Infra Finder application along with our database are [containerized](https://en.wikipedia.org/wiki/Containerization_(computing)) following modern application development best practices. Among other benefits, containerization enables portability to other cloud infrastructures should the need arise. After finalizing the design, the application development team created a web application using the open source [Rails framework](https://en.wikipedia.org/wiki/Ruby_on_Rails) written in the open source [Ruby programming language](https://en.wikipedia.org/wiki/Ruby_(programming_language)). The application also makes use of the open source interface framework [ActiveAdmin](https://activeadmin.info/). --- This page first published: 2024-01-04 ###### tags: `Infra-Finder-documentation`, `under-construction` --- <a rel="license" href="http://creativecommons.org/licenses/by/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by/4.0/88x31.png" /></a><br />This work is made available under a <a rel="license" href="http://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International License</a>. Users are free to share, remix, and adapt this work. (Please attribute [Invest in Open Infrastructure](https://investinopen.org/) in any derivative work).