Open Government and Digital Infrastructure for Transparency: AI Strategies for Democratizing Public Information =========================== ## 1. Name Thiago Andrade da Paixão ## 2. E-mail tap@thiagopaixao.com ## 3. Organization Name(s) / or: individual researcher * Thiago Paixão (main), Aline Lima e Ely Oliveira ## 4. Is this concept note primarily focused on research or implementation? * - [x] Research - [ ] Implementation ## 5. What is your research question? * > (30 palavras) | Por favor, descreva de forma concisa qual pesquisa você está propondo e seu contexto How can Governments and Civil Society take advantage of open digital infrastructure of FOSS AIs to enhance democratizing access to public information and promoting transparency / innovation in open government? ## 6. Why is this question important to answer and how does it relate to our fund? * > (500 palavras) While not a grand revolution but rather a natural evolution of techniques and technologies existing since the mid-20th century, such as neural networks, machine learning, deep learning, among others, in the last year we have experienced an explosion of new technologies known as generative Artificial Intelligences, including Large Language Models (LLM). Due to this recent phenomenon, the debate on how automation might "steal" our jobs and the need for regulation is increasing, with no consensus yet on best practices and little data available for subsidy. There is also a less visible debate on the socio-economic impacts generated by inequality in access to these new technologies (many of them closed) and how knowledge is restricted to specific economic groups that can afford them, often involved in the 'corporate capture' of States. Free and open digital infrastructure ensures that innovation is accessible to all, enabling both the promotion of more transparent and ethical new businesses, as well as the implementation of public policies and citizen participation. Governments committed to the Open Government agenda have transparency, participation, and collaboration as their pillars, promoting measures that ensure an environment with greater accountability, open technology, innovation, and open data. The Open Government Partnership - OGP, founded in 2011, created through broad public participation, a set of strategies and a commitment model signed between various civil society actors and governments, promoting open infrastructures that guarantee participation and access to quality data and information, accessible to all citizens. From this perspective, it is not enough for governments to commit to publishing massive amounts of data and information if they are not translated into simple language accessible to all sectors of society, and this is the biggest challenge. The study of the convergence between open-source software and digital infrastructure with social and governmental movements is crucial to understand how these elements can be sustained and developed equitably and responsibly. Answering this question will enable the development of pragmatic interventions and communication strategies that translate knowledge into practice, benefiting various sectors of society and contributing to a more open and democratic digital future. We hope to answer the following sub-questions: - How can Large Language Models positively impact as a tool for building applications aimed at simplification, dissemination, and transparency of government open data and information within an Open Government perspective? - What are the current initiatives of this type promoted by the public sector and civil society? - What could be the impacts on society? - The need or not for regulation of Artificial Intelligence technologies, the current debate, and main proposed paths. - What are the risks of using these new AI technologies? - Research, mapping, and testing of emerging generative Artificial Intelligence technologies of FOSS that can contribute to transparency and dissemination. - Proposal of best practices and a conceptual framework with an architecture composed of FOSS Artificial Intelligence technologies and models that perform best in tests. ## 7. What research methods will you use to answer this question? * > (500 palavras) | Por favor, descreva as metodologias e o escopo da pesquisa proposta The methodology of this research will be multidisciplinary, combining quantitative and qualitative methods to address the question comprehensively. Initially, we will conduct a systematic review of the literature to identify existing initiatives, challenges, and best practices in implementing LLMs in governmental contexts. Subsequently, we will conduct empirical analyses of emerging FOSS technologies, using action-research methods to test the efficacy of different LLM models in simplifying and disseminating governmental information. These tests will involve collaboration with governmental and civil society stakeholders to ensure the relevance and applicability of the results. Concurrently, we will develop detailed case studies to assess the potential impact of LLMs on society, considering aspects such as accessibility, equity, and inclusion. This will include interviews with experts, surveys with the general public, and participative live streaming to collect insights and feedback from various perspectives. Finally, we will integrate the obtained results to propose best practices and a robust conceptual framework that guides the implementation of LLMs in governmental environments, with clear recommendations for ethical and responsible practices. ## 8. What data or other resources will you use to answer the question? * > (500 palavras) 1. **Open Governmental Data:** We will use datasets provided by Brazilian governmental bodies, which include legislative, regulatory, administrative, and financial information, contributing to a comprehensive analysis of the political and governmental scenario. 2. **Academic Literature and Technical Reports:** We will conduct a systematic review of academic literature, technical reports, articles, and publications pertinent to the themes of Open Government, Digital Rights, Artificial Intelligence, and open-source technologies, aiming to construct a solid theoretical base and identify knowledge gaps. 3. **Open Source Tools and Technologies:** We will use open-source Large Language Models (LLM), such as those provided by Hugging Face, and other FOSS (Free and Open Source Software) technologies to develop, test, and analyze innovative solutions in open digital infrastructure. 4. **Interviews and Surveys:** We will conduct field research, including interviews with experts, governmental agents, and activists, and administer surveys to relevant stakeholders, with the aim of capturing diverse insights, opinions, and experiences. 5. **Forums and Online Communities:** We will participate in forums, discussion groups, and online communities related to Open Government, Digital Rights, and open-source technologies, seeking to interact, learn, and share knowledge with professionals, researchers, and enthusiasts in these areas. 6. **Social Media and Media Data:** We will analyze public data from social networks and media platforms to understand public discourse, predominant narratives, and society’s perceptions about themes related to the project. 7. **Legislative and Regulatory Analyses:** We will investigate legislations, norms, and regulatory frameworks related to Artificial Intelligence, Digital Rights, and Open Government in Brazil, to understand the legal environment and possible regulatory implications. The combination of these resources and data will allow for a multifaceted and interdisciplinary approach, providing a deep and integrated analysis of the challenges and opportunities associated with the implementation of Open Government practices and the development of open-source Artificial Intelligence technologies. ## 9. If applicable: What is the research finding that you are moving into practice? > (500 palavras) | ó se aplica se você propor um projeto de implementação Not applicable as we will not be implementing any final product/application; however, to achieve the research objectives, we will conduct some tests that may result in minor developments such as: - Web scraping scripts - Technology benchmarking scripts - Functional test automation - Small prototypes for proofs of concept. All codes generated by the research will be published on GitHub or GitLab with a FLOSS license. ## 10. What is the specific context / project / community that will be targeted with your research or its implementation - and why is the intervention needed? * > (500 palavras) The specific context of this project centers on the convergence of Open Government, Digital Rights, open-source communities, and the development of Large Language Models (LLM). The project directs its particular investigation towards: 1. **Open Government and Digital Rights**: We will focus on communities and organizations of activists in these themes, who seek to promote transparency, citizen participation, and governmental responsibility, and who advocate for open access to information and the protection of digital rights. 2. **Political, Regulatory, and Government Scenario in Brazil and Latin America**: The political and regulatory environment in Brazil and across Latin America, marked by its complexity and dynamism, will be a central point of our investigation, considering its impacts on open government practices and on the development and use of information and communication technologies. 3. **Free Software Community and Activists**: We will investigate the initiatives and perspectives of the free software community and digital activists, who play a crucial role in promoting open technologies and in advocating for a more equitable and inclusive digital ecosystem. 4. **International LLM Developers Community (such as Hugging Face)**: We will explore the contributions and knowledge of the international LLM developers community, seeking to understand and apply best practices and innovations in the field of large language models. ### Need for Intervention: Intervention is necessary because Brazil and all of Latin America, with their unique and diversified context, present significant challenges and opportunities in implementing open government practices and in developing open and inclusive digital technologies. The convergence between different communities and disciplines can foster a deeper and holistic understanding of the challenges and opportunities present in the scenario in this territory, contributing to the advancement of innovative and sustainable solutions in Open Government and Digital Rights. The promotion of free software and integration with international LLM development communities can enrich the debate and practice in Brazil, enhancing the development of technologies that are ethical, transparent, and aligned with the principles of open government and digital rights. Moreover, the intervention seeks to answer critical questions related to regulation, social impacts, and effective implementation of AI technologies in governmental contexts, providing valuable insights and pragmatic strategies for the promotion of an open and democratic digital infrastructure. ## 11. Please summarize your proposed work and the key activities that you will undertake * > (500 palavras) The inaugural activity of the work will be the development of an executive project, containing a detailed agenda of activities for each team member, with the respective deliverable, quantitative and qualitative goal. The executive project will contain the following activities: ### Phase 1: Preparation and Planning (1 month) - Detailed definition of the project, schedule, and resource allocation. - Review and final adjustments to the research plan. - Preparation of research instruments, such as questionnaires and interview scripts. ### Phase 2: Literature Review and Preliminary Research (2 months) - Systematic review of related literature. - Identification and analysis of existing studies, initiatives, and technologies. - Definition of inclusion and exclusion criteria for studies. ### Phase 3: Data Collection and Research (2 months) - Identification and analysis of relevant data sources. - Development and application of forms and conducting interviews with experts in the field and government agents. - Organization and preparation of the collected data for subsequent analysis. ### Phase 4: Computational Environment Development and Testing (2 months) - Configuration and preparation of the computational environment for tests. - Development, implementation, and validation of LLM models. - Conducting tests with identified technologies. ### Phase 5: Data Analysis and Development of Preliminary Results (2 months) - Detailed analysis of the collected data. - Interpretation and synthesis of the test results. - Preparation of preliminary results and development of insights for public debate. ### Phase 6: Public Debate and Results Review (1 month) - Organization and conducting of at least one live streaming or podcast with public debate with experts. - Collection of additional feedback and insights during the debate. - Review and adjustment of the research results based on the received feedback. ### Phase 7: Final Report Development (1 month) - Compilation and synthesis of the research results. - Writing and formatting of the project's final report. - Preparation of materials to disseminate the results. ### Phase 8: Accountability and Closing (1 month) - Preparation and submission of the detailed accountability. - Organization of a closing event to present the project results. - Submission of the final report. - Final evaluation of the project and planning for the next steps. ### Continuous Monitoring and Evaluation Throughout the project, periodic monitoring and evaluation meetings will be held to ensure that activities are progressing as planned and to adjust the plan as necessary. This plan is flexible and can be adjusted as the project's needs and requirements evolve over time. ## 12. What partnerships and programs are critical to this work and how do you envision outreach activities? * > 400 palavras ### CDR - Coalizão Direitos na Rede (Rights on the Network Coalition) We will have the institutional support of the CDR organization - Rights on the Network Coalition, in which the leader of this work, Thiago Paixão, is a member of the executive team as a consultant for digital security and infrastructure. This support will provide access to more than 50 academic and civil society organizations defending digital rights in Brazil that will help subsidize the research and public debate on the subject. ### IBEBrasil - Instituto Bem Estar Brasil (Well Being Brazil Institute) We will also have the qualified institutional support of the social organization IBEBrasil - Well Being Brazil Institute, which mainly works with Community Networks and social regulation of telecommunications services in Brazil, where team members Aline Lima and Ely Oliveira are volunteer collaborating members, and Thiago Paixão is a volunteer executive director. ### São Paulo City Hall We intend to partner with the Secretariat of Institutional Relations of the São Paulo City Hall, the main channel in the partnership of the municipality with the OGP through the Open Government Agents program in which team members, Thiago Paixão and Aline Lima have already participated in three editions through public notice selection. ### OGP - Open Government Partnership We will attempt to partner with OGP itself, trying to qualify the debate on the international scenario, and to obtain indications of sources and experts who can contribute to the work. ### Academic Institutions Lastly, in an attempt to provide greater relevance and academic qualification to the proposed work, we will contact academic research groups with whom we have built relationships in previous works, aiming to receive guidance and additional contributions. Among the candidate academic institutions are: - **USP / Poli / Gaesi** - Automation Research and Development Group of the Polytechnic School of the University of São Paulo, São Paulo / SP - Brazil - **IFF** - Fluminense Federal Institute, Campos dos Goytacazes / RJ - Brazil - **UENF** - State University of Northern Fluminense, Campos dos Goytacazes / RJ - Brazil - **PUC-SP** - Pontifical Catholic University of São Paulo, São Paulo / SP - Brazil ### Disclosure - **Latinoware - Latin American Free Technologies Event** - **CONEPE - Congress of Teaching, Research, and Extension** - **FIB - Brazilian Internet Forum** - **IGF - Internet Governance Forum** ## 13. What is your vision of success and what impact might your project have? * > 400 palavras) ### Vision for Success Our vision for success is anchored in the idea of fostering an open and robust digital infrastructure through the development and application of open-source LLM, aimed at democratizing, simplifying, and making government information transparent: 1. **Integrate Open-source LLM Models**: Achieving effective integration of exclusively open-source LLM models to disseminate government information and data transparently and accessibly, thus strengthening open digital infrastructure. 2. **Promote Public and Collaborative Dialogue**: By organizing public debates, such as podcasts, with the participation of experts, government representatives, and civil society, promoting a deep and constructive discussion about the importance of open digital infrastructure and open source. 3. **Establish Strategic and Multidisciplinary Alliances**: Forming partnerships with academic institutions, civil society organizations, and government entities, aiming to create a collaborative and multidisciplinary ecosystem in favor of open digital infrastructure and the development of open technologies. 4. **Generate Practical and Innovative Knowledge**: Producing insights, frameworks, and practical guidelines based on open-source models that can be effectively implemented by different stakeholders interested in promoting a more open and inclusive digital infrastructure. 5. **Collaborate with Public Debate on AI Regulation**: Contributing significantly to informed public discussions about the impacts of AI technologies and their need for regulation, providing a deeper understanding of the benefits and challenges associated with implementing AI in open digital infrastructure. ### Potential Impact 1. **Empower Civil Society and Strengthen Democracy**: By boosting open digital infrastructure and access to information, allowing more citizens to actively participate in democratic and decision-making processes. 2. **Promote Innovation and Sustainable Development**: Encouraging research and development of open-source artificial intelligence technologies, catalyzing innovation and sustainability in the technological ecosystem. 3. **Leverage Digital and Social Inclusion**: By democratizing access to information and knowledge, this project aims to include traditionally marginalized groups, promoting equity and social justice through open digital infrastructure. 4. **Educate and Raise Awareness about Open Technologies**: Disseminating knowledge and promoting educational debates about the advantages and challenges of open-source technologies and open digital infrastructure. 5. **Build Collaboration and Knowledge Networks**: The alliances formed can solidify and expand collaboration networks, encouraging the exchange of experiences and joint development of innovative solutions in favor of an open digital infrastructure. Ultimately, the success and impact of this project will be measured not only by the tangible results obtained but also by its ability to inspire transformations, stimulate new initiatives, and foster a more equitable, open, and inclusive digital environment. ## 14. Tell us more about the project team and collaborators * > (500 palavras) | (Links são apreciados) ### Thiago Paixão Thiago Paixão is a founding member of the collective Nós Livres - Community Networks and Development of Free Solutions, and he volunteers as a director of the social organization IBEBrasil. He holds a degree in Software Development and specialization lies in developing embedded automotive systems, web, and mobile applications. He had the opportunity to serve as a technical researcher for GAESI (Automation and IT Group of Poli-USP) via CNPq (National Council for Scientific and Technological Development, an entity linked to the Ministry of Science, Technology, and Innovations to encourage research in Brazil) for the Laboratory of Technologies and Open Protocols for Urban Mobility of the City of São Paulo – MobiLab, in partnership with SPTrans and CET. In this context, he participated in software development projects focused on urban mobility and smart cities. Thiago is a fervent advocate for the Free Software Movement, with over 10 years of involvement in the Free Software Association – ASL. He has coordinated efforts for the Latin American Free Software Installation Festival – FLISOL in São Paulo and nationally. Additionally, he is a dedicated activist for privacy rights and a free internet within the Rights on the Network Coalition. Contributor to the commitment plan for Open Government in the City of São Paulo, and an Open Government Agent of the São Paulo Open program for two years, selected via public notice. I am also a speaker and contributor at various events, mainly Free Software, such as FISL, Campus Party, FLISOL, and others. ### Aline Lima Aline Lima is a social communicator and a student of Information Technology at Univesp - Virtual University of the State of São Paulo. With solid experience in community networks, Aline serves as a volunteer communication assistant at IBEBrasil, where she is actively involved in implementing Community Networks projects throughout Brazil, employing her expertise to promote digital inclusion and democratization of access to information. Her experience in community networks also extends to international collaboration, acting as a peer in projects of the APC - Association for Progressive Communications, where she contributes to collaborative initiatives in community networks, exploring the potential of information and communication technologies as tools for social progress. Aline is a fervent advocate for gender equity in the field of technology and free software, engaging in debates and studies about female participation in these communities. ### Ely Oliveira Ely holds degrees in Business Administration and Economics, bringing an integrated and multifaceted view to his various fields of action. He is a dedicated collaborator in community network projects at IBEBrasil, where his background in administration and economics provides valuable insights for the development and implementation of digital inclusion and information access initiatives. Beyond his involvement with community networks, Ely is a profound independent investigator in macroeconomics and capitalist systems. He explores the complexities and nuances of economic systems and seeks to understand how macroeconomic dynamics interact with and impact contemporary social and political structures. **Links:** https://hackmd.io/@noslivres/B1OJzEwea ## 15. What cost level do you expect this work to be at?What cost level do you expect this work to be at? - [x] Between 50k and 75k dollars - [ ] Between 75k and 100k dollars - [ ] Between 100k and 125k dollars > (Orçamentos específicos, conectados ao nível fornecido, serão discutidos após a seleção e estão vinculados ao orçamento geral disponível e ao portfólio selecionado) ## 16. How many months do you expect this work to take?How many months do you expect this work to take? * - [ ] 6 months - [x] 12 months - [ ] more than 12 months (the exception is up to 18 months for part-time projects) ## 17. Can we share your concept note with other funders?Can we share your concept note with other funders? * - [x] yes - [ ] no