---
tags: liber-dslib
---
# WG Meeting Notes 2023
## 2023-12-13: WG Meeting #24
Agenda
1. News
2. Update about the LIBER Winter Event workshop
3. Update about the literature review
Participants
* Péter Király
* Jez Cope
* Angela Vorndran
* Annika Lindh
* Athanasia Salamoura
* Peter Verhaar
* Rosie Allison
### news
- Annika Lindh: publishers restrict the possibility of text/data mining on subscribed content
- Jez Cope: This is the main thing that's happened in my world in the last 2 months... https://blogs.bl.uk/living-knowledge/2023/11/cyber-incident.html
just published:
- Proceedings of the Workshop on Humanities-Centred Artificial Intelligence co-located with 46th German Conference on Artificial Intelligence, September 26 - September 29, 2023: Berlin, Germany (KI2023)
https://ceur-ws.org/Vol-3580/
- Proceedings of the Computational Humanities Research Conference 2023 (CHR 2023), Paris, France, December 6-8, 2023.
https://ceur-ws.org/Vol-3558/
- AI in Museums. Reflections, Perspectives and Applications
https://www.transcript-verlag.de/978-3-8376-6710-3/ai-in-museums/?c=311000020&number=978-3-8376-6710-3
- Maria Collins, Xiaoyan Song, and Sherri Schon: The Use of Python to Support Technical Services Work in Academic Libraries (Code4Lib Journal, Issue 58, 2023-12-04)
https://journal.code4lib.org/articles/17701
- LIBER and OCLC webinar - Here is the next event for OCLC: https://libereurope.eu/event/oclc-liber-building-for-the-future-facilitated-discussion-2-data-driven-decision-making/
Jez Cope: "Data-driven decisionmaking" rings alarm bells for me, data should be informing decisions not driving them...
Annika Lindh (IReL): I think it's the standard way to talk about using data to inform decisions rather than automating - but may be misleading if you're not in the data analytics space. Data-driven is usually used this way to indicate that the decision is based on data so I think the content will not be too controversial 😁 though transparency is definitely important
### LIBER Winter Event
- 150 attendants
- nice landscape near Firenze
- the workshop went well
- 30 ppl in the room
- presentation: https://drive.google.com/file/d/1-k-s_ihHx2HhpDNDTUCjXJRKixi-BHZN/view?usp=drive_link
- many questions: skillset, type of trainings
- lively discussion
- challenges
- 45 minutes about the hub
- Neha created a prototype of the learning hub: https://nehamoopen.github.io/liber-learning-hub/
- ppl can vote the topics (from the topic guide section of the hub + new topics) - the resulted flash cards will be digitized
- request: data science summer school organised by LIBER
- next steps: collect the input, prioritizeds, begin a developer strategy to write the guides
### Literature review
We collect reviews here:
https://drive.google.com/drive/folders/1jWlqn0DwVPOrSfkVIsb4ns34HpUpSDm0
You should copy the template (https://docs.google.com/document/d/1EHoknj5xWDqckxAe8s-vQaM-UAywUpkw8Rz2dzS4hJo/edit#heading=h.6rurhqaa2fak) from that directory, then upload your file there.
Peter (Verhaar) assigned one publication from the Zotero bibliography to each member of the working group:
https://docs.google.com/document/d/1cKBWovMKuZQ1SmmTKXE_UrXlSMafhDBl6QfvIEN9krA/edit
Athanasia Salamoura: To the articles uploaded in the Zotero, can we review them too if we find something interesting?
Péter: sure
- [ ] We will use tags to distinguish papers assigned, reviewd, unassigned
- [ ] check if everybody can add/modify entries (Jez has not that access right)
### Discussion of
Candela, G., Chambers, S., & Sherratt, T. (2023). An approach to assess the quality of Jupyter projects published by GLAM institutions. Journal of the Association for Information Science and Technology. Portico. https://doi.org/10.1002/asi.24835
### Next meeting
17th of January
## 2023-11-01: WG Meeting #23
Agenda
1. News
2. Update about the LIBER Winter Event workshop preparation
3. Update about the planned literature review
4. Discussion of
Candela, G., Chambers, S., & Sherratt, T. (2023). An approach to assess the quality of Jupyter projects published by GLAM institutions. Journal of the Association for Information Science and Technology. Portico. https://doi.org/10.1002/asi.24835
5. discussion about the _modus operandi_ of this event
Participants
* Neha Moopen
* Péter Kiraly
* Peter Verhaar
* Camila Lindelöw
* Niahm Malin
* Arben Hajra
* Asimia Valachi
* Athanasia Salamoura
### 1. News
* Wikimedia and Libraries User Group: AI Salon with Andrew Lih and Richard Knipel. An hour-long conversations on Google Meet to discuss artificial intelligence's potential threats and long-term implications for the Wikimedia ecosystem. Each AI Salon will be introduced by leading experts and directed toward the practical interests of librarians.
* September
https://meta.wikimedia.org/wiki/Wikimedia_and_Libraries_User_Group/Salons/2023/September
recording: https://www.youtube.com/watch?v=1NY87Tazen4
* October: AI Salon with Silvia Gutiérrez We held our second AI Salon at 12 p.m. EST on Thursday, October 27. Our guest speaker this month is Silvia Eunice Gutiérrez De la Torre, Senior Program Officer, Libraries, Wikimedia Foundation.
https://meta.wikimedia.org/wiki/Wikimedia_and_Libraries_User_Group/Salons/2023/October
recording: https://www.youtube.com/watch?v=mphVbFvdZKU
* Just published: Alkemade, H., Claeyssens, S., Colavizza, G., Freire, N., Lehmann, J., Neudecker, C., Osti, G., & van Strien, D. (2023). Datasheets for Digital Cultural Heritage Datasets. Journal of Open Humanities Data, 9:17, pp. 1–11. DOI: https://doi.org/10.5334/johd.124
* Padilla, T., Scates Kettler, H., Varner, S., & Shorish, Y. (2023). Vancouver Statement on Collections as Data. Zenodo. https://doi.org/10.5281/zen
Peter: Leiden University & Elsevier Symposium on Digital Sovereignty (29th of November 2023): https://www.library.universiteitleiden.nl/news/2023/09/leiden-symposium-on-digital-sovereignty
Arben: OpenAlex free how-to webinars are on second Thursday of each month at 12pm ET. https://openalex.org/webinars
* November 9: International Intelligence with OpenAlex
* December 14: Visualised research analytics with OpenAlex and VOSviewer
Peter: biweekly Lunch Time Seminar: https://www.universiteitleiden.nl/en/artificial-intelligence/sails-programme/past-events-archive/lunch-time-seminars#
### 2. Update about LIBER Winter Event workshop
Peter:
The Winter Event website: https://libereurope.eu/eventscalendar/liber-winter-event-2023-in-florence-italy/
Our documents (in progress):
* https://docs.google.com/document/d/1vSIi5WMi7icP6Byeju9UP8tmrnnASXxACOImeRr3wYU/edit#heading=h.q2tkcabgv2bj
* https://docs.google.com/document/d/1FagTZNCULkkPTy3rdclFkMeXnnQGS8Zrhm2mvCgyJcU/edit#heading=h.q2tkcabgv2bj
Neha: sneak peek at the learning hub website:
https://nehamoopen.github.io/liber-learning-hub/
the individual chapters will follow the topic list from one of the google docs above.
the chapters will not be totally new (educational) materials but curate what is already available...links, recommendations, learning pathways
Arben: https://openeconomics.zbw.eu/en/knowledgebase/introduction-to-open-data/?cat=90
Neha: A really nice resource, Arben. I think it could be featured in the eventual RDM 'chapter' and maybe in a domain-specific subsection if that fits.
Neha: in short, use the workshop to:
- get input on the topic list
### 3. Update about the planned literature review
- methods/tools
- challenges
Neha: a good landscape analysis:
Europe's Digital Humanities Landscape: A Study From LIBER's Digital Humanities & Digital Cultural Heritage Working Group
https://zenodo.org/records/3247286#.XQylxI9cI2w
Asimina: we would like to participate, it would be a beneficial controbution to the libraries
Neha: So maybe some literature could be shared/assigned among WG members and we should all should summarize it according to some criteria by a certain date?
Asimina: I participated in the AI in libraries webinar. Limitations: copyright, ethics. The recordings: https://www.youtube.com/watch?v=xrgXFdgnnvA.
Neha: If we want to do our literature review with AI, we have a tool developed at Utrecht University for that: https://asreview.nl/
I have not used it yet, so I don't know how it work out. But I've always wanted a project to try it out with haha.
Arben: Yes Neha, this can be a good approach.
So in the beginning, we select a collection of publications and store them somewhere (eg Zotero).
Then each of us reads one/two and provides a summary based on the defined criteria.
Our Zotero group: https://www.zotero.org/groups/4344603/
Action: we will discuss it again in the next meeting.
### 5. modus operandi
Neha: what helps is a real working together. Common goald
Camilla: the news/paper discussion would be an ice breaker
## 2023-10-04: WG Meeting #22
Agenda
1. member changes
2. the submission and acceptance of the Winter Event workshop
3. discussing the paper (Kent Fitch, “Searching for Meaning Rather Than Keywords and Returning Answers Rather Than Links” https://journal.code4lib.org/articles/17443)
participants
* Arben Hajra
* Asimina Vlachaki
* Birgit Schmidt
* Camilla Lindelöw
* Jez Cope
* Neha Moopen
* Simone Cocchi
* Petér Kiraly
### member changes
Simone is a new member
XX leaving
### Update on the submission and acceptance of the Winter Event workshop
LIBER conference, discussion with DSDCH WG, they are creating a training hub
At winter event: work session on the details of such a hub, e.g. who could be the target audience, how to annotate the content
Just received the notification that the workshop was accepted. Peter Verhaar will be there. Let us know if you are planning to be there as well.
Planning meeting with DSDCH WG on 18 Oct - anyone interested to join? Neha happy to help.
Documents - add links
### discussing the paper
Fitch, K. (2023). Searching for Meaning Rather Than Keywords and Returning Answers Rather Than Links. The Code4Lib Journal, 57. https://journal.code4lib.org/articles/17443
Jez Cope: This reminds me of a different approach taken by the people at CORE: https://blog.core.ac.uk/2023/03/17/core-gpt-combining-open-access-research-and-ai-for-credible-trustworthy-question-answering/
https://doi.org/10.1007/978-3-031-43849-3_13
That "CORE-GPT" approach is interesting because it's quite naive. I dispute its claims of "trustworthiness" though...
Arben: the usage of vocabularies improves the result list quality
Suggestions for next time:
- Candela, G., Chambers, S., & Sherratt, T. (2023). An approach to assess the quality of Jupyter projects published by GLAM institutions. Journal of the Association for Information Science and Technology. Portico. https://doi.org/10.1002/asi.24835
- Datasheets in cultural heritage (Birgit)
### closing notes
Neha: All Zotero invites have been accepted, sorry for the delay! Peter Kiraly also has admin role now - just in case :)
## 2023-09-06: WG Meeting #21
Agenda
1. Welcome to new members
2. Review of the Data Science workshop during the LIBER Conference
3. Collaboration with the Digital Scholarship and Digital Culture Heritage Collections Working Group and Call for Proposals for the LIBER Winter Event
4. Landscape study: next steps
5. Paper suggestions
### participants
- Annika Lindh (IReL)
- Arben Hajra
- Asimina Vlachaki
- Birgit Schmidt
- Niamh Malin (She/Her)
- Peter Kiraly
- Peter Verhaar
- Simone Cocchi - UNIMORE
### new members
- Annika Lindh (IReL) dataa cleaning takes lots of time. Background: research and sw development, mainly in Python. IReL is an Irish
- Asimina Vlachaki: a librarian from Athens (Nat. Kapodistrian Univ.). Cataloger and data producer
- Niamh Malin: data analyst (Cambridge). Open access
### 2. Review of LIBER conference
* was planned as a workshop but ended up in v large auditorium w a small audience, tried to make it interactive through questions
* findings from the survey were presented
* notes: https://bit.ly/dslib-2023-notes
* slides: https://bit.ly/dslib-2023
* Second half was on the Digital Scholarship and Digital Cultural Heritage Learning Hub (Minimal Viable Product), w personas etc.
### 3. Collaboration with DSDCH WG and Call for Winter Event
not clear who will attend from the DSDCH WG
LIBER Winter Event, 23-24 Nov
workshop on creating a learning hub
e.g. cookbook on DS in libraries, based on different personas, different needs
DARIAH Teach, Programming Historian, Carpentries, etc.
Q: What is the focus, content or a platform?
A: on the content, learning paths, etc.
Compentences of librarians
Input welcome, how do you engage in training/teaching
### 4. Landscape Study: next steps
We had the survey which produced useful result. Next step is to move forward, develop a report.
Literature review - Zotero library, https://www.zotero.org/groups/4344603/liber_dslib
(TODO: ask Neha to add members/change ownership)
Planning:
https://docs.google.com/document/d/1LU1N8grFc3shthStAsoPSHIhmcjSvZJG7Ca9KQwxaKE/edit#
Outline
https://docs.google.com/document/d/1_ZlIO2hS7tLSpa1nL-AhDT952miON495T32t0vENDQo/edit#
Suggestion:
- discuss the literature review during the WG meeting in November
- discuss draft writeup of the report in October
Participants for creating the first draft:
- Peter (lead)
- Péter
- Arben
### 5. Suggested paper for the next meeting
Erin Wolfe, "ChronoNLP: Exploration and Analysis of Chronological Textual Corpora"
https://journal.code4lib.org/articles/17502
Kent Fitch, "Searching for Meaning Rather Than Keywords and Returning Answers Rather Than Links"
https://journal.code4lib.org/articles/17443
selected for discussion in Oct
## 2023-06-07: WG Meeting #20
Agenda
1. Welcome to Nicola (de Bellis)
2. New date for working group meeting
3. State of the survey, workshop at LIBER 2023 in Budapest
4. Exploration possibilities. Automated generation of metadata (KB report)
5. Cooperation with Research Data Management WG and OCLC
6. Cooperation with Digital Scholarship & Digital Cultural Heritage (DSDCH) WG on a training hub
7. Paper suggestion
### Participants
* Peter Kiraly
* Jez Cope
* Nicola De Bellis
* Camilla Lindelöw
* Arben Hajra
* Athanasia Salamoura
* Matthijs
* Rosie Allison
* Michael Hertig
### 3a. State of the survey
A preliminary collection of survey answers (50) was presented.
### 3b. LIBER 2023
The workshop will be joined together with Digital Scholarship & Digital Cultural Heritage (DSDCH) WG. Here is the updated programme:
https://liberconference.eu/pre-conference-workshop-1-introducing-a-collaborative-and-sustainable-training-hub-for-supporting-liber-member-skills-development-in-digital-scholarship-digital-cultural-heritage-and-data-science/
### 4. discussion on Exploration possibilities paper
Arben: Annif in ZBW:
https://github.com/NatLibFi/Annif-tutorial/blob/master/data-sets/stw-zbw/README.md
Nicola De Bellis:
this is an example: of the works I was referring to: https://dl.acm.org/doi/abs/10.1145/3581754.3584126
### 5. Minutes of the meeting organised by OCLC
23 May 2023
Meeting with LIBER groups
Agenda:
Attending: Rachel, Rebecca, Titia, Mercy, Renee, Gemma, Barbara, Rosie
Attending: Steering committee State of the Art Services:
Data Science in Libraries Working Group
* Péter Király, Gottingen, researcher and software developer
Research data Management Working Group
* Mari Elisa Kuusniemi (MEK). University of Helsinki
* Liisi Lembinen, University of Tartu, Estonia
Round of introductions
2. Introduction of the OCLC-LIBER Engagement Program, “Building for the Future” (Titia). Titia shared the OCLC-LIBER Open Science Discussion Series.
3. Brief overview of the 2023-24 plan focusing on “state of the art services” (Rebecca)
a. Opening plenary (virtual, 12 October)
b. Interactive session: LIBER Winter Conference (Nov 23 or 24)
c. Interactive session: OCLC EMEA Regional Council (in person, TBA, around March 2024)
d. Interactive session: virtual (late spring 2024)
e. Closing plenary (virtual, June 2024)
4. Discussion on the interactive sessions. What angles/themes would you like to explore on these topics?
a. Ethical implications of AI. Péter says their WG has discussed the Responsible Operations report. He thought the paper was good, but it’s a hard task. One of the suggestions was to create task forces at our own institutions, and Péter thinks this is hard. He thinks it could be nice to talk on the practical side–[this sounds like a #socialinteroperability issue]. For instance, there are a lot of discussions today re: ChatGPT, but how do we discuss this–on FB, Twitter as individuals? Or should the library create a report. There’s probably also a skills component here, too. What are the ethical skills here? MEK says there’s a CS unit that has organized a MOOC about AI, and it’s a 3-5 credit class. The library encourages librarians to take this course, so they know basic things. It is affecting how they think about indexing. Translation is now easy with these tools. It was a huge thing for their library to give that time to everyone to take that course. She was initially sceptical about why librarians should know about this. But not so much now. There are some concrete results, but some are coming in the future.
b. Data science in libraries. Péter says their WG is doing a survey re: landscape analysis of data science in libraries. They have 4 subtopics:
i. improving services,
ii. improving internal workflows and services (from the librarian perspective),
iii. staff and trainings, and
iv. research intelligence (scientometrics, research analytics, etc.)
c. Research data management. Liisi says they are discussing re: the skills, what things are required. They started to map the skills last year, re: what skills librarians have. See the report on their Winter Event Workshop: https://libereurope.eu/wp-content/uploads/2023/04/Report-on-Winter-Event-Workshop_PDF-1.pdf
There is also an interest in what curricula should include re: skills development. Looking at both technical skills and soft skills.
MEK mentions the data curation webinar: https://libereurope.eu/event/research-data-curation-what-libraries-can-offer-rdm-working-group-workshop/
She says they are talking about how they organize RDM services. Different kinds of service models, roles of libraries in universities. Is there a coordinator, service provider around RDM. The library might just provide consulting services, or perhaps they also provide data repositories, etc. Some libraries have been providing these services for years but others haven’t. Data curation is a new effort for a lot of libraries. How to link data science and RDM. That might be an interesting topic. They need to provide new services all the time but where do they find the trained people?
The group noted that because the RDM WG is already organizing webinars on certain topics - the LIBER-OCLC series need to be complementary to these webinars and not overlap.
MEK says that “state of the art services” can mean different things to different countries.
Liisi asks re: the audience. Titia responds that our approach is that these sessions are open to anyone who wants to come to the table. Whoever shows up defines the kind of discussion we can have. And that is OK. As long as the discussion was meaningful to those who attended.
MEK: it’s a challenge to discuss RDM topics to a general audience
Rebecca: the hope is that participants are already really familiar with the topic; open but it should attract practitioners who have some familiarity already. A takeaway here might be that we might want to a write a “who should attend” statement for these?
Péter adds a suggestion for the RDM session: it might be interesting to explore the different RDM repository/tech stacks, the costs and the organizational structures. Whether it is self service, curated, with data stewards, and what is the organization of data stewardship. Workshop at the LIBER meeting in July on how to set up a repository. It’s not so much about the costs [RB look at this]. https://liberconference.eu/pre-conference-workshop-7-how-to-set-up-data-repository-services-in-a-university/
Titia asks Péter, Liisi and MEK if they can share the list of workshops they are planning for the next year, so we can coordinate – and not overlap.
5. Plenary sessions: are there panelist(s) you would recommend for the opening plenary?
Rebecca mentions that we could invite Thomas Padilla and maybe an OCLC colleague with ML-experience working with duplicate detection in WorldCat.
Péter suggests Herbert van de Sompel, who gave a recent presentation on the future of scholarly communications. Living in Vienna now.
Barbara will identify someone from LIBER to talk at the opening session, introducing the theme and speak a bit about “state of the art” services.
Barbara will share about this meeting with others who were unable to attend. We may get some additional comments from the steering committees.
Want to know more? Look at the program content plan.
### 6. Cooperation with Digital Scholarship & Digital Cultural Heritage (DSDCH) WG on a training hub
Purpose of the service
What need will a Digital Scholarship and Data Science in Libraries training material hub fill? Why is it worth creating a new hub for LIBER members, rather than simply relying on existing training hubs and networks elsewhere?
The exponential expansion of digital collections, and the computational methods and tools to interact with them, presents an opportunity for research library professionals to not only support digital scholars on increasingly more complex computationally driven research, but also to apply such methods in the care and curation of heritage collections. Capacity building in digital scholarship methods, including data science, is unevenly distributed across this community however, and it is this which the hub ultimately aims to redress. Though training for the sector around the use of data science and computational methods in the library context has grown considerably within the last decade [citation], fifty-eight percent of LIBER Member respondents still noted as recently as 2019 ‘technical knowledge - such as coding or tool expertise’ as the primary deficit in their environments in Europe's Digital Humanities Landscape: A Study From LIBER's Digital Humanities & Digital Cultural Heritage Working Group.
The provision of these valuable resources remains fragmented online and across institutions at a local, national and international level. This leads to considerable inefficiency and inconsistent skills acquisition across LIBER member institutions and sustains an imbalance between the wide-spread ambitions of scholars to undertake computational research with library and heritage collections, and a paucity of institutions with capacity to enable these. Equally, it impedes research libraries and cultural institutions from benefiting from the digital transformations computational methods can offer.
In addressing some of these issues, a new Digital Scholarship & Data Science in Libraries Learning HQ will:
* present a central destination for newcomers to gain a better understanding and gentle introduction to digital scholarship and data science practice in libraries;
* provide a holistic overview of the latest developments and applications of digital scholarship and data science in libraries;
* enable library staff, in LIBER membership and beyond, to explore and immediately access a wealth of self and group study educational resources available to them through faceted search and discovery;
* offer a centralised, curated and trusted home for educators to contribute, and have reused, their high-quality training materials;
* foster a culture of continuous learning and professional development and facilitate the development of localised training opportunities in LIBER institutions by providing guidance and training materials suitable for self and group study;
* raise awareness of local, national and international networks, and associated training events, relevant to the field;
* furnish current reports and competency skills frameworks pertaining to digital scholarship and data science in libraries;
* provide a platform for holistically assessing, measuring and promoting digital scholarship and data science in libraries;
### 7. suggested paper for the next meeting:
Candela et al., "A Checklist to Publish Collections as Data in GLAM Institutions"
https://doi.org/10.48550/arXiv.2304.02603
### 8. others
- should we have a meeting in July?
## 2023-04-26: WG Meeting #19
Agenda
1. Welcome to Erika
2. New date for working group meeting
3. Plan for Landscape study
4. Workshop at LIBER 2023 in Budapest
5. Topic for the next meeting of the Working Group in May
**Participants**
- Peter Verhaar
- Péter Kiraly
- Erika Kurucz
- Michael Hertig
- Arben Hajra
**Notes from meeting**
2. Matthij's suggestion:
- the first Wednesday of the month; or
- the last Friday of the month
Peter: Wednesday would be the preference
Michael: Wednesday would be the preference
Erika: Wednesday would be the preference
Péter: no preference
Peter: move to 3pm
3. Plan for Landscape study
Provisional plan:
https://docs.google.com/document/d/1LU1N8grFc3shthStAsoPSHIhmcjSvZJG7Ca9KQwxaKE/edit#
Outline:
https://docs.google.com/document/d/1_ZlIO2hS7tLSpa1nL-AhDT952miON495T32t0vENDQo/edit#
Potential action: Add the invitation to fill out the survey to the LIBER INSIDER (the LIBER monthly newsletter)
Highlight: the situation in Europe
4. Workshop at LIBER 2023 in Budapest
See provisional programme:
https://docs.google.com/document/d/1izOFeMN1EiVzsa-8P75o6BrsvMPyNtNMVb_KV8W6uyU/edit?pli=1
Main questions:
* Which working group members will attend the session?
* Who wants to chair the session?
* Who wants to be involved in the preparation of this workshop?
5. Literature suggestions:
Kleppe et al., "Exploration possibilities. Automated Generation of Metadata."
https://zenodo.org/record/3375192#.ZEjzGHZBzt0
Candela et al., "A Checklist to Publish Collections as Data in GLAM Institutions"
https://doi.org/10.48550/arXiv.2304.02603
## 2023-03-29: WG Meeting #18
Agenda
- Peter Verhaar's introduction on Padilla's Responsible Operations
- discussion about the report
**Participants**
- Peter Verhaar
- Arben Hajra
- Athanasia Salamoura
- Kiera
- Kirsten Krogh Kruuse
- Michael Hertig
- Neha
- Rosie
- Péter Király
**Notes from meeting**
- Neha: "I'm on the move and I can't use the mic, but I notice this report primarily involves contributors from the US. It would be interesting to see where the landscape is the same in the EU and where it is not, since there are some stricter regulations like the GDPR and possibly others coming in. In terms of what our working group can do, I think we want to come out with a landscape analysis and we could incorporate some of the recommendations in our report."
- Arben: a) staffing: do we need new people with different mindeset? b) algorithmic methods: we need clear guide how to handle these methods? We should be aware of the biases.
- Kirsten Krogh Kruuse: Some of our PhD students use AI tools to find literature for their studies. The development in this field happens so quickly that it is difficult for us in the library keep up with it.
- Neha: There is an AI in Libraries Cookbook: https://github.com/CENL-Network-Group-AI/Recipes but they are not very active, so we could pick it up.
- Athanasia: There are two books that I think that you would already know them, called "Practical Data Science for Information Professionals" and "Data Science in the Library: tools and strategies for supporting data-driven research...". Just mentioning them for anyone interested!
- Peter: https://unloch.github.io/lod/
- Kirsten: the AI tools:
- CiteSpace
- Citation Chaser
- Citation Tree
- Connected Papers
- Elicit
- Inciteful
- Iris.ai
- Keenious
- LibKey
- Litmaps
- Location Citation Network
- OpenAlex
- Pure Suggest
- ResearchRabbit
- Scholarcy
- Semantic Scholar
- Scite
- VOSviewer
- Yewno Discovery
- Peter: https://researchsoftware.pubpub.org/infrastructures - it has some links to tools interesting in this context
## 2023-03-01: WG Meeting #17
Agenda
- Are there any news relevant about our topic? Published papers, reports, events, submission deadlines?
- AI in libraries: should we have a dedicated slot/subgroup/new WG?
- Peter Verhaar's presentation
- Who would like to present next?
- paper selection
- anything else
**Notes from meeting**
**Participants**
- Peter Kiraly
- Peter Verhaar
- Angela Vorndran
- Birgit Schmidt
- Michael Hertig
- Arben Hajra
- Athanasia Salamoura
**Relevant events and news**
- SWIB (Semantic Web in Libraries): 11 – 13 September 2023, Berlin http://swib.org/swib23/
- LIBER conference. The review process is almost finished, notification will happen next week.
- IFLA conference, Rotterdam, August 2023 - call for paper
- [Göttingen Open Science meetup on 15 March: AI and Open Science](https://pad.gwdg.de/OpenScienceGOE20230315#). There is still room for short presentations if anyone is interested.
**Survey**
respondents so far:
- University of Pécs Library and Knowledge Centre (Hungary)
- Leiden University Libraries (Netherlands)
- AU Library, Royal Danish Library (Denmark)
- Iowa State University (USA)
**AI in libraries**
- https://ai4lam.org Artificial Intelligence for Libraries, Archives & Museums
- Your opinion on the relationship of DS and AI
- Should we have a separate AI WG?
AI a subset of AI vs. DS more basic level vs. more advanced
AI4LAM - monthly community calls, documents from the WG
AI literacy - relevant for other WGs as well (e.g. Digital Heritage, Ethics & values) perhaps add this as a discussion point to the LIBER conference workshop
**Peter Verhaar's presentation**
Centre for Digital Scholarship at Leiden UL, provide hands-on support for DS
- c 35K students
- training, consultancy, services
- open access, open science
- text and data mining, data science
- research software engineering
- collaborating in research projects
- mediating Islam in the digital age (MIDA), EC-funded project
- help PhD students: Twitter API - downloading relevant tweets (Charlie Hebdo) - machine learning -> classify the tweets. 7 views, labelled 700 tweets manually
- Turkish TV show (Payitaht) analysis. Focus on symbols that have a strong political meaning, developed a model of classification via comp vision techniques. In collaboration w Dutch eScience Center.
- Project on war of independence in Indonesia.
- letters, memoires, diaries -> transcribed
- collocations, word types, word embeddings
- comparing Dutch and Indonesian soldiers' writings
- IIIF - support for teachers that would like to use images in their classes
hieratic text + structured annotations ieratic tex
online lessons
- FAIR data: support to researchers to publish their data as FAIR data
- FAIRify the research data of a book historian (word documents, Excel sheets) -> searchable database. Data modelling, Python code to convert documents, identifiers (OpenStreetMap API, Wikidata) -> RDF triples
- connecting data with the Short Title Catalogue (STCN)
- Challenges
- capacity planning and prioritization
- keeping knowledge up to date (e.g. working with BERT/Large Language Model)
- involvement of researchers (they often lack the technical or statistical knowledge, client-supplier model not ideal, better when they are directly involved)
Your views on how to address these challenges?
Python courses
- introductory courses org 3 times per year (https://cdsleiden.github.io/python-tutorial/Welcome.html)
- advanced DS courses offered 2 times per year
- tutorials in the form of Jupyter notebooks
The Carpentries
- DC, progamming in R
- LC, Python and Git
- 3 certified trainers
- typically c 30 participants (students)
Interested to hear about similar courses, sharing learning materials
Learning materials about LOD
tog w CLARIAH
- LOD tutorial
- GLAM workbenches (Jupyter notebooks)
e.g. based on Europeana, Wikidata
Collections as data
- Padilla, report
- T. Tasovac, DARIAH position paper
- e.g. KB LAB, Data Foundry (National Library of Scotland)
Interest to work collectively on notebooks
Scoping document
make own collections available as data
- type of data
- methods for providing access (APIs, LOD, bulk downloads)
- methods for stimulating reuse (e.g. GLAM workbenches)
- what is needed to stimulate the use of our data as training material for ML
- Digital Heritage Reference Document (?)
Topics for discussion
- How to org the support for researchers
- can we coll in some way on org of DS workshops/training
- what is needed to publish coll as data
Cmts/questions
Princeton, Center for DH, similar model
Berkeley Inst for DS
as much effort goes into the project as the researcher invests, ie 50:50
from which areas?
PV: mostly from humanities and social sciences
12 employees at CDS, incl 2 data stewards, 1 legal/copyright, 2 OA, 2 research software engineers, 1 management of research software
Q: courses w credits?
PV: at the library but do not have the authority to assign credits
certificates of attendance
Centre for DH, part of the faculty
have been asked to embed into a linguistics course
**Who would like to present next?**
**paper selection**
- Péter's suggestion: Kleppe et al., “Exploration possibilities. Automated Generation of Metadata.”
## 2023-01-25: WG Meeting #16
- **introductions**: we have new members, some members left the group
- **update profiles on LIBER website**
- **co-chairship**: Neha stepped down as a co-chair, we are looking for volunteers for co-chairing
- **survey**:
- the survey is out, we should start filling out, and propagate (https://survey.uu.nl/jfe/form/SV_eswlDMEJuaf9nGS)
- the wg members should at least complete the survey to start
- discuss other promotion
- **LIBER 2023 conference (Budapest, July 5-7)**: who is planning to go? We have a deadline to propose a workshop e.g. about the results of the survey. Should we submit a proposal?
- **Data Science showcase**: we started a showcase to present on data science in our libraries. Who would like to be the next presenter?
- **journal club**: Do you have a suggestion for paper to read and discuss?
Join our Zotero library via: https://www.zotero.org/groups/4344603/liber_dslib/
**Attendees**
* Peter
* Neha
* Angela
* Arben
* Jez
* Matthijs
* Rosie
* Kirsten
* Peter V
* Joe
* Kiera
* Athanasia
* Birgit
**Notes from meeting**
- Round of (re)introductions
- Check your info on [DSLib WG website](https://libereurope.eu/working-group/liber-data-science-in-libraries-working-group/members/)
- Co-chairing: thank you to Neha, any volunteers?
offers: Athanasia, Peter
Peter will arrange a meeting.
- Survey, open until 17 April 2023
Simplified version based on first draft which we tested among WG members, not easy to fill because the info is often distributed across the institution.
So far: 3 responses (Leiden University Library; AU Library, Royal Danish Library; Iowa State University)!
- Promotion of the survey: so far only at the LIBER winter event and via the LIBER newsletter
- Ideas for dissemination channels (please add):
- Twitter (see e.g. https://twitter.com/GottingeneRA/status/1603357309845094400 in German)
- Mastodon
- UKB (network of Dutch university libraries) Digital Scholarship Working Group + RDM working group maybe
- RESEARCH-DATAMAN@JISCMAIL.AC.UK
- German community: forschungsdaten@listserv.dfn.de, inetbib@ub.uni-dortmund.de, DINI
- Future LIBER newsletters (obviously :)
- Workshop at LIBER2023
- who will be there?
- Athanasia, Kirsten, Birgit, Peter (for some tbc)
- possible topics: first survey outcomes
- Short presentation / showcase
- Peter V volunteers
- Suggestions for reading a paper
- Suggestions welcome
- Join the Zotero library
- Topic suggestion: ML and machine-analysis in automatic text and image generation, how does this affect library operations, ethics, modalities
- Tools for the WG
- [Hackmd start page](https://hackmd.io/@nehamoopen/liber-dslib/https%3A%2F%2Fhackmd.io%2F%40nehamoopen%2FHyFm4Prn_)
- [Gitlab repo](https://gitlab.com/nehamoopen/liber-dslib) - all our files
- Zotero library