---
tags: liber-dslib
---
# Landscape Analysis
## Introduction
Data science is typically viewed as an interdisciplinary field that employs computational methods to generate insights from data. Others view data science as a 'professional ecosystem' that is brought about by advances in statistics, computer science, and abundance of data (Buton et al 2018). A famous conceptualisation positions data science as a domain that combines substantive expertise, math and statistics knowledge, and hacking skills (Figure 1). For some, data science is equated with the use of machine learning algorithms. For others, data science is a term that replaces professional functions such as ‘statistician’, ‘analyst’, and ‘administrator’. In short, as with many novel terms, there is no single definition and understanding of the concept ‘data science’.

Data Science Venn Diagram. Source: http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram
This multitude of views on data science can be noted also within library environments and discussions in our Working Group. As our members span different organisations, and the different organisations have taken up somewhat different focus in their operations, also the interest in data science for libraries takes different routes. For some, data science in libraries is geared towards researchers and goes hand-in-hand with a call for research libraries to engage with research data curation and assistance to researchers in research data management. For others, data science in libraries is envisioned as a skillset to be embraced by library staff. This includes automation of data-related routines taking place within the library as well as use of data science methods in the design and provision of library services. Finally, data science in libraries can be envisioned as a new take on research management services that some libraries provide. Here the focus is on investigating research performance and Open Science metrics.
What counts as data in these four broad directions for data science in libraries?
* **Collections as Data**
By 'Collections as Data' we mean data science activities that facilitate the use of library collections in computationally-driven research and teaching. This can be activities that ensure that data that are high-quality, rich with information, reliable, suitable for analysis, and easily accessible for computational interactions. Examples include data pipelines that are created to enhance the quality of data and machine learning and computer vision techniques that are used to generate data, discover resources, identify and extract rich metadata and full-text from documents.
* **Library Intelligence**
By 'Library Intelligence' we mean data science activities geared towards the improvement of traditional library services and support for decision-making by library management. Examples are data-driven item suggestions for library patrons, the application of machine learning techniques in the management of library material flows, the use of library loan data analytics in collection management, and automated library analytics for day-to-day planning and annual reports.
* **Research Support**
By 'Research Support' we mean the use of data science to support researchers through the research lifecycle. In line with developments in open science and digital technologies/skills, libraries are expanding services and tools for researchers to leverage their production and use of data and information. This covers areas such as research data management, digital humanities, and (digital) information skills. Examples in research data management include data management planning, research data/software engineering, ensuring FAIRness (findability, accessibility, interoperability, reusability) of data, in addition to data curation and preservation. Examples in digital humanities include working with Linked Open Data and digital corpora. With respect to (digital) information skills, an example is the use of data science methods in (automated) systematic reviews of literature.
* **Research Intelligence**
Research intelligence (RI), like business intelligence, regards compiling and visualizing data for decisions and benchmarking within the research community. Given the scale of the data available, RI often requires the implementation of data pipelines and dashboard tools. Examples of data collected are metadata of publications and other research outputs, and data related to these outputs such as citations. An integral part of RI is also the continuous development of analysis workflows, for instance combining traditional citation metrics with alternative metrics such as policy citations.
This is not an exhaustive list of data science activities in libraries, yet it captures the main types as viewed by the members of Working Group Data Science in Libraries.
**References**
1. Burton, M., Lyon, L., Erdmann, C., & Tijerina, B. (2018). Shifting to Data Savvy: The Future of Data Science In Libraries. Monograph, Pittsburgh, PA: University of Pittsburgh. Retrieved June 27, 2021, from <http://d-scholarship.pitt.edu/33891/>
2. Padilla, T., Allen, L., Frost, H., Potvin, S., Russey Roke, E., & Varner, S. (2019). Final Report -- Always Already Computational: Collections as Data. Zenodo. Retrieved June 27, 2021, from https://doi.org/10.5281/zenodo.3152935
### About the Working Group
The [LIBER Data Science in Libraries (DSLib) working group](https://libereurope.eu/working-group/liber-data-science-in-libraries-working-group/) explores and promotes library engagement in applying data science and analytical methods in libraries, taking into account all kinds of processes and workflows around library collections and metadata as well as digital infrastructures and service areas.
This working group operates as part of LIBER’s Strategic Direction on [Research Infrastructure](https://libereurope.eu/strategy/research-infrastructures), which in turn is one of the key pillars of our [2018-2022 Strategy](https://libereurope.eu/strategy/).
Libraries are data-rich environments and present vast opportunities for applying data science and analytical methods. These activities and initiatives emerge at the overlap between data and information sciences as well as disciplinary domains which build research questions based on digital collections.
We are already seeing investment in analytical capacity and skills, in particular, related to mining and annotating collections, enriching (meta)data, monitoring scholarly communication and publication behaviour and related spendings, providing infrastructure and support for text and data mining of library collections, developing training opportunities for librarians and students, etc.
#### Goals
Areas foreseen to be investigated by the working group may include but are not limited to:
- Identify and analyse key initiatives and projects across Europe and beyond;
- Explore how data science methods and tools can be tailored to library-specific environments and requirements;
- Identify good practices as well as challenges and gaps;
- Evaluate methods and strategies for skills and capacity development.
## Key Recommendations
- One page
## Data Science in Libraries
### Awareness
Synthesize existing surveys? Digital Humanities wg, AI - maybe no surveys directly regarding data science but overlap with DH and AI etc
### Organization
How are data science activites/teams/services organized within libraries? do we want to make a survey? If not we can skip this part and rely on Library profiles, also with regard to how this is organised.
### Funding
How are data science activites/teams/services funded within libraries?
### Collections
Discuss library collections fit for data science. Open up collections and make it machine-readable for AI.
### Data Science Activities & Use Cases
split it up in the 4 sub-sections (under introduction):
* integration of data science methods, pilots, discovery systems
* 'version' of collection sections
* DS in Libraries Cookbook with Use Cases? See AI Cookbook for Libaries: https://github.com/CENL-Network-Group-AI/Recipes/wiki/AI-Cookbook-for-Libraries
## Staff & Skills
(may be included in the form for Library profile?)
- Library Staff & Job Titles (might be hard to state clearly)
- Skills Gaps
- How Skills Are Acquired
- Types of Training Offered
- Capacity & Skill Development
- [Library Carpentry](https://librarycarpentry.org/lessons/)
- link to other training resources
## Partners / Network / Community
- Target Audiences & Research Areas
- Networking, Community Building & Collaboration
- Code4Lib [conference](https://code4lib.org/) and [journal](https://journal.code4lib.org/)
- DARIAH Bibliographic Data Science Working Group
- AI4Lib
## Impact & Future
OS, Citizen Science, reward recognition - how do they tie in here
- Raising Awareness
- Impact Evaluation, Infrastuctures & Future
## Library Profiles
Template for library profiles? How do we select them, our own libraries or good examples we know about? Find profiles within our sub-sections (under introduction) - Lindas idea: construct a matrix with the 4 sub-sections and the other organisation/funding etc
## Uncategorized
• Spaces (physical and digital)
• ethical issues (may need to be addressed)