owned this note
owned this note
Published
Linked with GitHub
# 2021-05-06 DataONE Community Call
[![hackmd-github-sync-badge](https://hackmd.io/dpQ2gMlWTRGXzUTtnAfrjg/badge)](https://hackmd.io/dpQ2gMlWTRGXzUTtnAfrjg)
**Topic:** Data Licensing - a Discussion of Alternative Licenses
**Time:**
```
UTC: 2021-05-06, 17:00
Other time zones:
Auckland (New Zealand): 05:00 (05-07)
Sydney (Australia): 03:00 (05-07)
Shanghai (China): 01:00 (05-07)
Kolkata (India): 22:30 (05-06)
Nairobi (Africa): 20:00 (05-06)
London (Great Britain): 18:00 (05-06)
Eastern (US): 13:00 (05-06)
Central (US): 12:00 (05-06)
Mountain (US): 11:00 (05-06)
Pacific Time (US): 10:00 (05-06)
Alaska Time (US): 09:00 (05-06)
Tahiti (Pacific): 07:00 (05-06)
```
**Description:**
The selection of licenses for data and metadata hosted in repositories, shared by researchers, and reused by end users necessarily reflects an attempt to balance a variety of interests. A sample of these interests includes researcher interest in receiving proper attribution for their work in creating data, sponsor interests in maximizing the impact of the support they provide, repository interests in having the necessary rights to manage and provide access to data and metadata they host, and end user interests in being able to use the data they find with minimal restrictions and requirements. This DataONE Community Call will bring together the community and a group of invited participants to discuss the lessons learned and decisions made in selecting data licenses for use in a variety of contexts.
**Invited Introductory Speakers**
* Marty Downs - Long Term Ecological Research (LTER) Network Office
* Katie Fortney - California Digital Library
**Participants**
Maximum participants was 35 -- some people came late, and some left early
Please provide your name and affiliation
- Marty Downs, LTER Network
- Katie Fortney, California Digital Library
- Amber Budden, DataONE
- Matt Jones, DataONE and Arctic Data Center, NCEAS
- Kevin Ashley, Digital Curation Centre (UK)
- Mara Sedlins, Colorado State University Libraries
- Adam Rountrey, U Mich
- Adrienne Canino
- Annie Simpson, U.S. Geological Survey
- Bobby Candey
- Brian Westra
- Carrie Iwema
- Dave Vieglais, University of Kansas / DataONE
- Donna Scott
- Ge Peng
- Greg Maurer, JRN LTER
- H. K. "Rama" Ramapriyan, SSAI & NASA GSFC
- Jason Burton
- Varsha Khodiyar, Springer Nature
- Katie Mika, Harvard Library
- Susan Borda, UMich Library
- Renée F. Brown, McMurdo Dry Valleys LTER
- Jennifer Gonçalves Amato, SciELO (BR)
- Wendy Kozlowski, Cornell University Library
- ...
**High-Level Agenda**
1. Brief Introduction, Logistics (5-min, Amber)
2. Introductory Stage Setting Presentations from Marty Downs and Katie Fortney (15-minutes)
3. Discussion, Q&A
**Shared Notes**
- Katie's notes and slides: https://bit.ly/data1kf
- Data Citation - a view from the field. https://docs.google.com/presentation/d/1lBgLuJ25-apU2do2yx7V-wJhiOTxSqcDCLQTsaK1b14/edit?usp=sharing
- Overrviewed different people and the drivers for licensing
- "What license should I put on my data". Ask instead "Should I put a license on my data"
- facts are not copyrightable (1990 court case in U.S. ) [499 U.S. 340](https://supreme.justia.com/cases/federal/us/499/340/)
- CC0 is a waiver, not a license
- databases can be, sometimes
- Data may be owned by not the researcher directly, for instance it belongs to the institute or funder
- researchers are (mis)using licences as a way of trying to get credit for sharing their data
- suggestions:
- is there anything to own? think early about this
- If it's all about attribution, is a license the best way to handle that?
- expect questions, provide contact info (try to keep it up to date)
Discussion and questions
- Is there growing support for ODC (Open Data Commons)?
- MD: one of the licenses considered, but not clear on extent of use
- KF: can't really go wrong with both choices
- Kevin Ashley: exp in Europe, CC is popular, situation on what can be incorporated is nuanced; even just using a license as signaling an intent, CC can work fine; if trying to give permission, the intention is sufficient even if the licesne isn't enforceable; on the flip side, if your intent is to restrict, then the limitations might be problematic. Data repositories are better placed to handle licence/copyright decisions.
- Matt Jones: Arctic Data Center uses CC0 (15%) and CC-BY (85%), but need license for prose in metadata
- Wendy: when repositories license data, who owns that? do you transfer rights to the repository, or do the contributors keep rights?
- Brian Westra: methods in both a paper and dataset metadata may be in legal limbo
- maybe a deeper conversation aroiund protocols and methods and what "is" a publication is needed (e.g., protocols.io)
- BW: copyright and institutions asserting ownership; if you can't copyright the data, can they really do that?
- KF: universities often aren't asserting legal ownership, but rather stewardship?
- PC: DataverrseNO use CC0 as default license; researchers question the use, but don't express concern because of the statement that community norms requires attribution
- Adam Routrey: widely shared belief that everything is copyrightable; carried forward by repos that only allow application of CC licenses; some of the data have commercial and/or cultural heritage implications
- having the CC0 might conflate rights with other CC licenses
- Marty: is there an equivalent to CC-BY in the Open Data Commons;
- See Cornell's page: https://data.research.cornell.edu/content/intellectual-property
- Brian: ODB-BY corresponds
- Matt: How to license research objects/compendia that contain many sub-objects for which different licenses might be appropriate (e.g., code versus data versus text versus images)
- Plus how to do that in a machine readable way
- Wendy: especially an issue if people try to apply CC-BY to collections that contain software, which CC says shouldn't use CC licenses
- Katie Mika: how to handle having an overarching license for a package, with custom terms for specific files or components
- really hard to tell what things apply to when there are mixed license models
- Katie: we try to not to put an umbrella license on the package when individual licenses apply to the files
- H. K. "Rama" Ramapriyan - NASA policies might be of interest - https://earthdata.nasa.gov/earth-observation-data/data-use-policy
- https://earthdata.nasa.gov/earth-observation-data/data-citations-acknowledgements
- RDA Legal interoperability WG
- https://www.rd-alliance.org/groups/rdacodata-legal-interoperability-ig.html
- Slides on compound data licenses: https://www.rd-alliance.org/sites/default/files/2019-04-03_RDA_IG-Legal-Interoperability_ResearchCompendium.pdf
- Varsha Khodiyar:
- Why not just divide it up into different packages?
- For "reproducible" packages that integrate and create derived data
- Madison Langseth: they also want to deal with these integrated packages, especially when there's an assumption of public domain from a federal agency
- would be helpful to be able to flag that there are mixed licenses within a package
- KF: mixed compositions are not legally challenging, but the technical and communication issues are difficult
- KF: responsibility lies with the consumer to ensure they have the rights they need to reuse; won't be such an issue if people aren't getting litigious
**Notes from the zoom chat:**
10:18:16 From Adrienne Canino to Everyone : Can you share that bitly link please Katie?
10:18:17 From Stevan Earl to Everyone : can we get the link to those slides in the chat or notes?
10:18:29 From Matt Jones to Everyone : yes, we’ll post it all online at the end
10:18:33 From Matt Jones to Everyone : on the page for this call
10:19:34 From Katie Fortney (she/her) to Everyone : http://bit.ly/data1kf
10:19:59 From Adrienne Canino to Everyone : Thanks!
10:20:09 From Katie Fortney (she/her) to Everyone : https://think-lab.github.io/d/107/
10:28:05 From Kevin Ashley to Everyone : +1 to that!
10:29:15 From Brian Westra to Everyone : I agree with Matt's interpretation as to why researchers would choose CC-BY. They look at CC0 and assume it means free to use without crediting the source.
10:32:33 From susan borda to Everyone : Licensing repository metadata seems like an unnecessary complication.
10:33:14 From Bobby Candey to Everyone : https://resources.data.gov/open-licenses/ mostly calls for CC0
10:33:35 From Philipp Conzett (UiT) to Everyone : We have explicitly waived any rights to our repository metadata.
10:33:47 From Philipp Conzett (UiT) to Everyone : In an academic context, you may rely on good research ethics implying giving credit to / citing your sources, but what about reuse outside research?
10:33:55 From Philipp Conzett (UiT) to Everyone : The real hard work comes with datasets that are derived from multiples sources with different licenses or in the worst case customized Terms of Use, some of which may allow for fair use...
10:34:40 From Adam Rountrey- UMich to Everyone : Use of CC licenses on data that may not really be copyrightable can contribute to the false belief that data in that category are subject to copyright- then that perceived copyright can create problems- particularly in the non-academic world. Consider CyArk claiming copyright to cultural heritage data from other countries because they scanned it.
10:36:25 From Kevin Ashley to Everyone : Agree that assertion of ownership by the institution can also help it take responsibility for long-term stewardship; that's not always a given
10:37:28 From Kevin Ashley to Everyone : (Whilst also agreeing with Katie that defining what 'ownership' means isn't straightforward)
10:38:15 From Varsha Khodiyar to Everyone : Metadata files created for Scientific Data manuscripts are shared as CC0, because we wanted to make explicit that these are available to be reused
10:39:14 From Brian Westra to Everyone : Cornell's page, which I've found helpful: https://data.research.cornell.edu/content/intellectual-property
10:46:41 From Kevin Ashley to Everyone : That's my understanding as well re CC & software
10:47:09 From Katie Fortney (she/her) to Everyone : “Can I apply a Creative Commons license to software?
We recommend against using Creative Commons licenses for software. Instead, we strongly encourage you to use one of the very good software licenses which are already available. We recommend considering licenses listed as free by the Free Software Foundation and listed as “open source” by the Open Source Initiative.”
10:48:15 From William Corey to Everyone : I've used the Cornell info when crafting my RDM guides page on licensing, among others. I also have links to the other licenses. All in one place if you want to explore: https://guides.lib.virginia.edu/c.php?g=515290&p=7170132
10:50:17 From Ge Peng to Everyone : @Katie: If only caring for attribution, what would you recommend in terms of a machine-readable license to use for datasets?
10:50:22 From Madison Langseth to Everyone : It seems like we need a license that indicates multiple licenses within the object.
10:51:32 From Varsha Khodiyar to Everyone : Wouldn't it be easier to specify that all items within a package should have the same licence? Then incorporate links to related outputs so can easily navigate between all the outputs
10:52:40 From Bobby Candey to Everyone : The NASA Heliophysics archives (I’m head of SPDF) mostly distribute data via automated web services, with many research projects using many datasets. We don’t specify a license yet, but CC0 is probably the only one we would use. We would have a note that this doesn’t exempt users from following community norms for attribution. CC0 provides simplicity, interoperability and universal recognition.
10:53:32 From Philipp Conzett (UiT) to Everyone : + 1 That's how we do it in https://dataverse.no/.
10:54:51 From Matt Jones to Everyone : @kevin, the links are in the notes
10:55:34 From Wendy Kozlowski to Everyone : @Bobby - we also like to mention "community norms" - and encourage that human readable statement of intent also be included. This seems to give some reassurance to those that really are so worried that no one will cite them with a CC0, but it's still hard... people want (and deserve) credit for their work and they fear handing it off with CC0 (in my experience).
10:55:40 From Amber Budden to Everyone : Matt added the RDA links to the hackpad notes
10:57:16 From Kevin Ashley to Everyone : Thanks Matt - a quick search of the RDA groups does suggest as well that it is only that joint CODATA/RDA legal interoperability group that's tackled data licenses. And yes, it's been quiet for 18 months or so
10:57:54 From Adam Rountrey- UMich to Everyone : Having everything together might be transformative and the package itself might have a license (on that transformative bit) that is different from the licenses of the individual components, right?
10:59:38 From Adrienne Canino to Everyone : Thank you to all the speakers and these great questions, a great conversation!
11:01:06 From susan borda to Everyone : Not all repository systems are flexible enough to handle these complex licensing issues.
11:01:48 From Varsha Khodiyar to Everyone : Thanks all for a very interesting session
11:01:49 From Philipp Conzett (UiT) to Everyone : Thanks!
11:01:49 From Wendy Kozlowski to Everyone : Thank you!!