Friday December 13
Information / knowledge sharing (plenary roundtable led by Hadi Ashgari)
Aim: identify information sharing needs by participants
• What access requests have been done? Variety of used cases – type of company / data
• How can we have a system to facilitate access requests?
o Role: access collective database of SAR responses for (case)
o Willingness to Share (parts of – there may be personal information you don’t want to share) the SAR responses
o Asked to share – what could be the obstacles to go along with sharing.
• Two cases:
• 1. Supermarkets – facial recognition
• 2. Facebook
----Parallel Breakout Supermarket Group----
a. Concrete aims
i. What would the ideal SAR look like? Having a template for a SAR and that would be useful for various supermarkets so not necessarily directed at one particular supermarket.
1. SAR aimed at cameras – they want to identify you physically and they also ask additional info such as ID (Michael). So it would be useful to wear a very distinct t-shirt which would be hard to miss in the footage. Do you want to make your own footage?
ii. Ideal strategy / storyline to get what we want / method – what kind of physical behavior?
iii. Is there any specific case that we can identify that we can file a SAR?
b. Expertise needed
c. Action plan and suggested timeframe
- Identify the actors involved.
- First step when you go to a supermarket, you want to know what are the type of actors involved. Supermarket part of a shopping mall or they have their own security staff. In the case of AH that would usually single entity.
- Focal question – is there anecdotal evidence that FR is being used.
- Sensorial data in supermarket context. Generalized question: what types of analysis is being done with those sensors – FC, behavior.
- How can you make a SAR to unravel these issues – what kind of sensors and how they use it.
- If public entity using the same tech vendors, then Freedom of Information is relevant – public transport for example. Identify supermarket / mall / and identify is they are selling services / products to public actors and go from that.
- One approach could be to compare different types of supermarkets i.e. low budget and high end supermarkets or depending the neighborhood.
- Question / example: consent when FC/AI systems are being used? They are directed at different purposes Biometric (usually purpose of identifying the individual) / AI systems (but these systems might be directed at other purposes for example who forgot a luggage). categories of data is different then and consent. These complexities can be used when filing a SAR.
- Borderline technologies – even if they want to make an aggregate outcome and not necessarily directed at identification
- What is the difference as regards transparency obligations? Some tech are deployed to deliberately not to retain them - Accessible data is only aggregate data and Bavarian DPA said this is not processing of personal data.
o Can you access purpose of processing with a SAR
- Tech designed to aggregate analytics and they deliberately avoid making a connection to personal data. They do not retain data that enable you to be identified. They don’t make that aggerate data connected to a personal identification. If you file SAR they do not retain any personal data so it might not be successful.
- What? Discriminatory practices, self-checkout
- Example: Whole Foods (Amazon). Amazon go shopping you just go in and then go out. Detecting your skin color, face – related to biometric data. This would be related to ethnicity. Technologies determine that people are visibly different from each other and deliberately connecting it to your account so you can make a data protection case out of this.
- Small shops – Facewatch sells normal live stream CCTV for smaller shops. If Facewatch provides a platform and streaming all those CCTV footage
- Deciding on the approach to take when filing a SAR: Two ways to approach from a vendor perspective like amazon or FaceWatch stories or the user approach - you want to know what happens when you go into a supermarket. Facewatch says that they are the central controller across all supermarkets but you need to put effort in SAR?
o Example: Julie filed a SAR Monoprie French supermarket. They sent a document with a password, then didn’t send the password and then they said that they don’t have her data.
• General points to consider
1. How FR is penetrating supermarkets?
2. Are there other security measures taken?
3. These are usually to identify shoplifters but then is it actually happening / used for that purpose? Usually the purpose of the deployment of such technologies for security, age recognition, face . Camera systems can be used to for non-security related purposes such as to identify shoppers and find patterns and you can even include employees. Then they also become part of surveillance. With theft cases then usually the police is involved then you also have FOIA claims (differences among member states) Schiphol uses for example wifi and Bluetooth tracking.
4. Then you have to do an analysis of who is the supermarket / addressees?
5. Sensors: you can use SARs to discover what are the information being collected and used?
6. How much does it cost to implement these systems and the gain from prevention of theft etc?
o Questions that fit into SAR and other general questions – does it make sense to combine all those together when filing a SAR? From a journalism perspective, you can first make a storyline but they might also ignore you because you are journalist but also depends on the kind of company – for example with good PR. This might also relate to whether journalists expect a constructive conversation to take place or whether you want to keep it to a legal procedure.
o If you do a SAR on with whom do you share the data, they would have to tell you and the processors
1. Who are all the actors?
2. Follow up requests? – if you are going after vendors
There may be cases where you can identify the vendors. There is little chance that supermarkets are developing their own software but they usually own the hardware.
What are the goals of a story? From a journalism perspective what happens when you are in a supermarket. Are you being tracked? How? Wifi, bluetooth, FR systems, how is this being used?
- Physical tracking is an appealing subject to consider and it enables uncovering other broader issues.
- Self-checkout FR and physical tracking. From an academic perspective, governance based on transparency, to push the transparency regulations to find out whether there is compliance, if not, how? Data protection perspective – the right to know what can
Focus should be on pattern recognition of people’s physical movement – based on types of technologies.
Other data category focuses:
Two types of tracking:
1. passive tracking
2. active tracking
Active tracking: code being executed on your phone when your phone tracks you. This usually works with ultrasound beacons and apps – usually in retail.
In the EU there will be consent issues.
One target could be metatarget fidelity: name of the app is storecard. App and if you embed code in the app it allows you to be tracked in the retail space.
- What is the best template for a SAR? Judith usually use CNIL templates.
Loyalty cards – then you consent to be tracked but one problem could be ‘reasonable expectation’
Another target could be frustrated DPOs.
Action points:
What are the methods to use / what to ask from the supermarket?
a. Your request has concern you as a data subject or you’ll need several data subjects with different variables such as with loyalty card, without, over – below 18
b. What is the vocabulary to use? They might direct you to privacy policy. You might want to ask for categories data, facial data, cctv footage. Depending on how detailed the privacy policy is, they might not include all types of data. Include a list of processing purposes and categories of data when filing a request.
c. Ask to explicitly confirm / deny. And if they confirm, ask concrete purposes, legal basis for each personal data category.
d. Ask for all the third parties with whom they share the data – sources, processors and joint controllers.
e. Have a list of what can happen in a supermarket? If you are journalist you can combine a story and your request or are you constructing a story afterwards? then how to optimize your legal rights? Get as many leads as possible and combine that information with further action. For example identify the vendors and then approach them with what their systems do. List of vendors are the processors, cameras they use, the processing they do. In such cases they are joint controllers so they would have inform you about the purposes of processing.
f. Mall owners, association of shop owners and mole producers can also be useful for spotting vendors.
i. It can be useful to work together with consumer protection agencies. They can for example have a simple table saying what type of data being collected in this supermarket in which jurisdiction. Loyalty card – non-loyalty card.
g. Anonymized datasets to allow them to work it
h. You can also look at the devices in the supermarket, find who they sell etc - identify the vendor, and then find who does the processing.
i. Then do you want to have a distinctive clothing or behavior?
f. How important is it to include the DPO when filing a request? One suggestion can be to file a request to the DPO first and get some strategies or cc them on the email.
Challenge with SARs can be that you can get the basic information but when you want advanced information it might be very hard to obtain with SARs. A possible solution can be to object and for processing they have to explain why.
To do list:
- Analysis of different privacy policies
- Analysis of different FOIA results related to vendors
- Analysis of different vendor PR websites and documents
- Tabulate by jurisdictions
Obstacles or additional concerns to be considered in case of a litigation context were discussed.
Purpose of the access request can be to get your personal data to know what they do with it - to discover what they are actually doing with it. What do we do with the data that we get? One challenge to that is the transient processing. These kind of practices seem to not retain the data and they get rid of the personal data, just use it for analytical purposes. So maybe we need to construct strategies to approach that. Should you be getting these kind of scores? You can put up a wiki?
First level of action: figure out what is actually happening
Second level of action: is it actually affecting people?
The focus of the discussion was not just to have access to the data but to figure out what is actually happening?
Friday December 13 - 13:30 – 14:30
Case study ---- Timo location based marketing platform (AdTech)
Sharing format / tools:
Exercise:
Encode a SAR – describe how you would share it (depends on the purposes of filing a SAR)
- The process that led you to file a request but it usually is a back and forth procedure.
- Consider perspective of sharer and consumer
The app - Timo - locates customers through their phones, whether they are close etc to send targeted ads.
‘geoconquesting’
SAR filed based on the CNIL template.
Timo's reply: identification – give us a proof of advertising ID, phone number or a screen shot. And she sent the invoice and the screen shot and she was sent a file with numbers. She asked for explanation: location precise square of 225 meters info from the past 24 hours.
Third parties - This data was not shared with anyone except for Google Cloud platform - in addition, ‘the data sent to us by google ad exchange. We don’t know who sent the data to google ad exchange’.
What does 'events' column mean? she got the response they were 'events'.
Format to share this:
- Who? Company
o Type of company Adtech etc
- CNIL template
- Identifier used?
- Here, the proof of identifier was requested from the complainant with examples
- Category of data requested?
- What to redact when sharing responses to SAR?
- A valid question is who are you sharing it with?
- How do you encode qualitative data such as how easy was it to communicate with the company? In this case, they company gave examples as regards the possible identifiers so they were relatively collaborative.
- Source of the data (Google RTB)– this is useful to determine who to contact for further targets.
- Format to share? Screenshot, excel document, combination of different formats (Redact device ID and location)
o Describe the categories of data you were submitted instead of providing the exact data.
- Constraints: how much do you want to standardize the data you received – too much standardization may result in the loss of context.
- Asking for additional explanations but on what aspects?
- Licensing raw data if you are willing to share
Airline
Summary of all the data forms they collected - sources of data, legal basis and if they disclose it to third parties in an excel sheet that makes it easy to redact certain data.
There are also more complex data sent - with another structure, it is a lot of work to redact certain information.
- The format in which the data is sent may not allow easy reduction / editing.
- should you include descriptions on what certain ‘indications’ mean (what does ‘ee’ mean)
---Report Back---
Facebook Group
The aim was to discuss Sar-ing Facebook as a non-user. Some wanted to test the process without the relationship. Both the approaches were different and the group settled on the following flowchart.
[Insert flowchart picture here]
An adult who deleted their account sends a SAR or an adult who has never had a Facebook account and expects Facebook to have no idea. We also explored a child perspective because anecdotal information showed that Fb build shadow profiles. How do you identify the adult-child relationship? This needs more thought. We start with making a SAR and we provide all the required details. We start with a cookie. The reply with delaying tactics and we get stuck in a perpetual loop. It also becomes a question of security and even if you pass all these hurdles, they can just come back and say we don’t have data on this cookie. The result of the discussion was a list of FAQ that came out. We can only answer this list of FAQs after having done due research on Facebook.
Maybe the FAQ can be divided based on where they can be applied. Some FAQ is good for Facebook but probably not for other groups. We also found that everyone had had different responses and so we couldn’t have combined our results.
Group Supermarket
There were a few things from supermarket that we were interested in. What is going on, what is the process and who are the actors? We also included other sources of information i.e. vendors of technology that provide supermarkets with the tech. Processing purposes i.e. Biometric data, movement data, beacons, Bluetooth, wifi and other apps that track. Its very easier for the, to deny if we miss our mark slightly so we have to be concise. We will have to liase with data protection officers. Some DPOs are also confused about what to do and we need to work with them. Journalists also have a part to play i.e they can make journalistic requests but they can also make SAR requests. We’re trying to map out the purposes, age recognition, tracking checkout and other relationships. We also see a SaaS relationship and the way we see it changes the picture quite a lot. There are some interesting cases that tell us how this works in practice. In practice, the supermarket finds bug with the software provided by vendor and sometimes and they send data back to vendor. Vendors can use the data for their own purposes and hence they might be concerned about sharing the data. We want to build an understanding on how the sector deals with this technology.
Ideas about the process:
Data subjects are people with loyalty cards or people with coupons, and people who have had a run in with the supermarkets, they will send a SAR with our help. It will allow us to make new questions and tackle the problem at hand.
Blacklists of supermarkets also have to be registered at a Dutch DPA and so you can do a FOIA request to the DPA for this information. So would it help sending a SAR to supermarket when you can do it to the DPA?
Its based on jurisdiction. Dutch DPAs are different. In some countries we might have to do SAR and FOIA might not help.
We also want to stay close to the normal question to build an attractive story. Something than normal people can relate to.
Should we collect all the data or just the meta data about the process. The challenge to the aggregated data collection is that these companies don’t store this info. They use process it and then delete it and hence they don’t have anything to give back. Then you have to go back to the old debate of is it personal data? What happens now? Etc.
It is also interesting due to collaboration with consumer associations. Something that the group should keep in mind.
There’s 2 level of SARs here. 1) get knowledge about what is happening and 2) ask deeper question like how it effects normal people.
You can also compare user/non-user with loyalty card or without it.
DO you want to keep it clean in terms of the law that tell me what data you have about me or do you want to separate the 2? Process first and then data. Do you think they won’t cooperator and it will be just legal discussion?
DPOs are sometimes interested in more information than the actual SAR. It will take time to see what actual data you can get out of this.
Tinder changed their algorithm after one researcher published a book about it so you don’t have to know what you’re looking for. You can find great stuff without knowing what to look.
----Info sharing session----
The post-it’s from the morning session were organized according to themes. We can use them to identify trends or stories and also compare use cases. The themes also highlighted some collective motivations like increasing impact and making research easier for others. The interesting theme was to design a collaborative process. Some people want categories or names of firms, but some even said they want to share everything. Mostly people said they wanted to share all data about themselves because it was their own decision. Information can be broad, but we can also delve deep into particular cases. An example of that is when 10 people decided to show their genome data and people build more mechanisms for privacy around that. The concerns that came out were about free riding, scooping other’s story, abuse, validity or comparability and the main one was data subject privacy. There should thus be a governance that can be built out of this. Another interesting one was what if controllers start using this data and also the cases where you have a pact with the controller to not share data.
Participants pointed out the actual security of the data. The actual SAR isn’t shared with others’ but an encoded version of the SAR can be shared.
The challenge here is that there is no mechanism to hand over data to others in such a case. If someone takes on data like this, they become a liability in themselves. It’s a tricky thing because someone has to take liability for it but at the same time research repositories handle such cases quite a lot. The onus should not be on individual researchers but on the data handler as well.
We can at least connect the process data In this case, not the actual data but what was required in SAR and what was responded. People are also interested in best practices.
Personaldata.io is making a lot of information public and Paul Olivier Dehaye is comfortable with taking that liability up. We can learn a lot from the Wikimedia model where they comply with GDPR despite having an open mechanism.
In one way, internet archives and Wikimedia can become targets of litigation because they’re fundamentally against GDPR as of now. Hence, you’re taking a huge liability in this case.
The governance questions or sharing mechanisms are a side question and we want to put them forward. There is a lot of information about governance but we also have to focus on the sharing format and the tools which is the point of this session.
Will it be open?
Part of this can be open and part of it can be received by contacting the person in charge of the information.
--Exercise—
Encode one of your own SAR responses in groups of 3-4.
1) Amazon (Claudio)
2) Airlines (Rene)
3) DHS (Department Homeland Security) (Paul Olivier)
4) Timo (Judith)
5) Spotify (Trystan)
Claudio:
SAR response to Amazon, took 23 days. Had to pass through 5 links. Finally got a link, “click on it if you want your data. After 23 days got another link and a page that says download your data. Gets 6 files by Amazon like customer service, list of categories, amazon.de amazon.fr that he never used. Even opt out fields are marked as yes even though he has opted in. The most interesting results were all the search data. File shows how the data was accessed like while windows etc. It was through a made-up subject. File columns contains unknown data fields.
Claudio is willing to share raw data of all the response via a drive link.
Rene
Sharing the response from Brussels Airlines. It is one reponse out of a 100 collected from a campaign targeting airlines.
There is summary tab on the excel file including data type, description, sources, purpose and sharing.
The next tab is access request, then marketing.
One tab is Ingenico, which is a picture within the structure. It has data types and data values.
There are other pictures that are very hard to understand. It has abbreviations and there can be a problem with sanitizing it.
Judeth
Sharing a response from Teemo, a French geolocation services. They find customers for businesses. They send targeted ads to these people within range. She sent ss of ad ids from her phone and invoice from her phone company to prove she is the owner of the phone. She then asked for clarifications.
The readable data shows that they have coordinates for her. They know on which latitude and longtitide she was on a certain data. They say it wasn’t shared with anyone and it was on google cloud. It was received through google ads.
Fields include phone id, date, lat and long, concerns, events(ambiguous) number of events (could be the number of times the company received a bid from this)
Willing to share the data fields without her info.
Trystan (Spotify)
Wanted to do Spotify but used it as an example. They talked about data management and fields. We are willing to share research findings but not the actual data and the DPO wanted confidentiality. So can’t share data and data controller
Willing to share
- Meta data about the request and response
- ID requested
- Method by which response was sent back
- Type of data controller
- Dates of request/reponses follow up
- Success
- Contact details
- File format
- Status of the process
- Objective
- Some sort of description of share
- Sharing the SAR response with redaction useful for journalist
- Schema for the data but not always feasible so a free text description
----Michael Veale UCL---
A session about looking forward how AR can be done in the future. We will do forecasting and back-casting. The aim is majorly around forecasting and project future for 2030. We have to find a range of possible futures that may happen based on our struggles. We are not predicting the future but rather figuring out the range. We are going to ask the participants to do 2 sets of post its. 1) Things that are drivers for the future in terms of data rights. 2) Wildcards – surprising discrete but plausible events – things that can happen that would have a huge impact on data rights future. People can talk around come up with drivers and wildcards different from the themes of the workshop. Driver pages will also have graphs to show the speed of effect of the driver by 2030.
Participants were asked to place their notes on the graph below:
GRAPH HERE
When the ideas are placed on the graph, participants are asked to make 3 groups. Each group is then asked to create a scenario of the world as they see it in 2030. Participants are asked to use ideas from the graph.
Group 1
Picture here
Group 2
Truths become less and less obvious.
Deep fakes spread
Tech complexity keeps going up but relationship with the smart solutions goes down
SARS become ineffective because the targeted data becomes more and more complex
Privacy by design will help the companies
Wildcard: big movement says we can’t govern based on data and everything foes back to face to face meetings
Still have Christmas
Group 3
Picture here
---Closing---
Reflections and Takeaways:
Jef:
Main takeaway: even more than anticipation, there are so many strands where you can go, multiple use cases. A lot of enthusiasm. Hope others’ takeaway is that we’ll take concrete steps.
Amy:
Knowledge sharing is the main takeaway. How much more we all have to do and how we can answer our questions through sharing?
Rene:
1) In organizing this, they tried to bring together people from different backgrounds and the design choice was very open but it was scary. He feels he can trust the group and collaborate with them.
Michael
Dangerous that we’re becoming technocratic and there is a lot more anxiety coming in about the change. We need to think wholistically.
Fabienne:
I have more concrete avenues and I can tap into so much knowledge in the group.
Tristan:
Concrete Avenues and more knowledge to share. Tiring. Longer breaks next time. Its great to have a network. Meet more people.
Claudio:
Congratulations and Goodbye. A lot to underdstand
Paul:
Very happy for the two days and different for journalistic meeting. Easier to ask questions here. A bit tiresome and tough work
Paul Olivier:
Trust built is great. Most progress was made when people were sharing SARs. Its crucial that we develop the capacity to share and scale the movement.
Hugo:
My takeaway is that it was great to learn and I found out so many different ways and new connections.
Ala:
I am happy to see that there still is appetite for litigation based on data rights and noncompliance. There are gaps but I will pass it along in my field. Its great development that different people are coming together.
Joris:
I can say that I was happy to see this group come together. I am mostly interested in bigger and political questions around access rights which can be tested in a digital economy. It has become a robust field with people coming in with different expertise.
Karolina:
I would like to appreciate the organizers for bringing an open group together, I am happy that the project of bringing information together is gaining momentum. I have found different ways to tackle them.
Judeth:
I go back home with positive energy and enthusiasm Thank you to the organizers because we can now build trust. An example is how I was approached by Noyb but I didn’t trust them and now I trust them.
Bengi:
It was very interesting and it got very intense. I learnt a lot about access rights and I can now talk about it.
Adrianne:
I knew very little about SARs and now I know more and thanks for that. I got a lot of ideas about stories and I found people who are working on interesting topics. I appreciate this network.
Nayantara:
Main takeaway is SARs are not an end in itself we have to keep in mind the larger cultural and political insight. I had a very Immersive GDPR experience. I can takeaway these points back to India.
Hadi:
It was fun and the interactions were great especially the humor. My mind opened up with a lot intense information. Thanks to the organizers.