---
# Telefonbuch Research project
This documentation is dedicated to capturing main research ideas and methodological ideas.
---
## A repo for name matching
This pyhton based repo [namematch](https://github.com/urban-labs/namematch) provides code wihch allows for matching of names across datasets even though they might have inconsistencies.
This is interesting for us since Telefonbuch data is entered by hand leading to inconsistencies across years. However, we might want to track certain individuals moving within/across cities over a certain time period. Thus, repo can be help in achieving the matching task.
### Exogenous event
Consider one suitable exogenous event (like flooding or opening of coal mine) to construct test data sets.
- Candidates:
- Flooding 2002 Oder
- Wiki article of relocations in Germany: [wiki](https://de.wikipedia.org/wiki/Liste_abgebaggerter_Ortschaften) . Garzweiler looks like a solid candidate as there seem to have been multiple steps in the reloction to newly founded districts.
### Training data set requirements
They provide
The traitraining data sets needs to fulfill a certain set of requirements:
1. Already have a unique person or entity identifier that can be used to link records (e.g. SSN or Fingerprint ID)
2. Be granular enough that some people or entities appear multiple times (e.g. the same person being arrested two or three times)
3. Contain inconsistencies in identifying fields like name and date of birth (e.g. arrested once as John Browne and once as Jonathan Brown
Suggested solutions:
1. Construct ID variable.
2. Take a set of cities which are combined across years which yields a granular dataset in the sense of the repo.
3. Should be fulfilled anyhow.
---
### Collect cases which have to appreciated for the matching
- People marry --> name change
- People request for entries being deleted in phone book
- Phone book captures only people who are registered in the phone network, i.e. the person who signed the contract?!. Accordingly, if a family moves (including Person A and B) and A has registered before and B after, the move can not be detected. (Still need to check this but pretty sure that is how it works)
- Phone book contains some (this will really not be relevant) fictitious persons which is used to detect comanpies misusing this data see [wiki](https://dewiki.de/Lexikon/Telefonbuch)
- People die
- New phone book users are born (someone moving out)
- Since 1998 you have the right to keep your phone number when moving (Of course you still had to indicate that you want to do this and further pay an amount of money for that. )
### Some info on the phone book in general