Historic Person Linkage

Studies of society in the past have heavily relied on written records (books, diaries, ...). However, this has a tendency to skew the perception to that of the educated parts of society that were in a position to produce those kind of records. For the vast majority the only written record we have are entries in birth lists, baptism records, marriage records, and death lists. The difficulty is that none of these have identifiers to link entries between them to make it possible to trace a person's life.

The aim of the Historic Person Linkage project is to address this, developing algorithms and tools to automatically identify where the same person appears across multiple records. The project uses a combined machine-learning, crowdsourcing, and heuristic approach. Initially it uses machine learning techniques to identify candidate people for linking. The candidates are then validated using crowdsourcing. Based on the crowdsourcing feedback, the system will then learn general heuristics that can be applied to the data-set. The aim is to develop methods that allow for the generalisation of the machine learning models.