(The Journal on Information Technology in Healthcare, 2009-11) Philip, Achimugu; Abimbola, Soriyan; Babajide, Afolabi
The process of identifying record pairs that represent the same entity (duplicate
records) is technically known as record linkage and is one of the essential elements of data
cleansing. This paper proposes a fast and efficient method for linkage detection within the
healthcare domain. The features of the proposed approach are an embedded fast blocking
method with a string matching function that accounts for keystroke mistakes made during
data entry of a patient’s name and the addition of a module that dynamically generates blocks
of possible associated and unique records.