Journal of Systems Integration, Vol 6, No 3 (2015)

Font Size:  Small  Medium  Large

Frequency Ratio: a method for dealing with missing values within nearest neighbour search

Rosanne Janssen, Pieter Spronck, Pauline Dibbets, Arnoud Arntz

Abstract


In this paper we introduce the Frequency Ratio (FR) method for dealing with missing values within nearest neighbour search. We test the FR method on known medical datasets from the UCI machine learning repository. We compare the accuracy of the FR method with five commonly used methods (three “imputation” and two “bypassing” methods) for dealing with values that are “missing completely at random” (MCAR) for the purpose of classification. We discovered that in most cases, the FR method outperforms the other methods. We conclude that the FR method is a strong addition to the commonly used methods for dealing with missing values within the nearest neighbour method.

Full Text: PDF

DOI: http://dx.doi.org/10.20470/jsi.v6i3.233

ISSN: 1804-2724

Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial 3.0 Czech Republic License.