Publikace: On reporting performance of binary classifiers
Článekopen accesspeer-reviewedpublishedNačítá se...
Datum
Autoři
Škrabánek, Pavel
Doležel, Petr
Název časopisu
ISSN časopisu
Název svazku
Nakladatel
Univerzita Pardubice
Abstrakt
In this contribution, the question of reporting performance of binary
classifiers is opened in context of the so called class imbalance problem. The class
imbalance problem arises when a dataset with a highly imbalanced class distribution
is used within the training or evaluation process. In such cases, only measures, which
are not biased by distribution of classes in datasets, should be used; however, they
cannot be chosen arbitrarily. They should be selected so that their outcomes provide
desired information; and simultaneously, they should allow a full comparison of just
evaluated classifier performance along, with performances of other solutions. As is
shown in this article, the dilemma with reporting performance of binary classifiers can
be solved using so called class balanced measures. The class balanced measures are
generally applicable means, appropriate for reporting performance of binary
classifiers on balanced as well as on imbalanced datasets. On the basis of the
presented pieces of information, a suggestion for a generally applicable, fully-valued,
reporting of binary classifiers performance is given.
Popis
Klíčová slova
machine learning, binary classification, class imbalance problem, performance measures, reporting of results