Digitální knihovnaUPCE
 

Extraction of Outliers from Imbalanced Sets

Konferenční objektpeer-reviewedpostprint
Náhled

Datum publikování

2017

Vedoucí práce

Oponent

Název časopisu

Název svazku

Vydavatel

Springer

Abstrakt

In this paper, we presented an outlier detection method, designed for small datasets, such as datasets in animal group behaviour research. The method was aimed at detection of global outliers in unlabelled datasets where inliers form one predominant cluster and the outliers are at distances from the centre of the cluster. Simultaneously, the number of inliers was much higher than the number of outliers. The extraction of exceptional observations (EEO) method was based on the Mahalanobis distance with one tuning parameter. We proposed a visualization method, which allows expert estimation of the tuning parameter value. The method was tested and evaluated on 44 datasets. Excellent results, fully comparable with other methods, were obtained on datasets satisfying the method requirements. For large datasets, the higher computational requirement of this method might be prohibitive. This drawback can be partially suppressed with an alternative distance measure. We proposed to use Euclidean distance in combination with standard deviation normalization as a reliable

Rozsah stran

p. 402-412

ISSN

0302-9743

Trvalý odkaz na tento záznam

Projekt

Zdrojový dokument

Hybrid Artificial Intelligent Systems : 12th International Conference, HAIS 2017, proceedings

Vydavatelská verze

https://link.springer.com/chapter/10.1007%2F978-3-319-59650-1_34

Přístup k e-verzi

embargoed access

Název akce

12th International Conference, HAIS 2017 (21.06.2017 - 23.06.2017, La Rioja)

ISBN

978-3-319-59649-5

Studijní obor

Studijní program

Signatura tištěné verze

Umístění tištěné verze

Přístup k tištěné verzi

Klíčová slova

outlier analysis, distance based method, global outlier, single cluster, Mahalanobis distance, biology

Endorsement

Review

item.page.supplemented

item.page.referenced