Extraction of Outliers from Imbalanced Sets

Škrabánek, Pavel; Martínková, Natália

Digitální knihovna UPCE
→
Univerzita Pardubice
→
Publikační činnost akademických pracovníků UPCE / UPCE Research Outputs
→
Zobrazit záznam

dc.contributor.advisor
dc.contributor.author	Škrabánek, Pavel	cze
dc.contributor.author	Martínková, Natália	cze
dc.date.accessioned	2018-02-27T02:36:38Z
dc.date.available	2018-02-27T02:36:38Z
dc.date.issued	2017	eng
dc.identifier.isbn	978-3-319-59649-5	eng
dc.identifier.issn	0302-9743	eng
dc.identifier.uri	https://hdl.handle.net/10195/69780
dc.description.abstract	In this paper, we presented an outlier detection method, designed for small datasets, such as datasets in animal group behaviour research. The method was aimed at detection of global outliers in unlabelled datasets where inliers form one predominant cluster and the outliers are at distances from the centre of the cluster. Simultaneously, the number of inliers was much higher than the number of outliers. The extraction of exceptional observations (EEO) method was based on the Mahalanobis distance with one tuning parameter. We proposed a visualization method, which allows expert estimation of the tuning parameter value. The method was tested and evaluated on 44 datasets. Excellent results, fully comparable with other methods, were obtained on datasets satisfying the method requirements. For large datasets, the higher computational requirement of this method might be prohibitive. This drawback can be partially suppressed with an alternative distance measure. We proposed to use Euclidean distance in combination with standard deviation normalization as a reliable	eng
dc.format	p. 402-412	eng
dc.language.iso	eng	eng
dc.publisher	Springer	eng
dc.relation.ispartof	Hybrid Artificial Intelligent Systems : 12th International Conference, HAIS 2017, proceedings	eng
dc.rights	embargoed access	eng
dc.subject	outlier analysis	eng
dc.subject	distance based method	eng
dc.subject	global outlier	eng
dc.subject	single cluster	eng
dc.subject	Mahalanobis distance	eng
dc.subject	biology	eng
dc.title	Extraction of Outliers from Imbalanced Sets	eng
dc.title.alternative	Extrakce odlehlých hodnot z nevyvážených datových sad	cze
dc.type	ConferenceObject	eng
dc.description.abstract-translated	Článek přináší popis metody určené k detekci odlehlých hodnot v datových sadách s malým počtem pozorování, kde správná pozorování tvoří jeden klastr. Pro správnou funkčnost je potřeba, aby počet správných pozorování byl výrazně vyšší než počet odlehlých pozorování. Metoda je založena na Mahalanobis vzdálenosti.	cze
dc.event	12th International Conference, HAIS 2017 (21.06.2017 - 23.06.2017, La Rioja)	eng
dc.peerreviewed	yes	eng
dc.publicationstatus	postprint	eng
dc.identifier.doi	10.1007/978-3-319-59650-1_34
dc.relation.publisherversion	https://link.springer.com/chapter/10.1007%2F978-3-319-59650-1_34
dc.identifier.wos	000432880600034
dc.identifier.scopus	2-s2.0-85021705636
dc.identifier.scopus	2-s2.0-8502170563
dc.identifier.obd	39879485	eng