Tax default prediction using feature transformation-based machine learning

Abedin, Mohammad Zoynul; Chi, Guotai; Uddin, Mohammed Mohi; Satu, Md Shahriare; Khan, Imran; Hájek, Petr

Digitální knihovna UPCE
→
Univerzita Pardubice
→
Publikační činnost akademických pracovníků UPCE / UPCE Research Outputs
→
Zobrazit záznam

dc.contributor.author	Abedin, Mohammad Zoynul
dc.contributor.author	Chi, Guotai
dc.contributor.author	Uddin, Mohammed Mohi
dc.contributor.author	Satu, Md Shahriare
dc.contributor.author	Khan, Imran
dc.contributor.author	Hájek, Petr
dc.date.accessioned	2022-06-03T12:29:25Z
dc.date.available	2022-06-03T12:29:25Z
dc.date.issued	2021
dc.identifier.issn	2169-3536
dc.identifier.uri	https://hdl.handle.net/10195/79305
dc.description.abstract	This study proposes to address the economic significance of unpaid taxes by using an automatic system for predicting a tax default. Too little attention has been paid to tax default prediction in the past. Moreover, existing approaches tend to apply conventional statistical methods rather than advanced data analytic approaches, including state-of-the-art machine learning methods. Therefore, existing studies cannot effectively detect tax default information in real-world financial data because they fail to take into account the appropriate data transformations and nonlinear relationships between early-warning financial indicators and tax default behavior. To overcome these problems, this study applies diverse feature transformation techniques and state-of-the-art machine learning approaches. The proposed prediction system is validated by using a dataset showing tax defaults and non-defaults at Finnish limited liability firms. Our findings provide evidence for a major role of feature transformation, such as logarithmic and square-root transformation, in improving the performance of tax default prediction. We also show that extreme gradient boosting and the systematically developed forest of multiple decision trees outperform other machine learning methods in terms of accuracy and other classification performance measures. We show that the equity ratio, liquidity ratio, and debt-to-sales ratio are the most important indicators of tax defaults for 1-year-ahead predictions. Therefore, this study highlights the essential role of well-designed tax default prediction systems, which require a combination of feature transformation and machine learning methods. The effective implementation of an automatic tax default prediction system has important implications for tax administration and can assist administrators in achieving feasible government expenditure allocations and revenue expansions.	eng
dc.format	p. 19864-19881	eng
dc.language.iso	eng
dc.publisher	IEEE (Institute of Electrical and Electronics Engineers)	eng
dc.relation.ispartof	IEEE ACCESS, volume 9, issue: 29.12.2020	eng
dc.rights	open access	eng
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/
dc.subject	finance	eng
dc.subject	data analysis	eng
dc.subject	machine learning	eng
dc.subject	predictive models	eng
dc.subject	feature extraction	eng
dc.subject	economics	eng
dc.subject	support vector machines	eng
dc.subject	default prediction	eng
dc.subject	corporate tax	eng
dc.subject	machine learning	eng
dc.subject	feature transformation	eng
dc.title	Tax default prediction using feature transformation-based machine learning	eng
dc.title.alternative	Predikce nesplácení daní pomocí strojového učení založeného na transformaci atributů	cze
dc.type	article	eng
dc.description.abstract-translated	Tato studie navrhuje řešit ekonomický význam nezaplacených daní pomocí automatického systému pro predikci nesplácení daní. Predikce daňové neschopnosti byla v minulosti věnována příliš malá pozornost. Stávající přístupy navíc spíše používají konvenční statistické metody než pokročilé přístupy analýzy dat, včetně nejmodernějších metod strojového učení. Stávající studie proto nemohou účinně odhalit informace o daňovém selhání v reálných finančních datech, protože nezohledňují vhodné transformace dat a nelineární vztahy mezi finančními ukazateli včasného varování a chováním při daňovém selhání. K překonání těchto problémů používá tato studie různé techniky transformace atributů a nejmodernější přístupy strojového učení. Navrhovaný predikční systém je ověřen pomocí souboru dat zobrazujícího daňové selhání a neselhání u finských společností s ručením omezeným. Naše zjištění poskytují důkazy o významné roli transformace atributů, jako je logaritmická transformace a transformace s odmocninou, při zlepšování výkonnosti predikce daňového selhání. Ukazujeme také, že extrémní gradientní boosting a systematicky rozvíjený les vícenásobných rozhodovacích stromů překonávají ostatní metody strojového učení z hlediska přesnosti a dalších měřítek klasifikační výkonnosti. Ukazujeme, že poměr vlastního kapitálu, poměr likvidity a poměr dluhu k tržbám jsou nejdůležitějšími ukazateli daňového selhání pro předpověď na 1 rok dopředu. Tato studie proto zdůrazňuje zásadní roli dobře navržených systémů pro predikci daňových selhání, které vyžadují kombinaci transformace atributů a metod strojového učení. Efektivní implementace automatického systému predikce daňového selhání má důležité důsledky pro správu daní a může správcům pomoci při dosahování proveditelných alokací vládních výdajů a expanze příjmů.	cze
dc.peerreviewed	yes	eng
dc.publicationstatus	published version	eng
dc.identifier.doi	10.1109/ACCESS.2020.3048018
dc.relation.publisherversion	https://ieeexplore.ieee.org/document/9310180
dc.rights.licence	CC BY 4.0
dc.project.ID	GA19-15498S/Modelování emocí ve verbální a neverbální manažerské komunikaci pro predikci podnikových finančních rizik	cze
dc.identifier.wos	000615028400001
dc.identifier.scopus	2-s2.0-85099111620
dc.identifier.obd	39886241