Publikace: Similarity Space and Its Applications
Disertační práceopen access| dc.contributor.advisor | Mareš, Jan | |
| dc.contributor.author | Rozinek, Ondřej | |
| dc.contributor.referee | Horálek, Josef | |
| dc.contributor.referee | Kukal, Jaromír | |
| dc.date.accepted | 2024-06-21 | |
| dc.date.accessioned | 2024-07-08T11:44:53Z | |
| dc.date.available | 2024-07-08T11:44:53Z | |
| dc.date.issued | 2024 | |
| dc.date.submitted | 2024-04-26 | |
| dc.description.abstract | Mathematical spaces have been studied for centuries and belong to the basic mathematical theories, which are used in various real-world applications. In general, a mathematical space is a set of mathematical objects with an associated structure. This structure can be specified by a number of operations on the objects of the set. These operations must satisfy certain axioms of mathematical space. Similarity and dissimilarity functions are widely used in many research areas: in information retrieval, data mining, machine learning, cluster analysis and applications in database search, protein sequence comparison and many more. When a dissimilarity function is used, a distance metric is normally required. On the other hand, although similarity functions are used, there is no formally accepted definition of this concept. In this dissertation is used for the first time the novel term similarity space. A significant contribution of this dissertation is the identification of a class of functions that satisfy the axioms of similarity space, alongside the development of novel mathematical theorems and definitions that extend our understanding of similarity. This includes the exploration of duality between similarity and metric spaces, the introduction of normalization transformations that addresses to solution to open unsolved problem, and the establishment of new descriptions and definitions for convergence, continuity, and other fundamental properties within similarity spaces. A significant section is dedicated to developing a new fixed-point theory in similarity space, establishing solutions for differential equations, and introducing a new convergence criterion for the Newton method. Another theoretical contribution is the novel application of similarity space in linear regression. Within the framework of Natural Language Processing (NLP) and Artificial Intelligence (AI), this dissertation applies theoretical insights to address real-world challenges, particularly in the areas of approximate string matching, complex fuzzy record matching and deduplication. By developing a novel convolution-based string matching model, proposing an advanced mathematical model for fuzzy record similarity, and introducing an optimal Q-gram filter for bipartite matching, this research presents novel solutions that significantly improve upon the state-of-the-art methods in terms of efficiency, accuracy, and applicability. In conclusion, this dissertation not only advances the theoretical understanding of similarity spaces but also demonstrates their vast potential for application in data processing and analysis. By bridging the gap between abstract mathematical theory and practical computational challenges, this work lays the groundwork for future innovations across broad range of fields. | eng |
| dc.description.defence | Po představení doktoranda Ing. Ondřeje Rozinka byla komise seznámena se stanoviskem školitele k disertační práci a osobě disertanta. Doktorand seznámil komisi se svojí disertační prací formou prezentace. Poté byly předneseny posudky oponentů a doktorand reagoval na připomínky oponentů. V následné veřejné diskusi disertant odpověděl na otázky členů komise, které jsou uvedeny na samostatných listech. Komise posoudila disertační práci a rozhodla, že disertační práce není plagiát. Na závěr proběhlo tajné hlasování. Protokol o výsledcích hlasování je uveden na samostatné příloze. | cze |
| dc.description.department | Fakulta elektrotechniky a informatiky | cze |
| dc.description.grade | Dokončená práce s úspěšnou obhajobou | cze |
| dc.format | 177 s. | |
| dc.identifier | Univerzitní knihovna (studovna) | cze |
| dc.identifier.signature | D40714 | |
| dc.identifier.stag | 48999 | |
| dc.identifier.uri | https://hdl.handle.net/10195/83131 | |
| dc.language.iso | eng | |
| dc.publisher | Univerzita Pardubice | cze |
| dc.rights | bez omezení | cze |
| dc.subject | similarity metric | eng |
| dc.subject | similarity space | eng |
| dc.subject | normalized similarity | eng |
| dc.subject | edit distance | eng |
| dc.subject | Jaccard coefficient | eng |
| dc.subject | Q-gram filter | eng |
| dc.subject | indexing method | eng |
| dc.subject | approximate string matching | eng |
| dc.subject | record linkage | eng |
| dc.subject | entity resolution | eng |
| dc.subject | record deduplication | eng |
| dc.subject | similarity search | eng |
| dc.subject | similarity join | eng |
| dc.subject | linear regression | eng |
| dc.subject | fixed point | eng |
| dc.thesis.degree-discipline | Elektrotechnika a informatika | cze |
| dc.thesis.degree-grantor | Univerzita Pardubice. Fakulta elektrotechniky a informatiky | cze |
| dc.thesis.degree-name | Ph.D. | |
| dc.thesis.degree-program | Elektrotechnika a informatika | cze |
| dc.title | Similarity Space and Its Applications | eng |
| dc.type | disertační práce | cze |
| dspace.entity.type | Publication |
Soubory
Původní svazek
1 - 4 z 4
Načítá se...
- Název:
- RozinekO_SimilaritySpace_JM_2024.pdf
- Velikost:
- 6.77 MB
- Formát:
- Adobe Portable Document Format
- Popis:
- Plný text práce
Načítá se...
- Název:
- MaresJ_SimilaritySpace_OR_2024.pdf
- Velikost:
- 81.36 KB
- Formát:
- Adobe Portable Document Format
- Popis:
- Posudek vedoucího práce
Načítá se...
- Název:
- HoralekJ_SimilaritySpace_OR_2024.pdf
- Velikost:
- 138.37 KB
- Formát:
- Adobe Portable Document Format
- Popis:
- Posudek oponenta práce
Načítá se...
- Název:
- KukalJ_SimilaritySpace_OR_2024.pdf
- Velikost:
- 94.03 KB
- Formát:
- Adobe Portable Document Format
- Popis:
- Posudek oponenta práce