Publikace: Estimation of atmospheric visibility by deep learning model using multimodal dataset
Článekopen accesspeer-reviewedpublishedNačítá se...
Datum
Název časopisu
ISSN časopisu
Název svazku
Nakladatel
Elsevier
Abstrakt
Accurate estimation of atmospheric visibility is essential for numerous safety-critical applications, particularly in the field of transportation. In this study, a deep learning-based approach is investigated using a multimodal input representation that combines RGB images from a fixed-position surveillance camera with tabular meteorological variables collected from a nearby meteorological station. The meteorological input includes temperature, absolute pressure, relative humidity, dew point, wet bulb temperature, average and maximum wind speed, amount of precipitation, solar radiation, and ultraviolet index. Six neural network models for visibility estimation were developed and compared: a multimodal model utilizing both image and tabular meteorological inputs; two ablation models that use only unimodal input (image or meteorological data); a regions-of-interest (ROIs) based model that extracts features from predefined image subregions; and two ablation models that use only a reduced number of meteorological data. The multimodal model uses EfficientNetV2M for feature extraction and a set of fully connected neural networks to integrate the two modalities. The ROIs-based model also uses EfficientNetV2M, but only on manually selected reference regions of the scene. Evaluation was performed on a dataset of 1000 annotated images, with visibility manually determined based on reference points in the scene. The multimodal model achieved a mean squared error of 129,716 m2, a mean absolute error of 165.4 m, and an 𝑅2 score of 0.8861, with 84.46 % of predictions falling within a 10 % relative error margin. Although the ROIs-based model slightly outperformed the multimodal model in some regression metrics, its accuracy within tolerance thresholds was lower, and its reliance on manual scene annotation limits scalability. In contrast, the ablation models clearly demonstrated lower performance in almost all evaluated criteria. The results display that the proposed multimodal input strategy provides a balanced and practical approach to automated visibility estimation. Compared to conventional unimodal input models, this architecture offers improved accuracy, stability, and generalisation ability, making it suitable for real-world applications where both visual and environmental data are available.
Popis
Klíčová slova
Atmospheric visibility, Neural network, Multimodal dataset, Meteorological variables, Deep learning