Autors: Neshov, N. N., Tonchev K., Bozhilov, I. B., Petkova, R. R., Manolova, A. H. Title: Improving Texture Recognition via Multi-Layer Feature Aggregation from Pre-Trained Vision Architectures Keywords: DTD, FMD, GTOS-Mobile, KTH-TIPS-2, Multi-Layer Perceptron, texture recognition, transformer architecturesAbstract: Texture recognition is a fundamental task in computer vision, with diverse applications in material sciences, medicine, and agriculture. The ability to analyze complex patterns in images has been greatly enhanced by advancements in Deep Neural Networks and Vision Transformers. To address the challenging nature of texture recognition, this paper investigates the performance of several pre-trained vision architectures for texture recognition, including both CNN- and transformer-based models. For each architecture, multi-level features are extracted from early, intermediate, and final layers, concatenated, and fed into a trainable Multi-Layer Perceptron (MLP) classifier. The architecture is thoroughly evaluated using five publicly available texture datasets, KTH-TIPS2-b, FMD, GTOS-Mobile, DTD, and Soil, with MLP hyperparameters determined through an exhaustive grid search on one of the datasets to ensure optimal performance. Extensive experiments highlight the comparative performance of each architecture and demonstrate that aggregating features from different hierarchical levels improves texture recognition in most cases, outperforming even architectures that require substantially higher computational resources. The study also shows the particular effectiveness of transformer-based models, such as BEiTv2, in achieving state-of-the-art results on four of the five examined datasets. References - Agarwal M. Singhal A. Lall B. 3D local ternary co-occurrence patterns for natural, texture, face and bio medical image retrieval Neurocomputing 2018 313 333 345 10.1016/j.neucom.2018.06.027
- Akiva P. Purri M. Leotta M. Self-supervised material and texture representation learning for remote sensing tasks Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition New Orleans, LA, USA 18–24 June 2022 8203 8215
- Swetha R. Bende P. Singh K. Gorthi S. Biswas A. Li B. Weindorf D.C. Chakraborty S. Predicting soil texture from smartphone-captured digital images and an application Geoderma 2020 376 114562 10.1016/j.geoderma.2020.114562
- Liu L. Chen J. Fieguth P. Zhao G. Chellappa R. Pietikäinen M. From BoW to CNN: Two decades of texture representation for texture classification Int. J. Comput. Vis. 2019 127 74 109 10.1007/s11263-018-1125-z
- Zhai W. Cao Y. Zha Z.J. Xie H. Wu F. Deep structure-revealed network for texture recognition Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Seattle, WA, USA 13–19 June 2020 11010 11019
- Zhai W. Cao Y. Zhang J. Zha Z.J. Deep multiple-attribute-perceived network for real-world texture recognition Proceedings of the IEEE/CVF International Conference on Computer Vision Seoul, Republic of Korea 27 October–2 November 2019 3613 3622
- Chen Z. Li F. Quan Y. Xu Y. Ji H. Deep texture recognition via exploiting cross-layer statistical self-similarity Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Nashville, TN, USA 20–25 June 2021 5231 5240
- Fujieda S. Takayama K. Hachisuka T. Wavelet convolutional neural networks for texture classification arXiv 2017 10.48550/arXiv.1707.07394 1707.07394
- Liu Z. Mao H. Wu C.Y. Feichtenhofer C. Darrell T. Xie S. A convnet for the 2020s Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition New Orleans, LA, USA 18–24 June 2022 11976 11986
- Dosovitskiy A. An image is worth 16x16 words: Transformers for image recognition at scale arXiv 2020 2010.11929
- Touvron H. Cord M. Jégou H. Deit iii: Revenge of the vit Proceedings of the European Conference on Computer Vision Tel Aviv, Israel 23–27 October 2022 Springer Berlin/Heidelberg, Germany 2022 516 533
- Haralick R.M. Shanmugam K. Dinstein I.H. Textural features for image classification IEEE Trans. Syst. Man Cybern. 2007 SMC-3 610 621 10.1109/TSMC.1973.4309314
- Lazebnik S. Schmid C. Ponce J. A sparse texture representation using local affine regions IEEE Trans. Pattern Anal. Mach. Intell. 2005 27 1265 1278 10.1109/TPAMI.2005.151
- Jégou H. Douze M. Schmid C. Pérez P. Aggregating local descriptors into a compact image representation Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition San Francisco, CA, USA 13–18 June 2010 3304 3311
- Lowe D.G. Distinctive image features from scale-invariant keypoints Int. J. Comput. Vis. 2004 60 91 110 10.1023/B:VISI.0000029664.99615.94
- Gabor D. Theory of communication. Part 1: The analysis of information J. Inst. Electr. Eng. Part III Radio Commun. Eng. 1946 93 429 441 10.1049/ji-3-2.1946.0074
- Ojala T. Pietikainen M. Maenpaa T. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns IEEE Trans. Pattern Anal. Mach. Intell. 2002 24 971 987 10.1109/TPAMI.2002.1017623
- Bu X. Wu Y. Gao Z. Jia Y. Deep convolutional network with locality and sparsity constraints for texture classification Pattern Recognit. 2019 91 34 46 10.1016/j.patcog.2019.02.003
- Xue J. Zhang H. Dana K. Deep texture manifold for ground terrain recognition Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Salt Lake City, UT, USA 18–23 June 2018 558 567
- Zhang H. Xue J. Dana K. Deep ten: Texture encoding network Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Honolulu, HI, USA 21–26 July 2017 708 717
- Peeples J. Xu W. Zare A. Histogram layers for texture analysis IEEE Trans. Artif. Intell. 2021 3 541 552 10.1109/TAI.2021.3135804
- Xu Y. Li F. Chen Z. Liang J. Quan Y. Encoding spatial distribution of convolutional features for texture representation Proceedings of the Advances in Neural Information Processing Systems 34 (NeurIPS 2021) Online 14 December 2021 Volume 34 22732 22744
- Mao S. Rajan D. Chia L.T. Deep residual pooling network for texture recognition Pattern Recognit. 2021 112 107817 10.1016/j.patcog.2021.107817
- Zhai W. Cao Y. Zhang J. Xie H. Tao D. Zha Z.J. On exploring multiplicity of primitives and attributes for texture recognition in the wild IEEE Trans. Pattern Anal. Mach. Intell. 2023 46 403 420 10.1109/TPAMI.2023.3325230
- Chen Z. Quan Y. Xu R. Jin L. Xu Y. Enhancing texture representation with deep tracing pattern encoding Pattern Recognit. 2024 146 109959 10.1016/j.patcog.2023.109959
- Scabini L. Zielinski K.M. Ribas L.C. Gonçalves W.N. De Baets B. Bruno O.M. RADAM: Texture recognition through randomized aggregated encoding of deep activation maps Pattern Recognit. 2023 143 109802 10.1016/j.patcog.2023.109802
- Florindo J.B. Fractal pooling: A new strategy for texture recognition using convolutional neural networks Expert Syst. Appl. 2024 243 122978 10.1016/j.eswa.2023.122978
- Maurício J. Domingues I. Bernardino J. Comparing vision transformers and convolutional neural networks for image classification: A literature review Appl. Sci. 2023 13 5521 10.3390/app13095521
- Scabini L. Sacilotti A. Zielinski K.M. Ribas L.C. De Baets B. Bruno O.M. A Comparative Survey of Vision Transformers for Feature Extraction in Texture Analysis J. Imaging 2024 11 304 10.3390/jimaging11090304 41003354
- Liu Z. Lin Y. Cao Y. Hu H. Wei Y. Zhang Z. Lin S. Guo B. Swin transformer: Hierarchical vision transformer using shifted windows Proceedings of the IEEE/CVF International Conference on Computer Vision Montreal, QC, Canada 10–17 October 2021 10012 10022
- Yang H. Zhang S. Shen H. Zhang G. Deng X. Xiong J. Feng L. Wang J. Zhang H. Sheng S. A multi-layer feature fusion model based on convolution and attention mechanisms for text classification Appl. Sci. 2023 13 8550 10.3390/app13148550
- Tang H. Li Z. Zhang D. He S. Tang J. Divide-and-conquer: Confluent triple-flow network for RGB-T salient object detection IEEE Trans. Pattern Anal. Mach. Intell. 2024 47 1958 1974 10.1109/TPAMI.2024.3511621 40030445
- Liu Z. Hu H. Lin Y. Yao Z. Xie Z. Wei Y. Ning J. Cao Y. Zhang Z. Dong L. et al. Swin transformer v2: Scaling up capacity and resolution Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition New Orleans, LA, USA 18–24 June 2022 12009 12019
- Tu Z. Talebi H. Zhang H. Yang F. Milanfar P. Bovik A. Li Y. Maxvit: Multi-axis vision transformer Proceedings of the European Conference on Computer Vision Tel Aviv, Israel 23–27 October 2022 Springer Berlin/Heidelberg, Germany 2022 459 479
- Yu W. Wang X. Mambaout: Do we really need mamba for vision? Proceedings of the Computer Vision and Pattern Recognition Conference Nashville, TN, USA 11–15 June 2025 4484 4496
- Gu A. Dao T. Mamba: Linear-time sequence modeling with selective state spaces arXiv 2023 10.48550/arXiv.2312.00752 2312.00752
- Peng Z. Dong L. Bao H. Ye Q. Wei F. Beit v2: Masked image modeling with vector-quantized visual tokenizers arXiv 2022 2208.06366
- Sheth F. Mathur P. Gupta A.K. Chaurasia S. An advanced artificial intelligence framework integrating ensembled convolutional neural networks and Vision Transformers for precise soil classification with adaptive fuzzy logic-based crop recommendations Eng. Appl. Artif. Intell. 2025 158 111425 10.1016/j.engappai.2025.111425
- Caputo B. Hayman E. Mallikarjuna P. Class-specific material categorisation Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV’05), Volume 1 Beijing, China 17–20 October 2005 Volume 2 1597 1604
- Sharan L. Rosenholtz R. Adelson E. Material perception: What can you see in a brief glance? J. Vis. 2009 9 784 10.1167/9.8.784
- Song K. Yang H. Yin Z. Multi-scale boosting feature encoding network for texture recognition IEEE Trans. Circuits Syst. Video Technol. 2021 31 4269 4282 10.1109/TCSVT.2021.3051003
- Cimpoi M. Maji S. Kokkinos I. Mohamed S. Vedaldi A. Describing Textures in the Wild Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition Columbus, OH, USA 23–28 June 2014 3606 3613 10.1109/CVPR.2014.461
- Neshov N. Tonchev K. Manolova A. LBCNIN: Local Binary Convolution Network with Intra-Class Normalization for Texture Recognition with Applications in Tactile Internet Electronics 2024 13 2942 10.3390/electronics13152942
- pytorch.org, Instalation of Pytorch v1.12.1 Available online: https://pytorch.org/get-started/previous-versions/ (accessed on 28 November 2025)
- Wightman R. Pytorch Image Models (timm) Available online: https://github.com/rwightman/pytorch-image-models (accessed on 28 November 2025)
- Farhan Sheth (Phantom-fs); Contributors. Agro-Companion-Modules 2025 Available online: https://github.com/Phantom-fs/Agro-Companion-Modules (accessed on 28 November 2025)
- Montgomery D.C. Design and Analysis of Experiments 9th ed. Wiley Hoboken, NJ, USA 2017
- Hollander M. Wolfe D.A. Chicken E. Nonparametric Statistical Methods 3rd ed. Wiley Hoboken, NJ, USA 2013
- Wightman R. Pytorch Image Models (timm)—Huggingface Available online: https://huggingface.co/timm (accessed on 28 November 2025)
Issue
| Electronics (Switzerland), vol. 14, 2025, Switzerland, https://doi.org/10.3390/electronics14234779 |
|