Autors: Neshov, N. N., Tonchev K., Bozhilov, I. B., Petkova, R. R., Manolova, A. H.
Title: Improving Texture Recognition via Multi-Layer Feature Aggregation from Pre-Trained Vision Architectures
Keywords: DTD, FMD, GTOS-Mobile, KTH-TIPS-2, Multi-Layer Perceptron, texture recognition, transformer architectures

Abstract: Texture recognition is a fundamental task in computer vision, with diverse applications in material sciences, medicine, and agriculture. The ability to analyze complex patterns in images has been greatly enhanced by advancements in Deep Neural Networks and Vision Transformers. To address the challenging nature of texture recognition, this paper investigates the performance of several pre-trained vision architectures for texture recognition, including both CNN- and transformer-based models. For each architecture, multi-level features are extracted from early, intermediate, and final layers, concatenated, and fed into a trainable Multi-Layer Perceptron (MLP) classifier. The architecture is thoroughly evaluated using five publicly available texture datasets, KTH-TIPS2-b, FMD, GTOS-Mobile, DTD, and Soil, with MLP hyperparameters determined through an exhaustive grid search on one of the datasets to ensure optimal performance. Extensive experiments highlight the comparative performance of each architecture and demonstrate that aggregating features from different hierarchical levels improves texture recognition in most cases, outperforming even architectures that require substantially higher computational resources. The study also shows the particular effectiveness of transformer-based models, such as BEiTv2, in achieving state-of-the-art results on four of the five examined datasets.

References

  1. Agarwal M. Singhal A. Lall B. 3D local ternary co-occurrence patterns for natural, texture, face and bio medical image retrieval Neurocomputing 2018 313 333 345 10.1016/j.neucom.2018.06.027
  2. Akiva P. Purri M. Leotta M. Self-supervised material and texture representation learning for remote sensing tasks Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition New Orleans, LA, USA 18–24 June 2022 8203 8215
  3. Swetha R. Bende P. Singh K. Gorthi S. Biswas A. Li B. Weindorf D.C. Chakraborty S. Predicting soil texture from smartphone-captured digital images and an application Geoderma 2020 376 114562 10.1016/j.geoderma.2020.114562
  4. Liu L. Chen J. Fieguth P. Zhao G. Chellappa R. Pietikäinen M. From BoW to CNN: Two decades of texture representation for texture classification Int. J. Comput. Vis. 2019 127 74 109 10.1007/s11263-018-1125-z
  5. Zhai W. Cao Y. Zha Z.J. Xie H. Wu F. Deep structure-revealed network for texture recognition Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Seattle, WA, USA 13–19 June 2020 11010 11019
  6. Zhai W. Cao Y. Zhang J. Zha Z.J. Deep multiple-attribute-perceived network for real-world texture recognition Proceedings of the IEEE/CVF International Conference on Computer Vision Seoul, Republic of Korea 27 October–2 November 2019 3613 3622
  7. Chen Z. Li F. Quan Y. Xu Y. Ji H. Deep texture recognition via exploiting cross-layer statistical self-similarity Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Nashville, TN, USA 20–25 June 2021 5231 5240
  8. Fujieda S. Takayama K. Hachisuka T. Wavelet convolutional neural networks for texture classification arXiv 2017 10.48550/arXiv.1707.07394 1707.07394
  9. Liu Z. Mao H. Wu C.Y. Feichtenhofer C. Darrell T. Xie S. A convnet for the 2020s Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition New Orleans, LA, USA 18–24 June 2022 11976 11986
  10. Dosovitskiy A. An image is worth 16x16 words: Transformers for image recognition at scale arXiv 2020 2010.11929
  11. Touvron H. Cord M. Jégou H. Deit iii: Revenge of the vit Proceedings of the European Conference on Computer Vision Tel Aviv, Israel 23–27 October 2022 Springer Berlin/Heidelberg, Germany 2022 516 533
  12. Haralick R.M. Shanmugam K. Dinstein I.H. Textural features for image classification IEEE Trans. Syst. Man Cybern. 2007 SMC-3 610 621 10.1109/TSMC.1973.4309314
  13. Lazebnik S. Schmid C. Ponce J. A sparse texture representation using local affine regions IEEE Trans. Pattern Anal. Mach. Intell. 2005 27 1265 1278 10.1109/TPAMI.2005.151
  14. Jégou H. Douze M. Schmid C. Pérez P. Aggregating local descriptors into a compact image representation Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition San Francisco, CA, USA 13–18 June 2010 3304 3311
  15. Lowe D.G. Distinctive image features from scale-invariant keypoints Int. J. Comput. Vis. 2004 60 91 110 10.1023/B:VISI.0000029664.99615.94
  16. Gabor D. Theory of communication. Part 1: The analysis of information J. Inst. Electr. Eng. Part III Radio Commun. Eng. 1946 93 429 441 10.1049/ji-3-2.1946.0074
  17. Ojala T. Pietikainen M. Maenpaa T. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns IEEE Trans. Pattern Anal. Mach. Intell. 2002 24 971 987 10.1109/TPAMI.2002.1017623
  18. Bu X. Wu Y. Gao Z. Jia Y. Deep convolutional network with locality and sparsity constraints for texture classification Pattern Recognit. 2019 91 34 46 10.1016/j.patcog.2019.02.003
  19. Xue J. Zhang H. Dana K. Deep texture manifold for ground terrain recognition Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Salt Lake City, UT, USA 18–23 June 2018 558 567
  20. Zhang H. Xue J. Dana K. Deep ten: Texture encoding network Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Honolulu, HI, USA 21–26 July 2017 708 717
  21. Peeples J. Xu W. Zare A. Histogram layers for texture analysis IEEE Trans. Artif. Intell. 2021 3 541 552 10.1109/TAI.2021.3135804
  22. Xu Y. Li F. Chen Z. Liang J. Quan Y. Encoding spatial distribution of convolutional features for texture representation Proceedings of the Advances in Neural Information Processing Systems 34 (NeurIPS 2021) Online 14 December 2021 Volume 34 22732 22744
  23. Mao S. Rajan D. Chia L.T. Deep residual pooling network for texture recognition Pattern Recognit. 2021 112 107817 10.1016/j.patcog.2021.107817
  24. Zhai W. Cao Y. Zhang J. Xie H. Tao D. Zha Z.J. On exploring multiplicity of primitives and attributes for texture recognition in the wild IEEE Trans. Pattern Anal. Mach. Intell. 2023 46 403 420 10.1109/TPAMI.2023.3325230
  25. Chen Z. Quan Y. Xu R. Jin L. Xu Y. Enhancing texture representation with deep tracing pattern encoding Pattern Recognit. 2024 146 109959 10.1016/j.patcog.2023.109959
  26. Scabini L. Zielinski K.M. Ribas L.C. Gonçalves W.N. De Baets B. Bruno O.M. RADAM: Texture recognition through randomized aggregated encoding of deep activation maps Pattern Recognit. 2023 143 109802 10.1016/j.patcog.2023.109802
  27. Florindo J.B. Fractal pooling: A new strategy for texture recognition using convolutional neural networks Expert Syst. Appl. 2024 243 122978 10.1016/j.eswa.2023.122978
  28. Maurício J. Domingues I. Bernardino J. Comparing vision transformers and convolutional neural networks for image classification: A literature review Appl. Sci. 2023 13 5521 10.3390/app13095521
  29. Scabini L. Sacilotti A. Zielinski K.M. Ribas L.C. De Baets B. Bruno O.M. A Comparative Survey of Vision Transformers for Feature Extraction in Texture Analysis J. Imaging 2024 11 304 10.3390/jimaging11090304 41003354
  30. Liu Z. Lin Y. Cao Y. Hu H. Wei Y. Zhang Z. Lin S. Guo B. Swin transformer: Hierarchical vision transformer using shifted windows Proceedings of the IEEE/CVF International Conference on Computer Vision Montreal, QC, Canada 10–17 October 2021 10012 10022
  31. Yang H. Zhang S. Shen H. Zhang G. Deng X. Xiong J. Feng L. Wang J. Zhang H. Sheng S. A multi-layer feature fusion model based on convolution and attention mechanisms for text classification Appl. Sci. 2023 13 8550 10.3390/app13148550
  32. Tang H. Li Z. Zhang D. He S. Tang J. Divide-and-conquer: Confluent triple-flow network for RGB-T salient object detection IEEE Trans. Pattern Anal. Mach. Intell. 2024 47 1958 1974 10.1109/TPAMI.2024.3511621 40030445
  33. Liu Z. Hu H. Lin Y. Yao Z. Xie Z. Wei Y. Ning J. Cao Y. Zhang Z. Dong L. et al. Swin transformer v2: Scaling up capacity and resolution Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition New Orleans, LA, USA 18–24 June 2022 12009 12019
  34. Tu Z. Talebi H. Zhang H. Yang F. Milanfar P. Bovik A. Li Y. Maxvit: Multi-axis vision transformer Proceedings of the European Conference on Computer Vision Tel Aviv, Israel 23–27 October 2022 Springer Berlin/Heidelberg, Germany 2022 459 479
  35. Yu W. Wang X. Mambaout: Do we really need mamba for vision? Proceedings of the Computer Vision and Pattern Recognition Conference Nashville, TN, USA 11–15 June 2025 4484 4496
  36. Gu A. Dao T. Mamba: Linear-time sequence modeling with selective state spaces arXiv 2023 10.48550/arXiv.2312.00752 2312.00752
  37. Peng Z. Dong L. Bao H. Ye Q. Wei F. Beit v2: Masked image modeling with vector-quantized visual tokenizers arXiv 2022 2208.06366
  38. Sheth F. Mathur P. Gupta A.K. Chaurasia S. An advanced artificial intelligence framework integrating ensembled convolutional neural networks and Vision Transformers for precise soil classification with adaptive fuzzy logic-based crop recommendations Eng. Appl. Artif. Intell. 2025 158 111425 10.1016/j.engappai.2025.111425
  39. Caputo B. Hayman E. Mallikarjuna P. Class-specific material categorisation Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV’05), Volume 1 Beijing, China 17–20 October 2005 Volume 2 1597 1604
  40. Sharan L. Rosenholtz R. Adelson E. Material perception: What can you see in a brief glance? J. Vis. 2009 9 784 10.1167/9.8.784
  41. Song K. Yang H. Yin Z. Multi-scale boosting feature encoding network for texture recognition IEEE Trans. Circuits Syst. Video Technol. 2021 31 4269 4282 10.1109/TCSVT.2021.3051003
  42. Cimpoi M. Maji S. Kokkinos I. Mohamed S. Vedaldi A. Describing Textures in the Wild Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition Columbus, OH, USA 23–28 June 2014 3606 3613 10.1109/CVPR.2014.461
  43. Neshov N. Tonchev K. Manolova A. LBCNIN: Local Binary Convolution Network with Intra-Class Normalization for Texture Recognition with Applications in Tactile Internet Electronics 2024 13 2942 10.3390/electronics13152942
  44. pytorch.org, Instalation of Pytorch v1.12.1 Available online: https://pytorch.org/get-started/previous-versions/ (accessed on 28 November 2025)
  45. Wightman R. Pytorch Image Models (timm) Available online: https://github.com/rwightman/pytorch-image-models (accessed on 28 November 2025)
  46. Farhan Sheth (Phantom-fs); Contributors. Agro-Companion-Modules 2025 Available online: https://github.com/Phantom-fs/Agro-Companion-Modules (accessed on 28 November 2025)
  47. Montgomery D.C. Design and Analysis of Experiments 9th ed. Wiley Hoboken, NJ, USA 2017
  48. Hollander M. Wolfe D.A. Chicken E. Nonparametric Statistical Methods 3rd ed. Wiley Hoboken, NJ, USA 2013
  49. Wightman R. Pytorch Image Models (timm)—Huggingface Available online: https://huggingface.co/timm (accessed on 28 November 2025)

Issue

Electronics (Switzerland), vol. 14, 2025, Switzerland, https://doi.org/10.3390/electronics14234779

Цитирания (Citation/s):
1. Gupta V., Mishra A., Shrivastava N., HyTexNet: Percentile-guided local encoding and deep feature fusion for enhanced texture classification, 2026, Knowledge Based Systems, issue 0, vol. 338, DOI 10.1016/j.knosys.2026.115482, issn 09507051 - 2026 - в издания, индексирани в Scopus

Вид: статия в списание, публикация в издание с импакт фактор, публикация в реферирано издание, индексирана в Scopus и Web of Science