Transforming Earth Observation: An Extensive Evaluation of Vision Transformers for Satellite Images-Based Land Cover Classification

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

Satellite imagery offers rich information for land cover classification, but choosing an effective yet efficient feature extractor or backbone architecture remains challenging. In this study, I benchmark 25 vision-transformers across 10 public land cover datasets to guide backbone selection for downstream classification tasks. The proposed approach encodes each satellite image into a fixed-length feature vector via a pre-trained transformer, then trains and tests a linear support-vector classifier on these encodings to isolate the impact of the backbone alone. I report average classification accuracy and F1-score over three random stratified splits per dataset, and I also measure training time to assess the computational cost. Results show that the image encoding performed using large-receptive-field transformers with advanced self-attention—particularly deit3_base_patch16_224 and twins_svt_large—achieve the highest accuracies without incurring prohibitive training times. In contrast, encodings of the compact variants achieve faster training but incur notable performance drops around 7%–8%. These findings reveal a clear trade-off between representational power and efficiency. Practitioners can leverage such rankings to select a transformer backbone that best balances accuracy and computational efficiency for satellite image-based land cover classification tasks, accelerating the development of robust and resource-aware systems.

Original languageEnglish
Article numbere70082
JournalExpert Systems
Volume42
Issue number7
DOIs
StatePublished - Jul 2025

Bibliographical note

Publisher Copyright:
© 2025 John Wiley & Sons Ltd.

Keywords

  • deep learning
  • image-based land cover classification
  • support vector machine
  • vision transformers

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Theoretical Computer Science
  • Computational Theory and Mathematics
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Transforming Earth Observation: An Extensive Evaluation of Vision Transformers for Satellite Images-Based Land Cover Classification'. Together they form a unique fingerprint.

Cite this