Abstract
Robotic vision for densely fruited crop (i.e., blueberry) management remains challenging due to complex real-world conditions such as irregular fruit structures, overlapping clusters, varying berry sizes, inconsistent lighting, and cluttered backgrounds. These factors are compounded by the scarcity of diverse, high-quality annotated data from production environments, which is critical for training robust detection models. To address this gap, we introduce a comprehensive large-scale dataset repository for dense blueberry analysis, comprising three subsets aimed at advancing learning paradigms in this domain: DB-1) 1,195 fully annotated high-resolution images for supervised learning, DB-2) 141K frames from 520 videos for weakly/semi-supervised learning, and DB-3) 10K synthetic images with annotations, generated via our proposed data realization algorithm that mimics real-field complexity by offering a scalable, cost-effective foundation for blueberry annotation and robust model generalization. We validated the utility of DB-1 by benchmarking a strong baseline and proposing a customized framework for densely clustered blueberries, achieving 75.06% SEN, 56.85% IoU, and 72.49% DICE, outperforming the strongest baseline by 24.71%, 13.5%, and 8.6%, respectively. Further implementation and supplementary details are available at our GitHub repository: https://github.com/Owais-CodeHub/PVT-SN-SAM-RN.
| Original language | English |
|---|---|
| Article number | 2014 |
| Journal | Scientific data |
| Volume | 12 |
| Issue number | 1 |
| DOIs | |
| State | Published - Dec 2025 |
Bibliographical note
Publisher Copyright:© The Author(s) 2025.
ASJC Scopus subject areas
- Statistics and Probability
- Information Systems
- Education
- Computer Science Applications
- Statistics, Probability and Uncertainty
- Library and Information Sciences