The process of human decision-making regularly depends on using visual information from different perspectives or views. While in machine learning-based object classification,..., the decision is made only from a single image of that object. But the visual information expressed by only a single image could be not enough for a correct decision, especially for complex classification problems. Using the multi-view as the 3D representation for object classification has performed best so far to achieve the current state-of-the-art performance. However, there are some issues in the view-based 3D object classification methods such as using all the captured views from the fixed camera viewpoints where some of them may not be discriminative and useful for classification or using all the regions on the view which could confuse the classifier. These observations are a motivation for more smart and efficient selective multi-view classification models that extracts the multi-view images from the 3D objects and selects the most significant regions from the most influential views. Hence, in this research, selective 3D deep classification models will be proposed to extract multi-view images from the 3D data representations, which will select the discriminative views using the importance scores method and highlight the most significant regions on these views that are useful for classification using the Grad-CAM techniques. Experimental evaluation will be conducted on the ModelNet40 dataset by applying the proposed selective 3D CNN classification model on the 3D objects to assess its classification performance. The proposed methods will make the technology practical for real-world applications such as medical image analysis, automated driving, intelligence robots, crowd surveillance, automated building structure inspections, virtual/augmented reality, archaeology, and others.
|Effective start/end date
|1/02/23 → 1/02/24
Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.