Skip to main content
Thesis defences

PhD Oral Exam - Ahmed Zgaren, Information and Systems Engineering

Context Mining for Visual Object Counting


Date & time
Tuesday, May 20, 2025
2 p.m. – 5 p.m.
Cost

This event is free

Organization

School of Graduate Studies

Contact

Dolly Grewal

Accessible location

Yes

When studying for a doctoral degree (PhD), candidates submit a thesis that provides a critical review of the current state of knowledge of the thesis subject as well as the student’s own contributions to the subject. The distinguishing criterion of doctoral graduate research is a significant and original contribution to knowledge.

Once accepted, the candidate presents the thesis orally. This oral exam is open to the public.

Abstract

Visual object counting is a fundamental task in computer vision that aims to accurately estimate the number of objects of interest within an image. This task has widespread applications across various domains, including environmental monitoring, surveillance, retail analytics, and medical imaging. Traditional counting methods often face challenges such as object occlusion, variation in scale and appearance, and complex scene backgrounds. Although deep learning has significantly advanced this field, there are still limitations, particularly regarding the accurate capture of contextual information.

This thesis focuses on developing novel approaches to enhance visual object counting, targeting key research problems related to accuracy, efficiency, and robustness in both class-specific and class-agnostic counting scenarios. To address these challenges, this thesis makes several key contributions. First, we propose a novel hybrid counting method that combines local detection with global estimation to accurately count objects in aerial imagery. This approach efficiently exploits both local and global information, enhancing counting accuracy in high-density situations. Second, we introduce a self-attention-based model for class-agnostic counting, which effectively encodes repetitive object patterns, allowing for precise counting even in the presence of object variations and background clutter. This method improves feature representation and matching, leading to enhanced robustness and generalization capabilities. Finally, we present a novel box-free counting model requiring only one annotation point per object, significantly reducing the annotation task. This method employs contextual transformers and a position-aware attention encoder to achieve accurate object counts with minimal annotation effort. The effectiveness of our proposed methods is rigorously demonstrated through extensive experiments conducted on both public and private datasets. By comparing our results to those achieved by state-of-the-art methods, we showcase the superior performance of our approaches in addressing several challenges in visual object counting. These contributions collectively advance the field of visual object counting by providing more accurate, efficient, and robust counting methods, opening new possibilities for automated object counting in various applications.

Back to top

© Concordia University