Object Counting with Deep Learning
dc.contributor.advisor | Stavness, Ian Kent | |
dc.contributor.committeeMember | Makaroff, Dwight | |
dc.contributor.committeeMember | Roy, Chanchal | |
dc.contributor.committeeMember | Ko, Seok-Bum | |
dc.creator | Aich, Shubhra | |
dc.date.accessioned | 2019-07-04T14:32:02Z | |
dc.date.available | 2019-07-04T14:32:02Z | |
dc.date.created | 2019-04 | |
dc.date.issued | 2019-07-04 | |
dc.date.submitted | April 2019 | |
dc.date.updated | 2019-07-04T14:32:03Z | |
dc.description.abstract | This thesis explores various empirical aspects of deep learning or convolutional network based models for efficient object counting. First, we train moderately large convolutional networks on comparatively smaller datasets containing few hundred samples from scratch with conventional image processing based data augmentation. Then, we extend this approach for unconstrained, outdoor images using more advanced architectural concepts. Additionally, we propose an efficient, randomized data augmentation strategy based on sub-regional pixel distribution for low-resolution images. Next, the effectiveness of depth-to-space shuffling of feature elements for efficient segmentation is investigated for simpler problems like binary segmentation -- often required in the counting framework. This depth-to-space operation violates the basic assumption of encoder-decoder type of segmentation architectures. Consequently, it helps to train the encoder model as a sparsely connected graph. Nonetheless, we have found comparable accuracy to that of the standard encoder-decoder architectures with our depth-to-space models. After that, the subtleties regarding the lack of localization information in the conventional scalar count loss for one-look models are illustrated. At this point, without using additional annotations, a possible solution is proposed based on the regulation of a network-generated heatmap in the form of a weak, subsidiary loss. The models trained with this auxiliary loss alongside the conventional loss perform much better compared to their baseline counterparts, both qualitatively and quantitatively. Lastly, the intricacies of tiled prediction for high-resolution images are studied in detail, and a simple and effective trick of eliminating the normalization factor in an existing computational block is demonstrated. All of the approaches employed here are thoroughly benchmarked across multiple heterogeneous datasets for object counting against previous, state-of-the-art approaches. | |
dc.format.mimetype | application/pdf | |
dc.identifier.uri | http://hdl.handle.net/10388/12155 | |
dc.subject | Object counting | |
dc.subject | Deep learning | |
dc.subject | One-look models | |
dc.subject | Heatmap Regulation | |
dc.subject | Global Sum Pooling | |
dc.title | Object Counting with Deep Learning | |
dc.type | Thesis | |
dc.type.material | text | |
thesis.degree.department | Computer Science | |
thesis.degree.discipline | Computer Science | |
thesis.degree.grantor | University of Saskatchewan | |
thesis.degree.level | Masters | |
thesis.degree.name | Master of Science (M.Sc.) |