Thesis (Selection of subject)

On Saturday 19th October 2024 there will be a shutdown of some components of the information system. Especially the work with files in Thesis modules will be particularly unavailable. Please postpone your requests for a later time.

Your browser does not support JavaScript, or its support is disabled. Some features may not be available.

6D pose estimation of objects in images

Thesis title in Czech:	Odhad 6D polohy objektů v obrazech
Thesis title in English:	6D pose estimation of objects in images
Key words:	Počítačové vidění\|Odhad 6D polohy\|Nekalibrovaná kamera\|Detekce nových objektů\|Vizuální rozpoznávání\|Hluboké učení
English key words:	Computer vision\|6D pose estimation\|Uncalibrated camera\|Novel object detection\|Visual recognition\|Deep learning
Academic year of topic announcement:	2023/2024
Thesis type:	diploma thesis
Thesis language:	angličtina
Department:	Department of Software and Computer Science Education (32-KSVI)
Supervisor:	Josef Šivic
Author:	Mgr. Martin Cífka - assigned and confirmed by the Study Dept.
Date of registration:	09.10.2023
Date of assignment:	19.12.2023
Confirmed by Study dept. on:	19.12.2023
Date and time of defence:	14.02.2024 09:00
Date of electronic submission:	11.01.2024
Date of submission of printed version:	11.01.2024
Date of proceeded defence:	14.02.2024
Opponents:	doc. RNDr. Elena Šikudová, Ph.D.

Guidelines

Understanding scenes on the object level has many applications in augmented reality or robotics, e.g. accurate robotic manipulation. The goal of this thesis is to make a step toward estimating the 6D pose (3D translation and 3D rotation) of objects depicted in the image based on a database of objects with known 3D models. Two challenges are considered: (i) estimating 6D pose from images taken by an uncalibrated camera (e.g., frames from videos downloaded from YouTube) and (ii) detecting objects in the image without costly object-specific training. The thesis aims to approach these challenges by building on state-of-the-art methods: FocalPose [1], MegaPose [2], SAM [3], and CNOS [4]. More specifically, the objectives are:

1. Review state-of-the-art methods for object 6D pose estimation from images and for object detection in images. Identify their limitations and benefits.

2. Explore the possibilities for addressing the identified limitations and improving the state-of-the-art methods for object pose estimation [1]. For example, a promising direction is creating synthetic training data biased toward training distribution or improving the retrieval of the relevant instance from the object database.

3. Explore the possibilities for addressing limitations of object detection methods for unseen objects, i.e., objects that were not used for training, by taking an approach in CNOS [4] as a baseline. Promising directions include, for example, tuning and evaluating object proposal mechanisms (SAM [3] or selective search [5]) or combining unseen object model classification methods such as MegaPose coarse model [2] or CNOS [4].

References

[1] Ponimatkin, Georgy, et al. "Focal length and object pose estimation via render and compare." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022.

[2] Labbé, Yann, et al. "Megapose: 6d pose estimation of novel objects via render & compare." 6th Annual Conference on Robot Learning (CoRL), 2022.

[3] Kirillov, Alexander, et al. "Segment anything." arXiv preprint arXiv:2304.02643 (2023).

[4] Nguyen, Van Nguyen, et al. "CNOS: A Strong Baseline for CAD-based Novel Object Segmentation." arXiv preprint arXiv:2307.11067 (2023).

[5] Van de Sande, Koen EA, et al. "Segmentation as selective search for object recognition." 2011 international conference on computer vision. IEEE, 2011.