On Saturday 19th October 2024 there will be a shutdown of some components of the information system. Especially the work with files in Thesis modules will be particularly unavailable. Please postpone your requests for a later time. |
6D pose estimation of objects in images
Thesis title in Czech: | Odhad 6D polohy objektů v obrazech |
---|---|
Thesis title in English: | 6D pose estimation of objects in images |
Key words: | Počítačové vidění|Odhad 6D polohy|Nekalibrovaná kamera|Detekce nových objektů|Vizuální rozpoznávání|Hluboké učení |
English key words: | Computer vision|6D pose estimation|Uncalibrated camera|Novel object detection|Visual recognition|Deep learning |
Academic year of topic announcement: | 2023/2024 |
Thesis type: | diploma thesis |
Thesis language: | angličtina |
Department: | Department of Software and Computer Science Education (32-KSVI) |
Supervisor: | Josef Šivic |
Author: | Mgr. Martin Cífka - assigned and confirmed by the Study Dept. |
Date of registration: | 09.10.2023 |
Date of assignment: | 19.12.2023 |
Confirmed by Study dept. on: | 19.12.2023 |
Date and time of defence: | 14.02.2024 09:00 |
Date of electronic submission: | 11.01.2024 |
Date of submission of printed version: | 11.01.2024 |
Date of proceeded defence: | 14.02.2024 |
Opponents: | doc. RNDr. Elena Šikudová, Ph.D. |
Guidelines |
Understanding scenes on the object level has many applications in augmented reality or robotics, e.g. accurate robotic manipulation. The goal of this thesis is to make a step toward estimating the 6D pose (3D translation and 3D rotation) of objects depicted in the image based on a database of objects with known 3D models. Two challenges are considered: (i) estimating 6D pose from images taken by an uncalibrated camera (e.g., frames from videos downloaded from YouTube) and (ii) detecting objects in the image without costly object-specific training. The thesis aims to approach these challenges by building on state-of-the-art methods: FocalPose [1], MegaPose [2], SAM [3], and CNOS [4]. More specifically, the objectives are:
1. Review state-of-the-art methods for object 6D pose estimation from images and for object detection in images. Identify their limitations and benefits. 2. Explore the possibilities for addressing the identified limitations and improving the state-of-the-art methods for object pose estimation [1]. For example, a promising direction is creating synthetic training data biased toward training distribution or improving the retrieval of the relevant instance from the object database. 3. Explore the possibilities for addressing limitations of object detection methods for unseen objects, i.e., objects that were not used for training, by taking an approach in CNOS [4] as a baseline. Promising directions include, for example, tuning and evaluating object proposal mechanisms (SAM [3] or selective search [5]) or combining unseen object model classification methods such as MegaPose coarse model [2] or CNOS [4]. |
References |
[1] Ponimatkin, Georgy, et al. "Focal length and object pose estimation via render and compare." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022.
[2] Labbé, Yann, et al. "Megapose: 6d pose estimation of novel objects via render & compare." 6th Annual Conference on Robot Learning (CoRL), 2022. [3] Kirillov, Alexander, et al. "Segment anything." arXiv preprint arXiv:2304.02643 (2023). [4] Nguyen, Van Nguyen, et al. "CNOS: A Strong Baseline for CAD-based Novel Object Segmentation." arXiv preprint arXiv:2307.11067 (2023). [5] Van de Sande, Koen EA, et al. "Segmentation as selective search for object recognition." 2011 international conference on computer vision. IEEE, 2011. |