6D pose estimation of objects in images
Název práce v češtině: | Odhad 6D polohy objektů v obrazech |
---|---|
Název v anglickém jazyce: | 6D pose estimation of objects in images |
Klíčová slova: | Počítačové vidění|Odhad 6D polohy|Nekalibrovaná kamera|Detekce nových objektů|Vizuální rozpoznávání|Hluboké učení |
Klíčová slova anglicky: | Computer vision|6D pose estimation|Uncalibrated camera|Novel object detection|Visual recognition|Deep learning |
Akademický rok vypsání: | 2023/2024 |
Typ práce: | diplomová práce |
Jazyk práce: | angličtina |
Ústav: | Katedra softwaru a výuky informatiky (32-KSVI) |
Vedoucí / školitel: | Josef Šivic |
Řešitel: | Mgr. Martin Cífka - zadáno a potvrzeno stud. odd. |
Datum přihlášení: | 09.10.2023 |
Datum zadání: | 19.12.2023 |
Datum potvrzení stud. oddělením: | 19.12.2023 |
Datum a čas obhajoby: | 14.02.2024 09:00 |
Datum odevzdání elektronické podoby: | 11.01.2024 |
Datum odevzdání tištěné podoby: | 11.01.2024 |
Datum proběhlé obhajoby: | 14.02.2024 |
Oponenti: | doc. RNDr. Elena Šikudová, Ph.D. |
Zásady pro vypracování |
Understanding scenes on the object level has many applications in augmented reality or robotics, e.g. accurate robotic manipulation. The goal of this thesis is to make a step toward estimating the 6D pose (3D translation and 3D rotation) of objects depicted in the image based on a database of objects with known 3D models. Two challenges are considered: (i) estimating 6D pose from images taken by an uncalibrated camera (e.g., frames from videos downloaded from YouTube) and (ii) detecting objects in the image without costly object-specific training. The thesis aims to approach these challenges by building on state-of-the-art methods: FocalPose [1], MegaPose [2], SAM [3], and CNOS [4]. More specifically, the objectives are:
1. Review state-of-the-art methods for object 6D pose estimation from images and for object detection in images. Identify their limitations and benefits. 2. Explore the possibilities for addressing the identified limitations and improving the state-of-the-art methods for object pose estimation [1]. For example, a promising direction is creating synthetic training data biased toward training distribution or improving the retrieval of the relevant instance from the object database. 3. Explore the possibilities for addressing limitations of object detection methods for unseen objects, i.e., objects that were not used for training, by taking an approach in CNOS [4] as a baseline. Promising directions include, for example, tuning and evaluating object proposal mechanisms (SAM [3] or selective search [5]) or combining unseen object model classification methods such as MegaPose coarse model [2] or CNOS [4]. |
Seznam odborné literatury |
[1] Ponimatkin, Georgy, et al. "Focal length and object pose estimation via render and compare." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022.
[2] Labbé, Yann, et al. "Megapose: 6d pose estimation of novel objects via render & compare." 6th Annual Conference on Robot Learning (CoRL), 2022. [3] Kirillov, Alexander, et al. "Segment anything." arXiv preprint arXiv:2304.02643 (2023). [4] Nguyen, Van Nguyen, et al. "CNOS: A Strong Baseline for CAD-based Novel Object Segmentation." arXiv preprint arXiv:2307.11067 (2023). [5] Van de Sande, Koen EA, et al. "Segmentation as selective search for object recognition." 2011 international conference on computer vision. IEEE, 2011. |