Мохрат М., Махмуд Ж., Юманов М.А. (науч. рук. Колюбин С.А.) Online integration of visual, language, and geometric data for an open-vocabulary semantic slam system
In this work, we propose a system that tightly integrates visual data from a camera, semantic and language descriptions model, and geometric features derived from the SLAM process. This integration not only enhances mapping and localization accuracy but also enables interactive, on-the-fly object searches via natural language queries, which is a valuable feature for various robotic systems.
Мохрат М., Махмуд Ж., Юманов М.А. (науч. рук. Колюбин С.А.) Online integration of visual, language, and geometric data for an open-vocabulary semantic slam system // Сборник тезисов докладов конгресса молодых ученых. Электронное издание. – СПб: Университет ИТМО, [2025]. URL: https://kmu.itmo.ru/digests/article/15095