SiamSA: Scale-Aware Siamese Object Tracking for

Vision-Based UAM Approaching

Guangze Zheng Changhong Fu* Junjie Ye Bowen Li Geng Lu Jia Pan
     Third perspective   -->   a fixed camera                              First perspective   -->   the onboard camera
A Demo for unmanned aerial manipulator visual tracking when approaching the object.

UAM Tracking Benchmark

12 kinds of challenges involved in UAM tracking.

UAMT100 -- 100 image sequences

[Google Drive] [Baidu Pan]

UAMT20L -- 20 long image sequences

[Google Drive] [Baidu Pan]

TII 2022 Abstract

In many industrial applications of unmanned aerial manipulator (UAM), visual approaching the object is crucial to subsequent manipulating. In comparison with the widely-studied manipulating, the key to efficient visionbased UAM approaching, i.e., UAM object tracking, is still limited. Since traditional model-based UAM tracking is costly and cannot track arbitrary objects, an intuitive solution is to introduce state-of-the-art model-free Siamese trackers from the visual tracking field. Although Siamese tracking is most suitable for the onboard embedded processors, severe object scale variation in UAM tracking brings formidable challenges. To address these problems, this work proposes a novel model-free scale-aware Siamese tracker (SiamSA). Specifically, a scale attention network is proposed to emphasize scale awareness in feature processing. A scale-aware anchor proposal network is designed to achieve anchor proposing. Besides, two novel UAM tracking benchmarks are first recorded. Comprehensive experiments on benchmarks validate the effectiveness of SiamSA. Furthermore, real-world tests also confirm practicality for industrial UAM approaching tasks with high efficiency and robustness.


IROS 2022 Abstract

Although the manipulating of the unmanned aerial manipulator (UAM) has been widely studied, vision-based UAM approaching, which is crucial to the subsequent manipulating, generally lacks effective design. The key to the visual UAM approaching lies in object tracking, while current UAM tracking typically relies on costly model-based methods. Besides, UAM approaching often confronts more severe object scale variation issues, which makes it inappropriate to directly employ state-of-the-art model-free Siamesebased methods from the object tracking field. To address the above problems, this work proposes a novel Siamese network with pairwise scale-channel attention (SiamSA) for vision-based UAM approaching. Specifically, SiamSA consists of a pairwise scale-channel attention network (PSAN) and a scale-aware anchor proposal network (SA-APN). PSAN acquires valuable scale information for feature processing, while SA-APN mainly attaches scale awareness to anchor proposing. Moreover, a new tracking benchmark for UAM approaching, namely UAMT100, is recorded with 35K frames on a flying UAM platform for evaluation. Exhaustive experiments on the benchmarks and real-world tests validate the efficiency and practicality of SiamSA with a promising speed.


Video for SiamSA Evaluation and Real-World Tests

paper-snapshot
Qualitative comparison on five sequences are included in this demo. Click for the video on Youtube. View on Bilibili.

Video for IROS 2022 Presentation

paper-snapshot
The presentation of our work on IROS 2022. Click for the video on Youtube. View on Bilibili.

Acknowledgements

This work was supported by the National Natural Science Foundation of China (No. 62173249), the Natural Science Foundation of Shanghai (No. 20ZR1460100), and the Key R&D Program of Sichuan Province (No. 2020YFSY0004). The code is implemented based on pysot, SiamAPN, and SiamSE. We would like to express our sincere thanks to the contributors. We would like to thank Ziang Cao for his advice on the code. We appreciate the help from Fuling Lin, Haobo Zuo, and Liangliang Yao. We would like to thank Kunhan Lu for his advice on TensorRT acceleration.

The design of this project page was based on PointFlow and avian-mesh.