BEV
Papers
Vision-Centric BEV Perception: A Survey
- arxiv: https://arxiv.org/abs/2208.02797
- github: https://github.com/4DVLab/Vision-Centric-BEV-Perception
Multi-Camera 3D Object Detection
Lift, Splat, Shoot: Encoding Images From Arbitrary Camera Rigs by Implicitly Unprojecting to 3D
- intro: ECCV 2020
- intro: NVIDIA, Vector Institute, University of Toronto
- project page: https://nv-tlabs.github.io/lift-splat-shoot/
- arxiv: https://arxiv.org/abs/2008.05711
- github: https://github.com/nv-tlabs/lift-splat-shoot
BEVDet: High-Performance Multi-Camera 3D Object Detection in Bird-Eye-View
- intro: PhiGent Robotics
- arxiv: https://arxiv.org/abs/2112.11790
BEVDet4D: Exploit Temporal Cues in Multi-camera 3D Object Detection
- intro: PhiGent Robotics
- arxiv: https://arxiv.org/abs/2203.17054
BEVerse: Unified Perception and Prediction in Birds-Eye-View for Vision-Centric Autonomous Driving
- intro: Tsinghua University & PhiGent Robotics
- arxiv: https://arxiv.org/abs/2205.09743
- github: https://github.com/zhangyp15/BEVerse
BEVFormer: Learning Bird’s-Eye-View Representation from Multi-Camera Images via Spatiotemporal Transformers
- intro: Nanjing University & Shanghai AI Laboratory & The University of Hong Kong
- arxiv: https://arxiv.org/abs/2203.17270
- github: https://github.com/zhiqi-li/BEVFormer
HFT: Lifting Perspective Representations via Hybrid Feature Transformation
- intro: Institute of Automation, Chinese Academy of Sciences & PhiGent Robotics
- arxiv: https://arxiv.org/abs/2204.05068
- github: https://github.com/JiayuZou2020/HFT
M^2BEV: Multi-Camera Joint 3D Detection and Segmentation with Unified Birds-Eye View Representation
- project page: https://xieenze.github.io/projects/m2bev/
- arxiv: https://arxiv.org/abs/2204.05088
BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird’s-Eye View Representation
- project page: https://bevfusion.mit.edu/
- arxiv: https://arxiv.org/abs/2205.13542
- github: https://github.com/mit-han-lab/bevfusion
BEVFusion: A Simple and Robust LiDAR-Camera Fusion Framework
- intro: Peking University & Alibaba Group
- arxiv: https://arxiv.org/abs/2205.13790
- github: https://github.com/ADLab-AutoDrive/BEVFusion
A Simple Baseline for BEV Perception Without LiDAR
- intro: Carnegie Mellon University & Toyota Research Institute
- project page: http://www.cs.cmu.edu/~aharley/bev/
- arxiv: https://arxiv.org/abs/2206.07959
BEVDepth: Acquisition of Reliable Depth for Multi-view 3D Object Detection
- intro: Megvii Inc. (Face++) & Huazhong University of Science and Technology & Xi’an Jiaotong University
- arxiv: https://arxiv.org/abs/2206.10092
PolarFormer: Multi-camera 3D Object Detection with Polar Transformers
- intro: 1Fudan University & CASIA & Alibaba DAMO Academy & University of Surrey
- arxiv: https://arxiv.org/abs/2206.15398
- github: https://github.com/fudan-zvg/PolarFormer
ORA3D: Overlap Region Aware Multi-view 3D Object Detection
- intro: Korea University & KAIST & Hyundai Motor Company R&D Division
- arxiv: https://arxiv.org/abs/2207.00865
MSMDFusion: Fusing LiDAR and Camera at Multiple Scales with Multi-Depth Seeds for 3D Object Detection
- intro: Fudan University & Meituan
- arxiv: https://arxiv.org/abs/2209.03102
HD Map Construction
HDMapNet: An Online HD Map Construction and Evaluation Framework
- intro: ICRA 2022
- intro: Tsinghua University & MIT & Li Auto
- project page: https://tsinghua-mars-lab.github.io/HDMapNet/
- arxiv: https://arxiv.org/abs/2107.06307
- github: https://github.com/Tsinghua-MARS-Lab/HDMapNet
VectorMapNet: End-to-end Vectorized HD Map Learning
- intro: Tsinghua University & MIT & Li Auto
- arxiv: https://arxiv.org/abs/2206.08920
UniFormer: Unified Multi-view Fusion Transformer for Spatial-Temporal Representation in Bird’s-Eye-View
- intro: Zhejiang University & DJI & Shanghai AI Lab
- arxiv: https://arxiv.org/abs/2207.08536
MapTR: Structured Modeling and Learning for Online Vectorized HD Map Construction
- intro: University of Science & Technology, Horizon Robotics
- arxiv: https://arxiv.org/abs/2208.14437
- gihtub: https://github.com/hustvl/MapTR
Semantic Segmentation
LaRa: Latents and Rays for Multi-Camera Bird’s-Eye-View Semantic Segmentation
- intro: Valeo.ai & Inria
- arxiv: https://arxiv.org/abs/2206.13294
CoBEVT: Cooperative Bird’s Eye View Semantic Segmentation with Sparse Transformers
- intro: University of California, Los Angeles & University of Texas at Austin & University of California
- arxiv: https://arxiv.org/abs/2207.02202