A Language Agent for Autonomous Driving

1University of Southern California, 2Stanford University, 3NVIDIA
* indicates equal contribution.




Agent-Driver

Agent-Driver transforms the conventional perception-prediction-planning framework by introducing LLMs as an agent for autonomous driving.

Abstract

Human-level driving is an essential goal of autonomous driving. Conventional approaches formulate autonomous driving as a perception-prediction-planning framework, yet their systems do not capitalize on the inherent reasoning ability and experiential knowledge of humans. In this paper, we propose a fundamental paradigm shift from current pipelines, exploiting Large Language Models (LLMs) as a cognitive agent to integrate human-like intelligence into autonomous driving systems. Our system, termed Agent-Driver, transforms the traditional autonomous driving pipeline by introducing a versatile tool library accessible via function calls, a cognitive memory of common sense and experiential knowledge for decision-making, and a reasoning engine capable of chain-of-thought reasoning, task planning, motion planning, and self-reflection. Powered by LLMs, our Agent-Driver is endowed with intuitive common sense and robust reasoning capabilities, thus enabling a more nuanced, human-like approach to autonomous driving. We evaluate our system on the large-scale nuScenes benchmark, and extensive experiments substantiate that our Agent-Driver significantly outperforms the state-of-the-art driving methods by a large margin. Our approach also demonstrates superior interpretability and few-shot learning ability to these methods.


Method



  • We present Agent-Driver, an LLM-powered agent that revolutionizes the traditional perception-prediction-planning framework, establishing a powerful yet flexible paradigm for human-like autonomous driving.
  • Agent-Driver integrates a tool library for dynamic perception and prediction, a cognitive memory for human knowledge, and a reasoning engine that emulates human decision-making, all orchestrated by LLMs to enable a more anthropomorphic autonomous driving process.
  • Agent-Driver significantly outperforms the state-of-the-art autonomous driving systems by a large margin, with over 30% collision improvements in motion planning. Our approach also demonstrates strong few-shot learning ability and interpretability on the nuScenes benchmark.
  • We provide a variety range of ablation study to dissect the proposed architecture and understand the efficacy of each module, to facilitate future research in this direction.
  • Illustration of function calls in the tool library.

    Illustration of memory search.

    Illustration of reasoning engine.

    Demos

    BibTeX

    @article{mao2023agentdriver,
      author = {Mao, Jiageng and Ye, Junjie and Qian, Yuxi and Pavone, Marco and Wang, Yue},
      title = {A Language Agent for Autonomous Driving},
      year = {2023},
    }
    Stats