Ubiquitous Mobile Intelligence

Modern society depends on modern transportation, while modern transportation suffers from four major challenges:
- Safety: transportation accident tops the causes of death for ages 15—29;
- Mobility: Average US Commuters wasted 97 hours per year in traffic and lost $160 billion due to urban congestion;
- Environment: Transportation accounts for 29% of greenhouse gas emissions US, the largest source;
- Space: Today’s cars sit unused 95% of the time, demanding a large amount of parking spaces.
4th Industrial Revolution (AI and Cyber-Physical Systems) leads to a paradigm shift to intelligence-centric mobility.
| Vehicles | Autonomous and connected, with perception, reasoning, and decision-making capabilities. |
| Infrastructure | Smart roads, adaptive traffic lights, and predictive maintenance. |
| Logistics & Supply Chain | End-to-end optimization via AI-powered routing, inventory forecasting, and multimodal orchestration. |
| Urban Mobility | Integration of public transit, ride-sharing, and micro-mobility through AI-driven platforms (“Mobility as a Service”). |
| Policy & Sustainability | Data-driven transport policy, carbon optimization, and AI-assisted urban planning. |
When, where, and how to apply Large Language Model (LLM) for autonomous driving?
Hierarchical decision-making in autonomous driving:
| Long-term / Strategic | Route and mission planning | Seconds to minutes | Path through city, highway exit, detour, mission-level reasoning |
| Mid-term / Tactical | Behavioral planning | 100–500 ms | Lane changes, overtaking, merging, stop/yield behavior |
| Short-term / Reactive | Control and safety loops | 10–50 ms | Obstacle avoidance, braking, steering control, emergency actions |
LLM for high-level and mid-level planning: reasoning, interpretation, and multimodal context understanding
VLA: Vision-Language-Action Model

| Advantages: •Unified multimodal perception and reasoning •Natural-language reasoning, explainability, and adaptive learning •Generalization and flexible task adaptation •Hierarchical and interpretable action planning •Broad applicability to other autonomous systems | Challenges: •Training requires enormous GPU/TPU clusters •Inference is also demanding: difficulty for real-time deployment on embedded or safety-critical systems like vehicles •No single standard metric for measuring how well a VLA model understands and acts |
Central vs. Distributed Training?
| Central training •Pros: – Abundant computing resources – Aggregated large data set •Cons: – Data privacy and security – Single point of failure, not scalable, not personalized – Communication overhead and delay | End training •Cons: – Limited computing resources – Limited data set •Pros: – Data privacy and security – Personalized model – No communication overhead |
Federated learning for VLA [1]
| Advantages: •Distributed intelligence – Diversity improve generalization – Scalability: resources in end, edge, and cloud •Privacy preserving •Reducing communication •Supporting continuous, on-device learning and personalization •Fast reactions to real-time, location-relevant needs •…… | Challenges: •Multi-modal sensing information encoding •Heterogeneous vehicles with different resource constraint – Model distillation, fine tuning, and personalization •Communication bottleneck in training and inferencing •Delay, delay, delay •Safety, security, accountability •…… |
[1] Tianao Xiang, Mingjian Zhi, Yuanguo Bi, Lin Cai, Yuhao Chen, FLAD: Federated Learning for LLM-based Autonomous Driving in Vehicle-Edge-Cloud Networks, https://arxiv.org/abs/2511.09025