Meta-enhanced hierarchical multi-agent reinforcement learning for dynamic spectrum management and trust-based routing in cognitive vehicular networks
Authors :- Nahar A.; Das D.; Yadav R.; Halimjon K.; Reypnazarov E.
Publication :- Ad Hoc Networks (Elsevier), Volume 175, 1 August 2025.
This research introduces the Meta-Enhanced Recurrent Multi-Agent Reinforcement Learning (M-RMARL) framework, designed to tackle the challenges of reliable routing and dynamic spectrum management in Cognitive Vehicular Ad Hoc Networks (CR-VANETs). The framework is built on Meta-Agnostic Meta-Learning (MAML), utilizing Meta-Learned Deep Recurrent Q-Networks (DRQNs) to significantly reduce training time, enabling vehicles to quickly identify optimal routes and enhance spectrum sensing with minimal adjustments. M-RMARL also features a dynamic spectrum management system that employs Long Short-Term Memory (LSTM)-based meta-predictive models to forecast future spectrum availability and network conditions. These predictions allow DRQNs to make proactive, intelligent decisions, improving spectrum efficiency. To ensure secure communication, the framework incorporates a Trust-Based Meta-Coordination mechanism, which dynamically evaluates agent trustworthiness and integrates these assessments into the decision-making process. Additionally, the framework leverages a Hierarchical Meta-Agent Coordination architecture, where Roadside Units (RSUs) manage global coordination and meta-learning updates, while vehicle agents implement the derived policies. This structure enhances scalability and resource management, making M-RMARL particularly effective in complex decision-making environments. Extensive simulations demonstrate the framework’s effectiveness, showing improvements of 18% in spectrum utilization, 25% in training convergence, 20% in spectrum prediction accuracy, 30% in training efficiency, and 17% in trust evaluation reliability.