This survey presents a first systematic review of how graphs can empower AI agents. It

  • Focuses on the potential of graph learning to bolster agent planning, agent execution, agent memory, and multiagent coordination.
  • Explores the reciprocal relationship, detailing how AI agents can, in turn, empower and refine graph learning processes.
  • Outlines promising applications and identify key future research opportunities

Preliminaries

AI Agents: An AI agent is an intelligent model capable of perceiving its environment and making autonomous decisions to achieve specific goals.

Reinforcement Learning: It sits at the intersection of machine learning, control theory, and cognitive science. RL lets an agent learn by acting, observing the consequences, and using scalar rewards or penalties to refine its behavior flexibly with or without human labels.

Large Language Models: It is based on extensive world knowledge during large-scale pretraining and have demonstrated great knowledge expression ability in natural language understanding and generation tasks. It can serve as an agent’s foundation model for various tasks without costly re-training.

Graph Learning: The graph learning process can generally be divided into two key parts:

  • Data organization (Graph): Organizing data into a suitable graph is the foundation and an important part of information and operator structurization.
  • Knowledge extraction (Graph Learning): Graph models, such as graph neural networks (GNNs), can extract task-required knowledge by leveraging and aggregating the information at each node and its neighborhood in the graph.

Graphs for agent planning

Planning refers to the process in which an AI agent understands the task and devises a series of rational and ordered action plans after taking into account various available factors.

Graphs can play a role in various aspects of agent planning, mainly including effectively organizing the task reasoning form, arranging the task decomposition procedure, and constructing an efficient task decision searching process.

Task Reasoning

  1. Knowledge Graph-Auxiliary Reasoning: Knowledge graph-assisted reasoning primarily enhances the agent’s reasoning for tasks by leveraging additional information on an auxiliary knowledge graph.
    • A knowledge graph (KG) is a structured representation of knowledge. The nodes in a KG denote entities or concepts that the model can recognize or generate, while the edges signify the relationships between these entities or concepts
  2. Structure-Organized Reasoning: help LLM agents understand key task-related knowledge more efficiently by structuring these processes with trees or more complex graph forms.

Task Decomposition

In AI agents, task planning requires decomposing user requests into specific sub-tasks. Proper task decomposition is important because reasonable sub-task decomposition can improve the accuracy and efficiency of the agent in performing tasks.

These sub-tasks often have dependencies, such as one task’s output serving as another’s input. These sub-tasks and their dependencies form a task dependency graph (TDG). TDG is the primary organized graph structure used in the task decomposition process.

Task Decision Searching

It involves the making of sequential decisions in complex environments to achieve specific goals.

Search algorithms involve state transitions within the decision space, which naturally form a graph structure with transitions between states. These states and their transitions form a state space graph (SSG). Formally, each node in the SSG represents a state, with the node’s properties being the state’s information, such as its textual description or parameters. The transitions between states serve as the edges between nodes.

Graphs for Agent Execution

After the planning of an agent, the execution phase is where the formulated plan is put into action. At this stage, two main modules can play an important role.

  1. Tool usage: the agent needs to call upon appropriate external tools in combination with its own knowledge in order to complete the specified actions

  2. Environment interaction: the agent should interact with the neighboring environment to perceive the information it faces and conduct actions based on current circumstances.

Graphs can help arrange the scheduling of these numerous tools and model the rich relationships between agents and the environments in which they reside.

Graphs for Agent Memory

Memory is a crucial capability that allows agents to store and recall past experiences or related knowledge. It allows agents to accumulate experience, thus facilitating more informed and appropriate actions.

When applied to agent memory, a graph-based memory organization can effectively uncover latent associations among the various information encountered by the agent.

Memory Organization

  • Agents equipped with graph-structured memory can store knowledge and experiences as interconnected representations.
  • Recent LLM-based agents have explored a spectrum of memory representations, from unstructured text chunks to structured knowledge graphs. In particular, knowledge graphs and other structured forms (e.g., atomic facts and summarized notes) are increasingly used to organize an agent’s long-term memory.
  • Integrating richer information, some works begin to introduce hierarchical or hybrid graphs with multilevel or multigranularity information to improve organization.

Memory Retrieval

Based on the structured memory, it is critical for LLM agents to accurately and efficiently retrieve useful information from it as reliable guidance.

  • G-Retriever and GFM-RAG: integrate the semantic similarity and the graph metrics
  • Subgraph RAG: a lightweight perceptron with the triple-scoring mechanism
  • LightRAG: local retriever for entity-level questions and a global retriever for complex queries
  • GRAG: conduct efficiency-oriented optimization using a divide-and-conquer strategy to retrieve the optimal subgraph in linear time, along with two complementary context views to help LLMs understand the graph context

Memory Maintenance

Beyond organizing and retrieving structured knowledge, a pivotal and challenging aspect of graph-based agent memory is its dynamic evolution, the capacity to continuously update and refine memory representations and graph topologies in response to new experiences and interactions.

  • Dynamic graph-based memory architectures:

    • A-MEM: creates interconnected knowledge networks through dynamic indexing and linking
    • AriGraph: consists of episodic and semantic memories
  • Hierarchical dynamic graphs: to better capture and store the long-term relationship of elements

    • DAMCS: proposes a goal-oriented hierarchical knowledge graph for long-term memory, with lower-level experience nodes and goal nodes for the agent’s journey tracking. Higher-level long-term goal nodes are generated to provide an overview of the long-term progress.
    • Zep: a temporal-aware hierarchical knowledge graph engine that dynamically integrates unstructured conversational data, maintaining historical relationships.

Graphs for Multi-agent Coordination

A multi-agent system (MAS) is a complex system that integrates multiple AI agents for mutual integration, collaboration, and competition. Its goal is to accomplish complex tasks that are difficult for a single agent to complete through the interaction between various agents. In these systems, each agent has its own specialized advantages, such as domainspecific knowledge, reasoning ability, and the overall effect of coordination depends on the information exchange and decision consensus between the agents.

The core focus of coordinating multiple agents is relational modeling. Graph organization and learning can take advantage of its inherent ability to process relational data to excel in such modeling.

To make the review consistent, we define the graph organization of multi-agent coordination as the agent coordination graph (ACG). Formally, in the ACG, the core nodes are agents. Node features represent the information of each agent. Edges are communication paths between agent nodes that are formed to pass messages between agents.

Agents for Graph Learning

The paradigm of AI agents can also empower graph learning and graph-related tasks, opening up new avenues for automatic and effective graph processing. This synergy manifests itself primarily in two main aspects: (1) Graph annotation and synthesis, as well as (2) Graph understanding tasks.

Graph Annotation and Synthesis

  • Graph Annotation: As graph annotation can be regarded as a decision-making process for label prediction and correction target, RL agents can be optimized for annotation through trial.
  • Graph Synthesis: With the abundant pre-trained knowledge of LLMs, many works have adopted the LLM agent paradigm to generate specific graphs for training or simulation.

Graph Understanding

  • RL Agents: Traditional graph modeling methods are based on fixed design logic. However, real-world graph structures are diverse, and fixed modeling logic cannot fully utilize the potential of graph learning. Therefore, methods like RL agents have started to design adaptive aggregation mechanisms for graph learning.
  • LLM Agents: With the powerful knowledge of LLMs, many works are exploring how to design and tune LLM agents to fully leverage the capabilities of LLMs in graph-modeling tasks. The core of these works is to post-train LLMs on graph data. Thus, LLM agents can automatically understand graph data through text-described graph information and carry out downstream tasks like node classification.
  • LLM Multi-Agents: Graph modeling can be seen as a complex task that involves multiple sub-modeling tasks, such as graph data processing, graph architecture selection, and graph task execution, which is difficult for a single agent to handle all. Therefore, recently, many works have introduced multiple LLMs to construct a multi-agent system, with each agent focusing on a specific sub-modeling task.