Huachi Zhou, Jiahe Du, Chuang Zhou, Chang Yang, Yilin Xiao, Yuxuan Xie, Xiao Huang
The Hong Kong Polytechnic University
GDL4LLM introduces a novel approach to text-attributed graph learning with LLMs by treating graphs as a language rather than using natural language descriptions. Recognizing that natural language is too verbose and unstructured for modeling complex graph relationships, GDL4LLM translates graphs into a concise corpus on which LLMs can be pre-trained. This enables efficient representation of subgraphs with minimal tokens during fine-tuning. Experiments on real-world datasets show GDL4LLM outperforms description-based and embedding-based approaches by effectively modeling multi-hop neighborhoods.
๐ We convert the problem of modeling graph structures for LLMs into a graph language learning problem. We justify this approach by proving that the graph language learning objective enables LLMs to learn graph structural information.
๐ง We introduce GDL4LLM, a simple yet effective framework. It generates a graph language corpus from the given graph and pre-trains LLMs on this corpus to understand the graph. The framework then samples from the graph language corpus to represent subgraphs centered around target nodes for fine-tuning on downstream tasks.
๐ Through extensive experiments on three real-world datasets, we demonstrate that GDL4LLM outperforms competitive baselines. It surpasses both description-based and textual attribute embedding-based approaches by efficiently modeling different orders of neighbors with LLMs.
Node classification performance comparison among baselines w.r.t. micro classification accuracy across three datasets. | |||||||
---|---|---|---|---|---|---|---|
NLP Models | GNNs | ACM | Wiki | Amazon | |||
Val. | Test | Val. | Test | Val. | Test | ||
Fine-tuned LMs + GNNs | |||||||
Bert | - | 74.4 | 73.2 | 69.5 | 68.8 | 86.2 | 87.0 |
GCN | 77.6 | 77.1 | 69.4 | 68.4 | 92.3 | 92.8 | |
GAT | 77.9 | 78.0 | 70.5 | 69.8 | 92.5 | 92.4 | |
GraphSAGE | 77.3 | 76.8 | 73.1 | 72.7 | 92.0 | 92.3 | |
Roberta | - | 78.1 | 76.6 | 67.8 | 68.1 | 84.9 | 85.9 |
GCN | 80.1 | 79.4 | 68.5 | 68.0 | 92.3 | 92.5 | |
GAT | 79.7 | 78.9 | 70.1 | 71.0 | 92.5 | 92.4 | |
GraphSAGE | 78.5 | 78.3 | 72.7 | 72.1 | 92.2 | 92.1 | |
GraphSAGE | 80.9 | 79.5 | 73.2 | 70.4 | 94.3 | 94.1 | |
Specialized Frameworks for Text-Attributed Graphs | |||||||
MPAD | 80.1 | 78.9 | 68.8 | 68.0 | 93.1 | 92.8 | |
GLEM | 81.4 | 79.8 | 72.6 | 71.2 | 92.5 | 93.3 | |
GraphFormers | 75.3 | 75.1 | 66.8 | 67.5 | 85.6 | 86.4 | |
LLAGA | 77.2 | 77.5 | 71.7 | 72.0 | 90.1 | 90.8 | |
InstructGLM | 75.4 | 74.5 | 72.2 | 70.6 | 94.3 | 94.2 | |
GDL4LLM | 81.9 | 81.4 | 74.3 | 73.2 | 94.6 | 94.6 | |
Fine-tuned Large Language Models +/- GNNs | |||||||
GraphAdapter | - | 80.8 | 80.4 | 71.9 | 71.7 | 94.1 | 93.4 |
Llama3-8b | - | 80.7 | 80.6 | 71.9 | 71.2 | 92.0 | 91.6 |
Llama3-8b | GraphSAGE | 82.0 | 81.3 | 72.8 | 73.0 | 93.1 | 92.8 |
GDL4LLM w/ attr | 83.9 | 82.8 | 74.0 | 73.4 | 95.8 | 95.5 |
The figure demonstrates a comparison between mainstream methods and GDL4LLM for node-classification task. Figure (a) utilizes LLMs to embed node attributes and leverages GNN to aggregate the embeddings. Figure (b) presents the descriptions of graph structure centered around target nodes. Figure (c) illustrates how LLMs are pre-trained to capture graph structures through graph language learning, and how textual attributes are further integrated to enhance LLMs fine-tuning.
(i) Pre-training significantly enhances GDL4LLM performance, especially when combined with textual attributes in prompts, creating a synergistic effect where structural understanding complements semantic comprehension.
(ii) GDL4LLM performs better with Llama-3 than Llama-2 due to architectural improvements and better training data, with ablation studies confirming the effectiveness of the pre-training objective across different LLM architectures.
We examine two critical hyperparameters: the length of sampled graph sentences l and the number of sampled sentences k. The figure shows optimal performance at l=5 and k=10, and the performance gain is marginal when approaching this value.
These results demonstrate our framework's effectiveness in modeling high-order structural information, such as inter-order dependencies. For instance, a length of 5 captures fourth-order structural information, whereas GNNs, often converging in about two layers, typically capture only second-order information.
For inquiries or contributions, please contact us at huachi.zhou@connect.polyu.hk.