TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Graph Classification	CIFAR10 100k	GraphGPS + HDSE	Accuracy (%)	76.180±0.277	# 2
Graph Classification	Peptides-func	GraphGPS + HDSE	AP	0.7156±0.0058	# 2
Graph Regression	ZINC-500k	GraphGPS + HDSE	MAE	0.062	# 4

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/transformers-for-capturing-multi-level-graph/graph-classification-on-cifar10-100k)](https://paperswithcode.com/sota/graph-classification-on-cifar10-100k?p=transformers-for-capturing-multi-level-graph)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/transformers-for-capturing-multi-level-graph/graph-classification-on-peptides-func)](https://paperswithcode.com/sota/graph-classification-on-peptides-func?p=transformers-for-capturing-multi-level-graph)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/transformers-for-capturing-multi-level-graph/graph-regression-on-zinc-500k)](https://paperswithcode.com/sota/graph-regression-on-zinc-500k?p=transformers-for-capturing-multi-level-graph)`

Enhancing Graph Transformers with Hierarchical Distance Structural Encoding

22 Aug 2023 · Yuankai Luo, Hongkang Li, Lei Shi, Xiao-Ming Wu ·

Graph transformers need strong inductive biases to derive meaningful attention scores. Yet, current methods often fall short in capturing longer ranges, hierarchical structures, or community structures, which are common in various graphs such as molecules, social networks, and citation networks. This paper presents a Hierarchical Distance Structural Encoding (HDSE) method to model node distances in a graph, focusing on its multi-level, hierarchical nature. We introduce a novel framework to seamlessly integrate HDSE into the attention mechanism of existing graph transformers, allowing for simultaneous application with other positional encodings. To apply graph transformers with HDSE to large-scale graphs, we further propose a high-level HDSE that effectively biases the linear transformers towards graph hierarchies. We theoretically prove the superiority of HDSE over shortest path distances in terms of expressivity and generalization. Empirically, we demonstrate that graph transformers with HDSE excel in graph classification, regression on 7 graph-level datasets, and node classification on 11 large-scale graphs, including those with up to a billion nodes.

PDF Abstract

Code

Add Remove Mark official

luoyk1999/hdse official

Tasks

Add Remove

Graph Classification

Graph Regression

Node Classification

Datasets

CIFAR-10

OGB

ZINC

Wiki Squirrel CLUSTER PATTERN

Long Range Graph Benchmark (LRGB)

Results from the Paper

Edit

Ranked #2 on Graph Classification on CIFAR10 100k

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Graph Classification	CIFAR10 100k	GraphGPS + HDSE	Accuracy (%)	76.180±0.277	# 2	Compare
Graph Classification	Peptides-func	GraphGPS + HDSE	AP	0.7156±0.0058	# 2	Compare
Graph Regression	ZINC-500k	GraphGPS + HDSE	MAE	0.062	# 4	Compare

Methods

Add Remove

Absolute Position Encodings • Adam • BPE • Dense Connections • Dropout • Graph Transformer • Label Smoothing • LapEigen • Laplacian PE • Layer Normalization • Linear Layer • Multi-Head Attention • Position-Wise Feed-Forward Layer • Residual Connection • Scaled Dot-Product Attention • Softmax • Transformer

Edit Social Preview

Enhancing Graph Transformers with Hierarchical Distance Structural Encoding

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove