H2GB.sampler.get_NeighborLoader

sampler.get_NeighborLoader(batch_size, shuffle=True, split='train')

A heterogeneous graph sampler that performs neighbor sampling as introduced in the “Inductive Representation Learning on Large Graphs” paper. This loader allows for mini-batch training of GNNs on large-scale graphs where full-batch training is not feasible.

More specifically, neighbor_sizes in the configuration denotes how much neighbors are sampled for each node in each iteration. NeighborLoader takes in this list of num_neighbors and iteratively samples num_neighbors[i] for each node involved in iteration i - 1.

Sampled nodes are sorted based on the order in which they were sampled. In particular, the first batch_size nodes represent the set of original mini-batch nodes.

Parameters:
  • dataset (Any) – A InMemoryDataset dataset object.

  • batch_size (int) – The number of seed nodes (first nodes in the batch).

  • shuffle (bool) – Whether to shuffle the data or not (default: True).

  • split (str) – Specify which data split (train, val, test) is for this sampler. This determines some sampling parameter loaded from the configuration file, such as iter_per_epoch.