H2GB.datasets.PokecDataset

class PokecDataset(root: str, transform: Optional[Callable] = None, pre_transform: Optional[Callable] = None)[source]

Bases: InMemoryDataset

H-Pokec is a heterogeneous friendship graph of a Slovalk online social network, collected from SNAP at Stanford University.

The dataset consists of multiple types of entities–users and multiple fields of the hobby clubs they joined (e.g., movies, music)–as well as multiple types of directed relation representing the friendship relations and the hobby clubs they joined. Each user node is associated with a 66-dimensional feature vector extracted from the user profile information, such as geographical region, age, and visibility of user profile. Each user node is labeled with a binary label tagging their reported gender. This dataset is randomly split into training, validation, and test set.

Parameters:
  • root (str) – Root directory where the dataset should be saved.

  • transform (callable, optional) – A function/transform that takes in an torch_geometric.data.HeteroData object and returns a transformed version. The data object will be transformed before every access. (default: None)

  • pre_transform (callable, optional) – A function/transform that takes in an torch_geometric.data.HeteroData object and returns a transformed version. The data object will be transformed before being saved to disk. (default: None)

  • force_reload (bool, optional) – Whether to re-process the dataset. (default: False)