You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
std::cout << "Number of edges in the spanning tree: " << num_edges << std::endl;
107
+
if (visited.size() != num_vertices)
108
+
{
109
+
std::cerr << "Warning: Could not connect all vertices while building spanning tree (possibly due to low max_degree). Graph might be disconnected." << std::endl;
110
+
}
111
+
112
+
visited.clear();
113
+
unvisited.clear();
114
+
115
+
// Graph is connected with all edges having at least one vertex
116
+
// and a maximum degree of max_degree.
117
+
118
+
// Add additional random edges between vertices (will not exceed max_degree).
119
+
// Graph will of course remain connected.
120
+
121
+
// heuristic
122
+
constauto max_new_edges = num_edges * 2;
123
+
constauto min_new_edges = num_edges / 2;
124
+
125
+
// chose a number between min_new_edges and max_new_edges
As DGraph allows us to learn extremely large graphs, we push the size of the graphs beyond to train with full graph GNN training. We generate a synthetic graphs with 1 billion vertices.
4
+
5
+
## Data Generation
6
+
7
+
### Building the Graph Generator
8
+
We provide a fast graph generator to generate large graphs. The generator generates a graph with a given number of vertices and a maximum degree. The generator just requires a `GCC>10.3`. Build the generator in the `Generator` directory
9
+
```bash
10
+
cd Generator
11
+
make
12
+
```
13
+
14
+
### Generating the Graph
15
+
The generator takes the number of vertices and the maximum degree as input, and outputs a text file in the METIS graph format. Run the following command to generate a graph with 1 billion vertices with a maximum degree of 5:
This will generate an undirected graph with 1 billion vertices and a maximum degree of 5. The graph will be saved in the file `1B5D.graph`. The generator will take a few minutes to run and require `~150GB` of memory.
22
+
23
+
The graph will be generated in the METIS format, which is a simple text format that describes the graph. The first line of the file contains the number of vertices and edges. The i-th line of the file contains the neighbors of the i-th vertex.
24
+
25
+
### Partition the graph
26
+
27
+
We assume a there is a working `METIS` installation with flags `i64=1` and `r64=1`. `Parametis` may be useful as well.
28
+
29
+
To partition the graph in to `<num_partitions>` partitions, run the following command:
30
+
```bash
31
+
gpmetis 1B5D.graph <num_partitions>
32
+
```
33
+
This will generate a file `1B5D.graph.part.<num_partitions>` which contains the partitioning of the graph. The i-th line of the file contains the partition id of the i-th vertex. The partition ids are 0-indexed. This also requires `~150GB` of memory (with the flag `-ondisk`).
34
+
35
+
### Preprocess for DGraph
36
+
37
+
To finish the graph generation and make the data ready for DGraph, we take the graph file and partition file and run the following command:
0 commit comments