WebGraph is a framework for graph compression aimed at studying web graphs. It provides simple ways to manage very large graphs, exploiting modern compression techniques. The big version is a fork of the original WebGraph that can handle more than 231 nodes. For more details on WebGraph that are common between the standard and the big version, please see WebGraph. WebGraph is free software distributed under either the GNU Lesser General Public License 2.1+ or the Apache Software License 2.0.
Main differences
If you are used to WebGraph, the main difference is that nodes are
indexed by long integers. Correspondingly, iterators on nodes are
LazyLongIterator
s, and all array-based methods
(such as ImmutableGraph.successorBigArray(long)
or
ArcLabelledImmutableGraph.labelBigArray(long)
)
return big arrays.
Some classes have not been ported, and will be ported on an “as-needed” basis.
Porting code
If you want to port code written for WebGraph to the big version, the main
nuisance is the fact that ImmutableGraph.successorBigArray(long)
returns, as the name says, a big array, which
cannot be accessed like a standard Java array. Watch out in particular for accesses to the
length
field, which will be syntactically correct even on a big array, but
must be replaced by calls to a suitable method (e.g.,
BigArrays.length(long[][])
). In general, you
must get accustomed to big-array methods before porting code.
To simplify many mundane matters, such as unit tests, ImmutableGraph
provides two static wrapping methods (ImmutableGraph.wrap(it.unimi.dsi.webgraph.ImmutableGraph)
and ImmutableGraph.wrap(ImmutableGraph)
) that
turn a standard ImmutableGraph
into a big ImmutableGraph
and viceversa. Thus, for instance, there is no big version of
ArrayListMutableGraph
: it is expected that instances will be just
wrapped should you need to use them in the big framework.
Compatibility
The serialisation format of the standard and big versions of BVGraph
are compatible (but
you cannot load a graph with more than 231 elements using the standard version). The same
graph loaded with instances of the two classes, however, will not by equal.
You must wrap one or the other (see above) to check for equality.
Note also that usually satellite data generated by various utilities (e.g., StronglyConnectedComponents
)
are written using formats that are not compatible.