public class Transform extends Object
Most methods take an ImmutableGraph
(along with some other data, that
depend on the kind of transformation), and return another ImmutableGraph
that represents the transformed
version.
Modifier and Type  Class and Description 

static interface 
Transform.ArcFilter
Provides a method to accept or reject an arc.

static class 
Transform.BatchGraph 
static interface 
Transform.LabelledArcFilter
Provides a method to accept or reject a labelled arc.

static class 
Transform.LowerBound
An arc filter that rejects arcs whose wellknown attribute has a value smaller than a given threshold.

static class 
Transform.NodeClassFilter
An arc filter that only accepts arcs whose endpoints belong to the same
(if the parameter
keepOnlySame is true) or to
different (if keepOnlySame is false) classes. 
Modifier and Type  Field and Description 

static it.unimi.dsi.webgraph.Transform.NoLoops 
NO_LOOPS
A singleton providing an arc filter that rejects loops.

Modifier and Type  Method and Description 

static ArcLabelledImmutableGraph 
compose(ArcLabelledImmutableGraph g0,
ArcLabelledImmutableGraph g1,
LabelSemiring strategy)
Returns the composition (a.k.a.

static ImmutableGraph 
compose(ImmutableGraph g0,
ImmutableGraph g1)
Returns the composition (a.k.a.

protected static boolean 
ensureNumArgs(String[] param,
int n)
Ensures that the arguments are exactly
n , if n is nonnegative, or
at least n , otherwise. 
static ArcLabelledImmutableGraph 
filterArcs(ArcLabelledImmutableGraph graph,
Transform.LabelledArcFilter filter)
Returns a labelled graph with some arcs eventually stripped, according to the given filter.

static ArcLabelledImmutableGraph 
filterArcs(ArcLabelledImmutableGraph graph,
Transform.LabelledArcFilter filter,
ProgressLogger ignored)
Returns a labelled graph with some arcs eventually stripped, according to the given filter.

static ImmutableGraph 
filterArcs(ImmutableGraph graph,
Transform.ArcFilter filter)
Returns a graph with some arcs eventually stripped, according to the given filter.

static ImmutableGraph 
filterArcs(ImmutableGraph graph,
Transform.ArcFilter filter,
ProgressLogger ignored)
Returns a graph with some arcs eventually stripped, according to the given filter.

static int[] 
grayCodePermutation(ImmutableGraph g)
Returns a permutation that would make the given graph adjacency lists in Graycode order.

static int[] 
hostByHostGrayCodePermutation(ImmutableGraph g,
int[] hostMap,
boolean strict)
Returns a permutation that would make the given graph adjacency lists in hostbyhost Graycode order.

static int[] 
lexicographicalPermutation(ImmutableGraph g)
Returns a permutation that would make the given graph adjacency lists in lexicographical order.

static ImmutableSequentialGraph 
line(ImmutableGraph g,
String mapBasename,
File tempDir,
int batchSize,
ProgressLogger pl)
Computes the line graph of a given symmetric graph.

protected static ImmutableGraph 
load(Class<?> graphClass,
String baseName,
boolean offline,
ProgressLogger pl)
Loads a graph with given data and returns it.

protected static void 
logBatches(ObjectArrayList<File> batches,
long pairs,
ProgressLogger pl) 
static void 
main(String[] args) 
static ImmutableGraph 
map(ImmutableGraph g,
int[] f)
Remaps the the graph nodes through a function specified via
an array.

static ImmutableGraph 
map(ImmutableGraph g,
int[] map,
ProgressLogger pl)
Remaps the the graph nodes through a partial function specified via
an array.

static ImmutableSequentialGraph 
mapOffline(ImmutableGraph g,
int[] map,
int batchSize)
Returns an immutable graph obtained by remapping offline the graph nodes through a partial function specified via an array.

static ImmutableSequentialGraph 
mapOffline(ImmutableGraph g,
int[] map,
int batchSize,
File tempDir)
Returns an immutable graph obtained by remapping offline the graph nodes through a partial function specified via an array.

static ImmutableSequentialGraph 
mapOffline(ImmutableGraph g,
int[] map,
int batchSize,
File tempDir,
ProgressLogger pl)
Returns an immutable graph obtained by remapping offline the graph nodes through a partial function specified via an array.

static int 
processBatch(int n,
int[] source,
int[] target,
File tempDir,
List<File> batches)
Sorts the given source and target arrays w.r.t.

static int[] 
randomPermutation(ImmutableGraph g,
long seed)
Returns a random permutation for a given graph.

static ImmutableGraph 
symmetrize(ImmutableGraph g)
Returns a symmetrized graph.

static ImmutableGraph 
symmetrize(ImmutableGraph g,
ImmutableGraph t)
Returns a symmetrized graph.

static ImmutableGraph 
symmetrize(ImmutableGraph g,
ImmutableGraph t,
ProgressLogger pl)
Returns a symmetrized graph.

static ImmutableGraph 
symmetrize(ImmutableGraph g,
ProgressLogger pl)
Returns a symmetrized graph.

static ImmutableGraph 
symmetrizeOffline(ImmutableGraph g,
int batchSize)
Returns a symmetrized graph using an offline transposition.

static ImmutableGraph 
symmetrizeOffline(ImmutableGraph g,
int batchSize,
File tempDir)
Returns a symmetrized graph using an offline transposition.

static ImmutableGraph 
symmetrizeOffline(ImmutableGraph g,
int batchSize,
File tempDir,
ProgressLogger pl)
Returns a symmetrized graph using an offline transposition.

static ImmutableGraph 
transpose(ImmutableGraph g)
Returns an immutable graph obtained by reversing all arcs in
g . 
static ImmutableGraph 
transpose(ImmutableGraph g,
ProgressLogger pl)
Returns an immutable graph obtained by reversing all arcs in
g . 
static ArcLabelledImmutableGraph 
transposeOffline(ArcLabelledImmutableGraph g,
int batchSize)
Returns an arclabelled immutable graph obtained by reversing all arcs in
g , using an offline method. 
static ArcLabelledImmutableGraph 
transposeOffline(ArcLabelledImmutableGraph g,
int batchSize,
File tempDir)
Returns an arclabelled immutable graph obtained by reversing all arcs in
g , using an offline method. 
static ArcLabelledImmutableGraph 
transposeOffline(ArcLabelledImmutableGraph g,
int batchSize,
File tempDir,
ProgressLogger pl)
Returns an arclabelled immutable graph obtained by reversing all arcs in
g , using an offline method. 
static ImmutableSequentialGraph 
transposeOffline(ImmutableGraph g,
int batchSize)
Returns an immutable graph obtained by reversing all arcs in
g , using an offline method. 
static ImmutableSequentialGraph 
transposeOffline(ImmutableGraph g,
int batchSize,
File tempDir)
Returns an immutable graph obtained by reversing all arcs in
g , using an offline method. 
static ImmutableSequentialGraph 
transposeOffline(ImmutableGraph g,
int batchSize,
File tempDir,
ProgressLogger pl)
Returns an immutable graph obtained by reversing all arcs in
g , using an offline method. 
static ArcLabelledImmutableGraph 
union(ArcLabelledImmutableGraph g0,
ArcLabelledImmutableGraph g1,
LabelMergeStrategy labelMergeStrategy)
Returns the union of two arclabelled immutable graphs.

static ImmutableGraph 
union(ImmutableGraph g0,
ImmutableGraph g1)
Returns the union of two immutable graphs.

public static final it.unimi.dsi.webgraph.Transform.NoLoops NO_LOOPS
public static ImmutableGraph filterArcs(ImmutableGraph graph, Transform.ArcFilter filter)
graph
 a graph.filter
 the filter (telling whether each arc should be kept or not).public static ImmutableGraph filterArcs(ImmutableGraph graph, Transform.ArcFilter filter, ProgressLogger ignored)
graph
 a graph.filter
 the filter (telling whether each arc should be kept or not).ignored
 a progress logger.public static ArcLabelledImmutableGraph filterArcs(ArcLabelledImmutableGraph graph, Transform.LabelledArcFilter filter)
graph
 a labelled graph.filter
 the filter (telling whether each arc should be kept or not).public static ArcLabelledImmutableGraph filterArcs(ArcLabelledImmutableGraph graph, Transform.LabelledArcFilter filter, ProgressLogger ignored)
graph
 a labelled graph.filter
 the filter (telling whether each arc should be kept or not).ignored
 a progress logger.public static ImmutableGraph map(ImmutableGraph g, int[] map, ProgressLogger pl)
map.length=g.numNodes()
,
and map[i]
is the new name of node i
, or 1 if the node
should not be mapped. If some
index appearing in map
is larger than or equal to the
number of nodes of g
, the resulting graph is enlarged correspondingly.
Arcs are mapped in the obvious way; in other words, there is
an arc from map[i]
to map[j]
(both nonnegative)
in the transformed
graph iff there was an arc from i
to j
in the original graph.
Note that if map
is bijective, the returned graph
is simply a permutation of the original graph.
Otherwise, the returned graph is obtained by deleting nodes mapped
to 1, quotienting nodes w.r.t. the equivalence relation induced by the fibres of map
and renumbering the result, always according to map
.
This method requires ImmutableGraph.randomAccess() random access.
g
 the graph to be transformed.map
 the transformation map.pl
 a progress logger to be used during the precomputation, or null
.public static ImmutableGraph map(ImmutableGraph g, int[] f)
g
 the graph to be transformed.f
 the transformation map.map(ImmutableGraph, int[], ProgressLogger)
public static ImmutableGraph symmetrizeOffline(ImmutableGraph g, int batchSize) throws IOException
g
 the source graph.batchSize
 the number of integers in a batch; two arrays of integers of this size will be allocated by this method.IOException
symmetrizeOffline(ImmutableGraph, int, File, ProgressLogger)
public static ImmutableGraph symmetrizeOffline(ImmutableGraph g, int batchSize, File tempDir) throws IOException
g
 the source graph.batchSize
 the number of integers in a batch; two arrays of integers of this size will be allocated by this method.tempDir
 a temporary directory for the batches, or null
for File.createTempFile(java.lang.String, java.lang.String)
's choice.IOException
symmetrizeOffline(ImmutableGraph, int, File, ProgressLogger)
public static ImmutableGraph symmetrizeOffline(ImmutableGraph g, int batchSize, File tempDir, ProgressLogger pl) throws IOException
The symmetrized graph is the union of a graph and of its transpose. This method will
compute the transpose on the fly using transposeOffline(ArcLabelledImmutableGraph, int, File, ProgressLogger)
.
g
 the source graph.batchSize
 the number of integers in a batch; two arrays of integers of this size will be allocated by this method.tempDir
 a temporary directory for the batches, or null
for File.createTempFile(java.lang.String, java.lang.String)
's choice.pl
 a progress logger, or null
.IOException
public static ImmutableGraph symmetrize(ImmutableGraph g, ImmutableGraph t, ProgressLogger pl)
The symmetrized graph is the union of a graph and of its transpose. This method will use the provided transposed graph, if any, instead of computing it on the fly.
g
 the source graph.t
 the graph g
transposed; if null
, the transposed graph will be computed on the fly.pl
 a progress logger, or null
.public static ImmutableGraph symmetrize(ImmutableGraph g, ImmutableGraph t)
The symmetrized graph is the union of a graph and of its transpose. This method will use the provided transposed graph, if any, instead of computing it on the fly.
g
 the source graph.t
 the graph g
transposed; if null
, the transposed graph will be computed on the fly.public static ImmutableGraph symmetrize(ImmutableGraph g, ProgressLogger pl)
g
 the source graph.pl
 a progress logger.symmetrize(ImmutableGraph, ImmutableGraph, ProgressLogger)
public static ImmutableGraph symmetrize(ImmutableGraph g)
g
 the source graph.symmetrize(ImmutableGraph, ImmutableGraph, ProgressLogger)
public static ImmutableGraph transpose(ImmutableGraph g, ProgressLogger pl)
g
.
This method can process offline graphs).
g
 an immutable graph.pl
 a progress logger, or null
.g
.public static int processBatch(int n, int[] source, int[] target, File tempDir, List<File> batches) throws IOException
n
 the index of the last element to be sorted (exclusive).source
 the source array.target
 the target array.tempDir
 a temporary directory where to store the sorted arrays, or null
batches
 a list of files to which the batch file will be added.n
because duplicates are eliminated).IOException
public static ImmutableSequentialGraph transposeOffline(ImmutableGraph g, int batchSize) throws IOException
g
, using an offline method.g
 an immutable graph.batchSize
 the number of integers in a batch; two arrays of integers of this size will be allocated by this method.g
.IOException
transposeOffline(ImmutableGraph, int, File, ProgressLogger)
public static ImmutableSequentialGraph transposeOffline(ImmutableGraph g, int batchSize, File tempDir) throws IOException
g
, using an offline method.g
 an immutable graph.batchSize
 the number of integers in a batch; two arrays of integers of this size will be allocated by this method.tempDir
 a temporary directory for the batches, or null
for File.createTempFile(java.lang.String, java.lang.String)
's choice.g
.IOException
transposeOffline(ImmutableGraph, int, File, ProgressLogger)
public static ImmutableSequentialGraph transposeOffline(ImmutableGraph g, int batchSize, File tempDir, ProgressLogger pl) throws IOException
g
, using an offline method.
This method should be used to transpose very large graph in case transpose(ImmutableGraph)
requires too much memory. It creates a number of sorted batches on disk containing arcs
represented by a pair of gapcompressed integers ordered by target
and returns an ImmutableGraph
that can be accessed only using a node iterator
. The node iterator
merges on the fly the batches, providing a transposed graph. The files are marked with
File.deleteOnExit()
, so they should disappear when the JVM exits. An additional safetynet
finaliser tries to delete the batches, too.
Note that each NodeIterator
returned by the transpose requires opening all batches at the same time.
The batches are closed when they are exhausted, so a complete scan of the graph closes them all. In any case,
another safetynet finaliser closes all files when the iterator is collected.
This method can process offline graphs.
g
 an immutable graph.batchSize
 the number of integers in a batch; two arrays of integers of this size will be allocated by this method.tempDir
 a temporary directory for the batches, or null
for File.createTempFile(java.lang.String, java.lang.String)
's choice.pl
 a progress logger, or null
.g
.IOException
protected static void logBatches(ObjectArrayList<File> batches, long pairs, ProgressLogger pl)
public static ImmutableSequentialGraph mapOffline(ImmutableGraph g, int[] map, int batchSize) throws IOException
g
 an immutable graph.map
 the transformation map.batchSize
 the number of integers in a batch; two arrays of integers of this size will be allocated by this method.g
.IOException
mapOffline(ImmutableGraph, int[], int, File, ProgressLogger)
public static ImmutableSequentialGraph mapOffline(ImmutableGraph g, int[] map, int batchSize, File tempDir) throws IOException
g
 an immutable graph.map
 the transformation map.batchSize
 the number of integers in a batch; two arrays of integers of this size will be allocated by this method.tempDir
 a temporary directory for the batches, or null
for File.createTempFile(java.lang.String, java.lang.String)
's choice.g
.IOException
mapOffline(ImmutableGraph, int[], int, File, ProgressLogger)
public static ImmutableSequentialGraph mapOffline(ImmutableGraph g, int[] map, int batchSize, File tempDir, ProgressLogger pl) throws IOException
map(ImmutableGraph, int[], ProgressLogger)
for the semantics of this method and transpose(ImmutableGraph, ProgressLogger)
for
implementation and performancerelated details.g
 an immutable graph.map
 the transformation map.batchSize
 the number of integers in a batch; two arrays of integers of this size will be allocated by this method.tempDir
 a temporary directory for the batches, or null
for File.createTempFile(java.lang.String, java.lang.String)
's choice.pl
 a progress logger, or null
.g
.IOException
public static ArcLabelledImmutableGraph transposeOffline(ArcLabelledImmutableGraph g, int batchSize) throws IOException
g
, using an offline method.g
 an immutable graph.batchSize
 the number of integers in a batch; two arrays of integers of this size will be allocated by this method,
plus an additional FastByteArrayOutputStream
needed to store all the labels for a batch.g
.IOException
transposeOffline(ArcLabelledImmutableGraph, int, File, ProgressLogger)
public static ArcLabelledImmutableGraph transposeOffline(ArcLabelledImmutableGraph g, int batchSize, File tempDir) throws IOException
g
, using an offline method.g
 an immutable graph.batchSize
 the number of integers in a batch; two arrays of integers of this size will be allocated by this method,
plus an additional FastByteArrayOutputStream
needed to store all the labels for a batch.tempDir
 a temporary directory for the batches, or null
for File.createTempFile(java.lang.String, java.lang.String)
's choice.g
.IOException
transposeOffline(ArcLabelledImmutableGraph, int, File, ProgressLogger)
public static ArcLabelledImmutableGraph transposeOffline(ArcLabelledImmutableGraph g, int batchSize, File tempDir, ProgressLogger pl) throws IOException
g
, using an offline method.
This method should be used to transpose very large graph in case transpose(ImmutableGraph)
requires too much memory. It creates a number of sorted batches on disk containing arcs
represented by a pair of integers in DataInput
format ordered by target
and returns an ImmutableGraph
that can be accessed only using a node iterator
. The node iterator
merges on the fly the batches, providing a transposed graph. The files are marked with
File.deleteOnExit()
, so they should disappear when the JVM exits. An additional safetynet
finaliser tries to delete the batches, too. As far as labels are concerned, they are temporarily stored in
an inmemory bit stream, that is permuted when it is stored on the disk
Note that each NodeIterator
returned by the transpose requires opening all batches at the same time.
The batches are closed when they are exhausted, so a complete scan of the graph closes them all. In any case,
another safetynet finaliser closes all files when the iterator is collected.
This method can process offline graphs. Note that no method to transpose online arclabelled graph is provided currently.
g
 an immutable graph.batchSize
 the number of integers in a batch; two arrays of integers of this size will be allocated by this method,
plus an additional FastByteArrayOutputStream
needed to store all the labels for a batch.tempDir
 a temporary directory for the batches, or null
for File.createTempFile(java.lang.String, java.lang.String)
's choice.pl
 a progress logger.g
.IOException
public static ImmutableGraph transpose(ImmutableGraph g)
g
.
This method can process offline graphs.
g
 an immutable graph.g
.transpose(ImmutableGraph, ProgressLogger)
public static ArcLabelledImmutableGraph union(ArcLabelledImmutableGraph g0, ArcLabelledImmutableGraph g1, LabelMergeStrategy labelMergeStrategy)
The two arguments may differ in the number of nodes, in which case the resulting graph will be large as the larger graph.
g0
 the first graph.g1
 the second graph.labelMergeStrategy
 the strategy used to merge labels when the same arc
is present in both graphs; if null
, Labels.KEEP_FIRST_MERGE_STRATEGY
is used.public static ImmutableGraph union(ImmutableGraph g0, ImmutableGraph g1)
The two arguments may differ in the number of nodes, in which case the resulting graph will be large as the larger graph.
g0
 the first graph.g1
 the second graph.public static ImmutableGraph compose(ImmutableGraph g0, ImmutableGraph g1)
The two arguments may differ in the number of nodes, in which case the resulting graph will be large as the larger graph.
g0
 the first graph.g1
 the second graph.public static ArcLabelledImmutableGraph compose(ArcLabelledImmutableGraph g0, ArcLabelledImmutableGraph g1, LabelSemiring strategy)
The two arguments may differ in the number of nodes, in which case the resulting graph will be large as the larger graph.
g0
 the first graph.g1
 the second graph.strategy
 a label semiring.public static ImmutableSequentialGraph line(ImmutableGraph g, String mapBasename, File tempDir, int batchSize, ProgressLogger pl) throws IOException
Two additional files are created, with names stemmed from mapBasename
; the ith entries of the two files
identify the source and target node (in the original graph) corresponding the node i in the line graph.
g
 the graph (it must be symmetric and loopless).mapBasename
 the basename of two files that will, at the end, contain as many integers as the number of nodes in the line graph: the ith
integer in the file mapBasename.source
will contain the source of the arc corresponding to the ith
node in the line graph, and similarly mapBasename.target
will give the target.tempDir
 the temporary directory to be used.batchSize
 the size used for batches.pl
 the progress logger to be used.g
.IOException
public static int[] grayCodePermutation(ImmutableGraph g)
Gray codes list all sequences of n zeros and ones in such a way that adjacent lists differ by exactly one bit. If we assign to each row of the adjacency matrix of a graph its index as a Gray code, we obtain a permutation that will make similar lines nearer.
Note that since a graph permutation permutes both rows and columns, this transformation is not idempotent: the Graycode permutation produced from a matrix that has been Graycode sorted will not be, in general, the identity.
The important feature of Graycode ordering is that it is completely endogenous (e.g., determined by the graph itself), contrarily to, say, lexicographic URL ordering (which relies on the knowledge of the URL associated to each node).
g
 an immutable graph.map(ImmutableGraph, int[], ProgressLogger)
).public static int[] randomPermutation(ImmutableGraph g, long seed)
g
 an immutable graph.seed
 for XorShift1024StarRandom
.public static int[] hostByHostGrayCodePermutation(ImmutableGraph g, int[] hostMap, boolean strict)
This permutation differs from grayCodePermutation(ImmutableGraph)
in that Gray codes
are computed host by host. There are two variants, strict and loose. In the first case,
we restrict the adjacency matrix to the submatrix corresponding to a host and compute the ordering. In
the second case, we just restrict to the rows corresponding to a host, but then entire rows are used
to compute the ordering.
g
 an immutable graph.hostMap
 an array mapping each URL to its host (it is sufficient that each host is assigned a distinct number).strict
 if true, hostbyhost Gray code computation will be strict, that is, the order is computed only
between columns of the same host of the rows.map(ImmutableGraph, int[], ProgressLogger)
).public static int[] lexicographicalPermutation(ImmutableGraph g)
Note that since a graph permutation permutes both rows and columns, this transformation is not idempotent: the lexicographical permutation produced from a matrix that has been lexicographically sorted will not be, in general, the identity.
The important feature of lexicographical ordering is that it is completely endogenous (e.g., determined by the graph itself), contrarily to, say, lexicographic URL ordering (which relies on the knowledge of the URL associated to each node).
Warning: rows are numbered from zero from the left. This means, for instance, that nodes with an arc towards node zero are lexicographically smaller than nodes without it.
g
 an immutable graph.map(ImmutableGraph, int[], ProgressLogger)
).protected static boolean ensureNumArgs(String[] param, int n)
n
, if n
is nonnegative, or
at least n
, otherwise.protected static ImmutableGraph load(Class<?> graphClass, String baseName, boolean offline, ProgressLogger pl) throws IllegalArgumentException, SecurityException, IllegalAccessException, InvocationTargetException, NoSuchMethodException, IOException
graphClass
 the class of the graph to be loaded.baseName
 the graph basename.offline
 whether the graph is to be loaded in an offline fashion.pl
 a progress logger.IllegalArgumentException
SecurityException
IllegalAccessException
InvocationTargetException
NoSuchMethodException
IOException
public static void main(String[] args) throws IOException, IllegalArgumentException, SecurityException, InstantiationException, IllegalAccessException, InvocationTargetException, NoSuchMethodException, ClassNotFoundException, JSAPException