public class Transform extends Object
Most methods take an ImmutableGraph
(along with some other data, that
depend on the kind of transformation), and return another ImmutableGraph
that represents the transformed
version.
Modifier and Type  Class and Description 

static interface 
Transform.ArcFilter
Provides a method to accept or reject an arc.

static class 
Transform.BatchGraph 
static interface 
Transform.LabelledArcFilter
Provides a method to accept or reject a labelled arc.

static class 
Transform.LowerBound
An arc filter that rejects arcs whose wellknown attribute has a value smaller than a given threshold.

static class 
Transform.NodeClassFilter
An arc filter that only accepts arcs whose endpoints belong to the same
(if the parameter
keepOnlySame is true) or to
different (if keepOnlySame is false) classes. 
Modifier and Type  Field and Description 

static it.unimi.dsi.big.webgraph.Transform.NoLoops 
NO_LOOPS
A singleton providing an arc filter that rejects loops.

Modifier and Type  Method and Description 

static ArcLabelledImmutableGraph 
compose(ArcLabelledImmutableGraph g0,
ArcLabelledImmutableGraph g1,
LabelSemiring strategy)
Returns the composition (a.k.a.

static ImmutableGraph 
compose(ImmutableGraph g0,
ImmutableGraph g1)
Returns the composition (a.k.a.

static ArcLabelledImmutableGraph 
filterArcs(ArcLabelledImmutableGraph graph,
Transform.LabelledArcFilter filter)
Returns a labelled graph with some arcs eventually stripped, according to the given filter.

static ArcLabelledImmutableGraph 
filterArcs(ArcLabelledImmutableGraph graph,
Transform.LabelledArcFilter filter,
ProgressLogger ignored)
Returns a labelled graph with some arcs eventually stripped, according to the given filter.

static ImmutableGraph 
filterArcs(ImmutableGraph graph,
Transform.ArcFilter filter)
Returns a graph with some arcs eventually stripped, according to the given filter.

static ImmutableGraph 
filterArcs(ImmutableGraph graph,
Transform.ArcFilter filter,
ProgressLogger ignored)
Returns a graph with some arcs eventually stripped, according to the given filter.

static long[][] 
grayCodePermutation(ImmutableGraph g)
Returns a permutation that would make the given graph adjacency lists in Graycode order.

static long[][] 
lexicographicalPermutation(ImmutableGraph g)
Returns a permutation that would make the given graph adjacency lists in lexicographical order.

protected static void 
logBatches(ObjectArrayList<File> batches,
long pairs,
ProgressLogger pl) 
static void 
main(String[] args) 
static ImmutableSequentialGraph 
mapOffline(ImmutableGraph g,
long[][] map,
int batchSize)
Returns an immutable graph obtained by remapping offline the graph nodes through a partial function specified via a big array.

static ImmutableSequentialGraph 
mapOffline(ImmutableGraph g,
long[][] map,
int batchSize,
File tempDir)
Returns an immutable graph obtained by remapping offline the graph nodes through a partial function specified via a big array.

static ImmutableSequentialGraph 
mapOffline(ImmutableGraph g,
long[][] map,
int batchSize,
File tempDir,
ProgressLogger pl)
Remaps the the graph nodes through a partial function specified via
a big array, using an offline method.

static int 
processBatch(int n,
long[] source,
long[] target,
File tempDir,
List<File> batches)
Sorts the given source and target arrays w.r.t.

static long[][] 
randomPermutation(ImmutableGraph g,
long seed)
Returns a random permutation for a given graph.

static ImmutableGraph 
symmetrizeOffline(ImmutableGraph g,
int batchSize)
Returns a symmetrized graph using an offline transposition.

static ImmutableGraph 
symmetrizeOffline(ImmutableGraph g,
int batchSize,
File tempDir)
Returns a symmetrized graph using an offline transposition.

static ImmutableGraph 
symmetrizeOffline(ImmutableGraph g,
int batchSize,
File tempDir,
ProgressLogger pl)
Returns a symmetrized graph using an offline transposition.

static ArcLabelledImmutableGraph 
transposeOffline(ArcLabelledImmutableGraph g,
int batchSize)
Returns an arclabelled immutable graph obtained by reversing all arcs in
g , using an offline method. 
static ArcLabelledImmutableGraph 
transposeOffline(ArcLabelledImmutableGraph g,
int batchSize,
File tempDir)
Returns an arclabelled immutable graph obtained by reversing all arcs in
g , using an offline method. 
static ArcLabelledImmutableGraph 
transposeOffline(ArcLabelledImmutableGraph g,
int batchSize,
File tempDir,
ProgressLogger pl)
Returns an arclabelled immutable graph obtained by reversing all arcs in
g , using an offline method. 
static ImmutableSequentialGraph 
transposeOffline(ImmutableGraph g,
int batchSize)
Returns an immutable graph obtained by reversing all arcs in
g , using an offline method. 
static ImmutableSequentialGraph 
transposeOffline(ImmutableGraph g,
int batchSize,
File tempDir)
Returns an immutable graph obtained by reversing all arcs in
g , using an offline method. 
static ImmutableSequentialGraph 
transposeOffline(ImmutableGraph g,
int batchSize,
File tempDir,
ProgressLogger pl)
Returns an immutable graph obtained by reversing all arcs in
g , using an offline method. 
static ArcLabelledImmutableGraph 
union(ArcLabelledImmutableGraph g0,
ArcLabelledImmutableGraph g1,
LabelMergeStrategy labelMergeStrategy)
Returns the union of two arclabelled immutable graphs.

static ImmutableGraph 
union(ImmutableGraph g0,
ImmutableGraph g1)
Returns the union of two immutable graphs.

public static final it.unimi.dsi.big.webgraph.Transform.NoLoops NO_LOOPS
public static ImmutableGraph filterArcs(ImmutableGraph graph, Transform.ArcFilter filter, ProgressLogger ignored)
graph
 a graph.filter
 the filter (telling whether each arc should be kept or not).ignored
 a progress logger, which will be ignored.public static ArcLabelledImmutableGraph filterArcs(ArcLabelledImmutableGraph graph, Transform.LabelledArcFilter filter, ProgressLogger ignored)
graph
 a labelled graph.filter
 the filter (telling whether each arc should be kept or not).ignored
 a progress logger, which will be ignored.public static ImmutableGraph filterArcs(ImmutableGraph graph, Transform.ArcFilter filter)
graph
 a graph.filter
 the filter (telling whether each arc should be kept or not).public static ArcLabelledImmutableGraph filterArcs(ArcLabelledImmutableGraph graph, Transform.LabelledArcFilter filter)
graph
 a labelled graph.filter
 the filter (telling whether each arc should be kept or not).public static ImmutableGraph symmetrizeOffline(ImmutableGraph g, int batchSize) throws IOException
g
 the source graph.batchSize
 the number of integers in a batch; two arrays of integers of this size will be allocated by this method.IOException
symmetrizeOffline(ImmutableGraph, int, File, ProgressLogger)
public static ImmutableGraph symmetrizeOffline(ImmutableGraph g, int batchSize, File tempDir) throws IOException
g
 the source graph.batchSize
 the number of integers in a batch; two arrays of integers of this size will be allocated by this method.tempDir
 a temporary directory for the batches, or null
for File.createTempFile(java.lang.String, java.lang.String)
's choice.IOException
symmetrizeOffline(ImmutableGraph, int, File, ProgressLogger)
public static ImmutableGraph symmetrizeOffline(ImmutableGraph g, int batchSize, File tempDir, ProgressLogger pl) throws IOException
The symmetrized graph is the union of a graph and of its transpose. This method will
compute the transpose on the fly using transposeOffline(ArcLabelledImmutableGraph, int, File, ProgressLogger)
.
g
 the source graph.batchSize
 the number of integers in a batch; two arrays of integers of this size will be allocated by this method.tempDir
 a temporary directory for the batches, or null
for File.createTempFile(java.lang.String, java.lang.String)
's choice.pl
 a progress logger, or null
.IOException
public static int processBatch(int n, long[] source, long[] target, File tempDir, List<File> batches) throws IOException
n
 the index of the last element to be sorted (exclusive).source
 the source array.target
 the target array.tempDir
 a temporary directory where to store the sorted arrays, or null
batches
 a list of files to which the batch file will be added.n
because duplicates are eliminated).IOException
public static ImmutableSequentialGraph transposeOffline(ImmutableGraph g, int batchSize) throws IOException
g
, using an offline method.g
 an immutable graph.batchSize
 the number of integers in a batch; two arrays of integers of this size will be allocated by this method.g
.IOException
transposeOffline(ImmutableGraph, int, File, ProgressLogger)
public static ImmutableSequentialGraph transposeOffline(ImmutableGraph g, int batchSize, File tempDir) throws IOException
g
, using an offline method.g
 an immutable graph.batchSize
 the number of integers in a batch; two arrays of integers of this size will be allocated by this method.tempDir
 a temporary directory for the batches, or null
for File.createTempFile(java.lang.String, java.lang.String)
's choice.g
.IOException
transposeOffline(ImmutableGraph, int, File, ProgressLogger)
public static ImmutableSequentialGraph transposeOffline(ImmutableGraph g, int batchSize, File tempDir, ProgressLogger pl) throws IOException
g
, using an offline method.
This method creates a number of sorted batches on disk containing arcs
represented by a pair of gapcompressed long integers ordered by target
and returns an ImmutableGraph
that can be accessed only using a node iterator
. The node iterator
merges on the fly the batches, providing a transposed graph. The files are marked with
File.deleteOnExit()
, so they should disappear when the JVM exits. An additional safetynet
finaliser tries to delete the batches, too.
Note that each NodeIterator
returned by the transpose requires opening all batches at the same time.
The batches are closed when they are exhausted, so a complete scan of the graph closes them all. In any case,
another safetynet finaliser closes all files when the iterator is collected.
This method can process offline graphs.
g
 an immutable graph.batchSize
 the number of integers in a batch; two arrays of integers of this size will be allocated by this method.tempDir
 a temporary directory for the batches, or null
for File.createTempFile(java.lang.String, java.lang.String)
's choice.pl
 a progress logger.g
.IOException
protected static void logBatches(ObjectArrayList<File> batches, long pairs, ProgressLogger pl)
public static ImmutableSequentialGraph mapOffline(ImmutableGraph g, long[][] map, int batchSize) throws IOException
g
 an immutable graph.map
 the transformation map.batchSize
 the number of integers in a batch; two arrays of integers of this size will be allocated by this method.g
.IOException
mapOffline(ImmutableGraph, long[][], int, File, ProgressLogger)
public static ImmutableSequentialGraph mapOffline(ImmutableGraph g, long[][] map, int batchSize, File tempDir) throws IOException
g
 an immutable graph.map
 the transformation map.batchSize
 the number of integers in a batch; two arrays of integers of this size will be allocated by this method.tempDir
 a temporary directory for the batches, or null
for File.createTempFile(java.lang.String, java.lang.String)
's choice.g
.IOException
mapOffline(ImmutableGraph, long[][], int, File, ProgressLogger)
public static ImmutableSequentialGraph mapOffline(ImmutableGraph g, long[][] map, int batchSize, File tempDir, ProgressLogger pl) throws IOException
More specifically, LongBigArrays.length(map)=g.numNodes()
,
and LongBigArrays.get(map, i)
is the new name of node i
, or 1 if the node
should not be mapped. If some
index appearing in map
is larger than or equal to the
number of nodes of g
, the resulting graph is enlarged correspondingly.
Arcs are mapped in the obvious way; in other words, there is
an arc from LongBigArrays.get(map, i)
to LongBigArrays.get(map, j)
(both nonnegative)
in the transformed
graph iff there was an arc from i
to j
in the original graph.
Note that if map
is bijective, the returned graph
is simply a permutation of the original graph.
Otherwise, the returned graph is obtained by deleting nodes mapped
to 1, quotienting nodes w.r.t. the equivalence relation induced by the fibres of map
and renumbering the result, always according to map
.
See transposeOffline(ImmutableGraph, int, File, ProgressLogger)
for
implementation and performancerelated details.
g
 an immutable graph.map
 the transformation map.batchSize
 the number of integers in a batch; two arrays of integers of this size will be allocated by this method.tempDir
 a temporary directory for the batches, or null
for File.createTempFile(java.lang.String, java.lang.String)
's choice.pl
 a progress logger, or null
.g
.IOException
public static ArcLabelledImmutableGraph transposeOffline(ArcLabelledImmutableGraph g, int batchSize) throws IOException
g
, using an offline method.g
 an immutable graph.batchSize
 the number of integers in a batch; two arrays of integers of this size will be allocated by this method,
plus an additional FastByteArrayOutputStream
needed to store all the labels for a batch.g
.IOException
transposeOffline(ArcLabelledImmutableGraph, int, File, ProgressLogger)
public static ArcLabelledImmutableGraph transposeOffline(ArcLabelledImmutableGraph g, int batchSize, File tempDir) throws IOException
g
, using an offline method.g
 an immutable graph.batchSize
 the number of integers in a batch; two arrays of integers of this size will be allocated by this method,
plus an additional FastByteArrayOutputStream
needed to store all the labels for a batch.tempDir
 a temporary directory for the batches, or null
for File.createTempFile(java.lang.String, java.lang.String)
's choice.g
.IOException
transposeOffline(ArcLabelledImmutableGraph, int, File, ProgressLogger)
public static ArcLabelledImmutableGraph transposeOffline(ArcLabelledImmutableGraph g, int batchSize, File tempDir, ProgressLogger pl) throws IOException
g
, using an offline method.
This method creates a number of sorted batches on disk containing arcs
represented by a pair of long integers in DataInput
format ordered by target
and returns an ImmutableGraph
that can be accessed only using a node iterator
. The node iterator
merges on the fly the batches, providing a transposed graph. The files are marked with
File.deleteOnExit()
, so they should disappear when the JVM exits. An additional safetynet
finaliser tries to delete the batches, too. As far as labels are concerned, they are temporarily stored in
an inmemory bit stream, that is permuted when it is stored on the disk
Note that each NodeIterator
returned by the transpose requires opening all batches at the same time.
The batches are closed when they are exhausted, so a complete scan of the graph closes them all. In any case,
another safetynet finaliser closes all files when the iterator is collected.
This method can process offline graphs. Note that no method to transpose online arclabelled graph is provided currently.
g
 an immutable graph.batchSize
 the number of integers in a batch; two arrays of integers of this size will be allocated by this method,
plus an additional FastByteArrayOutputStream
needed to store all the labels for a batch.tempDir
 a temporary directory for the batches, or null
for File.createTempFile(java.lang.String, java.lang.String)
's choice.pl
 a progress logger.g
.IOException
public static ArcLabelledImmutableGraph union(ArcLabelledImmutableGraph g0, ArcLabelledImmutableGraph g1, LabelMergeStrategy labelMergeStrategy)
The two arguments may differ in the number of nodes, in which case the resulting graph will be large as the larger graph.
g0
 the first graph.g1
 the second graph.labelMergeStrategy
 the strategy used to merge labels when the same arc
is present in both graphs; if null
, Labels.KEEP_FIRST_MERGE_STRATEGY
is used.public static ImmutableGraph union(ImmutableGraph g0, ImmutableGraph g1)
The two arguments may differ in the number of nodes, in which case the resulting graph will be large as the larger graph.
g0
 the first graph.g1
 the second graph.public static ImmutableGraph compose(ImmutableGraph g0, ImmutableGraph g1)
The two arguments may differ in the number of nodes, in which case the resulting graph will be large as the larger graph.
g0
 the first graph.g1
 the second graph.public static ArcLabelledImmutableGraph compose(ArcLabelledImmutableGraph g0, ArcLabelledImmutableGraph g1, LabelSemiring strategy)
The two arguments may differ in the number of nodes, in which case the resulting graph will be large as the larger graph.
g0
 the first graph.g1
 the second graph.strategy
 a label semiring.public static long[][] grayCodePermutation(ImmutableGraph g)
Gray codes list all sequences of n zeros and ones in such a way that adjacent lists differ by exactly one bit. If we assign to each row of the adjacency matrix of a graph its index as a Gray code, we obtain a permutation that will make similar lines nearer.
Note that since a graph permutation permutes both rows and columns, this transformation is not idempotent: the Graycode permutation produced from a matrix that has been Graycode sorted will not be, in general, the identity.
The important feature of Graycode ordering is that it is completely endogenous (e.g., determined by the graph itself), contrarily to, say, lexicographic URL ordering (which relies on the knowledge of the URL associated to each node).
g
 an immutable graph.mapOffline(ImmutableGraph, long[][], int, File, ProgressLogger)
).public static long[][] randomPermutation(ImmutableGraph g, long seed)
g
 an immutable graph.seed
 for XorShift1024StarRandom
.public static long[][] lexicographicalPermutation(ImmutableGraph g)
Note that since a graph permutation permutes both rows and columns, this transformation is not idempotent: the lexicographical permutation produced from a matrix that has been lexicographically sorted will not be, in general, the identity.
The important feature of lexicographical ordering is that it is completely endogenous (e.g., determined by the graph itself), contrarily to, say, lexicographic URL ordering (which relies on the knowledge of the URL associated to each node).
Warning: rows are numbered from zero from the left. This means, for instance, that nodes with an arc towards node zero are lexicographically smaller than nodes without it.
g
 an immutable graph.mapOffline(ImmutableGraph, long[][], int)
).public static void main(String[] args) throws IOException, IllegalArgumentException, SecurityException, InstantiationException, IllegalAccessException, InvocationTargetException, NoSuchMethodException, ClassNotFoundException, JSAPException