Class Transform
public class Transform extends Object
Most methods take an ImmutableGraph
(along with some other data, that
depend on the kind of transformation), and return another ImmutableGraph
that represents the transformed
version.

Nested Class Summary
Nested Classes Modifier and Type Class Description static interface
Transform.ArcFilter
Provides a method to accept or reject an arc.static class
Transform.BatchGraph
static interface
Transform.LabelledArcFilter
Provides a method to accept or reject a labelled arc.static class
Transform.LowerBound
An arc filter that rejects arcs whose wellknown attribute has a value smaller than a given threshold.static class
Transform.NodeClassFilter
An arc filter that only accepts arcs whose endpoints belong to the same (if the parameterkeepOnlySame
is true) or to different (ifkeepOnlySame
is false) classes. 
Field Summary
Fields Modifier and Type Field Description static it.unimi.dsi.big.webgraph.Transform.NoLoops
NO_LOOPS
A singleton providing an arc filter that rejects loops. 
Method Summary
Modifier and Type Method Description static ImmutableGraph
compose(ImmutableGraph g0, ImmutableGraph g1)
Returns the composition (a.k.a.static ArcLabelledImmutableGraph
compose(ArcLabelledImmutableGraph g0, ArcLabelledImmutableGraph g1, LabelSemiring strategy)
Returns the composition (a.k.a.static ImmutableGraph
filterArcs(ImmutableGraph graph, Transform.ArcFilter filter)
Returns a graph with some arcs eventually stripped, according to the given filter.static ImmutableGraph
filterArcs(ImmutableGraph graph, Transform.ArcFilter filter, ProgressLogger ignored)
Returns a graph with some arcs eventually stripped, according to the given filter.static ArcLabelledImmutableGraph
filterArcs(ArcLabelledImmutableGraph graph, Transform.LabelledArcFilter filter)
Returns a labelled graph with some arcs eventually stripped, according to the given filter.static ArcLabelledImmutableGraph
filterArcs(ArcLabelledImmutableGraph graph, Transform.LabelledArcFilter filter, ProgressLogger ignored)
Returns a labelled graph with some arcs eventually stripped, according to the given filter.static long[][]
grayCodePermutation(ImmutableGraph g)
Returns a permutation that would make the given graph adjacency lists in Graycode order.static long[][]
lexicographicalPermutation(ImmutableGraph g)
Returns a permutation that would make the given graph adjacency lists in lexicographical order.protected static void
logBatches(ObjectArrayList<File> batches, long pairs, ProgressLogger pl)
static void
main(String[] args)
static ImmutableSequentialGraph
mapOffline(ImmutableGraph g, long[][] map, int batchSize)
Returns an immutable graph obtained by remapping offline the graph nodes through a partial function specified via a big array.static ImmutableSequentialGraph
mapOffline(ImmutableGraph g, long[][] map, int batchSize, File tempDir)
Returns an immutable graph obtained by remapping offline the graph nodes through a partial function specified via a big array.static ImmutableSequentialGraph
mapOffline(ImmutableGraph g, long[][] map, int batchSize, File tempDir, ProgressLogger pl)
Remaps the the graph nodes through a partial function specified via a big array, using an offline method.static int
processBatch(int n, long[] source, long[] target, File tempDir, List<File> batches)
Sorts the given source and target arrays w.r.t.static long[][]
randomPermutation(ImmutableGraph g, long seed)
Returns a random permutation for a given graph.static ImmutableGraph
simplify(ImmutableGraph g, ImmutableGraph t)
Returns a simplified (loopless and symmetric) graph using the graph and its transpose.static ImmutableGraph
simplify(ImmutableGraph g, ImmutableGraph t, ProgressLogger pl)
Returns a simplified (loopless and symmetric) graph using the graph and its transpose.static ImmutableGraph
simplifyOffline(ImmutableGraph g, int batchSize)
Returns a simplified (loopless and symmetric) graph using an offline transposition.static ImmutableGraph
simplifyOffline(ImmutableGraph g, int batchSize, File tempDir)
Returns a simplified (loopless and symmetric) graph using an offline transposition.static ImmutableGraph
simplifyOffline(ImmutableGraph g, int batchSize, File tempDir, ProgressLogger pl)
Returns a simplified graph(loopless and symmetric) using an offline transposition.static ImmutableGraph
symmetrizeOffline(ImmutableGraph g, int batchSize)
Returns a symmetrized graph using an offline transposition.static ImmutableGraph
symmetrizeOffline(ImmutableGraph g, int batchSize, File tempDir)
Returns a symmetrized graph using an offline transposition.static ImmutableGraph
symmetrizeOffline(ImmutableGraph g, int batchSize, File tempDir, ProgressLogger pl)
Returns a symmetrized graph using an offline transposition.static ImmutableSequentialGraph
transposeOffline(ImmutableGraph g, int batchSize)
Returns an immutable graph obtained by reversing all arcs ing
, using an offline method.static ImmutableSequentialGraph
transposeOffline(ImmutableGraph g, int batchSize, File tempDir)
Returns an immutable graph obtained by reversing all arcs ing
, using an offline method.static ImmutableSequentialGraph
transposeOffline(ImmutableGraph g, int batchSize, File tempDir, ProgressLogger pl)
Returns an immutable graph obtained by reversing all arcs ing
, using an offline method.static ArcLabelledImmutableGraph
transposeOffline(ArcLabelledImmutableGraph g, int batchSize)
Returns an arclabelled immutable graph obtained by reversing all arcs ing
, using an offline method.static ArcLabelledImmutableGraph
transposeOffline(ArcLabelledImmutableGraph g, int batchSize, File tempDir)
Returns an arclabelled immutable graph obtained by reversing all arcs ing
, using an offline method.static ArcLabelledImmutableGraph
transposeOffline(ArcLabelledImmutableGraph g, int batchSize, File tempDir, ProgressLogger pl)
Returns an arclabelled immutable graph obtained by reversing all arcs ing
, using an offline method.static ImmutableGraph
union(ImmutableGraph g0, ImmutableGraph g1)
Returns the union of two immutable graphs.static ArcLabelledImmutableGraph
union(ArcLabelledImmutableGraph g0, ArcLabelledImmutableGraph g1, LabelMergeStrategy labelMergeStrategy)
Returns the union of two arclabelled immutable graphs.

Field Details

NO_LOOPS
public static final it.unimi.dsi.big.webgraph.Transform.NoLoops NO_LOOPSA singleton providing an arc filter that rejects loops.


Method Details

filterArcs
public static ImmutableGraph filterArcs(ImmutableGraph graph, Transform.ArcFilter filter, ProgressLogger ignored)Returns a graph with some arcs eventually stripped, according to the given filter. Parameters:
graph
 a graph.filter
 the filter (telling whether each arc should be kept or not).ignored
 a progress logger, which will be ignored. Returns:
 the filtered graph.

filterArcs
public static ArcLabelledImmutableGraph filterArcs(ArcLabelledImmutableGraph graph, Transform.LabelledArcFilter filter, ProgressLogger ignored)Returns a labelled graph with some arcs eventually stripped, according to the given filter. Parameters:
graph
 a labelled graph.filter
 the filter (telling whether each arc should be kept or not).ignored
 a progress logger, which will be ignored. Returns:
 the filtered graph.

filterArcs
Returns a graph with some arcs eventually stripped, according to the given filter. Parameters:
graph
 a graph.filter
 the filter (telling whether each arc should be kept or not). Returns:
 the filtered graph.

filterArcs
public static ArcLabelledImmutableGraph filterArcs(ArcLabelledImmutableGraph graph, Transform.LabelledArcFilter filter)Returns a labelled graph with some arcs eventually stripped, according to the given filter. Parameters:
graph
 a labelled graph.filter
 the filter (telling whether each arc should be kept or not). Returns:
 the filtered graph.

symmetrizeOffline
Returns a symmetrized graph using an offline transposition. Parameters:
g
 the source graph.batchSize
 the number of integers in a batch; two arrays of integers of this size will be allocated by this method. Returns:
 the symmetrized graph.
 Throws:
IOException
 See Also:
symmetrizeOffline(ImmutableGraph, int, File, ProgressLogger)

symmetrizeOffline
public static ImmutableGraph symmetrizeOffline(ImmutableGraph g, int batchSize, File tempDir) throws IOExceptionReturns a symmetrized graph using an offline transposition. Parameters:
g
 the source graph.batchSize
 the number of integers in a batch; two arrays of integers of this size will be allocated by this method.tempDir
 a temporary directory for the batches, ornull
forFile.createTempFile(java.lang.String, java.lang.String)
's choice. Returns:
 the symmetrized graph.
 Throws:
IOException
 See Also:
symmetrizeOffline(ImmutableGraph, int, File, ProgressLogger)

symmetrizeOffline
public static ImmutableGraph symmetrizeOffline(ImmutableGraph g, int batchSize, File tempDir, ProgressLogger pl) throws IOExceptionReturns a symmetrized graph using an offline transposition.The symmetrized graph is the union of a graph and of its transpose. This method will compute the transpose on the fly using
transposeOffline(ArcLabelledImmutableGraph, int, File, ProgressLogger)
. Parameters:
g
 the source graph.batchSize
 the number of integers in a batch; two arrays of integers of this size will be allocated by this method.tempDir
 a temporary directory for the batches, ornull
forFile.createTempFile(java.lang.String, java.lang.String)
's choice.pl
 a progress logger, ornull
. Returns:
 the symmetrized graph.
 Throws:
IOException

simplify
Returns a simplified (loopless and symmetric) graph using the graph and its transpose. Parameters:
g
 the source graph.t
 the graphg
transposed.pl
 a progress logger, ornull
. Returns:
 the simplified (loopless and symmetric) graph.

simplify
Returns a simplified (loopless and symmetric) graph using the graph and its transpose. Parameters:
g
 the source graph.t
 the graphg
transposed. Returns:
 the simplified (loopless and symmetric) graph.

simplifyOffline
Returns a simplified (loopless and symmetric) graph using an offline transposition. Parameters:
g
 the source graph.batchSize
 the number of integers in a batch; two arrays of integers of this size will be allocated by this method. Returns:
 the simplified (loopless and symmetric) graph.
 Throws:
IOException
 See Also:
simplifyOffline(ImmutableGraph, int, File, ProgressLogger)

simplifyOffline
public static ImmutableGraph simplifyOffline(ImmutableGraph g, int batchSize, File tempDir) throws IOExceptionReturns a simplified (loopless and symmetric) graph using an offline transposition. Parameters:
g
 the source graph.batchSize
 the number of integers in a batch; two arrays of integers of this size will be allocated by this method.tempDir
 a temporary directory for the batches, ornull
forFile.createTempFile(java.lang.String, java.lang.String)
's choice. Returns:
 the simplified (loopless and symmetric) graph.
 Throws:
IOException
 See Also:
simplifyOffline(ImmutableGraph, int, File, ProgressLogger)

simplifyOffline
public static ImmutableGraph simplifyOffline(ImmutableGraph g, int batchSize, File tempDir, ProgressLogger pl) throws IOExceptionReturns a simplified graph(loopless and symmetric) using an offline transposition.The simplified graph is the union of a graph and of its transpose, with the loops removed. This method will compute the transpose on the fly using
transposeOffline(ArcLabelledImmutableGraph, int, File, ProgressLogger)
. Parameters:
g
 the source graph.batchSize
 the number of integers in a batch; two arrays of integers of this size will be allocated by this method.tempDir
 a temporary directory for the batches, ornull
forFile.createTempFile(java.lang.String, java.lang.String)
's choice.pl
 a progress logger, ornull
. Returns:
 the simplified (loopless and symmetric) graph.
 Throws:
IOException

processBatch
public static int processBatch(int n, long[] source, long[] target, File tempDir, List<File> batches) throws IOExceptionSorts the given source and target arrays w.r.t. the target and stores them in a temporary file. Parameters:
n
 the index of the last element to be sorted (exclusive).source
 the source array.target
 the target array.tempDir
 a temporary directory where to store the sorted arrays, ornull
batches
 a list of files to which the batch file will be added. Returns:
 the number of pairs in the batch (might be less than
n
because duplicates are eliminated).  Throws:
IOException

transposeOffline
public static ImmutableSequentialGraph transposeOffline(ImmutableGraph g, int batchSize) throws IOExceptionReturns an immutable graph obtained by reversing all arcs ing
, using an offline method. Parameters:
g
 an immutable graph.batchSize
 the number of integers in a batch; two arrays of integers of this size will be allocated by this method. Returns:
 an immutable, sequentially accessible graph obtained by transposing
g
.  Throws:
IOException
 See Also:
transposeOffline(ImmutableGraph, int, File, ProgressLogger)

transposeOffline
public static ImmutableSequentialGraph transposeOffline(ImmutableGraph g, int batchSize, File tempDir) throws IOExceptionReturns an immutable graph obtained by reversing all arcs ing
, using an offline method. Parameters:
g
 an immutable graph.batchSize
 the number of integers in a batch; two arrays of integers of this size will be allocated by this method.tempDir
 a temporary directory for the batches, ornull
forFile.createTempFile(java.lang.String, java.lang.String)
's choice. Returns:
 an immutable, sequentially accessible graph obtained by transposing
g
.  Throws:
IOException
 See Also:
transposeOffline(ImmutableGraph, int, File, ProgressLogger)

transposeOffline
public static ImmutableSequentialGraph transposeOffline(ImmutableGraph g, int batchSize, File tempDir, ProgressLogger pl) throws IOExceptionReturns an immutable graph obtained by reversing all arcs ing
, using an offline method.This method creates a number of sorted batches on disk containing arcs represented by a pair of gapcompressed long integers ordered by target and returns an
ImmutableGraph
that can be accessed only using anode iterator
. The node iterator merges on the fly the batches, providing a transposed graph. The files are marked withFile.deleteOnExit()
, so they should disappear when the JVM exits. An additional safetynet finaliser tries to delete the batches, too.Note that each
NodeIterator
returned by the transpose requires opening all batches at the same time. The batches are closed when they are exhausted, so a complete scan of the graph closes them all. In any case, another safetynet finaliser closes all files when the iterator is collected.This method can process offline graphs.
 Parameters:
g
 an immutable graph.batchSize
 the number of integers in a batch; two arrays of integers of this size will be allocated by this method.tempDir
 a temporary directory for the batches, ornull
forFile.createTempFile(java.lang.String, java.lang.String)
's choice.pl
 a progress logger. Returns:
 an immutable, sequentially accessible graph obtained by transposing
g
.  Throws:
IOException

logBatches

mapOffline
public static ImmutableSequentialGraph mapOffline(ImmutableGraph g, long[][] map, int batchSize) throws IOExceptionReturns an immutable graph obtained by remapping offline the graph nodes through a partial function specified via a big array. Parameters:
g
 an immutable graph.map
 the transformation map.batchSize
 the number of integers in a batch; two arrays of integers of this size will be allocated by this method. Returns:
 an immutable, sequentially accessible graph obtained by transforming
g
.  Throws:
IOException
 See Also:
mapOffline(ImmutableGraph, long[][], int, File, ProgressLogger)

mapOffline
public static ImmutableSequentialGraph mapOffline(ImmutableGraph g, long[][] map, int batchSize, File tempDir) throws IOExceptionReturns an immutable graph obtained by remapping offline the graph nodes through a partial function specified via a big array. Parameters:
g
 an immutable graph.map
 the transformation map.batchSize
 the number of integers in a batch; two arrays of integers of this size will be allocated by this method.tempDir
 a temporary directory for the batches, ornull
forFile.createTempFile(java.lang.String, java.lang.String)
's choice. Returns:
 an immutable, sequentially accessible graph obtained by transforming
g
.  Throws:
IOException
 See Also:
mapOffline(ImmutableGraph, long[][], int, File, ProgressLogger)

mapOffline
public static ImmutableSequentialGraph mapOffline(ImmutableGraph g, long[][] map, int batchSize, File tempDir, ProgressLogger pl) throws IOExceptionRemaps the the graph nodes through a partial function specified via a big array, using an offline method.More specifically,
LongBigArrays.length(map)=g.numNodes()
, andLongBigArrays.get(map, i)
is the new name of nodei
, or 1 if the node should not be mapped. If some index appearing inmap
is larger than or equal to the number of nodes ofg
, the resulting graph is enlarged correspondingly.Arcs are mapped in the obvious way; in other words, there is an arc from
LongBigArrays.get(map, i)
toLongBigArrays.get(map, j)
(both nonnegative) in the transformed graph iff there was an arc fromi
toj
in the original graph.Note that if
map
is bijective, the returned graph is simply a permutation of the original graph. Otherwise, the returned graph is obtained by deleting nodes mapped to 1, quotienting nodes w.r.t. the equivalence relation induced by the fibres ofmap
and renumbering the result, always according tomap
. SeetransposeOffline(ImmutableGraph, int, File, ProgressLogger)
for implementation and performancerelated details. Parameters:
g
 an immutable graph.map
 the transformation map.batchSize
 the number of integers in a batch; two arrays of integers of this size will be allocated by this method.tempDir
 a temporary directory for the batches, ornull
forFile.createTempFile(java.lang.String, java.lang.String)
's choice.pl
 a progress logger, ornull
. Returns:
 an immutable, sequentially accessible graph obtained by transforming
g
.  Throws:
IOException

transposeOffline
public static ArcLabelledImmutableGraph transposeOffline(ArcLabelledImmutableGraph g, int batchSize) throws IOExceptionReturns an arclabelled immutable graph obtained by reversing all arcs ing
, using an offline method. Parameters:
g
 an immutable graph.batchSize
 the number of integers in a batch; two arrays of integers of this size will be allocated by this method, plus an additionalFastByteArrayOutputStream
needed to store all the labels for a batch. Returns:
 an immutable, sequentially accessible graph obtained by transposing
g
.  Throws:
IOException
 See Also:
transposeOffline(ArcLabelledImmutableGraph, int, File, ProgressLogger)

transposeOffline
public static ArcLabelledImmutableGraph transposeOffline(ArcLabelledImmutableGraph g, int batchSize, File tempDir) throws IOExceptionReturns an arclabelled immutable graph obtained by reversing all arcs ing
, using an offline method. Parameters:
g
 an immutable graph.batchSize
 the number of integers in a batch; two arrays of integers of this size will be allocated by this method, plus an additionalFastByteArrayOutputStream
needed to store all the labels for a batch.tempDir
 a temporary directory for the batches, ornull
forFile.createTempFile(java.lang.String, java.lang.String)
's choice. Returns:
 an immutable, sequentially accessible graph obtained by transposing
g
.  Throws:
IOException
 See Also:
transposeOffline(ArcLabelledImmutableGraph, int, File, ProgressLogger)

transposeOffline
public static ArcLabelledImmutableGraph transposeOffline(ArcLabelledImmutableGraph g, int batchSize, File tempDir, ProgressLogger pl) throws IOExceptionReturns an arclabelled immutable graph obtained by reversing all arcs ing
, using an offline method.This method creates a number of sorted batches on disk containing arcs represented by a pair of long integers in
DataInput
format ordered by target and returns anImmutableGraph
that can be accessed only using anode iterator
. The node iterator merges on the fly the batches, providing a transposed graph. The files are marked withFile.deleteOnExit()
, so they should disappear when the JVM exits. An additional safetynet finaliser tries to delete the batches, too. As far as labels are concerned, they are temporarily stored in an inmemory bit stream, that is permuted when it is stored on the diskNote that each
NodeIterator
returned by the transpose requires opening all batches at the same time. The batches are closed when they are exhausted, so a complete scan of the graph closes them all. In any case, another safetynet finaliser closes all files when the iterator is collected.This method can process offline graphs. Note that no method to transpose online arclabelled graph is provided currently.
 Parameters:
g
 an immutable graph.batchSize
 the number of integers in a batch; two arrays of integers of this size will be allocated by this method, plus an additionalFastByteArrayOutputStream
needed to store all the labels for a batch.tempDir
 a temporary directory for the batches, ornull
forFile.createTempFile(java.lang.String, java.lang.String)
's choice.pl
 a progress logger. Returns:
 an immutable, sequentially accessible graph obtained by transposing
g
.  Throws:
IOException

union
public static ArcLabelledImmutableGraph union(ArcLabelledImmutableGraph g0, ArcLabelledImmutableGraph g1, LabelMergeStrategy labelMergeStrategy)Returns the union of two arclabelled immutable graphs.The two arguments may differ in the number of nodes, in which case the resulting graph will be large as the larger graph.
 Parameters:
g0
 the first graph.g1
 the second graph.labelMergeStrategy
 the strategy used to merge labels when the same arc is present in both graphs; ifnull
,Labels.KEEP_FIRST_MERGE_STRATEGY
is used. Returns:
 the union of the two graphs.

union
Returns the union of two immutable graphs.The two arguments may differ in the number of nodes, in which case the resulting graph will be large as the larger graph.
 Parameters:
g0
 the first graph.g1
 the second graph. Returns:
 the union of the two graphs.

compose
Returns the composition (a.k.a. matrix product) of two immutable graphs.The two arguments may differ in the number of nodes, in which case the resulting graph will be large as the larger graph.
 Parameters:
g0
 the first graph.g1
 the second graph. Returns:
 the composition of the two graphs.

compose
public static ArcLabelledImmutableGraph compose(ArcLabelledImmutableGraph g0, ArcLabelledImmutableGraph g1, LabelSemiring strategy)Returns the composition (a.k.a. matrix product) of two arclabelled immutable graphs.The two arguments may differ in the number of nodes, in which case the resulting graph will be large as the larger graph.
Warning: this implementation requires outdegrees smaller than 2^{32}.
 Parameters:
g0
 the first graph.g1
 the second graph.strategy
 a label semiring. Returns:
 the composition of the two graphs.

grayCodePermutation
Returns a permutation that would make the given graph adjacency lists in Graycode order.Gray codes list all sequences of n zeros and ones in such a way that adjacent lists differ by exactly one bit. If we assign to each row of the adjacency matrix of a graph its index as a Gray code, we obtain a permutation that will make similar lines nearer.
Note that since a graph permutation permutes both rows and columns, this transformation is not idempotent: the Graycode permutation produced from a matrix that has been Graycode sorted will not be, in general, the identity.
The important feature of Graycode ordering is that it is completely endogenous (e.g., determined by the graph itself), contrarily to, say, lexicographic URL ordering (which relies on the knowledge of the URL associated to each node).
 Parameters:
g
 an immutable graph. Returns:
 the permutation that would order the graph adjacency lists by Gray order
(you can just pass it to
mapOffline(ImmutableGraph, long[][], int, File, ProgressLogger)
).

randomPermutation
Returns a random permutation for a given graph. Parameters:
g
 an immutable graph.seed
 forXoRoShiRo128PlusRandom
. Returns:
 a random permutation for the given graph

lexicographicalPermutation
Returns a permutation that would make the given graph adjacency lists in lexicographical order.Note that since a graph permutation permutes both rows and columns, this transformation is not idempotent: the lexicographical permutation produced from a matrix that has been lexicographically sorted will not be, in general, the identity.
The important feature of lexicographical ordering is that it is completely endogenous (e.g., determined by the graph itself), contrarily to, say, lexicographic URL ordering (which relies on the knowledge of the URL associated to each node).
Warning: rows are numbered from zero from the left. This means, for instance, that nodes with an arc towards node zero are lexicographically smaller than nodes without it.
 Parameters:
g
 an immutable graph. Returns:
 the permutation that would order the graph adjacency lists by lexicographical order
(you can just pass it to
mapOffline(ImmutableGraph, long[][], int)
).

main
public static void main(String[] args) throws IOException, IllegalArgumentException, SecurityException, InstantiationException, IllegalAccessException, InvocationTargetException, NoSuchMethodException, ClassNotFoundException, JSAPException
