Class Transform
Most methods take an ImmutableGraph
(along with some other data, that
depend on the kind of transformation), and return another ImmutableGraph
that represents the transformed
version.
-
Nested Class Summary
Modifier and TypeClassDescriptionstatic interface
Provides a method to accept or reject an arc.static final class
static interface
Provides a method to accept or reject a labelled arc.static final class
An arc filter that rejects arcs whose well-known attribute has a value smaller than a given threshold.static final class
An arc filter that only accepts arcs whose endpoints belong to the same (if the parameterkeepOnlySame
is true) or to different (ifkeepOnlySame
is false) classes. -
Field Summary
Modifier and TypeFieldDescriptionstatic final it.unimi.dsi.big.webgraph.Transform.NoLoops
A singleton providing an arc filter that rejects loops. -
Method Summary
Modifier and TypeMethodDescriptionstatic ImmutableGraph
compose
(ImmutableGraph g0, ImmutableGraph g1) Returns the composition (a.k.a. matrix product) of two immutable graphs.static ArcLabelledImmutableGraph
compose
(ArcLabelledImmutableGraph g0, ArcLabelledImmutableGraph g1, LabelSemiring strategy) Returns the composition (a.k.a. matrix product) of two arc-labelled immutable graphs.static ImmutableGraph
filterArcs
(ImmutableGraph graph, Transform.ArcFilter filter) Returns a graph with some arcs eventually stripped, according to the given filter.static ImmutableGraph
filterArcs
(ImmutableGraph graph, Transform.ArcFilter filter, ProgressLogger ignored) Returns a graph with some arcs eventually stripped, according to the given filter.static ArcLabelledImmutableGraph
filterArcs
(ArcLabelledImmutableGraph graph, Transform.LabelledArcFilter filter) Returns a labelled graph with some arcs eventually stripped, according to the given filter.static ArcLabelledImmutableGraph
filterArcs
(ArcLabelledImmutableGraph graph, Transform.LabelledArcFilter filter, ProgressLogger ignored) Returns a labelled graph with some arcs eventually stripped, according to the given filter.static long[][]
Returns a permutation that would make the given graph adjacency lists in Gray-code order.static long[][]
Returns a permutation that would make the given graph adjacency lists in lexicographical order.protected static void
logBatches
(ObjectArrayList<File> batches, long pairs, ProgressLogger pl) static void
static ImmutableSequentialGraph
mapOffline
(ImmutableGraph g, long[][] map, int batchSize) Returns an immutable graph obtained by remapping offline the graph nodes through a partial function specified via a big array.static ImmutableSequentialGraph
mapOffline
(ImmutableGraph g, long[][] map, int batchSize, File tempDir) Returns an immutable graph obtained by remapping offline the graph nodes through a partial function specified via a big array.static ImmutableSequentialGraph
mapOffline
(ImmutableGraph g, long[][] map, int batchSize, File tempDir, ProgressLogger pl) Remaps the the graph nodes through a partial function specified via a big array, using an offline method.static int
processBatch
(int n, long[] source, long[] target, File tempDir, List<File> batches) Sorts the given source and target arrays w.r.t. the target and stores them in a temporary file.static long[][]
randomPermutation
(ImmutableGraph g, long seed) Returns a random permutation for a given graph.static ImmutableGraph
Returns a simplified (loopless and symmetric) graph using the graph and its transpose.static ImmutableGraph
simplify
(ImmutableGraph g, ImmutableGraph t, ProgressLogger pl) Returns a simplified (loopless and symmetric) graph using the graph and its transpose.static ImmutableGraph
simplifyOffline
(ImmutableGraph g, int batchSize) Returns a simplified (loopless and symmetric) graph using an offline transposition.static ImmutableGraph
simplifyOffline
(ImmutableGraph g, int batchSize, File tempDir) Returns a simplified (loopless and symmetric) graph using an offline transposition.static ImmutableGraph
simplifyOffline
(ImmutableGraph g, int batchSize, File tempDir, ProgressLogger pl) Returns a simplified graph(loopless and symmetric) using an offline transposition.static ImmutableGraph
symmetrizeOffline
(ImmutableGraph g, int batchSize) Returns a symmetrized graph using an offline transposition.static ImmutableGraph
symmetrizeOffline
(ImmutableGraph g, int batchSize, File tempDir) Returns a symmetrized graph using an offline transposition.static ImmutableGraph
symmetrizeOffline
(ImmutableGraph g, int batchSize, File tempDir, ProgressLogger pl) Returns a symmetrized graph using an offline transposition.static ImmutableSequentialGraph
transposeOffline
(ImmutableGraph g, int batchSize) Returns an immutable graph obtained by reversing all arcs ing
, using an offline method.static ImmutableSequentialGraph
transposeOffline
(ImmutableGraph g, int batchSize, File tempDir) Returns an immutable graph obtained by reversing all arcs ing
, using an offline method.static ImmutableSequentialGraph
transposeOffline
(ImmutableGraph g, int batchSize, File tempDir, ProgressLogger pl) Returns an immutable graph obtained by reversing all arcs ing
, using an offline method.static ArcLabelledImmutableGraph
transposeOffline
(ArcLabelledImmutableGraph g, int batchSize) Returns an arc-labelled immutable graph obtained by reversing all arcs ing
, using an offline method.static ArcLabelledImmutableGraph
transposeOffline
(ArcLabelledImmutableGraph g, int batchSize, File tempDir) Returns an arc-labelled immutable graph obtained by reversing all arcs ing
, using an offline method.static ArcLabelledImmutableGraph
transposeOffline
(ArcLabelledImmutableGraph g, int batchSize, File tempDir, ProgressLogger pl) Returns an arc-labelled immutable graph obtained by reversing all arcs ing
, using an offline method.static ImmutableGraph
union
(ImmutableGraph g0, ImmutableGraph g1) Returns the union of two immutable graphs.static ArcLabelledImmutableGraph
union
(ArcLabelledImmutableGraph g0, ArcLabelledImmutableGraph g1, LabelMergeStrategy labelMergeStrategy) Returns the union of two arc-labelled immutable graphs.
-
Field Details
-
NO_LOOPS
public static final it.unimi.dsi.big.webgraph.Transform.NoLoops NO_LOOPSA singleton providing an arc filter that rejects loops.
-
-
Method Details
-
filterArcs
public static ImmutableGraph filterArcs(ImmutableGraph graph, Transform.ArcFilter filter, ProgressLogger ignored) Returns a graph with some arcs eventually stripped, according to the given filter.- Parameters:
graph
- a graph.filter
- the filter (telling whether each arc should be kept or not).ignored
- a progress logger, which will be ignored.- Returns:
- the filtered graph.
-
filterArcs
public static ArcLabelledImmutableGraph filterArcs(ArcLabelledImmutableGraph graph, Transform.LabelledArcFilter filter, ProgressLogger ignored) Returns a labelled graph with some arcs eventually stripped, according to the given filter.- Parameters:
graph
- a labelled graph.filter
- the filter (telling whether each arc should be kept or not).ignored
- a progress logger, which will be ignored.- Returns:
- the filtered graph.
-
filterArcs
Returns a graph with some arcs eventually stripped, according to the given filter.- Parameters:
graph
- a graph.filter
- the filter (telling whether each arc should be kept or not).- Returns:
- the filtered graph.
-
filterArcs
public static ArcLabelledImmutableGraph filterArcs(ArcLabelledImmutableGraph graph, Transform.LabelledArcFilter filter) Returns a labelled graph with some arcs eventually stripped, according to the given filter.- Parameters:
graph
- a labelled graph.filter
- the filter (telling whether each arc should be kept or not).- Returns:
- the filtered graph.
-
symmetrizeOffline
Returns a symmetrized graph using an offline transposition.- Parameters:
g
- the source graph.batchSize
- the number of integers in a batch; two arrays of integers of this size will be allocated by this method.- Returns:
- the symmetrized graph.
- Throws:
IOException
- See Also:
-
symmetrizeOffline
public static ImmutableGraph symmetrizeOffline(ImmutableGraph g, int batchSize, File tempDir) throws IOException Returns a symmetrized graph using an offline transposition.- Parameters:
g
- the source graph.batchSize
- the number of integers in a batch; two arrays of integers of this size will be allocated by this method.tempDir
- a temporary directory for the batches, ornull
forFile.createTempFile(java.lang.String, java.lang.String)
's choice.- Returns:
- the symmetrized graph.
- Throws:
IOException
- See Also:
-
symmetrizeOffline
public static ImmutableGraph symmetrizeOffline(ImmutableGraph g, int batchSize, File tempDir, ProgressLogger pl) throws IOException Returns a symmetrized graph using an offline transposition.The symmetrized graph is the union of a graph and of its transpose. This method will compute the transpose on the fly using
transposeOffline(ArcLabelledImmutableGraph, int, File, ProgressLogger)
.- Parameters:
g
- the source graph.batchSize
- the number of integers in a batch; two arrays of integers of this size will be allocated by this method.tempDir
- a temporary directory for the batches, ornull
forFile.createTempFile(java.lang.String, java.lang.String)
's choice.pl
- a progress logger, ornull
.- Returns:
- the symmetrized graph.
- Throws:
IOException
-
simplify
Returns a simplified (loopless and symmetric) graph using the graph and its transpose.- Parameters:
g
- the source graph.t
- the graphg
transposed.pl
- a progress logger, ornull
.- Returns:
- the simplified (loopless and symmetric) graph.
-
simplify
Returns a simplified (loopless and symmetric) graph using the graph and its transpose.- Parameters:
g
- the source graph.t
- the graphg
transposed.- Returns:
- the simplified (loopless and symmetric) graph.
-
simplifyOffline
Returns a simplified (loopless and symmetric) graph using an offline transposition.- Parameters:
g
- the source graph.batchSize
- the number of integers in a batch; two arrays of integers of this size will be allocated by this method.- Returns:
- the simplified (loopless and symmetric) graph.
- Throws:
IOException
- See Also:
-
simplifyOffline
public static ImmutableGraph simplifyOffline(ImmutableGraph g, int batchSize, File tempDir) throws IOException Returns a simplified (loopless and symmetric) graph using an offline transposition.- Parameters:
g
- the source graph.batchSize
- the number of integers in a batch; two arrays of integers of this size will be allocated by this method.tempDir
- a temporary directory for the batches, ornull
forFile.createTempFile(java.lang.String, java.lang.String)
's choice.- Returns:
- the simplified (loopless and symmetric) graph.
- Throws:
IOException
- See Also:
-
simplifyOffline
public static ImmutableGraph simplifyOffline(ImmutableGraph g, int batchSize, File tempDir, ProgressLogger pl) throws IOException Returns a simplified graph(loopless and symmetric) using an offline transposition.The simplified graph is the union of a graph and of its transpose, with the loops removed. This method will compute the transpose on the fly using
transposeOffline(ArcLabelledImmutableGraph, int, File, ProgressLogger)
.- Parameters:
g
- the source graph.batchSize
- the number of integers in a batch; two arrays of integers of this size will be allocated by this method.tempDir
- a temporary directory for the batches, ornull
forFile.createTempFile(java.lang.String, java.lang.String)
's choice.pl
- a progress logger, ornull
.- Returns:
- the simplified (loopless and symmetric) graph.
- Throws:
IOException
-
processBatch
public static int processBatch(int n, long[] source, long[] target, File tempDir, List<File> batches) throws IOException Sorts the given source and target arrays w.r.t. the target and stores them in a temporary file.- Parameters:
n
- the index of the last element to be sorted (exclusive).source
- the source array.target
- the target array.tempDir
- a temporary directory where to store the sorted arrays, ornull
batches
- a list of files to which the batch file will be added.- Returns:
- the number of pairs in the batch (might be less than
n
because duplicates are eliminated). - Throws:
IOException
-
transposeOffline
public static ImmutableSequentialGraph transposeOffline(ImmutableGraph g, int batchSize) throws IOException Returns an immutable graph obtained by reversing all arcs ing
, using an offline method.- Parameters:
g
- an immutable graph.batchSize
- the number of integers in a batch; two arrays of integers of this size will be allocated by this method.- Returns:
- an immutable, sequentially accessible graph obtained by transposing
g
. - Throws:
IOException
- See Also:
-
transposeOffline
public static ImmutableSequentialGraph transposeOffline(ImmutableGraph g, int batchSize, File tempDir) throws IOException Returns an immutable graph obtained by reversing all arcs ing
, using an offline method.- Parameters:
g
- an immutable graph.batchSize
- the number of integers in a batch; two arrays of integers of this size will be allocated by this method.tempDir
- a temporary directory for the batches, ornull
forFile.createTempFile(java.lang.String, java.lang.String)
's choice.- Returns:
- an immutable, sequentially accessible graph obtained by transposing
g
. - Throws:
IOException
- See Also:
-
transposeOffline
public static ImmutableSequentialGraph transposeOffline(ImmutableGraph g, int batchSize, File tempDir, ProgressLogger pl) throws IOException Returns an immutable graph obtained by reversing all arcs ing
, using an offline method.This method creates a number of sorted batches on disk containing arcs represented by a pair of gap-compressed long integers ordered by target and returns an
ImmutableGraph
that can be accessed only using anode iterator
. The node iterator merges on the fly the batches, providing a transposed graph. The files are marked withFile.deleteOnExit()
, so they should disappear when the JVM exits. An additional safety-net finaliser tries to delete the batches, too.Note that each
NodeIterator
returned by the transpose requires opening all batches at the same time. The batches are closed when they are exhausted, so a complete scan of the graph closes them all. In any case, another safety-net finaliser closes all files when the iterator is collected.This method can process offline graphs.
- Parameters:
g
- an immutable graph.batchSize
- the number of integers in a batch; two arrays of integers of this size will be allocated by this method.tempDir
- a temporary directory for the batches, ornull
forFile.createTempFile(java.lang.String, java.lang.String)
's choice.pl
- a progress logger.- Returns:
- an immutable, sequentially accessible graph obtained by transposing
g
. - Throws:
IOException
-
logBatches
-
mapOffline
public static ImmutableSequentialGraph mapOffline(ImmutableGraph g, long[][] map, int batchSize) throws IOException Returns an immutable graph obtained by remapping offline the graph nodes through a partial function specified via a big array.- Parameters:
g
- an immutable graph.map
- the transformation map.batchSize
- the number of integers in a batch; two arrays of integers of this size will be allocated by this method.- Returns:
- an immutable, sequentially accessible graph obtained by transforming
g
. - Throws:
IOException
- See Also:
-
mapOffline
public static ImmutableSequentialGraph mapOffline(ImmutableGraph g, long[][] map, int batchSize, File tempDir) throws IOException Returns an immutable graph obtained by remapping offline the graph nodes through a partial function specified via a big array.- Parameters:
g
- an immutable graph.map
- the transformation map.batchSize
- the number of integers in a batch; two arrays of integers of this size will be allocated by this method.tempDir
- a temporary directory for the batches, ornull
forFile.createTempFile(java.lang.String, java.lang.String)
's choice.- Returns:
- an immutable, sequentially accessible graph obtained by transforming
g
. - Throws:
IOException
- See Also:
-
mapOffline
public static ImmutableSequentialGraph mapOffline(ImmutableGraph g, long[][] map, int batchSize, File tempDir, ProgressLogger pl) throws IOException Remaps the the graph nodes through a partial function specified via a big array, using an offline method.More specifically,
LongBigArrays.length(map)=g.numNodes()
, andLongBigArrays.get(map, i)
is the new name of nodei
, or -1 if the node should not be mapped. If some index appearing inmap
is larger than or equal to the number of nodes ofg
, the resulting graph is enlarged correspondingly.Arcs are mapped in the obvious way; in other words, there is an arc from
LongBigArrays.get(map, i)
toLongBigArrays.get(map, j)
(both nonnegative) in the transformed graph iff there was an arc fromi
toj
in the original graph.Note that if
map
is bijective, the returned graph is simply a permutation of the original graph. Otherwise, the returned graph is obtained by deleting nodes mapped to -1, quotienting nodes w.r.t. the equivalence relation induced by the fibres ofmap
and renumbering the result, always according tomap
. SeetransposeOffline(ImmutableGraph, int, File, ProgressLogger)
for implementation and performance-related details.- Parameters:
g
- an immutable graph.map
- the transformation map.batchSize
- the number of integers in a batch; two arrays of integers of this size will be allocated by this method.tempDir
- a temporary directory for the batches, ornull
forFile.createTempFile(java.lang.String, java.lang.String)
's choice.pl
- a progress logger, ornull
.- Returns:
- an immutable, sequentially accessible graph obtained by transforming
g
. - Throws:
IOException
-
transposeOffline
public static ArcLabelledImmutableGraph transposeOffline(ArcLabelledImmutableGraph g, int batchSize) throws IOException Returns an arc-labelled immutable graph obtained by reversing all arcs ing
, using an offline method.- Parameters:
g
- an immutable graph.batchSize
- the number of integers in a batch; two arrays of integers of this size will be allocated by this method, plus an additionalFastByteArrayOutputStream
needed to store all the labels for a batch.- Returns:
- an immutable, sequentially accessible graph obtained by transposing
g
. - Throws:
IOException
- See Also:
-
transposeOffline
public static ArcLabelledImmutableGraph transposeOffline(ArcLabelledImmutableGraph g, int batchSize, File tempDir) throws IOException Returns an arc-labelled immutable graph obtained by reversing all arcs ing
, using an offline method.- Parameters:
g
- an immutable graph.batchSize
- the number of integers in a batch; two arrays of integers of this size will be allocated by this method, plus an additionalFastByteArrayOutputStream
needed to store all the labels for a batch.tempDir
- a temporary directory for the batches, ornull
forFile.createTempFile(java.lang.String, java.lang.String)
's choice.- Returns:
- an immutable, sequentially accessible graph obtained by transposing
g
. - Throws:
IOException
- See Also:
-
transposeOffline
public static ArcLabelledImmutableGraph transposeOffline(ArcLabelledImmutableGraph g, int batchSize, File tempDir, ProgressLogger pl) throws IOException Returns an arc-labelled immutable graph obtained by reversing all arcs ing
, using an offline method.This method creates a number of sorted batches on disk containing arcs represented by a pair of long integers in
DataInput
format ordered by target and returns anImmutableGraph
that can be accessed only using anode iterator
. The node iterator merges on the fly the batches, providing a transposed graph. The files are marked withFile.deleteOnExit()
, so they should disappear when the JVM exits. An additional safety-net finaliser tries to delete the batches, too. As far as labels are concerned, they are temporarily stored in an in-memory bit stream, that is permuted when it is stored on the diskNote that each
NodeIterator
returned by the transpose requires opening all batches at the same time. The batches are closed when they are exhausted, so a complete scan of the graph closes them all. In any case, another safety-net finaliser closes all files when the iterator is collected.This method can process offline graphs. Note that no method to transpose on-line arc-labelled graph is provided currently.
- Parameters:
g
- an immutable graph.batchSize
- the number of integers in a batch; two arrays of integers of this size will be allocated by this method, plus an additionalFastByteArrayOutputStream
needed to store all the labels for a batch.tempDir
- a temporary directory for the batches, ornull
forFile.createTempFile(java.lang.String, java.lang.String)
's choice.pl
- a progress logger.- Returns:
- an immutable, sequentially accessible graph obtained by transposing
g
. - Throws:
IOException
-
union
public static ArcLabelledImmutableGraph union(ArcLabelledImmutableGraph g0, ArcLabelledImmutableGraph g1, LabelMergeStrategy labelMergeStrategy) Returns the union of two arc-labelled immutable graphs.The two arguments may differ in the number of nodes, in which case the resulting graph will be large as the larger graph.
- Parameters:
g0
- the first graph.g1
- the second graph.labelMergeStrategy
- the strategy used to merge labels when the same arc is present in both graphs; ifnull
,Labels.KEEP_FIRST_MERGE_STRATEGY
is used.- Returns:
- the union of the two graphs.
-
union
Returns the union of two immutable graphs.The two arguments may differ in the number of nodes, in which case the resulting graph will be large as the larger graph.
- Parameters:
g0
- the first graph.g1
- the second graph.- Returns:
- the union of the two graphs.
-
compose
Returns the composition (a.k.a. matrix product) of two immutable graphs.The two arguments may differ in the number of nodes, in which case the resulting graph will be large as the larger graph.
- Parameters:
g0
- the first graph.g1
- the second graph.- Returns:
- the composition of the two graphs.
-
compose
public static ArcLabelledImmutableGraph compose(ArcLabelledImmutableGraph g0, ArcLabelledImmutableGraph g1, LabelSemiring strategy) Returns the composition (a.k.a. matrix product) of two arc-labelled immutable graphs.The two arguments may differ in the number of nodes, in which case the resulting graph will be large as the larger graph.
- Parameters:
g0
- the first graph.g1
- the second graph.strategy
- a label semiring.- Returns:
- the composition of the two graphs.
- Implementation Specification:
- This implementation requires outdegrees smaller than 232.
-
grayCodePermutation
Returns a permutation that would make the given graph adjacency lists in Gray-code order.Gray codes list all sequences of n zeros and ones in such a way that adjacent lists differ by exactly one bit. If we assign to each row of the adjacency matrix of a graph its index as a Gray code, we obtain a permutation that will make similar lines nearer.
Note that since a graph permutation permutes both rows and columns, this transformation is not idempotent: the Gray-code permutation produced from a matrix that has been Gray-code sorted will not be, in general, the identity.
The important feature of Gray-code ordering is that it is completely endogenous (e.g., determined by the graph itself), contrarily to, say, lexicographic URL ordering (which relies on the knowledge of the URL associated to each node).
- Parameters:
g
- an immutable graph.- Returns:
- the permutation that would order the graph adjacency lists by Gray order
(you can just pass it to
mapOffline(ImmutableGraph, long[][], int, File, ProgressLogger)
).
-
randomPermutation
Returns a random permutation for a given graph.- Parameters:
g
- an immutable graph.seed
- forXoRoShiRo128PlusRandom
.- Returns:
- a random permutation for the given graph
-
lexicographicalPermutation
Returns a permutation that would make the given graph adjacency lists in lexicographical order.Note that since a graph permutation permutes both rows and columns, this transformation is not idempotent: the lexicographical permutation produced from a matrix that has been lexicographically sorted will not be, in general, the identity.
The important feature of lexicographical ordering is that it is completely endogenous (e.g., determined by the graph itself), contrarily to, say, lexicographic URL ordering (which relies on the knowledge of the URL associated to each node).
Warning: rows are numbered from zero from the left. This means, for instance, that nodes with an arc towards node zero are lexicographically smaller than nodes without it.
- Parameters:
g
- an immutable graph.- Returns:
- the permutation that would order the graph adjacency lists by lexicographical order
(you can just pass it to
mapOffline(ImmutableGraph, long[][], int)
).
-
main
public static void main(String[] args) throws IOException, IllegalArgumentException, SecurityException, InstantiationException, IllegalAccessException, InvocationTargetException, NoSuchMethodException, ClassNotFoundException, JSAPException
-