Class BVGraph
 All Implemented Interfaces:
CompressionFlags
,FlyweightPrototype<ImmutableGraph>
,Serializable
public class BVGraph extends ImmutableGraph implements CompressionFlags, Serializable
This class provides a flexible and configurable way to store and access web graphs in a
compressed form. Its main method can load an ImmutableGraph
and compress it. The
resulting compressed BVGraph
is described by a graph file (with extension
.graph
), an offset file (with extension .offsets
) and a
property file (with extension .properties
). The latter, not surprisingly,
is a Java property file. Optionally, an offset biglist file (with extension
.obl
) can be created to load graphs faster. *
As a rule of thumb, random access is faster using successors(long)
, whereas while
iterating using a NodeIterator
it is better to use
NodeIterator.successorBigArray()
.
Parallel compression
Starting with version 3.6.0, this classes uses ImmutableGraph.splitNodeIterators(int)
to
compress a graph using parallel compression threads. The number of parallel threads can be set at
construction time, or using the property
"it.unimi.dsi.webgraph.threads" from the command line;
this approach is useful with classes such as Transform
.
Parallel compression requires the creation of possibly large temporary files. It might be
necessary to set the property java.io.tmpdir
to a suitable directory if you experience
diskfull errors during compression.
The Graph File
This class stores a graph as an bit stream. The bit stream format depends on a number of parameters and encodings that can be mixed orthogonally. The parameters are:
 the window size, a nonnegative integer;
 the maximum reference count, a positive integer (it is meaningful only when the window is nonzero);
 the minimum interval length, an integer larger than or equal to two, or 0, which is interpreted as infinity.
Successor Lists
The graph file is a sequence of successor lists, one for each node. The list of node x can be thought of as a sequence of natural numbers (even though, as we will explain later, this sequence is further coded suitably as a sequence of bits):
 The outdegree of the node; if it is zero, the list ends here.
 If the window size is not zero, the reference part, that is:
 a nonnegative integer, the reference, which never exceeds the window size; if the reference is r, the list of successors will be specified as a modified version of the list of successors of x−r; if r is 0, then the list of successors will be specified explicitly;
 if r is nonzero:
 a natural number b, the block count;
 a sequence of b natural numbers B_{1}, …, B_{b}, called the copyblock list; only the first number can be zero.
 Then comes the extra part, specifying some more entries that the list of successors
contains (or all of them, if r is zero), that is:
 If the minimum interval length is finite,
 an integer i, the interval count;
 a sequence of i pairs, whose first component is the left extreme of an interval, and whose second component is the length of the interval (i.e., the number of integers contained in the interval).
 Finally, the list of residuals, which contain all successors not specified by previous methods.
 If the minimum interval length is finite,
The above data should be interpreted as follows:
 The reference part, if present (i.e., if both the window size and the reference are strictly positive), specifies that part of the list of successors of node x−r should be copied; the successors of node x−r that should be copied are described in the copyblock list; more precisely, one should copy the first B_{1} entries of this list, discard the next B_{2}, copy the next B_{3} etc. (the last remaining elements of the list of successors will be copied if b is even, and discarded if b is odd).
 The extra part specifies additional successors (or all of them, if the reference part is
absent); the extra part is not present if the number of successors that are to be copied
according to the reference part already coincides with the outdegree of x; the
successors listed in the extra part are given in two forms:
 some of them are specified as belonging to (integer) intervals, if the minimum interval length is finite; the interval count indicates how many intervals, and the intervals themselves are listed as pairs (left extreme, length);
 the residuals are the remaining "scattered" successors.
How Successor Lists Are Coded
As we said before, the list of integers corresponding to each successor list should be coded into
a sequence of bits. This is (ideally) done in two phases: we first modify the sequence in a
suitable manner (as explained below) so to obtain another sequence of integers (some of them
might be negative). Then each single integer is coded, using a coding that can be specified as an
option; the integers that may be negative are first turned into natural numbers using
Fast.int2nat(long)
.
 The outdegree of the node is left unchanged, as well as the reference and the block count;
 all blocks are decremented by 1, except for the first one;
 the interval count is left unchanged;
 all interval lengths are decremented by the minimum interval length;
 the first left extreme is expressed as its difference from x (it will be negative if the first extreme is less than x); the remaining left extremes are expressed as their distance from the previous right extreme plus 2 (e.g., if the interval is [5..11] and the previous one was [1..3], then the left extreme 5 is expressed as 5(3+2)=55=0);
 the first residual is expressed as its difference from x (it will be negative if the first residual is less than x); the remaining residuals are expressed as decremented differences from the previous residual.
The Offset File
Since the graph is stored as a bit stream, we must have some way to know where each successor list starts. This information is stored in the offset file, which contains the bit offset of each successor list (in particular, the offset of the first successor list will be zero). As a commodity, the offset file contains an additional offset pointing just after the last successor list (providing, as a sideeffect, the actual bit length of the graph file). Each offset (except for the first one) is stored as a suitably coded difference from the previous offset.
The list of offsets can be additionally stored as a serialised
EliasFanoMonotoneLongBigList
using a suitable commandline option. If the serialised big
list is detected, it is loaded instead of parsing the offset list.
The Property File
This file contains selfexplaining entries that are necessary to correctly decode the graph and offset files, and moreover give some statistical information about the compressed graph (e.g., the number of bits per link).
nodes
 the number of nodes of the graph.
nodes
 the number of arcs of the graph.
version
 a version number.
graphclass
 the name of the class that should load this graph (
ImmutableGraph
convention). bitsperlink
 the number of bits per link (overall graph size in bits divided by the number of arcs).
bitspernode
 the number of bits per node (overall graph size in bits divided by the number of nodes).
compratio
 the ratio between the graph size and the informationtheoretical lower bound (the binary
logarithm of the number of subsets of size
arcs
out of a universe ofnodes
^{2} elements). compressionflags
 flags specifying the codes used for the components of the compression algorithm.
zetak
 if ζ codes are selected for residuals, the parameter k.
windowsize
 the window size.
maxref
 the maximum reference count.
minintervallength
 the minimum length of an interval.
avgdist
 the average distance of a reference.
avgref
 the average length of reference chains.
bitsfor*
 number of bits used by a specific compoenent of the algorithm (the sum is the number of bits used to store the graph).
avgbitsfor*
 number of bits used by a specific compoenent of the algorithm, divided by the number of nodes (the sum is the number of bits per node).
*arcs
 the number of arcs stored by each component of the algorithm (the sum is the number of arcs).
*expstats
 frequencies of the floor of the logarithm of successor gaps and residual gaps, separated by a
comma; the statistics include the gap between each node and its first successor, after it has
been passed through
Fast.int2nat(long)
, but discarding zeroes (which happen in very rare circumstance, and should be considered immaterial). *avg[log]gap
 the average of the gaps (or of their logarithm) of successors and residuals: note that this data is computed from the exponential statistics above, and thus it is necessarily approximate.
How The Graph File Is Loaded Into Memory
The natural way of using a graph file is to load it into a byte array and then index its bits
using the suitable offset. This class will use a byte array for graphs smaller than
Integer.MAX_VALUE
bytes, and a
FastMultiByteArrayInputStream
otherwise: in the latter case,
expect a significant slowdown (as an InputBitStream
can wrap directly a
byte array).
Offsets are loaded using an EliasFanoMonotoneLongBigList
, which occupies exponentially
less space than the graph itself (unless your graph is pathologically sparse). There is of course
a cost involved in accessing the list with respect to accessing an array of longs.
Note that by default the EliasFanoMonotoneLongBigList
instance is created from scratch
using the file of offsets. This is a long and tedious process, in particular with large graphs.
The main method of this class has an option that will generate such a list once for all and
serialise it in a file with extension .obl
. The list will be quickly deserialised if
its modification date is later than that of the offset file.
Not Loading the Graph File at All
For some applications (such as transposing a graph) it is not necessary to load the graph file in
memory. Since this class is able to enumerate the links of a graph without using random access,
it is possible not to load in memory any information at all, and obtain iterators that directly
read from the graph file. To obtain this effect, you must call
loadOffline(CharSequence)
.
Memory–Mapping a Graph
Another interesting alternative is memory mapping. When using
loadMapped(CharSequence)
, the graph will be mapped into memory, and the offsets
loaded. The graph will provide random access and behave as if it was loaded into memory, but of
course the access will be slower.
 See Also:
 Serialized Form

Nested Class Summary
Nested classes/interfaces inherited from class it.unimi.dsi.big.webgraph.ImmutableGraph
ImmutableGraph.LoadMethod

Field Summary
Fields Modifier and Type Field Description protected CharSequence
basename
The basename of the graph.static int
BLOCK_COUNT_DELTA
Flag: write block counts using δ coding.static int
BLOCK_COUNT_GAMMA
Flag: write block counts using γ coding (default).static int
BLOCK_COUNT_UNARY
Flag: write block counts using unary coding.protected int
blockCoding
The coding for copyblock lists.protected int
blockCountCoding
The coding for block counts.static int
BLOCKS_DELTA
Flag: write copyblock lists using δ coding.static int
BLOCKS_GAMMA
Flag: write copyblock lists using γ coding (default).static int
BVGRAPH_VERSION
This number classifies the present graph format.protected long
cachedNode
If notLong.MIN_VALUE
, the node whose degree is cached incachedOutdegree
.protected int
cachedOutdegree
IfcachedNode
is notLong.MIN_VALUE
, its cached outdegree.protected long
cachedPointer
IfcachedNode
is notLong.MIN_VALUE
, the position immediately after the coding of the outdegree ofcachedNode
.static int
DEFAULT_MAX_REF_COUNT
Default backward reference maximum length.static int
DEFAULT_MIN_INTERVAL_LENGTH
Default minimum interval length.static int
DEFAULT_WINDOW_SIZE
Default window size.static int
DEFAULT_ZETA_K
Default value of k.static String
GRAPH_EXTENSION
The standard extension for the graph bit stream.protected byte[]
graphMemory
The byte array storing the compressed graph, ifisMemory
is true andoffsetType
is not 1.protected FastMultiByteArrayInputStream
graphStream
The multibyte array input stream storing the compressed graph, ifisMemory
is false,isMapped
is false andoffsetType
is not 1.protected static int
INITIAL_SUCCESSOR_LIST_LENGTH
The initial length of an array that will contain a successor list.protected boolean
isMapped
WhenoffsetType
is not 1, whether this graph is directly loaded intographMemory
, or rather memorymapped.protected boolean
isMemory
WhenoffsetType
is not 1, whether this graph is directly loaded intographMemory
, or rather wrapped in aFastMultiByteArrayInputStream
specified bygraphStream
.protected long
m
The number of arcs of the graph.protected ByteBufferInputStream
mappedGraphStream
The memorymapped input stream storing the compressed graph, ifisMapped
is true.protected int
maxRefCount
The maximum reference count.protected int
minIntervalLength
The minimum interval length.protected long
n
The number of nodes of the graph.static int
NO_INTERVALS
A special value forminIntervalLength
interpreted as meaning that the minimum interval length is infinity.static int
OFFLINE
The offset step parameter corresponding to offline load.protected int
offsetCoding
The coding for offsets.protected LongBigList
offsets
This variable isnull
iffoffsetType
is zero or less (implying that offsets have not been loaded).static String
OFFSETS_BIG_LIST_EXTENSION
The standard extension for the cachedLongBigList
containing the graph offsets.static int
OFFSETS_DELTA
Flag: write offsets using δ coding.static String
OFFSETS_EXTENSION
The standard extension for the graphoffsets bit stream.static int
OFFSETS_GAMMA
Flag: write offsets using γ coding (default).protected int
offsetType
The offset type: 2 is memorymapping, 1 is normal randomaccess loading, 0 means that we do not want to load offsets at all, 1 that the we do not want even load the graph file.protected int
outdegreeCoding
The coding for outdegrees.static int
OUTDEGREES_DELTA
Flag: write outdegrees using δ coding.static String
OUTDEGREES_EXTENSION
The standard extension for the stream of node outdegrees.static int
OUTDEGREES_GAMMA
Flag: write outdegrees using γ coding (default).protected int
referenceCoding
The coding for references.static int
REFERENCES_DELTA
Flag: write references using δ coding.static int
REFERENCES_GAMMA
Flag: write references using γ coding.static int
REFERENCES_UNARY
Flag: write references using unary coding (default).protected int
residualCoding
The coding for residuals.static int
RESIDUALS_DELTA
Flag: write residuals using δ coding.static int
RESIDUALS_GAMMA
Flag: write residuals using γ coding.static int
RESIDUALS_GOLOMB
Flag: write residuals using Golomb coding.static int
RESIDUALS_NIBBLE
Flag: write residuals using variablelength nibble coding.static int
RESIDUALS_ZETA
Flag: write residuals using ζ_{k} coding (default).static int
SEQUENTIAL
The offset step parameter corresponding to sequential load.protected int
windowSize
The window size.protected int
zetaK
The value of k for ζ_{k} coding (for residuals).Fields inherited from class it.unimi.dsi.big.webgraph.ImmutableGraph
GRAPHCLASS_PROPERTY_KEY, NUMBER_OF_THREADS_PROPERTY, PROPERTIES_EXTENSION
Fields inherited from interface it.unimi.dsi.big.webgraph.CompressionFlags
CODING_NAME, DELTA, GAMMA, GOLOMB, NIBBLE, SKEWED_GOLOMB, UNARY, ZETA

Constructor Summary
Constructors Modifier Constructor Description protected
BVGraph()

Method Summary
Modifier and Type Method Description CharSequence
basename()
Returns a symbolic basename for this graph (optional operation).BVGraph
copy()
Returns a flyweight copy of this immutable graph.protected static int
intervalize(LongArrayList x, int minInterval, LongArrayList left, LongArrayList len, LongArrayList residuals)
This method tries to express an increasing sequence of natural numbersx
as a union of an increasing sequence of intervals and an increasing sequence of residual elements.static BVGraph
load(CharSequence basename)
Creates a newBVGraph
by loading a compressed graph file from disk to memory, with no progress logger and all offsets.static BVGraph
load(CharSequence basename, int offsetType)
Creates a newBVGraph
by loading a compressed graph file from disk to memory, with no progress logger.static BVGraph
load(CharSequence basename, int offsetType, ProgressLogger pl)
Creates a newBVGraph
by loading a compressed graph file from disk to memory.static BVGraph
load(CharSequence basename, ProgressLogger pl)
Creates a newBVGraph
by loading a compressed graph file from disk to memory, with all offsets.protected BVGraph
loadInternal(CharSequence basename, int offsetType, ProgressLogger pl)
Loads a compressed graph file from disk into this graph.static BVGraph
loadMapped(CharSequence basename)
Creates a newBVGraph
by memorymapping a graph file.static BVGraph
loadMapped(CharSequence basename, ProgressLogger pl)
Creates a newBVGraph
by memorymapping a graph file.static BVGraph
loadOffline(CharSequence basename)
Creates a newBVGraph
by loading just the metadata of a compressed graph file.static BVGraph
loadOffline(CharSequence basename, ProgressLogger pl)
Creates a newBVGraph
by loading just the metadata of a compressed graph file.static BVGraph
loadSequential(CharSequence basename)
Deprecated.static BVGraph
loadSequential(CharSequence basename, ProgressLogger pl)
Deprecated.static void
main(String[] args)
Reads an immutable graph and stores it as aBVGraph
.int
maxRefCount()
Returns the maximum reference count of this graph.NodeIterator
nodeIterator(long from)
This method returns a node iterator for scanning the graph sequentially, starting from the given node.long
numArcs()
Returns the number of arcs of this graph (optional operation).long
numNodes()
Returns the number of nodes of this graph.long
outdegree(long x)
Returns the outdegree of a node.boolean
randomAccess()
Checks whether this graph provides random access to successor lists.protected int
readBlock(InputBitStream ibs)
Reads a block from the given stream.protected int
readBlockCount(InputBitStream ibs)
Reads a block count from the given stream.protected long
readOffset(InputBitStream ibs)
Reads an offset difference from the given stream.protected int
readOutdegree(InputBitStream ibs)
Reads an outdegree from the given stream.protected int
readOutdegree(InputBitStream ibs, long offset)
Reads an outdegree from the given stream at a given offset.protected int
readReference(InputBitStream ibs)
Reads a reference from the given stream.protected long
readResidual(InputBitStream ibs)
Reads a residual from the given stream.static void
store(ImmutableGraph graph, CharSequence basename)
Writes the given graph using a given base name, without any progress logger and with all parameters set to their default values.static void
store(ImmutableGraph graph, CharSequence basename, int windowSize, int maxRefCount, int minIntervalLength, int zetaK, int flags)
Writes the given graph using a given base name, without any progress logger.static void
store(ImmutableGraph graph, CharSequence basename, int windowSize, int maxRefCount, int minIntervalLength, int zetaK, int flags, int numberOfThreads, ProgressLogger pl)
Writes the given graph using a given base name.static void
store(ImmutableGraph graph, CharSequence basename, int windowSize, int maxRefCount, int minIntervalLength, int zetaK, int flags, ProgressLogger pl)
Writes the given graph using a given base name.static void
store(ImmutableGraph graph, CharSequence basename, ProgressLogger pl)
Writes the given graph using a given base name, with all parameters set to their default values.LazyLongIterator
successors(long x)
Returns an iterator over the successors of a given node.protected LazyLongIterator
successors(long x, InputBitStream ibs, long[][] window, int[] outd)
Given anInputBitStream
wrapping a graph file, returns an iterator over the successors of a given nodex
.protected static void
updateBins(long currNode, long[] list, int length, long[] bin)
Updates a list of exponential bins using the gaps a given list of strinctly increasing integers.int
windowSize()
Returns the window size of this graph.protected int
writeBlock(OutputBitStream obs, int block)
Writes a block to the given stream.protected int
writeBlockCount(OutputBitStream obs, int count)
Writes a block count to the given stream.protected int
writeOffset(OutputBitStream obs, long x)
Writes an offset difference to the given stream.void
writeOffsets(OutputBitStream obs, ProgressLogger pl)
Write the offset file to a given bit stream.protected int
writeOutdegree(OutputBitStream obs, int d)
Writes an outdegree to the given stream.protected int
writeReference(OutputBitStream obs, int ref)
Writes a reference to the given stream.protected long
writeResidual(OutputBitStream obs, long residual)
Writes a residual to the given stream.Methods inherited from class it.unimi.dsi.big.webgraph.ImmutableGraph
equals, hasCopiableIterators, hashCode, intNumNodes, load, loadOnce, nodeIterator, outdegrees, splitNodeIterators, store, store, successorBigArray, toString, wrap, wrap

Field Details

SEQUENTIAL
public static final int SEQUENTIALThe offset step parameter corresponding to sequential load. See Also:
 Constant Field Values

OFFLINE
public static final int OFFLINEThe offset step parameter corresponding to offline load. See Also:
 Constant Field Values

GRAPH_EXTENSION
The standard extension for the graph bit stream. See Also:
 Constant Field Values

OFFSETS_EXTENSION
The standard extension for the graphoffsets bit stream. See Also:
 Constant Field Values

OFFSETS_BIG_LIST_EXTENSION
The standard extension for the cachedLongBigList
containing the graph offsets. See Also:
 Constant Field Values

OUTDEGREES_EXTENSION
The standard extension for the stream of node outdegrees. See Also:
 Constant Field Values

BVGRAPH_VERSION
public static final int BVGRAPH_VERSIONThis number classifies the present graph format. When new features require introducing binary incompatibilities, this number is bumped so to ensure that old classes do not try to read graphs they cannot understand. See Also:
 Constant Field Values

INITIAL_SUCCESSOR_LIST_LENGTH
protected static final int INITIAL_SUCCESSOR_LIST_LENGTHThe initial length of an array that will contain a successor list. See Also:
 Constant Field Values

NO_INTERVALS
public static final int NO_INTERVALSA special value forminIntervalLength
interpreted as meaning that the minimum interval length is infinity. See Also:
 Constant Field Values

basename
The basename of the graph. This may benull
, but trying to load the graph with an offset step of 1 will cause an exception. 
n
protected long nThe number of nodes of the graph. 
m
protected long mThe number of arcs of the graph. 
isMemory
protected boolean isMemoryWhenoffsetType
is not 1, whether this graph is directly loaded intographMemory
, or rather wrapped in aFastMultiByteArrayInputStream
specified bygraphStream
. 
isMapped
protected boolean isMappedWhenoffsetType
is not 1, whether this graph is directly loaded intographMemory
, or rather memorymapped. 
graphMemory
protected byte[] graphMemoryThe byte array storing the compressed graph, ifisMemory
is true andoffsetType
is not 1.This variable is loaded with a copy of the graph file, or with a rearrangement of the latter, depending on whether
offsetType
is smaller than or equal to one. IfoffsetType
is 1, this variable isnull
, and node iterators are generated by opening streams directly on the graph file. 
graphStream
The multibyte array input stream storing the compressed graph, ifisMemory
is false,isMapped
is false andoffsetType
is not 1.It is loaded with a copy of the graph file. If
offsetType
is 1, this variable isnull
, and node iterators are generated by opening streams directly on the graph file. 
mappedGraphStream
The memorymapped input stream storing the compressed graph, ifisMapped
is true.It is loaded with a copy of the graph file. If
offsetType
is 1, this variable isnull
, and node iterators are generated by opening streams directly on the graph file. 
offsets
This variable isnull
iffoffsetType
is zero or less (implying that offsets have not been loaded). Otherwise, it is an Elias–Fano monotone list containing the pointers of the bit streams of one eachoffsetType
nodes. 
offsetType
protected int offsetTypeThe offset type: 2 is memorymapping, 1 is normal randomaccess loading, 0 means that we do not want to load offsets at all, 1 that the we do not want even load the graph file. 
cachedNode
protected transient long cachedNodeIf notLong.MIN_VALUE
, the node whose degree is cached incachedOutdegree
. 
cachedOutdegree
protected int cachedOutdegreeIfcachedNode
is notLong.MIN_VALUE
, its cached outdegree. 
cachedPointer
protected long cachedPointerIfcachedNode
is notLong.MIN_VALUE
, the position immediately after the coding of the outdegree ofcachedNode
. 
maxRefCount
protected int maxRefCountThe maximum reference count. 
DEFAULT_MAX_REF_COUNT
public static final int DEFAULT_MAX_REF_COUNTDefault backward reference maximum length. See Also:
 Constant Field Values

windowSize
protected int windowSizeThe window size. Zero means no references. 
DEFAULT_WINDOW_SIZE
public static final int DEFAULT_WINDOW_SIZEDefault window size. See Also:
 Constant Field Values

minIntervalLength
protected int minIntervalLengthThe minimum interval length. 
DEFAULT_MIN_INTERVAL_LENGTH
public static final int DEFAULT_MIN_INTERVAL_LENGTHDefault minimum interval length. See Also:
 Constant Field Values

zetaK
protected int zetaKThe value of k for ζ_{k} coding (for residuals). 
DEFAULT_ZETA_K
public static final int DEFAULT_ZETA_KDefault value of k. See Also:
 Constant Field Values

OUTDEGREES_GAMMA
public static final int OUTDEGREES_GAMMAFlag: write outdegrees using γ coding (default). See Also:
 Constant Field Values

OUTDEGREES_DELTA
public static final int OUTDEGREES_DELTAFlag: write outdegrees using δ coding. See Also:
 Constant Field Values

BLOCKS_GAMMA
public static final int BLOCKS_GAMMAFlag: write copyblock lists using γ coding (default). See Also:
 Constant Field Values

BLOCKS_DELTA
public static final int BLOCKS_DELTAFlag: write copyblock lists using δ coding. See Also:
 Constant Field Values

RESIDUALS_GAMMA
public static final int RESIDUALS_GAMMAFlag: write residuals using γ coding. See Also:
 Constant Field Values

RESIDUALS_ZETA
public static final int RESIDUALS_ZETAFlag: write residuals using ζ_{k} coding (default). See Also:
 Constant Field Values

RESIDUALS_DELTA
public static final int RESIDUALS_DELTAFlag: write residuals using δ coding. See Also:
 Constant Field Values

RESIDUALS_NIBBLE
public static final int RESIDUALS_NIBBLEFlag: write residuals using variablelength nibble coding. See Also:
 Constant Field Values

RESIDUALS_GOLOMB
public static final int RESIDUALS_GOLOMBFlag: write residuals using Golomb coding. See Also:
 Constant Field Values

REFERENCES_GAMMA
public static final int REFERENCES_GAMMAFlag: write references using γ coding. See Also:
 Constant Field Values

REFERENCES_DELTA
public static final int REFERENCES_DELTAFlag: write references using δ coding. See Also:
 Constant Field Values

REFERENCES_UNARY
public static final int REFERENCES_UNARYFlag: write references using unary coding (default). See Also:
 Constant Field Values

BLOCK_COUNT_GAMMA
public static final int BLOCK_COUNT_GAMMAFlag: write block counts using γ coding (default). See Also:
 Constant Field Values

BLOCK_COUNT_DELTA
public static final int BLOCK_COUNT_DELTAFlag: write block counts using δ coding. See Also:
 Constant Field Values

BLOCK_COUNT_UNARY
public static final int BLOCK_COUNT_UNARYFlag: write block counts using unary coding. See Also:
 Constant Field Values

OFFSETS_GAMMA
public static final int OFFSETS_GAMMAFlag: write offsets using γ coding (default). See Also:
 Constant Field Values

OFFSETS_DELTA
public static final int OFFSETS_DELTAFlag: write offsets using δ coding. See Also:
 Constant Field Values

outdegreeCoding
protected int outdegreeCodingThe coding for outdegrees. By default, we use γ coding. 
blockCoding
protected int blockCodingThe coding for copyblock lists. By default, we use γ coding. 
residualCoding
protected int residualCodingThe coding for residuals. By default, we use ζ coding. 
referenceCoding
protected int referenceCodingThe coding for references. By default, we use unary coding. 
blockCountCoding
protected int blockCountCodingThe coding for block counts. By default, we use γ coding. 
offsetCoding
protected int offsetCodingThe coding for offsets. By default, we use γ coding.


Constructor Details

BVGraph
protected BVGraph()


Method Details

copy
Description copied from class:ImmutableGraph
Returns a flyweight copy of this immutable graph. Specified by:
copy
in interfaceFlyweightPrototype<ImmutableGraph>
 Specified by:
copy
in classImmutableGraph
 Returns:
 a flyweight copy of this immutable graph.
 See Also:
FlyweightPrototype

numNodes
public long numNodes()Description copied from class:ImmutableGraph
Returns the number of nodes of this graph.Albeit this method is not optional, it is allowed that this method throws an
UnsupportedOperationException
if this graph has never been entirely traversed using anode iterator
. This apparently bizarre behaviour is necessary to support implementations asArcListASCIIGraph
, which do not know the actual number of nodes until a traversal has been completed. Specified by:
numNodes
in classImmutableGraph
 Returns:
 the number of nodes.

numArcs
public long numArcs()Description copied from class:ImmutableGraph
Returns the number of arcs of this graph (optional operation). Overrides:
numArcs
in classImmutableGraph
 Returns:
 the number of arcs.

randomAccess
public boolean randomAccess()Description copied from class:ImmutableGraph
Checks whether this graph provides random access to successor lists. Specified by:
randomAccess
in classImmutableGraph
 Returns:
 true if this graph provides random access to successor lists.

basename
Description copied from class:ImmutableGraph
Returns a symbolic basename for this graph (optional operation).Implementors of this class may provide a basename (usually a pathname from which various files storing the graph are stemmed). This method is optional because it is sometimes unmeaningful (e.g., for oneoff anonymous classes).
 Overrides:
basename
in classImmutableGraph
 Returns:
 the basename.

maxRefCount
public int maxRefCount()Returns the maximum reference count of this graph. Returns:
 the maximum reference count.

windowSize
public int windowSize()Returns the window size of this graph. Returns:
 the window size.

readOffset
Reads an offset difference from the given stream. Parameters:
ibs
 an offsetfile input bit stream. Returns:
 the next offset difference.
 Throws:
IOException

writeOffset
Writes an offset difference to the given stream. Parameters:
obs
 an offsetfile output bit stream.x
 an offset difference to be stored in the stream. Returns:
 the number of bits written.
 Throws:
IOException

readOutdegree
Reads an outdegree from the given stream. Parameters:
ibs
 a graphfile input bit stream. Returns:
 the next outdegree.
 Throws:
IOException

readOutdegree
Reads an outdegree from the given stream at a given offset. Parameters:
ibs
 a graphfile input bit stream.offset
 the offset at which the stream must be positioned. Returns:
 the next outdegree.
 Throws:
IOException

writeOutdegree
Writes an outdegree to the given stream. Parameters:
obs
 a graphfile output bit stream.d
 an outdegree to be stored in the stream. Returns:
 the number of bits written.
 Throws:
IOException

readReference
Reads a reference from the given stream. Parameters:
ibs
 a graphfile input bit stream. Returns:
 the next reference.
 Throws:
IOException

writeReference
Writes a reference to the given stream. Parameters:
obs
 a graphfile output bit stream.ref
 the reference. Returns:
 the number of bits written.
 Throws:
IOException

readBlockCount
Reads a block count from the given stream. Parameters:
ibs
 a graphfile input bit stream. Returns:
 the next block count.
 Throws:
IOException

writeBlockCount
Writes a block count to the given stream. Parameters:
obs
 a graphfile output bit stream.count
 the block count. Returns:
 the number of written bits.
 Throws:
IOException

readBlock
Reads a block from the given stream. Parameters:
ibs
 a graphfile input bit stream. Returns:
 the next block.
 Throws:
IOException

writeBlock
Writes a block to the given stream. Parameters:
obs
 a graphfile output bit stream.block
 the block. Returns:
 the number of written bits.
 Throws:
IOException

readResidual
Reads a residual from the given stream. Parameters:
ibs
 a graphfile input bit stream. Returns:
 the next residual.
 Throws:
IOException

writeResidual
Writes a residual to the given stream. Parameters:
obs
 a graphfile output bit stream.residual
 the residual. Returns:
 the number of written bits.
 Throws:
IOException

outdegree
Description copied from class:ImmutableGraph
Returns the outdegree of a node. Specified by:
outdegree
in classImmutableGraph
 Parameters:
x
 a node. Returns:
 the outdegree of the given node.
 Throws:
IllegalStateException
 if called without offsets.

successors
Returns an iterator over the successors of a given node. Overrides:
successors
in classImmutableGraph
 Parameters:
x
 a node. Returns:
 an iterator over the successors of the node.

successors
protected LazyLongIterator successors(long x, InputBitStream ibs, long[][] window, int[] outd) throws IllegalStateExceptionGiven anInputBitStream
wrapping a graph file, returns an iterator over the successors of a given nodex
.This method can be used in two different ways:
 by providing a node and an input bit stream wrapping a graph file, it is possible to access the successor list of the node (provided that offsets have been loaded);
 by providing additional data, which essentially are used to keep some state about the graph, it is possible to perform an efficient sequential visit of all successor lists (even when no offsets were loaded).
This method may modify the offset and the outdegree caches if
window
isnull
. Parameters:
x
 a node.ibs
 an input bit stream wrapping a graph file. After this method returns, the state ofibs
is undefined: however, after the iterator returned is exhausted,ibs
will positioned just after the successor list ofx
.window
 eithernull
, or a double array with the following meaning:window[(xi) mod windowSize]
contains, for alli
between 1 (inclusive) andwindowSize
(exclusive), the list of successors of nodex
−i
. Ifwindow
is notnull
thenibs
must be positioned before the successor list ofx
. This parameter will not be modified.outd
 ifwindow
is notnull
, this is an array with as many elements aswindowSize
;outd[(xi) mod windowSize]
contains the outdegree of nodex
−i
fori
greater than 0; at the end, this will be true also fori
equal to 0. Returns:
 an iterator over the successors of
x
.  Throws:
IllegalStateException
 ifwindow
isnull
andoffsetType
is 0.

nodeIterator
This method returns a node iterator for scanning the graph sequentially, starting from the given node. It keeps track of a sliding window ofwindowSize()
previous successor lists to speed up the iteration of graphs with significant referentiation. Overrides:
nodeIterator
in classImmutableGraph
 Parameters:
from
 the node from which the iterator will iterate. Returns:
 a
NodeIterator
for accessing nodes and successors sequentially.

load
public static BVGraph load(CharSequence basename, int offsetType, ProgressLogger pl) throws IOExceptionCreates a newBVGraph
by loading a compressed graph file from disk to memory. Parameters:
basename
 the basename of the graph.offsetType
 the desired offset type (2 is memory mapping, 1 is normal randomaccess loading, 0 means that we do not want to load offsets at all, 1 that the we do not want even load the graph file).pl
 a progress logger used while loading the graph, ornull
. Returns:
 a
BVGraph
containing the specified graph.  Throws:
IOException
 if an I/O exception occurs while reading the graph.

load
Creates a newBVGraph
by loading a compressed graph file from disk to memory, with no progress logger. Parameters:
basename
 the basename of the graph.offsetType
 the desired offset type (2 is memory mapping, 1 is normal randomaccess loading, 0 means that we do not want to load offsets at all, 1 that the we do not want even load the graph file). Returns:
 a
BVGraph
containing the specified graph.  Throws:
IOException
 if an I/O exception occurs while reading the graph.

load
Creates a newBVGraph
by loading a compressed graph file from disk to memory, with no progress logger and all offsets. Parameters:
basename
 the basename of the graph. Returns:
 a
BVGraph
containing the specified graph.  Throws:
IOException
 if an I/O exception occurs while reading the graph.

load
Creates a newBVGraph
by loading a compressed graph file from disk to memory, with all offsets. Parameters:
basename
 the basename of the graph.pl
 a progress logger used while loading the graph, ornull
. Returns:
 a
BVGraph
containing the specified graph.  Throws:
IOException
 if an I/O exception occurs while reading the graph.

loadMapped
Creates a newBVGraph
by memorymapping a graph file. Parameters:
basename
 the basename of the graph.pl
 a progress logger used while loading the offsets, ornull
. Returns:
 an
BVGraph
containing the specified graph.  Throws:
IOException
 if an I/O exception occurs while memorymapping the graph or reading the offsets.

loadMapped
Creates a newBVGraph
by memorymapping a graph file. Parameters:
basename
 the basename of the graph. Returns:
 an
BVGraph
containing the specified graph.  Throws:
IOException
 if an I/O exception occurs while memorymapping the graph or reading the offsets.

loadSequential
@Deprecated public static BVGraph loadSequential(CharSequence basename, ProgressLogger pl) throws IOExceptionDeprecated.Creates a newBVGraph
by loading a compressed graph file from disk to memory, without offsets. Parameters:
basename
 the basename of the graph.pl
 a progress logger used while loading the graph, ornull
. Returns:
 a
BVGraph
containing the specified graph.  Throws:
IOException
 if an I/O exception occurs while reading the graph.

loadSequential
Deprecated.UseloadOffline(CharSequence)
orloadMapped(CharSequence)
instead.Creates a newBVGraph
by loading a compressed graph file from disk to memory, with no progress logger and without offsets. Parameters:
basename
 the basename of the graph. Returns:
 a
BVGraph
containing the specified graph.  Throws:
IOException
 if an I/O exception occurs while reading the graph.

loadOffline
Creates a newBVGraph
by loading just the metadata of a compressed graph file. Parameters:
basename
 the basename of the graph.pl
 a progress logger, ornull
. Returns:
 a
BVGraph
containing the specified graph.  Throws:
IOException
 if an I/O exception occurs while reading the metadata.

loadOffline
Creates a newBVGraph
by loading just the metadata of a compressed graph file. Parameters:
basename
 the basename of the graph. Returns:
 a
BVGraph
containing the specified graph.  Throws:
IOException
 if an I/O exception occurs while reading the metadata.

loadInternal
protected BVGraph loadInternal(CharSequence basename, int offsetType, ProgressLogger pl) throws IOExceptionLoads a compressed graph file from disk into this graph. Note that this method should be called only on a newly created graph. Parameters:
basename
 the basename of the graph.offsetType
 the desired offset type (2 is memorymapping, 1 is normal randomaccess loading, 0 means that we do not want to load offsets at all, 1 that the we do not want even load the graph file).pl
 a progress logger used while loading the graph, ornull
. Returns:
 this graph.
 Throws:
IOException
 if an I/O exception occurs while reading the graph.

intervalize
protected static int intervalize(LongArrayList x, int minInterval, LongArrayList left, LongArrayList len, LongArrayList residuals)This method tries to express an increasing sequence of natural numbersx
as a union of an increasing sequence of intervals and an increasing sequence of residual elements. More precisely, this intervalization works as follows: first, one looks atx
as a sequence of intervals (i.e., maximal sequences of consecutive elements); those intervals whose length is ≥minInterval
are stored in the listsleft
(the list of left extremes) andlen
(the list of lengths; the length of an integer interval is the number of integers in that interval). The remaining integers, called residuals are stored in theresidual
list.Note that the previous content of
left
,len
andresidual
is lost. Parameters:
x
 the list to be intervalized (an increasing list of natural numbers).minInterval
 the least length that a maximal sequence of consecutive elements must have in order for it to be considered as an interval.left
 the resulting list of left extremes of the intervals.len
 the resulting list of interval lengths.residuals
 the resulting list of residuals. Returns:
 the number of intervals.

store
public static void store(ImmutableGraph graph, CharSequence basename, int windowSize, int maxRefCount, int minIntervalLength, int zetaK, int flags, int numberOfThreads, ProgressLogger pl) throws IOExceptionWrites the given graph using a given base name. Parameters:
graph
 a graph to be compressed.basename
 a base name.windowSize
 the window size (1 for the default value).maxRefCount
 the maximum reference count (1 for the default value).minIntervalLength
 the minimum interval length (1 for the default value,NO_INTERVALS
to disable).zetaK
 the parameter used for residual ζcoding, if used (1 for the default value).flags
 the flag mask.numberOfThreads
 the number of threads to use; if 0 or negative, it will be replaced byRuntime.availableProcessors()
. Note that ifImmutableGraph.numNodes()
is not implemented bygraph
, the number of threads will be automatically set to one, possibly logging a warning.pl
 a progress logger to log the state of compression, ornull
if no logging is required. Throws:
IOException
 if some exception is raised while writing the graph.

store
public static void store(ImmutableGraph graph, CharSequence basename, int windowSize, int maxRefCount, int minIntervalLength, int zetaK, int flags, ProgressLogger pl) throws IOExceptionWrites the given graph using a given base name. Parameters:
graph
 a graph to be compressed.basename
 a base name.windowSize
 the window size (1 for the default value).maxRefCount
 the maximum reference count (1 for the default value).minIntervalLength
 the minimum interval length (1 for the default value,NO_INTERVALS
to disable).zetaK
 the parameter used for residual ζcoding, if used (1 for the default value).flags
 the flag mask.pl
 a progress logger to log the state of compression, ornull
if no logging is required. Throws:
IOException
 if some exception is raised while writing the graph.

store
public static void store(ImmutableGraph graph, CharSequence basename, int windowSize, int maxRefCount, int minIntervalLength, int zetaK, int flags) throws IOExceptionWrites the given graph using a given base name, without any progress logger. Parameters:
graph
 a graph to be compressed.basename
 a base name.windowSize
 the window size (1 for the default value).maxRefCount
 the maximum reference count (1 for the default value).minIntervalLength
 the minimum interval length (1 for the default value,NO_INTERVALS
to disable).zetaK
 the parameter used for residual ζcoding, if used (1 for the default value).flags
 the flag mask. Throws:
IOException
 if some exception is raised while writing the graph.

store
public static void store(ImmutableGraph graph, CharSequence basename, ProgressLogger pl) throws IOExceptionWrites the given graph using a given base name, with all parameters set to their default values. Parameters:
graph
 a graph to be compressed.basename
 a base name.pl
 a progress logger to log the state of compression, ornull
if no logging is required. Throws:
IOException
 if some exception is raised while writing the graph.

store
Writes the given graph using a given base name, without any progress logger and with all parameters set to their default values. Parameters:
graph
 a graph to be compressed.basename
 a base name. Throws:
IOException
 if some exception is raised while writing the graph.

updateBins
protected static void updateBins(long currNode, long[] list, int length, long[] bin)Updates a list of exponential bins using the gaps a given list of strinctly increasing integers. Parameters:
currNode
 the current node.list
 a strictly increasing list of integers.length
 the number of valid elements inlist
.bin
 the bins.

writeOffsets
Write the offset file to a given bit stream. Parameters:
obs
 the output bit stream to which offsets will be written.pl
 a progress logger, ornull
. Throws:
IOException

main
public static void main(String[] args) throws SecurityException, IllegalAccessException, InvocationTargetException, NoSuchMethodException, IOException, JSAPException, ClassNotFoundException, InstantiationExceptionReads an immutable graph and stores it as aBVGraph
.

loadOffline(CharSequence)
orloadMapped(CharSequence)
instead.