Class ApproximateNeighbourhoodFunctions

java.lang.Object
it.unimi.dsi.webgraph.algo.ApproximateNeighbourhoodFunctions

public class ApproximateNeighbourhoodFunctions
extends Object
Static methods and objects that manipulate approximate neighbourhood functions.

A number of statistics that can be used with Jackknife, such as CDF, AVERAGE_DISTANCE, HARMONIC_DIAMETER and SPID are available.

  • Field Details

    • SPID

      public static Jackknife.Statistic SPID
      A statistic that computes the spid.
    • AVERAGE_DISTANCE

      public static Jackknife.Statistic AVERAGE_DISTANCE
      A statistic that computes the average distance.
    • HARMONIC_DIAMETER

      public static Jackknife.Statistic HARMONIC_DIAMETER
      A statistic that computes the harmonic diameter.
    • EFFECTIVE_DIAMETER

      public static Jackknife.Statistic EFFECTIVE_DIAMETER
      A statistic that computes the effective diameter.
    • CDF

      public static Jackknife.Statistic CDF
      A statistic that divides all values of a sample (an approximate neighbourhood function) by the last value. Useful for moving from neighbourhood functions to cumulative distribution functions.
    • PMF

      public static Jackknife.Statistic PMF
      A statistic that computes differences between consecutive elements of a sample (an approximate neighbourhood function) and divide them by the last value. Useful for moving from neighbourhood functions or cumulative distribution functions to probability mass functions.
  • Method Details

    • combine

      public static double[] combine​(Iterable<double[]> anf)
      Combines several approximate neighbourhood functions for the same graph by averaging their values.

      Note that the resulting approximate neighbourhood function has its standard deviation reduced by the square root of the number of samples (the standard error). However, if the cumulative distribution function has to be computed instead, calling this method and dividing all values by the last value is not the best approach, as it leads to a biased estimate. Rather, the samples should be combined using the jackknife and the CDF statistic.

      If you want to obtain estimates on the standard error of each data point, please consider using the jackknife with the identity statistic instead of this method.

      Parameters:
      anf - an iterable object returning arrays of doubles representing approximate neighbourhood functions.
      Returns:
      a combined approximate neighbourhood functions.
    • evenOut

      public static ObjectList<double[]> evenOut​(Iterable<double[]> anf)
      Evens out several approximate neighbourhood functions for the same graph by extending them to the same length (by copying the last value). This is usually a preparatory step for the jackknife.
      Parameters:
      anf - an iterable object returning arrays of doubles representing approximate neighbourhood functions.
      Returns:
      a list containing the same approximate neighbourhood functions, extended to the same length.