band_hic_matrix Class#

The band_hic_matrix class is the core data structure in the BandHiC package, designed for efficient storage and manipulation of Hi-C contact matrices using a banded representation. It supports various operations similar to those in NumPy, while optimizing memory usage and performance.

Class Overview and Initialization#

class bandhic.band_hic_matrix(contacts, diag_num=1, mask_row_col=None, mask=None, dtype=None, default_value=0, band_data_input=False)[source]#

Symmetric banded matrix stored in upper-triangular format. This storage format is motivated by high-resolution Hi-C data characteristics:

  1. Symmetry of contact maps.

  2. Interaction frequency concentrated near the diagonal; long-range contacts are sparse (mostly zero).

  3. Contact frequency decays sharply with genomic distance.

By storing only the main and a fixed number of super-diagonals as columns of a band matrix (diagonal-major storage: diagonal k stored in column k), we drastically reduce memory usage while enabling random access to Hi-C contacts. Additionally, mask and mask_row_col arrays track invalid or masked contacts to support downstream analysis.

Operations on this band_hic_matrix are as simple as on a numpy.ndarray; users can ignore these storage details.

This class stores only the main diagonal and up to (diag_num - 1) super-diagonals, exploiting symmetry by mirroring values for lower-triangular access.

Parameters:
  • contacts (coo_array | coo_matrix | tuple | ndarray)

  • diag_num (int)

  • mask_row_col (ndarray | None)

  • mask (Tuple[ndarray, ndarray] | None)

  • dtype (type | None)

  • default_value (int | float)

  • band_data_input (bool)

shape#

Shape of the original full Hi-C contact matrix (bin_num, bin_num), regardless of internal band storage format.

Type:

tuple of int

dtype#

Data type of the matrix elements, compatible with numpy dtypes.

Type:

data-type

diag_num#

Number of diagonals stored.

Type:

int

bin_num#

Number of bins (rows/columns) of the Hi-C matrix.

Type:

int

data#

Array of shape (bin_num, diag_num) storing banded Hi-C data.

Type:

ndarray

mask#

Mask for individual invalid entries. Stored as a boolean ndarray of shape (bin_num, diag_num) with the same shape as data.

Type:

ndarray of bool or None

mask_row_col#

Mask for entire rows and corresponding columns, indicating invalid bins. Stored as a boolean ndarray of shape (bin_num,). For computational convenience, row/column masks are also applied to the mask array to track masked entries.

Type:

ndarray of bool or None

default_value#

Default value for out-of-band entries. Entries out of the banded region and not stored in the data array will be set to this value.

Type:

scalar

Examples

>>> import bandhic as bh
>>> import numpy as np
>>> mat = bh.band_hic_matrix(np.eye(4), diag_num=2)
>>> mat.shape
(4, 4)
__init__(contacts, diag_num=1, mask_row_col=None, mask=None, dtype=None, default_value=0, band_data_input=False)[source]#

Initialize a band_hic_matrix instance.

Parameters:
  • contacts ({coo_array, coo_matrix, tuple, ndarray}) – Input Hi-C data in COO format, tuple (data, (row, col)), full square array, or banded stored ndarray. For non-symmetric full arrays, only the upper-triangular part is used and the matrix is symmetrized. Full square arrays are not recommended for large matrices due to memory constraints.

  • diag_num (int, optional) – Number of diagonals to store. Must be >=1 and <= matrix dimension. Default is 1.

  • mask_row_col (ndarray of bool or indices, optional) –

    Mask for invalid rows/columns. Can be specified as:

    • A boolean array of shape (bin_num,) indicating which rows/columns to mask.

    • A list of indices to mask.

    Defaults to None (no masking).

  • mask (ndarray pair of (row_indices, col_indices), optional) –

    Mask for invalid matrix entries. Can be specified as:

    • A tuple of two ndarray (row_indices, col_indices) listing positions to mask.

    Defaults to None (no masking).

  • dtype (data-type, optional) – Desired numpy dtype; defaults to ‘contacts’ data dtype; compatible with numpy dtypes.

  • default_value (scalar, optional) – Default value for unstored out-of-band entries. Default is 0.

  • band_data_input (bool, optional) – If True, contacts is treated as precomputed band storage. Default is False.

Raises:

ValueError – If contacts type is invalid, diag_num out of range, or array shape invalid.

Return type:

None

Examples

Initialize from a SciPy COO matrix:

>>> import bandhic as bh
>>> import numpy as np
>>> from scipy.sparse import coo_matrix
>>> coo = coo_matrix(([1, 2, 3], ([0, 1, 2],[0, 1, 2])), shape=(3,3))
>>> mat1 = bh.band_hic_matrix(coo, diag_num=2)
>>> mat1.data.shape
(3, 2)

Initialize from a tuple (data, (row, col)):

>>> mat2 = bh.band_hic_matrix(([4, 5, 6], ([0, 1, 2],[2, 1, 0])), diag_num=1)
>>> mat2.data.shape
(3, 1)

Initialize from a full dense array, only upper-triangular part is stored, lower part is symmetrized:

>>> arr = np.arange(16).reshape(4,4)
>>> mat3 = bh.band_hic_matrix(arr, diag_num=3)
>>> mat3.data.shape
(4, 3)

Initialize with row/column mask, this masks entire rows and corresponding columns:

>>> mask = np.array([True, False, False, True])
>>> mat4 = bh.band_hic_matrix(arr, diag_num=2, mask_row_col=mask)
>>> mat4.mask_row_col
array([ True, False, False,  True])

mask_row_col is also supported as a list of indices:

>>> mat4 = bh.band_hic_matrix(arr, diag_num=2, mask_row_col=[0, 3])
>>> mat4.mask_row_col
array([ True, False, False,  True])

Initialize from precomputed banded storage:

>>> band = mat3.data.copy()
>>> mat5 = bh.band_hic_matrix(band, band_data_input=True)
>>> mat5.data.shape
(4, 3)

Masking Methods#

These methods allow for the application and manipulation of masks within the matrix. Masks can be used to exclude or highlight specific parts of the matrix during computations.

band_hic_matrix.init_mask()

Initialize mask for invalid entries based on matrix shape.

band_hic_matrix.add_mask(row_idx, col_idx)

Add mask entries for specified indices.

band_hic_matrix.get_mask()

Get current mask array.

band_hic_matrix.unmask([indices])

Remove mask entries for specified indices or clear all.

band_hic_matrix.drop_mask()

Clear the current mask by entry-level, but retain the row/column mask.

band_hic_matrix.add_mask_row_col(mask_row_col)

Mask entire rows and corresponding columns.

band_hic_matrix.get_mask_row_col()

Get current row/column mask.

band_hic_matrix.drop_mask_row_col()

Clear the current row/column mask.

band_hic_matrix.clear_mask()

Clear all masks (both entry-level and row/column-level).

band_hic_matrix.count_masked()

Count the number of masked entries in the banded matrix.

band_hic_matrix.count_unmasked()

Count the number of unmasked entries in the banded matrix.

band_hic_matrix.count_in_band_masked()

Count the number of masked entries in the in-band region.

band_hic_matrix.count_out_band_masked()

Count the number of masked entries in the out-of-band region.

band_hic_matrix.count_in_band()

Count the number of valid entries in the in-band region.

band_hic_matrix.count_out_band()

Count the number of valid entries in the out-of-band region.

Data Indexing and Modification#

The following methods provide functionality to access, modify, and index the matrix data. These are essential for manipulating individual elements or subsets of the matrix.

band_hic_matrix.__getitem__(index)

Retrieve matrix entries or submatrix using NumPy-like indexing.

band_hic_matrix.__setitem__(index, values)

Assign values to matrix entries using NumPy-like indexing.

band_hic_matrix.get_values(row_idx, col_idx)

Retrieve values considering mask.

band_hic_matrix.set_values(row_idx, col_idx, ...)

Set values at specified row and column indices.

band_hic_matrix.diag(k)

Retrieve the k-th diagonal from the matrix.

band_hic_matrix.set_diag(k, values)

Set values in the k-th diagonal of the matrix.

band_hic_matrix.extract_row(idx[, ...])

Extract stored, unmasked band values for a given row or column.

band_hic_matrix.__iter__()

Iterate over diagonals of the matrix.

band_hic_matrix.iterwindows(width[, step])

Iterate over the diagonals of the matrix with a specified window size.

band_hic_matrix.iterrows()

Iterate over the rows of the band_hic_matrix object.

band_hic_matrix.itercols()

Iterate over the columns of the band_hic_matrix object.

band_hic_matrix.filled([fill_value, copy])

Fill masked entries in data with default value.

band_hic_matrix.clip(min_val, max_val)

Clip data values to given range.

Data Reduction and Normalization#

These methods enable various data reduction operations, such as summing, averaging, or aggregating matrix values along specific axes or dimensions.

band_hic_matrix.min([axis])

Compute the minimum value in the matrix or along a given axis.

band_hic_matrix.max([axis])

Compute the maximum value in the matrix or along a given axis.

band_hic_matrix.sum([axis])

Compute the sum of the values in the matrix or along a given axis.

band_hic_matrix.mean([axis])

Compute the mean value of the matrix or along a given axis.

band_hic_matrix.std([axis])

Compute the standard deviation of the values in the matrix or along a given axis.

band_hic_matrix.var([axis])

Compute the variance of the values in the matrix or along a given axis.

band_hic_matrix.prod([axis])

Compute the product of the values in the matrix or along a given axis.

band_hic_matrix.ptp([axis])

Compute the peak-to-peak (maximum - minimum) value of the matrix or along a given axis.

band_hic_matrix.all([axis, banded_only])

Test whether all (or any) array elements along a given axis evaluate to True.

band_hic_matrix.any([axis, banded_only])

Test whether all (or any) array elements along a given axis evaluate to True.

band_hic_matrix.normalize([inplace])

Normalize each diagonal of the matrix to have zero mean and unit variance.

Vectorized Computation Methods#

These methods allow for the application of universal functions (ufuncs) to the matrix, enabling element-wise operations similar to those in NumPy.

band_hic_matrix.absolute(self, *args, **kwargs)

Perform element-wise 'absolute' operation.

band_hic_matrix.add(self, other, *args, **kwargs)

Perform element-wise 'add' operation with two inputs.

band_hic_matrix.arccos(self, *args, **kwargs)

Perform element-wise 'arccos' operation.

band_hic_matrix.arccosh(self, *args, **kwargs)

Perform element-wise 'arccosh' operation.

band_hic_matrix.arcsin(self, *args, **kwargs)

Perform element-wise 'arcsin' operation.

band_hic_matrix.arcsinh(self, *args, **kwargs)

Perform element-wise 'arcsinh' operation.

band_hic_matrix.arctan(self, *args, **kwargs)

Perform element-wise 'arctan' operation.

band_hic_matrix.arctan2(self, other, *args, ...)

Perform element-wise 'arctan2' operation with two inputs.

band_hic_matrix.arctanh(self, *args, **kwargs)

Perform element-wise 'arctanh' operation.

band_hic_matrix.bitwise_and(self, other, ...)

Perform element-wise 'bitwise_and' operation with two inputs.

band_hic_matrix.bitwise_or(self, other, ...)

Perform element-wise 'bitwise_or' operation with two inputs.

band_hic_matrix.bitwise_xor(self, other, ...)

Perform element-wise 'bitwise_xor' operation with two inputs.

band_hic_matrix.cbrt(self, *args, **kwargs)

Perform element-wise 'cbrt' operation.

band_hic_matrix.conj(self, *args, **kwargs)

Perform element-wise 'conj' operation.

band_hic_matrix.conjugate(self, *args, **kwargs)

Perform element-wise 'conjugate' operation.

band_hic_matrix.cos(self, *args, **kwargs)

Perform element-wise 'cos' operation.

band_hic_matrix.cosh(self, *args, **kwargs)

Perform element-wise 'cosh' operation.

band_hic_matrix.deg2rad(self, *args, **kwargs)

Perform element-wise 'deg2rad' operation.

band_hic_matrix.degrees(self, *args, **kwargs)

Perform element-wise 'degrees' operation.

band_hic_matrix.divide(self, other, *args, ...)

Perform element-wise 'divide' operation with two inputs.

band_hic_matrix.divmod(self, other, *args, ...)

Perform element-wise 'divmod' operation with two inputs.

band_hic_matrix.equal(self, other, *args, ...)

Perform element-wise 'equal' operation with two inputs.

band_hic_matrix.exp(self, *args, **kwargs)

Perform element-wise 'exp' operation.

band_hic_matrix.exp2(self, *args, **kwargs)

Perform element-wise 'exp2' operation.

band_hic_matrix.expm1(self, *args, **kwargs)

Perform element-wise 'expm1' operation.

band_hic_matrix.fabs(self, *args, **kwargs)

Perform element-wise 'fabs' operation.

band_hic_matrix.float_power(self, other, ...)

Perform element-wise 'float_power' operation with two inputs.

band_hic_matrix.floor_divide(self, other, ...)

Perform element-wise 'floor_divide' operation with two inputs.

band_hic_matrix.fmod(self, other, *args, ...)

Perform element-wise 'fmod' operation with two inputs.

band_hic_matrix.gcd(self, other, *args, **kwargs)

Perform element-wise 'gcd' operation with two inputs.

band_hic_matrix.greater(self, other, *args, ...)

Perform element-wise 'greater' operation with two inputs.

band_hic_matrix.greater_equal(self, other, ...)

Perform element-wise 'greater_equal' operation with two inputs.

band_hic_matrix.heaviside(self, other, ...)

Perform element-wise 'heaviside' operation with two inputs.

band_hic_matrix.hypot(self, other, *args, ...)

Perform element-wise 'hypot' operation with two inputs.

band_hic_matrix.invert(self, *args, **kwargs)

Perform element-wise 'invert' operation.

band_hic_matrix.lcm(self, other, *args, **kwargs)

Perform element-wise 'lcm' operation with two inputs.

band_hic_matrix.left_shift(self, other, ...)

Perform element-wise 'left_shift' operation with two inputs.

band_hic_matrix.less(self, other, *args, ...)

Perform element-wise 'less' operation with two inputs.

band_hic_matrix.less_equal(self, other, ...)

Perform element-wise 'less_equal' operation with two inputs.

band_hic_matrix.log(self, *args, **kwargs)

Perform element-wise 'log' operation.

band_hic_matrix.log1p(self, *args, **kwargs)

Perform element-wise 'log1p' operation.

band_hic_matrix.log2(self, *args, **kwargs)

Perform element-wise 'log2' operation.

band_hic_matrix.log10(self, *args, **kwargs)

Perform element-wise 'log10' operation.

band_hic_matrix.logaddexp(self, other, ...)

Perform element-wise 'logaddexp' operation with two inputs.

band_hic_matrix.logaddexp2(self, other, ...)

Perform element-wise 'logaddexp2' operation with two inputs.

band_hic_matrix.logical_and(self, other, ...)

Perform element-wise 'logical_and' operation with two inputs.

band_hic_matrix.logical_or(self, other, ...)

Perform element-wise 'logical_or' operation with two inputs.

band_hic_matrix.logical_xor(self, other, ...)

Perform element-wise 'logical_xor' operation with two inputs.

band_hic_matrix.maximum(self, other, *args, ...)

Perform element-wise 'maximum' operation with two inputs.

band_hic_matrix.minimum(self, other, *args, ...)

Perform element-wise 'minimum' operation with two inputs.

band_hic_matrix.mod(self, other, *args, **kwargs)

Perform element-wise 'mod' operation with two inputs.

band_hic_matrix.multiply(self, other, *args, ...)

Perform element-wise 'multiply' operation with two inputs.

band_hic_matrix.negative(self, *args, **kwargs)

Perform element-wise 'negative' operation.

band_hic_matrix.not_equal(self, other, ...)

Perform element-wise 'not_equal' operation with two inputs.

band_hic_matrix.positive(self, *args, **kwargs)

Perform element-wise 'positive' operation.

band_hic_matrix.power(self, other, *args, ...)

Perform element-wise 'power' operation with two inputs.

band_hic_matrix.rad2deg(self, *args, **kwargs)

Perform element-wise 'rad2deg' operation.

band_hic_matrix.radians(self, *args, **kwargs)

Perform element-wise 'radians' operation.

band_hic_matrix.reciprocal(self, *args, **kwargs)

Perform element-wise 'reciprocal' operation.

band_hic_matrix.remainder(self, other, ...)

Perform element-wise 'remainder' operation with two inputs.

band_hic_matrix.right_shift(self, other, ...)

Perform element-wise 'right_shift' operation with two inputs.

band_hic_matrix.rint(self, *args, **kwargs)

Perform element-wise 'rint' operation.

band_hic_matrix.sign(self, *args, **kwargs)

Perform element-wise 'sign' operation.

band_hic_matrix.sin(self, *args, **kwargs)

Perform element-wise 'sin' operation.

band_hic_matrix.sinh(self, *args, **kwargs)

Perform element-wise 'sinh' operation.

band_hic_matrix.sqrt(self, *args, **kwargs)

Perform element-wise 'sqrt' operation.

band_hic_matrix.square(self, *args, **kwargs)

Perform element-wise 'square' operation.

band_hic_matrix.subtract(self, other, *args, ...)

Perform element-wise 'subtract' operation with two inputs.

band_hic_matrix.tan(self, *args, **kwargs)

Perform element-wise 'tan' operation.

band_hic_matrix.tanh(self, *args, **kwargs)

Perform element-wise 'tanh' operation.

band_hic_matrix.true_divide(self, other, ...)

Perform element-wise 'true_divide' operation with two inputs.

Data Type Conversion and Representation#

This section includes methods for converting the matrix to different data types or representations, such as dense, sparse, or masked arrays.

band_hic_matrix.todense()

Convert the band matrix to a dense format.

band_hic_matrix.tocoo([drop_zeros])

Convert the matrix to COO format.

band_hic_matrix.tocsr()

Convert the matrix to CSR format.

band_hic_matrix.copy()

Deep copy the object.

band_hic_matrix.astype(dtype[, copy])

Cast data to new dtype.

band_hic_matrix.__repr__()

Return a string representation of the band_hic_matrix object.

band_hic_matrix.__str__()

Return a string representation of the band_hic_matrix object.

band_hic_matrix.__array__([copy])

Return the data as a NumPy array.

band_hic_matrix.__array_priority__()

Return the priority for array operations.

Other Methods#

This section includes additional methods that do not fall into the categories above but provide other useful operations for band_hic_matrix.

band_hic_matrix.memory_usage()

Compute memory usage of band_hic_matirx object.

band_hic_matrix.__len__()

Return the number of rows in the band_hic_matrix object.

band_hic_matrix.__hash__()

Return a hash value for the band_hic_matrix object.

band_hic_matrix.__bool__()

Truth value of the band_hic_matrix, following NumPy semantics.

band_hic_matrix.dump(filename)

Save the band_hic_matrix object to a file.