Tutorials#
This section provides detailed tutorials on how to use the BandHiC package effectively.
Prerequisites#
BandHiC can serve as an alternative to the NumPy package when managing and manipulating Hi-C data, aiming to address the issue of excessive memory usage caused by storing dense matrices using NumPy’s ndarray. At the same time, BandHiC supports masking operations similar to NumPy’s ma.MaskedArray module, with enhancements tailored for Hi-C data.
Users can leverage their experience with NumPy when using the BandHiC package, so it is recommended that users have some basic knowledge of NumPy. A link to NumPy is provided below: https://numpy.org
Import bandhic package#
>>> import bandhic as bh
Initialize a band_hic_matrix object#
Initialize from a SciPy coo_matrix object:
>>> import bandhic as bh
>>> import numpy as np
>>> from scipy.sparse import coo_matrix
>>> coo = coo_matrix(([1, 2, 3], ([0, 1, 2],[0, 1, 2])), shape=(3,3))
>>> mat1 = bh.band_hic_matrix(coo, diag_num=2)
Initialize from a tuple (data, (row, col)):
>>> mat2 = bh.band_hic_matrix(([4, 5, 6], ([0, 1, 2],[2, 1, 0])), diag_num=1)
Initialize from a full dense array, only upper-triangular part is stored, lower part is symmetrized:
>>> arr = np.arange(16).reshape(4,4)
>>> mat3 = bh.band_hic_matrix(arr, diag_num=3)
Load or save a band_hic_matrix object#
>>> bh.save_npz('./sample.npz', mat)
>>> mat = bh.load_npz('./sample.npz')
Load from .hic file:
>>> mat = bh.straw_chr('sample.hic',
'chr1',
resolution=10000,
diag_num=200
)
Load from .mcool file:
>>> mat = bh.cooler_chr('sample.mcool',
'chr1',
diag_num=200
resolution=10000,
)
Construct a band_hic_matrix object#
Create a band_hic_matrix object filled with zeros.
>>> mat1 = bh.zeros((5, 5), diag_num=3, dtype=float)
Create a band_hic_matrix object filled with ones.
>>> mat2 = bh.ones((5, 5), diag_num=3, dtype=float)
Create a band_hic_matrix object filled as an identity matrix.
>>> mat3 = bh.eye((5, 5), diag_num=3, dtype=float)
Create a band_hic_matrix object filled with a specified value.
>>> mat4 = bh.full((5, 5), fill_value=0.1, diag_num=3, dtype=float)
Create a band_hic_matrix object matching another matrix, filled with zeros.
>>> mat5 = bh.zeros_like(mat1, diag_num=3, dtype=float)
Create a band_hic_matrix object matching another matrix, filled with ones.
>>> mat6 = bh.ones_like(mat1, diag_num=3, dtype=float)
Create a band_hic_matrix object matching another matrix, filled as an identity matrix.
>>> mat7 = bh.eye_like(mat1, diag_num=3, dtype=float)
Create a band_hic_matrix object matching another matrix, filled with a specified value.
>>> mat8 = bh.full_like(mat1, fill_value=0.1, diag_num=3, dtype=float)
Indexing on band_hic_matrix#
First, we create a band_hic_matrix object:
>>> import numpy as np
>>> import bandhic as bh
>>> mat = bh.band_hic_matrix(np.arange(16).reshape(4,4), diag_num=2)
Single-element access (scalar)
>>> mat[1, 2]
6
Masked element returns masked
>>> mat2 = bh.band_hic_matrix(np.eye(4), dtype=int, diag_num=2, mask=([0],[1]))
>>> mat2[0, 1]
masked
Square submatrix via two-slice indexing returns band_hic_matrix
>>> sub = mat[1:3, 1:3]
>>> isinstance(sub, bh.band_hic_matrix)
True
Single-axis slice returns band_hic_matrix for square region
>>> sub2 = mat[0:2] # equivalent to mat[0:2, 0:2]
>>> isinstance(sub2, bh.band_hic_matrix)
True
Fancy indexing returns ndarray or MaskedArray
>>> arr = mat[[0,2,3], [1,2,0]]
>>> isinstance(arr, np.ndarray)
True
>>> mat.add_mask([0,1],[1,2]) # Add mask to some entries
>>> masked_arr = mat[[0,1], [1,2]]
>>> isinstance(masked_arr, np.ma.MaskedArray)
True
Boolean indexing with band_hic_matrix
>>> mat3 = bh.band_hic_matrix(np.eye(4), diag_num=2, mask=([0,1],[1,2]))
>>> bool_mask = mat3 > 0 # Create a boolean mask
>>> result = mat3[bool_mask] # Use boolean mask for indexing
>>> isinstance(result, np.ma.MaskedArray)
True
>>> result
masked_array(data=[1.0, 1.0, 1.0, 1.0],
mask=[False, False, False, False],
fill_value=0.0)
Masking#
Add item-wise mask:
>>> mat.add_mask([0, 1], [1, 2])
Add row/column mask:
>>> mask = np.array([True, False, False])
>>> mat.add_mask_row_col(mask)
Remove mask for specified indices.
>>> mat.unmask(( [0],[1] ))
Remove all item-wise mask and row/column mask.
>>> mat.unmask()
Remove all item-wise mask and row/column mask.
>>> mat.clear_mask()
Drop all item-wise mask but preserve all row/column mask.
>>> mat.drop_mask()
Drop all row/column mask.
>>> mat.drop_mask_row_col()
Access masked band_hic_matrix will obtain np.ma.MaskedArray object:
>>> mat.add_mask([0, 1], [1, 2])
>>> masked_arr = mat[[0,1], [1,2]]
>>> isinstance(masked_arr, np.ma.MaskedArray)
True
Universal functions (ufunc)#
Universal functions that BandHiC supports:
Function |
Description |
Function |
Description |
|---|---|---|---|
absolute |
Absolute value |
add |
Element-wise addition |
arccos |
Inverse cosine |
arccosh |
Inverse hyperbolic cosine |
arcsin |
Inverse sine |
arcsinh |
Inverse hyperbolic sine |
arctan |
Inverse tangent |
arctan2 |
Arctangent of y/x with quadrant |
arctanh |
Inverse hyperbolic tangent |
bitwise_and |
Element-wise bitwise AND |
bitwise_or |
Element-wise bitwise OR |
bitwise_xor |
Element-wise bitwise XOR |
cbrt |
Cube root |
conj |
Complex conjugate |
conjugate |
Alias for conj |
cos |
Cosine function |
cosh |
Hyperbolic cosine |
deg2rad |
Degrees to radians |
degrees |
Radians to degrees |
divide |
Element-wise division |
divmod |
Quotient and remainder |
equal |
Element-wise equality test |
exp |
Exponential |
exp2 |
Base-2 exponential |
expm1 |
exp(x) - 1 |
fabs |
Absolute value (float) |
float_power |
Floating-point power |
floor_divide |
Integer division (floor) |
fmod |
Modulo operation |
gcd |
Greatest common divisor |
greater |
Element-wise greater-than test |
greater_equal |
Greater-than or equal test |
heaviside |
Heaviside step function |
hypot |
Euclidean norm |
invert |
Bitwise inversion |
lcm |
Least common multiple |
left_shift |
Bitwise left shift |
less |
Element-wise less-than test |
less_equal |
Less-than or equal test |
log |
Natural logarithm |
log1p |
log(1 + x) |
log2 |
Base-2 logarithm |
log10 |
Base-10 logarithm |
logaddexp |
log(exp(x) + exp(y)) |
logaddexp2 |
Base-2 version of logaddexp |
logical_and |
Element-wise logical AND |
logical_or |
Element-wise logical OR |
logical_xor |
Element-wise logical XOR |
maximum |
Element-wise maximum |
minimum |
Element-wise minimum |
mod |
Remainder (modulo) |
multiply |
Element-wise multiplication |
negative |
Element-wise negation |
not_equal |
Element-wise inequality test |
positive |
Returns input unchanged |
power |
Raise to power |
rad2deg |
Radians to degrees |
radians |
Degrees to radians |
reciprocal |
Element-wise reciprocal |
remainder |
Modulo remainder |
right_shift |
Bitwise right shift |
rint |
Round to nearest integer |
sign |
Sign of input |
sin |
Sine function |
sinh |
Hyperbolic sine |
sqrt |
Square root |
square |
Square of input |
subtract |
Element-wise subtraction |
tan |
Tangent function |
tanh |
Hyperbolic tangent |
true_divide |
Division that returns float |
BandHiC supports these universal functions, and they can be used in the following four ways:
As methods of the
band_hic_matrixobject:
When two band_hic_matrix objects are involved, their shape and diag_num must match
>>> mat3 = mat1.add(mat2)
>>> mat4 = mat1.less(mat2)
>>> mat5 = mat1.negative()
As functions of the
BandHiCpackage
>>> mat3 = bh.add(mat1, mat2)
>>> mat4 = bh.less(mat1, mat2)
>>> mat5 = bh.negative(mat1)
Using mathematical operators:
>>> mat3 = mat1 + mat2
>>> mat4 = mat1 < mat2
>>> mat5 = - mat1
Calling NumPy’s universal functions:
>>> mat3 = np.add(mat1, mat2)
>>> mat4 = np.less(mat1, mat2)
>>> mat5 = np.negative(mat1)
Other Array Functions#
Function |
Description |
|---|---|
sum |
Compute the sum of all elements along the specified axis |
prod |
Compute the product of all elements along the specified axis |
min |
Return the minimum value along the specified axis |
max |
Return the maximum value along the specified axis |
mean |
Compute the arithmetic mean along the specified axis |
var |
Compute the variance (average squared deviation) |
std |
Compute the standard deviation (square root of variance) |
ptp |
Compute the range (max - min) of values along the axis |
all |
Return |
any |
Return |
clip |
Limit values to a specified min and max range |
BandHiC supports these functions, and they can be used in the following three ways:
As methods of the
band_hic_matrixobject:
Compute the sum of all elements including out-of-band values filled with default_value.
>>> result0 = mat1.sum()
Compute the sum of all elements along the row axis
>>> result1 = mat1.sum(axis=0)
>>> result1 = mat1.sum(axis='row')
Compute the sum of all elements along the diag axis
>>> result2 = mat1.sum(axis='diag')
Calling
BandHiC’s functions:
>>> result0 = bh.sum(mat1)
>>> result1 = bh.sum(mat1, axis=0)
>>> result2 = bh.sum(mat1, axis='diag')
Calling NumPy’s functions:
>>> result0 = np.sum(mat1)
>>> result1 = np.sum(mat1, axis=0)
>>> result2 = np.sum(mat1, axis='diag')