Skip to content

v0.0.34

Latest
Compare
Choose a tag to compare
@MaksimEkin MaksimEkin released this 07 Jan 01:20
ff683c8

Fast-tracking to v0.0.34 from v0.0.20

Enhancements

Pruning Support:

  • Enabled pruning in bnmf, wnmf, and nmf_recommender.
  • Added pruning of additional matrices, e.g., MASK, based on X.
  • Included pruned_cols and pruned_rows in saved outputs.

Matrix Factorization:

  • Introduced new submodule BNMFk under NMFk with nmf_method='bnmf'.
  • Added WEIGHT and MASK keys for WNMFk and BNMFk.
  • Implemented matrix deletion in subroutines to reduce memory consumption.
  • Added factor_thresholding parameter to perform thresholding over NMFk factors, making them boolean. Options include:
    • coord_desc_thresh
    • WH_thresh
  • Introduced factor_thresholding_obj_params for configuring thresholding subroutines.
  • Added clustering_method parameter with options:
    • kmeans
    • bool or boolean (both are equivalent).
  • Introduced clustering_obj_params to configure clustering subroutines.
  • Added new perturbation type for boolean matrices: perturb_type='boolean' or perturb_type='bool'.
  • Updated examples to reflect new boolean-specific features.
  • Path compatibility using os.path.join.

Thresholding and Clustering:

  • Added factor_thresholding_H_regression with options:
    • otsu_thresh
    • coord_desc_thresh
    • kmeans_thresh
  • Default factor_thresholding_H_regression set to kmeans_thresh.
  • Default factor_thresholding set to otsu_thresh.
  • Introduced factor_thresholding_H_regression_obj_params to configure parameters.
  • Added K-means-based boolean thresholding for W and H matrices:
    • Clusters values in each row of W and H into two groups; then the boolean threshold is the midpoint of cluster centroids.

Hardware and Device Management:

  • Added device parameter to NMFk for GPU management:
    • device=-1: Use all GPUs.
    • device=0: Use the GPU with ID 0.
    • device=[0,1,...]: Use a specific list of GPUs.
    • Negative values other than -1: Use (number of GPUs + device + 1).

Hierarchical NMFk (HNMFk) Improvements:

  • Added new variables for nodes:
    • parent_node_factors_path
    • parent_node_k
    • factors_path
  • Enabled dynamic renaming of paths when loading HNMFk models from different directories.
  • Improved decomposition behavior:
    • Nodes with fewer samples than the sample threshold no longer decompose unnecessarily.
  • Added signature, centroid, and probabilities from parent nodes to child nodes.
  • Introduced graph iterator methods for navigating to specific nodes by name.
  • Updated node naming conventions to use ancestor-based indexing.

Result Storage:

  • Added W_all to saved outputs of NMFk.

Installation and Documentation

  • Migrated to a new installation system using pip and Poetry.
  • Added a post-installation script for simplifying setup on different systems.
  • Updated documentation for:
    • New installation methods on Chicoma and Darwin.

Bug Fixes

  • Corrected HNMFk behavior to return total data indices instead of indices of indices.
  • Corrected naming inconsistencies in pruning variables in NMFk.
  • Fixed error calculation to consider only known locations when masking is applied.
  • Resolved GPU transfer conflicts when using MASK.
  • Fixed default device parameter in NMFk to be -1 (use all devices).
  • Addressed issues in WNMFk and BNMFk examples.
  • Fixed checkpointing bugs:
    • Made saving checkpoints true by default.
    • Resolved issues when loading an HNMFk model during an ongoing process.
  • Fixed scalar addition error with sparse matrices in kl_mu.
  • Resolved dependency conflicts with numpy and numba.
  • Updated HPC documentation for T-ELF installation.