Skip to content

v0.43.0

Compare
Choose a tag to compare
@github-actions github-actions released this 08 Feb 18:02
· 7895 commits to main since this release

📦 Uncategorized

  • #4668: Yolov5 GS Demo Benchmarking
  • #0: uplift umd; pick up fix for n150 cluster
  • #3178: Fix for wormhole b0 reduce w
  • #4489: fixed bugs in the program caching of eltwise unary and eltwise binary. Updated bloom to use L1 memory config
  • #4821: Add cumsum op to tt_dnn
  • Dispatch/Bandwidth tests
  • #4003: fixed test_eltwise_unary_op
  • Argmax and Argmin Support
  • #3212: softmax works after reduce fix of max, sum, etc. for WHB0
  • #0: (MINOR) Update version to v0.43.0
  • #4761: Add call to ttl repeat_interleave and also provide script for …
  • #4003: fixed the bug with printing the compile-time attributes
  • Support moreh arange
  • Remove skip_for_wormhole_b0 for test_moreh_softmax and test_moreh_softmin
  • #4541: remove unpad start at 0 limitation
  • Agrebenisan/restart cmd fix
  • Support moreh SGD
  • #0: Use fetch-depth: 0 instead of fetch-tags because otherwise git complains of commit SHA/tag conflict
  • #0: Add code owners for primary operations api binding
  • #4547: Add 2x2 window unit tests to ttnn maxpool
  • #4003: restructure ttnn
  • #4889: Change TileSlice printing to only print tile data
  • #4836: Add support for blocking conv activation in 2d systolic conv v…
  • #0: Update unicast cycles lower bound
  • #4904: Add support for 1d width sharded LN
  • #4941: Convert command header to struct for easier maintainability
  • #4823: enable sum_0 operation fails with low PCC [Wormhole,Grayskull]
  • Fix sharded buffers for one core in fast dispatch
  • #4906: global reduce sum, mean, max, min operations added
  • Revert "#4823: enable sum_0 operation fails with low PCC [Wormhole,GS]
  • #0: Change codeowners from specific op binding files/dirs to all tt_lib bindings
  • #4003: split unary sweep into per op sweeps
  • #4232: added support for converting from numpy arrays to ttnn tensors. Borrow data whenever possible when converting from numpy/torch
  • Uplift AttnMatmul to support GroupAttnMatmul
  • Add watcher-specific CI tests
  • #4916: Add avg pool to ttnn
  • #0: Add a lock on DPRINT server raise/wait structures
  • #4967: added validation for input tensors
  • #4971: update documentation by a new doc hierarchy;
  • #0: Leftover decorate_operation replacement for avg pool
  • #4899: fix the permute to operate on the intended shape
  • #4730: Add tt_lib.tensor.concat
  • Aliu/enqueue eth
  • #4003: Updating functional performance from changes in ttnn.permute w…
  • #4984: Remove dead OP_INFO and graph interpreter
  • #4878: initial commit to add Conv parameters to ttnn.preprocess_model_parameters
  • Update Program Hashes for Ops using Mem config
  • #4984: Remove unused dprint functionality
  • Aliu/ci fix
  • #4215: Add Argmax and Argmin Fallback
  • #4999: added input tensor validation to add, sub and mul operations.
  • Support for softmax rm major sharding and causal mask sharding
  • #0: provide API for where() to support scalar True/False branches
  • #5003: Update expected compile and runtimes for perf regression on VM
  • Revert "Update Program Hashes for Ops using Mem config"
  • #4931: add apis to get ethernet by socket ids
  • #4786: Add upsample_nearest2d functional stable diffusion
  • #4986: deploy docs only to main and enable devs to run docs build on different pages
  • Deploy ttnn sweeps results to docs
  • #4958: Move all python api unit tests to frequent in order to reduce SD pipeline length
  • #4999: Added input validation for ttnn.matmul and ttnn.linear. Add unit test for linear operation. Update input tensor validation in binary.py. Fix compute_output_shapes in bmm_op.cpp
  • #4620: Fix+improve bw test
  • #4852: Add unit tests for functional bloom
  • #5032: scalar argument versions for relops
  • #0: Add some README recommendations from MCW to clarify issue about access to internal workflows VM installation page
  • #4790: Implement GEGLU using ttnn for stable_diffusion model
  • #4999: Adding validation checks
  • #4791: Implement Feedforward sub-module using ttnn for stable_diffusi…
  • Npetrovic/bw ops sweeps
  • #4999: update documentation of ttnn operations to include the validation schema
  • #0: Remove model run from frequent_api_pipeline per @tt-rkim
  • Minor dprint/watcher cleanup
  • #4858: Add support for typecast
  • #0: Disable dprint tests because they're flaky at the moment
  • #4946: Add trig ops to ttnn
  • Nshanker/convs split by 2
  • #4946: Add inv trig ops to ttnn
  • #4003: fixed circular dependency in decorators
  • #5054: Removed asserts from conv op host code that are not required. …
  • #4003: fixed circular dependencies in ttnn
  • #4852: Fix CI pipeline by re-enabling functional bloom for causal LM
  • GroupNorm Sharded. support
  • #4972: is_sharded and memory_config is free from tensor
  • #0: eltwise ops/activate operator tracking for GS, and WHB0
  • Aliu/fd tunneling pr
  • #4642: Converted 14 old cpp tests to use gtest, with capabilities to switch btwn FD/SD when possible
  • #4852: Add tests for functional ttnn bloom implementation.
  • #4003: correctly convert all parameters of torch module to ttnn parameters
  • #5082: Pow gradient calculation method is different with pytorch
  • Argmax/Argmin support for channel, batch and all dim
  • #4420: switch to shared_ptr
  • #4420: return shared_future from taskflow async wrapper
  • Minor DPrint fixes
  • #0: Enable/disable clearing L1 from env var
  • #4003: started moving ttnn operation to C++
  • #4003: Add script to help with finding issues that we need approval for
  • #5044: Adding support for optional output tensors
  • #4003: Adding the open flag to show only open PRs
  • #5048: Add CreateDevices and CloseDevices api to detail
  • decouple ClearProgramCache from CommandQueue
  • Conv fixes for padding input channels. Shallow conv fixes. Conv input/output autoformatting. Cleanup
  • Asarje/mp unpack tilize fused
  • Update CreateBuffer to return shared_ptr, and Enqueue R/W buffer to accept std::shared_ptr
  • #5137: Cleanups for newer Linux distro / toolchains
  • Revert "#5137: Cleanups for newer Linux distro / toolchains"
  • Revert "Update CreateBuffer to return shared_ptr, and Enqueue R/W buffer to accept std::shared_ptr"
  • #4793: Implement ResnetBlock2D using ttnn for stable_diffusion model
  • #4788: Implement Downsample2D using ttnn for stable_diffusion model
  • #4792: Implement CrossAttention sub-module using ttnn for stable_diff…
  • #4747: Reduce amount of samples in bert sweeps
  • #4789: Add upsample2d to functional_stable_diffusion model
  • #0: Add fix for lamb optimizer
  • #5057: Add relational ops support to TTNN
  • skip eth test suite on GS
  • #4003: updated ttnn.Tensor to be derived form ttl.tensor.Tensor
  • Asarje/shwetank upsample
  • #5082: power gradient is erroneous when exponent is in range (0-1)