-
Notifications
You must be signed in to change notification settings - Fork 424
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update README.md for v1.0.0 #1100
Changes from 6 commits
cfad81f
d422659
fabcf8c
b844acf
88e84f3
6abc8ad
7570c11
09bbefb
42cb368
26f4eb2
fedf790
f28f364
e55b29c
99e3be0
96b530f
a7b6f79
135eaa2
6af7fef
eac61dd
c65e915
7cf4134
4fc1ea9
548c462
4da52a4
6959c71
47d7435
05f8a45
6f971eb
536c069
d9d09e0
16c4055
e69a392
896951a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
============================= | ||
Automatic precision inference | ||
============================= | ||
|
||
The automatic precision inference (implemented in :py:class:`~hls4ml.model.optimizer.passes.infer_precision.InferPrecisionTypes`) attempts to infer the appropriate widths for a given precision. | ||
It is initiated by configuring a precision in the configuration as 'auto'. Functions like :py:class:`~hls4ml.utils.config.config_from_keras_model` and :py:class:`~hls4ml.utils.config.config_from_onnx_model` | ||
automatically set most precisions to 'auto' if the ``'name'`` granularity is used. | ||
|
||
.. note:: | ||
It is recommended to pass the backend to the ``config_from_*`` functions so that they can properly extract all the configurable precisions. | ||
|
||
The approach taken by the precision inference is to set accumulator and other precisions to never truncate, using only the bitwidths of the inputs (not the values). This is quite conservative, | ||
especially in cases where post-training quantization is used, or if the bit widths were set fairly loosely. The recommended action in that case is to edit the configuration and explicitly set | ||
some widths in it, potentially in an iterative process after seeing what precisions are automatically set. Another option, currently implemented in :py:class:`~hls4ml.utils.config.config_from_keras_model`, | ||
is to pass a maximum bitwidth using the ``max_precison`` option. Then the automatic precision inference will never set a bitwdith larger than the bitwidth or an integer part larger than the integer part of | ||
the ``max_precision`` that is passed. (The bitwidth and integer parts are treated separately.) |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -45,8 +45,8 @@ This python dictionary can be edited as needed. A more advanced configuration ca | |
default_precision='fixed<16,6>', | ||
backend='Vitis') | ||
|
||
This will include per-layer configuration based on the model. Including the backend is recommended because some configation options depend on the backend. Note, the precisions at the | ||
higher granularites usually default to 'auto', which means that ``hls4ml`` will try to set it automatically. Note that higher granularity settings take precendence | ||
This will include per-layer configuration based on the model. Including the backend is recommended because some configuration options depend on the backend. Note, the precisions at the | ||
higher granularites usually default to 'auto', which means that ``hls4ml`` will try to set it automatically (see :ref:`Automatic precision inference`). Note that higher granularity settings take precedence | ||
over model-level settings. See :py:class:`~hls4ml.utils.config.config_from_keras_model` for more information on the various options. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we generalize this part of the documentation to not just feature the keras parser as an example? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I attempted to do this, though feel free to edit. |
||
|
||
One can override specific values before using the configuration: | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
=========== | ||
Activations | ||
=========== | ||
|
||
Most activations without extra parameters are represented with the ``Activation`` layer, and those with single parameters (leaky ReLU, thresholded ReLU, ELU) as ``ParametrizedActivation``. | ||
``PReLU`` has its own class because it has a parameter matrix (stored as a weight). The hard (piecewise linear) sigmoid and tanh functions are implemented in a ``HardActivation`` layer, | ||
and ``Softmax`` has its own layer class. | ||
|
||
Softmax has four implementations that the user can choose from by setting the ``implementation`` parameter: | ||
|
||
* **latency**: Good latency, but somewhat high resource usage. It does not work well if there are many output classes. | ||
* **stable**: Slower but with better accuracy, useful in scenarios where higher accuracy is needed. | ||
* **legacy**: An older implementation with poor accuracy, but good performance. Usually the latency implementation is preferred. | ||
* **argmax**: If you don't care about normalized outputs and only care about which one has the highest value, using argmax saves a lot of resources. This sets the highest value to 1, the others to 0. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
================== | ||
Convolution Layers | ||
================== | ||
|
||
Standard convolutions | ||
===================== | ||
|
||
These are the standard 1D and 2D convolutions currently supported by hls4ml, and the fallback if there is no special pointwise implementation. | ||
|
||
io_parallel | ||
----------- | ||
|
||
Parallel convolutions are for cases where the model needs to be small and fast, though synthesizability limits can be quickly reached. Also note that skip connections | ||
are not supported in io_parallel. | ||
|
||
For the Xilinx backends and Catapult, there is a very direct convolution implementation when using the ``Latency`` strategy. This is only for very small models because the | ||
high number of nested loops. The ``Resource`` strategy in all cases defaults to an algorithm using the *im2col* transformation. This generally supports larger models. The ``Quartus``, | ||
``oneAPI``, and ``Catapult`` backends also implement a ``Winograd`` algorithm choosable by setting the ``implementation`` to ``Winograd`` or ``combination``. Note that | ||
the Winograd implementation is available for only a handful of filter size configurations, and it is less concerned about bit accuracy and overflow, but it can be faster. | ||
|
||
io_stream | ||
--------- | ||
|
||
There are two main classes of io_stream implementations, ``LineBuffer`` and ``Encoded``. ``LineBuffer`` is always the default, and generally produces marginally better results, | ||
while ``Catapult`` and ``Vivado`` also implement ``Encoded``, choosable with the ``convImplementation`` configuration option. In all cases, the data is processed serially, one pixel | ||
at a time, with a pixel containing an array of all the channel values for the pixel. | ||
|
||
Depthwise convolutions | ||
====================== | ||
|
||
Pointwise convolutions | ||
====================== |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
============ | ||
Dense Layers | ||
============ | ||
|
||
One-dimensional Dense Layers | ||
============================ | ||
|
||
One-dimensional dense layers implement a matrix multiply and bias add. The produced code is also used by other layers to implement the matrix multiplication. | ||
|
||
|
||
io_parallel | ||
----------- | ||
|
||
All the backends implement a ``Resource`` implementation, which explicitly iterates over the reuse factor. There are different implementations depending on whether the reuse factor is | ||
smaller or bigger than the input size. The two Xilinx backends and Catapult also implement a ``Latency`` implementation, which only uses the reuse factor in pragmas. | ||
|
||
io_stream | ||
--------- | ||
|
||
The io_stream implementation only wraps the io_parallel implementation with streams or pipes for communication. The data is still transferred in parallel. | ||
|
||
Multi-dimensional Dense Layers | ||
============================== | ||
|
||
Multi-dimensional Dense layers are converted to pointwise convolutions, and do not directly use the above implementation |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -43,23 +43,30 @@ version can be installed directly from ``git``: | |
Dependencies | ||
============ | ||
|
||
The ``hls4ml`` library depends on a number of Python packages and external tools for synthesis and simulation. Python dependencies are automatically managed | ||
The ``hls4ml`` library requires python 3.10 or later, and depends on a number of Python packages and external tools for synthesis and simulation. Python dependencies are automatically managed | ||
by ``pip`` or ``conda``. | ||
|
||
* `TensorFlow <https://pypi.org/project/tensorflow/>`_ (version 2.4 and newer) and `QKeras <https://pypi.org/project/qkeras/>`_ are required by the Keras converter. | ||
* `TensorFlow <https://pypi.org/project/tensorflow/>`_ (version 2.8 to 2.14) and `QKeras <https://pypi.org/project/qkeras/>`_ are required by the Keras converter. One may want to install newer versions of QKeras from GitHub. Newer versions of TensorFlow can be used, but QKeras and hl4ml do not currently support Keras v3. | ||
|
||
* `ONNX <https://pypi.org/project/onnx/>`_ (version 1.4.0 and newer) is required by the ONNX converter. | ||
|
||
* `PyTorch <https://pytorch.org/get-started>`_ package is optional. If not installed, the PyTorch converter will not be available. | ||
|
||
Running C simulation from Python requires a C++11-compatible compiler. On Linux, a GCC C++ compiler ``g++`` is required. Any version from a recent | ||
Linux should work. On MacOS, the *clang*-based ``g++`` is enough. | ||
Linux should work. On MacOS, the *clang*-based ``g++`` is enough. For the oneAPI backend, one must have oneAPI installed, along with the FPGA compiler, | ||
to run C/SYCL simulations. | ||
|
||
To run FPGA synthesis, installation of following tools is required: | ||
|
||
* Xilinx Vivado HLS 2018.2 to 2020.1 for synthesis for Xilinx FPGAs | ||
* Xilinx Vivado HLS 2018.2 to 2020.1 for synthesis for Xilinx FPGAs using the ``Vivado`` backend. | ||
|
||
* Vitis HLS 2022.2 or newer is required for synthesis for Xilinx FPGAs using the ``Vitis`` backend. | ||
|
||
* Vitis HLS 2022.2 or newer is required for synthesis for Xilinx FPGAs using the ``Vitis`` backend. | ||
* Intel Quartus 20.1 to 21.4 for the synthesis for Intel/Altera FPGAs using the ``Quartus`` backend. | ||
|
||
* Intel Quartus 20.1 to 21.4 for the synthesis for Intel FPGAs | ||
* oneAPI 2024.1 to 2025.0 with the FPGA compiler and recent Intel/Altara Quartus for Intel/Altera FPGAs using the ``oneAPI`` backend. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Typo: Altara There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fixed. |
||
|
||
Catapult HLS 2024.1_1 or 2024.2 can be used to synthesize both for ASICs and FPGAs. | ||
|
||
|
||
Quick Start | ||
|
@@ -100,76 +107,77 @@ Done! You've built your first project using ``hls4ml``! To learn more about our | |
|
||
If you want to configure your model further, check out our :doc:`Configuration <api/configuration>` page. | ||
|
||
Apart from our main API, we also support model conversion using a command line interface, check out our next section to find out more: | ||
.. | ||
Apart from our main API, we also support model conversion using a command line interface, check out our next section to find out more: | ||
|
||
Getting started with hls4ml CLI (deprecated) | ||
-------------------------------------------- | ||
Getting started with hls4ml CLI (deprecated) | ||
-------------------------------------------- | ||
|
||
As an alternative to the recommended Python PI, the command-line interface is provided via the ``hls4ml`` command. | ||
As an alternative to the recommended Python PI, the command-line interface is provided via the ``hls4ml`` command. | ||
|
||
To follow this tutorial, you must first download our ``example-models`` repository: | ||
To follow this tutorial, you must first download our ``example-models`` repository: | ||
|
||
.. code-block:: bash | ||
.. code-block:: bash | ||
|
||
git clone https://github.com/fastmachinelearning/example-models | ||
git clone https://github.com/fastmachinelearning/example-models | ||
|
||
Alternatively, you can clone the ``hls4ml`` repository with submodules | ||
Alternatively, you can clone the ``hls4ml`` repository with submodules | ||
|
||
.. code-block:: bash | ||
.. code-block:: bash | ||
|
||
git clone --recurse-submodules https://github.com/fastmachinelearning/hls4ml | ||
git clone --recurse-submodules https://github.com/fastmachinelearning/hls4ml | ||
|
||
The model files, along with other configuration parameters, are defined in the ``.yml`` files. | ||
Further information about ``.yml`` files can be found in :doc:`Configuration <api/configuration>` page. | ||
The model files, along with other configuration parameters, are defined in the ``.yml`` files. | ||
Further information about ``.yml`` files can be found in :doc:`Configuration <api/configuration>` page. | ||
|
||
In order to create an example HLS project, first go to ``example-models/`` from the main directory: | ||
In order to create an example HLS project, first go to ``example-models/`` from the main directory: | ||
|
||
.. code-block:: bash | ||
.. code-block:: bash | ||
|
||
cd example-models/ | ||
cd example-models/ | ||
|
||
And use this command to translate a Keras model: | ||
And use this command to translate a Keras model: | ||
|
||
.. code-block:: bash | ||
.. code-block:: bash | ||
|
||
hls4ml convert -c keras-config.yml | ||
hls4ml convert -c keras-config.yml | ||
|
||
This will create a new HLS project directory with an implementation of a model from the ``example-models/keras/`` directory. | ||
To build the HLS project, do: | ||
This will create a new HLS project directory with an implementation of a model from the ``example-models/keras/`` directory. | ||
To build the HLS project, do: | ||
|
||
.. code-block:: bash | ||
.. code-block:: bash | ||
|
||
hls4ml build -p my-hls-test -a | ||
hls4ml build -p my-hls-test -a | ||
|
||
This will create a Vivado HLS project with your model implementation! | ||
This will create a Vivado HLS project with your model implementation! | ||
|
||
**NOTE:** For the last step, you can alternatively do the following to build the HLS project: | ||
**NOTE:** For the last step, you can alternatively do the following to build the HLS project: | ||
|
||
.. code-block:: Bash | ||
.. code-block:: Bash | ||
|
||
cd my-hls-test | ||
vivado_hls -f build_prj.tcl | ||
cd my-hls-test | ||
vivado_hls -f build_prj.tcl | ||
|
||
``vivado_hls`` can be controlled with: | ||
``vivado_hls`` can be controlled with: | ||
|
||
.. code-block:: bash | ||
.. code-block:: bash | ||
|
||
vivado_hls -f build_prj.tcl "csim=1 synth=1 cosim=1 export=1 vsynth=1" | ||
vivado_hls -f build_prj.tcl "csim=1 synth=1 cosim=1 export=1 vsynth=1" | ||
|
||
Setting the additional parameters from ``1`` to ``0`` disables that step, but disabling ``synth`` also disables ``cosim`` and ``export``. | ||
Setting the additional parameters from ``1`` to ``0`` disables that step, but disabling ``synth`` also disables ``cosim`` and ``export``. | ||
|
||
Further help | ||
^^^^^^^^^^^^ | ||
Further help | ||
^^^^^^^^^^^^ | ||
|
||
* For further information about how to use ``hls4ml``\ , do: ``hls4ml --help`` or ``hls4ml -h`` | ||
* If you need help for a particular ``command``\ , ``hls4ml command -h`` will show help for the requested ``command`` | ||
* We provide a detailed documentation for each of the command in the :doc:`Command Help <../command>` section | ||
* For further information about how to use ``hls4ml``\ , do: ``hls4ml --help`` or ``hls4ml -h`` | ||
* If you need help for a particular ``command``\ , ``hls4ml command -h`` will show help for the requested ``command`` | ||
* We provide a detailed documentation for each of the command in the :doc:`Command Help <advanced/command>` section | ||
|
||
Existing examples | ||
----------------- | ||
|
||
* Examples of model files and weights can be found in `example_models <https://github.com/fastmachinelearning/example-models>`_ directory. | ||
* Training codes and examples of resources needed to train the models can be found in the `tutorial <https://github.com/fastmachinelearning/hls4ml-tutorial>`__. | ||
* Examples of model files and weights can be found in `example_models <https://github.com/fastmachinelearning/example-models>`_ directory. | ||
|
||
Uninstalling | ||
------------ | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can updated now since we added it to QONNX and pytorch as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated this to not say that it's only for keras models.