-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
doc: clarify conversions can be impacted by double-rounding #2361
base: main
Are you sure you want to change the base?
doc: clarify conversions can be impacted by double-rounding #2361
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments suggested, please incorporate as you see fit, thanks!
@@ -99,7 +99,12 @@ No downconversions are allowed by default, but can be enabled using | |||
the floating-point math controls described in @ref | |||
dev_guide_attributes_fpmath_mode. | |||
|
|||
|
|||
The \f$convert_{dst\_dt}\f$ conversion is not guaranteed to be correctly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this be the result of the conversion is not guaranteed to be correctly rounded to the nearest value?
|
||
The \f$convert_{dst\_dt}\f$ conversion is not guaranteed to be correctly | ||
rounded. In particular, some hardware platforms have no direct | ||
conversion instructions from f32 to low-precision datatypes (e.g. fp8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
conversion instructions from f32 to low-precision datatypes (e.g. fp8 | |
conversion instructions to convert f32 data types to low-precision data types such as fp8 |
The \f$convert_{dst\_dt}\f$ conversion is not guaranteed to be correctly | ||
rounded. In particular, some hardware platforms have no direct | ||
conversion instructions from f32 to low-precision datatypes (e.g. fp8 | ||
or fp4), and will perform conversion through an intermediate datatype |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or fp4), and will perform conversion through an intermediate datatype | |
or fp4, and will perform conversion through an intermediate data type |
rounded. In particular, some hardware platforms have no direct | ||
conversion instructions from f32 to low-precision datatypes (e.g. fp8 | ||
or fp4), and will perform conversion through an intermediate datatype | ||
(e.g. f16/bf16), which may cause [double |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(e.g. f16/bf16), which may cause [double | |
(for example, f16 or bf16), which may result in [double |
Specify in the documentation that there could be double rounding going on when converting f32 to very low accuracy datatypes (fp4/fp8).