Fp32 Tilize Tilizes Into Fp16, Resulting In Loss Of Range
Introduction
In the realm of deep learning and numerical computations, the choice of data type and format can significantly impact the performance and accuracy of models. The ttnn
library, a high-performance tensor library, provides various data formats and types to optimize computations. However, a recent issue has been identified where fp32
tilize operations result in a loss of range when tilizing into fp16
data format. In this article, we will delve into the details of this issue, explore the possible causes, and discuss potential solutions.
Describe the Bug
The bug occurs when performing tilize operations on very small values. The input values get truncated to 0 due to the conversion from fp32
to fp16
. This can be confirmed by examining the kernel unpacker data formats, which show a transition from 0
to 1
(representing fp32
to fp16
):
#pragma once
constexpr std::int32_t unpack_src_format[32] = {
0,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,0,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,
};
constexpr std::int32_t unpack_dst_format[32] = {
1,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,1,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,
};
Steps to Reproduce the Issue
To reproduce the issue, we can use the following example test:
from loguru import logger
import pytest
import torch
import ttnn
def test_tilize_with_padding_for_1D():
torch.manual_seed(2005)
shape = (64, 128)
input_a = torch.full(shape, 1.9908e-05, dtype=torch.float32)
device = ttnn.open_device(device_id=0)
input_tensor = ttnn.from_torch(input_a, device=device, layout=ttnn.ROW_MAJOR_LAYOUT)
input_tensor = ttnn.to_layout(input_tensor, ttnn.TILE_LAYOUT)
output_tensor = ttnn.to_torch(input_tensor)
print(output_tensor)
When running this test, the output tensor will be printed, and all values will be truncated to 0.
Expected Behavior
A possible solution to this issue is to tilize into TF32
instead of FP16
. This would help preserve the range of smaller values and prevent the truncation to 0. However, this is relatively untested in ttnn
, as the from_torch
function always tilizes on the host.
Please Complete the Following Environment Information
- Operating System: [Insert operating system]
- ttnn Version: [Insert ttnn version]
- Device: [Insert device information]
- Compiler: [Insert compiler information]
Conclusion
In conclusion, the fp32
tilize operation into fp16
data format results in a loss of range, causing small values to be truncated to 0. This issue can be reproduced using the provided example test. To address this issue, tilizing into TF32
could be a potential solution. However, further testing and validation are required to confirm the effectiveness of this approach. We encourage the community to contribute to the discussion and provide feedback on this issue.
Recommendations
- Upgrade to the latest ttnn version: Ensure that you are using the latest version of ttnn, as newer versions may have addressed this issue.
- Use TF32 for tilizing: Consider tilizing into TF32 instead of FP16 to preserve the range of smaller values.
- Provide feedback and contribute: Share your experiences and provide feedback on this issue. Contributions to the ttnn community are welcome and encouraged.
Introduction
In our previous article, we discussed the issue of fp32
tilize operations resulting in a loss of range when tilizing into fp16
data format. In this article, we will provide a Q&A section to address common questions and concerns related to this issue.
Q: What is the cause of this issue?
A: The cause of this issue is due to the conversion from fp32
to fp16
data format. When tilizing into fp16
, the range of smaller values is lost, resulting in truncation to 0.
Q: How can I reproduce this issue?
A: You can reproduce this issue by running the example test provided in our previous article. The test creates a tensor with small values and performs a tilize operation into fp16
data format.
Q: What are the implications of this issue?
A: The implications of this issue are that small values may be truncated to 0, which can affect the accuracy and performance of models using the ttnn library.
Q: Is there a solution to this issue?
A: Yes, one possible solution is to tilize into TF32
instead of FP16
. This would help preserve the range of smaller values and prevent the truncation to 0.
Q: Why is tilizing into TF32 a good solution?
A: Tilizing into TF32 is a good solution because it provides a higher precision than FP16, which can help preserve the range of smaller values.
Q: Are there any other solutions to this issue?
A: Yes, other possible solutions include:
- Using a different data format: Consider using a different data format that is more suitable for your specific use case.
- Increasing the precision: Consider increasing the precision of your calculations to reduce the impact of truncation.
- Using a different library: Consider using a different library that does not have this issue.
Q: How can I contribute to the discussion and provide feedback?
A: You can contribute to the discussion and provide feedback by:
- Sharing your experiences: Share your experiences and feedback on this issue.
- Providing code examples: Provide code examples that demonstrate the issue and potential solutions.
- Participating in the community: Participate in the ttnn community and provide feedback on this issue.
Conclusion
In conclusion, the fp32
tilize operation into fp16
data format results in a loss of range, causing small values to be truncated to 0. This issue can be reproduced using the provided example test. To address this issue, tilizing into TF32
could be a potential solution. We encourage the community to contribute to the discussion and provide feedback on this issue.
Recommendations
- Upgrade to the latest ttnn version: Ensure that you are using the latest version of ttnn, as newer versions may have addressed this issue.
- Use TF32 for tilizing: Consider tilizing into TF32 instead of FP16 to the range of smaller values.
- Provide feedback and contribute: Share your experiences and provide feedback on this issue. Contributions to the ttnn community are welcome and encouraged.
By following these recommendations and contributing to the discussion, we can work together to resolve this issue and improve the performance and accuracy of models using the ttnn library.