Float16 type support #26

BenjaminPoulain · 2019-09-07T19:04:59Z

The current spec defines the tensor types "float32", "int32" and "8 bits quantized".

We should consider "float16" for a few reasons:

Certain hardware are significantly slower handling float32 compared to float16. This includes but is not limited to GPUs.
Storing the intermediate values in float16 can be beneficial to reduce memory footprint and memory bandwidth. This is true even if all the operations are done on float32 in the FPUs.
Supporting constant tensors in float16 would allow clients to half the size of the model's parameters. This can reduce models size in memory and reduce network bandwidth when loading the page.

Implementations without native float16 support should be allowed to use float32 internally. This would help maximizing performance on float32 hardware while keeping things simple for the authors.

huningxin · 2019-11-13T06:36:21Z

Thanks for proposing this @BenjaminPoulain . Adding "float16" support makes sense to me.

anssiko · 2020-01-09T16:09:45Z

Resolution from https://www.w3.org/2020/01/09-webmachinelearning-minutes.html#x04

Add "float16" to OperandType enum in WebNN API and define a way for the API to respond when float16 is not natively supported

@huningxin feel free to craft a PR for the group to review. Thanks!

wchao1115 · 2020-01-10T04:04:35Z

Agreed with @BenjaminPoulain on the stated benefits of supporting float16 in the API. Models with float16 weights and features have become increasingly popular in real-world use cases especially on the GPU e.g. Nvidia TensorCore in the Pascal or newer generation is specifically float16-based. Float16 is also a primary native data type on most DSP-based AI hardware chips e.g. Intel Movidius.

That said, in absence of native support in the underlying hardware, internal simulation such as up-casting the data type on the fly could lead to severe performance penalty and sometimes unexpected computation errors. The API should be designed such that it could fail the operation with unsupported error to allow the client an opportunity to gracefully fallback either by switching to a different target computing device, or to a model variation that simply processes float32 weights and feature sets.

huningxin · 2020-01-10T06:33:10Z

@huningxin feel free to craft a PR for the group to review. Thanks!

Put together #35. @BenjaminPoulain @wchao1115 @anssiko , please take a look. Thanks.

BenjaminPoulain · 2020-02-12T00:00:25Z

Closing. @huningxin resolved the issue in #35.

huningxin mentioned this issue Jan 10, 2020

Add float16 and tensor-float16 into OperandType #35

Merged

huningxin mentioned this issue Jan 15, 2020

Handling unsupported OperandType #36

Closed

BenjaminPoulain closed this as completed Feb 12, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Float16 type support #26

Float16 type support #26

BenjaminPoulain commented Sep 7, 2019

huningxin commented Nov 13, 2019

anssiko commented Jan 9, 2020

wchao1115 commented Jan 10, 2020 •

edited

huningxin commented Jan 10, 2020

BenjaminPoulain commented Feb 12, 2020

Float16 type support #26

Float16 type support #26

Comments

BenjaminPoulain commented Sep 7, 2019

huningxin commented Nov 13, 2019

anssiko commented Jan 9, 2020

wchao1115 commented Jan 10, 2020 • edited

huningxin commented Jan 10, 2020

BenjaminPoulain commented Feb 12, 2020

wchao1115 commented Jan 10, 2020 •

edited