[op compatibility] conv2d #28

nsthorat · 2019-09-16T23:58:28Z

This issue will track op compatibility resolution for conv2d.

Signature:
conv2d(x, filter, padding, strides, dilations)

Arguments:
x: 4-dim tensor with logical shape [N, H, W, in_channels]
filter: 4-dim tensor with logical shape [H, W, in_channels, out_channels]
padding: array of ints. Padding for the beginning and ending along each spatial axis.
strides: array of ints that have length 1, 2 or 4. The stride of the sliding window for each dimension of input.
dilations: array of ints of length 2: [dil_height, dil_width] The dilation factor for each spatial dimension of the input.

Docstring:
Computes a 2-D convolution given 4-D input and filter tensors.

Notes:

Tensor layout should be a single logical type to leave transpose graph optimizations up to the browser (this is much more portable because the model author does not have to think about the hardware of the device).
Tensor layout should be NHWC because HTMLMediaElements are already of this format and we want to remove the burden on the user to remove transposing.

To be discussed:

Compatibility with underlying APIs / hardware

huningxin · 2019-09-18T03:00:41Z

conv2d(x, filter, padding, strides)

Is the dilations parameter missed in the signature?

nsthorat · 2019-09-20T01:02:33Z

Fixed, thanks :)

huningxin · 2019-10-02T17:21:15Z

Compatibility with underlying APIs / hardware

The following table is based on WebNN POC native mapping. It captures the mapping to NNAPI, MPS and DirectML that can be used as the base for native APIs compatibility study.

WebNN	NNAPI	MPS	DML	BNNS	DNNL (MKL-DNN)
conv2d	ANEURALNETWORKS_CONV_2D	MPSCNNConvolution	DML_OPERATOR_CONVOLUTION	BNNSFilterCreateConvolutionLayer	convolution_forward
x: 4-dim tensor with logical shape [N, H, W, in_channels]	input[0]: 4-dim tensor of shape [N, H, W, in_channels]	sourceImage of MPSCNNKernel.encodeToCommandBuffer: 4-dim tensor in MPSImage described by MPSImageDescriptor {numberOfImages, featureChannels, width, height}	InputTensor of DML_CONVOLUTION_OPERATOR_DESC: 4-dim tensor in buffer resource of shape [N, in_channels, H, W]	in_desc: 4-dim tensor of shape [N, in_channels, H, W]	src_desc of convolution_forward::desc: 4-dim tensor of shape [N, in_channels, H, W]
filter: 4-dim tensor with logical shape [H, W, in_channels, out_channels]	inputs[1]: 4-dim tensor of shape [out_channels, H, W, in_channels]	weights of MPSCNNConvolutionDataSource: 4-dim tensor of shape [out_channels, H, W, in_channels]	FilterTensor of DML_CONVOLUTION_OPERATOR_DESC: 4-dim tensor of shape [out_channels, in_channels, H, W]	weigths of BNNSConvolutionLayerParameters: 4-dim tensor of shape [out_channels, in_channels, H, W]	weights_desc of convolution_forward::desc: 4-dim tensor of shape [out_channels, in_channels, H, W]
padding: array of ints. Padding for the beginning and ending along each spatial axis.	(explicit padding) input[3:6]: int, padding to beginning_W, ending_W, beginning_H, ending_H	offset of MPSCNNKernel: in MPSOffset {x, y, z}, int, offset (-padding) for beginning_W and beginning_H. Open: padding for ending?	StartPadding of DML_CONVOLUTION_OPERATOR_DESC: unsigned int, padding to the start of the corresponding axis [beginning_H, beginning_W] EndPadding of DML_CONVOLUTION_OPERATOR_DESC: unsigned int, padding to the end of corresponding axis [ending_H, ending_W]	x_padding of BNNSConvolutionLayerParameters: unsigned int, beginning and ending padding of width axis y_padding of BNNSConvolutionLayerParameters: unsigned int, beginning and ending padding of height axis Open: only support same beginning and ending paddings along each spatial axis.	padding_l of of convolution_forward::desc: int64, [beginning_H, beginning_W] padding_r of convolution_forward::desc: int64, [ending_H, ending_W]
strides: array of ints that have length 1, 2 or 4. The stride of the sliding window for each dimension of input.	input[7:8]: int, stride_W and stride_H Open: support strides of N and C?	strideInPixelsX and strideInPixelsY of MPSCNNConvolutionDescriptor: unsigned int, stride_W and stride_H Open: support strides of N and C?	Strides of DML_CONVOLUTION_OPERATOR_DESC: unsigned int, [stride_H, stride_W] Open: support strides of N and C?	x_stride and y_stride of BNNSConvolutionLayerParameters: unsigned int, stride_W and stride_H Open: support strides of N and C?	strides of convolution_forward::desc: int64, stride_W, stride_H Open: support strides of N and C?
dilations: array of ints of length 2: [dil_height, dil_width] The dilation factor for each spatial dimension of the input.	input[11:12]: int, dilation_width and dilation_height	dilationRateX and dilationRateY of MPSCNNConvolutionDescriptor: unsigned int, dilation_width and dilation_height	Dilations of DML_CONVOLUTION_OPERATOR_DESC: unsigned int, [dilation_height, dilation_width]	Open: not supported	dilates of convolution_forward::desc: int64, [dilation_width, dilation_height]

According to the table, most features of proposed conv2d op can be supported by NNAPI, MPS and DirectML, BNNS and DNNL with some opens:

dilation is not supported by BNNS.
MPS and BNNS only support same beginning and ending paddings along each spatial axis.
strides for N and C dimension are not supported by all APIs.
int to unsigned int conversion for strides and dilations are required by MPS, BNNS and DML.

Another open question is: NNAPI, MPS and DML all support fused Bias and Activation for conv2d. It looks like there are two options to support them:

add 'bias' and 'activation' into conv2d op
add separated add and activation ops and leverage the underlying graph optimizer to generate the fused native op

Any thoughts?

BenjaminPoulain · 2019-10-08T00:11:06Z

Thanks Ningxin.

Would it be useful to add BNNS and MKL to survey all the possible shapes?

I believe fusing bias and activation should be done by the optimizer. The fused ops are arbitrary. It will make the API cleaner.

huningxin · 2019-10-09T03:20:46Z

Would it be useful to add BNNS and MKL to survey all the possible shapes?

Good point, thanks Benjamin. BNNS and MKL-DNN (renamed to DNNL) can support CPU acceleration. It would be useful to survey them. I updated the table with BNNS and DNNL columns. The opens have been updated accordingly. Please take a look.

I believe fusing bias and activation should be done by the optimizer. The fused ops are arbitrary. It will make the API cleaner.

+1

BenjaminPoulain · 2019-10-10T03:26:26Z

dilation is not supported by BNNS.

This should not be a problem.

anssiko · 2019-10-15T11:26:31Z

I think we're ready to formalize this op and add it into the spec.

Thanks @BenjaminPoulain for Apple's perspective. @nsthorat and @RafaelCintron please chime in if you have further feedback or design considerations from Google and Microsoft.

Unless we hear concerns, I'd ask @huningxin to craft a PR to add conv2d to the spec.

anssiko · 2019-10-31T16:07:02Z

I'll revert my previous comment on spec readiness. There are still opens that need to be addressed before this op can be spec-formalized, quoting @huningxin for opens:

dilation is not supported by BNNS.

MPS and BNNS only support same beginning and ending paddings along each spatial axis.

strides for N and C dimension are not supported by all APIs.

int to unsigned int conversion for strides and dilations are required by MPS, BNNS and DML.

Suggestions welcome on how to overcome these four issues!

BenjaminPoulain · 2019-11-18T20:25:11Z

I'd like to add a few open questions:

Precision: Convolution has the same concerns as [op compatibility] matMul #27 (comment) - this could matter for compatibility.
Input shape: The operation does not need to define the batch dimension to be valid ([N, H, W, in_channels] -> [H, W, in_channels]). I suspect most usage do not need N. When there are higher dimensions, I don't think it needs to be limited to one (e.g. [A, B, ..., H, W, in_channels]). The definition of mathMul in [op compatibility] matMul #27 has this kind of flexibility.
Should the dimensions be int or a Number -> ToLength()? (OperandDescriptor's dimension should match ECMAScript's TypedArray dimension #29)

Per 5 Dec CG meeting [1], move the conv2d compability table [2] to markdown. [1] https://www.w3.org/2019/12/05-webmachinelearning-minutes.html#x02 [2] webmachinelearning#28 (comment)

huningxin · 2019-12-10T07:26:00Z

Per 5 Dec CG meeting, moved the conv2d native API compatibility table to op_compatibility/conv2d.md. Feel free to review and contribute. @RafaelCintron @walrusmcd

wchao1115 · 2020-01-21T06:36:56Z

@nsthorat Is supporting grouped convolution out of scope?

Grouped convolution is used in AlexNet, the filter tensor shape for DirectML will be [out_channels / group_count, group_count, in_channels / group_count, H,W] when group_count > 1.

And since the API is explicitly called 'conv2d', I'm assuming supporting 3-D volumetric convolution (with 'depth' input dimension) is out of scope.

anssiko · 2020-01-24T14:08:52Z

@wchao1115 asked:

@nsthorat Is supporting grouped convolution out of scope?

On the 23 Jan 2020 call we agreed to keep grouped convolution a separate op, compatibility to be assessed in its own issue.

smilejames · 2020-02-06T02:43:14Z

Hi ningxin @huningxin ，I really appreciate the work WebNN has done.
Can you add Conv2d of Paddle-Lite to the form?
Conv2d : https://www.paddlepaddle.org.cn/documentation/docs/en/api/layers/conv2d.html

huningxin · 2020-02-06T04:29:32Z

Thanks for the info @smilejames

Can you add Conv2d of Paddle-Lite to the form?

I'd like to help. Let me take a look at paddle doc and make a PR. I'll ping you for review.

anssiko · 2020-03-26T10:47:45Z

@huningxin @wchao1115 I believe we can close this issue now. Please check if there are any opens remaining.

wchao1115 · 2020-03-26T18:04:42Z

@anssiko Ideally I would prefer that all the ? in the table are definitively answered e.g. striding/padding support in some API etc. but I would be ok closing this issue as-is. I assume people do look at the current spec and relate that to their respective API advocacy. Perfection is the enemy of progress!

dontcallmedom · 2023-03-03T17:39:01Z

closing issue as per discussion above

huningxin mentioned this issue Dec 9, 2019

Move conv2d table to markdown #34

Merged

wchao1115 mentioned this issue Jan 22, 2020

Update op_compatibility for conv2d: updating DML and adding ONNX column #37

Merged

This was referenced Feb 6, 2020

Add Paddle-Lite column into conv2d table #42

Merged

Add conv2d #43

Merged

BruceDai mentioned this issue Nov 20, 2020

Add compatibility tests generated from Android Neural Networks API (NNAPI) Compatibility Test Suite. webmachinelearning/webnn-polyfill#29

Merged

huningxin mentioned this issue Jan 21, 2021

Support a separate filter layout formats and the optional bias tensor. #130

Merged

anssiko added the enhancement label Mar 3, 2023

dontcallmedom closed this as completed Mar 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[op compatibility] conv2d #28

[op compatibility] conv2d #28

nsthorat commented Sep 16, 2019 •

edited

huningxin commented Sep 18, 2019

nsthorat commented Sep 20, 2019

huningxin commented Oct 2, 2019 •

edited

BenjaminPoulain commented Oct 8, 2019

huningxin commented Oct 9, 2019

BenjaminPoulain commented Oct 10, 2019

anssiko commented Oct 15, 2019

anssiko commented Oct 31, 2019

BenjaminPoulain commented Nov 18, 2019

huningxin commented Dec 10, 2019 •

edited

wchao1115 commented Jan 21, 2020

anssiko commented Jan 24, 2020

smilejames commented Feb 6, 2020

huningxin commented Feb 6, 2020

anssiko commented Mar 26, 2020

wchao1115 commented Mar 26, 2020

dontcallmedom commented Mar 3, 2023

[op compatibility] conv2d #28

[op compatibility] conv2d #28

Comments

nsthorat commented Sep 16, 2019 • edited

huningxin commented Sep 18, 2019

nsthorat commented Sep 20, 2019

huningxin commented Oct 2, 2019 • edited

BenjaminPoulain commented Oct 8, 2019

huningxin commented Oct 9, 2019

BenjaminPoulain commented Oct 10, 2019

anssiko commented Oct 15, 2019

anssiko commented Oct 31, 2019

BenjaminPoulain commented Nov 18, 2019

huningxin commented Dec 10, 2019 • edited

wchao1115 commented Jan 21, 2020

anssiko commented Jan 24, 2020

smilejames commented Feb 6, 2020

huningxin commented Feb 6, 2020

anssiko commented Mar 26, 2020

wchao1115 commented Mar 26, 2020

dontcallmedom commented Mar 3, 2023

nsthorat commented Sep 16, 2019 •

edited

huningxin commented Oct 2, 2019 •

edited

huningxin commented Dec 10, 2019 •

edited