You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A few use cases could benefit from exposing some packetization level API to RTCRtpScriptTransform. #50 could be solved if script would know the max size of an audio packet.
SPacket could be solved if:
sender transform would know the max size of a video packet and would provide to the write side of the transform information of how to split a given encoded frame.
receiver transform would know how the frame was concatenated from some packets.
Looking at libwebrtc, it seems that:
exposing the max size of an audio packet is feasible, though the computation might be racy (MTU might change for instance between the time JS use the size and the time the packet is sent, might already be an existing issue in today's implementations).
exposing the max size of the first packet, middle packets and last packet of a video frame is feasible (same issue with being racy)
These sizes would typically be computed from MTU, RTP overhead, RTP headers overhead, as currently done by packetizers.
Getting or setting packet level boundary of an encoded frame could also be done as an attribute in RTCEncodedVideoFrame.
Here is an API example of how this could look like:
dictionary MaxVideoPacketSizes {
required attribute unsigned long maxPayloadSize;
attribute unsigned long maxFirstPayloadSize;
attribute unsigned long maxLastPayloadSize;
}
// To compute max packet size
partial interface RTCRtpScriptTransformer {
Promise<unsigned long> computeMaxAudioPacketSize(RTCEncodedAudioFrame frame);
Promise<MaxVideoPacketSizes > computeMaxVideoPacketSizes(RTCEncodedVideoFrame frame);
}
// To read/write video packet boundaries
partial interface RTCEncodedVideoFrame {
attribute sequence<unsigned long> payloadSizes;
}
@aboba, @alvestrand, is it too simplistic or too libwebrtc-implementation centric? Any idea about the feasibility or the potential race issues?
The text was updated successfully, but these errors were encountered:
We would need the following info for exposing the RTP packetization for each frame:
dictionary RTCRTPFragment{
attribute Buffer prefix;
attribute ArrayBufferView payload; //Containing the correct fragment of the video frame payload
attribute Buffer suffix;
}
partial interface RTCEncodedVideoFrame {
attribute sequence<RTCRTPFragment> payload;
}
Note that the RTP packetization may require to prepend some RTP header stuff in each packet (nal headers in h264, vp8 picture headers) and may skip some data on the payload (the nal headers in h264, for example). I am not aware of any packetization that requires appending data at the end, but may be worthy.
Encrypting could be straight forward, as it could just put all the encrypted buffer as the prefix or sufix and skip the payload completely.
What is a bit more complex is in decryption, because as it is happening before the depacketization, we don't have a assembled video frame, and we would need to pass the raw RTPpacket instead (either just the payloads, or the full rtp packet info)
I like the idea of a prefix, which could also be used to address the A/V/data sync issue. You might want to make clear that the browser is expected to take care of RTP header/extension formatting (e.g. things like setting the Marker bit). The size of header extensions can vary with each packet, so that the available payload size needs to be provided for each packet.
Closing this as OBE in this context, given that TPAC 2023 discussions favored developing a separate spec for RTPTransport, which would define a packet level interface as one of its work products.
A few use cases could benefit from exposing some packetization level API to RTCRtpScriptTransform.
#50 could be solved if script would know the max size of an audio packet.
SPacket could be solved if:
Looking at libwebrtc, it seems that:
These sizes would typically be computed from MTU, RTP overhead, RTP headers overhead, as currently done by packetizers.
Getting or setting packet level boundary of an encoded frame could also be done as an attribute in RTCEncodedVideoFrame.
Here is an API example of how this could look like:
@aboba, @alvestrand, is it too simplistic or too libwebrtc-implementation centric? Any idea about the feasibility or the potential race issues?
The text was updated successfully, but these errors were encountered: