generateKeyFrame takes a "rid" argument, but is invoked with "rids" #143

alvestrand · 2022-08-30T08:40:00Z

The generateKeyFrame algorithm takes a single RID, but the generateKeyFrame API function invokes it with a sequence of RIDs.

The rtcRtpScriptTransformer invokes it with a single RID.

This should be just a matter of wordsmithing; the obvious decision to be taken is behavior of generateKeyFrame([valid rid, invalid rid]).

fippo · 2022-08-30T08:44:42Z

The algorithm doesn't deal with the lack of rids either.
The IDL might need to specify a default of = [] but that might just be how it needs to be implemented in Chrome.

youennf · 2022-08-30T08:46:32Z

Oh right, I missed it in #132.
Should we align RTCRtpSender.generateKeyFrame to RTCRtpScriptTransformer.generateKeyFrame?

youennf · 2022-09-01T12:07:05Z

#146

alvestrand · 2022-09-09T18:34:47Z

(uplevel from PR comment)
there are arguments for making sender.generateKeyFrame take a list of RIDs. One of them is ensuring all the keyframes are sent at the same timestamp; another is dealing with encoders that can generate multiple keyframes from one encoder.

I'm not sure how valid those are for transformer.generateKeyFrame, however.

solmaks · 2022-09-29T06:38:16Z

(36c965e) removed ability to request keyframes on a sequence of RIDs. This imposes some unnecessary limitations, as now when applications need to request keyframes on N (N>1 && N<MAX_RIDS) RIDs, to guarantee video fluidity for a wide range of receivers, they can no longer do this optimally. With currently proposed API, application willing to request keyframes on 2 RIDs will need to either call into API 2 times (providing RID N and RID M as arguments), or 1 time (providing no arguments). Choosing between the 2 will require applications to know implementation details and limitations, which is not always possible. E.g. if user agent emits keyframes on all RIDs even when asked for a single RID, applications should prefer option 1. If user agent emits keyframes only on specific RIDs, application should prefer option 2. Having applications to implement such heuristic is something we can easily avoid, by offering a more flexible API that ALWAYS takes sequence of RIDs.

youennf · 2022-09-29T08:22:17Z

At last TPAC, the room feeling was that supporting N=1 or N=ALL_ACTIVE was good enough as it covered the envisioned use cases.
@solmaks, I assume you have a use case for N > 1 && N < ALL_ACTIVE.
Can you share it?

alvestrand · 2022-09-29T08:45:25Z

The resolution as recorded in https://www.w3.org/2022/09/13-webrtc-minutes.html#t07 was "go with proposal 3 without returning a timestamp", where proposal 3 was "providing no rid -> Trigger key frames for all rids".

So that would mean making the argument to requestKeyFrame optional.

solmaks · 2022-09-29T10:51:33Z

What I am after is optimal operation in cases where application would like to dynamically maximise video fidelity for every receiver group, by having tight control over the target bitrates of individual RIDs. A practical example:

2+ RIDs are configured for low fidelity encodings, that target low capability receivers, running on poor/lossy networks
1+ RIDs are configured for high fidelity encodings, that target high capability receivers, running on good networks
Target bitrates for all RIDs are dynamically configured based on the ever changing capabilities of receiver groups to maximise video fidelity across all N receivers.

With N growing big, the optimization exercise requires frequent changes both in target bitrates of every RID, but also in the number of receivers specific RID is targeted to serve. Consider a scenario that involves introducing receivers at a high pace (e.g. start of a larger conference). In this scenario there will be a number of receivers capable of consuming high fidelity encodings (due to their estimators having time to ramp up and stabilize), as well as continuous inflow of receivers capable of consuming different variants of low fidelity encodings (these groups will change a lot due to estimators ramping up and stabilizing around some values). Since applications would want to minimise the number of generated key frames on any encoding/RID, they would want to pace/aggregate the changes. In practice this would mean having multiple changes to apply in one go. In this example, most of the changes would target lower fidelity encodings (due to higher instability of these groups). Under these conditions it would be great to have ability not to penalise receivers consuming high fidelity stream(s). Hence the ask to not artificially limit the API surface.

solmaks · 2022-09-29T10:57:25Z

@alvestrand , do I read correctly that TPAC consensus was to:

Remove the timestamp altogether
Make argument optional, and have "ALL" semantic if it is missing
Keep a single RID argument

alvestrand · 2022-09-29T11:15:03Z

@solmaks that is how I read the minutes, and it fits with my memory of the meeting.
If compelling new arguments are raised, this can be revisited; the scenario you mention might be one such.

I do wonder - if the RID groups are fairly independent, is there really a need to synchronize the sending of keyframes?
If not, requestKeyframe(RID) would seem to be sufficient.

fippo · 2022-09-29T12:24:25Z

@alvestrand see my objection to the consensus on the list.

This isn't a new request, see here

some encoders (libvpx?) prefer synchronizing because it allows sharing some work, see here, not prohibiting this seems useful
while PLIs are per-ssrc they may come as part of a compound packet (servers may aggregate even though because of libwebrtc bugs currently few do probably) hence you need some kind of API for those. Aggregating into a single call to the encoder seems useful for this too.

We're essentially talking about

(where "enc" is an abstract encoder on the left, SEA is a "simulcast encoder adapter" which resolves rids or ssrcs to layers and "enc" is a concrete encoder implementation on the right) vs

vs, with some encoder implementation that understands what this simulcast thing is

Please note that there is heavy threadhopping on the left of the first "enc". Leaving how to fulfill the request for keyframes to the encoder implementation seems preferable.

aboba · 2022-09-29T14:33:43Z

There were multiple process problems during the discussion. There were more than 3 proposals in the slide deck, but only the first 3 were presented. Then the "resolution" was determined without polling all the participants in the discussion, so as to allow remote participants to participate.

aboba added the TPAC 2022 For discussion at TPAC label Sep 2, 2022

youennf changed the title ~~Editorial: generateKeyFrame takes a "rid" argument, but is invoked with "rids"~~ generateKeyFrame takes a "rid" argument, but is invoked with "rids" Sep 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

generateKeyFrame takes a "rid" argument, but is invoked with "rids" #143

generateKeyFrame takes a "rid" argument, but is invoked with "rids" #143

alvestrand commented Aug 30, 2022

fippo commented Aug 30, 2022

youennf commented Aug 30, 2022

youennf commented Sep 1, 2022

alvestrand commented Sep 9, 2022

solmaks commented Sep 29, 2022

youennf commented Sep 29, 2022

alvestrand commented Sep 29, 2022

solmaks commented Sep 29, 2022

solmaks commented Sep 29, 2022

alvestrand commented Sep 29, 2022

fippo commented Sep 29, 2022

aboba commented Sep 29, 2022

generateKeyFrame takes a "rid" argument, but is invoked with "rids" #143

generateKeyFrame takes a "rid" argument, but is invoked with "rids" #143

Comments

alvestrand commented Aug 30, 2022

fippo commented Aug 30, 2022

youennf commented Aug 30, 2022

youennf commented Sep 1, 2022

alvestrand commented Sep 9, 2022

solmaks commented Sep 29, 2022

youennf commented Sep 29, 2022

alvestrand commented Sep 29, 2022

solmaks commented Sep 29, 2022

solmaks commented Sep 29, 2022

alvestrand commented Sep 29, 2022

fippo commented Sep 29, 2022

aboba commented Sep 29, 2022