Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What should we do about hand input sources where "select" and "squeeze" are confusable? #997

Closed
Manishearth opened this issue Apr 4, 2020 · 18 comments · Fixed by immersive-web/webxr-input-profiles#170
Milestone

Comments

@Manishearth
Copy link
Contributor

Manishearth commented Apr 4, 2020

Related: immersive-web/webxr-input-profiles#105

On the Hololens 2, any form of fist is considered a squeeze event. An "air tap" (pinch) gesture is a select event. All air taps are also squeeze events.

The idea is that air taps are for faraway interaction, and squeezes are for near interaction. These usually don't overlap, so it's fine.

Unfortunately, in the webxr world, content doesn't necessarily use squeeze events this way. For example, in this threejs demo, the squeeze button (coupled with movements along the y axis) is how you change the brush size. Since all select events are also squeeze events, the brush size keeps changing as you paint, which isn't ideal.

Similarly, in Hubs, I think squeeze and select events both affect faraway objects, differently (I'll update this with more info when I try this next)

Typical WebXR controllers are able to deal with distinct select and squeeze events, because these are just buttons. Applications seem to be written with this assumption in mind. If this isn't true for hand devices, we have a problem where webxr content will work badly on hand interaction devices.

We have a couple paths forward:

  • Suggest applications detect this case and be robust about it. This can be helped by attaching extra semantics to generic-hand-select-squeeze profiles and expecting frameworks to pick that up. Sadly there is no input profiles negotiation like openxr to make this easier.
  • Suggest Hololens UAs (and others with the same problem) mask off squeeze when select is held. This isn't great since now it's trickier to grasp nearby objects, and some applications may still need to detect this and be robust about it.

cc @thetuvix @yl-msft @cabanier

@cabanier
Copy link
Member

cabanier commented Apr 4, 2020

I wish I could help you out but @Artyom17 knows more about this.

@Manishearth
Copy link
Contributor Author

/agenda we should discuss this if we get a chance

(sorry, forgot to agenda this earlier)

@probot-label probot-label bot added the agenda Request discussion in the next telecon/FTF label Apr 7, 2020
@Manishearth Manishearth added this to the Pre-CR milestone Apr 10, 2020
@toji
Copy link
Member

toji commented Apr 10, 2020

Discussed this with @Manishearth earlier today. It seems like one of the problems that we're encountering is that, semantically, we'd like the squeeze event to represent "user is grabbing something" but existing experiences are using it basically as just another button. (Like the Three.js sample that uses squeeze to adjust brush size.) This seems to me like a potential argument against the idea proposed in immersive-web/webxr-gamepads-module#23 that hand-based input should have a gamepad, because it then becomes more reliable for the app to distinguish button-like behavior from gesture-like behavior and makes our recommended best practices for apps more concrete: If the behavior you want is button-like you should always look at gamepad signals, and have fallback UX or an error message if it's not available. If the behavior you want is gesture-like (mimicking a grabbing/moving motion) then you should rely on the events and we will normalize hand/controller interaction for you.

@Manishearth
Copy link
Contributor Author

Yeah, if we can get some coordination with implementors it seems like using onsqueeze and .squeeze based on what the application needs would be nice. This might complicate framework code, though.

@AdaRoseCannon AdaRoseCannon removed the agenda Request discussion in the next telecon/FTF label Apr 14, 2020
@Manishearth
Copy link
Contributor Author

Manishearth commented Apr 24, 2020

@RaananW @mrdoob what do y'all think about this?

Essentially, it seems like there are two ways of using select/squeeze: you can use select as a way to "select" faraway objects, and squeeze as a way to grab nearby ones. You can also treat either of these as "just another button".

This works fine for devices with physical button gamepads, but for AR devices like the Hololens 2 there's ambiguity -- a select and squeeze action can overlap, and it's up to the application to see if the user is grabbing nearby/faraway objects by seeing which objects are there.

@toji's solution is elegant: onselect and onsqueeze continue to be "select or grab objects", and if you intend to use the button as "just another button", you use source.gamepad.squeeze instead.

Unfortunately, it might not be very straightforward to expose such a dichotomy in frameworks.

@mrdoob
Copy link

mrdoob commented Apr 25, 2020

I was following the logic that the squeeze event is for grabbing objects.

Say that there is a gun, you pick it up by holding the squeeze button, and while squeezing you can shot with the select button. With that in mind, I thought using the squeeze event to change the brush size seemed fitting too.

In my opinion, the way Hololens is using doesn't follow the original intent, and we may need to rethink all this if we expose that to webxr.

@RaananW
Copy link

RaananW commented Apr 25, 2020

I tried avoiding the integration of those two events in Babylon.js and have only recently introduced them to the framework (exmplanation here). The main reason was that I find the semantics a bit misleading. As a developer I would expect the select event to be attached to the object being selected. From the session I would expect a ongesture event (for example), leaving the environments to decide what (in this case - hand) gesture correlates to each predefined "digital" gesture. I ended up using the select begin/end to simulate pointer down/up (and select - tap) behavior, to offer the user a single user-input API. As the session is exposed to the devs, they can decide themselves how to use the squeeze event.

I think a lot of developers will "misuse" the squeeze event, because it is the only other input available to them without introducing GUI. Take the paint example - there could have been a tiltbrush-style "hand-pallet" that had a slider to change the brush size using the select event, but it was simpler (and honestly more intuitive) to use the squeeze gesture.

I agree with @mrdoob that the way it is done in Hololens doesn't entirely follow the original intent. If they say "pinch is select" and "fist is squeeze" (gesture-based) devs will be able to decide what to do in each case. In this case, hands are just like a motion controller, with two simple triggers. Both squeeze and select events when pinching can be confusing IMO.

I think it all comes down to semantics :-) I agree with @toji , I only find the naming misleading.

@Manishearth
Copy link
Contributor Author

Manishearth commented Apr 27, 2020

Well, @toji is proposing a third way out, in which the squeeze button is for this kind of thing, whereas the squeeze event is potentially ambiguous and is only for "picking up" objects. This might be a pretty major change to expectations, though.

Another possibility is signalling this stuff through the input profile, and applications can ignore the squeeze event when ambiguity isn't okay.

I've asked the Microsoft folks to chime in.

It might be possible to make pinches select-only, and grabs squeeze-only, however it seems like for hand tracking there's ambiguity there.

This discussion does bring up a wider point: webxr attempts to follow device conventions, what do we do when the device conventions are sufficiently alien that author expectations are subverted?

(In this case, Hololens' expectations are mostly "alien" because so far we've not dealt with many hand-tracking devices, and my understanding is that the other one -- Oculus Quest -- is having similar issues)

@Manishearth
Copy link
Contributor Author

@thetuvix would you have a moment to provide your perspective here?

@yl-msft
Copy link

yl-msft commented Jun 4, 2020

After reading through the discussion above, IMHO, the squeeze of hololens hand interaction aligns well enough to webxr concept and similar to most motion controller squeeze button on the handle. The user can trigger the action at any hand pose. It's the "select" on hololens hand interaction is controversial. This select is mapped to "air tap" gesture, it must be done when hand is palm down shape, and it triggers the squeeze at the same time, because the airtap is actually a squeeze guesture to grab something in front of your eye. The ambiguity is from here.

If webxr input has an implicit contract that the action must be able to trigger regardless of hand pose and independent to other actions, it's understandable since that's most controller buttons would give. Then hololen hand interaction "squeeze" can fit into this contract, but "select" cannot.

If above contract is necessary to maintain then I suggest to map hololen hand interaction to a webxr profile with a single action, such as the "trigger" profile, and map squeeze to it.

If it's possible to introduce new contract or new input profile, then I suggest to take hololen hand interaction to a different profile where the two actions are meant to be used differently( one for near range manipulate and the oth meant for far range point and command) and the two actions might sometimes overlap based on where the hand is.

@Manishearth
Copy link
Contributor Author

If webxr input has an implicit contract that the action must be able to trigger regardless of hand pose and independent to other action

There is no such implicit contract, however it is recognized that the "select" action is more important than "squeeze", which is part of the issue here.

It is possible for us to introduce a new input profile, but that would still mean masking off the onsqueeze event if content keeps breaking because of it.

For example, we could introduce a generic-hand-overlappingsqueeze profile for this behavior, and expose select/squeeze buttons that content can choose to consume, but there would only be a select event. This is somewhat unfortunate because onsqueeze was in part suggested by @thetuvix so we would not have to deal with this.

@De-Panther

This comment has been minimized.

@Manishearth
Copy link
Contributor Author

Manishearth commented Jul 10, 2020

@De-Panther hi, this set of concerns is not relevant to this issue. This issue is about how we can simultaneously expose select and squeeze on hands in a way that does not break content expecting them to be independent.

@Manishearth
Copy link
Contributor Author

/agenda to discuss this with stakeholders present

@mrdoob @RaananW @thetuvix @cabanier @Artyom17 Would y'all be able to make it to next week's call (12PM PST Tuesday). I'd like to discuss this with all of you present. @mrdoob and @RaananW if y'all don't have connection info for the call please email me (manish@mozilla.com) or otherwise contact a group member to share it with you.

A thing I do think is worth noting is that while "hand input sources only expose select" is suboptimal it's not terrible, code might need to be tweaked to use select for local interaction as well.

@thetuvix @cabanier it would also be nice to know if it's at all possible for the platform to expose distinct select/squeeze guestures, as well. I know this might go against platform conventions, but it's worth exploring.

@AdaRoseCannon AdaRoseCannon added the agenda Request discussion in the next telecon/FTF label Jul 13, 2020
@Manishearth
Copy link
Contributor Author

Discussed in the call today. The general consensus was to add a "generic-hand-select" (generic-hand-trigger? there's no actual trigger) profile and use it in the input profiles repo.

We can add a further opt-in feature called hand-overlappingsqueeze (?) that enables the generic-hand-select-overlappingsqueeze profile which has an overlapping squeeze event. Content that's aware of this can handle it.

Another way of doing it is to expose a generic-hand-overlappingsqueeze that exposes the squeeze as a secondary button.

@RaananW
Copy link

RaananW commented Jul 20, 2020

Hi @Manishearth ,

Thanks for inviting me to the meeting! I was away for the week so only replying now -

I believe that we are talking about three different levels of support - platform, framework, and developers.

The confusion I had during the call (and still have right now) is the mix between those 3. Select/squeeze on the platform level is decided because the system has details about the objects the user interacts with.

WebXR, IMO, can’t really send select or squeeze as they are, unless they are predefined gestures that the developer can directly use (and know their definition), mainly because WebXR has no information about the scene itself (or the object the user currently interacts with).

It is also true that users of a specific framework expect the gestures they are used to on the platform to work in all experiences, but this will be hard to solve with WebXR itself, unless everyone agrees on gesture-standards. If tap (like the oculus quest gesture) is select, and fist-squeeze is squeeze on each platform, the developer will be able to use those gestures when needed.

WebXR cannot differentiate between a fist-select and fist-squeeze. The developer / framework is the one that is responsible to choose what to do with the current user interaction. If the difference between select and squeeze is the distance to the object, then it doesn't matter whether we send a select or squeeze.

On a framework level, we can allow developers to differentiate between close and far object. What we need as a framework it to tell the developer what the user gestured - in this case, a generic-hand-select-squeeze (or generic-hand-tap-fist) will be great. This is similar to the generic-hand-overlappingsqueeze you mentioned.

@AdaRoseCannon AdaRoseCannon removed the agenda Request discussion in the next telecon/FTF label Jul 20, 2020
@Manishearth
Copy link
Contributor Author

@RaananW Thanks for the feedback!

On a framework level, we can allow developers to differentiate between close and far object. What we need as a framework it to tell the developer what the user gestured - in this case, a generic-hand-select-squeeze (or generic-hand-tap-fist) will be great. This is similar to the generic-hand-overlappingsqueeze you mentioned.

Yeah, a crucial issue here is that if we expose an overlapping squeeze as just a "normal squeeze" button it will break content that expect the squeeze button to work like it does in handheld controllers. We need it to be opt in, either by exposing an overlappingsqueeze profile that maps squeeze to an extra button, or by having an optin feature which exposes the overlapping squeeze profile

@Manishearth
Copy link
Contributor Author

Opened immersive-web/webxr-input-profiles#170 to add generic-hand-select , and immersive-web/webxr-input-profiles#171 to decide on overlappingsqueeze.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants