Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider expose interactionId to keypress, input, and more other event types #132

Closed
zuoaoyuan opened this issue Feb 16, 2023 · 20 comments
Closed
Assignees

Comments

@zuoaoyuan
Copy link
Contributor

Event timing has added interactionId attribute to all entries, but to most entry types, its value would be 0. Currently it's only exposed to [pointerdown, pointerup, click, keydown, keyup] with non-trivial values. But is this ideal?

I propose to expose interactionId to more entry types with following 3 reasons.

Pros:

  1. INP reflect user experience more accurately
    Entry with interactionId 0 could possibly be the potential INP candidate - have longest duration among all the entries from the same interaction, and having an interactionId 0 makes it not being counted towards INP and makes INP not reflecting the real user experience.
    Keypress is one example and https://zuoaoyuan.github.io/key-event-demo/ demos it could have a longer duration than both keydown and keyup sometimes.
    image

  2. IME complications for keydown & keyup
    IME behaves so differently on different platforms. For example, on Windows, it could output one keydown but two keyup from the same user input.
    image
    To tackle this problem, chromium is currently exposing interactionId to only input events when under composition. This is not ideal, but one step closer towards calculating INP. And we have plan to investigate exposing interactionId to more events under composition as well since only input event won't capture the longest duration either.

  3. InteractionId of 0 confuse api users
    We expect interactionId to uniquely identify events from the same user interaction. Since every entry is a direct/indirect result from an user input, ideally they should all have valid interactionIds. When there exist entries interactionId of 0, api users are unable to identify which interaction it belongs to solely from interactionId attribute, and usually user have to deduce that information from its timestamp and the order it comes in instead. This is bad.

@Biki-das
Copy link

is this what i should follow to sort of add new APIs?

document.addEventListener('keypress', function(event) {
  // Capture the interactionId when a keypress event occurs
  var interactionId = getInteractionId();

  // Add the interactionId to the event object
  event.interactionId = interactionId;

  // Call the existing function in the Performance Event Timing API to record the event
  performance.mark('keypress', {interactionId: interactionId});
});




@Biki-das
Copy link

@zuoaoyuan what should i read before to implement, as there is a performanceEventtiming api and an interaction id which now supports only a few events where should i refer to learn more which can guide me to make the API for the evnts

@zuoaoyuan
Copy link
Contributor Author

zuoaoyuan commented Feb 23, 2023

@Biki-das Glad to see you're interested on helping fix this issue.

"is this what i should follow to sort of add new APIs?"

Nice try but not what we need here:)

The interactionId attribute we need to populate here is on performance observer entries (example).

I assume you from GSoC and looking for projects? - if so, I would suggest as a first step, try copy & paste following js snippet into the console on any website and then interact on the page and see console logs printing, to get a feel on how event timing api works:)

new PerformanceObserver(list => {
  for (const entry of list.getEntries()) {
    // Comment this out to show ALL entries.
    if (!entry.interactionId) continue;

    console.log(`[INP] duration: ${entry.duration}, type: ${entry.name}`, entry);
  }
}).observe({
  type: 'event',
  durationThreshold: 16, // Minimum supported by the spec.
  buffered: true
});

And observer api spec could answer questions you have on the api itself.

I don't expect GSoC students to understand more than that for now, but if you would like to learn more:

  • Current W3C spec has an computing interactionId section which explains current implementation logic on computing interactionId.

  • Web.dev/INP is a good learning material on INP - the crucial web vital involved in this issue.

  • Pointer Interaction Doc documented the event timing state machine logic for pointer interactions. The first goal on this GSoC project would be to create a similar document for key interactions.

@Biki-das
Copy link

Thanks a lot @zuoaoyuan for the quick response,i will dig into this tomorrow and learn more about it and yes i came from the gsoc idealist page, this project is something that i see i can learn a lot from and understand indepth about browser events.

Looking forward to continue the same with you :-)

@Biki-das
Copy link

@zuoaoyuan I tried to play with the snippet that you shared, I have a few doubts while trying to understand it :

  1. this entry.interactionId comes as undefined when i try to access it so which event actually have the interactionId property, i am sort of confused with this one
  2. as you pointed in the issue that the interactionId attribute has been added, and it is only exposed to pointerdown, pointerup, click, keydown, keyup events, so basically our tasks would be extend the interaction Id attribute to be injected in the other event type as well.
  3. when i commented out the if (!entry.interactionId) continue; i could see logs in my console, some key events such as Tab, Alt and also gets logged into the console!, so are we looking forward to add other keyevents to have the interactionId included in them?

@Biki-das
Copy link

Biki-das commented Feb 24, 2023

😅 Tried to just do a very tiny demonstration of what the key interactions state machine might look like

+---------------------+
|      IDLE           |
+---------------------+
          | keydown
          v
+---------------------+
|     KEY DOWN         |
+---------------------+
          | keyup
          v
+---------------------+
|    KEY UP            |
+---------------------+
          | keydown
          v
+---------------------+
|     KEY REPEAT       |
+---------------------+

The state machine has four states: IDLE, KEY DOWN, KEY UP, and KEY REPEAT. Initially, the state machine is in the IDLE state. When a keydown event is detected, the state machine transitions to the KEY DOWN state. When a keyup event is detected, the state machine transitions to the KEY UP state. If another keydown event is detected while in the KEY DOWN state, the state machine transitions to the KEY REPEAT state. When a keyup event is detected while in the KEY REPEAT state, the state machine transitions back to the KEY DOWN state.

As you said The first goal on this GSoC project would be to create a similar document for key interactions, what could possibly be the necessary information and research i can do meanwhile to think about creating the same

@zuoaoyuan
Copy link
Contributor Author

@Biki-das Good work! I think the diagram you drew looks slightly different from what your comment described, but I understand the idea from your description. This looks like a nice first draft and we can iterate from it as our next step:

  • From what you've described, |KEY REPEAT| -- keyup --> |KEY DOWN| is one of the transition. But |KEY DOWN| -- keyup --> |KEY UP| is another valid transition, which makes the state machine possible to accept two keyups back to back, that doesn't seem right.
  • Keep in mind in case you missed it - the whole idea of creating a state machine diagram is to visualize the logic of assigning the same interactionId to events result from the same user interaction. So its purpose is to help us better understand how current implementation group up events to an interaction. I see your diagram makes sense ,but I can't easily tell which events shares the same interactionId with which. We should try to make it more clear:)

Note: Event Timing currently only assign interactionId to discrete interactions. That means, continuous interactions like scroll, drag & drop won't get an interactionId. Key press and hold is similar (in your case KEY REPEAT), but we chop it up and treat it as multiple discrete interactions. You can try on keyboard event viewer to learn how browser dispatch events from press & hold. Try on key event demo and check the 16ms checkbox to learn how event timing assign interactionId(that's no. in this demo) to entries result from key press & hold. SetKeyIdAndRecordLatency() is the current implementation on how event timing assign interactionId to keyboard related entries.

Re your questions in the earlier comment:

  1. My bet is on that you were testing on a browser which doesn't support interactionId at the moment. Check browser compatability. If so, try on Chrome instead:)
  2. Yes exactly. Currently other types also have an interactionId attribute, but their value will always be 0. Event timing has some complication there which may makes it hard to assign interactionId to ALL other types, but at least we should try to expose it to keypress, and maybe a few more, to fix the existing known issues around computing INP.
  3. I believe every event is directly or indirectly from an user interaction, so ideally every event should have a valid interactionId being assigned. However, as I mentioned above, there are some existing complication around event timing implementation, for example:
  • grouping problem - how to match different events with the same interactionId
  • id generation timing problem - when to generate a new interactionId is the best? Can't be too early as it may be a continuous interaction which shouldn't get an id, and we don't know in the beginning. Can't be too late as events arrive earlier need to be buffered and be delayed to notify event timing observers.
  • backward compatibility - while adding support to new event types, we need to still maintain old event types on all platforms (different OS, different input devices could result in different events dispatching, and we should try to support all of them with a single state machine)

Don't be scared at the moment, these investigations are all part of the GSoC project that we'll be going through together:)

@zuoaoyuan
Copy link
Contributor Author

@Biki-das BTW, Mermaid could be a handy tool for generating state machine diagrams, and it's the tool I was using in those documents as well:)

@Biki-das
Copy link

@Biki-das Good work! I think the diagram you drew looks slightly different from what your comment described, but I understand the idea from your description. This looks like a nice first draft and we can iterate from it as our next step:

* From what you've described, |KEY REPEAT|  -- keyup --> |KEY DOWN| is one of the transition. But |KEY DOWN| -- keyup --> |KEY UP| is another valid transition, which makes the state machine possible to accept two keyups back to back, that doesn't seem right.

* Keep in mind in case you missed it - the whole idea of creating a state machine diagram is to visualize the logic of assigning the same interactionId to events result from the same user interaction. So its purpose is to help us better understand how current implementation group up events to an interaction. I see your diagram makes sense ,but I can't easily tell which events shares the same interactionId with which. We should try to make it more clear:)

Note: Event Timing currently only assign interactionId to discrete interactions. That means, continuous interactions like scroll, drag & drop won't get an interactionId. Key press and hold is similar (in your case KEY REPEAT), but we chop it up and treat it as multiple discrete interactions. You can try on keyboard event viewer to learn how browser dispatch events from press & hold. Try on key event demo and check the 16ms checkbox to learn how event timing assign interactionId(that's no. in this demo) to entries result from key press & hold. SetKeyIdAndRecordLatency() is the current implementation on how event timing assign interactionId to keyboard related entries.

Re your questions in the earlier comment:

1. My bet is on that you were testing on a browser which doesn't support interactionId at the moment. Check [browser compatability](https://developer.mozilla.org/en-US/docs/Web/API/PerformanceEventTiming/interactionId#browser_compatibility). If so, try on Chrome instead:)

2. Yes exactly. Currently other types also have an interactionId attribute, but their value will always be 0. Event timing has some complication there which may makes it hard to assign interactionId to ALL other types, but at least we should try to expose it to keypress, and maybe a few more, to fix the existing known issues around computing INP.

3. I believe every event is directly or indirectly from an user interaction, so ideally every event should have a valid interactionId being assigned. However, as I mentioned above, there are some existing complication around event timing implementation, for example:


* grouping problem - how to match different events with the same interactionId

* id generation timing problem - when to generate a new interactionId is the best? Can't be too early as it may be a continuous interaction which shouldn't get an id, and we don't know in the beginning. Can't be too late as events arrive earlier need to be buffered and be delayed to notify event timing observers.

* backward compatibility - while adding support to new event types, we need to still maintain old event types on all platforms (different OS, different input devices could result in different events dispatching, and we should try to support all of them with a single state machine)

Don't be scared at the moment, these investigations are all part of the GSoC project that we'll be going through together:)

sorry for the late response, I am researching on the things you mentioned and making notes on the same , Thank you for extending such a smooth feedback for my queries :-)

@Biki-das
Copy link

@Biki-das BTW, Mermaid could be a handy tool for generating state machine diagrams, and it's the tool I was using in those documents as well:)

ok i was quite curious how you were drawing the same got it! thanks :-)

@Biki-das
Copy link

Online FlowChart   Diagrams Editor - Mermaid Live Editor - Google Chrome 2_27_2023 4_06_16 PM

Tried to map a small state machine as you asked , tbh quite overwhelmed as there are lots of thing to study forward to properly understand how we will be implementing everything

@Biki-das
Copy link

Biki-das commented Feb 27, 2023

@zuoaoyuan as you said our next step would be to document the keyevent interactions as you have done for the pointer events how should i move on tbh, feeling quite overwhelmed with such a baggage of information

found an interesting part in the w3c doc,

Note: the algorithm attempts to assign events to the corresponding interactiond IDs. For keyboard events, a keydown triggers a new interaction ID, whereas a keyup has to match its ID with a previous keydown

what would be the skills i would need to be able to atleast dive into this project, seems like i am missing something out?

@zuoaoyuan
Copy link
Contributor Author

@Biki-das Glad to see you mastered mermaid to create the diagram so quick! Great work, and don't overwhelm yourself of course! The diagram can be iterated better as you learn more and understand better as the program go.

Knowing C++ & JS would be fundamental skills, and if have experience with creating web pages, that would be a plus - helpful with testing the API changes. I don't think it's anything you're missing, it's just too much to learn in a few days. So don't overload yourself and give sometime to digest:)

How's the code snippet go? Did you get entry.interactionId print with values?

@Biki-das
Copy link

@zuoaoyuan appreciate your encouragement, i am really diving deep to learn stuff as much as i can , so tried to play with the code snippet you shared and yeah have to say the event timing API seems tricky to work with, let me elaborate my observations

  1. Lets start with the pointer events, when ever a pointer event get registered a particular sequence is followed, it all starts with the pointer down , then the pointer up and then click , this is what makes up a complete click event . The interaction Id for each one successive click is different
  2. coming to key events they occur when we try to scroll the page through space or with the ^ or with the downward key , it also gets triggered whenever i try to fill a particular input field. the sequence for the keyboard events are kinda confusing and that's what we are try to solve for the state machine diagrams i think , like when i try to enter some input in a particular input box, the event sequences are confusing sometimes there is key down in succession and a keyup in middle, also while scrolling the same behavior is recognized as same

these were my observations and i might be wrong in ascertaining my observations, so looking upon your feeback, my first real and important goal is to understand this snippet well enough to move on to solve the next bigger problems :-)

@zuoaoyuan
Copy link
Contributor Author

These all look expected to me.

  • Keyboard scroll is a special case of scroll which we do measure.

  • Successive keydowns could possibly happen if user press more than one key before releasing any of them. Or it could also happen if the keyup in between has a duration (rounded to the nearest 8ms) less than the 16ms durationThreshol here, then it get filtered out.

@Biki-das
Copy link

Biki-das commented Mar 3, 2023

Been a long catching this, what should be the next step, played with the API for a while now @zuoaoyuan , should i start documenting the key interaction state machine?

@zuoaoyuan
Copy link
Contributor Author

@Biki-das That would a great next step in my opinion, its main purpose is to help you familiarize with key events. It can be either based on current implementation, or based on your ideal mental model. As you working on it, you'll learn from specs, get an idea on how keyboard interaction dispatch different events, understand how event timing process key events, and even discover problems with current implementation and come up with ideas on how we can expose interactionId to keypress and other events. All of above will be very helpful for you to write a great project proposal as well:)

@daksh-goyal
Copy link

Hi @zuoaoyuan, I was researching project ideas suggested by organizations for GSOC 2023 and came across this idea. Thanks to the previous discussion on the post I was able to get some understanding of the idea, but I had a few things I'd like to discuss to be able to decide whether I'll be able to add meaningfull contributions to the project or not.

Is there a list of specific events that we're currently targeting? I wanted to see if I can come up with a reasonable design of the state diagram for these events to get a better understanding of the required additions/changes to the logic.

From what you've described, |KEY REPEAT| -- keyup --> |KEY DOWN| is one of the transition. But |KEY DOWN| -- keyup --> |KEY UP| is another valid transition, which makes the state machine possible to accept two keyups back to back, that doesn't seem right.

I noticed that for handling press and hold key events, we're breaking the event by time slices into multiple keydown events and finally a keyup event. Also, as you said:

Successive keydowns could possibly happen if user press more than one key before releasing any of them. Or it could also happen if the keyup in between has a duration (rounded to the nearest 8ms) less than the 16ms durationThreshol here, then it get filtered out.

Isn't this true for having keyup events as well? Meaning that when I have three keys pressed on the board, I could release two keys simultaneously which would should generate two back-to-back keyup events? This seems like a really big design decision to be made. And changing something might affect existing functionality. While I could see that the event-timing api is still in it's first draft, how difficult would it be to make such a decision at this stage of the API?

I was also skimming through the codebase to understand the already existing logic, and at line #335 it says that we ignore keydown/keyup events during a composition event. Per my understanding, this would lead to all those events not having an interactionId. I suppose we'd be trying to add interactionIds to these events too as part of the project?

Also, the project doc mentioned in the GSOC info card seems to be private. Could this please be made public?

Lastly, a few questions related to logistics :-

  1. How many contributers are planning to have for this particular task?
  2. I suppose we'd be expected to be equipped with machines that serve as required development environment for the project. I tried to compile the current version of source code but since I am running Linux Mint which has Snap store disabled, I wasn't able to get past it. Do you have any suggestions to go about this issue?
  3. I suspect it'd take pretty long to compile chromium from scratch on my machine, do you think this might become a hindrance in me being able to contribute to the project?

Apologies if this was a lot of text in a single comment. Looking forward to your reply! :)

@zuoaoyuan
Copy link
Contributor Author

@daksh-goyal Thanks for your interest!

W3C Event Timing Spec has listed events that we're currently targeting, except chromium expose to input event too under composition (which is currently not spec'd and we'll fix it later).

Isn't this true for having keyup events as well? Meaning that when I have three keys pressed on the board, I could release two keys simultaneously which would should generate two back-to-back keyup events?

Yes, and it currently does. See following screenshot captured from key event demo.
image

This seems like a really big design decision to be made. And changing something might affect existing functionality. While I could see that the event-timing api is still in it's first draft, how difficult would it be to make such a decision at this stage of the API?

Not sure what change you're referring to here. Current implementation already handle multi-key user input, and each keydown keyup pair would be treated as a separate interaction. (Also demonstrated from above screenshot)

we ignore keydown/keyup events during a composition event. Per my understanding, this would lead to all those events not having an interactionId.

Indeed. And yes it would be ideal to expose to them as part of this project:)

Regarding the project doc, unfortunately it was created by my google.com account so under company policy that I can't directly set it to public. Just request access, and I'll grant:)

This project is suitable for only one contributor. High performance cloud machines will be granted based on previous experience, so hardwares will never be a hindrance for GSoC students:)

@zuoaoyuan
Copy link
Contributor Author

Closing this ticket as completed since we've considered exposing interactionId and not seeing objections, also discussions under this thread has been kind of diverted..

Creating separate issues to track each specific task item:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants