Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default buffering of Event Timing Entries is based on durationThreshold: 104 -- even for Interactions. #124

Open
mmocny opened this issue Oct 11, 2022 · 6 comments
Assignees

Comments

@mmocny
Copy link
Contributor

mmocny commented Oct 11, 2022

Buffering of EventTiming entries, which buffers from page load until a PerformanceObserver is registered, is always based on a durationThreshold: 104.

When you register a new PerformanceObserver that asks for events with durationThreshold lower than that (i.e. durationThreshold: 40 to options), there cannot already be any such Entries stored in the buffer. It may be surprising to receive buffered entries >= 104, but none in the [40, 104) range.

A few options:

  • Change spec to buffer certain types of Events with a lower default threshold (i.e. all discrete events, or only those with interactionId)
  • Change spec to support "dynamic buffering". Start by buffering entries with any duration until buffer is full. Once buffer fills, filter out the fastest entries and bump the threshold. Keep doing so until threshold reaches 104ms, after which buffer just stays full.
@yoavweiss
Copy link
Contributor

Dynamic buffering is interesting! The main concern with buffering "too much" is around memory constraints, and can be a good solution to that. It'd require a bit more work on the UA side of things though, and would require that to happen even on pages where there will never be an EventTiming listener.

I'd be curious to hear from RUM collectors if this is a real issue they are bumping up against when collecting EventTiming entries.

@mmocny
Copy link
Contributor Author

mmocny commented Oct 12, 2022

True that buffing, and therefore dynamic buffering, is a cost that is paid on every page even if it ends up not used (by design). Filtering a fixed sized list on simple criteria, only when its full feels like a small constant amount of work, but its not free.

There were some proposals in the past to support a policy via header/meta tags I think to control overriding this behaviour... Anyway, I don't think dynamic buffering is ideal solution here anyway, because Event Timing reports on so many different event types, and the event counts are not evenly distributed (see below).

I'd be curious to hear from RUM collectors if this is a real issue they are bumping up against when collecting EventTiming entries.

@philipwalton the author of web-vitals.js library reported this as a real issue. Users of that library noticed it when trying to use it to measure INP. (The web-vitals library uses 40ms as the default durationThreshold and this issue means there results of measurement are dependant on the timing of when the library loads-- which PO buffering is meant to prevent.)

But yes, I would love more input here to see if this is a real issue, given that we still do have eventCounts for things under threshold.


Try this to show total eventCounts for any page:

console.table([...performance.eventCounts.entries()])

Inject early in page load to measure events >=40ms only:

let c = 0;
new PerformanceObserver(list => {
  for (let entry of list.getEntries()) {
    console.log(c++, entry.duration)
  }
}).observe({ type: 'event', durationThreshold: 40 });

Anecdote: On my desktop using a mouse pointer, after a mere few seconds on a popular news site I saw reported already 492 events which had duration over 40ms (due to loading main thread contention). All those events were continuous i.e. pointerenter/leave events or similar.

Dynamic buffering would help, but maybe having different default threshold for events with interactionId is better?

@philipwalton
Copy link
Member

philipwalton commented Oct 12, 2022

@philipwalton the author of web-vitals.js library reported this as a real issue.

My biggest gripe with the current behavior is that it's unexpected.

From an API design perspective, it's not great to have a configuration object where two of the options don't work together and the only way to discover that is to read the spec :)

I understand that the buffering concerns are important, and I'm not suggesting that browsers should buffer all entries as a solution to this. However, I do think there's a real issue here. And what makes matters worse is the issue is not immediately obvious, so a developer could spend a lot of time building and deploying some monitoring code only to realize much later that they have gaps when looking at the data they collect.

Some potential options to improve this:

  • Throw an error if buffered and durationThreshold are used together. This would mean developers who want both would have to create two POs, but at least they would be aware right away that there's a problem.
  • Remove the durationThreshold option and make it configurable via a meta tag / header (as Michal suggests).
  • Default to buffering all events, but limit the types of events that get buffered by default (or also make that configurable via a meta tag / header).

On that last point, do we know of any sites / RUM providers who are using Event Timing and looking at mouse/pointer events? I wonder if issues related to lots of continuous events would be better handled by a separate API (e.g. something like frame timing).

To comment on some of Michal's suggestions

A few options:

  • Change spec to buffer certain types of Events with a lower default threshold (i.e. all discrete events, or only those with interactionId)
  • Change spec to support "dynamic buffering". Start by buffering entries with any duration until buffer is full. Once buffer fills, filter out the fastest entries and bump the threshold. Keep doing so until threshold reaches 104ms, after which buffer just stays full.

I think both of these would do a lot to decrease the chances that anyone notices (or is negatively impacted by) this issue, but I don't think they address the root problem.

I wonder if we could use one of these suggestions along with something that makes the API less prone to misuse.

@nhelfman
Copy link

Just a small input on this. My feeling is that the "dynamic buffering" proposal can be confusing for developers, even if spec changes have very concrete and detailed definition of the behavior.

@yoavweiss
Copy link
Contributor

Alternatively, we currently buffer up to 150 entries before an observer is registered. If we lower the duration for these entries, we're likely to reach that limit sooner, but maybe that's enough for perf observers to be registered?

@mmocny - would you be able to add use counters for cases where perf observers with a buffered flag were registered after the event timing buffer was full? And then maybe experiment with lowering the threshold and seeing if that count increases for those users?

@mmocny
Copy link
Contributor Author

mmocny commented Oct 19, 2022

Sure, will file an issue.


I will say: my assumption is that this will depend entirely on the hardware used, and the number of continuous events fired.

You can try this on any page you use:

function getTotalEventCounts() {
    return Array.from(performance.eventCounts.values()).reduce((a,b) => a+b);
}
setInterval(() => console.log(getTotalEventCounts()), 1000)

@mmocny mmocny self-assigned this Feb 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants