Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add common guidance on recording errors on spans and metrics, clarify DB conventions #1716

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

lmolkova
Copy link
Contributor

@lmolkova lmolkova commented Dec 28, 2024

Fixes #1536

  • Document HTTP + DB approach for span status + description as generic for spans
  • Document error.type as generic error attribute on spans and metrics
  • Specify how to classify db.response.status_codes as failures
  • Update recommendation to record exceptions as a span events or log records

Important

We can and should implement strategy of reporting exceptions on local root spans in OTel SDK - it should not be an instrumentation concern.

Here's an OTEP on how to record exceptions on logs - it discusses the details of configurable exception recording strategy.
Since public facing log API is in development, we'll keep using span events for the time being.

Merge requirement checklist

@lmolkova lmolkova requested review from a team as code owners December 28, 2024 18:27
@lmolkova lmolkova requested a review from Copilot December 28, 2024 18:28

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 5 out of 20 changed files in this pull request and generated no comments.

Files not reviewed (15)
  • docs/database/redis.md: Evaluated as low risk
  • docs/database/cassandra.md: Evaluated as low risk
  • docs/database/cosmosdb.md: Evaluated as low risk
  • docs/rpc/rpc-spans.md: Evaluated as low risk
  • docs/messaging/messaging-spans.md: Evaluated as low risk
  • docs/cli/cli-spans.md: Evaluated as low risk
  • docs/http/http-spans.md: Evaluated as low risk
  • docs/general/recording-errors.md: Evaluated as low risk
  • docs/gen-ai/gen-ai-spans.md: Evaluated as low risk
  • docs/faas/faas-spans.md: Evaluated as low risk
  • docs/exceptions/README.md: Evaluated as low risk
  • docs/database/elasticsearch.md: Evaluated as low risk
  • docs/database/database-spans.md: Evaluated as low risk
  • docs/database/mongodb.md: Evaluated as low risk
  • docs/database/couchdb.md: Evaluated as low risk
@lmolkova lmolkova force-pushed the generic-recording-errors branch from 4aa1c75 to 4c1fdc3 Compare January 3, 2025 01:11
@lmolkova lmolkova force-pushed the generic-recording-errors branch from 4c1fdc3 to 456f377 Compare January 9, 2025 01:40

**Status**: [Development][DocumentStatus]

When instrumented operation fails with an exception, instrumentation SHOULD record
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
When instrumented operation fails with an exception, instrumentation SHOULD record
When an instrumented operation fails with an exception, instrumentation SHOULD record

Refer to the [Recording errors](/docs/general/recording-errors.md) document for additional
details on how to record errors across different signals.

It's RECOMMENDED to use `Span.recordException` API or logging library API that takes exception instance
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
It's RECOMMENDED to use `Span.recordException` API or logging library API that takes exception instance
It's RECOMMENDED to use the `Span.recordException` API or logging library API that takes exception instance

It's NOT RECOMMENDED to record the same exception more than once.
It's NOT RECOMMENDED to record exceptions that are handled by the instrumented library.

For example, in this code-snippet, `ResourceNotFoundException` is handled and corresponding
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
For example, in this code-snippet, `ResourceNotFoundException` is handled and corresponding
For example, in this code-snippet, `ResourceAlreadyExistsException` is handled and the corresponding

Comment on lines +39 to +40
native instrumentation should not record it. Other exceptions, that are propagated
to the caller, should be recorded (or logged) once.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
native instrumentation should not record it. Other exceptions, that are propagated
to the caller, should be recorded (or logged) once.
native instrumentation should not record it. Exceptions which are propagated
to the caller should be recorded (or logged) once.


**Status**: [Development][DocumentStatus].

This document provides recommendations to semantic conventions and instrumentation authors
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This document provides recommendations to semantic conventions and instrumentation authors
This document provides recommendations to semantic convention and instrumentation authors


## What constitutes an error

Operation SHOULD be considered as failed if any of the following is true:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Operation SHOULD be considered as failed if any of the following is true:
An operation SHOULD be considered as failed if any of the following is true:

> Instrumentations that have additional context about a specific request MAY use
> this context to set the span status more precisely.

Errors that were retried or handled allowing operation to complete gracefully SHOULD NOT
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Errors that were retried or handled allowing operation to complete gracefully SHOULD NOT
Errors that were retried or handled (allowing an operation to complete gracefully) SHOULD NOT

@@ -0,0 +1,4 @@
change_type: enhancement
component: docs, db
note: Add common guidance on recording errors on spans and metrics, clarify DB conventions.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
note: Add common guidance on recording errors on spans and metrics, clarify DB conventions.
note: Add common guidance for recording errors on spans and metrics, clarify DB conventions.

Comment on lines +120 to +122
Refer to the [Recording Errors](/docs/general/recording-errors.md) document for
general considerations on how to record span status.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe?

Suggested change
Refer to the [Recording Errors](/docs/general/recording-errors.md) document for
general considerations on how to record span status.
**Status**: [Development][DocumentStatus] - Refer to the [Recording Errors](/docs/general/recording-errors.md) document for
general considerations on how to record span status.

@@ -11,33 +11,6 @@ exceptions associated with spans.

<!-- toc -->

- [Recording an Exception](#recording-an-exception)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
3 participants