-
Notifications
You must be signed in to change notification settings - Fork 131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
remove baggage propagation from xray propagator #1178
base: main
Are you sure you want to change the base?
remove baggage propagation from xray propagator #1178
Conversation
Thanks @jamesmoessis raise the issue. Other than baggage length 8192 bytes problem, could you help me understand what is other problem? awsxraypropagator is a short term solution before x-ray fully migrate to w3c, we want to understand the customer impact. |
It's mainly the length problem. Perhaps also another problem is the lack of a proper specification for this header, and disparity between different OpenTelemetry implementations of it as a result. Particularly the issue is that Java AWS SDK v2 instrumentation hardcodes the use of this propagator, but also allows other propagators to be used in the case of SQS messaging. In the case of SQS, I'd prefer my W3C propagator to be handling the baggage, without risk of corruption from the xraypropagator.
The customer impact is that the xray propagator causes 400s to AWS and has the potential to corrupt baggage. I'm excited for when Amazon fully adopt W3C, but until then, the xraypropagator's handling of baggage is harmful, and it's specific to the Java implementation. For the short-term solution, there should be no baggage. |
@wangzlei any updates or thoughts? Are we happy causing 400s and corrupting baggage? It's a blocker for using AWS SDK instrumentation since the propagator is hardcoded into it. |
Not getting much from the code owners here, wonder if maintainers have any thoughts @jack-berg @trask? |
@jamesmoessis I added it to the agenda for the 4/11/24 Java SIG. Hopefully we can figure out how to proceed. |
Thankyou @jack-berg appreciate it. I would come to the SIG but unfortunately it's at 2am for me 😴 I'll keep an eye on the SIG meeting notes too |
c23766d
to
4dde536
Compare
} | ||
// Size is key/value pair, excludes delimiter. | ||
int size = key.length() + entry.getValue().length() + 1; | ||
if (baggageWrittenBytes + size > 256) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The primary incompatibility is that the limit of 256 characters in the Amazon Trace header. This is incompatible with the W3C baggage limit of 8192 bytes. Attempting to truncate the baggage to fit within this limit risks corrupting the entries and should not be attempted.
It looks like the code is only trying to include baggage entires if the entire entry can be included without exceeding 256 bytes.
When you talk about corrupting entries, are you referring to some entries being present while others not? Can you help me understand why this is problematic?
.append(samplingFlag); | ||
|
||
Baggage baggage = Baggage.fromContext(context); | ||
// Truncate baggage to 256 chars per X-Ray spec. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't seem to find a spec document for the xray spec format. If there is a document, and it does talk about including baggage like this, we should likely follow the spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not aware of a formal specification as well, but I usually follow this reference: https://docs.aws.amazon.com/xray/latest/devguide/aws-xray.html#Tracing-header:~:text=sampling%20rules.-,Tracing%20header,-All%20requests%20are
It seems that there is no equivalent concept to baggage, at least in the reference above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No reference to baggage here, so maybe its right to delete it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
refer to the comment, using key "XRAY_HEADER_ADDITIONAL_FIELDS"
.append(samplingFlag); | ||
|
||
Baggage baggage = Baggage.fromContext(context); | ||
// Truncate baggage to 256 chars per X-Ray spec. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about this comment:
Regarding the length of the X-Ray header, I did not find a definition in X-Ray spec. However, based on some clues, it can be inferred that StepFunction likely has a contract with X-Ray, hence there is length checking in the StepFunction client API. So it is reasonable to truncate the baggage length to (256 - the length of the existing trace context).
My read of this is that including up to 256 baggage bytes is a bug that should be fixed, instead of removing baggage encoding altogether.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. #563 is StepFunction receives a xray header over 256. AWS Lambda is working on a new encoding algorithm to reducing the xray header to be less than 256, then we can fix the bug by truncating xray header to be 256, not truncating baggage to 256.
discussed in today's SIG meeting:
|
@@ -241,9 +199,6 @@ private static <C> Context getContextFromHeader( | |||
if (spanContext.isValid()) { | |||
context = context.with(Span.wrap(spanContext)); | |||
} | |||
if (baggage != null) { | |||
context = context.with(baggage.build()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use a new key "XRAY_HEADER_ADDITIONAL_FIELDS"
context = context.with(XRAY_HEADER_ADDITIONAL_FIELDS ,baggage.build());
FYI #1178 (comment)
Sorry folks, haven't had time to get back to this but I would like to as soon as I get the time. If someone else wants to pick this up and fix the problem, feel free. Otherwise I will try to find some time. |
Description:
This PR removes baggage propagation from the X-Ray propagator.
Existing Issue(s):
We've seen strong incompatibilities between the OpenTelemetry/W3C baggage specification, and that which X-Ray allows.
The primary incompatibility is that the limit of 256 characters in the Amazon Trace header. This is incompatible with the W3C baggage limit of 8192 bytes. Attempting to truncate the baggage to fit within this limit risks corrupting the entries and should not be attempted.
We've seen issues with baggage propagation in X-Ray, for example here: aws-observability/aws-otel-java-instrumentation#563
Additionally, other language's xray propagator do not attempt to attach baggage, for example:
Testing:
Removed tests related to baggage.