-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Self produce logs #72
Self produce logs #72
Conversation
Thanks for this @tmcqueen-materials. I agree with your first part choice because I am pretty sure we will use heartbeat a lot and detailed logging only when we have something to debug so keeping them separate seems right. In the second part, I agree with the approach for something centralized and consistent rather than each module or component having its own, context-specific logger (like logging.getLogger(name)). Context specific allows a lot of control but I would think it might miss things and certainly would be more work to keep consistent. Doing something else like creating a decoupled, dedicated warnings handler might separate concerns between warning and logging and allow flexibility to escalate warnings to exceptions in specific cases, but it would also increase complexity by introducing another layer of handling and doesn't seem necessary. In short, I think you've picked the pragmatic, effective solution. If there are more complex needs then thinking about a hierarchical or context-specific design might be worth it, but hopefully Kafka doesn't throw us a lot of problems! At least that's my two cents as the least development savvy person on this thread! |
@davidelbert : thanks. I guess the reason I was considering whether heartbeats+logs should be a single item is that a heartbeat message is really just a special type of log message (namely one that gives a timestamp, and associated status, e.g. number of bytes processed, etc). And OpenMSIStream already includes the capabilities to adjust logging level to ones desired level. But if you all are happy with two separate ones, I am happy. Thanks! |
@davidelbert @Xarthisius : The residual test issues here come from the fact that the CI is using the wrong config files, see #70 , as well as the other S3 access token issue. This should be ready to review. Responding to my question above about encryption of logs and heartbeats, the answer is: they are not. They are sent in the clear. This presents a confidentiality problem. But it also reveals a more serious issue: since the heartbeat producer/consumer python object is not the same as the one carrying actual data, when encountering issues such as confluentinc/librdkafka#4916 , heartbeats can continue even if the actual data carrying connection is broken (this is true with or without encryption). These can be fixed -- would you like that as part of this PR, or as a new PR? |
CircleCI project setup for this repo has
This is already a big PR, I'd vote for deferring that for now. |
@Xarthisius : Thanks. I will defer the encryption and heartbeat functionality fixes to a new PR. Maybe David can fix the CircleCI configuration setting you mention so we can get clean tests before merging this? |
One of the tests related to this PR timeouts and hangs:
|
@Xarthisius : Yes, that is because of the issue you identified in your not-yet-merged #70. If you look at the just previous test (which has one of the test topics deliberately wrong but works without the config file update of this PR), that test passes: https://app.circleci.com/pipelines/github/openmsi/openmsistream/702 . Edited to add: The technical detail: the test that is timing out is I was thinking this could be left as-is as #70 will eventually address the underlying problem. If you prefer, I could revert the last commit of this PR, so that |
We could merge my PR and rebase this one. It should solve it issue. It will require a nasty rebase though. |
@davidelbert : How would you like to proceed with this? The rebase of this one on top of #70 shouldn't be too bad, as they are essentially orthogonal changes, except a couple small places. |
@tmcqueen-materials and @Xarthisius , #70 is merged so you guys are set to rebase I think. |
…rieval added, and tests.
fix, and testing needed.
…rieval added, and tests.
fix, and testing needed.
…fer is increased in size to reasonably capture all log output.
Revert "Test supposition that failing tests is because of missing config file update." This reverts commit 78a073d.
13655c2
to
e7b23cd
Compare
did not apply the changes to the binary KafkaCrypto configs of testing_node_2). So regenerate, and while here, add some documentation of how to do it again in the future.
@davidelbert : This one is ready to merge. Thanks! |
@Xarthisius @davidelbert : This is a WIP set of modifications to OpenMSIStream to enable self-production of own logs. I am submitting it now as a draft to get feedback on the overall approach, and to start getting all the CI style feedback, etc.
The approach taken here consists of two parts:
(1) In the first part, plumbing for separate production of log messages to kafka is added. This is essentially copy&paste of heartbeats, but simpler because individual class overrides are not required. I decided to keep it separate, instead of a single combined heartbeat/logs production, since I can see cases where you would want one and not the other, or vice versa.
(2) In the second part, a single global, thread-safe, implementation of a loggings and warnings handler is plumbed in during initial package import. I am not a fan of the single global instance, but it is the only way to make sure we capture all warnings early on (as it involves single global state, see python warnings documentation), and it simplifies other things we might want to do later (e.g. enabling warnings output to go to the normal OpenMSI logging facilities).
One other thing: how does this all play with encrypted messages? Does anyone know? KafkaCrypto really needs a single KafkaCryptoStore backing object, e.g., while this naively looks like the heartbeat one might trample over the regular message production one?