-
Notifications
You must be signed in to change notification settings - Fork 375
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pickup latest goal state on tenant certificate rotation + Avoid infin… #3273
Conversation
…ite loop when the tenant certificate is missing
@@ -61,14 +61,6 @@ class GoalStateProperties(object): | |||
All = RoleConfig | HostingEnv | SharedConfig | ExtensionsGoalState | Certificates | RemoteAccessInfo | |||
|
|||
|
|||
class GoalStateInconsistentError(ProtocolError): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of checking the tenant cert at the GoalState level and raising an exception when there is a mismatch, now the callers of GoalState that are actually interested in this situation do the check and handle the error. The caller, in this case the main loop of the extension handler, can do a better job handling and reporting the error.
message = u"Fetching the goal state is still failing: {0}".format(textutil.format_exception(e)) | ||
logger.error(message) | ||
add_event(op=WALAEventOperation.FetchGoalState, is_success=False, message=message, log_event=False) | ||
# Report one single periodic error every 6 hours |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while working on these fixes, I noticed that these messages were being logged every 6 secs, rather than every 6 hours as was intended.
@@ -381,20 +380,6 @@ def http_get_handler(url, *_, **__): | |||
self.assertEqual(initial_timestamp, goal_state.extensions_goal_state.created_on_timestamp, "The timestamp of the updated goal state is incorrect") | |||
self.assertTrue(goal_state.extensions_goal_state.is_outdated, "The updated goal state should be marked as outdated") | |||
|
|||
def test_it_should_raise_when_the_tenant_certificate_is_missing(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not needed anymore
@@ -444,45 +429,6 @@ def test_it_should_download_certs_on_a_new_fabric_goal_state(self): | |||
goal_state.update() | |||
self.assertTrue(os.path.isfile(crt_path)) | |||
|
|||
def test_it_should_refresh_the_goal_state_when_it_is_inconsistent(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved to test_update
@@ -506,39 +452,6 @@ def http_get_handler(url, *_, **__): | |||
self.assertEqual(0, len(goal_state.certs.summary), "Cert list should be empty") | |||
self.assertEqual(1, http_get_handler.certificate_requests, "There should have been exactly 1 requests for the goal state certificates") | |||
|
|||
def test_it_should_refresh_the_goal_state_when_it_is_fails_to_decrypt_cert(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this test was not needed to begin with
if vm_settings is not None: | ||
most_recent = vm_settings if vm_settings.created_on_timestamp > extensions_config.created_on_timestamp else extensions_config | ||
else: | ||
most_recent = extensions_config | ||
elif goal_state_updated: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't this elif condition the same as the first if condition?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks; with this change the elif is not needed (in fact, it is dead code.) Removing it.
azurelinuxagent/ga/update.py
Outdated
@@ -491,45 +496,93 @@ def _check_threads_running(self, all_thread_handlers): | |||
logger.warn("{0} thread died, restarting".format(thread_handler.get_thread_name())) | |||
thread_handler.start() | |||
|
|||
errors = 0 | |||
|
|||
def _try_update_goal_state(self, protocol): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We only hit this logic when updating the goal state. Do we also need it when we initialize the goal state?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, _initialize_goal_state also calls this method
azurelinuxagent/ga/update.py
Outdated
@@ -491,45 +496,93 @@ def _check_threads_running(self, all_thread_handlers): | |||
logger.warn("{0} thread died, restarting".format(thread_handler.get_thread_name())) | |||
thread_handler.start() | |||
|
|||
errors = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where is this used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks; left-over debugging code. Removing it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure if change pushed, it's still there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks; updated
azurelinuxagent/ga/update.py
Outdated
@@ -491,45 +496,93 @@ def _check_threads_running(self, all_thread_handlers): | |||
logger.warn("{0} thread died, restarting".format(thread_handler.get_thread_name())) | |||
thread_handler.start() | |||
|
|||
errors = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure if change pushed, it's still there
Fixes for a couple of bugs on the goal state that surfaced on recent incidents.
First, if there is a problem with the tenant certificate, UpdateHandler._initialize_goal_state() can go into an infinite loop. Second, during rotation of the tenant certificate, the Agent needs to pick up the latest extensions goal state.
For both issues, comments in the code provide more details.
The PR also includes a fix for a bug logging in UpdateHandler._try_update_goal_state().