Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(device)!: rollback and improve send/recv socket exception #304

Merged
merged 4 commits into from
Sep 29, 2024

Conversation

wuwentao
Copy link
Member

@wuwentao wuwentao commented Sep 24, 2024

changes detail:

  1. rollback some socket related process
  2. process all the socket exception in connect and main loop, all the exception MUST raise to main.
  3. improve refresh_status, support init protocol probe and refresh
  4. improve connect, support init connect(protocl probe) and reconnect
  5. add reconnect and ignore check_protocol during refresh_status
  6. reconnect failed(device power off) , return to connect loop until device online
  7. add more comments in source code for the changes

Summary by CodeRabbit

  • New Features

    • Enhanced connection setup with options for initial configuration and reconnection.
    • Improved error handling for connection issues and device capabilities retrieval.
  • Bug Fixes

    • Refined exception handling for various error types to provide clearer logging.
  • Refactor

    • Restructured methods for better error management and connection recovery.

Copy link
Contributor

coderabbitai bot commented Sep 24, 2024

Walkthrough

The pull request modifies the device.py file, particularly the connect, close_socket, and send_message_v2 methods, to improve connection handling and error management. It introduces init and reconnect parameters for enhanced flexibility, improves exception handling with specific logging, and restructures the control flow for better resilience against connection issues. The _recv_message method has been removed, with its functionality integrated into refresh_status. Overall, these changes aim to enhance the robustness of device communication.

Changes

File Change Summary
midealocal/device.py - Updated connect method to accept init and reconnect parameters for improved connection handling.
- Enhanced error handling in authenticate, send_message_v2, and refresh_status methods.
- Removed _recv_message method; its functionality is now part of refresh_status.
- Restructured run method to include a connection retry loop.
- Updated close_socket method to accept init parameter.

Possibly related PRs

Suggested labels

breaking change

Suggested reviewers

  • rokam
  • chemelli74

Poem

🐇 In the meadow, connections bloom,
With every hop, we banish gloom.
A socket here, a protocol there,
Robust and ready, we leap with flair!
Error logs now sing so bright,
In our device, all feels just right! 🌼


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@github-actions github-actions bot added the bug Something isn't working label Sep 24, 2024
@codecov-commenter
Copy link

codecov-commenter commented Sep 24, 2024

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 6.93069% with 94 lines in your changes missing coverage. Please review.

Project coverage is 40.00%. Comparing base (e0e0633) to head (7a9a58a).

Files with missing lines Patch % Lines
midealocal/device.py 6.93% 94 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #304      +/-   ##
==========================================
+ Coverage   39.93%   40.00%   +0.07%     
==========================================
  Files          84       84              
  Lines        7535     7524      -11     
==========================================
+ Hits         3009     3010       +1     
+ Misses       4526     4514      -12     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Outside diff range and nitpick comments (1)
midealocal/device.py (1)

248-254: Redundant logging before raising SocketException

In the authenticate method, when self._socket is None, you log a debug message and then raise a SocketException. Since the calling method (connect) already logs the exception, this logging might be redundant and could be removed to reduce clutter.

Consider removing the debug log before raising the exception:

 if not self._socket:
-    _LOGGER.debug(
-        "[%s] authenticate failure, device socket is none",
-        self._device_id,
-    )
     # Raise exception to connect loop
     raise SocketException
Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between e0e0633 and 20488c7.

Files selected for processing (1)
  • midealocal/device.py (8 hunks)
Additional comments not posted (2)
midealocal/device.py (2)

645-654: Review exception handling for NoSupportedProtocol

In the run method, when a NoSupportedProtocol exception is caught, the loop continues without setting reconnect = True, unlike other exceptions. If this exception indicates a critical issue affecting connectivity, you may need to handle it similarly to other exceptions.

Please verify whether NoSupportedProtocol should trigger a reconnection attempt or if additional handling is required.


572-572: Ensure socket is closed when IP address changes

In the set_ip_address method, after updating the IP address, you call self.close_socket(init=True). Ensure that the socket is properly closed before attempting to reconnect to the new IP address to prevent any lingering connections.

midealocal/device.py Show resolved Hide resolved
midealocal/device.py Show resolved Hide resolved
midealocal/device.py Outdated Show resolved Hide resolved
midealocal/device.py Outdated Show resolved Hide resolved
midealocal/device.py Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (4)
midealocal/device.py (4)

Line range hint 200-233: Improved connection handling and error management

The changes to the connect method significantly enhance error handling and provide more detailed logging for various exception types. The addition of the init parameter allows for a more comprehensive initial connection setup, including protocol checking and capabilities retrieval.

Consider adding a brief comment explaining the purpose of the init parameter for better code readability:

-    def connect(self, init: bool = False) -> bool:
+    def connect(self, init: bool = False) -> bool:
+        """
+        Connect to the device.
+        
+        :param init: If True, perform initial setup including protocol check and capabilities retrieval.
+        :return: True if connection is successful, False otherwise.
+        """

284-327: Enhanced error handling and timeout management in send_message_v2

The improvements to send_message_v2 method are commendable. The addition of a socket check, more specific exception handling, and query-specific timeout settings enhance the robustness of the method.

Consider extracting the common logging pattern into a separate method to reduce code duplication:

def _log_send_error(self, error_type: str, error: Exception):
    _LOGGER.debug(
        "[%s] send_message_v2 %s: %s",
        self._device_id,
        error_type,
        error,
    )

Then, you can use it in the exception handlers:

except TimeoutError as e:
    self._log_send_error("timed out", e)
    raise

Line range hint 360-419: Improved message handling and error management in refresh_status

The changes to the refresh_status method significantly enhance its robustness. The addition of a loop for message reception and parsing ensures that all messages are properly handled. The improved exception handling provides better control over different error scenarios.

Consider adding a maximum retry count to prevent potential infinite loops in case of persistent errors:

MAX_RETRIES = 5
retry_count = 0
while True:
    if retry_count >= MAX_RETRIES:
        _LOGGER.warning("[%s] Max retries reached in refresh_status", self._device_id)
        break
    # ... existing code ...
    retry_count += 1

636-699: Greatly improved run loop with robust error handling and connection management

The changes to the run method significantly enhance its robustness and efficiency. The addition of a connection retry loop with exponential backoff is an excellent practice for handling network issues. The more specific exception handling provides better control over different error scenarios, and the added sleep prevents high CPU usage.

Consider extracting the exponential backoff logic into a separate method for better readability and potential reuse:

def _get_backoff_time(self, retries: int) -> int:
    return min(5 * (2 ** (retries - 1)), 600)

# In the run method:
sleep_time = self._get_backoff_time(connection_retries)

This would make the code more maintainable and easier to adjust if needed.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 20488c7 and 19a2873.

📒 Files selected for processing (1)
  • midealocal/device.py (8 hunks)
🔇 Additional comments not posted (3)
midealocal/device.py (3)

248-253: Improved socket validation in authentication

The addition of a check for self._socket being None before attempting authentication is a good defensive programming practice. Raising a SocketException in this case ensures that the error is properly propagated and can be handled in the calling code.


579-583: Improved socket closure with initialization option

The addition of the init parameter to close_socket method allows for resetting the unsupported protocol list during initialization. This is useful for re-establishing device capabilities after reconnection.

Regarding the previous thread safety concern, it's important to ensure that access to self._unsupported_protocol is thread-safe if multiple threads can interact with the device. Consider using a threading lock when modifying this shared resource:

#!/bin/bash
# Check for potential thread safety issues
rg -n "self._unsupported_protocol" midealocal/device.py
rg -n "threading.Lock()" midealocal/device.py

If no lock is found, consider adding one to ensure thread safety.


Line range hint 1-724: Overall significant improvements in error handling and connection management

The changes made to this file have greatly enhanced its robustness and reliability. The improvements in error handling, connection management, and the addition of exponential backoff for retries are particularly noteworthy. These changes should make the device communication more resilient to network issues and unexpected errors.

To further improve the code:

  1. Consider creating a custom logger for this class to reduce the repetition of [%s] and self._device_id in log messages.
  2. Look for opportunities to extract common error handling patterns into separate methods to reduce code duplication.
  3. Consider implementing a state machine pattern to manage the different states of the device (connecting, connected, disconnected, etc.) which could make the code more maintainable and easier to reason about.

To ensure that all error cases are properly handled, run the following script to check for any except: clauses without specific exception types:

If any are found, consider replacing them with more specific exception handling.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (5)
midealocal/device.py (5)

Line range hint 200-243: Improved connection handling and error management

The changes to the connect method significantly enhance the connection process by introducing different behaviors for initial connections and reconnections. The added error handling for various exceptions improves the robustness of the connection process.

Consider adding more detailed logging for the different connection scenarios (init, reconnect) to aid in debugging. For example:

 if reconnect or init:
+    _LOGGER.debug("[%s] Refreshing status (init: %s, reconnect: %s)", self._device_id, init, reconnect)
     self.refresh_status(check_protocol=init)
 if init:
+    _LOGGER.debug("[%s] Getting capabilities during init", self._device_id)
     self.get_capabilities()

287-339: Enhanced error handling and logging in send_message_v2

The improvements to the send_message_v2 method, including the check for self._socket being None and the more detailed exception handling, significantly enhance the robustness of the message sending process. The additional logging will aid in debugging.

Consider grouping the exception handling blocks together to improve code readability. For example:

 try:
     # ... existing code ...
-except TimeoutError:
-    # ... existing code ...
-except ConnectionResetError as e:
-    # ... existing code ...
-except OSError as e:
-    # ... existing code ...
-except Exception as e:
-    # ... existing code ...
+except (TimeoutError, ConnectionResetError, OSError) as e:
+    _LOGGER.debug(
+        "[%s] send_message_v2 error: %s",
+        self._device_id,
+        str(e),
+    )
+    raise
+except Exception as e:
+    _LOGGER.exception(
+        "[%s] send_message_v2 Unexpected socket error",
+        self._device_id,
+        exc_info=e,
+    )
+    raise

This change would reduce code duplication while maintaining the specific logging for each exception type.


Line range hint 376-446: Significantly improved refresh_status method

The refresh_status method has been greatly enhanced with more robust error handling, a loop for receiving and parsing messages, and improved handling for protocol checking and unsupported protocols. These changes should make the status refresh process more reliable and resilient to network issues.

Consider extracting the message receiving and parsing loop into a separate method to improve code readability. For example:

def _receive_and_parse_messages(self):
    while True:
        if not self._socket:
            raise SocketException("Device socket is None")
        msg = self._socket.recv(512)
        if len(msg) == 0:
            raise OSError("Empty message received.")
        result = self.parse_message(msg)
        if result == MessageResult.SUCCESS:
            break
        elif result == MessageResult.PADDING:
            continue
        else:
            raise ResponseException

This would make the refresh_status method cleaner and easier to understand.


Line range hint 663-730: Greatly improved run method with robust error handling

The changes to the run method significantly enhance its robustness and efficiency. The addition of a connection retry loop with exponential backoff is an excellent practice for handling network issues. The improved exception handling and the small sleep time to prevent high CPU usage are also valuable improvements.

Consider extracting the connection retry logic into a separate method to improve code readability. For example:

def _retry_connection(self):
    connection_retries = 0
    while self._socket is None:
        _LOGGER.debug("[%s] Socket is None, try to connect", self._device_id)
        if self.connect(init=True) is False:
            self.close_socket(init=True)
            connection_retries += 1
            sleep_time = min(5 * (2 ** (connection_retries - 1)), 600)
            _LOGGER.warning(
                "[%s] Unable to connect, sleep %s seconds and retry",
                self._device_id,
                sleep_time,
            )
            time.sleep(sleep_time)
    return connection_retries

This would make the run method cleaner and easier to understand.


Line range hint 1-730: Overall significant improvements to device communication

The changes made to midealocal/device.py represent a substantial improvement in the reliability, robustness, and maintainability of the device communication code. Key improvements include:

  1. Enhanced error handling across multiple methods.
  2. Improved connection management with retry mechanisms.
  3. More detailed logging for better debugging.
  4. Refined status refresh and message parsing processes.

These changes should result in a more stable and resilient device communication system.

Consider the following architectural improvements for future iterations:

  1. Implement a state machine to manage device connection states more explicitly.
  2. Use asyncio for non-blocking I/O operations, which could improve overall performance.
  3. Implement a more comprehensive logging strategy, possibly using structured logging for easier analysis.
  4. Consider breaking down this large class into smaller, more focused classes to improve maintainability and testability.

These suggestions could further enhance the scalability and maintainability of the codebase.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between f0b25dc and 7a9a58a.

📒 Files selected for processing (1)
  • midealocal/device.py (8 hunks)
🔇 Additional comments (3)
midealocal/device.py (3)

246-257: Improved error handling in authenticate method

The addition of a check for self._socket being None before attempting to send data is a good defensive programming practice. This prevents potential NoneType errors and raises a SocketException to be handled by the calling method.


352-365: Improved logging and refresh timing in build_send

The addition of logging for force refreshing after setting status enhances the visibility of the device's behavior. Updating the _previous_refresh time ensures that the next status refresh will occur at the appropriate interval after sending a set command.


Line range hint 606-621: Improved close_socket method with initialization option

The addition of the init parameter to reset the _unsupported_protocol list during initialization is a good improvement. This allows for re-establishing device capabilities after a reconnection.

Regarding the thread safety concern raised in a previous review, it's important to ensure that access to self._unsupported_protocol is thread-safe if multiple threads can interact with the device. Consider using a threading lock when modifying this shared resource. Here's a script to check for potential thread safety issues:

If the results show that _unsupported_protocol is accessed from multiple methods and threading is used, consider implementing a lock mechanism to ensure thread safety.

Copy link
Contributor

@rokam rokam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good for me and this is another breaking change

@rokam rokam changed the title fix(device): rollback and improve send/recv socket exception fix(device)!: rollback and improve send/recv socket exception Sep 27, 2024
@github-actions github-actions bot added the breaking change A change that is not backwards compatible label Sep 27, 2024
@rokam rokam merged commit 7083464 into main Sep 29, 2024
12 checks passed
@rokam rokam deleted the rollback_socket branch September 29, 2024 01:38
@rokam rokam mentioned this pull request Sep 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking change A change that is not backwards compatible bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants