Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release v0.3 #103

Merged
merged 105 commits into from
Nov 18, 2024
Merged

Release v0.3 #103

merged 105 commits into from
Nov 18, 2024

Conversation

rlskoeser
Copy link
Member

@rlskoeser rlskoeser commented Nov 15, 2024

Summary by CodeRabbit

Release Notes

  • New Features

    • Introduced enhanced contributor recognition with a new CONTRIBUTORS.md section, showcasing individual contributions and a badge.
    • Added support for Extended Date/Time Format (EDTF) with a new Jupyter notebook demonstrating functionality.
    • Implemented a new BaseDateConverter class for flexible date conversions.
    • Added new Date and Timedelta classes for improved date and time handling.
    • Updated the README.md with expanded usage examples and clearer descriptions.
  • Improvements

    • Updated documentation with clearer contribution guidelines and enhanced README for better user understanding.
    • Transitioned to hatchling for package management, improving build and dependency management.
    • Enhanced the testing framework by switching to pytest and updating testing commands.
    • Improved the structure and organization of the documentation for better navigation.
  • Bug Fixes

    • Improved error handling and validation in date parsing logic to ensure robustness.
    • Fixed issues related to the handling of negative years and unspecified digits in date formats.
  • Tests

    • Expanded test coverage for new features and improved existing tests to ensure reliability.
    • Added comprehensive tests for the new Date, Timedelta, and BaseDateConverter classes.

rlskoeser and others added 30 commits June 6, 2024 12:22
Reorganize and improve documentation
Co-authored-by: Julia Damerow <[email protected]>
Co-authored-by: Julia Damerow <[email protected]>
Check notebooks in CI `check` workflow #76
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Copy link
Contributor

coderabbitai bot commented Nov 15, 2024

Caution

Review failed

The pull request is closed.

Walkthrough

The pull request includes comprehensive updates across multiple files, primarily focusing on enhancing the contributor recognition system, updating testing workflows, and restructuring documentation. Key changes involve modifying the .all-contributorsrc file to reflect updated contributor roles and introducing a new badge template. Additionally, workflow files have been updated to use Python 3.12 and implement new tools for code quality checks. Documentation files have been restructured for clarity, and new classes for date handling have been introduced in the codebase.

Changes

File Change Summary
.all-contributorsrc Updated "files" entry from "README.md" to "CONTRIBUTORS.md"; modified contributor entries for Cole Crawford, Robert Casties, and Julia Damerow; added "badgeTemplate" field.
.github/workflows/check.yml Updated Python version to 3.12; changed cache dependency path; replaced black with ruff for code checking; updated documentation build check to use sphinx-build.
.github/workflows/unit_tests.yml Updated Python versions in job matrix; modified installation command for test dependencies; switched from tox to pytest for running tests.
.gitignore Added entry to ignore Jupyter notebook checkpoint files (.ipynb_checkpoints/).
.pre-commit-config.yaml Removed black formatter; added ruff linter and mypy for type checking; included new hook for ruff formatting.
.pythonversion Created new file specifying Python version as 3.12.7.
CHANGELOG.md Updated to reflect new version 0.3 and significant changes, including support for numpy's datetime64.
CONTRIBUTING.md Clarified contribution guidelines, including branch naming conventions and submission processes.
CONTRIBUTORS.md Added a section for All Contributors, including a badge and detailed contributors table.
DEVELOPER_NOTES.md Introduced developer instructions for project setup and contribution.
README.md Restructured content, added logo, clarified library purpose, and enhanced example usage section.
docs/CONTRIBUTING.md Added directive to include content from the main CONTRIBUTING.md.
docs/CONTRIBUTORS.md Added directive to include content from the main CONTRIBUTORS.md.
docs/DEVELOPER_NOTES.md Added directive to include content from the main DEVELOPER_NOTES.md.
docs/LICENSE.md Added directive to include content from the main LICENSE.md.
docs/_static/custom.css Introduced new CSS rules for the sidebar in Sphinx documentation.
docs/_templates/sidebar_dhtech.html Added new HTML section for "powered by" attribution.
docs/conf.py Updated project name, copyright year, HTML theme, and extensions.
docs/index.rst Updated table of contents and reduced maximum depth.
examples/README.md Added section outlining example data and projects using "undate."
examples/notebooks/edtf-support.ipynb Created a notebook demonstrating EDTF support.
examples/notebooks/shxco_partial_date_durations.ipynb Updated logic for duration calculations and output formatting.
pyproject.toml Transitioned to hatchling build system; added project metadata and dependencies.
src/undate/__init__.py Updated imports and __all__ declaration.
src/undate/converters/base.py Introduced BaseDateConverter class for date conversion.
src/undate/converters/edtf/converter.py Added EDTFDateConverter class for handling EDTF formats.
src/undate/date.py Introduced Timedelta and Date classes for enhanced date handling.
tests/test_converters/test_base.py Added unit tests for BaseDateConverter.
tests/test_converters/test_edtf.py Introduced tests for EDTFDateConverter.
tests/test_date.py Added tests for Date and Timedelta classes.
tests/test_undate.py Enhanced tests for Undate and UndateInterval classes.

Possibly related issues

Possibly related PRs

Suggested reviewers

  • ColeDCrawford

Poem

🐰 In the garden where ideas bloom,
Contributors gather, dispelling gloom.
With badges bright and roles defined,
Together we craft, our paths aligned.
In every change, a hop, a cheer,
For undate's journey, we hold dear! 🌼

Warning

Rate limit exceeded

@rlskoeser has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 16 minutes and 12 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between e3c6f65 and 69eca57.


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 26

🧹 Outside diff range and nitpick comments (52)
.pre-commit-config.yaml (1)

18-22: Enhance MyPy's type checking configuration

While MyPy is correctly set up, consider enabling stricter type checking for better code quality.

   - id: mypy
+    args: [--strict, --ignore-missing-imports]
     additional_dependencies: [numpy]

Also, consider creating a mypy.ini or updating pyproject.toml with MyPy configurations for more granular control.

.github/workflows/check.yml (3)

26-28: Consider enhancing Ruff configuration

While Ruff is a good choice, consider adding explicit formatting checks and specifying rules.

Consider enhancing the configuration:

      - name: Run ruff
-       run: ruff check
+       run: |
+         ruff check .
+         ruff format --check .

35-36: Consider expanding type checking coverage

The current configuration only checks the src directory. Consider including tests and using an explicit config file.

Consider enhancing the configuration:

      - name: Check types in python src directory; install needed types
-       run: mypy --install-types --non-interactive src
+       run: mypy --install-types --non-interactive src tests

38-40: Enhance notebook testing configuration

While adding notebook testing is great, consider making it more robust with timeouts and explicit directory specification.

Consider enhancing the configuration:

      - name: Check jupyter notebooks with treon
-       run: treon
+       run: treon examples/ --timeout=300
tests/test_converters/edtf/test_edtf_parser.py (2)

15-15: Consider adding boundary test cases for negative years

While the added test cases cover basic negative years and scientific notation, consider adding tests for:

  • Minimum supported year value
  • Transition cases around year 0
  • Large negative years without scientific notation

Also applies to: 17-18


Line range hint 43-54: Enhance test documentation and error coverage

The test file would benefit from:

  1. Adding docstrings explaining the test strategy and EDTF level coverage
  2. Expanding error_cases to include:
    • Invalid date combinations (e.g., "1984-04-31")
    • Malformed scientific notation
    • Invalid qualifiers
    • Incorrect interval formats

Would you like me to help generate additional error test cases and documentation?

.github/workflows/unit_tests.yml (1)

41-41: Consider adding pytest configuration

The switch from tox to direct pytest usage is good for simplicity, but consider adding a pytest configuration file to ensure consistent test behavior:

  • Test discovery patterns
  • Test ordering
  • Coverage settings

Would you like me to help create a pytest.ini or add the configuration to pyproject.toml?

Also applies to: 46-46

tests/test_converters/edtf/test_edtf_transformer.py (1)

36-40: Consider adding more edge cases for unspecified digits

The current test cases provide good coverage for unspecified digits, but consider adding these additional patterns for completeness:

  • Mixed 'X' and digit in day: "1984-12-X5"
  • All possible positions in month: "1984-X2", "1984-2X"
 # level 2 unspecified digits
 ("156X-12-25", Undate("156X", 12, 25)),
 ("XXXX-12-XX", Undate("XXXX", 12, "XX")),
 ("1XXX-XX", Undate("1XXX", "XX")),
 ("1984-1X", Undate(1984, "1X")),
+("1984-12-X5", Undate(1984, 12, "X5")),
+("1984-X2", Undate(1984, "X2")),
+("1984-2X", Undate(1984, "2X")),
.all-contributorsrc (1)

79-80: Consider adding alt text for accessibility

While the badge template is well-configured, consider adding alt text to improve accessibility. The current implementation might not be screen-reader friendly.

-  "badgeTemplate": "![All Contributors](https://img.shields.io/github/all-contributors/dh-tech/undate-python?color=ee8449&style=flat-square)"
+  "badgeTemplate": "![All Contributors - See the full list of project contributors](https://img.shields.io/github/all-contributors/dh-tech/undate-python?color=ee8449&style=flat-square)"
CHANGELOG.md (1)

21-21: Fix typo and consider adding context

There's a typo: "payloud" should be "payload". Consider adding context about the benefits that justify the increased size.

-Performance differences seem to be negligible, but it does increase payloud size.  The virtualenv for installing version 0.2 was 14MB; when installing the newer version with numpy, the virtualenv is 46MB (the numpy folder in site packages is 31MB on its own).
+Performance differences seem to be negligible, but it does increase payload size. The virtualenv for installing version 0.2 was 14MB; when installing the newer version with numpy, the virtualenv is 46MB (the numpy folder in site packages is 31MB on its own). This increase is justified by the extended year range support and improved date handling capabilities provided by numpy.
tests/test_converters/test_base.py (5)

1-7: Remove extra empty line

There's an unnecessary empty line at line 6. One empty line between imports and class definition is sufficient.

 import pytest
 from undate.converters.base import BaseDateConverter
-
 
 class TestBaseDateConverter:

8-18: Enhance test assertions with descriptive messages

Add descriptive messages to assertions to make test failures more informative.

-        assert isinstance(available_converters, dict)
+        assert isinstance(available_converters, dict), "available_converters should return a dictionary"
 
         # NOTE: import _after_ generating available formatters
         # so we can confirm it gets loaded
         from undate.converters.iso8601 import ISO8601DateFormat
 
-        assert ISO8601DateFormat.name in available_converters
-        assert available_converters[ISO8601DateFormat.name] == ISO8601DateFormat
+        assert ISO8601DateFormat.name in available_converters, f"Expected {ISO8601DateFormat.name} in available converters"
+        assert available_converters[ISO8601DateFormat.name] == ISO8601DateFormat, "Converter class mismatch"

24-31: Verify error messages in NotImplementedError tests

The tests should verify the specific error messages to ensure they provide helpful guidance to implementers.

     def test_parse_not_implemented(self):
-        with pytest.raises(NotImplementedError):
+        with pytest.raises(NotImplementedError, match="Subclasses must implement parse method"):
             BaseDateConverter().parse("foo bar baz")
 
     def test_parse_to_string(self):
-        with pytest.raises(NotImplementedError):
+        with pytest.raises(NotImplementedError, match="Subclasses must implement to_string method"):
             BaseDateConverter().to_string(1991)

33-51: Add docstring and strengthen assertions

The test would benefit from a docstring explaining its purpose and more specific assertions.

 def test_import_converters_import_only_once(caplog):
+    """
+    Test that the import_converters method:
+    1. Successfully imports converters on first call
+    2. Uses cache on subsequent calls
+    3. Logs appropriate messages
+    """
     # clear the cache, since any instantiation of an Undate
     # object anywhere in the test suite will populate it
     BaseDateConverter.import_converters.cache_clear()
 
     # run first, and confirm it runs and loads formatters
     with caplog.at_level(logging.DEBUG):
         import_count = BaseDateConverter.import_converters()
     # should import at least one thing (iso8601)
-    assert import_count >= 1
+    assert import_count >= 1, f"Expected at least one converter to be imported, got {import_count}"
     # should have log entry
-    assert "Loading converters" in caplog.text
+    assert "Loading converters" in caplog.text, "Expected 'Loading converters' message in logs"

1-64: Consider adding extensibility tests

The test suite would benefit from examples demonstrating how to properly extend BaseDateConverter. Consider adding a test case that implements a valid custom converter to serve as documentation for other developers.

Would you like me to provide an example test case that demonstrates implementing a custom converter?

examples/use-cases/shakespeare-and-company-project/README.md (3)

9-12: Add hyphens to compound adjectives for clarity.

The content is excellent, but for better readability, compound adjectives should be hyphenated.

-The dataset includes durations for subscriptions and borrow events in human readable format and number of days.
+The dataset includes durations for subscriptions and borrow events in human-readable format and number of days.
🧰 Tools
🪛 LanguageTool

[misspelling] ~11-~11: This word is normally spelled with a hyphen.
Context: ... for subscriptions and borrow events in human readable format and number of days. The [Shakes...

(EN_COMPOUNDS_HUMAN_READABLE)


13-15: Improve readability and conciseness.

The technical explanation is excellent, but consider these improvements:

  1. Add hyphens to compound adjectives
  2. Make the language more concise
-subscriptions in both human readable and machine readable form
+subscriptions in both human-readable and machine-readable form

-Our hope is that in the future, `undate` will make it possible
+In the future, `undate` will make it possible
🧰 Tools
🪛 LanguageTool

[misspelling] ~13-~13: This word is normally spelled with a hyphen.
Context: ...rowing events and subscriptions in both human readable and machine readable form (text and cou...

(EN_COMPOUNDS_HUMAN_READABLE)


[misspelling] ~13-~13: This word is normally spelled with a hyphen.
Context: ...ubscriptions in both human readable and machine readable form (text and count by days). Data a...

(EN_COMPOUNDS_MACHINE_READABLE)


[style] ~15-~15: This phrase might be unnecessarily wordy.
Context: ...koeser/shxco-missingdata-specreading)). Our hope is that in the future, undate will make ...

(MY_HOPE_IS)


22-22: Format DOI as a proper markdown link.

Convert the bare URL to a proper markdown link for better documentation practices.

-https://doi.org/10.34770/nz90-ym25
+[DOI: 10.34770/nz90-ym25](https://doi.org/10.34770/nz90-ym25)
🧰 Tools
🪛 Markdownlint

22-22: null
Bare URL used

(MD034, no-bare-urls)

tests/test_date.py (2)

54-63: Add edge cases to subtraction tests

While the basic subtraction cases are covered, consider adding tests for:

  • Subtracting dates with different precisions
  • Error cases (e.g., subtracting a Date from a Timedelta)
  • Boundary conditions (e.g., dates across year boundaries)
def test_subtract_edge_cases(self):
    # Different precisions
    date1 = Date(2024, 1, 1)
    date2 = Date(2024)
    with pytest.raises(ValueError):
        _ = date1 - date2
    
    # Invalid operation
    with pytest.raises(TypeError):
        _ = ONE_YEAR - date1
    
    # Year boundary
    assert (Date(2024, 1, 1) - Date(2023, 12, 31)).days == 1

1-2: Add pytest import and consider adding type hints

  1. Add explicit pytest import since this is a test file
  2. Consider adding type hints to improve code clarity and enable static type checking
 import numpy as np
+import pytest
 from undate.date import ONE_YEAR, Date, DatePrecision, Timedelta

Example with type hints:

def test_init_year_month(self) -> None:
    d: Date = Date(2010, 5)
    assert isinstance(d, Date)
pyproject.toml (2)

12-12: Consider adding version constraints for dependencies

While the dependencies are correctly specified, it's recommended to add version constraints for lark and numpy to ensure reproducible builds and avoid potential compatibility issues with future releases.

Example:

-dependencies = ["lark", "numpy"]
+dependencies = ["lark>=1.0.0", "numpy>=1.20.0"]

85-85: Fix typo in pytest marker description

There's a typo in the "first" marker description: "efore" should be "before".

-    "first : run marked tests efore all others",
+    "first : run marked tests before all others",
docs/conf.py (1)

88-89: Consider using alternative MyST extensions for links

The comment indicates that the "linkify" extension was not found. Consider using other MyST extensions for handling links, such as "colon_fence" or "dollarmath" if needed.

-# myst_enable_extensions = ["linkify"]  # disabling because not found
+myst_enable_extensions = [
+    "colon_fence",
+    "smartquotes",
+    "replacements"
+]
src/undate/converters/edtf/transformer.py (2)

67-74: Add input validation to prevent potential errors.

While the logic is correct, the method could be more robust:

  1. Add bounds checking for the year value
  2. Add type validation for the year token
  3. Handle potential IndexError when accessing items[1]

Consider this safer implementation:

 def year(self, items):
     # when the year is negative, there are two tokens
     if len(items) > 1 and items[0] == "-":
         # an anonymous token for the - and the integer year
+        if len(items) < 2:
+            raise ValueError("Expected year value after negative sign")
         year = items[1]
+        if not isinstance(year, (int, str)) or not str(year).isdigit():
+            raise ValueError(f"Invalid year value: {year}")
+        year = int(year)
+        if not (-9999 <= -year <= 9999):  # Adjust bounds as needed
+            raise ValueError(f"Year {-year} is out of valid range")
         return Tree(data="year", children=[-year])

+    if not items:
+        raise ValueError("Expected year value")
+    year = items[0]
+    if not isinstance(year, (int, str)) or not str(year).isdigit():
+        raise ValueError(f"Invalid year value: {year}")
+    year = int(year)
+    if not (-9999 <= year <= 9999):  # Adjust bounds as needed
+        raise ValueError(f"Year {year} is out of valid range")
     return Tree(data="year", children=[items[0]])

76-76: Fix indentation inconsistency.

The indentation on line 78 is misaligned with the method definition.

 def year_fivedigitsplus(self, items):
-    # strip off the leading Y and convert to integer
+    # strip off the leading Y and convert to integer

Also applies to: 78-78

tests/test_converters/test_edtf.py (3)

8-18: Consider parameterizing the test cases for better maintainability.

The test cases in test_parse_singledate could be parameterized using @pytest.mark.parametrize to make the tests more maintainable and easier to extend.

Here's a suggested refactor:

-    def test_parse_singledate(self):
-        assert EDTFDateConverter().parse("2002") == Undate(2002)
-        assert EDTFDateConverter().parse("1991-05") == Undate(1991, 5)
-        assert EDTFDateConverter().parse("1991-05-03") == Undate(1991, 5, 3)
-        # unknown dates are not strictly equal, but string comparison should match
-        assert str(EDTFDateConverter().parse("201X")) == str(Undate("201X"))
-        assert str(EDTFDateConverter().parse("2004-XX")) == str(Undate(2004, "XX"))
-        assert str(EDTFDateConverter().parse("XXXX-05-03")) == Undate(
-            month=5, day=3
-        ).format("EDTF")
+    @pytest.mark.parametrize(
+        "input_date,expected",
+        [
+            ("2002", Undate(2002)),
+            ("1991-05", Undate(1991, 5)),
+            ("1991-05-03", Undate(1991, 5, 3)),
+            ("201X", Undate("201X")),
+            ("2004-XX", Undate(2004, "XX")),
+            ("XXXX-05-03", Undate(month=5, day=3)),
+        ],
+    )
+    def test_parse_singledate(self, input_date, expected):
+        if "X" in input_date:
+            assert str(EDTFDateConverter().parse(input_date)) == str(expected)
+        else:
+            assert EDTFDateConverter().parse(input_date) == expected

32-44: Consider adding more edge cases for invalid inputs.

The current test cases cover basic invalid scenarios. Consider adding these additional edge cases:

  • Invalid separators (e.g., "1991/05-03")
  • Invalid characters in year/month/day positions
  • Malformed ranges
  • Leading/trailing whitespace

Here's a suggested addition:

     invalid_inputs = [
         "1991-13",  # invalid month
         "1991-12-32",  # invalid day
         "199A",  # invalid year format
         "",  # empty string
         None,  # None input
+        "1991/05-03",  # invalid separator
+        "199Z-05-03",  # invalid year character
+        "1991-0Z-03",  # invalid month character
+        "1991-05-0Z",  # invalid day character
+        "1991-05-03/",  # malformed range
+        " 1991-05-03",  # leading whitespace
+        "1991-05-03 ",  # trailing whitespace
     ]

70-70: Address the TODO comment about digit replacement.

The TODO comment suggests testing digit replacement functionality, but the requirement is unclear.

Would you like me to help:

  1. Implement the test case for digit replacement?
  2. Open a GitHub issue to track this requirement?
src/undate/converters/base.py (3)

33-39: Consider adding name uniqueness validation

While the docstring states that subclasses must define a unique name, there's no mechanism to enforce this uniqueness. Consider adding validation in the available_converters method to detect duplicate names.

 @classmethod
 def available_converters(cls) -> Dict[str, Type["BaseDateConverter"]]:
     # ensure undate converters are imported
     cls.import_converters()
-    return {c.name: c for c in cls.__subclasses__()}  # type: ignore
+    converters = {}
+    for converter in cls.__subclasses__():
+        if converter.name in converters:
+            raise ValueError(f"Duplicate converter name '{converter.name}' found")
+        converters[converter.name] = converter
+    return converters

78-85: Address unused variables and add error handling

  1. Rename unused variables as suggested by static analysis
  2. Add error handling for import failures to prevent silent failures
-        for importer, modname, ispkg in pkgutil.iter_modules(
+        for _importer, modname, _ispkg in pkgutil.iter_modules(
             converter_path, converter_prefix
         ):
             # import everything except the current file
             if not modname.endswith(".base"):
-                importlib.import_module(modname)
-                import_count += 1
+                try:
+                    importlib.import_module(modname)
+                    import_count += 1
+                except ImportError as e:
+                    logger.error(f"Failed to import converter module {modname}: {e}")
🧰 Tools
🪛 Ruff

78-78: Loop control variable importer not used within loop body

Rename unused importer to _importer

(B007)


78-78: Loop control variable ispkg not used within loop body

Rename unused ispkg to _ispkg

(B007)


88-95: Improve type safety

Consider improving type hints and removing the type ignore comment:

+from typing import TYPE_CHECKING

+if TYPE_CHECKING:
+    from typing import Type
+    from .types import BaseDateConverterType

     @classmethod
-    def available_converters(cls) -> Dict[str, Type["BaseDateConverter"]]:
+    def available_converters(cls) -> Dict[str, BaseDateConverterType]:
         """
         Dictionary of available converters keyed on name.
         """
         # ensure undate converters are imported
         cls.import_converters()
-        return {c.name: c for c in cls.__subclasses__()}  # type: ignore
+        return {c.name: c for c in cls.__subclasses__()}

Add a new file types.py:

from typing import Type, TypeVar
BaseDateConverterType = TypeVar('BaseDateConverterType', bound='BaseDateConverter')
CONTRIBUTING.md (3)

26-29: Fix typo in git flow instructions

There's a redundant phrase in the git flow instructions.

-If you have `git flow` installed, you can start, you can use:
+If you have `git flow` installed, you can use:

14-19: Enhance branching strategy documentation

The git flow explanation is good, but consider adding:

  1. A note about hotfix branches for critical bug fixes
  2. Guidelines for commit message format

Would you like me to propose additional content for these sections?

🧰 Tools
🪛 LanguageTool

[formatting] ~19-~19: Insert a comma after ‘cases’: “In most cases,”?
Context: ...that relates to the changes or feature In most cases branches should be created from the mos...

(IN_MOST_CASES_COMMA)


43-46: Improve grammar in feature request description

The sentence structure could be improved for better clarity.

-If you find a bug or can think a feature you would really like to see being implemented
+If you find a bug or can think of a feature you would really like to see implemented
🧰 Tools
🪛 LanguageTool

[uncategorized] ~44-~44: Possible missing preposition found.
Context: ...equests If you find a bug or can think a feature you would really like to see be...

(AI_EN_LECTOR_MISSING_PREPOSITION)

src/undate/converters/iso8601.py (6)

7-13: Consider formalizing the validation strategy

The comment raises important questions about validation approach. Consider implementing a strict validation mode (using regex) that can be toggled, allowing both permissive and strict parsing based on use case.

Would you like me to propose a validation implementation that supports both modes?


29-30: Handle edge case with ISO datetime strings

The TODO comments indicate missing handling for ISO dates with time components.

Would you like me to propose an implementation for handling ISO datetime strings? We could either:

  1. Strip the time component
  2. Raise a specific exception
  3. Add support for time precision

Line range hint 47-54: Fix type checking issue in _parse_single_date

The type ignore comment masks an underlying type compatibility issue that should be addressed.

-        return Undate(*date_parts)  # type: ignore
+        # Ensure date_parts has exactly 3 elements (year, month, day)
+        while len(date_parts) < 3:
+            date_parts.append(None)
+        return Undate(
+            year=date_parts[0],
+            month=date_parts[1],
+            day=date_parts[2]
+        )
🧰 Tools
🪛 Ruff

79-79: Loop control variable iso_format not used within loop body

Rename unused iso_format to _iso_format

(B007)


64-66: Document ISO8601 range representation limitations

The comment raises valid concerns about ISO8601 compliance for ranges. Consider:

  1. Documenting these limitations in the docstring
  2. Adding a strict mode that raises exceptions for non-compliant formats

77-78: Consider refactoring to use properties

The TODO suggests using year/month/day properties which would make the code cleaner and more maintainable.

Would you like me to propose a refactored implementation using properties?


Line range hint 95-96: Fix type checking in join operation

The type ignore comment masks an underlying type issue that should be addressed.

-        return "-".join(date_parts)  # type: ignore
+        # Filter out None values and convert to strings
+        valid_parts = [str(part) for part in date_parts if part is not None]
+        return "-".join(valid_parts)
README.md (2)

7-10: Consider using platform-agnostic warning format

The current warning uses GitHub-specific alert syntax (> [!WARNING]). Consider using a more universal format to ensure consistent rendering across different platforms:

-> [!WARNING]
-> This is pre-alpha software and is **NOT** feature complete! Use with caution.
-> Currently it only supports parsing and formatting dates in ISO8601 format and
-> some portions of EDTF (Extended Date Time Format).
+⚠️ **WARNING**
+
+This is pre-alpha software and is **NOT** feature complete! Use with caution.
+Currently it only supports parsing and formatting dates in ISO8601 format and
+some portions of EDTF (Extended Date Time Format).

34-45: Add type hints to code examples

Consider adding type hints to make the examples more educational and align with modern Python practices:

-from undate.undate import Undate
+from undate.undate import Undate

-november7 = Undate(2000, 11, 7)
+november7: Undate = Undate(2000, 11, 7)
examples/notebooks/edtf-support.ipynb (2)

63-81: Consider adding error handling test cases.

While the happy path tests are comprehensive, consider adding test cases for:

  • Invalid date strings
  • Out of range values
  • Malformed EDTF strings

Example test cases to add:

# Invalid date
with pytest.raises(ValueError):
    Undate.parse("1985-13-01", "EDTF")  # Invalid month

# Malformed EDTF
with pytest.raises(ValueError):
    Undate.parse("1985/04/12", "EDTF")  # Wrong separator

727-729: Track TODO comment for implementation.

The TODO comment indicates missing functionality for handling minimum and maximum year values.

Would you like me to:

  1. Generate the implementation for MIN_YEAR and MAX_YEAR constants?
  2. Create a GitHub issue to track this enhancement?
examples/notebooks/shxco_partial_date_durations.ipynb (1)

299-300: Consider adding error handling for file loading.

The dataset loading could benefit from error handling to gracefully handle cases where the file doesn't exist or is corrupted.

-events_df = pd.read_csv("../use-cases/shakespeare-and-company-project/SCoData_events_v1.2_2022-01.csv", low_memory=False)
+try:
+    events_df = pd.read_csv("../use-cases/shakespeare-and-company-project/SCoData_events_v1.2_2022-01.csv", low_memory=False)
+except FileNotFoundError:
+    raise FileNotFoundError("Shakespeare and Company dataset file not found. Please ensure it exists in the use-cases directory.")
+except pd.errors.EmptyDataError:
+    raise ValueError("The dataset file is empty.")
+except Exception as e:
+    raise Exception(f"Error loading dataset: {str(e)}")
src/undate/converters/edtf/converter.py (1)

84-84: Address the TODO comments for handling uncertain or approximate dates

There are TODO comments indicating that handling for uncertain or approximate dates needs to be implemented in the sections dealing with years, months, and days. Implementing this will ensure full support for EDTF specifications regarding uncertain or approximate dates.

Would you like assistance in implementing the handling for uncertain or approximate dates? I can help generate the necessary code changes.

Also applies to: 91-91, 98-98

src/undate/date.py (2)

35-35: Typo in comment: Remove extra space

There's an extra space between "single" and "year" in the comment on line 35.

Apply this diff to fix the typo:

-#: timedelta for a single  year (non-leap year)
+#: timedelta for a single year (non-leap year)

107-118: Generalize handling of timedelta units in __sub__ method

Currently, the __sub__ method only handles results with dtype "timedelta64[D]". If future code introduces dates with different time units (e.g., weeks or months), this method may not handle them correctly. Consider generalizing the condition to handle other possible timedelta units.

Apply this diff to generalize handling of timedelta units:

 def __sub__(self, other):
     # modify to conditionally return a timedelta object instead of a
     # Date object with dtype timedelta64[D] (default behavior)

     result = super().__sub__(other)
     # if the result has a timedelta type (i.e., date minus date = timedelta),
     # cast to local Timedelta object; otherwise, leave as is
     # (i.e., date minus timedelta = date)
-    if result.dtype == "timedelta64[D]":
+    if np.issubdtype(result.dtype, np.timedelta64):
         result = Timedelta(result)
     return result
tests/test_undate.py (1)

366-367: Remove redundant commented-out code

The commented-out assertion is redundant since the same assertion is active in the next line. Removing it can improve code readability.

src/undate/undate.py (3)

36-37: Consider limiting the allowable year range to practical values.

Using extremely large integers for MIN_ALLOWABLE_YEAR and MAX_ALLOWABLE_YEAR may lead to computational issues or unexpected behavior. Consider setting logical limits within the range of acceptable dates.


48-48: Address TODO: Add validation for string inputs.

The TODO comment suggests adding validation to ensure that if year, month, or day are strings, they must be of expected length. Implementing this validation will improve data integrity.

Would you like assistance in implementing this validation?


133-135: Address TODO: Handle February 29 with unknown year.

The TODO comment indicates that if the date is February 29 with an unknown year, the min/max years should be adjusted to known leap years. Implementing this special case will ensure accurate date ranges.

Would you like assistance in implementing this logic?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between fe7c795 and 1b94c78.

⛔ Files ignored due to path filters (6)
  • _static/undate_logo.png is excluded by !**/*.png
  • docs/_static/dhtech-logo.png is excluded by !**/*.png
  • docs/_static/dhtech-logo.svg is excluded by !**/*.svg
  • docs/_static/logo.png is excluded by !**/*.png
  • docs/_static/undate_logo.png is excluded by !**/*.png
  • examples/use-cases/shakespeare-and-company-project/SCoData_events_v1.2_2022-01.csv is excluded by !**/*.csv
📒 Files selected for processing (57)
  • .all-contributorsrc (5 hunks)
  • .github/workflows/check.yml (1 hunks)
  • .github/workflows/unit_tests.yml (2 hunks)
  • .gitignore (1 hunks)
  • .pre-commit-config.yaml (2 hunks)
  • .pythonversion (1 hunks)
  • CHANGELOG.md (1 hunks)
  • CONTRIBUTING.md (1 hunks)
  • CONTRIBUTORS.md (1 hunks)
  • DEVELOPER_NOTES.md (1 hunks)
  • README.md (2 hunks)
  • docs/CONTRIBUTING.md (1 hunks)
  • docs/CONTRIBUTORS.md (1 hunks)
  • docs/DEVELOPER_NOTES.md (1 hunks)
  • docs/LICENSE.md (1 hunks)
  • docs/_static/custom.css (1 hunks)
  • docs/_templates/sidebar_dhtech.html (1 hunks)
  • docs/authors.rst (0 hunks)
  • docs/conf.py (3 hunks)
  • docs/index.rst (1 hunks)
  • docs/license.rst (0 hunks)
  • docs/readme.md (1 hunks)
  • docs/readme.rst (0 hunks)
  • docs/requirements.txt (0 hunks)
  • docs/undate.rst (0 hunks)
  • docs/undate/converters.rst (1 hunks)
  • docs/undate/core.rst (1 hunks)
  • docs/undate/index.rst (1 hunks)
  • examples/README.md (1 hunks)
  • examples/notebooks/edtf-support.ipynb (1 hunks)
  • examples/notebooks/shxco_partial_date_durations.ipynb (112 hunks)
  • examples/use-cases/shakespeare-and-company-project/README.md (1 hunks)
  • pyproject.toml (1 hunks)
  • pytest.ini (0 hunks)
  • setup.cfg (0 hunks)
  • setup.py (0 hunks)
  • src/undate/__init__.py (1 hunks)
  • src/undate/converters/__init__.py (1 hunks)
  • src/undate/converters/base.py (1 hunks)
  • src/undate/converters/edtf/__init__.py (1 hunks)
  • src/undate/converters/edtf/converter.py (1 hunks)
  • src/undate/converters/edtf/edtf.lark (2 hunks)
  • src/undate/converters/edtf/parser.py (0 hunks)
  • src/undate/converters/edtf/transformer.py (2 hunks)
  • src/undate/converters/iso8601.py (3 hunks)
  • src/undate/date.py (1 hunks)
  • src/undate/dateformat/__init__.py (0 hunks)
  • src/undate/dateformat/base.py (0 hunks)
  • src/undate/undate.py (14 hunks)
  • tests/test_converters/edtf/test_edtf_parser.py (3 hunks)
  • tests/test_converters/edtf/test_edtf_transformer.py (3 hunks)
  • tests/test_converters/test_base.py (1 hunks)
  • tests/test_converters/test_edtf.py (1 hunks)
  • tests/test_converters/test_iso8601.py (1 hunks)
  • tests/test_date.py (1 hunks)
  • tests/test_dateformat/test_base.py (0 hunks)
  • tests/test_undate.py (9 hunks)
💤 Files with no reviewable changes (12)
  • docs/authors.rst
  • docs/license.rst
  • docs/readme.rst
  • docs/requirements.txt
  • docs/undate.rst
  • pytest.ini
  • setup.cfg
  • setup.py
  • src/undate/converters/edtf/parser.py
  • src/undate/dateformat/init.py
  • src/undate/dateformat/base.py
  • tests/test_dateformat/test_base.py
✅ Files skipped from review due to trivial changes (17)
  • .gitignore
  • .pythonversion
  • docs/CONTRIBUTING.md
  • docs/CONTRIBUTORS.md
  • docs/DEVELOPER_NOTES.md
  • docs/LICENSE.md
  • docs/_static/custom.css
  • docs/_templates/sidebar_dhtech.html
  • docs/index.rst
  • docs/readme.md
  • docs/undate/converters.rst
  • docs/undate/core.rst
  • docs/undate/index.rst
  • examples/README.md
  • src/undate/converters/init.py
  • src/undate/converters/edtf/init.py
  • tests/test_converters/test_iso8601.py
🧰 Additional context used
🪛 actionlint
.github/workflows/check.yml

24-24: property "python-cache" is not defined in object type {}

(expression)

.github/workflows/unit_tests.yml

37-37: property "python-cache" is not defined in object type {}

(expression)

🪛 LanguageTool
CONTRIBUTING.md

[style] ~5-~5: Consider a different adverb to strengthen your wording.
Context: ... to this project. This is great! We are always looking for new contributors. If you ca...

(ALWAYS_CONSTANTLY)


[formatting] ~19-~19: Insert a comma after ‘cases’: “In most cases,”?
Context: ...that relates to the changes or feature In most cases branches should be created from the mos...

(IN_MOST_CASES_COMMA)


[uncategorized] ~37-~37: A comma might be missing here.
Context: ...b issue in your pull request comments. Ideally contributions should include documentat...

(AI_EN_LECTOR_MISSING_PUNCTUATION_COMMA)


[typographical] ~38-~38: It seems that a comma is missing.
Context: ...oposed changes, but if that is a barrier please let us know when you submit a pull requ...

(IF_PLEASE_COMMA)


[uncategorized] ~44-~44: Possible missing preposition found.
Context: ...equests If you find a bug or can think a feature you would really like to see be...

(AI_EN_LECTOR_MISSING_PREPOSITION)

DEVELOPER_NOTES.md

[grammar] ~7-~7: The word “checkout” is a noun. The verb is spelled with a white space.
Context: ... you started. ## Setup ### Use git to checkout a copy of the repository ```sh git clon...

(NOUN_VERB_CONFUSION)


[style] ~50-~50: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... sh pip install -e ".[dev]" ### Install pre-commit hooks We use [pre-commit](h...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

examples/use-cases/shakespeare-and-company-project/README.md

[misspelling] ~11-~11: This word is normally spelled with a hyphen.
Context: ... for subscriptions and borrow events in human readable format and number of days. The [Shakes...

(EN_COMPOUNDS_HUMAN_READABLE)


[misspelling] ~13-~13: This word is normally spelled with a hyphen.
Context: ...rowing events and subscriptions in both human readable and machine readable form (text and cou...

(EN_COMPOUNDS_HUMAN_READABLE)


[misspelling] ~13-~13: This word is normally spelled with a hyphen.
Context: ...ubscriptions in both human readable and machine readable form (text and count by days). Data a...

(EN_COMPOUNDS_MACHINE_READABLE)


[style] ~15-~15: This phrase might be unnecessarily wordy.
Context: ...koeser/shxco-missingdata-specreading)). Our hope is that in the future, undate will make ...

(MY_HOPE_IS)

🪛 Markdownlint
CONTRIBUTORS.md

51-51: Expected: 0; Actual: 2
Unordered list indentation

(MD007, ul-indent)


50-50: null
Link fragments should be valid

(MD051, link-fragments)

DEVELOPER_NOTES.md

37-37: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

examples/use-cases/shakespeare-and-company-project/README.md

22-22: null
Bare URL used

(MD034, no-bare-urls)

🪛 Ruff
src/undate/converters/base.py

78-78: Loop control variable importer not used within loop body

Rename unused importer to _importer

(B007)


78-78: Loop control variable ispkg not used within loop body

Rename unused ispkg to _ispkg

(B007)

src/undate/converters/edtf/converter.py

43-45: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

src/undate/converters/iso8601.py

79-79: Loop control variable iso_format not used within loop body

Rename unused iso_format to _iso_format

(B007)

🔇 Additional comments (53)
src/undate/__init__.py (2)

6-6: LGTM! Well-structured public API declaration

The __all__ declaration correctly matches the imported symbols and explicitly defines the public API, following Python best practices.


3-4: Verify module structure and imports

The imports look correct based on the module structure described in the summary.

✅ Verification successful

Let me verify the class definitions with a different approach since ast-grep might have limitations with Python files.


Import statements are correctly structured and verified

All imported classes exist in their respective modules:

  • DatePrecision class is defined in src/undate/date.py
  • Undate and UndateInterval classes are defined in src/undate/undate.py
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify that the imported modules exist and contain the referenced classes

# Check if the modules exist
echo "Checking for module files..."
fd -t f "^(date|undate)\.py$" src/undate/

# Verify class definitions in the modules
echo "Checking for class definitions..."
ast-grep --pattern 'class DatePrecision' src/undate/date.py
ast-grep --pattern 'class Undate' src/undate/undate.py
ast-grep --pattern 'class UndateInterval' src/undate/undate.py

Length of output: 411


Script:

#!/bin/bash
# Check class definitions using ripgrep for better Python support
echo "Checking DatePrecision in date.py:"
rg "^class DatePrecision" -A 1 src/undate/date.py

echo -e "\nChecking Undate in undate.py:"
rg "^class Undate[^I]" -A 1 src/undate/undate.py

echo -e "\nChecking UndateInterval in undate.py:"
rg "^class UndateInterval" -A 1 src/undate/undate.py

Length of output: 671

.pre-commit-config.yaml (1)

Line range hint 1-22: Verify tool version consistency across configuration files

Let's ensure the tool versions and configurations are consistent across the project.

✅ Verification successful

Tool configurations are consistent and well-structured

The pre-commit configuration is properly aligned with the project's development dependencies and configurations:

  • Ruff and MyPy are correctly listed as dev dependencies in pyproject.toml
  • MyPy configuration in pre-commit matches the plugin configuration in pyproject.toml
  • The pre-commit hooks provide comprehensive linting and type checking coverage
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for version consistency across configuration files

# Check for Ruff and MyPy configurations in pyproject.toml
echo "Checking pyproject.toml for tool configurations..."
if [ -f "pyproject.toml" ]; then
    echo "=== Ruff Configuration ==="
    rg -A 5 '^\[tool\.ruff\]' pyproject.toml
    echo "=== MyPy Configuration ==="
    rg -A 5 '^\[tool\.mypy\]' pyproject.toml
fi

# Check for version specifications in requirements files
echo "Checking requirements files for tool versions..."
fd -e txt -e in requirements | while read -r file; do
    echo "=== $file ==="
    rg -i "ruff|mypy" "$file"
done

Length of output: 554


Script:

#!/bin/bash
# Let's check for any additional configuration files and requirements
# that might contain version specifications

# Check for any requirements files in the root or common locations
echo "=== Requirements files ==="
fd -e txt -e in '(requirements|deps|dev)' -d 2

# Look for pyproject.toml content
echo -e "\n=== pyproject.toml content ==="
if [ -f "pyproject.toml" ]; then
    cat pyproject.toml
fi

# Look for setup.py/setup.cfg for any version specifications
echo -e "\n=== Setup files ==="
if [ -f "setup.py" ]; then
    echo "setup.py content:"
    cat setup.py
fi
if [ -f "setup.cfg" ]; then
    echo "setup.cfg content:"
    cat setup.cfg
fi

Length of output: 2794

.github/workflows/check.yml (1)

16-21: LGTM: Python setup is well configured

The setup uses the latest stable Python version and action version with proper caching configuration.

tests/test_converters/edtf/test_edtf_parser.py (1)

2-2: LGTM: Import path updated correctly

The import path change from dateformat to converters aligns with the module restructuring.

.github/workflows/unit_tests.yml (2)

14-14: LGTM: Python 3.12 is a good choice for coverage reporting

Using Python 3.12 for coverage reporting is appropriate as it's the latest stable version.


33-33: LGTM: Updated cache dependency path

The change from setup.cfg to pyproject.toml aligns with modern Python packaging standards.

tests/test_converters/edtf/test_edtf_transformer.py (3)

2-3: LGTM: Import statements are correctly structured

The imports are appropriately organized and provide the necessary components for testing.


15-19: LGTM: Good coverage for negative year scenarios

The test cases appropriately cover both extended format (-Y17000002) and simple format (-1985) for negative years.


20-20: Verify the commented-out season test case

The commented test case "2001-21" for spring 2001 suggests that season support might be planned.

Let's check if there are any related issues or implementations:

✅ Verification successful

Season support is planned but not yet implemented

The commented test case is related to EDTF Level 1 season support (where 21 represents Spring), which is currently tracked in issue #73 but not yet implemented. This is an intentionally commented-out test case awaiting implementation.

  • Issue EDTF support - level 0, 1 and parts of level 2 #73 explicitly lists seasons support as an unfinished task: "values 21, 22, 23, 24 may be used used to signify 'Spring', 'Summer', 'Autumn', 'Winter'"
  • The test case follows EDTF specification format where "2001-21" represents Spring 2001
  • Other Level 1 features like negative years and unspecified digits are already implemented, but seasons remain a TODO item
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for season-related code or issues
echo "Searching for season-related code..."
rg -l "season|spring|summer|fall|winter|21|22|23|24" --type py

echo "Searching for related issues..."
gh issue list --search "season date"

Length of output: 565


Script:

#!/bin/bash
# Let's check the actual implementation and tests for season handling
echo "Checking EDTF parser implementation for seasons..."
rg -A 5 "season|21|22|23|24" src/undate/converters/edtf/parser.py

echo "\nChecking EDTF transformer tests..."
cat tests/test_converters/edtf/test_edtf_transformer.py

echo "\nChecking related issue details..."
gh issue view 73

Length of output: 5234

src/undate/converters/edtf/edtf.lark (3)

45-45: LGTM! Correct implementation of extended year format.

The pattern correctly implements the EDTF specification for years exceeding 4 digits, including negative years.

Let's verify the handling of edge cases:

#!/bin/bash
# Description: Check test coverage for extended year formats
# Look for test cases covering edge cases in extended year formats
rg -A 3 "test.*Y\d{5}" "tests/test_converters/edtf/"

37-42: Clarify EDTF level separation in grammar rules.

The comment mentions Level 2 support, but the grammar structure doesn't clearly separate different EDTF levels. Consider restructuring to make the level separation more explicit.

Let's verify the EDTF specification compliance for unspecified components:

#!/bin/bash
# Description: Verify EDTF level separation and unspecified component handling
# Look for level-specific test organization
rg -A 5 "test.*unspecified" "tests/test_converters/edtf/"

17-17: Consider optimizing the year rule pattern.

The current rule allows whitespace between the negative sign and the year number. Consider using a more precise pattern.

-year: INT | /\-/ INT
+year: /\-?\d+/

Let's verify the EDTF specification compliance:

.all-contributorsrc (2)

17-18: LGTM: Contributor role updates are well-structured

The additions of "ideas" and other roles for multiple contributors are well-organized and follow the all-contributors specification. These updates appropriately reflect the evolving contributions to the project.

Also applies to: 42-44, 56-57


3-3: Verify CONTRIBUTORS.md setup

The change to use CONTRIBUTORS.md as the target file is a good practice for maintainability. However, let's verify its implementation.

✅ Verification successful

Let me check the content of both files to ensure proper setup and references.


✓ CONTRIBUTORS.md setup is correctly implemented

The change to use CONTRIBUTORS.md as the target file is properly implemented:

  • CONTRIBUTORS.md exists and is properly set up with the expected content
  • README.md correctly references CONTRIBUTORS.md in the contributors badge
  • Both files maintain consistent badge sections
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify CONTRIBUTORS.md exists and is properly referenced

# Check if CONTRIBUTORS.md exists
if [ -f "CONTRIBUTORS.md" ]; then
    echo "✓ CONTRIBUTORS.md exists"
else
    echo "✗ CONTRIBUTORS.md is missing"
fi

# Check for any remaining references to the old location in README.md
rg -l "contributors" README.md

Length of output: 128


Script:

#!/bin/bash
# Check content of both files for contributor references
echo "=== README.md contributor references ==="
rg "contributor" README.md -A 2 -B 2

echo -e "\n=== CONTRIBUTORS.md content preview ==="
head -n 10 CONTRIBUTORS.md

Length of output: 1050

CHANGELOG.md (4)

9-11: Well documented EDTF improvements

Good practice to include a validation notebook for demonstrating the new EDTF support. The attribution to numpy.datetime64 for enabling the feature is clear.


12-16: Verify documentation structure and completeness

The documentation improvements span multiple areas, which is great. Let's verify the structure and cross-linking.

#!/bin/bash
# Description: Check documentation structure and cross-references

# Check if converter documentation exists
fd -e md -e rst "converter"

# Check if branching guidelines are properly linked
rg -i "branch.*guideline" "CONTRIBUTING.md"

# Verify Sphinx documentation structure
fd . "docs/" -e rst -e md

17-17: Verify tooling change documentation

The switch from black to ruff is a significant change in the development workflow.

#!/bin/bash
# Description: Verify tooling change documentation

# Check for ruff configuration
fd "ruff" -e toml -e ini

# Check if the change is documented in developer notes
rg -i "ruff.*black" "DEVELOPER_NOTES.md"

# Verify pre-commit configuration
rg "ruff" ".pre-commit-config.yaml"

5-8: Verify breaking changes documentation

The renaming of "formatters" to "converters" represents a breaking change that could impact existing code. While the change is documented here, we should ensure it's prominently highlighted in the upgrade guide.

✅ Verification successful

Let's search for documentation and code changes to verify the breaking changes and their documentation.


Let's examine the documentation and code references to understand how this change is documented.


Breaking changes are adequately documented

The renaming of "formatters" to "converters" is well documented and implemented consistently:

  • The change is clearly documented in CHANGELOG.md
  • README.md has been updated to use the new "converter" terminology
  • The change is reflected in the codebase structure with a dedicated converters module and documentation at docs/undate/converters.rst
  • The API provides backward compatibility through clear method names (format and parse)
  • Implementation shows a clean transition with no remnants of old formatter terminology in the core code
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check if breaking changes are documented in upgrade guide

# Look for upgrade guide or migration documentation
fd -e md -e rst "upgrade|migration|breaking.changes"

# Check if the formatter to converter change is documented in README
rg -i "formatter.*converter" "README.md"

Length of output: 300


Script:

#!/bin/bash
# Search for documentation files in the root and docs directory
fd -e md -e rst . 

# Search for formatter/converter related changes in Python files
rg -i "formatter.*converter|converter.*formatter" -g "*.py"

# Search for any migration or upgrade related content
rg -i "migration|upgrade|breaking.change" -g "*.md" -g "*.rst"

# Look for the old formatter implementation
ast-grep --pattern 'class $_(Formatter|formatter)' 

# Look for the new converter implementation
ast-grep --pattern 'class $_(Converter|converter)'

Length of output: 807


Script:

#!/bin/bash
# Check the content of the converter documentation
cat docs/undate/converters.rst

# Look for any mentions of formatters in the codebase
rg -i "formatter" -g "*.py" -g "*.rst" -g "*.md" -A 2

# Look for any mentions of converters in the codebase
rg -i "converter" -g "*.py" -g "*.rst" -g "*.md" -A 2

# Check the content of core documentation
cat docs/undate/core.rst

Length of output: 21963

examples/use-cases/shakespeare-and-company-project/README.md (2)

1-4: LGTM!

The header section is well-structured and provides clear attribution.


5-8: LGTM!

Clear explanation of the dataset scope and standardized date formats.

tests/test_date.py (1)

15-38: LGTM! Well-structured initialization tests

The initialization tests provide good coverage across different precision levels and data types.

pyproject.toml (1)

2-3: LGTM: Build system configuration is appropriate

The transition to hatchling as the build backend is a good choice, as it's a modern, maintainable build system with good support for Python packaging standards.

docs/conf.py (4)

35-39: LGTM! Good choice using myst_parser

The switch from m2r2 to myst_parser is a good modernization. MyST Parser is the recommended choice for Markdown support in Sphinx documentation.


49-52: LGTM! Clear and explicit file type mapping

The new dictionary format for source_suffix provides a clearer mapping between file extensions and their processors. This configuration properly supports both reStructuredText and Markdown files.


78-86: Verify custom sidebar template existence

The configuration includes a custom sidebar template "sidebar_dhtech.html". Please ensure this template file exists in the appropriate location.

#!/bin/bash
# Check if the custom sidebar template exists
fd "sidebar_dhtech.html" docs/_templates/

20-21: ⚠️ Potential issue

Update release version to match v0.3

The release version is still set to "0.0.1.dev" while this PR is for Release v0.3. Please update the version number to maintain consistency.

-release = "0.0.1.dev"
+release = "0.3"

Likely invalid or redundant comment.

src/undate/converters/edtf/transformer.py (2)

1-1: LGTM! Import statement is well organized.

The imports are properly grouped and ordered.


67-89: Verify test coverage for negative year handling.

The changes to year handling are significant. Let's verify the test coverage:

tests/test_converters/test_edtf.py (3)

1-7: LGTM! Clean imports and class setup.

The imports are appropriate and the test class follows Python naming conventions.


50-69: LGTM! Comprehensive string conversion tests.

The test cases cover various scenarios including:

  • Year padding
  • Unknown components
  • Negative years
  • Error handling for empty dates

29-30: Address the commented question about EDTF support.

The comment raises an important question about EDTF support for dates with missing years. This should be investigated and either:

  1. Uncomment and fix the test if EDTF supports this format
  2. Remove the comment if confirmed that EDTF doesn't support this format
  3. Create a GitHub issue to track this investigation

Let's verify EDTF specification support:

src/undate/converters/base.py (1)

1-31: LGTM! Well-documented module with clear extension instructions.

The module documentation provides comprehensive guidance for implementing new converters, and the imports are well-organized.

CONTRIBUTING.md (3)

55-58: LGTM! Clear help and community information

The help section provides clear guidance and good community references.


8-8: Verify the Developer Notes reference

The documentation now refers to "Developer Notes" for setup instructions. Let's verify this file exists and contains the necessary information.

✅ Verification successful

The Developer Notes reference is valid and accurate

The DEVELOPER_NOTES.md file exists and contains comprehensive setup instructions, including:

  • Development environment setup with git-flow
  • Python installation and configuration
  • Package installation steps
  • Pre-commit hooks setup
  • Testing instructions
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if DEVELOPER_NOTES.md exists and contains setup instructions
if [ -f "DEVELOPER_NOTES.md" ]; then
    echo "DEVELOPER_NOTES.md exists"
    rg -i "setup|install|test" "DEVELOPER_NOTES.md"
else
    echo "DEVELOPER_NOTES.md not found"
fi

Length of output: 1610


52-52: Verify examples directory structure

Let's confirm the examples directory exists and is properly structured.

✅ Verification successful

Examples directory structure is valid and well-organized

The examples directory exists at the correct path and is properly structured with:

  • A clear README.md explaining the purpose
  • Jupyter notebooks with example code
  • Use cases from real projects

The path reference in the CONTRIBUTING.md is accurate and points to an active directory.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if examples directory exists and its structure
if [ -d "examples" ]; then
    echo "Examples directory exists"
    ls -la examples/
else
    echo "Examples directory not found"
fi

Length of output: 384


Script:

#!/bin/bash
# Let's check the content of examples/README.md to understand the directory structure
cat examples/README.md

# Also check if there are any example files in the notebooks and use-cases directories
echo -e "\nContents of notebooks directory:"
ls -la examples/notebooks/

echo -e "\nContents of use-cases directory:"
ls -la examples/use-cases/

Length of output: 1097

src/undate/converters/iso8601.py (1)

23-28: LGTM! Documentation improvements

The docstring clearly describes the supported formats and return types.

README.md (4)

21-23: LGTM!

The badges are well-organized and provide clear project status information.


28-32: LGTM!

The introduction effectively explains the library's purpose and target use cases.


48-155: LGTM!

The code examples are comprehensive and effectively demonstrate the library's features:

  • Initialization with different date precisions
  • String formatting
  • Duration calculations
  • Date comparisons
  • Interval handling
  • Parsing and formatting

159-167: LGTM!

The documentation section provides clear links to additional resources and properly references the license.

examples/notebooks/edtf-support.ipynb (2)

1-18: Well-structured documentation and clear introduction!

The notebook provides excellent documentation with:

  • Clear purpose and scope
  • Reference to the EDTF specification
  • Explicit mention of supported and unsupported features

696-767: Excellent test coverage for Level 2 EDTF features!

The implementation thoroughly covers:

  • Unspecified digits in various positions
  • Edge cases with multiple unspecified digits
  • Proper handling of earliest/latest calculations
examples/notebooks/shxco_partial_date_durations.ipynb (4)

Line range hint 325-338: LGTM! Clean and well-documented implementation.

The duration calculation implementation is clear and includes helpful comments explaining the off-by-one adjustment. The imports are correctly updated to use the new ISO8601DateFormat from the converters package.


2595-2595: LGTM! Python version update.

The Python version has been updated to 3.12.7, which is appropriate for using the latest features and security updates.


2540-2549: Well-documented historical issues and fixes.

The markdown comments effectively document previous issues and their resolutions:

  1. Edge case with unknown years and date wrapping
  2. Leap year calculation adjustment
  3. Data validation against source material

This documentation is valuable for future maintenance.


Line range hint 777-2566: Verify test coverage for edge cases.

The duration comparison tests cover a good range of cases, but consider adding explicit tests for:

  1. Leap year transitions
  2. Century transitions
  3. Invalid date combinations
src/undate/date.py (1)

40-121: Implementation of Date class is well-structured

The Date class effectively extends numpy.datetime64 to behave more like Python's built-in datetime.date. The methods and properties are well-defined, and the use of type hints enhances code readability. The class provides a convenient interface for date manipulation.

tests/test_undate.py (5)

134-150: Well-structured tests for the year property

The test cases effectively cover various scenarios, including numeric and string years, partially known years, and unset years. This enhances the coverage and reliability of the year property.


151-164: Comprehensive tests for the month property

The tests handle different scenarios such as known, partially known, and unknown months, ensuring robustness in the month property functionality.


165-184: Extensive tests for the day property

The test cases adequately cover various situations for the day property, including known days, partially unknown days, fully unknown days, and edge cases like months with fewer days. This improves the reliability of date handling.


191-192: LGTM

The test correctly asserts that comparing an Undate object with an unrelated type returns NotImplemented, which is appropriate behavior.


535-540: ⚠️ Potential issue

Verify the duration calculation for intervals spanning unknown years

The test calculates the duration between Undate(None, 6, 7) and Undate(None, 6, 6) and asserts it to be 365 days. However, the duration from June 7 to June 6 of the following year is typically 364 days, unless it's a leap year. Please verify the correctness of this test case.

@@ -1 +1,6 @@
__version__ = "0.3.0.dev0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Update version for release

The version string indicates a development version (0.3.0.dev0), but this PR is for Release v0.3. The version should be updated to 0.3.0 for the release.

-__version__ = "0.3.0.dev0"
+__version__ = "0.3.0"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
__version__ = "0.3.0.dev0"
__version__ = "0.3.0"

Comment on lines 3 to 8
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.3.4
hooks:
- id: black
- id: ruff
args: [ --select, I, --fix, --exit-non-zero-on-fix ]
- id: ruff-format
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider expanding Ruff's rule selection for better code quality coverage

The current configuration only enables import sorting rules (--select I). Consider enabling additional essential rule sets for comprehensive code quality checks.

   - id: ruff
-    args: [ --select, I, --fix, --exit-non-zero-on-fix ]
+    args: [ --select, E,F,W,I, --fix, --exit-non-zero-on-fix ]

This change would enable:

  • E/F rules: Python errors and syntax errors
  • W rules: Python warnings
  • I rules: Import sorting (currently enabled)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.3.4
hooks:
- id: black
- id: ruff
args: [ --select, I, --fix, --exit-non-zero-on-fix ]
- id: ruff-format
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.3.4
hooks:
- id: ruff
args: [ --select, E,F,W,I, --fix, --exit-non-zero-on-fix ]
- id: ruff-format

Comment on lines 22 to 24
- name: Install package with development dependencies
run: pip install -e ".[dev]"
if: steps.python-cache.outputs.cache-hit != 'true'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix the cache hit condition

The condition references an undefined cache output variable. The correct output variable should be from the setup-python action.

Apply this fix:

      - name: Install package with development dependencies
        run: pip install -e ".[dev]"
-       if: steps.python-cache.outputs.cache-hit != 'true'
+       if: steps.setup-python.outputs.cache-hit != 'true'
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- name: Install package with development dependencies
run: pip install -e ".[dev]"
if: steps.python-cache.outputs.cache-hit != 'true'
- name: Install package with development dependencies
run: pip install -e ".[dev]"
if: steps.setup-python.outputs.cache-hit != 'true'
🧰 Tools
🪛 actionlint

24-24: property "python-cache" is not defined in object type {}

(expression)

Comment on lines 31 to 32
- name: Check that documentation can be built
run: tox -e docs
run: sphinx-build docs docs/_build # --fail-on-warning
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Remove the commented --fail-on-warning flag

The documentation setup is standard with no special warning configurations in either conf.py or Makefile. The git history shows this flag was added as a comment in a recent change from tox -e docs to direct sphinx-build usage. Since there's no indication of a need for strict warning handling in the configuration files, and the flag was deliberately commented out rather than enabled, it should be removed entirely to avoid confusion.

  • Remove the commented --fail-on-warning flag from .github/workflows/check.yml line 32
🔗 Analysis chain

Decide on documentation warning handling

The --fail-on-warning flag is commented out, suggesting uncertainty about the desired strictness level.

Let's check if there are any current documentation warnings:

Consider either:

  1. Removing the comment if warnings should be allowed
  2. Uncommenting the flag if warnings should fail the build
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for documentation warnings
sphinx-build -b html docs docs/_build 2>&1 | grep -i "warning:"

Length of output: 58


Script:

#!/bin/bash
# Let's first check if sphinx-build is available in the environment
which sphinx-build

# Let's also check the content of docs directory to understand the documentation setup
ls -la docs/

# Check if there's any configuration for sphinx
fd -t f "conf.py" docs/

# Look for any historical changes to the sphinx build command
git log -p -- .github/workflows/check.yml | grep -A 5 -B 5 "sphinx-build"

Length of output: 1777


Script:

#!/bin/bash
# Let's check the content of conf.py to see if there are any warning configurations
cat docs/conf.py

# Let's also check the Makefile to see if there are any warning-related configurations
cat docs/Makefile

Length of output: 3351

Comment on lines +34 to +39
# level 2 unspecified digits
"156X-12-25",
"XXXX-12-XX",
"1XXX-12",
"1XXX-XX",
"1984-1X",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add error cases for invalid unspecified digit patterns

The new test cases cover valid patterns well, but consider adding error cases to error_cases for:

  • Invalid mixing of specified and unspecified digits (e.g., "19X4")
  • Incorrect position of X in dates
  • Invalid month/day combinations with X

Comment on lines +116 to +118
# TODO: handle leap day in an unknown year
# (currently causes an exception because min/max years are not leap years)
# Undate(None, 2, 29)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Address the TODO for handling leap day in unknown years

The TODO comment indicates that initializing Undate with day 29 in February and an unknown year causes an exception because the min/max years are not leap years. Implementing support for this scenario will enhance the robustness of date handling.

Would you like assistance in implementing this functionality or opening a GitHub issue to track this task?

Comment on lines +548 to +554

# duration is not supported for open-ended intervals
assert UndateInterval(Undate(2000), None).duration() == NotImplemented

# one year set and the other not currently raises not implemented error
with pytest.raises(NotImplementedError):
UndateInterval(Undate(2000), Undate()).duration()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Ensure consistent handling of unsupported durations in open-ended intervals

Currently, UndateInterval(Undate(2000), None).duration() returns NotImplemented, while UndateInterval(Undate(2000), Undate()).duration() raises NotImplementedError. Consider standardizing the behavior to either always raise an exception or always return NotImplemented to maintain consistency.

Comment on lines 330 to 331
return f"{year:>04}"
# if value is unset but date precision is month or greater, return unknown month
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix incorrect zero-padding in date component formatting.

The current format specifiers in the year, month, and day properties use '>' alignment, which does not zero-pad numbers. Use :0>4 for year and :0>2 for month and day to correctly zero-pad on the left.

Apply this diff to fix the formatting:

-                return f"{year:>04}"
+                return f"{year:0>4}"

...

-                return f"{month:>02}"
+                return f"{month:0>2}"

...

-                return f"{day:>02}"
+                return f"{day:0>2}"

Also applies to: 343-344, 354-355

Comment on lines +44 to 47
converter: Optional[BaseDateConverter] = None,
label: Optional[str] = None,
):
# keep track of initial values and which values are known
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Changing formatter to converter may break backward compatibility.

Changing the parameter name from formatter to converter in the Undate constructor might break existing code that uses the formatter parameter. Consider maintaining backward compatibility by accepting both parameters, using formatter if converter is not provided, and issuing a deprecation warning.

You can apply this diff to handle both parameters and maintain backward compatibility:

 def __init__(
     self,
     year: Optional[Union[int, str]] = None,
     month: Optional[Union[int, str]] = None,
     day: Optional[Union[int, str]] = None,
-    converter: Optional[BaseDateConverter] = None,
+    converter: Optional[BaseDateConverter] = None,
+    formatter: Optional[BaseDateConverter] = None,
     label: Optional[str] = None,
 ):
+    if converter is None and formatter is not None:
+        converter = formatter
+        # Issue a deprecation warning
+        warnings.warn(
+            "The 'formatter' parameter is deprecated and will be removed in a future version. Please use 'converter' instead.",
+            DeprecationWarning,
+        )
+    elif converter is None:
+        # Existing logic to initialize converter

Committable suggestion skipped: line range outside the PR's diff.

Comment on lines +292 to +301
return all(
[
self.earliest <= other.earliest,
self.latest >= other.latest,
# is precision sufficient for comparing partially known dates?
# checking based on less precise /less granular time unit,
# e.g. a day or month could be contained in a year
# but not the reverse
self.precision < other.precision,
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Verify precision comparison logic in __contains__ method.

The current logic uses self.precision < other.precision, which may not correctly represent containment when comparing date precisions. Since a higher DatePrecision value indicates finer granularity (e.g., DAY > MONTH > YEAR), consider revising the comparison to self.precision > other.precision to ensure accurate containment checks.

Apply this diff to fix the precision comparison:

-                    self.precision < other.precision,
+                    self.precision > other.precision,
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
return all(
[
self.earliest <= other.earliest,
self.latest >= other.latest,
# is precision sufficient for comparing partially known dates?
# checking based on less precise /less granular time unit,
# e.g. a day or month could be contained in a year
# but not the reverse
self.precision < other.precision,
]
return all(
[
self.earliest <= other.earliest,
self.latest >= other.latest,
# is precision sufficient for comparing partially known dates?
# checking based on less precise /less granular time unit,
# e.g. a day or month could be contained in a year
# but not the reverse
self.precision > other.precision,
]

Includes several items suggested or flagged by @coderabbitai
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Outside diff range and nitpick comments (15)
.github/workflows/check.yml (2)

26-28: Consider enhancing ruff configuration

While using ruff is a good choice, consider specifying explicit rules or configuration to ensure consistent checking across different environments.

-      - name: Run ruff
-        run: ruff check
+      - name: Run ruff
+        run: ruff check --select E,F,B,I,N,RUF --target-version py312

38-40: Consider enhancing notebook testing configuration

While adding treon for notebook testing is valuable, consider adding more specific configuration for better control and reliability.

-      - name: Check jupyter notebooks with treon
-        run: treon
+      - name: Check jupyter notebooks with treon
+        run: treon examples/ --timeout=300
src/undate/converters/edtf/edtf.lark (1)

44-48: Consider adding example patterns in comments.

The implementation correctly handles Y-prefixed years with optional negative signs. To improve maintainability, consider adding example patterns in the comments.

 // 'Y' may be used at the beginning of the date string to signify that 
 // the date is a year, when (and only when) the year exceeds four digits,
 // i.e. for years later than 9999 or earlier than -9999.
 // For negative years, the minus comes after the Y.
+// Examples: Y170000, Y-170000, Y10000, Y-10000
 year_fivedigitsplus: /Y-?\d{5,}/
pyproject.toml (2)

20-20: Consider removing or formalizing the maintainers comment

The comment about maintainers could be removed as it doesn't provide value, or better yet, formalized using the proper maintainers field if you want to explicitly document this information.


82-85: Fix typo in pytest marker description

There's a typo in the 'first' marker description:

-    "first : run marked tests efore all others",
+    "first : run marked tests before all others",
src/undate/converters/edtf/converter.py (2)

54-67: Consider code organization improvements

While the implementation is correct, consider these improvements:

  1. Move the note about open intervals vs unknown start/end to the method's docstring.
  2. Consider extracting the interval handling logic to a separate helper method for better maintainability.

Example refactor:

     def to_string(self, undate: Union[Undate, UndateInterval]) -> str:
         """
         Convert an :class:`~undate.undate.Undate` or
         :class:`~undate.undate.UndateInterval` to EDTF format.
+
+        Note:
+            The spec distinguishes between open intervals ("..") and
+            unknown start/end ("").
         """
         if isinstance(undate, Undate):
             return self._undate_to_string(undate)
         elif isinstance(undate, UndateInterval):
-            # NOTE: what is the difference between an open interval and unknown start/end?
-            # spec distinguishes between these, open is ".." but unknown is ""
-            start = self._undate_to_string(undate.earliest) if undate.earliest else ".."
-            end = self._undate_to_string(undate.latest) if undate.latest else ".."
-            return f"{start}/{end}"
+            return self._interval_to_string(undate)
+
+    def _interval_to_string(self, interval: UndateInterval) -> str:
+        start = self._undate_to_string(interval.earliest) if interval.earliest else ".."
+        end = self._undate_to_string(interval.latest) if interval.latest else ".."
+        return f"{start}/{end}"

68-99: Consider breaking down the method for better maintainability

The method handles multiple date precisions with similar patterns. Consider extracting each precision handling into separate helper methods to improve readability and maintainability.

Example refactor:

     def _undate_to_string(self, undate: Undate) -> str:
-        parts = []
+        return "-".join(filter(None, [
+            self._format_year(undate),
+            self._format_month(undate),
+            self._format_day(undate)
+        ]))
 
-        if undate.precision >= DatePrecision.YEAR:
-            year = self._convert_missing_digits(undate.year, undate.MISSING_DIGIT)
-            if year and len(year.lstrip("-")) > 4:
-                year = f"Y{year}"
-            parts.append(year or EDTF_UNSPECIFIED_DIGIT * 4)
+    def _format_year(self, undate: Undate) -> Optional[str]:
+        if undate.precision < DatePrecision.YEAR:
+            return None
+        year = self._convert_missing_digits(undate.year, undate.MISSING_DIGIT)
+        if year and len(year.lstrip("-")) > 4:
+            year = f"Y{year}"
+        return year or EDTF_UNSPECIFIED_DIGIT * 4
 
-        if undate.precision >= DatePrecision.MONTH:
-            parts.append(
-                self._convert_missing_digits(undate.month, undate.MISSING_DIGIT)
-                or EDTF_UNSPECIFIED_DIGIT * 2
-            )
+    def _format_month(self, undate: Undate) -> Optional[str]:
+        if undate.precision < DatePrecision.MONTH:
+            return None
+        return (self._convert_missing_digits(undate.month, undate.MISSING_DIGIT)
+                or EDTF_UNSPECIFIED_DIGIT * 2)
 
-        if undate.precision >= DatePrecision.DAY:
-            parts.append(
-                self._convert_missing_digits(undate.day, undate.MISSING_DIGIT)
-                or EDTF_UNSPECIFIED_DIGIT * 2
-            )
+    def _format_day(self, undate: Undate) -> Optional[str]:
+        if undate.precision < DatePrecision.DAY:
+            return None
+        return (self._convert_missing_digits(undate.day, undate.MISSING_DIGIT)
+                or EDTF_UNSPECIFIED_DIGIT * 2)
README.md (3)

50-51: Consider consistent formatting for code outputs

For better readability, consider using consistent formatting for all code outputs. Currently, some outputs use list representation while others use direct printing.

->>> [str(d) for d in [november7, november, year2k, november7_some_year]]
-['2000-11-07', '2000-11', '2000', '--11-07']
+>>> for d in [november7, november, year2k, november7_some_year]:
+...     print(str(d))
+...
+2000-11-07
+2000-11
+2000
+--11-07

Also applies to: 62-68


34-45: Consider adding type hints in examples

To improve documentation and IDE support, consider adding type hints to the example code:

-from undate.undate import Undate
+from undate.undate import Undate
+from typing import List

-november7 = Undate(2000, 11, 7)
+november7: Undate = Undate(2000, 11, 7)

29-30: Fix punctuation in the introduction

Add a comma after "Often" to improve readability.

-Often humanities and cultural data include imprecise or uncertain temporal information.
+Often, humanities and cultural data include imprecise or uncertain temporal information.
🧰 Tools
🪛 LanguageTool

[uncategorized] ~29-~29: Possible missing comma found.
Context: ...ibution information. ## Example Usage Often humanities and cultural data include im...

(AI_HYDRA_LEO_MISSING_COMMA)

src/undate/undate.py (5)

31-37: Consider documenting the rationale for MIN_ALLOWABLE_YEAR adjustment

The code adds 1 year to avoid a leap year, but this might be confusing for future maintainers. Consider adding more detailed documentation about why this specific adjustment was chosen and its implications.

 # numpy datetime is stored as 64-bit integer, so min/max
 # depends on the time unit; assume days for now
 # See https://numpy.org/doc/stable/reference/arrays.datetime.html#datetime-units
 # It just so happens that int(2.5e16) is a leap year, which is a weird default,
 # so let's increase our lower bound by one year.
+# Note: Adding 1 year ensures we start with a non-leap year for consistent date calculations
 MIN_ALLOWABLE_YEAR = int(-2.5e16) + 1
 MAX_ALLOWABLE_YEAR = int(2.5e16)

177-196: Reduce code duplication in parse and format methods

The converter lookup logic is duplicated. Consider extracting it to a private helper method.

+    @classmethod
+    def _get_converter(cls, format):
+        """Get a converter instance for the specified format"""
+        converter_cls = BaseDateConverter.available_converters().get(format, None)
+        if not converter_cls:
+            raise ValueError(f"Unsupported format '{format}'")
+        return converter_cls()
+
     @classmethod
     def parse(cls, date_string, format) -> Union["Undate", "UndateInterval"]:
         """parse a string to an undate or undate interval using the specified format;
         for now, only supports named converters"""
-        converter_cls = BaseDateConverter.available_converters().get(format, None)
-        if converter_cls:
-            # NOTE: some parsers may return intervals; is that ok here?
-            return converter_cls().parse(date_string)
-
-        raise ValueError(f"Unsupported format '{format}'")
+        converter = cls._get_converter(format)
+        return converter.parse(date_string)

     def format(self, format) -> str:
         """format this undate as a string using the specified format;
         for now, only supports named converters"""
-        converter_cls = BaseDateConverter.available_converters().get(format, None)
-        if converter_cls:
-            # NOTE: some parsers may return intervals; is that ok here?
-            return converter_cls().to_string(self)
-
-        raise ValueError(f"Unsupported format '{format}'")
+        converter = self._get_converter(format)
+        return converter.to_string(self)

325-364: Add more documentation for date part properties

The properties would benefit from more detailed documentation, including examples of return values for different scenarios (known, unknown, partially known).

     @property
     def year(self) -> Optional[str]:
-        "year as string (minimum 4 characters), if year is known"
+        """Year as string (minimum 4 characters).
+        
+        Returns:
+            - Known year: "2024"
+            - Unknown year with year precision: "XXXX"
+            - No year component: None
+        """

385-387: Remove commented out code

Remove the commented out datetime.date code as it's no longer needed after the migration to the new Date class.

-                # latest = datetime.date(
-                #     self.earliest.year, self.latest.month, self.latest.day
-                # )

Line range hint 1-520: Consider splitting the file into smaller modules

The file has grown quite large and handles multiple concerns (date parsing, formatting, comparison, duration calculation). Consider splitting it into smaller, focused modules:

  • undate/core.py - Core Undate class
  • undate/interval.py - UndateInterval class
  • undate/formatting.py - Parsing and formatting logic

This would improve maintainability and make the code easier to test.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 1b94c78 and 867c5ca.

📒 Files selected for processing (18)
  • .github/workflows/check.yml (1 hunks)
  • .github/workflows/unit_tests.yml (2 hunks)
  • .pre-commit-config.yaml (2 hunks)
  • CHANGELOG.md (1 hunks)
  • CITATION.cff (1 hunks)
  • README.md (2 hunks)
  • docs/conf.py (3 hunks)
  • examples/notebooks/README.md (1 hunks)
  • examples/notebooks/edtf-support.ipynb (1 hunks)
  • pyproject.toml (1 hunks)
  • src/undate/__init__.py (1 hunks)
  • src/undate/converters/edtf/converter.py (1 hunks)
  • src/undate/converters/edtf/edtf.lark (2 hunks)
  • src/undate/converters/edtf/transformer.py (2 hunks)
  • src/undate/undate.py (14 hunks)
  • tests/test_converters/edtf/test_edtf_parser.py (3 hunks)
  • tests/test_converters/edtf/test_edtf_transformer.py (3 hunks)
  • tests/test_converters/test_edtf.py (1 hunks)
✅ Files skipped from review due to trivial changes (2)
  • CITATION.cff
  • examples/notebooks/README.md
🚧 Files skipped from review as they are similar to previous changes (9)
  • .pre-commit-config.yaml
  • CHANGELOG.md
  • docs/conf.py
  • examples/notebooks/edtf-support.ipynb
  • src/undate/init.py
  • src/undate/converters/edtf/transformer.py
  • tests/test_converters/edtf/test_edtf_parser.py
  • tests/test_converters/edtf/test_edtf_transformer.py
  • tests/test_converters/test_edtf.py
🧰 Additional context used
🪛 actionlint
.github/workflows/check.yml

24-24: property "setup-python" is not defined in object type {}

(expression)

.github/workflows/unit_tests.yml

37-37: property "setup-python" is not defined in object type {}

(expression)

🪛 LanguageTool
README.md

[uncategorized] ~29-~29: Possible missing comma found.
Context: ...ibution information. ## Example Usage Often humanities and cultural data include im...

(AI_HYDRA_LEO_MISSING_COMMA)

🪛 Ruff
src/undate/converters/edtf/converter.py

43-45: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🔇 Additional comments (23)
.github/workflows/check.yml (4)

16-21: LGTM! Modern Python setup with appropriate caching

The setup is well-configured with the latest stable Python version and proper caching using pyproject.toml.


22-24: LGTM! Cache condition is correctly configured

The installation step is properly configured with the correct cache hit condition referencing the setup-python step.

🧰 Tools
🪛 actionlint

24-24: property "setup-python" is not defined in object type {}

(expression)


34-36: LGTM! Type checking is well configured

The mypy configuration is appropriate with non-interactive type installation and focused src directory checking.


30-32: Verify the need for strict documentation warnings

The --fail-on-warning flag is now enabled, which will cause the workflow to fail on any documentation warnings. Please verify if this strict checking aligns with the project's documentation standards.

✅ Verification successful

Keep the --fail-on-warning flag for documentation quality

Based on the evidence gathered, keeping the --fail-on-warning flag is appropriate for this project:

  1. The project maintains high quality standards with multiple quality checks (Ruff, codecov, unit tests) as shown in README.md
  2. Documentation is prominently featured with ReadTheDocs integration and badges
  3. Recent commit history shows active documentation maintenance and improvements
  4. The CONTRIBUTING.md explicitly states documentation is an expected part of contributions
  5. No issues were found regarding documentation warnings being too strict

The strict documentation checking aligns with the project's demonstrated commitment to code and documentation quality.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check documentation configuration and current warning state

# Check sphinx configuration for warning handling
echo "Checking sphinx configuration..."
fd -t f "conf.py" docs/ -x cat {}

# Check if there are any current documentation warnings
echo "Checking for current warnings..."
sphinx-build -b html docs docs/_build 2>&1 | grep -i "warning:"

Length of output: 2856


Script:

#!/bin/bash
# Let's check for any documentation-related discussions or standards in project files

# Check README and other documentation files for any mentions of documentation standards
echo "Checking documentation standards..."
rg -i "documentation|sphinx|docs?" -A 3 -B 3 "README.md" "CONTRIBUTING.md" "docs/README.md" || true

# Check commit history for any relevant discussions about documentation warnings
echo "Checking commit history..."
git log --grep="sphinx\|documentation\|warning" --pretty=format:"%h %s" -n 10

# Check if there are any existing documentation issues
echo "Checking documentation issues..."
gh issue list --label "documentation" || true

Length of output: 4689

.github/workflows/unit_tests.yml (4)

14-14: LGTM! Good choice updating to Python 3.12

Using the latest stable Python version (3.12) for coverage reporting is a good practice.


21-21: Consider removing Python 3.13 from the test matrix

Python 3.13 is currently in alpha, and including it in the test matrix might lead to unstable builds. Key dependencies like numpy typically don't guarantee support for alpha Python versions.

Let's verify the current Python 3.13 compatibility status:

#!/bin/bash
# Check if the project's dependencies support Python 3.13
curl -s https://pypi.org/pypi/numpy/json | jq -r '.info.requires_python'
curl -s https://pypi.org/pypi/lark/json | jq -r '.info.requires_python'

# Check if there are any known issues with Python 3.13
gh api graphql -f query='
{
  repository(owner: "numpy", name: "numpy") {
    issues(first: 5, states: OPEN, labels: ["Python 3.13"]) {
      nodes {
        title
        url
      }
    }
  }
}'

33-33: LGTM! Good update to use pyproject.toml

Using pyproject.toml for dependency caching aligns with modern Python packaging standards.


41-46: Verify pytest configuration maintains all test scenarios

The switch from tox to direct pytest commands simplifies the workflow, but please ensure that all test scenarios from the tox configuration are preserved in pytest configuration.

Let's check for any tox-specific configurations that need to be migrated:

✅ Verification successful

Let me gather more information about the previous test configuration to ensure we haven't lost any important settings.


Based on the git history and test files, I can now provide a conclusive response about the pytest configuration changes.

Test configuration has been properly migrated from tox

The switch from tox to direct pytest commands maintains all essential test functionality:

  • The pytest configuration in pyproject.toml preserves the important settings:
    • Correct Python path (pythonpath = "src/")
    • Test markers for execution order
  • The workflow changes maintain the same test coverage by:
    • Running basic tests across all Python versions
    • Running coverage report specifically for one version
  • No test scenarios were lost in the migration as tox was primarily used as a test runner without additional test configurations (the removed tox config only specified basic pytest execution)
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Look for tox configuration files and their content
fd "tox.ini|setup.cfg" --exec cat {}

# Look for pytest configuration in pyproject.toml
fd "pyproject.toml" --exec cat {} | grep -A 20 "\[tool.pytest"

Length of output: 314


Script:

#!/bin/bash
# Check git history for removed tox configuration
git log -p -- tox.ini setup.cfg

# Look for any pytest-related files
fd "conftest.py|pytest.ini" --exec cat {}

# Check test directory structure and any test configuration files
fd "test" --type d
fd "test" --type f -e py -e ini -e cfg --exec cat {}

Length of output: 42999

src/undate/converters/edtf/edtf.lark (1)

37-39: Implementation aligns with EDTF specification.

The addition of unspecified ~ 4 correctly implements fully unspecified years (XXXX) as per Level 2 of the EDTF specification.

Let's verify the grammar handles all unspecified year cases correctly:

✅ Verification successful

Grammar correctly handles all unspecified year patterns

The test coverage shows comprehensive verification of the grammar implementation:

  • Full year unspecification: XXXX (e.g. XXXX-12-31)
  • Partial year unspecification patterns:
    • Decade unspecified: 19XX, 20XX
    • Single digit unspecified: 201X, 156X
    • Multiple digits: 1XXX
  • All patterns are tested with various month/day combinations
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check test coverage for unspecified year patterns
# Look for test cases covering both partial and fully unspecified years

rg -A 2 'XXXX|[0-9]XXX|[0-9]{2}XX|[0-9]{3}X' tests/

Length of output: 4707

pyproject.toml (4)

2-3: LGTM! Clean build system configuration.

The transition to hatchling as the build backend is a good choice, providing a modern and maintainable build system.


38-38: Remove Python 3.13 classifier

As noted in the previous review, Python 3.13 is not yet released and the project hasn't been tested against it. This classifier should be removed until compatibility is confirmed.


61-64: Remove circular dependencies

The following circular dependencies should be removed:

  • undate[docs] and undate[test] in dev dependencies
  • undate in all dependencies

These are redundant and could cause issues during development.


66-71: LGTM! Well-structured project URLs.

All essential URLs are provided and consistently use the 'main' branch.

src/undate/converters/edtf/converter.py (5)

1-13: LGTM! Well-organized imports and clear constant definition.

The imports are logically grouped and the constant is well-documented with proper type annotation.


15-28: LGTM! Well-documented class with clear scope.

The class is well-structured with comprehensive documentation that clearly defines its capabilities and limitations.


42-45: Enhance exception handling by preserving the original exception

When re-raising an exception within an except block, it's best practice to include the original exception to maintain the exception chain.

Apply this diff to improve the exception handling:

-        except UnexpectedCharacters:
+        except UnexpectedCharacters as err:
             raise ValueError(
                 "Parsing failed: '%s' is not a supported EDTF date format" % value
-            )
+            ) from err
🧰 Tools
🪛 Ruff

43-45: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


47-53: LGTM! Clean and focused helper method.

The method is well-typed and follows the single responsibility principle.


104-104: Correct exception message formatting

The ValueError raised should have a properly formatted message.

Apply this diff to fix the exception message:

-        raise ValueError("Failed to generate an EDTF string from %r", undate)
+        raise ValueError(f"Failed to generate an EDTF string from {undate!r}")
README.md (2)

21-23: LGTM! Comprehensive set of badges

The badges effectively communicate the project's status, quality metrics, and community engagement.


Line range hint 1-169: Excellent documentation update!

The README now provides:

  • Clear project status and limitations
  • Comprehensive examples covering major functionality
  • Well-structured sections for easy navigation
  • Proper references to external documentation

The changes significantly improve the documentation's quality and usability.

🧰 Tools
🪛 LanguageTool

[uncategorized] ~29-~29: Possible missing comma found.
Context: ...ibution information. ## Example Usage Often humanities and cultural data include im...

(AI_HYDRA_LEO_MISSING_COMMA)

src/undate/undate.py (3)

44-44: Previous comment about backward compatibility is still valid

The change from formatter to converter parameter could break existing code.


Line range hint 151-168: LGTM: String representation logic is well-implemented

The implementation correctly handles partially known dates and uses proper zero-padding.


292-301: Previous comment about precision comparison logic is still valid

The precision comparison in __contains__ method needs to be reversed.

Comment on lines 36 to 37
pip install -e ".[test]"
if: steps.setup-python.outputs.cache-hit != 'true'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix undefined step ID in cache hit check

The condition uses steps.setup-python but this step ID is not defined in the workflow. The Python setup step doesn't have an id field defined.

Apply this fix:

-        if: steps.setup-python.outputs.cache-hit != 'true'
+        if: steps.cache.outputs.cache-hit != 'true'
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
pip install -e ".[test]"
if: steps.setup-python.outputs.cache-hit != 'true'
pip install -e ".[test]"
if: steps.cache.outputs.cache-hit != 'true'
🧰 Tools
🪛 actionlint

37-37: property "setup-python" is not defined in object type {}

(expression)

@@ -14,7 +14,7 @@

date: year | year "-" month | year "-" month "-" day

year: INT
year: INT | /\-/ INT
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Simplify the negative year pattern and prevent unintended spaces.

The current pattern INT | /\-/ INT could allow spaces between the minus sign and the number. Consider using a single regex pattern for better precision.

-year: INT | /\-/ INT
+year: /-?\d+/

Committable suggestion skipped: line range outside the PR's diff.


**undate** is a python library for working with uncertain or partially known dates.
![undate](_static/undate_logo.png)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Logo image path needs to be updated

The logo image exists but the path in the README needs to be corrected. The image is present in both _static/undate_logo.png and docs/_static/undate_logo.png, but for consistency with other documentation, it should reference the one in the docs/_static/ directory.

  • Update the image path in README.md from ![undate](_static/undate_logo.png) to ![undate](docs/_static/undate_logo.png)
🔗 Analysis chain

Verify the logo image path and accessibility

Please ensure that:

  1. The _static directory exists in the repository root
  2. The logo image file is present and correctly named
  3. The image is visible in the rendered README
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if the logo image exists and is accessible
if [ -f "_static/undate_logo.png" ]; then
    echo "Logo image exists"
    file "_static/undate_logo.png"
else
    echo "Logo image not found at _static/undate_logo.png"
    # Check if it exists elsewhere
    fd -e png "undate_logo"
fi

Length of output: 184


Script:

#!/bin/bash
# Let's verify the existence and structure of _static directory
ls -la _static/ 2>/dev/null || echo "Directory _static does not exist"

# Check if the image exists in any other location
find . -type f -name "undate_logo.png"

# List all png files in the repository to see if the logo might be named differently
find . -type f -name "*.png"

Length of output: 412

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (2)
src/undate/converters/edtf/edtf.lark (1)

44-48: LGTM! Clear documentation for extended year handling.

The documentation and implementation correctly handle extended year ranges with the Y prefix. The clarification about negative year placement is particularly helpful.

Consider adding an example in the comments:

 // For negative years, the minus comes after the Y.
+// Example: "Y-170000" represents the year -170000
README.md (1)

89-119: Consider adding explanatory comments for comparison behavior

The comparison examples would benefit from brief comments explaining why certain comparisons raise NotImplementedError. This would help users better understand the limitations and behavior of date comparisons.

Example enhancement:

 # When dates cannot be compared due to ambiguity or precision, comparison methods raise a NotImplementedError.
+# For example, when comparing a specific day (2020-11-07) with its containing month (2020-11),
+# the comparison is ambiguous since the day falls within the month's range.
 
 >>> november_2020 = Undate(2020, 11)
 >>> november7_2020 > november_2020
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 867c5ca and e3c6f65.

📒 Files selected for processing (3)
  • README.md (2 hunks)
  • pyproject.toml (1 hunks)
  • src/undate/converters/edtf/edtf.lark (2 hunks)
🧰 Additional context used
🪛 LanguageTool
README.md

[uncategorized] ~29-~29: Possible missing comma found.
Context: ...ibution information. ## Example Usage Often humanities and cultural data include im...

(AI_HYDRA_LEO_MISSING_COMMA)

🔇 Additional comments (8)
src/undate/converters/edtf/edtf.lark (1)

17-17: LGTM! Improved year pattern implementation.

The simplified regex pattern properly handles negative years while preventing unintended spaces, as previously suggested.

pyproject.toml (4)

2-3: LGTM: Clean build system configuration

The transition to hatchling as the build backend is well-structured and follows modern Python packaging practices.


38-38: Remove Python 3.13 classifier

Including Python 3.13 in classifiers is premature as it's not yet released and there's no testing infrastructure for this version.


61-64: Remove circular dependencies

The following dependencies create circular references:

  • undate[docs] and undate[test] in dev dependencies
  • undate in all dependencies

66-71: LGTM: Well-structured project URLs

All essential repository links are provided with correct branch references.

README.md (3)

7-10: LGTM: Well-formatted warning section

The warning section effectively uses GitHub's alert syntax to highlight the pre-alpha status and feature limitations.


21-26: LGTM: Comprehensive badge coverage

The badges section effectively communicates project status, including code quality (Ruff), test coverage, documentation status, and contributor recognition.


161-165: LGTM: Clear documentation references

The documentation section effectively directs users to both the ReadTheDocs site and developer notes.

Comment on lines +37 to +39
// In Level 2, year may be completely unspecified.
unspecified: /X/
?year_unspecified: /\d+/ unspecified+
?year_unspecified: /\d+/ unspecified+ | unspecified ~ 4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Implementation needs adjustment to match EDTF Level 2 specification

The current implementation allowing XXXX (completely unspecified year) appears to exceed the documented Level 2 support. Based on the evidence:

  • The notebook explicitly states: "The only part of L2 that undate currently supports is allowing an unspecified digit anywhere in the date"
  • Test cases show XXXX pattern being used, but this isn't documented as a supported Level 2 feature
  • The implementation should focus on the specified Level 2 feature of allowing 'X' anywhere within a component, as shown in example '156X-12-25'

Suggested actions:

  • Remove or document the rationale for supporting completely unspecified years (XXXX)
  • Update tests to focus on the documented Level 2 feature of allowing 'X' anywhere within components
🔗 Analysis chain

Verify EDTF Level 2 specification compliance.

The addition of unspecified ~ 4 for completely unspecified years (XXXX) appears to be a Level 2 feature. Please ensure this aligns with the EDTF specification and that appropriate tests are in place.


Let me check the test files to verify the implementation details and specification compliance.


Let me check the EDTF specification reference in the codebase to ensure complete verification.


Let me check the notebook that demonstrates EDTF support to verify Level 2 compliance.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for related test cases and Level 2 implementation
# Look for test files that verify XXXX pattern handling
rg -l "XXXX" tests/
# Check if we're implementing other Level 2 features
rg "Level 2" src/

Length of output: 282


Script:

#!/bin/bash
# Check test cases for XXXX pattern
rg -B2 -A2 "XXXX" tests/test_converters/test_edtf.py tests/test_converters/edtf/test_edtf_parser.py tests/test_converters/edtf/test_edtf_transformer.py

# Check for any Level 2 documentation or specification references
rg -l "edtf.*level.*2|level.*2.*edtf" -i .

Length of output: 1852


Script:

#!/bin/bash
# Look for EDTF specification references in documentation or comments
rg -i "edtf.*specification|specification.*edtf" .

# Check for README or documentation files that might contain EDTF spec details
fd -e md -e rst -e txt | xargs rg -i "edtf"

Length of output: 1719


Script:

#!/bin/bash
# Extract content from the EDTF support notebook
cat examples/notebooks/edtf-support.ipynb | jq -r '.cells[] | select(.source != null) | .source[]' | grep -A5 -B5 -i "level.*2\|unspecified.*year"

Length of output: 1253

pyproject.toml Outdated Show resolved Hide resolved
rlskoeser and others added 2 commits November 18, 2024 17:25
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
@rlskoeser rlskoeser merged commit 2d8de7f into main Nov 18, 2024
6 checks passed
@rlskoeser rlskoeser deleted the release/0.3 branch November 18, 2024 22:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants