Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use last element as end time #58

Merged
merged 2 commits into from
Nov 1, 2024
Merged

Conversation

magnusuMET
Copy link
Collaborator

@magnusuMET magnusuMET commented Nov 1, 2024

Added an additional fix for type comparisons

@magnusuMET magnusuMET requested a review from heikoklein November 1, 2024 13:15
Comment on lines 614 to +617
end = datetime.min
for s, e in self._start_include + self._startend_include + self._end_include:
start = min(start, s)
end = max(end, s)
end = max(end, e)
Copy link
Collaborator

@avaldebe avaldebe Nov 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the cleanest way to set the start/end
Also, why is does start depends on self._end_include and end depends on self._start_include?

maybe itertools.chain can improve the readability

from itertools import chain
[...]
        if self._start_include or self._startend_include:
            start = min(s for s, e in chain(self._start_include, self._startend_include))
        else:
            start = datetime.max

        if self._end_include or self._startend_include:
            end = max(e for s, e in chain(self._end_include, self._startend_include))
        else:
            end = datetime.min

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the chain-version is very readable here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the chain-version is very readable here.

Yes, at the cost of creating new lists this could be written on a simpler way

        if self._start_include or self._startend_include:
            start = min(s for s, e in self._start_include + self._startend_include)
        else:
            start = datetime.max

        if self._end_include or self._startend_include:
            end = max(e for s, e in self._end_include + self._startend_include)
        else:
            end = datetime.min

In any case, my question still stands

Also, why is does start depends on self._end_include and end depends on self._start_include?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tuple values of (start, end) are the time-ranges of the filter, while start_include/end_include refer to measurements starttime/endtime.

@@ -579,16 +579,15 @@ def init_kwargs(self):
}

def _index_from_include_exclude(self, times1, times2, includes, excludes):
idx = times1.astype("bool")
if len(includes) == 0:
Copy link
Collaborator

@avaldebe avaldebe Nov 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like includes is assumed to be an Iterable, in which case
if len(includes) == 0: is equivalent to if not includes:.

With this in mind, this section can be written as:

        idx = np.repeat(not includes, len(times1))
        for start, end in includes:
            idx |= (np.datetime64(start) <= times1) & (times2 <= np.datetime64(end))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While idx = np.repeat(not includes, len(times1)) is written using very few characters, the boolean value of includes is not related to the boolean value of idx except for a chain of action, e.g. we could also end up with idx = np.repeat(includes, ...) if we would use the idx differently.

Rather than having to add long comments, please use rather more code.

Copy link
Member

@heikoklein heikoklein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, looks good.
Please check if we could improve the code by internally storing all datetimes as np.datetime64[s] directly?

@@ -614,7 +613,7 @@ def envelope(self) -> tuple[datetime, datetime]:
end = datetime.min
for s, e in self._start_include + self._startend_include + self._end_include:
start = min(start, s)
end = max(end, s)
end = max(end, e)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, very well that you also included a test.

for start, end in includes:
idx |= (start <= times1) & (times2 <= end)
idx |= (np.datetime64(start) <= times1) & (times2 <= np.datetime64(end))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why the conversion to np.datetime64 is needed here, or, more specific, if operations need to be done as np.datetime64 then maybe _str_list_to_datetime_list should do the conversion already?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably should store dt's as numpy format, opened issue #59 for this

Comment on lines 614 to +617
end = datetime.min
for s, e in self._start_include + self._startend_include + self._end_include:
start = min(start, s)
end = max(end, s)
end = max(end, e)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the chain-version is very readable here.

@magnusuMET magnusuMET merged commit 923faeb into metno:main Nov 1, 2024
2 checks passed
@magnusuMET magnusuMET deleted the bugfix/endtime branch November 1, 2024 14:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants