-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use last element as end time #58
Conversation
end = datetime.min | ||
for s, e in self._start_include + self._startend_include + self._end_include: | ||
start = min(start, s) | ||
end = max(end, s) | ||
end = max(end, e) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the cleanest way to set the start
/end
Also, why is does start
depends on self._end_include
and end
depends on self._start_include
?
maybe itertools.chain can improve the readability
from itertools import chain
[...]
if self._start_include or self._startend_include:
start = min(s for s, e in chain(self._start_include, self._startend_include))
else:
start = datetime.max
if self._end_include or self._startend_include:
end = max(e for s, e in chain(self._end_include, self._startend_include))
else:
end = datetime.min
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the chain-version is very readable here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the chain-version is very readable here.
Yes, at the cost of creating new lists this could be written on a simpler way
if self._start_include or self._startend_include:
start = min(s for s, e in self._start_include + self._startend_include)
else:
start = datetime.max
if self._end_include or self._startend_include:
end = max(e for s, e in self._end_include + self._startend_include)
else:
end = datetime.min
In any case, my question still stands
Also, why is does start depends on self._end_include and end depends on self._start_include?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tuple values of (start, end) are the time-ranges of the filter, while start_include/end_include refer to measurements starttime/endtime.
@@ -579,16 +579,15 @@ def init_kwargs(self): | |||
} | |||
|
|||
def _index_from_include_exclude(self, times1, times2, includes, excludes): | |||
idx = times1.astype("bool") | |||
if len(includes) == 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like includes
is assumed to be an Iterable
, in which case
if len(includes) == 0:
is equivalent to if not includes:
.
With this in mind, this section can be written as:
idx = np.repeat(not includes, len(times1))
for start, end in includes:
idx |= (np.datetime64(start) <= times1) & (times2 <= np.datetime64(end))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While idx = np.repeat(not includes, len(times1))
is written using very few characters, the boolean value of includes is not related to the boolean value of idx except for a chain of action, e.g. we could also end up with idx = np.repeat(includes, ...) if we would use the idx differently.
Rather than having to add long comments, please use rather more code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, looks good.
Please check if we could improve the code by internally storing all datetimes as np.datetime64[s] directly?
@@ -614,7 +613,7 @@ def envelope(self) -> tuple[datetime, datetime]: | |||
end = datetime.min | |||
for s, e in self._start_include + self._startend_include + self._end_include: | |||
start = min(start, s) | |||
end = max(end, s) | |||
end = max(end, e) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, very well that you also included a test.
for start, end in includes: | ||
idx |= (start <= times1) & (times2 <= end) | ||
idx |= (np.datetime64(start) <= times1) & (times2 <= np.datetime64(end)) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why the conversion to np.datetime64 is needed here, or, more specific, if operations need to be done as np.datetime64 then maybe _str_list_to_datetime_list should do the conversion already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably should store dt's as numpy format, opened issue #59 for this
end = datetime.min | ||
for s, e in self._start_include + self._startend_include + self._end_include: | ||
start = min(start, s) | ||
end = max(end, s) | ||
end = max(end, e) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the chain-version is very readable here.
Added an additional fix for type comparisons