-
Notifications
You must be signed in to change notification settings - Fork 477
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow import times due to always importing optional packages #1460
Comments
That's not intention with these imports, instead, we need to have a list of types that Pint is not allowed to wrap... it's not about enabling optional features as much as ensuring that erroneous behavior cannot come about with whatever packages a user happens to have in their environment (e.g., we need to make sure that attempting to stick a pandas Series inside a pint Quantity always fails). Having a "deny list" rather than an "allow list" was a decision made early on given Pint's relatively high place in the generally agreed upon type casting hierarchy, although it is one that will likely see revisiting once there is consensus reached on pydata/duck-array-discussion#3. That being said, using type checks against the actual array types themselves still opens us up to those other libraries' slow import times also slowing down Pint. This style of check has caused circular import issues in the past as well (see |
not really, besides pushing pydata/duck-array-discussion#3 forward. For now, the code for def fully_qualified_name(obj):
t = type(obj)
module = t.__module__
name = t.__qualname__
if module is None or module == "__builtin__":
return name
return f"{module}.{name}" and we'd define a |
I like @keewis proposal. One improved way will be to "delay" the import until the fully qualified match. The rationale would be as follows:
Something like this (disregard the name of the variables): upcast_types = { "fully.qualified.name": actual_type or None}
def is_upcast_type_by_name(other):
fqn = fully_qualified_name(other)
try:
importer = upcast_types[fqn]
except KeyError:
return False
if isinstance(importer, callable):
cls = importer()
else:
module_name, class_name = fqn.rsplit(".", 1)
cls = getattr(import_module(module_name), cls)
upcast_types[fqn] = cls
# This is to check we are importing the same thing.
# and avoid weird problems. Maybe instead of return
# we should raise an error if false.
return isinstance(obj, cls)
def is_upcast_type(other):
if other in upcast_types.values():
return True
return is_upcast_type_by_name(other) and in # Meaning: "we now of its existence and we can handle"
# Alternative we can also allow having a function that does a more complex import.
upcast_types["xarray.core.dataarray.DataArray"] = None |
Hi, thanks to the developers and maintainers for creating and supporting this software.
I've noticed some slow (> 1 second) import times and used Tuna to isolate it as the Pandas import.
Here is Tuna's output of a script containing only
from pint import UnitRegistry
:It seems to be here, the result of #959 in which it's imported if available alongside some other packages no matter if it's used. Note that I don't have
pint-pandas
installed so I'm not trying to take advantage of that functionality.I assuming the thinking there is "if it's not needed, the user shouldn't have it installed," but Pandas is used by so many packages that it's hard to avoid it creeping in. Most of of my use cases, i.e. tracking small amounts of data or something as simple as providing a custom unit registry, don't use Pandas and therefore the ~1 second of import time is not a desired side effect.
The text was updated successfully, but these errors were encountered: