-
-
Notifications
You must be signed in to change notification settings - Fork 405
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix the bug that caused MAIN_GAIA_TABLE
in astroquery.gaia
to not work
#2153
Conversation
astroquery/gaia/core.py
Outdated
@@ -357,7 +369,9 @@ def get_datalinks(self, ids, verbose=False): | |||
return self.__gaiadata.get_datalinks(ids=ids, verbose=verbose) | |||
|
|||
def __query_object(self, coordinate, radius=None, width=None, height=None, | |||
async_job=False, verbose=False, columns=[]): | |||
async_job=False, verbose=False, columns=[], | |||
table_name=None, ra_column_name=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't do this with the other modules and I don't think it's the overall preference to bloat up the number of kwargs with configurable members. cc @keflavich
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. I see the motivation to have a configurable default table and row limit (:+1: for those), but not for the RA/Dec column name. If there's a specific reason to make that configurable (i.e., if the RA/Dec columns have version numbers in them), then that would be a good enough reason, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the main logic was to just use another instance for that, right? (not that gaia_xyz = Gaia()
works, it doesn't, unlike other modules). If useful, we could even make it into the __init__
as we do it in e.g. Vizier
, to make it very convenient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only reason I added the *_column_name
arguments to the query_object
functions was that the cone_search
functions already had them, and having a common set of arguments for both seemed like a good idea.
That being said I don't think those config options should exist at all. It seems to me that having the coordinate system as a config option and/or a keyword argument would be beneficial, but the names of the longitude and latitude coordinates should be automatically taken from some built-in dictionary instead of having to rely on the user to supply both. If this is an idea we wish to pursue further in some other pull request then we would have to deprecate the *_column_name
config options, and adding more public keyword arguments in this pull request that would have to be deprecated in the future would be counterproductive.
The best thing to do here seems to be to undo the addition of the *_column_name
arguments to the query_object
functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with the idea that coordinate system selection should be by system, not pairs of columns.
regarding configuration of the default table / row limit: those are, in many packages, accessible as both configuration items and per-instance items. The table selection is more like a base URL choice, which is always(?) configurable at the config level.
@astropy/coordinators - could someone add @eerovaher to the org, it's way overdue! |
Before I update this pull request I need to make sure that I understand how the configuration system is supposed to work here. The If that is the case then I should remove all the new keyword arguments and undo the deprecation of the class attributes. Based on the discussion above we might want to actually deprecate some of the attributes, but that would be better done in a separate pull request because its not really related to the issues that are being fixed here. |
These are good questions. There are some things that are meant to be treated as class attributes or configuration items - things that you want to persist across a whole session and are more directly associated with a feature of the service, such as: the URL of the service, the timeout limit, etc. Other items, anything that should vary from one query to the next, should be left as keyword arguments. The line between these different classes of items is not sharp, though, and there can be arguments both ways for some of attributes/kwargs. If there is ambiguity, it is acceptable to have function-specific kwargs that override the class attribute for that single call, though generally this is probably not needed. @bsipocz feel free to disagree with / suggest changes to this "policy". I am simply summarizing how it has been done (sometimes), not how it has to be. |
@keflavich, based on your answer it would seem that the table name should be available as a class attribute and row limit should be available as a function argument. |
Yes, though I would suggest including |
What I meant was that the table name should be a class attribute in addition to being in |
I don't really disagree with any of the above, but prefer to have consistency, or converge toward consistency among the modules. And if possible not bloat up the number up method kwargs with repeated ones (but am not against in having things as class attributes that are modifiable either by config and class kwargs). |
We always make the config items class attributes. e.g., in Vizier, it's set during While in SIMBAD it's a class attribute defined earlier... Yes, we should definitely make these consistent. I prefer the Vizier approach out of the two above, but I would defer to others if there's reason to prefer the direct attribute approach. |
I'll add, though, that the fact that this is the first time this has come up (or close to) suggests that these options are not often used and/or not confusing to users, so while consistency would be better, we do not seem to have a glaring UI problem. |
For consistency's sake it does seem better to make the row limit available through It looks like the changes needed in the code are much smaller than I had thought, and in that case it becomes more difficult to justify dealing with all three issues here simultaneously. Should I open a separate pull request for addressing #1760 and only deal with the |
What came up before is that the dr release attribute wasn't mutable for Gaia. But that module is not really usable the way we envisioned the rest with modified attributes/configs. So that's a clear bug. I think I have the slight preference to use class instances, as in the Simbad docs, but don't have a strong preference whether values are set at initialisation or mutable later as attributes. How about systemically checking what we're doing in different modules, but also pulling in more people to ask for their API opinions (e.g. AdrianD). And am sorry @eerovaher for derailing your bugfix pr. |
Yeah, we should definitely have an issue summarizing where we stand and requesting feedback. and ditto, sorry @eerovaher for derailing! |
9651942
to
a736fd1
Compare
Codecov Report
@@ Coverage Diff @@
## main #2153 +/- ##
==========================================
+ Coverage 66.42% 66.44% +0.01%
==========================================
Files 418 418
Lines 28026 28041 +15
==========================================
+ Hits 18616 18631 +15
Misses 9410 9410
Continue to review full report at Codecov.
|
MAIN_GAIA_TABLE
in astroquery.gaia
to not work
This pull request does not have a label yet. |
Currently querying the Gaia archive can return results from `gaiadr2.gaia_source` even if the user has specified that they wish to query a different table. This commit adds a regression test to see if the right table actually gets queried.
It is now possible to change the table Gaia queries are made from through the MAIN_GAIA_TABLE config item or class attribute at runtime. If the class attribute is specified then it will take precedence over the config item.
a736fd1
to
561011f
Compare
I rebased the commits to solve a merge conflict in the change log. |
@bsipocz, it would be nice to get these bugfixes merged. |
Thanks @eerovaher! |
This pull request has two approvals, it should be good to merge now. |
Hah, I meant to merge not just approve last night, sorry about that 😅 |
Instead of attempting to overhaul the configuration system in
astroquery.gaia
, this pull request has been scaled down to only fix #2093 and fix #2099. #1760 will be dealt with in a separate pull request.The original pull request description is preserved below to provide context to the discussions that took place here.
ORIGINAL PULL REQUEST MESSAGE (OUTDATED!):
The
cone_search()
andquery_object()
families of functions inastroquery.gaia
now properly use theastroquery.gaia.conf
itemsMAIN_GAIA_TABLE
,MAIN_GAIA_TABLE_RA
,MAIN_GAIA_TABLE_DEC
andROW_LIMIT
, meaning that these configuration options can be updated at runtime. Furthermore, those functions also accept optional keyword arguments that overrule the values fromconf
. Some functions already allowed for some of these arguments, and all new arguments use the same names as the existing ones. However, these names differ from the names used inconf
, and renaming them might be desirable in the future.The
GaiaClass
attributes that correspond to the previously listedconf
items provide no value and have been deprecated.There is still more that can be done. For example
VALID_DATALINK_RETRIEVAL_TYPES
should either not be defined inconf
at all or it should be defined as an instance ofConfigItem
. Another question not addressed here is if the config options should affect thecross_match()
function. But this pull request is quite large already and it fixes #2093, #2099 and #1760.