Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serialize Nones #999

Open
wants to merge 20 commits into
base: main
Choose a base branch
from
Open

Serialize Nones #999

wants to merge 20 commits into from

Conversation

flying-sheep
Copy link
Member

@flying-sheep flying-sheep commented Jun 7, 2023

Rendered release notes

@codecov
Copy link

codecov bot commented Jun 7, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 84.63%. Comparing base (4bcd989) to head (230469c).
Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #999      +/-   ##
==========================================
- Coverage   87.08%   84.63%   -2.45%     
==========================================
  Files          40       40              
  Lines        6107     6116       +9     
==========================================
- Hits         5318     5176     -142     
- Misses        789      940     +151     
Files with missing lines Coverage Δ
src/anndata/_io/specs/methods.py 88.45% <100.00%> (-0.16%) ⬇️
src/anndata/_io/specs/registry.py 95.40% <ø> (-0.63%) ⬇️

... and 6 files with indirect coverage changes

@flying-sheep flying-sheep changed the title Save Nones Serialize Nones Jun 7, 2023
@flying-sheep flying-sheep marked this pull request as ready for review June 7, 2023 13:23
@ivirshup
Copy link
Member

As discussed, I don't want to change the on-disk format in the next release. So I'd hold off on this until at least 0.11.

@flying-sheep flying-sheep modified the milestones: 0.10.0, 0.11.0 Jul 10, 2023
@flying-sheep
Copy link
Member Author

Since 0.10 is released, we can document and then merge this, right?

@ivirshup
Copy link
Member

ivirshup commented Oct 6, 2023

I would like to have proposals for all disk format changes written up and discussed with other implementers before merging them to main.

@flying-sheep
Copy link
Member Author

Sure, we can use this PR as reference implementation.

Which other implementers are there? Is there a channel with the relevant people on some communication medium?

@ivirshup ivirshup modified the milestones: 0.11.0, 0.12.0 Aug 8, 2024
@alam-shahul
Copy link

Hi there, I feel that this is an important feature/bug fix, I was wondering if there were plans to merge this anytime soon/if it's possible to help.

@flying-sheep flying-sheep changed the base branch from main to test-io-roundtrip October 1, 2024 14:19
Base automatically changed from test-io-roundtrip to main October 1, 2024 14:41
@flying-sheep flying-sheep requested review from ilan-gold and removed request for ivirshup December 12, 2024 12:09
@flying-sheep flying-sheep self-assigned this Dec 12, 2024
Copy link
Contributor

@ilan-gold ilan-gold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two things:

  1. Should we put this feature behind a flag initially? I know we had spoken about that as a way forward before. However, in this case, I'm no really sure what we'd gain by doing that.
  2. This PR is simple enough that it can be merged without much comment. I think we should try to make it clear that this PR is planned for the 0.12 release and that we are looking for feedback (specifically from other languages).

tests/test_readwrite.py Outdated Show resolved Hide resolved
src/anndata/_io/specs/methods.py Show resolved Hide resolved
@flying-sheep
Copy link
Member Author

flying-sheep commented Jan 7, 2025

  1. Should we put this feature behind a flag initially? I know we had spoken about that as a way forward before. However, in this case, I'm no really sure what we'd gain by doing that.

I’m not super hot on the idea myself. The whole idea behind IOSpecs was that people get a nice error message that says: “this file has been written by a newer anndata version, so use that to load it”. Now we have both that and individual feature flags for e.g. nullable arrays. The intended purpose for the latter is that people see the feature but have to opt-in. But then again they already have to (less explicitly) opt in by using df.convert_dtypes() or dtype='string'.

Now that I’m thinking about it, an alternative possible solution could have been compatibility profiles for each AnnData version with new features: adata.save('foo.h5ad', compat='0.10') would e.g. crash if there are nullable arrays in the file. And to convert to an older version, one would create an env with a new anndata version, delete or modify the incompatible entries, and convert it.

  1. This PR is simple enough that it can be merged without much comment. I think we should try to make it clear that this PR is planned for the 0.12 release and that we are looking for feedback (specifically from other languages).

I don’t understand the first half: With it being milestone 0.12, we don’t backport its changelog entry to the 0.11.x branch, that’s what makes this change 0.12.

What kind of feedback are we looking for?

@ilan-gold
Copy link
Contributor

What kind of feedback are we looking for?

More just going off of #673 (comment). Maybe we'll learn something but honestly, from what I can tell, this feature should not be a problem for the programming languages that has anndata support (from memory).

I’m not super hot on the idea myself.

Ok, I think since neither of us can come up with an idea, then maybe we just merge this PR and note to the community that this feature is coming, and if they want to try it out, it's on main. We will also have the integration tests to see what's up with other packages. So overall, I think a no-opt-in way forward is fine too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

.write does not save None values
4 participants