Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversion test #102

Draft
wants to merge 52 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
52 commits
Select commit Hold shift + click to select a range
9907435
changes for testing conversions locally
grg2rsr Oct 24, 2024
0c6b01b
some bugfixes to pass the processed-only checks
grg2rsr Oct 24, 2024
a161473
for local testing
grg2rsr Oct 25, 2024
caabeb6
read after write for raw ephys and video data added
grg2rsr Dec 10, 2024
3ab8de3
revision argument in all datainterfaces
grg2rsr Dec 11, 2024
58ccd21
cleanups
grg2rsr Dec 11, 2024
5e17eec
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 11, 2024
7999b4a
add signature to sorting interface
h-mayorquin Dec 13, 2024
dab27f0
Merge remote-tracking branch 'refs/remotes/origin/conversion_test' in…
h-mayorquin Dec 13, 2024
c58da11
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 13, 2024
6c80583
fix typing
h-mayorquin Dec 13, 2024
beae882
Merge remote-tracking branch 'refs/remotes/origin/conversion_test' in…
h-mayorquin Dec 13, 2024
9a1c01c
fix more typing errors
h-mayorquin Dec 13, 2024
2b9c4bf
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 13, 2024
5f9d77e
optional
h-mayorquin Dec 13, 2024
05d2958
integration of mine and hebertos changes
grg2rsr Dec 17, 2024
1ab3d11
Merge remote-tracking branch 'origin/conversion_test' into conversion…
grg2rsr Dec 17, 2024
e608b5d
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 17, 2024
22feb6a
added automatic last revision to consistency checking
grg2rsr Dec 17, 2024
d4b80ef
Merge branch 'conversion_test' of github.com:catalystneuro/IBL-to-nwb…
grg2rsr Dec 17, 2024
76fc999
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 17, 2024
62d7a40
output path related fixes / cleanups
grg2rsr Dec 17, 2024
4202759
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 17, 2024
b640ee6
attempting to pass one to IblSortingInterface - fails currently by py…
grg2rsr Dec 18, 2024
6487907
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 18, 2024
45cbaca
one instantiation removed in IblSortingInterface, but requires hack i…
grg2rsr Dec 18, 2024
a522cb5
git mess fix
grg2rsr Dec 18, 2024
cff20a5
for heberto
grg2rsr Dec 18, 2024
86144df
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 18, 2024
72a7117
for sdsc
grg2rsr Dec 19, 2024
73899e5
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 19, 2024
0c792fb
fix for one local on sdsc
grg2rsr Dec 19, 2024
d6984ce
Merge branch 'conversion_test' of github.com:catalystneuro/IBL-to-nwb…
grg2rsr Dec 19, 2024
cafbbd5
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 19, 2024
9fad4ec
sdsc fix
grg2rsr Dec 19, 2024
86868dc
revisions bugfix
grg2rsr Dec 23, 2024
fcab8a9
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 23, 2024
0b930d7
updates with revision hack: session_id = eid:revision
grg2rsr Jan 8, 2025
d7a5ec0
Merge branch 'conversion_test' of github.com:catalystneuro/IBL-to-nwb…
grg2rsr Jan 8, 2025
631dec9
ruff happiness
grg2rsr Jan 8, 2025
225a0f8
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jan 8, 2025
497ba31
one scripts to convert both raw and processed, and also to to run on …
grg2rsr Jan 8, 2025
2da2f93
Merge branch 'conversion_test' of github.com:catalystneuro/IBL-to-nwb…
grg2rsr Jan 8, 2025
b72fd85
rest caching disabled
grg2rsr Jan 8, 2025
a8811a6
decompression and cleaup added
grg2rsr Jan 9, 2025
fed2679
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jan 9, 2025
b894d6a
decompression to scratch added
grg2rsr Jan 9, 2025
32aef9f
Merge branch 'conversion_test' of github.com:catalystneuro/IBL-to-nwb…
grg2rsr Jan 9, 2025
cf853a2
ruff
grg2rsr Jan 9, 2025
221f6b7
removing uuids from filenames of symbolic links
grg2rsr Jan 9, 2025
ca472a1
cleanup updated
grg2rsr Jan 9, 2025
b74359d
bugfix in symlink creater (uuid removal)
grg2rsr Jan 9, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -134,3 +134,5 @@ dmypy.json
#misc
endpoint_schemas/
tests/
src/ibl_to_nwb/local/
.vscode
210 changes: 210 additions & 0 deletions src/ibl_to_nwb/_scripts/_convert_brainwide_map.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,210 @@
import os
import shutil
import sys
from datetime import datetime
from pathlib import Path

import spikeglx

# if running on SDSC, use the OneSdsc, else normal
if "USE_SDSC_ONE" in os.environ:
print("using SDSC ONE")
from deploy.iblsdsc import OneSdsc as ONE
else:
print("using regular ONE")
from one.api import ONE

from ibl_to_nwb.converters import BrainwideMapConverter, IblSpikeGlxConverter
from ibl_to_nwb.datainterfaces import (
BrainwideMapTrialsInterface,
IblPoseEstimationInterface,
IblSortingInterface,
LickInterface,
PupilTrackingInterface,
RawVideoInterface,
RoiMotionEnergyInterface,
WheelInterface,
)


def create_symlinks(source_dir, target_dir, remove_uuid=True):
"""replicates the tree under source_dir at target dir in the form of symlinks"""
for root, dirs, files in os.walk(source_dir):
for dir in dirs:
folder = target_dir / (Path(root) / dir).relative_to(source_dir)
folder.mkdir(parents=True, exist_ok=True)

for root, dirs, files in os.walk(source_dir):
for file in files:
source_file_path = Path(root) / file
target_file_path = target_dir / source_file_path.relative_to(source_dir)
if remove_uuid:
parent, name = target_file_path.parent, target_file_path.name
name_parts = name.split(".")
name_parts.remove(name_parts[-2])
target_file_path = parent / ".".join(name_parts)
if not target_file_path.exists():
target_file_path.symlink_to(source_file_path)


def get_last_before(eid: str, one: ONE, revision: str):
revisions = one.list_revisions(eid, revision="*")
revisions = [datetime.strptime(revision, "%Y-%m-%d") for revision in revisions]
revision = datetime.strptime(revision, "%Y-%m-%d")
revisions = sorted(revisions)
ix = sum([not (rev > revision) for rev in revisions]) - 1
return revisions[ix].strftime("%Y-%m-%d")


def convert(eid: str, one: ONE, data_interfaces: list, revision: str, mode: str):
# Run conversion
session_converter = BrainwideMapConverter(one=one, session=eid, data_interfaces=data_interfaces, verbose=True)
metadata = session_converter.get_metadata()
metadata["NWBFile"]["session_id"] = f"{eid}:{revision}" # FIXME this hack has to go
subject_id = metadata["Subject"]["subject_id"]

subject_folder_path = output_folder / f"sub-{subject_id}"
subject_folder_path.mkdir(exist_ok=True)
if mode == "raw":
fname = f"sub-{subject_id}_ses-{eid}_desc-raw_ecephys+image.nwb"
if mode == "processed":
fname = f"sub-{subject_id}_ses-{eid}_desc-processed_behavior+ecephys.nwb"

nwbfile_path = subject_folder_path / fname
session_converter.run_conversion(
nwbfile_path=nwbfile_path,
metadata=metadata,
overwrite=True,
)
return nwbfile_path


cleanup = False

if __name__ == "__main__":
if len(sys.argv) == 1:
eid = "caa5dddc-9290-4e27-9f5e-575ba3598614"
mode = "raw"
else:
eid = sys.argv[1]
mode = sys.argv[2] # raw or processed

print(eid)
print(mode)

# path setup
base_path = Path.home() / "ibl_scratch"
output_folder = base_path / "nwbfiles"
output_folder.mkdir(exist_ok=True, parents=True)
local_scratch_folder = base_path / eid

# common
one_kwargs = dict(
base_url="https://openalyx.internationalbrainlab.org",
password="international",
mode="remote",
)

# if not running on SDSC adding the cache folder explicitly
if "USE_SDSC_ONE" in os.environ:
one_kwargs["cache_rest"] = None # disables rest caching (write permission errors on popeye)
else:
# Initialize IBL (ONE) client to download processed data for this session
one_cache_folder_path = base_path / "ibl_conversion" / eid / "cache"
one_kwargs["cache_dir"] = one_cache_folder_path

# instantiate one
one = ONE(**one_kwargs)

# correct revision
revision = get_last_before(eid=eid, one=one, revision="2024-07-10")

# Initialize as many of each interface as we need across the streams
data_interfaces = []

if mode == "raw":
# ephys
session_folder = one.eid2path(eid)
spikeglx_source_folder_path = session_folder / "raw_ephys_data"

# create symlinks at local scratch
create_symlinks(spikeglx_source_folder_path, local_scratch_folder)

# check and decompress
cbin_paths = []
for root, dirs, files in os.walk(local_scratch_folder):
for file in files:
if file.endswith(".cbin"):
cbin_paths.append(Path(root) / file)

for path in cbin_paths:
if not path.with_suffix(".bin").exists():
print(f"decompressing {path}")
spikeglx.Reader(path).decompress_to_scratch()

# Specify the path to the SpikeGLX files on the server but use ONE API for timestamps
spikeglx_subconverter = IblSpikeGlxConverter(folder_path=spikeglx_source_folder_path, one=one, eid=eid)
data_interfaces.append(spikeglx_subconverter)

# video
metadata_retrieval = BrainwideMapConverter(one=one, session=eid, data_interfaces=[], verbose=False)
subject_id = metadata_retrieval.get_metadata()["Subject"]["subject_id"]

pose_estimation_files = one.list_datasets(eid=eid, filename="*.dlc*")
for pose_estimation_file in pose_estimation_files:
camera_name = pose_estimation_file.replace("alf/_ibl_", "").replace(".dlc.pqt", "")

video_interface = RawVideoInterface(
nwbfiles_folder_path=output_folder,
subject_id=subject_id,
one=one,
session=eid,
camera_name=camera_name,
)
data_interfaces.append(video_interface)

if mode == "processed":
# These interfaces should always be present in source data
data_interfaces.append(IblSortingInterface(one=one, session=eid, revision=revision))
data_interfaces.append(BrainwideMapTrialsInterface(one=one, session=eid, revision=revision))
data_interfaces.append(WheelInterface(one=one, session=eid, revision=revision))

# # These interfaces may not be present; check if they are before adding to list
pose_estimation_files = one.list_datasets(eid=eid, filename="*.dlc*")
for pose_estimation_file in pose_estimation_files:
camera_name = pose_estimation_file.replace("alf/_ibl_", "").replace(".dlc.pqt", "")
data_interfaces.append(
IblPoseEstimationInterface(one=one, session=eid, camera_name=camera_name, revision=revision)
)

pupil_tracking_files = one.list_datasets(eid=eid, filename="*features*")
for pupil_tracking_file in pupil_tracking_files:
camera_name = pupil_tracking_file.replace("alf/_ibl_", "").replace(".features.pqt", "")
data_interfaces.append(
PupilTrackingInterface(one=one, session=eid, camera_name=camera_name, revision=revision)
)

roi_motion_energy_files = one.list_datasets(eid=eid, filename="*ROIMotionEnergy.npy*")
for roi_motion_energy_file in roi_motion_energy_files:
camera_name = roi_motion_energy_file.replace("alf/", "").replace(".ROIMotionEnergy.npy", "")
data_interfaces.append(
RoiMotionEnergyInterface(one=one, session=eid, camera_name=camera_name, revision=revision)
)

if one.list_datasets(eid=eid, collection="alf", filename="licks*"):
data_interfaces.append(LickInterface(one=one, session=eid, revision=revision))

# run the conversion
nwbfile_path = convert(
eid=eid,
one=one,
data_interfaces=data_interfaces,
revision=revision,
mode=mode,
)

# cleanup
if cleanup:
if mode == "raw":
os.system(f"find {local_scratch_folder} -type l -exec unlink {{}} \;")
shutil.rmtree(local_scratch_folder)
106 changes: 106 additions & 0 deletions src/ibl_to_nwb/_scripts/_convert_brainwide_map_processed.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
import os
import sys
from datetime import datetime
from pathlib import Path

if "USE_SDSC_ONE" in os.envion:
from deploy.iblsdsc import OneSdsc as ONE
else:
from one.api import ONE

from ibl_to_nwb.converters import BrainwideMapConverter
from ibl_to_nwb.datainterfaces import (
BrainwideMapTrialsInterface,
IblPoseEstimationInterface,
IblSortingInterface,
LickInterface,
PupilTrackingInterface,
RoiMotionEnergyInterface,
WheelInterface,
)


def get_last_before(eid: str, one: ONE, revision: str):
revisions = one.list_revisions(eid, revision="*")
revisions = [datetime.strptime(revision, "%Y-%m-%d") for revision in revisions]
revision = datetime.strptime(revision, "%Y-%m-%d")
revisions = sorted(revisions)
ix = sum([not (rev > revision) for rev in revisions]) - 1
return revisions[ix].strftime("%Y-%m-%d")


def convert(eid: str, one: ONE, data_interfaces: list, revision: str):
# Run conversion
session_converter = BrainwideMapConverter(one=one, session=eid, data_interfaces=data_interfaces, verbose=True)
metadata = session_converter.get_metadata()
metadata["NWBFile"]["session_id"] = f"{eid}:{revision}" # FIXME this hack has to go
subject_id = metadata["Subject"]["subject_id"]

subject_folder_path = output_folder / f"sub-{subject_id}"
subject_folder_path.mkdir(exist_ok=True)
fname = f"sub-{subject_id}_ses-{eid}_desc-processed.nwb"

nwbfile_path = subject_folder_path / fname
session_converter.run_conversion(
nwbfile_path=nwbfile_path,
metadata=metadata,
overwrite=True,
)
return nwbfile_path


if __name__ == "__main__":
if len(sys.argv) == 1:
eid = "caa5dddc-9290-4e27-9f5e-575ba3598614"
else:
eid = sys.argv[1]

# path setup
base_path = Path.home() / "ibl_scratch"
output_folder = base_path / "nwbfiles"
output_folder.mkdir(exist_ok=True, parents=True)

# Initialize IBL (ONE) client to download processed data for this session
one_cache_folder_path = base_path / "ibl_conversion" / eid / "cache"
one = ONE(
base_url="https://openalyx.internationalbrainlab.org",
password="international",
mode="remote",
# silent=True,
cache_dir=one_cache_folder_path,
)

revision = get_last_before(eid=eid, one=one, revision="2024-07-10")

# Initialize as many of each interface as we need across the streams
data_interfaces = list()

# These interfaces should always be present in source data
data_interfaces.append(IblSortingInterface(one=one, session=eid, revision=revision))
data_interfaces.append(BrainwideMapTrialsInterface(one=one, session=eid, revision=revision))
data_interfaces.append(WheelInterface(one=one, session=eid, revision=revision))

# # These interfaces may not be present; check if they are before adding to list
pose_estimation_files = one.list_datasets(eid=eid, filename="*.dlc*")
for pose_estimation_file in pose_estimation_files:
camera_name = pose_estimation_file.replace("alf/_ibl_", "").replace(".dlc.pqt", "")
data_interfaces.append(
IblPoseEstimationInterface(one=one, session=eid, camera_name=camera_name, revision=revision)
)

pupil_tracking_files = one.list_datasets(eid=eid, filename="*features*")
for pupil_tracking_file in pupil_tracking_files:
camera_name = pupil_tracking_file.replace("alf/_ibl_", "").replace(".features.pqt", "")
data_interfaces.append(PupilTrackingInterface(one=one, session=eid, camera_name=camera_name, revision=revision))

roi_motion_energy_files = one.list_datasets(eid=eid, filename="*ROIMotionEnergy.npy*")
for roi_motion_energy_file in roi_motion_energy_files:
camera_name = roi_motion_energy_file.replace("alf/", "").replace(".ROIMotionEnergy.npy", "")
data_interfaces.append(
RoiMotionEnergyInterface(one=one, session=eid, camera_name=camera_name, revision=revision)
)

if one.list_datasets(eid=eid, collection="alf", filename="licks*"):
data_interfaces.append(LickInterface(one=one, session=eid, revision=revision))

nwbfile_path = convert(eid=eid, one=one, data_interfaces=data_interfaces, revision=revision)
Loading
Loading