Skip to content

Commit

Permalink
first commit
Browse files Browse the repository at this point in the history
first commit

first commit

initial commit

initial commit

google colab link

update citation

update citation

update citation

update citation

Pre public (#5)

* initial commit

* notebook

* noteboob/google colab

* noteboob/google colab

* noteboob/google colab

* bug fixes

* depth vis

* readme update

* readme update

* bug fixes

* fancy readme

* fancy readme

* fix visualization

* datagen + shape pretraining

* readme update

* update readme

* readme+ datagen

* poster link

* data gen script

* datalinks in configs

link update readme

notebook torch version change

MIT license
  • Loading branch information
zubair-irshad committed May 18, 2022
0 parents commit c4c8979
Show file tree
Hide file tree
Showing 68 changed files with 501,111 additions and 0 deletions.
161 changes: 161 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,161 @@
# CenterSnap: Single-Shot Multi-Object 3D Shape Reconstruction and Categorical 6D Pose and Size Estimation
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) [![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/centersnap-single-shot-multi-object-3d-shape/6d-pose-estimation-using-rgbd-on-camera25)](https://paperswithcode.com/sota/6d-pose-estimation-using-rgbd-on-camera25?p=centersnap-single-shot-multi-object-3d-shape)<img src="demo/Pytorch_logo.png" width="10%">

This repository is the pytorch implementation of our paper:
<a href="https://www.tri.global/" target="_blank">
<img align="right" src="demo/tri-logo.png" width="20%"/>
</a>

**CenterSnap: Single-Shot Multi-Object 3D Shape Reconstruction and Categorical 6D Pose and Size Estimation**<br>
[__***Muhammad Zubair Irshad***__](https://zubairirshad.com), [Thomas Kollar](http://www.tkollar.com/site/), [Michael Laskey](https://www.linkedin.com/in/michael-laskey-4b087ba2/), [Kevin Stone](https://www.linkedin.com/in/kevin-stone-51171270/), [Zsolt Kira](https://faculty.cc.gatech.edu/~zk15/) <br>
International Conference on Robotics and Automation (ICRA), 2022<br>

[[Project Page](https://zubair-irshad.github.io/projects/CenterSnap.html)] [[arXiv](https://arxiv.org/abs/2203.01929)] [[PDF](https://arxiv.org/pdf/2203.01929.pdf)] [[Video](https://www.youtube.com/watch?v=Bg5vi6DSMdM)] [[Poster](https://zubair-irshad.github.io/projects/resources/Poster%7CCenterSnap%7CICRA2022.pdf)]

[![Explore CenterSnap in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/zubair-irshad/CenterSnap/blob/master/notebook/explore_CenterSnap.ipynb)<br>


<p align="center">
<img src="demo/POSE_CS.gif" width="100%">
</p>

<p align="center">
<img src="demo/Method_CS.gif" width="100%">
</p>

## Citation

If you find this repository useful, please consider citing:

```
@inproceedings{irshad2022centersnap,
title={CenterSnap: Single-Shot Multi-Object 3D Shape Reconstruction and Categorical 6D Pose and Size Estimation},
author={Muhammad Zubair Irshad and Thomas Kollar and Michael Laskey and Kevin Stone and Zsolt Kira},
journal={IEEE International Conference on Robotics and Automation (ICRA)},
year={2022},
url={https://arxiv.org/abs/2203.01929},
}
```

### Contents
<div class="toc">
<ul>
<li><a href="#-environment">💻 Environment</a></li>
<li><a href="#-dataset">📊 Dataset</a></li>
<li><a href="#-training-and-validate">✨ Training and Inference</a></li>
<li><a href="#-faqs">📝 FAQ</a></li>
</ul>
</div>

## 💻 Environment

Create a python 3.8 virtual environment and install requirements:

```bash
cd $CenterSnap_Repo
conda create -y --prefix ./env python=3.8
conda activate ./env/
./env/bin/python -m pip install --upgrade pip
./env/bin/python -m pip install -r requirements.txt -f https://download.pytorch.org/whl/torch_stable.html
```
The code was built and tested on **cuda 10.2**

## 📊 Dataset

1. Download pre-processed dataset

We recommend downloading the preprocessed dataset to train and evaluate CenterSnap model. Download and untar [Synthetic](https://tri-robotics-public.s3.amazonaws.com/centersnap/CAMERA.tar.gz) (868GB) and [Real](https://tri-robotics-public.s3.amazonaws.com/centersnap/Real.tar.gz) (70GB) datasets. These files contains all the training and validation you need to replicate our results.

```
cd $CenterSnap_REPO/data
wget https://tri-robotics-public.s3.amazonaws.com/centersnap/CAMERA.tar.gz
tar -xzvf CAMERA.tar.gz
wget https://tri-robotics-public.s3.amazonaws.com/centersnap/Real.tar.gz
tar -xzvf Real.tar.gz
```

The data directory structure should follow:

```
data
├── CAMERA
│ ├── train
│ └── val_subset
├── Real
│ ├── train
└── └── test
```

2. To prepare your own dataset, we provide additional scripts under [prepare_data](https://github.com/zubair-irshad/CenterSnap/tree/master/prepare_data).

## ✨ Training and Inference

1. Train on NOCS Synthetic (requires 13GB GPU memory):
```bash
./runner.sh net_train.py @configs/net_config.txt
```

Note than *runner.sh* is equivalent to using *python* to run the script. Additionally it sets up the PYTHONPATH and CenterSnap Enviornment Path automatically.

2. Finetune on NOCS Real Train (Note that good results can be obtained after finetuning on the Real train set for only a few epochs i.e. 1-5):
```bash
./runner.sh net_train.py @configs/net_config_real_resume.txt --checkpoint \path\to\best\checkpoint
```

3. Inference on a NOCS Real Test Subset

<p align="center">
<img src="demo/reconstruction.gif" width="100%">
</p>

Download a small NOCS Real subset from [[here](https://www.dropbox.com/s/yfenvre5fhx3oda/nocs_test_subset.tar.gz?dl=1)]

```bash
./runner.sh inference/inference_real.py @configs/net_config.txt --data_dir path_to_nocs_test_subset --checkpoint checkpoint_path_here
```

You should see the **visualizations** saved in ```results/CenterSnap```. Change the --ouput_path in *config.txt to save them to a different folder

4. Optional (Shape Auto-Encoder Pre-training)

We provide pretrained model for shape auto-encoder to be used for data collection and inference. Although our codebase doesn't require separately training the shape auto-encoder, if you would like to do so, we provide additional scripts under **external/shape_pretraining**


## 📝 FAQ

**1.** I am getting ```no cuda GPUs available``` while running colab.

- Ans: Make sure to follow this instruction to activate GPUs in colab:

```
Make sure that you have enabled the GPU under Runtime-> Change runtime type!
```

**2.** I am getting ```raise RuntimeError('received %d items of ancdata' %
RuntimeError: received 0 items of ancdata```

- Ans: Increase ulimit to 2048 or 8096 via ```uimit -n 2048```

**3.** I am getting ``` RuntimeError: CUDA error: no kernel image is available for execution on the device``` or ``` You requested GPUs: [0] But your machine only has: [] ```

- Ans: Check your pytorch installation with your cuda installation. Try the following:


1. Installing cuda 10.2 and running the same script in requirements.txt

2. Installing the relevant pytorch cuda version i.e. changing this line in the requirements.txt

```
torch==1.7.1
torchvision==0.8.2
```

**4.** I am seeing zero val metrics in ***wandb***
- Ans: Make sure you threshold the metrics. Since pytorch lightning's first validation check metric is high, it seems like all other metrics are zero. Please threshold manually to remove the outlier metric in wandb to see actual metrics.

## Acknowledgments
* This code is built upon the implementation from [SimNet](https://github.com/ToyotaResearchInstitute/simnet)

## Licenses
* The source code is released under the [MIT license](https://opensource.org/licenses/MIT).
25 changes: 25 additions & 0 deletions configs/inference.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
--checkpoint=../nocs_test_subset/checkpoint/centersnap_real.ckpt
--max_steps=380000
--model_file=models/panoptic_net.py
--model_name=res_fpn
--output=results/CenterSnap
--train_path=file://data/Real/train
--train_batch_size=32
--train_num_workers=10
--val_path=file://data/Real/test
--val_batch_size=32
--val_num_workers=10
--optim_learning_rate=0.0006
--optim_momentum=0.9
--optim_weight_decay=1e-4
--optim_poly_exp=0.9
--optim_warmup_epochs=1
--loss_seg_mult=1.0
--loss_depth_mult=1.0
--loss_vertex_mult=0.1
--loss_rotation_mult=0.1
--loss_heatmap_mult=100.0
--loss_latent_emb_mult=0.1
--loss_abs_pose_mult=0.1
--loss_z_centroid_mult=0.1
--wandb_name=NOCS_Inference_Real
24 changes: 24 additions & 0 deletions configs/net_config.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
--max_steps=380000
--model_file=models/panoptic_net.py
--model_name=res_fpn
--output=results/CenterSnap_TrainSynthetic
--train_path=file://data/CAMERA/train
--train_batch_size=32
--train_num_workers=10
--val_path=file://data/CAMERA/val_subset
--val_batch_size=32
--val_num_workers=10
--optim_learning_rate=0.0006
--optim_momentum=0.9
--optim_weight_decay=1e-4
--optim_poly_exp=0.9
--optim_warmup_epochs=1
--loss_seg_mult=1.0
--loss_depth_mult=1.0
--loss_vertex_mult=0.1
--loss_rotation_mult=0.1
--loss_heatmap_mult=100.0
--loss_latent_emb_mult=0.1
--loss_abs_pose_mult=0.1
--loss_z_centroid_mult=0.1
--wandb_name=NOCS_Train_Synthetic
25 changes: 25 additions & 0 deletions configs/net_config_real_resume.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
--max_steps=240000
--finetune_real=True
--model_file=models/panoptic_net.py
--model_name=res_fpn
--output=results/CenterSnap_FinetuneReal
--train_path=file://data/Real/train
--train_batch_size=32
--train_num_workers=5
--val_path=file://data/Real/test
--val_batch_size=32
--val_num_workers=5
--optim_learning_rate=0.0006
--optim_momentum=0.9
--optim_weight_decay=1e-4
--optim_poly_exp=0.9
--optim_warmup_epochs=1
--loss_seg_mult=1.0
--loss_depth_mult=1.0
--loss_vertex_mult=0.1
--loss_rotation_mult=0.1
--loss_heatmap_mult=100.0
--loss_latent_emb_mult=0.1
--loss_abs_pose_mult=0.1
--loss_z_centroid_mult=0.1
--wandb_name=NOCS_Real_Finetune
Binary file added demo/Method_CS.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demo/POSE_CS.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demo/Pytorch_logo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demo/reconstruction.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demo/tri-logo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
3 changes: 3 additions & 0 deletions env/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
*
*/
!.gitignore
41 changes: 41 additions & 0 deletions external/shape_pretraining/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
## Shape Autoencoder Pre-training<br>
Shape pretraining code is adapted from [object-deformnet](https://github.com/mentian/object-deformnet).

### Install dependencies

```
conda activate ./env/
cd $CenterSnap_Repo
conda install -c bottler nvidiacub
conda install -c conda-forge -c fvcore -c iopath fvcore iopath
./env/bin/python -m pip install "git+https://github.com/facebookresearch/[email protected]"
```

### Dataset Prepration
1. Download [object models](http://download.cs.stanford.edu/orion/nocs/obj_models.zip) provided by [NOCS](https://github.com/hughw19/NOCS_CVPR2019)

2. Download NOCS [preprocess data](https://www.dropbox.com/s/8im9fzopo71h6yw/nocs_preprocess.tar.gz?dl=1)

Unzip and organize these files in $CenterSnap/data as follows:
```
data
├── obj_models
├── train
├── val
├── real_train
├── real_test
├── mug_meta.pkl
```

2. Prepare data:

```
./runner.sh external/shape_pretraining/shape_data.py --obj_model_dir \path\to\object-model\dir
```
A file would generate in ***obj_models*** folder named ***ShapeNetCore_2048.h5***

3. Train shape auto-encoder:
```
cd external/shape_pretraining
./runner.sh external/shape_pretraining\train_ae.py --h5_file \path\to\h5_file
```
Binary file not shown.
38 changes: 38 additions & 0 deletions external/shape_pretraining/dataset/shape_dataset.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
import h5py
import numpy as np
import torch.utils.data as data

class ShapeDataset(data.Dataset):
def __init__(self, h5_file, mode, n_points=1024, augment=False):
assert (mode == 'train' or mode == 'val'), 'Mode must be "train" or "val".'
self.mode = mode
self.n_points = n_points
self.augment = augment
# load data from h5py file
with h5py.File(h5_file, 'r') as f:
self.length = f[self.mode].attrs['len']
self.data = f[self.mode]['data'][:]
self.label = f[self.mode]['label'][:]
# augmentation parameters
self.sigma = 0.01
self.clip = 0.02
self.shift_range = 0.02

def __len__(self):
return self.length

def __getitem__(self, index):
xyz = self.data[index]
label = self.label[index] - 1 # data saved indexed from 1
# randomly downsample
np_data = xyz.shape[0]
assert np_data >= self.n_points, 'Not enough points in shape.'
idx = np.random.choice(np_data, self.n_points)
xyz = xyz[idx, :]
# data augmentation
if self.augment:
jitter = np.clip(self.sigma*np.random.randn(self.n_points, 3), -self.clip, self.clip)
xyz[:, :3] += jitter
shift = np.random.uniform(-self.shift_range, self.shift_range, (1, 3))
xyz[:, :3] += shift
return xyz, label
Binary file not shown.
Loading

0 comments on commit c4c8979

Please sign in to comment.