Build a docker image for openchatkit #40

loklok-infi · 2023-03-14T12:23:24Z

Is your feature request related to a problem? Please describe.
A docker image might be easier for people to use.

Describe the solution you'd like
We could add a /docker folder or a simple dockerfile to the repo, so people could build the image by themselves. And maybe we could push the image to dockerhub so they could just pull and test.

csris · 2023-03-14T18:35:09Z

Thanks for the feature request. This is a great idea. Will put it on the roadmap.

rpj09 · 2023-03-14T19:47:29Z

Hey @Jonuknownothingsnow i am new to open source . and it would be very happy to work on this idea under your guidance

kailust · 2023-03-15T11:30:59Z

Dockerfile

# Base image
FROM ubuntu:20.04

# Set working directory
WORKDIR /app

# Update and install required packages
RUN apt-get update && \
    apt-get install -y git-lfs wget && \
    rm -rf /var/lib/apt/lists/*

# Download and install Miniconda
RUN wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh && \
    bash Miniconda3-latest-Linux-x86_64.sh -b -p /opt/conda && \
    rm Miniconda3-latest-Linux-x86_64.sh

# Set conda to automatically activate base environment on login
RUN echo ". /opt/conda/etc/profile.d/conda.sh" >> ~/.bashrc && \
    echo "conda activate base" >> ~/.bashrc

# Create OpenChatKit environment
COPY environment.yml .
RUN conda env create -f environment.yml

# Install Git LFS
RUN git lfs install

# Copy OpenChatKit code
COPY . .

# Prepare GPT-NeoX-20B model
RUN python pretrained/GPT-NeoX-20B/prepare.py

# Set entrypoint to bash shell
ENTRYPOINT ["/bin/bash"]

Build the Docker image using the following command:

docker build -t openchatkit .

Run the Docker container using the following command:

docker run -it openchatkit

This will start a new bash shell in the container.
Activate the OpenChatKit environment by running the following command:

conda activate OpenChatKit

You should now be able to use the OpenChatKit code and run the prepare.py script.

csris · 2023-03-18T05:40:41Z

As I mentioned in the PR, both the pretrained model and datasets can be quite large.

$ du -sh data/* pretrained/GPT-NeoX-20B/
172G    data/OIG
238M    data/OIG-moderation
38G     data/wikipedia-3sentence-level-retrieval-index
39G     pretrained/GPT-NeoX-20B/

The Dockerfile above bakes the 39GB pretrained model into the image. In my opinion, it would be better to download the pretrained model into a bind mount when the container starts. The image would be much smaller and the bind mount persists the model across container restarts.

rpj09 · 2023-03-19T13:19:58Z

As I mentioned in the PR, both the pretrained model and datasets can be quite large.
$ du -sh data/* pretrained/GPT-NeoX-20B/
172G    data/OIG
238M    data/OIG-moderation
38G     data/wikipedia-3sentence-level-retrieval-index
39G     pretrained/GPT-NeoX-20B/
The Dockerfile above bakes the 39GB pretrained model into the image. In my opinion, it would be better to download the pretrained model into a bind mount when the container starts. The image would be much smaller and the bind mount persists the model across container restarts.

Sure i will try to do it

xsanz · 2023-04-22T01:07:38Z

Hello,

I'm just starting with OpenChat, I was checking to use Docker and I found your Dockerfile.

One question, for training the model, I understand cuda + nvidia are used if available?

If that yes, yesterday I found this, maybe useful https://blog.roboflow.com/nvidia-docker-vscode-pytorch/
Looks like there are docker-nvidia accelerated containers nvidia/cuda:11.0.3-base-ubuntu20.04 ready to be used.
(Note: Because I use docker-compose I had to update to a 1.28.0+ version to configure of '--gpus all' parameter)

Thank you

xsanz · 2023-04-23T04:07:46Z

Hello there,

Just mention that I had issues building the environment with package netifaces, that I solved updating environment.yml file from netifaces===0.11.0 to netifaces2==0.0.16

Thank you

orangetin · 2023-04-23T04:30:24Z

Hello there,

Just mention that I had issues building the environment with package netifaces, that I solved updating environment.yml file from netifaces===0.11.0 to netifaces2==0.0.16

Thank you

This can be fixed by installing gcc. On Ubuntu, you'd run sudo apt install gcc. That should fix your error!

orangetin · 2023-04-25T00:43:04Z

If that yes, yesterday I found this, maybe useful https://blog.roboflow.com/nvidia-docker-vscode-pytorch/ Looks like there are docker-nvidia accelerated containers nvidia/cuda:11.0.3-base-ubuntu20.04 ready to be used. (Note: Because I use docker-compose I had to update to a 1.28.0+ version to configure of '--gpus all' parameter)

Thanks for the great resource @xsanz! I was able to get the model loaded onto the GPU in docker using those instructions

mlaug · 2023-05-06T09:44:06Z

the conda binary was not found during my docker build. If anyone runs into this issue, I had set the PATH correctly before trying to run conda

ENV PATH=/opt/conda/bin/:$PATH

rpj09 linked a pull request Mar 16, 2023 that will close this issue

Added docker-file and edited the instructions for it #50

Open

csris assigned rpj09 Mar 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Build a docker image for openchatkit #40

Build a docker image for openchatkit #40

loklok-infi commented Mar 14, 2023

csris commented Mar 14, 2023

rpj09 commented Mar 14, 2023

kailust commented Mar 15, 2023

csris commented Mar 18, 2023

rpj09 commented Mar 19, 2023

xsanz commented Apr 22, 2023

xsanz commented Apr 23, 2023

orangetin commented Apr 23, 2023

orangetin commented Apr 25, 2023

mlaug commented May 6, 2023

Build a docker image for openchatkit #40

Build a docker image for openchatkit #40

Comments

loklok-infi commented Mar 14, 2023

csris commented Mar 14, 2023

rpj09 commented Mar 14, 2023

kailust commented Mar 15, 2023

csris commented Mar 18, 2023

rpj09 commented Mar 19, 2023

xsanz commented Apr 22, 2023

xsanz commented Apr 23, 2023

orangetin commented Apr 23, 2023

orangetin commented Apr 25, 2023

mlaug commented May 6, 2023