Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Direct users to the CUDA metapackage guides for compilation #2416

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/maintainer/adding_pkgs.md
Original file line number Diff line number Diff line change
Expand Up @@ -893,7 +893,7 @@ $ cd ~/staged-recipes
$ python build-locally.py <VARIANT>
```

where `<VARIANT>` is one of the file names in the `.ci_support/` directory, e.g. `linux64`, `osx64`, and `linux64_cuda102`.
where `<VARIANT>` is one of the file names in the `.ci_support/` directory, e.g. `linux64`, `osx64`, and `linux64_cuda<version>`.

<a id="about"></a>

Expand Down
2 changes: 1 addition & 1 deletion docs/maintainer/infrastructure.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ The YAML files included in `.ci_support` are minimal and not rendered like the o
Instead, conda-build will take these and combine them with the pinnings from `conda-forge-pinning` at runtime.
Also note that `staged-recipes` only builds for x64. Support for additional architectures can only be done once a feedstock has been provided.

- Linux: `linux64.yaml` plus the CUDA (10.2, 11.0, 11.1 and 11.2) variants.
- Linux: `linux64.yaml` plus the CUDA variants.
- macOS: `osx64.yaml`.
- Windows `win64.yaml`.

Expand Down
100 changes: 5 additions & 95 deletions docs/maintainer/knowledge_base.md
Original file line number Diff line number Diff line change
Expand Up @@ -2006,73 +2006,11 @@ if you're using a `c_stdlib_version` of `2.28`, set it to `alma8`.
## CUDA builds

Although the provisioned CI machines do not feature a GPU, conda-forge does provide mechanisms
to build CUDA-enabled packages. These mechanisms involve several packages:

- `cudatoolkit`: The runtime libraries for the CUDA toolkit. This is what end-users will end
up installing next to your package.
- `nvcc`: Nvidia's EULA does not allow the redistribution of compilers and drivers. Instead, we
provide a wrapper package that locates the CUDA installation in the system. The main role of this
package is to set some environment variables (`CUDA_HOME`, `CUDA_PATH`, `CFLAGS` and others),
as well as wrapping the real `nvcc` executable to set some extra command line arguments.

In practice, to enable CUDA on your package, add `{{ compiler('cuda') }}` to the `build`
section of your requirements and rerender. The matching `cudatoolkit` will be added to the `run`
requirements automatically.

On Linux, CMake users are required to use `${CMAKE_ARGS}` so CMake can find CUDA correctly. For example:

```shell-session
mkdir build && cd build
cmake ${CMAKE_ARGS} ${SRC_DIR}
make
```

:::note

**How is CUDA provided at the system level?**

- On Linux, Nvidia provides official Docker images, which we then
[adapt](https://github.com/conda-forge/docker-images) to conda-forge's needs.
- On Windows, the compilers need to be installed for every CI run. This is done through the
[conda-forge-ci-setup](https://github.com/conda-forge/conda-forge-ci-setup-feedstock/) scripts.
Do note that the Nvidia executable won't install the drivers because no GPU is present in the machine.

**How is cudatoolkit selected at install time?**

Conda exposes the maximum CUDA version supported by the installed Nvidia drivers through a virtual package
named `__cuda`. By default, `conda` will install the highest version available
for the packages involved. To override this behaviour, you can define a `CONDA_OVERRIDE_CUDA` environment
variable. More details in the
[Conda docs](https://docs.conda.io/projects/conda/en/stable/user-guide/tasks/manage-virtual.html#overriding-detected-packages).

Note that prior to v4.8.4, `__cuda` versions would not be part of the constraints, so you would always
get the latest one, regardless the supported CUDA version.

If for some reason you want to install a specific version, you can use:

```default
conda install your-gpu-package cudatoolkit=10.1
```

:::

<a id="testing-the-packages"></a>

### Testing the packages

Since the CI machines do not feature a GPU, you won't be able to test the built packages as part
of the conda recipe. That does not mean you can't test your package locally. To do so:

1. Enable the Azure artifacts for your feedstock (see [here](conda_forge_yml.mdx#azure)).
2. Include the test files and requirements in the recipe
[like this](https://github.com/conda-forge/cupy-feedstock/blob/a1e9cdf47775f90d3153a26913068c6df942d54b/recipe/meta.yaml#L51-L61).
3. Provide the test instructions. Take into account that the GPU tests will fail in the CI run,
so you need to ignore them to get the package built and uploaded as an artifact.
[Example](https://github.com/conda-forge/cupy-feedstock/blob/a1e9cdf47775f90d3153a26913068c6df942d54b/recipe/run_test.py).
4. Once you have downloaded the artifacts, you will be able to run:
```default
conda build --test <pkg file>.tar.bz2
```
to build CUDA-enabled packages.
See the [guide for maintainers of recipes that use CUDA](https://github.com/conda-forge/cuda-feedstock/blob/main/recipe/doc/recipe_guide.md)
for more information.
If a feedstock does need access to additional resource (like GPUs), please see the following
[section](#packages-that-require-a-gpu-or-long-running-builds).

<a id="common-problems-and-known-issues"></a>

Expand Down Expand Up @@ -2118,34 +2056,6 @@ burden on our CI resources. Only proceed if there's a known use case for the ext
2. In your feedstock fork, create a new branch and place the migration file under `.ci_support/migrations`.
3. Open a PR and re-render. CUDA 9.2, 10.0 and 10.1 will appear in the CI checks now. Merge when ready!
jakirkham marked this conversation as resolved.
Show resolved Hide resolved

<a id="adding-support-for-a-new-cuda-version"></a>

### Adding support for a new CUDA version

Providing a new CUDA version involves five repositores:

- [cudatoolkit-feedstock](https://github.com/conda-forge/cudatoolkit-feedstock)
- [nvcc-feedstock](https://github.com/conda-forge/nvcc-feedstock)
- [conda-forge-pinning-feedstock](https://github.com/conda-forge/conda-forge-pinning-feedstock)
- [docker-images](https://github.com/conda-forge/docker-images) (Linux only)
- [conda-forge-ci-setup-feedstock](https://github.com/conda-forge/conda-forge-ci-setup-feedstock) (Windows only)

The steps involved are, roughly:

1. Add the `cudatoolkit` packages in `cudatoolkit-feedstock`.
2. Submit the version migrator to `conda-forge-pinning-feedstock`.
This will stay open during the following steps.
3. For Linux, add the corresponding Docker images at `docker-images`.
Copy the migration file manually to `.ci_support/migrations`.
This copy should not specify a timestamp. Comment it out and rerender.
4. For Windows, add the installer URLs and hashes to the `conda-forge-ci-setup`
[script](https://github.com/conda-forge/conda-forge-ci-setup-feedstock/blob/master/recipe/install_cuda.bat).
The migration file must also be manually copied here. Rerender.
5. Create the new `nvcc` packages for the new version. Again, manual
migration must be added. Rerender.
6. When everything else has been merged and testing has taken place,
consider merging the PR opened at step 2 now so it can apply to all the downstream feedstocks.

<a id="opengpuserver"></a>

<a id="packages-that-require-a-gpu-or-long-running-builds"></a>
Expand Down
5 changes: 2 additions & 3 deletions docs/user/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -154,9 +154,8 @@ conda environment. This can be accomplished (for gcc) by passing `-sysroot=/` on

## How can I compile CUDA (host or device) codes in my environment?

Unfortunately, this is not possible with conda-forge's current infrastructure (`nvcc`, `cudatoolkit`, etc) if there is no local CUDA Toolkit installation. In particular, the `nvcc` package provided on conda-forge is a _wrapper package_ that exposes the actual `nvcc` compiler to our CI infrastructure in a `conda`-friendly way; it does not contain the full `nvcc` compiler toolchain. One of the reasons is that CUDA headers like `cuda.h`, `cuda_runtime.h`, etc, which are needed at compile time, are not redistributable according to NVIDIA's EULA. Likewise, the `cudatoolkit` package only contains CUDA runtime libraries and not the compiler toolchain.

If you need to compile CUDA code, even if it involves only CUDA host APIs, you will still need a valid CUDA Toolkit installed locally and use it. Please refer to [NVCC's documentation](https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html) for the CUDA compiler usage and [CUDA Programming Guide](https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html) for general CUDA programming.
Beginning with CUDA 12.0, a full suite of CUDA packages are provided in conda-forge including various metapackages to group components together.
These metapackages and their documentation are updated frequently, so for the most up to date recommendations and documentation please read the [relevant guides provided in the `cuda` feedstock](https://github.com/conda-forge/cuda-feedstock/blob/main/recipe/README.md).

<a id="faq-abi-incompatibility"></a>

Expand Down
8 changes: 4 additions & 4 deletions docs/user/tipsandtricks.md
Original file line number Diff line number Diff line change
Expand Up @@ -138,19 +138,19 @@ echo "CONDA_SUBDIR: $CONDA_SUBDIR" # Should print "CONDA_SUBDIR: osx-64"

## Installing CUDA-enabled packages like TensorFlow and PyTorch

In conda-forge, some packages are available with GPU support. These packages not only take significantly longer to compile and build, but they also result in rather large binaries that users then download. As an effort to maximize accessibility for users with lower connection and/or storage bandwidth, there is an ongoing effort to limit installing packages compiled for GPUs unnecessarily on CPU-only machines by default. This is accomplished by adding a run dependency, `__cuda`, that detects if the local machine has a GPU. However, this introduces challenges to users who may prefer to still download and use GPU-enabled packages even on a non-GPU machine. For example, login nodes on HPCs often do not have GPUs and their compute counterparts with GPUs often do not have internet access. In this case, a user can override the default setting via the environment variable `CONDA_OVERRIDE_CUDA` to install GPU packages on the login node to be used later on the compute node. At the time of writing (February 2022), we have concluded this safe default behavior is best for most of conda-forge users, with an easy override option available and documented. Please let us know if you have thoughts on or issues with this.
In conda-forge, some packages are available with GPU support. These packages not only take significantly longer to compile and build, but they also result in rather large binaries that users then download. As an effort to maximize accessibility for users with lower connection and/or storage bandwidth, there is an ongoing effort to limit installing packages compiled for GPUs unnecessarily on CPU-only machines by default. This is accomplished by adding a [virtual package](https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-virtual.html) run dependency, `__cuda`, that detects if the local machine has a GPU. However, this introduces challenges to users who may prefer to still download and use GPU-enabled packages even on a non-GPU machine. For example, login nodes on HPCs often do not have GPUs and their compute counterparts with GPUs often do not have internet access. In this case, a user can override the default setting via the environment variable `CONDA_OVERRIDE_CUDA` to install GPU packages on the login node to be used later on the compute node. At the time of writing (February 2022), we have concluded this safe default behavior is best for most of conda-forge users, with an easy override option available and documented. Please let us know if you have thoughts on or issues with this.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bdice Link to virtual packages docs added here as this is the first place that __cuda shows up.


In order to override the default behavior, a user can set the environment variable `CONDA_OVERRIDE_CUDA` like below to install TensorFlow with GPU support even on a machine with CPU only.

```shell-session
CONDA_OVERRIDE_CUDA="11.2" conda install "tensorflow==2.7.0=cuda112*" -c conda-forge
CONDA_OVERRIDE_CUDA="<CUDA version>" conda install tensorflow -c conda-forge
# OR
CONDA_OVERRIDE_CUDA="11.2" mamba install "tensorflow==2.7.0=cuda112*" -c conda-forge
CONDA_OVERRIDE_CUDA="<CUDA version>" mamba install tensorflow -c conda-forge
```

:::note

You should select the cudatoolkit version most appropriate for your GPU; currently, we have "10.2", "11.0", "11.1", and "11.2" builds available, where the "11.2" builds are compatible with all cudatoolkits>=11.2. At the time of writing (Mar 2022), there seems to be a bug in how the CUDA builds are resolved by `mamba`, defaulting to `cudatoolkit==10.2`; thus, it is prudent to be as explicit as possible like above or by adding `cudatoolkit>=11.2` or similar to the line above.
See the [relevant CUDA user guides](https://github.com/conda-forge/cuda-feedstock/blob/main/recipe/README.md) for more information.
matthewfeickert marked this conversation as resolved.
Show resolved Hide resolved

:::

Expand Down
Loading