Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot use of GPU with CUDA instead only CPU which is very slow #2083

Open
8 of 9 tasks
dadupriv opened this issue Sep 11, 2024 · 3 comments
Open
8 of 9 tasks

Cannot use of GPU with CUDA instead only CPU which is very slow #2083

dadupriv opened this issue Sep 11, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@dadupriv
Copy link

dadupriv commented Sep 11, 2024

Pre-check

  • I have searched the existing issues and none cover this bug.

Description

Windows OS:
all requirements that CUDA has
gcc++ 14
Runing PrivateGPT but only with CPU not GPU

CUDA:
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.94 Driver Version: 560.94 CUDA Version: 12.6 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Driver-Model | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3090 WDDM | 00000000:03:00.0 Off | N/A |
| 0% 37C P8 21W / 350W | 47MiB / 24576MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 1 NVIDIA GeForce RTX 3090 WDDM | 00000000:04:00.0 Off | N/A |
| 0% 45C P8 31W / 350W | 340MiB / 24576MiB | 2% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 14620 C+G ...crosoft\Edge\Application\msedge.exe N/A |
| 1 N/A N/A 9456 C+G C:\Windows\explorer.exe N/A |
| 1 N/A N/A 10884 C+G ...2txyewy\StartMenuExperienceHost.exe N/A |
| 1 N/A N/A 12132 C+G ....Search_cw5n1h2txyewy\SearchApp.exe N/A |
| 1 N/A N/A 14668 C+G ...CBS_cw5n1h2txyewy\TextInputHost.exe N/A |
| 1 N/A N/A 17180 C+G ...am Files (x86)\VideoLAN\VLC\vlc.exe N/A |
| 1 N/A N/A 18792 C+G ...5n1h2txyewy\ShellExperienceHost.exe N/A |
+-----------------------------------------------------------------------------------------+

I have searched, and cannot compile llama cpp with CUDA problem as below.

Anaconda Powershell

PS C:\Users\XXXXX>

$env:CMAKE_ARGS='-DLLAMA_CUBLAS=on'; poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python numpy==1.26.0
Collecting llama-cpp-python
Downloading llama_cpp_python-0.2.90.tar.gz (63.8 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 63.8/63.8 MB 40.9 MB/s eta 0:00:00
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing metadata (pyproject.toml) ... done
Collecting numpy==1.26.0
Downloading numpy-1.26.0-cp311-cp311-win_amd64.whl.metadata (61 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 61.1/61.1 kB 3.4 MB/s eta 0:00:00
Collecting typing-extensions>=4.5.0 (from llama-cpp-python)
Downloading typing_extensions-4.12.2-py3-none-any.whl.metadata (3.0 kB)
Collecting diskcache>=5.6.1 (from llama-cpp-python)
Downloading diskcache-5.6.3-py3-none-any.whl.metadata (20 kB)
Collecting jinja2>=2.11.3 (from llama-cpp-python)
Downloading jinja2-3.1.4-py3-none-any.whl.metadata (2.6 kB)
Collecting MarkupSafe>=2.0 (from jinja2>=2.11.3->llama-cpp-python)
Downloading MarkupSafe-2.1.5-cp311-cp311-win_amd64.whl.metadata (3.1 kB)
Downloading numpy-1.26.0-cp311-cp311-win_amd64.whl (15.8 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 15.8/15.8 MB 38.6 MB/s eta 0:00:00
Downloading diskcache-5.6.3-py3-none-any.whl (45 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 45.5/45.5 kB ? eta 0:00:00
Downloading jinja2-3.1.4-py3-none-any.whl (133 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 133.3/133.3 kB ? eta 0:00:00
Downloading typing_extensions-4.12.2-py3-none-any.whl (37 kB)
Downloading MarkupSafe-2.1.5-cp311-cp311-win_amd64.whl (17 kB)
Building wheels for collected packages: llama-cpp-python
Building wheel for llama-cpp-python (pyproject.toml) ... error
error: subprocess-exited-with-error

× Building wheel for llama-cpp-python (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> [31 lines of output]
*** scikit-build-core 0.10.6 using CMake 3.30.3 (wheel)
*** Configuring CMake...
2024-09-11 10:14:43,243 - scikit_build_core - WARNING - Can't find a Python library, got libdir=None, ldlibrary=None, multiarch=None, masd=None
loading initial cache file C:\Users\nasdadu\AppData\Local\Temp\tmp2efzwb2l\build\CMakeInit.txt
-- Building for: Visual Studio 17 2022
-- Selecting Windows SDK version 10.0.22621.0 to target Windows 10.0.19045.
-- The C compiler identification is MSVC 19.35.32217.1
-- The CXX compiler identification is MSVC 19.35.32217.1
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: C:/Program Files (x86)/Microsoft Visual Studio/2022/BuildTools/VC/Tools/MSVC/14.35.32215/bin/Hostx64/x64/cl.exe - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: C:/Program Files (x86)/Microsoft Visual Studio/2022/BuildTools/VC/Tools/MSVC/14.35.32215/bin/Hostx64/x64/cl.exe - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found Git: C:/Users/nasdadu/pinokio/bin/miniconda/Library/bin/git.exe (found version "2.42.0.windows.1")
CMake Error at vendor/llama.cpp/CMakeLists.txt:95 (message):
LLAMA_CUBLAS is deprecated and will be removed in the future.

    Use GGML_CUDA instead

  Call Stack (most recent call first):
    vendor/llama.cpp/CMakeLists.txt:100 (llama_option_depr)


  -- Configuring incomplete, errors occurred!

  *** CMake configuration failed
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for llama-cpp-python
Failed to build llama-cpp-python
ERROR: Could not build wheels for llama-cpp-python, which is required to install pyproject.toml-based projects

[notice] A new release of pip is available: 23.3.1 -> 24.2
[notice] To update, run: python.exe -m pip install --upgrade pip

Steps to Reproduce

Windows OS:
Input commands in powershell:
$env:CMAKE_ARGS='-DLLAMA_CUBLAS=on'; poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python numpy==1.26.0

Expected Behavior

Expected BLAS=1 with GPU usage

Actual Behavior

Output BLAS=0 only CPU usage

Environment

Windows 10 19045.4780 RTX 3090

Additional Information

No response

Version

No response

Setup Checklist

  • Confirm that you have followed the installation instructions in the project’s documentation.
  • Check that you are using the latest version of the project.
  • Verify disk space availability for model storage and data processing.
  • Ensure that you have the necessary permissions to run the project.

NVIDIA GPU Setup Checklist

  • Check that the all CUDA dependencies are installed and are compatible with your GPU (refer to CUDA's documentation)
  • Ensure an NVIDIA GPU is installed and recognized by the system (run nvidia-smi to verify).
  • Ensure proper permissions are set for accessing GPU resources.
  • Docker users - Verify that the NVIDIA Container Toolkit is configured correctly (e.g. run sudo docker run --rm --gpus all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi)
@dadupriv dadupriv added the bug Something isn't working label Sep 11, 2024
@dadupriv dadupriv changed the title [BUG] Cannot use of GPU with CUDA instead only CPU which es very slow [BUG] Cannot use of GPU with CUDA instead only CPU which is very slow Sep 11, 2024
@dadupriv dadupriv changed the title [BUG] Cannot use of GPU with CUDA instead only CPU which is very slow Cannot use of GPU with CUDA instead only CPU which is very slow Sep 11, 2024
@dadupriv
Copy link
Author

$env:CMAKE_ARGS='-DLLAMA_CUBLAS=on'; poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python numpy==1.26.0
Solved! Replace DLLAMA_CUBLAS=on with GGML_CUDA=on

@jveronese
Copy link

Is there a certain way I need to launch this? I launch using https://github.com/zylon-ai/private-gpt/issues/2083 after running '$env:CMAKE_ARGS='-GGML_CUDA=on'; poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python numpy==1.26.0' and it still uses my CPU instead of GPU

@jacooooooooool
Copy link

paste the entire line into the terminal and click enter:

CMAKE_ARGS="-DGGML_CUDA=on" pip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants