An offline ChatGPT-like software integrated to Visual Studios (GPU edition).
EdgeLlama is able to run your llama models (alpaca, vicuna and codellama) directly on your pc without need for internet connection.
Data kept within your PC and safe for use within your organization.
This GPU edition allows edge devices with a GPU core to utilize on GPU instead of CPU. This allows the PC to still function properly while Edgellama is doing the inference.
This is a Visual Studios Professional Port of llama.cpp which aims to bring LLM to edge devices.
This version of edgellama works on Intel Based CPU with AVX-2 support / CuBLAS running on Cuda for Nvidia
During installation process, remember to select "custom install". Uncheck NSight VSE, NSight System, NSight Compute and Visual Studio Integration from the installation before installation. If the step is missed and the installation failed, you will need to uninstall Cuda Toolkit before attempting to reinstall Cuda Toolkit.