First of all, I don't recommend the Tesla P4 graphics card, which has no active heat dissipation, no video output interface, and low cost performance (even the slag 40HX is not as good). The graphics card cost 400 yuan for heat dissipation (graphics card 360 + bracket 30 + fan 17), the reason why I bought this card is because I personally like Tesla P4 (low power consumption, small appearance, is the most beautiful graphics card in my eyes), half buy half collection, by the way to All-In-One host toss vGPU graphics card virtualization.

This article introduces the Tesla P4 graphics card virtualization settings in the PVE environment, because Tesla P4 is a professional card itself that supports vGPU, and the N card of a non-professional card requires vGPU Unlock to virtualize, which is not within the scope of this article:

1. Preparation: PVE setting

PVE turns on iommu (PCIE pass-through):

vim /etc/default/grub

#修改如下设置
GRUB_CMDLINE_LINUX_DEFAULT="quiet"
#intel cpu 改为：
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt"
#amd cpu改为：
GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on iommu=pt"

Update GRUB:

update-grub

Load the VFIO module

echo vfio >> /etc/modules
echo vfio_iommu_type1 >> /etc/modules
echo vfio_pci >> /etc/modules
echo vfio_virqfd >> /etc/modules

Update kernel parameters

update-initramfs -k all -u

Shield the open source graphics card driver to avoid the graphics card being loaded by the system when booting

#AMD显卡
echo "blacklist radeon" >> /etc/modprobe.d/blacklist.conf 
echo "blacklist amdgpu" >> /etc/modprobe.d/blacklist.conf 
#NVIDIA显卡
echo "blacklist nouveau" >> /etc/modprobe.d/blacklist.conf 
echo "blacklist nvidia" >> /etc/modprobe.d/blacklist.conf 
echo "blacklist nvidiafb" >> /etc/modprobe.d/blacklist.conf 
#INTEL核显
echo "blacklist snd_hda_intel" >> /etc/modprobe.d/blacklist.conf 
echo "blacklist snd_hda_codec_hdmi" >> /etc/modprobe.d/blacklist.conf 
echo "blacklist i915" >> /etc/modprobe.d/blacklist.conf

Update the kernel and reboot

update-initramfs -k all -u

Ready to go!

2. PVE installs the graphics card host driver

Once the graphics card is installed, the driver can be installed. It is more troublesome to download the official website of the calculation card driver and need to register an account, you can find someone else's packaged driver online. The vGPU driver is divided into two drivers: host (server) and client (client). The host driver needs to be installed on the PVE. First, install the graphics driver dependency environment:

apt install build-essential dkms mdevctl pve-headers-$(uname -r)

Upload the driver to PVE through ssh and install it, if the graphics driver is installed before, you need to uninstall it first:

#如果之前安装了显卡驱动需要先卸载
apt-get remove --purge nvidia-*
#进入驱动所在目录，开始安装驱动
chmod +x NVIDIA-Linux-x86_64-535.54.06-vgpu-kvm.run
./NVIDIA-Linux-x86_64-535.54.06-vgpu-kvm.run --dkms
#重启pve
reboot

Get started with the Tesla P4, All-In-One play vGPU graphics card virtualization

In this way, the PVE driver has been installed, and the vGPU authorization service is set up before installing the driver for the virtual machine.

3. Set up authorization services

The vGPU client needs authorization to use the driver normally. It is not cost-effective for individuals to buy commercial licenses, so use the private licensing method: build authorization services through Docker.

Find a Linux device, install the docker environment, write down the LAN IP address of the device, and start building docker, choose one of the following 2 ways:

3.1 makedie/fastapi-dls images

MakeDie/Fastapi-DLS is a solution based on collinwebdesigns/fastapi-dls original image modification, and does not need to build OpenSSL yourself, so the operation is relatively simple.

One command can set up the container, pay attention to modify the actual LAN IP:

docker run -d -e DLS_URL=192.168.31.20 -e DLS_PORT=1020 -p 1020:443  makedie/fastapi-dls

3.2 ColinWebDesigns/Fastapi-DLS images (original)

collinwebdesigns/fastapi-dls is the original image and needs to build its own OpenSSL certificate, so not only Docker but also OpenSSL must be installed on the server

apt install openssl

Create a certificate on the server:

WORKING_DIR=/opt/docker/fastapi-dls/cert
mkdir -p $WORKING_DIR
cd $WORKING_DIR
# create instance private and public key for singing JWT's
openssl genrsa -out $WORKING_DIR/instance.private.pem 2048 
openssl rsa -in $WORKING_DIR/instance.private.pem -outform PEM -pubout -out $WORKING_DIR/instance.public.pem
# create ssl certificate for integrated webserver (uvicorn) - because clients rely on ssl
openssl req -x509 -nodes -days 3650 -newkey rsa:2048 -keyout  $WORKING_DIR/webserver.key -out $WORKING_DIR/webserver.crt

Run the docker fastapi-dls service:

docker run -e DLS_URL=192.168.31.128 -e DLS_PORT=1020 -p 1020:443 -v $WORKING_DIR:/app/cert collinwebdesigns/fastapi-dls:latest

Visit https://192.168.31.20:1020 to see if the licensing service is running correctly:

4. Install vGPU driver and authorization for virtual machine

First, add a PCI device to the virtual machine on PVE and select the corresponding graphics card:

Since I want to divide the graphics card, I only plan to divide 2 pieces in the MDev type, so I choose GRID P4-4Q (nvidia-65):

Next, install the vGPU driver and authorization on the virtual machine, and send the driver to the virtual machine (depending on the system, you can use the smb sharing or ssh remote method). Just like installing the driver on the computer normally.

After installing the driver, obtain vGPU authorization:

Run PowerShell with Windows administrator permissions to execute the following command, pay attention to modify the address of the authentication server (same as the address of the docker container service above):

# 下载授权文件
curl.exe --insecure -L -X GET https://192.168.31.20:1020/-/client-token -o "C:\Program Files\NVIDIA Corporation\vGPU Licensing\ClientConfigToken\client_configuration_token_$($(Get-Date).tostring('dd-MM-yy-hh-mm-ss')).tok"
# 重启英伟达服务
Restart-Service NVDisplay.ContainerLocalSystem
# 查看授权状态
nvidia-smi.exe -q | Select-String License

Run the following command with Linux root privileges:

# 下载授权文件
curl --insecure -L -X GET https://192.168.31.20:1020/-/client-token -o /etc/nvidia/ClientConfigToken/client_configuration_token_$(date '+%d-%m-%Y-%H-%M-%S').tok
# 重启英伟达服务
service nvidia-gridd restart
# 查看授权状态
nvidia-smi -q | grep License

After installing the driver on Windows, the graphics card can already be recognized normally in the task manager, and since I divided the graphics card, the video memory shown here is 4G.

In order to better call the graphics card, you can stream the virtual machine through moonlight+sunshine.

hanbrake can call NVENC normally for decoding, can install CUDA normally, except that there is no physical video output function and the same as general graphics cards.

Although this card has low power consumption, it must be modified to dissipate heat. The heat dissipation modification is very simple, buy a plastic bracket + a small turbofan (the turbofan I bought is only 4800rpm, so it is relatively quiet). Since the seller did not send the screws, I chose to fix the fan with screen glue (which is quite firm). Tesla because of the low power consumption, with this heat dissipation method can barely be used, like P40 TDP 200W or more graphics cards need a more violent solution:

In order to pursue a closed air duct, some netizens will wrap adhesive tape on the graphics card and fan interface. I think it affects the appearance of the graphics card a little, so I didn't wrap the tape anymore. Take a look at the effect after modification:

The front face value is high, the back face value is not so high, especially the seller also issued a gray bracket, it is difficult to give praise. The turbofan can also be installed with the air outlet rearward, but I think it is more aesthetically pleasing:

Take a look at the effect of the graphics card after it is put on the machine:

Since the graphics card fan is connected to the motherboard fan interface, remember to adjust the fan speed to full speed in the BIOS. My turbofan speed is not high, so even at full speed the sound is small. But the heat dissipation capacity is also very tight: 45 °C without load, 90 °C at full load will also be downclocked, but it can barely be used:

The next performance test is based on a PVE installed Windows system. Since it is not a native system, there may be some performance loss:

Tesla P4 supports video encoding and decoding, so you can stream remotely through sunshine+moonlight, and I run Timespy directly through streaming:

The graphics card score is only 4000 points, which is much lower than the Tesla P4's regular running score (the conventional running score is about 4600-5000 points), if not for the face of the video encoder, this is simply worse than the 80-piece P106. But the next tests proved that Tesla P4 video transcoding is also quite crotch-pulling:

I coded the same video with a Tesla P4 and a 2060S resolution, a sixth-generation NVENC encoder for the Tesla P4 and a seventh-generation encoder for the 2060S. The Tesla P4 has 2 transcoding chips, while the 2060S has only 1. However, the Tesla P4 is only 280 frames (single, 50% load), which means that the two-way full load transcoding is only 560 frames. The 2060S has 700 frames per channel:

Streaming is not a big problem playing online games. Considering the bandwidth and overall performance, I limited the frame rate of moonlight to 60 frames, so the LOL display is 60 frames.

Overall, the Tesla P4 graphics card is very cost-effective, the performance is only 1060 level, and the video transcoding performance is not strong. In terms of AI and games, it is the same as the slag P106, and the cost performance is completely second-in-two. In addition, Tesla P4 also has to be modified to dissipate heat, which is much inferior to the 40HX of the same price, so it is not recommended to buy.

Finally, add the solution that the driver is lost because the kernel is upgraded. First of all, install the corresponding pve-headers before upgrading the kernel, and if the driver is still dropped after installation, then downgrade back to the previous kernel:

#查看当前内核
uname -a
#查看已安装内核
dpkg --get-selections |grep kernel
#切换到旧版本内核
proxmox-boot-tool kernel pin 5.13.19-6-pve
proxmox-boot-tool refresh
#重启
reboot

It should have been written very comprehensively!

Get started with the Tesla P4, All-In-One play vGPU graphics card virtualization