laitimes

How to choose a GPU for deep learning Xi?

author:Machine Vision Knowledge Recommendation Officer

Deep learning Xi is a field with a large demand for computing, and to a certain extent, the choice of GPU will fundamentally determine the experience of deep learning Xi. Therefore, choosing to buy the right GPU is a very important decision. So how to choose the right GPU? This article integrates the existing GPU selection criteria and evaluation information on the Internet, hoping to serve as a reference for your purchase decision.

1 What makes one GPU faster than another?

There are some reliable performance indicators that can be used as a judgment of people's experience. Here are some priority guidelines for different deep Xi architectures:

Convolutional networks and Transformers: Tensor Cores > FLOPs > Memory Bandwidth > 16-bit capability

Recurrent networks: Memory Bandwidth > 16-bit capability > Tensor Cores > FLOPs

2 How to choose NVIDIA/AMD/Google

NVIDIA's standard libraries make it very easy to build your first deep Xi library in CUDA. The early benefits combined with NVIDIA's strong community support mean that if you're using an NVIDIA GPU, you can easily get support when something goes wrong. But NVIDIA's policy now makes it possible for only Tesla GPUs to use CUDA in data centers, while GTX or RTX do not, and Tesla has no real advantage over GTX and RTX, which is up to 10 times more expensive.

AMD is powerful, but lacks adequate support. AMD GPUs have 16-bit computing power, but they still fall short of the Tensor cores of NVIDIA GPUs.

Google TPU is highly cost-effective. Because TPUs have a complex parallel infrastructure, if multiple cloud TPUs (equivalent to 4 GPUs) are used, the TPU will have a greater speed advantage over the GPU. Therefore, for now, TPU is more suitable for training convolutional neural networks.

3 Multi-GPU parallel acceleration

Convolutional and recurrent networks are very easy to parallel, especially if only one computer or 4 GPUs are used. Both TensorFlow and PyTorch are also great for parallel recursion. However, fully connected networks, including transformers, often perform poorly in terms of data parallelism, so more advanced algorithms are required for acceleration. If you're running on multiple GPUs, you should try running on 1 GPU first to compare the speeds of the two. Since a single GPU can do almost everything, the quality of better parallelism, such as the number of PCIe lanes, is not so important when buying multiple GPUs.

4 Performance Evaluation

1) Cost-benefit evaluation from Tim Dettmers[1]

https://timdettmers.com/2019/04/03/which-gpu-for-deep-learning/

How to choose a GPU for deep learning Xi?

Convolutional networks (CNNs), recursive networks (RNNs), and transformers for normalized performance/cost numbers (higher is better). The RTX 2060 is more than 100 times more cost-efficient than the Tesla V5. For short sequences with a length of less than 100, Word RNN stands for biLSTM. Benchmarked with PyTorch 1.0.1 and CUDA 10.

As you can see from these data, the RTX 2060 is more cost-effective than the RTX 2070, RTX 2080, or RTX 2080 Ti. The reason for this is that the ability to use Tensor Cores for 16-bit computation is much more valuable than simply having more Tensor Cores cores.

2) Reviews from Lambda [2,3]

https://lambdalabs.com/blog/best-gpu-tensorflow-2080-ti-vs-v100-vs-titan-v-vs-1080-ti-benchmark/

https://lambdalabs.com/blog/choosing-a-gpu-for-deep-learning/

How to choose a GPU for deep learning Xi?

GPU average acceleration/total system cost

How to choose a GPU for deep learning Xi?

GPU performance, measured in images processed per second

How to choose a GPU for deep learning Xi?

Image model training throughput for Quadro RTX 8000 benchmarked against Quadro RTX 8000

3) "Online" GPU review from Zhihu @Aero[4]

https://www.zhihu.com/question/299434830/answer/1010987691

Google Colab is probably the most used, after all, it is free, and you can even choose TPU

How to choose a GPU for deep learning Xi?

But now out of membership:

How to choose a GPU for deep learning Xi?

The free version is mainly K80, a little weak, you can run a relatively simple model, there is a probability to get to T4, and there is a European emperor who can get P100.

If you pay for it, you can ensure that it is T4 or P100, and it is $10 a month, which is said to be in the United States only.

After all, Colab is Google's, so you must first be able to connect to Google, and the network must be stable, if it is disconnected, it is likely to be retrained, and the overall domestic experience is not very good.

Next is Baidu AI Studio:

How to choose a GPU for deep learning Xi?

It's very conscientious to send V100 for free, many people used to install tensorflow by themselves, but now it's not allowed, and the measured tensorflow pytorch is not installed, you have to use paddlepaddle. Then users who are Xi to paddlepaddle can choose this, others are not suitable.

How to choose a GPU for deep learning Xi?

But it seems that the GPU is not enough, and the peak period has been reminded during the day, and it is really not until after 22 o'clock.

There are also vast.ai abroad:

How to choose a GPU for deep learning Xi?

5 Recommendations

1)来自Tim Dettmers的建议

  • Best GPU overall: RTX 2070 GPU
  • Avoid using : Any Tesla, any Quadro, any Founders Edition, Titan RTX, Titan V, Titan XP
  • Efficient but expensive: RTX 2070
  • Efficient and inexpensive: RTX 2060, GTX 1060 (6GB)
  • Affordable: GTX 1060 (6GB)
  • Low price: GTX 1050 Ti (4GB). Or: CPU (Prototyping) + AWS/TPU (Training); or Colab.
  • Fit for Kaggle race: RTX 2070
  • For computer vision researchers: GTX 2080 Ti, RTX Titans are recommended if training very large networks

2) Suggestion from Lambda

As of February 2020, the following GPUs can train all SOTA language and image models:

  • RTX 8000:48 GB VRAM
  • RTX 6000:24 GB VRAM
  • Titan RTX:24 GB VRAM

Specific recommendations:

  • RTX 2060 (6 GB): Ideal for exploring deep learning Xi in your spare time.
  • RTX 2070 or 2080 (8 GB): Suitable for deep Xi professionals with a budget of 4-6k
  • RTX 2080 Ti (11 GB): For deep Xi professionals with a GPU budget of around 8-9k. The RTX 2080 Ti is 40% faster than the RTX 2080.
  • Titan RTX and Quadro RTX 6000 (24 GB): suitable for researchers who use SOTA models widely but do not have enough budget for the RTX 8000.
  • Quadro RTX 8000 (48 GB): Relatively expensive, but with excellent performance for future investment.

bibliography

[1] https://timdettmers.com/2019/04/03/which-gpu-for-deep-learning/

[2] https://lambdalabs.com/blog/best-gpu-tensorflow-2080-ti-vs-v100-vs-titan-v-vs-1080-ti-benchmark/

[3] https://lambdalabs.com/blog/choosing-a-gpu-for-deep-learning/

[4] https://www.zhihu.com/question/299434830/answer/1010987691

Read on