概要
松尾研究室から公開されている Deep Learning のテキストのうち、4章2節のプログラムでグラボありなしの性能比較をしてみます。
https://github.com/matsuolab-edu/dl4us/blob/master/lesson4/lesson4_sec2_exercise.ipynb
Mac
最初Macでこのプログラムを動かしてたのですが、結構時間がかかってました。
2020-07-17 20:55:43.113435: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA 2020-07-17 20:55:43.179415: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7fa5182d3ef0 initialized for platform Host (this does not guarantee that XLA will be used). Devices: 2020-07-17 20:55:43.179445: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version Epoch 1/15 307/307 - 399s - loss: 2.9520 - val_loss: 2.3440 Epoch 2/15 307/307 - 385s - loss: 2.1116 - val_loss: 1.9684 Epoch 3/15 307/307 - 396s - loss: 1.8424 - val_loss: 1.7958 Epoch 4/15 307/307 - 401s - loss: 1.6791 - val_loss: 1.6770 Epoch 5/15 307/307 - 400s - loss: 1.5544 - val_loss: 1.5807 Epoch 6/15 307/307 - 395s - loss: 1.4500 - val_loss: 1.5049 Epoch 7/15 307/307 - 398s - loss: 1.3558 - val_loss: 1.4410 Epoch 8/15 307/307 - 399s - loss: 1.2725 - val_loss: 1.3951 Epoch 9/15 307/307 - 416s - loss: 1.1990 - val_loss: 1.3568 Epoch 10/15 307/307 - 401s - loss: 1.1325 - val_loss: 1.3207 Epoch 11/15 307/307 - 394s - loss: 1.0734 - val_loss: 1.2963 Epoch 12/15 307/307 - 398s - loss: 1.0207 - val_loss: 1.2759 Epoch 13/15 307/307 - 399s - loss: 0.9728 - val_loss: 1.2641 Epoch 14/15 307/307 - 394s - loss: 0.9289 - val_loss: 1.2611 Epoch 15/15 307/307 - 409s - loss: 0.8879 - val_loss: 1.2476
Windows + GTX 750 Ti
グラボが乗ってるWindowsなら早いんでね?って思ってWindowsで実行してみました。動かすまでに、Python再インストールしたりして大変でした。
GeForce GTX 750 Ti は4年前くらいのグラボで、かつ当時でも比較的やすかった(多分2万弱くらい?)のグラボです。今で言う 1660 Ti みたいな感じです。
これでどれくらい改善されるのかやってみます。
C:\Users\yoshi\dl4us\src\lesson4>python lesson_4_2.py 2020-07-18 15:35:07.640596: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll 2020-07-18 15:35:11.046229: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll 2020-07-18 15:35:11.070849: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties: pciBusID: 0000:2d:00.0 name: GeForce GTX 750 Ti computeCapability: 5.0 coreClock: 1.0845GHz coreCount: 5 deviceMemorySize: 2.00GiB deviceMemoryBandwidth: 80.47GiB/s 2020-07-18 15:35:11.077753: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll 2020-07-18 15:35:11.084651: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll 2020-07-18 15:35:11.090914: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll 2020-07-18 15:35:11.095087: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll 2020-07-18 15:35:11.102217: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll 2020-07-18 15:35:11.107127: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll 2020-07-18 15:35:11.111157: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudnn64_7.dll'; dlerror: cudnn64_7.dll not found 2020-07-18 15:35:11.115093: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1598] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform. Skipping registering GPU devices... 2020-07-18 15:35:11.125803: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 2020-07-18 15:35:11.137837: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x17083b55b00 initialized for platform Host (this does not guarantee that XLA will be used). Devices: 2020-07-18 15:35:11.141513: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version 2020-07-18 15:35:11.145098: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix: 2020-07-18 15:35:11.148936: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108] Epoch 1/15 307/307 - 219s - loss: 2.9461 - val_loss: 2.3396 Epoch 2/15 307/307 - 222s - loss: 2.0982 - val_loss: 1.9578 Epoch 3/15 307/307 - 221s - loss: 1.8337 - val_loss: 1.7803 Epoch 4/15 307/307 - 222s - loss: 1.6725 - val_loss: 1.6652 Epoch 5/15 307/307 - 221s - loss: 1.5452 - val_loss: 1.5785 Epoch 6/15 307/307 - 218s - loss: 1.4379 - val_loss: 1.4995 Epoch 7/15 307/307 - 218s - loss: 1.3447 - val_loss: 1.4389 Epoch 8/15 307/307 - 217s - loss: 1.2615 - val_loss: 1.3937 Epoch 9/15 307/307 - 218s - loss: 1.1889 - val_loss: 1.3460 Epoch 10/15 307/307 - 218s - loss: 1.1243 - val_loss: 1.3225 Epoch 11/15 307/307 - 218s - loss: 1.0673 - val_loss: 1.2974 Epoch 12/15 307/307 - 217s - loss: 1.0151 - val_loss: 1.2804 Epoch 13/15 307/307 - 218s - loss: 0.9673 - val_loss: 1.2683 Epoch 14/15 307/307 - 217 - loss: 0.9224 - val_loss: 1.2566 Epoch 15/15 307/307 - 217s - loss: 0.8803 - val_loss: 1.2503
結論
1.5倍くらいの早さになりました。思ったより早くなりませんでした。もっと良いグラボ買った方がいいですね。
今だと 1660 Ti とかですかね。もっといいグラボほしい。1660 Ti と RTX 2060 どっちがいいだろう。
MSI GeForce RTX 2060 SUPER VENTUS GP OC /A グラフィックスボード VD7261
- 発売日: 2020/04/10
- メディア: Personal Computers