猫でもわかるWebプログラミングと副業

本業エンジニアリングマネージャー。副業Webエンジニア。Web開発のヒントや、副業、日常生活のことを書きます。

MacBook Pro 2018 と Windows 10 + GTX 750 Ti でディープラーニングの時間比較

概要

松尾研究室から公開されている Deep Learning のテキストのうち、4章2節のプログラムでグラボありなしの性能比較をしてみます。

https://github.com/matsuolab-edu/dl4us/blob/master/lesson4/lesson4_sec2_exercise.ipynb

Mac

最初Macでこのプログラムを動かしてたのですが、結構時間がかかってました。

2020-07-17 20:55:43.113435: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-07-17 20:55:43.179415: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7fa5182d3ef0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-07-17 20:55:43.179445: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
Epoch 1/15
307/307 - 399s - loss: 2.9520 - val_loss: 2.3440
Epoch 2/15
307/307 - 385s - loss: 2.1116 - val_loss: 1.9684
Epoch 3/15
307/307 - 396s - loss: 1.8424 - val_loss: 1.7958
Epoch 4/15
307/307 - 401s - loss: 1.6791 - val_loss: 1.6770
Epoch 5/15
307/307 - 400s - loss: 1.5544 - val_loss: 1.5807
Epoch 6/15
307/307 - 395s - loss: 1.4500 - val_loss: 1.5049
Epoch 7/15
307/307 - 398s - loss: 1.3558 - val_loss: 1.4410
Epoch 8/15
307/307 - 399s - loss: 1.2725 - val_loss: 1.3951
Epoch 9/15
307/307 - 416s - loss: 1.1990 - val_loss: 1.3568
Epoch 10/15
307/307 - 401s - loss: 1.1325 - val_loss: 1.3207
Epoch 11/15
307/307 - 394s - loss: 1.0734 - val_loss: 1.2963
Epoch 12/15
307/307 - 398s - loss: 1.0207 - val_loss: 1.2759
Epoch 13/15
307/307 - 399s - loss: 0.9728 - val_loss: 1.2641
Epoch 14/15
307/307 - 394s - loss: 0.9289 - val_loss: 1.2611
Epoch 15/15
307/307 - 409s - loss: 0.8879 - val_loss: 1.2476

Windows + GTX 750 Ti

グラボが乗ってるWindowsなら早いんでね?って思ってWindowsで実行してみました。動かすまでに、Python再インストールしたりして大変でした。

GeForce GTX 750 Ti は4年前くらいのグラボで、かつ当時でも比較的やすかった(多分2万弱くらい?)のグラボです。今で言う 1660 Ti みたいな感じです。

これでどれくらい改善されるのかやってみます。

C:\Users\yoshi\dl4us\src\lesson4>python lesson_4_2.py
2020-07-18 15:35:07.640596: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-07-18 15:35:11.046229: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2020-07-18 15:35:11.070849: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:2d:00.0 name: GeForce GTX 750 Ti computeCapability: 5.0
coreClock: 1.0845GHz coreCount: 5 deviceMemorySize: 2.00GiB deviceMemoryBandwidth: 80.47GiB/s
2020-07-18 15:35:11.077753: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-07-18 15:35:11.084651: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-07-18 15:35:11.090914: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-07-18 15:35:11.095087: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-07-18 15:35:11.102217: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-07-18 15:35:11.107127: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-07-18 15:35:11.111157: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudnn64_7.dll'; dlerror: cudnn64_7.dll not found
2020-07-18 15:35:11.115093: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1598] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2020-07-18 15:35:11.125803: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2020-07-18 15:35:11.137837: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x17083b55b00 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-07-18 15:35:11.141513: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-07-18 15:35:11.145098: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-07-18 15:35:11.148936: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108]
Epoch 1/15
307/307 - 219s - loss: 2.9461 - val_loss: 2.3396
Epoch 2/15
307/307 - 222s - loss: 2.0982 - val_loss: 1.9578
Epoch 3/15
307/307 - 221s - loss: 1.8337 - val_loss: 1.7803
Epoch 4/15
307/307 - 222s - loss: 1.6725 - val_loss: 1.6652
Epoch 5/15
307/307 - 221s - loss: 1.5452 - val_loss: 1.5785
Epoch 6/15
307/307 - 218s - loss: 1.4379 - val_loss: 1.4995
Epoch 7/15
307/307 - 218s - loss: 1.3447 - val_loss: 1.4389
Epoch 8/15
307/307 - 217s - loss: 1.2615 - val_loss: 1.3937
Epoch 9/15
307/307 - 218s - loss: 1.1889 - val_loss: 1.3460
Epoch 10/15
307/307 - 218s - loss: 1.1243 - val_loss: 1.3225
Epoch 11/15
307/307 - 218s - loss: 1.0673 - val_loss: 1.2974
Epoch 12/15
307/307 - 217s - loss: 1.0151 - val_loss: 1.2804
Epoch 13/15
307/307 - 218s - loss: 0.9673 - val_loss: 1.2683
Epoch 14/15
307/307 - 217 - loss: 0.9224 - val_loss: 1.2566
Epoch 15/15
307/307 - 217s - loss: 0.8803 - val_loss: 1.2503

結論

1.5倍くらいの早さになりました。思ったより早くなりませんでした。もっと良いグラボ買った方がいいですね。

今だと 1660 Ti とかですかね。もっといいグラボほしい。1660 Ti と RTX 2060 どっちがいいだろう。

[asin:B07P2CZNSF:detail]