TensorFlow/MNIST実行環境構築
--------------------------------------------------------------------------------
・前処理
・用意するもの
・Hardware
・PC
Diginnos / Prime PC
CPU : Intel(R) Core(TM) i5-4570 CPU @ 3.20GHz
Clock : 3478.695 MHz
Cache Size : 6144 KB
CPU Cores : 4
Core Memory : 8 GB
Hard Disk : 500 GB
・GPU (これは並列処理のために使用(GPGPU))
玄人志向 NVIDIA GeForce GT 710 搭載 グラフィックボード 1GB GF-GT710-E1GB/HS
・Files
・CentOS-8-x86_64-1905-boot.isoでブータブルUSBを用意。(CentOS8 1905以降)
・NVIDIA-Linux-x86_64-440.100.run
・cuda_10.1.243_418.87.00_linux.run
・cuda-repo-rhel8-10.2.89-1.x86_64.rpm
・libcudnn8-8.0.1.13-1.cuda11.0.x86_64.rpm
・libcudnn8-devel-8.0.1.13-1.cuda11.0.x86_64.rpm
・libcudnn8-doc-8.0.1.13-1.cuda11.0.x86_64.rpm
・libcudnn7-7.6.4.38-1.cuda10.1.x86_64.rpm
・libcudnn7-devel-7.6.4.38-1.cuda10.1.x86_64.rpm
--------------------------------------------------------------------------------
・インストール
・起動(T)
・Install CentOS 8
・Language: English (United States)
・DATE & TIME: Asia Tokyo
・KEYBOARD: Japanese
・SOFTWARE SELECTION: Minimal Install;
・KDUMP: disabled
・NETWORK & HOST ; Ethernet = ON; ai01.mydomain
・[Begin Installation]
・ROOT PASSWORD
・USER CREATION // <-- devusr
・(処理終了を待つ)
・Reboot
・固定IP(環境依存)
$ su -
# cd /etc/sysconfig/network-scripts
# cp ifcfg-enp5s0 _ifcfg-enp5s0_
# vi ifcfg-enp5s0
# diff _ifcfg-enp5s0_ ifcfg-enp5s0
4c4,7
< BOOTPROTO="dhcp"
---
> BOOTPROTO="static"
> IPADDR="192.168.3.201"
> NETMASK="255.255.255.0"
> GATEWAY="192.168.3.1"
#
・Reboot
# shutdown -r now
・ファイル用意
# mkdir ~/wrk
ここに必要ファイルを用意。
・NVIDIA-Linux-x86_64-440.100.run
・cuda_10.1.243_418.87.00_linux.run
・cuda-repo-rhel8-10.2.89-1.x86_64.rpm
・libcudnn7-7.6.4.38-1.cuda10.1.x86_64.rpm
・libcudnn7-devel-7.6.4.38-1.cuda10.1.x86_64.rpm
・libcudnn8-8.0.1.13-1.cuda11.0.x86_64.rpm
・libcudnn8-devel-8.0.1.13-1.cuda11.0.x86_64.rpm
・libcudnn8-doc-8.0.1.13-1.cuda11.0.x86_64.rpm
--------------------------------------------------------------------------------
・以降、client端末からsshでログインし、処理します。
--------------------------------------------------------------------------------
# yum -y update
###
### reboot
###
# shutdown -r now
###
### remi and PowerTools.
###
# yum -y install epel-release https://rpms.remirepo.net/enterprise/remi-release-8.rpm; \
yum config-manager --set-enabled PowerTools
###
### kernel-headers and tools
###
# yum -y install kernel-devel kernel-headers elfutils-libelf-devel zlib-devel gcc make cmake; \
yum -y groupinstall "Development Tools"; \
yum -y install protobuf-devel leveldb-devel snappy-devel opencv-devel boost-devel hdf5-devel; \
yum -y install gflags-devel glog-devel lmdb-devel; \
yum -y install atlas-devel; \
yum -y install python36 python3-devel; \
yum -y install tar wget net-tools zip unzip emacs nkf ImageMagick ImageMagick-devel git
###
### Disable Nouveau kernel driver
###
# grub2-editenv - set "$(grub2-editenv - list | grep kernelopts) nouveau.modeset=0"
# shutdown -r now
###
### NDIVIS, CUDA and tensorflow
###
# sh ~/wrk/NVIDIA-Linux-x86_64-440.100.run
WARNING: nvidia-installer was forced to guess the X library path '/usr/lib64' and X module path '/usr/lib64/xorg/modules'; these paths were
not queryable from the system. If X fails to find the NVIDIA X driver module, please install the `pkg-config` utility and the X.Org
SDK/development package for your distribution and reinstall the driver.
OK
Install NVIDIA's 32-bit compatibility libraries?
Yes
Would you like to run the nvidia-xconfig utility to automatically update your X configuration file so that the NVIDIA X driver will be used
when you restart X? Any pre-existing X configuration file will be backed up.
No
Installation of the NVIDIA Accelerated Graphics Driver for Linux-x86_64 (version: 440.100) is now complete. Please update your xorg.conf file
as appropriate; see the file /usr/share/doc/NVIDIA_GLX-1.0/README.txt for details.
OK
# nvidia-smi
Mon Jul 6 11:49:46 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.100 Driver Version: 440.100 CUDA Version: 10.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GT 710 Off | 00000000:01:00.0 N/A | N/A |
| 40% 51C P0 N/A / N/A | 0MiB / 978MiB | N/A Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 Not Supported |
+-----------------------------------------------------------------------------+
#
# rpm --install ~/wrk/cuda-repo-rhel8-10.2.89-1.x86_64.rpm; \
yum -y install cuda
# nvidia-smi
Mon Jul 6 12:06:59 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.36.06 Driver Version: 450.36.06 CUDA Version: 11.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce GT 710 Off | 00000000:01:00.0 N/A | N/A |
| 40% 48C P0 N/A / N/A | 0MiB / 978MiB | N/A Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
#
# pip3 install --upgrade pip; \
pip3 install setuptools --upgrade; \
pip3 install --upgrade tensorflow; \
pip3 install --upgrade tensorflow-gpu; \
pip3 install ipython; \
rpm --install ~/wrk/libcudnn8-8.0.1.13-1.cuda11.0.x86_64.rpm; \
rpm --install ~/wrk/libcudnn8-devel-8.0.1.13-1.cuda11.0.x86_64.rpm; \
rpm --install ~/wrk/libcudnn8-doc-8.0.1.13-1.cuda11.0.x86_64.rpm
###
### .bashrc
###
# cd
# vi .bashrc
# cat .bashrc
<<<
# .bashrc
# User specific aliases and functions
# alias rm='rm -i'
# alias cp='cp -i'
# alias mv='mv -i'
# Source global definitions
if [ -f /etc/bashrc ]; then
. /etc/bashrc
fi
alias ls='ls'
alias ll='ls -la'
alias e='emacs'
alias delb='rm -f *~ .??*~'
export PATH="/usr/local/cuda/bin:$PATH"
export LD_LIBRARY_PATH="/usr/local/cuda/lib64/:$LD_LIBRARY_PATH"
>>>
# exit
### 再びLogin
# ipython
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()
deep_learning = tf.constant('Deep Learning')
session = tf.Session()
session.run(deep_learning)
a = tf.constant(2)
b = tf.constant(3)
multiply = tf.multiply(a, b)
session.run(multiply)
##### 上記例では、過去のモジュールが見つからないというワーニングが発生していたので
##### Toolkitを追加します。
#####
##### for libcudart.so.10.1
##### 好みのCUDA-Toolkitバージョン取得可能
##### https://developer.nvidia.com/cuda-toolkit-archive
#####
# sh ~/wrk/cuda_10.1.243_418.87.00_linux.run <-- Do you accept the above EULA? accept
CUDA Toolkit 10.1だけをインストール
#####
##### for libcudnn.so.7
##### 好みのcuDNN-Toolkitバージョン取得可能
##### https://developer.nvidia.com/rdp/cudnn-archive
#####
# rpm --install ~/wrk/libcudnn7-7.6.4.38-1.cuda10.1.x86_64.rpm; \
rpm --install ~/wrk/libcudnn7-devel-7.6.4.38-1.cuda10.1.x86_64.rpm
###
### その他
###
# pip3 install matplotlib; \
pip3 install keras; \
pip3 install pillow; \
pip3 install opencv-python; \
pip3 install pycuda; \
pip3 install chainer; \
pip3 install cupy
###
### MNIST
###
# cp -r /usr/src/cudnn_samples_v8/ $HOME; \
cd $HOME/cudnn_samples_v8/mnistCUDNN; \
make clean && make
# ./mnistCUDNN image=data/three_28x28.pgm | grep 'Result of classification'
Result of classification: 3
#
###
### TensorBoard
###
### --- server side
###
# yum -y reinstall python3-six
# firewall-cmd --zone=public --add-port=6006/tcp --permanent; \
firewall-cmd --reload
# cd
# mkdir logs
# tensorboard --logdir ./logs
###
### --- client side (use tunneling)
###
$ ssh -L 16006:localhost:6006 root@192.168.3.201
### Check http://localhost:16006
------------------------------------------------------------------------------------------------
end of line