Technical note : TensorFlow

ディープラーニング実行環境構築/TensorFlow

TensorFlow/MNIST実行環境構築
--------------------------------------------------------------------------------

・前処理

　・用意するもの

    ・Hardware
      ・PC
        Diginnos / Prime PC
          CPU         : Intel(R) Core(TM) i5-4570 CPU @ 3.20GHz
          Clock       : 3478.695 MHz
          Cache Size  : 6144 KB
          CPU Cores   : 4
          Core Memory : 8 GB
          Hard Disk   : 500 GB
      ・GPU (これは並列処理のために使用(GPGPU))
        玄人志向 NVIDIA GeForce GT 710 搭載 グラフィックボード 1GB GF-GT710-E1GB/HS

    ・Files
      ・CentOS-8-x86_64-1905-boot.isoでブータブルUSBを用意。(CentOS8 1905以降)
      ・NVIDIA-Linux-x86_64-440.100.run
      ・cuda_10.1.243_418.87.00_linux.run
      ・cuda-repo-rhel8-10.2.89-1.x86_64.rpm
      ・libcudnn8-8.0.1.13-1.cuda11.0.x86_64.rpm
      ・libcudnn8-devel-8.0.1.13-1.cuda11.0.x86_64.rpm
      ・libcudnn8-doc-8.0.1.13-1.cuda11.0.x86_64.rpm
      ・libcudnn7-7.6.4.38-1.cuda10.1.x86_64.rpm
      ・libcudnn7-devel-7.6.4.38-1.cuda10.1.x86_64.rpm

--------------------------------------------------------------------------------

・インストール

　・起動(T)
　・Install CentOS 8
　　・Language: English (United States)
　　・DATE & TIME: Asia Tokyo
　　・KEYBOARD: Japanese
　　・SOFTWARE SELECTION: Minimal Install;
　　・KDUMP: disabled
　　・NETWORK & HOST ; Ethernet = ON; ai01.mydomain
　　・[Begin Installation]

　　・ROOT PASSWORD
　　・USER CREATION       // <-- devusr

　　・（処理終了を待つ）

　　・Reboot

　　・固定IP（環境依存）
      $ su -
      # cd /etc/sysconfig/network-scripts
      # cp ifcfg-enp5s0 _ifcfg-enp5s0_
      # vi ifcfg-enp5s0
      # diff _ifcfg-enp5s0_ ifcfg-enp5s0
      4c4,7
      < BOOTPROTO="dhcp"
      ---
      > BOOTPROTO="static"
      > IPADDR="192.168.3.201"
      > NETMASK="255.255.255.0"
      > GATEWAY="192.168.3.1"
      #

　　・Reboot
      # shutdown -r now

　　・ファイル用意
      # mkdir ~/wrk
      ここに必要ファイルを用意。
      ・NVIDIA-Linux-x86_64-440.100.run
      ・cuda_10.1.243_418.87.00_linux.run
      ・cuda-repo-rhel8-10.2.89-1.x86_64.rpm
      ・libcudnn7-7.6.4.38-1.cuda10.1.x86_64.rpm
      ・libcudnn7-devel-7.6.4.38-1.cuda10.1.x86_64.rpm
      ・libcudnn8-8.0.1.13-1.cuda11.0.x86_64.rpm
      ・libcudnn8-devel-8.0.1.13-1.cuda11.0.x86_64.rpm
      ・libcudnn8-doc-8.0.1.13-1.cuda11.0.x86_64.rpm

--------------------------------------------------------------------------------

・以降、client端末からsshでログインし、処理します。

--------------------------------------------------------------------------------

# yum -y update

###
### reboot
###
# shutdown -r now

###
### remi and PowerTools.
###
# yum -y install epel-release https://rpms.remirepo.net/enterprise/remi-release-8.rpm; \
yum config-manager --set-enabled PowerTools

###
### kernel-headers and tools
###
# yum -y install kernel-devel kernel-headers elfutils-libelf-devel zlib-devel gcc make cmake; \
yum -y groupinstall "Development Tools"; \
yum -y install protobuf-devel leveldb-devel snappy-devel opencv-devel boost-devel hdf5-devel; \
yum -y install gflags-devel glog-devel lmdb-devel; \
yum -y install atlas-devel; \
yum -y install python36 python3-devel; \
yum -y install tar wget net-tools zip unzip emacs nkf ImageMagick ImageMagick-devel git

###
### Disable Nouveau kernel driver
###
# grub2-editenv - set "$(grub2-editenv - list | grep kernelopts) nouveau.modeset=0"
# shutdown -r now

###
### NDIVIS, CUDA and tensorflow
###
# sh ~/wrk/NVIDIA-Linux-x86_64-440.100.run

WARNING: nvidia-installer was forced to guess the X library path '/usr/lib64' and X module path '/usr/lib64/xorg/modules'; these paths were
           not queryable from the system.  If X fails to find the NVIDIA X driver module, please install the `pkg-config` utility and the X.Org
           SDK/development package for your distribution and reinstall the driver.
                                                OK

Install NVIDIA's 32-bit compatibility libraries?
                                                Yes   

Would you like to run the nvidia-xconfig utility to automatically update your X configuration file so that the NVIDIA X driver will be used
  when you restart X?  Any pre-existing X configuration file will be backed up.
                                                No

Installation of the NVIDIA Accelerated Graphics Driver for Linux-x86_64 (version: 440.100) is now complete.  Please update your xorg.conf file
  as appropriate; see the file /usr/share/doc/NVIDIA_GLX-1.0/README.txt for details.
                                                OK

# nvidia-smi
Mon Jul  6 11:49:46 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.100      Driver Version: 440.100      CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GT 710      Off  | 00000000:01:00.0 N/A |                  N/A |
| 40%   51C    P0    N/A /  N/A |      0MiB /   978MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0                    Not Supported                                       |
+-----------------------------------------------------------------------------+
#

# rpm --install ~/wrk/cuda-repo-rhel8-10.2.89-1.x86_64.rpm; \
yum -y install cuda
# nvidia-smi
Mon Jul  6 12:06:59 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.36.06    Driver Version: 450.36.06    CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce GT 710      Off  | 00000000:01:00.0 N/A |                  N/A |
| 40%   48C    P0    N/A /  N/A |      0MiB /   978MiB |     N/A      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
#

# pip3 install --upgrade pip; \
pip3 install setuptools --upgrade; \
pip3 install --upgrade tensorflow; \
pip3 install --upgrade tensorflow-gpu; \
pip3 install ipython; \
rpm --install ~/wrk/libcudnn8-8.0.1.13-1.cuda11.0.x86_64.rpm; \
rpm --install ~/wrk/libcudnn8-devel-8.0.1.13-1.cuda11.0.x86_64.rpm; \
rpm --install ~/wrk/libcudnn8-doc-8.0.1.13-1.cuda11.0.x86_64.rpm

###
### .bashrc
###
# cd
# vi .bashrc
# cat .bashrc
<<<
# .bashrc

# User specific aliases and functions

# alias rm='rm -i'
# alias cp='cp -i'
# alias mv='mv -i'

# Source global definitions
if [ -f /etc/bashrc ]; then
        . /etc/bashrc
fi

alias ls='ls'
alias ll='ls -la'
alias e='emacs'
alias delb='rm -f *~ .??*~'

export PATH="/usr/local/cuda/bin:$PATH"
export LD_LIBRARY_PATH="/usr/local/cuda/lib64/:$LD_LIBRARY_PATH"
>>>

# exit

### 再びLogin

# ipython

import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()
deep_learning = tf.constant('Deep Learning') 
session = tf.Session() 
session.run(deep_learning) 
a = tf.constant(2) 
b = tf.constant(3) 
multiply = tf.multiply(a, b) 
session.run(multiply) 

##### 上記例では、過去のモジュールが見つからないというワーニングが発生していたので
##### Toolkitを追加します。
#####
##### for libcudart.so.10.1
##### 好みのCUDA-Toolkitバージョン取得可能
#####   https://developer.nvidia.com/cuda-toolkit-archive
#####
# sh ~/wrk/cuda_10.1.243_418.87.00_linux.run  <-- Do you accept the above EULA? accept
                                                  CUDA Toolkit 10.1だけをインストール
#####
##### for libcudnn.so.7
##### 好みのcuDNN-Toolkitバージョン取得可能
#####   https://developer.nvidia.com/rdp/cudnn-archive
#####
# rpm --install ~/wrk/libcudnn7-7.6.4.38-1.cuda10.1.x86_64.rpm; \
rpm --install ~/wrk/libcudnn7-devel-7.6.4.38-1.cuda10.1.x86_64.rpm

###
### その他
###
# pip3 install matplotlib; \
pip3 install keras; \
pip3 install pillow; \
pip3 install opencv-python; \
pip3 install pycuda; \
pip3 install chainer; \
pip3 install cupy

###
### MNIST
###
# cp -r /usr/src/cudnn_samples_v8/ $HOME; \
cd $HOME/cudnn_samples_v8/mnistCUDNN; \
make clean && make
# ./mnistCUDNN image=data/three_28x28.pgm | grep 'Result of classification'
Result of classification: 3
#

###
### TensorBoard
###
### --- server side
###
# yum -y reinstall python3-six
# firewall-cmd --zone=public --add-port=6006/tcp --permanent; \
firewall-cmd --reload
# cd
# mkdir logs
# tensorboard --logdir ./logs
###
### --- client side (use tunneling)
###
$ ssh -L 16006:localhost:6006 root@192.168.3.201
### Check http://localhost:16006

------------------------------------------------------------------------------------------------
end of line