TensorFlow/MNIST実行環境構築 -------------------------------------------------------------------------------- ・前処理 ・用意するもの ・Hardware ・PC Diginnos / Prime PC CPU : Intel(R) Core(TM) i5-4570 CPU @ 3.20GHz Clock : 3478.695 MHz Cache Size : 6144 KB CPU Cores : 4 Core Memory : 8 GB Hard Disk : 500 GB ・GPU (これは並列処理のために使用(GPGPU)) 玄人志向 NVIDIA GeForce GT 710 搭載 グラフィックボード 1GB GF-GT710-E1GB/HS ・Files ・CentOS-8-x86_64-1905-boot.isoでブータブルUSBを用意。(CentOS8 1905以降) ・NVIDIA-Linux-x86_64-440.100.run ・cuda_10.1.243_418.87.00_linux.run ・cuda-repo-rhel8-10.2.89-1.x86_64.rpm ・libcudnn8-8.0.1.13-1.cuda11.0.x86_64.rpm ・libcudnn8-devel-8.0.1.13-1.cuda11.0.x86_64.rpm ・libcudnn8-doc-8.0.1.13-1.cuda11.0.x86_64.rpm ・libcudnn7-7.6.4.38-1.cuda10.1.x86_64.rpm ・libcudnn7-devel-7.6.4.38-1.cuda10.1.x86_64.rpm -------------------------------------------------------------------------------- ・インストール ・起動(T) ・Install CentOS 8 ・Language: English (United States) ・DATE & TIME: Asia Tokyo ・KEYBOARD: Japanese ・SOFTWARE SELECTION: Minimal Install; ・KDUMP: disabled ・NETWORK & HOST ; Ethernet = ON; ai01.mydomain ・[Begin Installation] ・ROOT PASSWORD ・USER CREATION // <-- devusr ・(処理終了を待つ) ・Reboot ・固定IP(環境依存) $ su - # cd /etc/sysconfig/network-scripts # cp ifcfg-enp5s0 _ifcfg-enp5s0_ # vi ifcfg-enp5s0 # diff _ifcfg-enp5s0_ ifcfg-enp5s0 4c4,7 < BOOTPROTO="dhcp" --- > BOOTPROTO="static" > IPADDR="192.168.3.201" > NETMASK="255.255.255.0" > GATEWAY="192.168.3.1" # ・Reboot # shutdown -r now ・ファイル用意 # mkdir ~/wrk ここに必要ファイルを用意。 ・NVIDIA-Linux-x86_64-440.100.run ・cuda_10.1.243_418.87.00_linux.run ・cuda-repo-rhel8-10.2.89-1.x86_64.rpm ・libcudnn7-7.6.4.38-1.cuda10.1.x86_64.rpm ・libcudnn7-devel-7.6.4.38-1.cuda10.1.x86_64.rpm ・libcudnn8-8.0.1.13-1.cuda11.0.x86_64.rpm ・libcudnn8-devel-8.0.1.13-1.cuda11.0.x86_64.rpm ・libcudnn8-doc-8.0.1.13-1.cuda11.0.x86_64.rpm -------------------------------------------------------------------------------- ・以降、client端末からsshでログインし、処理します。 -------------------------------------------------------------------------------- # yum -y update ### ### reboot ### # shutdown -r now ### ### remi and PowerTools. ### # yum -y install epel-release https://rpms.remirepo.net/enterprise/remi-release-8.rpm; \ yum config-manager --set-enabled PowerTools ### ### kernel-headers and tools ### # yum -y install kernel-devel kernel-headers elfutils-libelf-devel zlib-devel gcc make cmake; \ yum -y groupinstall "Development Tools"; \ yum -y install protobuf-devel leveldb-devel snappy-devel opencv-devel boost-devel hdf5-devel; \ yum -y install gflags-devel glog-devel lmdb-devel; \ yum -y install atlas-devel; \ yum -y install python36 python3-devel; \ yum -y install tar wget net-tools zip unzip emacs nkf ImageMagick ImageMagick-devel git ### ### Disable Nouveau kernel driver ### # grub2-editenv - set "$(grub2-editenv - list | grep kernelopts) nouveau.modeset=0" # shutdown -r now ### ### NDIVIS, CUDA and tensorflow ### # sh ~/wrk/NVIDIA-Linux-x86_64-440.100.run WARNING: nvidia-installer was forced to guess the X library path '/usr/lib64' and X module path '/usr/lib64/xorg/modules'; these paths were not queryable from the system. If X fails to find the NVIDIA X driver module, please install the `pkg-config` utility and the X.Org SDK/development package for your distribution and reinstall the driver. OK Install NVIDIA's 32-bit compatibility libraries? Yes Would you like to run the nvidia-xconfig utility to automatically update your X configuration file so that the NVIDIA X driver will be used when you restart X? Any pre-existing X configuration file will be backed up. No Installation of the NVIDIA Accelerated Graphics Driver for Linux-x86_64 (version: 440.100) is now complete. Please update your xorg.conf file as appropriate; see the file /usr/share/doc/NVIDIA_GLX-1.0/README.txt for details. OK # nvidia-smi Mon Jul 6 11:49:46 2020 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 440.100 Driver Version: 440.100 CUDA Version: 10.2 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GeForce GT 710 Off | 00000000:01:00.0 N/A | N/A | | 40% 51C P0 N/A / N/A | 0MiB / 978MiB | N/A Default | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 Not Supported | +-----------------------------------------------------------------------------+ # # rpm --install ~/wrk/cuda-repo-rhel8-10.2.89-1.x86_64.rpm; \ yum -y install cuda # nvidia-smi Mon Jul 6 12:06:59 2020 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 450.36.06 Driver Version: 450.36.06 CUDA Version: 11.0 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 GeForce GT 710 Off | 00000000:01:00.0 N/A | N/A | | 40% 48C P0 N/A / N/A | 0MiB / 978MiB | N/A Default | | | | N/A | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+ # # pip3 install --upgrade pip; \ pip3 install setuptools --upgrade; \ pip3 install --upgrade tensorflow; \ pip3 install --upgrade tensorflow-gpu; \ pip3 install ipython; \ rpm --install ~/wrk/libcudnn8-8.0.1.13-1.cuda11.0.x86_64.rpm; \ rpm --install ~/wrk/libcudnn8-devel-8.0.1.13-1.cuda11.0.x86_64.rpm; \ rpm --install ~/wrk/libcudnn8-doc-8.0.1.13-1.cuda11.0.x86_64.rpm ### ### .bashrc ### # cd # vi .bashrc # cat .bashrc <<< # .bashrc # User specific aliases and functions # alias rm='rm -i' # alias cp='cp -i' # alias mv='mv -i' # Source global definitions if [ -f /etc/bashrc ]; then . /etc/bashrc fi alias ls='ls' alias ll='ls -la' alias e='emacs' alias delb='rm -f *~ .??*~' export PATH="/usr/local/cuda/bin:$PATH" export LD_LIBRARY_PATH="/usr/local/cuda/lib64/:$LD_LIBRARY_PATH" >>> # exit ### 再びLogin # ipython import tensorflow.compat.v1 as tf tf.disable_v2_behavior() deep_learning = tf.constant('Deep Learning') session = tf.Session() session.run(deep_learning) a = tf.constant(2) b = tf.constant(3) multiply = tf.multiply(a, b) session.run(multiply) ##### 上記例では、過去のモジュールが見つからないというワーニングが発生していたので ##### Toolkitを追加します。 ##### ##### for libcudart.so.10.1 ##### 好みのCUDA-Toolkitバージョン取得可能 ##### https://developer.nvidia.com/cuda-toolkit-archive ##### # sh ~/wrk/cuda_10.1.243_418.87.00_linux.run <-- Do you accept the above EULA? accept CUDA Toolkit 10.1だけをインストール ##### ##### for libcudnn.so.7 ##### 好みのcuDNN-Toolkitバージョン取得可能 ##### https://developer.nvidia.com/rdp/cudnn-archive ##### # rpm --install ~/wrk/libcudnn7-7.6.4.38-1.cuda10.1.x86_64.rpm; \ rpm --install ~/wrk/libcudnn7-devel-7.6.4.38-1.cuda10.1.x86_64.rpm ### ### その他 ### # pip3 install matplotlib; \ pip3 install keras; \ pip3 install pillow; \ pip3 install opencv-python; \ pip3 install pycuda; \ pip3 install chainer; \ pip3 install cupy ### ### MNIST ### # cp -r /usr/src/cudnn_samples_v8/ $HOME; \ cd $HOME/cudnn_samples_v8/mnistCUDNN; \ make clean && make # ./mnistCUDNN image=data/three_28x28.pgm | grep 'Result of classification' Result of classification: 3 # ### ### TensorBoard ### ### --- server side ### # yum -y reinstall python3-six # firewall-cmd --zone=public --add-port=6006/tcp --permanent; \ firewall-cmd --reload # cd # mkdir logs # tensorboard --logdir ./logs ### ### --- client side (use tunneling) ### $ ssh -L 16006:localhost:6006 root@192.168.3.201 ### Check http://localhost:16006 ------------------------------------------------------------------------------------------------ end of line