CentOS 6.6でwaifu2xを使えるようにしてみた

(2015/11/05 追記) タイトルがおかしかったので修正しました。

前回の更新からまた大分空いてしまいました･･･下書きばかりが溜まっていくのですがちゃんと記事としてまとめられていないのでちょくちょく更新していくようにします。

waifu2xという画像を高画質で拡大してくれるライブラリがあります(動作原理とかよく知らない)。それをWebサーバに入れて動かしたいとの要望がありました。

Public AMIが公開されているのでそれからEC2インスタンスを起動するのが一番手っ取り早いのですが、CentOSを使っているのでそれに合わせたいのと、LDAPなど共通でインストールしたいものがあったため、一からインストールして使えるようにしてみました。

環境は以下の通りです。

EC2インスタンス g2.2xlarge
CentOS 6.6

Installationを見ながらやってみました。

事前準備

$ sudo yum install pciutils
$ sudo lspci | grep NVIDIA
00:03.0 VGA compatible controller: NVIDIA Corporation GK104GL [GRID K520] (rev a1)

Torch7のインストール

$ sudo yum install cmake curl readline-devel ncurses-devel \
  gcc-c++ gcc-gfortran git gnuplot unzip \
  libjpeg-turbo-devel libpng-devel \
  ImageMagick fftw-devel sox-devel sox qt-devel
$ curl -s https://raw.githubusercontent.com/torch/ezinstall/master/install-all | sudo bash

Installing https://raw.githubusercontent.com/torch/rocks/master/sundown-scm-1.rockspec...
Using https://raw.githubusercontent.com/torch/rocks/master/sundown-scm-1.rockspec... switching to 'build' mode
Initialized empty Git repository in /tmp/luarocks_sundown-scm-1-385/sundown-ffi/.git/
github.com[0: 192.30.252.129]: errno=Connection timed out
fatal: unable to connect a socket (Connection timed out)

Error: Failed cloning git repository.
Installing https://raw.githubusercontent.com/torch/rocks/master/sundown-scm-1.rockspec...
Using https://raw.githubusercontent.com/torch/rocks/master/sundown-scm-1.rockspec... switching to 'build' mode
Initialized empty Git repository in /tmp/luarocks_sundown-scm-1-4674/sundown-ffi/.git/
github.com[0: 192.30.252.129]: errno=Connection timed out
fatal: unable to connect a socket (Connection timed out)

Error: Failed cloning git repository.
Error. Exiting.
ERROR: Torch install returned an error. Installation may be incomplete.

エラーが出ました。 git cloneするときに、gitプロトコル(9418番ポート)で行おうとしているのですが、そんなポートは閉じているのでタイムアウトになっていたのが原因です。 git://からhttps://に変換するようにします。

# $HOME/.gitconfig
[url "https://"]
    insteadOf = git://

再度実行

$ curl -s https://raw.githubusercontent.com/torch/ezinstall/master/install-all | sudo bash

(...長いビルドが続く...)

=> Torch7 has been installed successfully

  + Extra packages have been installed as well:
     $ luarocks list

  + To install more packages, do:
     $ luarocks search --all
     $ luarocks install PKG_NAME

  + Note: on MacOS, it's a good idea to install GCC 5 to enable OpenMP.
     You can do this by with brew
      $ brew install gcc --without-multilib
     type the following lines before running the installation script
      export CC=gcc-5
      export CXX=g++-5
     For installing cunn, you will need instead the default AppleClang compiler,
     which means you should open a new terminal (with unexported CC and CXX) and
      luarocks install cunn

  + packages installed:
    - sundown   :  ok
    - cwrap     :  ok
    - paths     :  ok
    - torch     :  ok
    - nn        :  ok
    - dok       :  ok
    - gnuplot   :  ok
    - qtlua     :  ok
    - qttorch   :  ok
    - lfs       :  ok
    - penlight  :  ok
    - sys       :  ok
    - xlua      :  ok
    - image     :  ok
    - optim     :  ok
    - cjson     :  ok
    - trepl     :  ok

INFO: Torch installed successfully.

$ which th
/usr/local/bin/th

$ th

  ______             __   |  Torch7
 /_  __/__  ________/ /   |  Scientific computing for Lua.
  / / / _ \/ __/ __/ _ \  |  Type ? for help
 /_/  \___/_/  \__/_//_/  |  https://github.com/torch
                          |  http://torch.ch

th> exit
Do you really want to exit ([y]/n)? y

CUDA 7.0のインストール

ここが一番時間がかかった･･･

$ wget http://developer.download.nvidia.com/compute/cuda/repos/rhel6/x86_64/cuda-repo-rhel6-7.0-28.x86_64.rpm
$ sudo rpm -ivh cuda-repo-rhel6-7.0-28.x86_64.rpm

# /etc/yum.repos.d/linuxtech.repo
[linuxtech]
name=LinuxTECH
baseurl=http://pkgrepo.linuxtech.net/el6/release/
enabled=1
gpgcheck=1
gpgkey=http://pkgrepo.linuxtech.net/el6/release/RPM-GPG-KEY-LinuxTECH.NET

$ sudo yum install libvdpau
$ sudo yum install cuda

共有ライブラリへパスを通す

# /etc/ld.so.conf
include ld.so.conf.d/*.conf

# /etc/ld.so.conf.d/cuda.conf
/usr/local/cuda-7.0/lib64

$ sudo ldconfig

$ sudo ldconfig -v | grep cuda
/usr/local/cuda-7.0/lib64:
        libcudart.so.7.0 -> libcudart.so.7.0.28
        libcuda.so.1 -> libcuda.so.346.46

CUDAコマンドのパスを通す

rootも含む全ユーザにパスが通したいため、/etc/bashrcにパスを追加します。

# /etc/bashrc
# 末尾に追記
export PATH=$PATH:/usr/local/cuda-7.0/bin

$ source ~/.bashrc

$ which nvcc
/usr/local/cuda-7.0/bin/nvcc

CUDAコマンドの検証

$ /usr/local/cuda-7.0/bin/cuda-install-samples-7.0.sh .
$ cd NVIDIA_CUDA-7.0_Samples
$ make
$ ./1_Utilities/deviceQuery/deviceQuery
./1_Utilities/deviceQuery/deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

FATAL: Module nvidia not found.
cudaGetDeviceCount returned 38
-> no CUDA-capable device is detected
Result = FAIL

ぐぬぬ。NVIDIA Driverが無いらしい。もう一度やり直す。

$ sudo yum remove "nvidia*"
$ sudo yum remove "cuda*"

ドキュメントを読んでちゃんとやってみます。

$ lspci | grep -i nvidia
00:03.0 VGA compatible controller: NVIDIA Corporation GK104GL [GRID K520] (rev a1)

$ uname -m && cat /etc/*release
x86_64
CentOS release 6.6 (Final)
CentOS release 6.6 (Final)
CentOS release 6.6 (Final)

$ gcc --version
gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-16)
Copyright (C) 2010 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ wget http://developer.download.nvidia.com/compute/cuda/repos/rhel6/x86_64/cuda-repo-rhel6-7.0-28.x86_64.rpm

$ md5sum cuda-repo-rhel6-7.0-28.x86_64.rpm
e24998c62a83e47a60c89b41914577f0  cuda-repo-rhel6-7.0-28.x86_64.rpm

$ curl -s http://developer.download.nvidia.com/compute/cuda/7_0/Prod/md5sum-7.0.txt | grep e24998c62a83e47a60c89b41914577f0
e24998c62a83e47a60c89b41914577f0  cuda-repo-rhel6-7.0-28.x86_64.rpm

$ sudo /usr/bin/nvidia-uninstall
$ sudo rpm -ivh cuda-repo-rhel6-7.0-28.x86_64.rpm
$ sudo yum clean expire-cache
$ sudo yum install cuda

インストール中のログを見ると、kernelとkernel-develのバージョンが違うと怒られていました。

DKMS: add completed.
Error! echo
Your kernel headers for kernel 2.6.32-504.12.2.el6.x86_64 cannot be found at
/lib/modules/2.6.32-504.12.2.el6.x86_64/build or /lib/modules/2.6.32-504.12.2.el6.x86_64/source.
Error! echo
Your kernel headers for kernel 2.6.32-504.12.2.el6.x86_64 cannot be found at
/lib/modules/2.6.32-504.12.2.el6.x86_64/build or /lib/modules/2.6.32-504.12.2.el6.x86_64/source.
warning: %post(nvidia-kmod-1:346.46-2.el6.x86_64) scriptlet failed, exit status 1

kernel-develはCUDAのインストール時に依存関係で一緒にインストールされるのですが、今あるkernelとのバージョンとは異なるようです。 kernelとkernel-develのバージョンを揃えてみます。

もう一度やり直し。

$ sudo yum remove "nvidia*"
$ sudo yum remove "cuda*"

$ sudo yum install kernel-devel-`uname -r`
読み込んだプラグイン:fastestmirror
インストール処理の設定をしています
Loading mirror speeds from cached hostfile
 * base: ftp.riken.jp
 * epel: ftp.iij.ad.jp
 * extras: ftp.riken.jp
 * updates: ftp.riken.jp
パッケージ kernel-devel-2.6.32-504.12.2.el6.x86_64 は利用できません。
エラー: 何もしません

無いのでRPMを持ってきてインストールします。

$ wget ftp://mirror.switch.ch/pool/4/mirror/scientificlinux/6.6/x86_64/updates/security/kernel-devel-2.6.32-504.12.2.el6.x86_64.rpm
$ sudo rpm -ivh kernel-devel-2.6.32-504.12.2.el6.x86_64.rpm

$ rpm -qa | grep kernel-
kernel-2.6.32-504.12.2.el6.x86_64
kernel-headers-2.6.32-504.12.2.el6.x86_64
kernel-firmware-2.6.32-504.12.2.el6.noarch
kernel-devel-2.6.32-504.12.2.el6.x86_64

バージョンが揃いました。もう一度CUDAをインストールします。

$ sudo yum install cuda

ちゃんと動くか確認します。

$ sudo modprobe nvidia
$ echo $?  # 0
# エラーでたら再起動してから試す

$ sudo modprobe nvidia-uvm
$ echo $?  # 0

$ nvidia-smi
Wed Sep  2 20:27:37 2015
+------------------------------------------------------+
| NVIDIA-SMI 346.46     Driver Version: 346.46         |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GRID K520           Off  | 0000:00:03.0     Off |                  N/A |
| N/A   53C    P0     0W / 125W |     10MiB /  4095MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

できた！！

$ /usr/local/cuda-7.0/bin/cuda-install-samples-7.0.sh .
$ cd NVIDIA_CUDA-7.0_Samples
$ make
$ ./1_Utilities/deviceQuery/deviceQuery
./1_Utilities/deviceQuery/deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "GRID K520"
  CUDA Driver Version / Runtime Version          7.0 / 7.0
  CUDA Capability Major/Minor version number:    3.0
  Total amount of global memory:                 4096 MBytes (4294770688 bytes)
  ( 8) Multiprocessors, (192) CUDA Cores/MP:     1536 CUDA Cores
  GPU Max Clock rate:                            797 MHz (0.80 GHz)
  Memory Clock rate:                             2500 Mhz
  Memory Bus Width:                              256-bit
  L2 Cache Size:                                 524288 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)
  Maximum Layered 1D Texture Size, (num) layers  1D=(16384), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(16384, 16384), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 2 copy engine(s)
  Run time limit on kernels:                     No
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Device PCI Domain ID / Bus ID / location ID:   0 / 0 / 3
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 7.0, CUDA Runtime Version = 7.0, NumDevs = 1, Device0 = GRID K520
Result = PASS

こちらも通りました！手こずりました･･･

Luaモジュールのインストール

$ sudo /usr/local/bin/luarocks install cutorch
$ sudo /usr/local/bin/luarocks install cunn

$ wget ftp://rpmfind.net/linux/epel/6/x86_64/GraphicsMagick-1.3.20-3.el6.x86_64.rpm
$ sudo rpm -ivh GraphicsMagick-1.3.20-3.el6.x86_64.rpm
$ wget ftp://195.220.108.108/linux/epel/6/x86_64/GraphicsMagick-devel-1.3.20-3.el6.x86_64.rpm
$ sudo rpm -ivh GraphicsMagick-devel-1.3.20-3.el6.x86_64.rpm
$ sudo /usr/local/bin/luarocks install graphicsmagick

LuaJITのインストール

$ lua -v
Lua 5.1.4  Copyright (C) 1994-2008 Lua.org, PUC-Rio

$ wget http://luajit.org/download/LuaJIT-2.0.4.tar.gz
$ tar zxvf LuaJIT-2.0.4.tar.gz
$ cd LuaJIT-2.0.4
$ make
$ sudo make install

これでインストール完了です。

waifu2xを試す

$ git clone https://github.com/nagadomi/waifu2x.git
$ cd waifu2x
$ th waifu2x.lua
images/miku_small(noise_scale).png: 35.13604593277 sec

できました！

最後に

手順通りにやれば簡単にできるだろうと思ってたのですが、CUDAが曲者で意外と手こずりました。実は最近、CUDA 7.5が出ており、Yumでインストールすると 7.5がインストールされ、動かないばかりかkernelパニックまで起こりました。その対処法は別途書きます。

本日も乙

ただの自己満足な備忘録。