【WSL2+Docker+GPU】機械学習のために環境構築を～仮想環境編

あらすじ

機械学習の勉強のためにいろいろな環境を作れるようにしようということで、前回までに最新のwindows OS ビルドやWSLをインストールしてきた。
今回はDockerを使って仮想環境をつくる。

今回の登場物とそのバージョン

Docker Desktop for Windows 3.5.2
NGC pytorch:21.06-py3

流れ

windows作業編：1~3（前回）
仮想環境編　　：4~5（今回）

windows10 build 20150（以上）をインストール
NVIDIA Drivers for CUDA on WSL
CUDA Toolkitをインストール
Docker Desktop for Windowsをインストール
NGCコンテナを動かす

Docker Desktop for Windowsをインストール

macではよく使ったけどwindowsは初めてやる。
インストール後、再起動すると立ち上がるのでチュートリアルとかをやろう。

Dockerコンテナはいろいろインストールして環境構築済みのPCとでも思っておけばいいかもしれない。ミスったと思ったら簡単に消せるところがいいね。
Dockerイメージはコンテナの設計書と考えておけばいい。

www.docker.com

linux上にdockerやCUDAをインストールしてもいい

Docker Desktopがあるのでやらなくてもいいけど、別にもやり方があるというだけ。
正直、dockerを学びたいわけではないのでCUIとGUIでやりやすい方を選ぶといい。
なのでやりかたは省略。

Dockerの設定

日本語にしない方が分かりやすいと思う。
最初から設定がほとんどされていたからあまりすることは無かった。

docs.docker.com

f:id:taopo:20210715000151j:plain — docker run --rm -it --gpus=all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark

↑ の実行結果。こんな感じに表示されていれば問題ないらしい。

NGC（NVIDIA GPU Cloud）コンテナを動かす

カタログサイトを見るとなんだかたくさん種類があるが、おススメは左上の法則よりPyTorchを選んでみる。その隣のTensorFlowが一番人気な気がするけど。

ContainerとかCollectionとか分類もいくつかあるけど基本コンテナしか使わなそう。

ngc.nvidia.com

とりあえず、先にフォルダだけ作っておく。
wslを起動すると/mnt/c/Users/userとかいう場所にいるが、/mnt配下でdockerするのが良くないとのことなので、適当に/home/user/pythorchというフォルダを作った。
適当にtest.txtみたいなファイルも作って/pytorch下に置いておくと後で確認しやすい。

Best practices

To get the best out of the file system performance when build-mounting files, we recommend strong source code and other data that is bind-mounted into Linux containers (i.e., with docker run -v <host-path>:<container-path> ) in the Linux file system, rather than the Windows file system. You can also refer to the from recommendation Microsoft.

・ Linux containers only receive file change events (“inotify events”) if the original files are stored in the Linux filesystem. For example, some web development workflows rely on inotify events for automatic reloading when files have changed.

・ Performance is much higher when files are bind-mounted from the Linux filesystem, rather than remoted from the Windows host. Therefore avoid docker run -v /mnt/c/users:/users (where /mnt/c is mounted from Windows).

・ Instead, from a Linux shell use a command like docker run -v ~/my-project:/sources <my-image> where ~ is expanded by the Linux shell to $HOME.

If you have concerns about the size of the docker-desktop-data VHDX, or need to change it, take a look at the WSL tooling built into Windows.

If you have concerns about CPU or memory usage, you can configure limits on the memory, CPU, Swap size allocated to the WSL 2 utility VM.

To avoid any potential conflicts with using WSL 2 on Docker Desktop, you must uninstall any previous versions of Docker Engine and CLI installed directly through Linux distributions before installing Docker Desktop.

コマンドをコピペして実行。
待ってる時間にPyTorchの使い方とかを調べておくといい。

※Docker Desktopを起動した状態でdockerコマンドを実行すること。起動してないとdockerコマンドはありませんとエラーになる。

docker pull nvcr.io/nvidia/pytorch:21.06-py3
# ...
# 実行結果
Status: Downloaded newer image for nvcr.io/nvidia/pytorch:21.06-py3
nvcr.io/nvidia/pytorch:21.06-py3

f:id:taopo:20210715111038j:plain — ちゃんとImageができてる

コピペと変えた部分

local_dir：さっき作ったpytorchフォルダを指定
container_dir：/workspaceが作業ディレクトリっぽかったので、その配下にpytorchを置く。（-v は自分のPCと仮想環境のどのフォルダをコピーするか的なノリ）
--name pytorch：コンテナに名前を付ける。なくてもいいけど見やすくなる。
21.06：選んだバージョン

docker run --gpus all -it --rm -v /home/user/pytorch:/workspace/pytorch --name pytorch nvcr.io/nvidia/pytorch:21.06-py3

f:id:taopo:20210715115006j:plain — コンテナが起動した。環境構築済みのPCにログインした的な感じ。

ちゃんと/workspaceにpytorch/test.txtがある。

自分のPC側でtest.txtを編集保存すると、コンテナ側のtest.txtに編集が反映される。
また、その逆も同様。
今後はホスト側で実装して、それをコンテナ側で実行することでGPUで処理を行うことができる。なので、一応これで環境構築としては完了。

※コンテナ内に全てあるので自分のPCにpythonをインストールする必要すらない。
ただ、エディターでwarningは表示されるかも。

f:id:taopo:20210715115755j:plain — Docker Desktopにも表示されている

おわり

これで機械学習のための環境構築ができたと思う。というよりもdockerを使うための準備だったと言えそう。
バージョン管理や依存関係に悩まされることなくやっていけそう。飽きてもdockerのcontaier, imageを削除したらきれいさっぱり元通りにできる。
仮想化っていいね。

おまけ

PytorchにしろTensorflowにしろ、使い方はそれぞれの公式チュートリアルをやって覚えようと思う。
当面は買った本の写経をして、そのあとにそれをPytorchで書き換えるようにしよ。
先はまだ長そう。

強欲で謙虚なツボツボ

趣味の読書の書の方