请在宿主服务器上安装 cuda 的驱动程序

镜像准备

下载 image 此镜像（可以 docker pull 或者直接下载 .tar 文件），注意选择的 ARCH 为 amd64

若下载 .tar 文件，则需要使用 docker load -i 导入镜像

镜像下载成功后，我们需要开启 sudo 权限，在 /etc/docker/daemon.json 中添加如下内容（若不存在此文件，请先创建）：

1
{
2
    "runtimes": {
3
        "nvidia": {
4
            "path": "/usr/bin/nvidia-container-runtime",
5
            "runtimeArgs": []
6
         }
7
    }
8
}

容器建立

运行:

1
docker run --restart=on-failure --runtime=nvidia --network host  -it {{your image name}} /usr/bin/sh

{{your image name}} 处填写镜像名称，使用 docker image ls 可查看，注意需要名称加标签，例如：

写作：nvidia/cuda:11.7.0-cudnn8-devel-ubuntu20.04 进入容器后，执行命令：

1
nvidia-smi

出现类似下面的表格即成功创建了可运行 cuda 的 docker 容器

构建准备

安装必要的软件，当然首先进行换源：

1
sed -i 's/archive.ubuntu.com/mirrors.ustc.edu.cn/g' /etc/apt/sources.list

然后安装软件：

1
apt update
2
apt upgrade
3
apt update
4
apt-get install -y python3 python3-dev python3-setuptools gcc libtinfo-dev zlib1g-dev build-essential libedit-dev libxml2-dev libssl-dev unzip vim wget git

安装 cmake-3.20

1
wget https://github.com/Kitware/CMake/releases/download/v3.20.0/cmake-3.20.0.tar.gz
2
tar -xvf cmake-3.20.0.tar.gz
3
cd cmake-3.20.0
4
./configure
5
make -j6
6
make install
7
cp bin/cmake /usr/bin/

完成后输入 cmake -version 查看是否成功

安装 llvm

运行

1
apt install lsb-release wget software-properties-common gnupg -y
2
wget https://apt.llvm.org/llvm.sh
3
chmod u+x llvm.sh
4
./llvm.sh 14 all

构建 `tvm`

从 github 下载

1
git clone --recursive https://github.com/apache/tvm tvm

下载 .zip 然后从本机传到 docker 上解压

在本机上，输入：docker cp ~/tvm.zip {{container id}}:/root/，在 {{container id}} 处输入容器的 id ，可以用 docker ps -a 进行查看

然后进入 docker 中，输入
Terminal window
```
1
unzip tvm.zip
```

得到一个 tvm 文件夹，随后进行构建

注意，请下载 v0.1.0 版本，否则可能无法正常运行

1
unzip tvm.zip
2
cd tvm
3
mkdir -p build
4
cd build
5
cp ../cmake/config.cmake build
6
vim config.cmake # 在这里修改配置

修改内容如下：

将set(USE_CUDA OFF) 改为 set(USE_CUDA ON) 启用 CUDA 后端，如果要使用例如 OpenGL 则启用对应的即可

LLVM 将 set(USE_LLVM OFF) 改为 set(USE_LLVM ON)

IR 调试，设置 set(USE_RELAY_DEBUG ON)，同时设置环境变量 TVM_LOG_DEBUG

1
export TVM_LOG_DEBUG="ir/transform.cc=1,relay/ir/transform.cc=1"

然后:

1
source .bashrc

重启环境

然后在 build 文件夹中，输入：

1
cmake -DCMAKE_BUILD_TYPE=Debug ..
2
make -j8

等待 libtvm.so 的构建完成

对于 python 包的构建，设置环境变量 PYTHONPATH 来告诉 python 在哪里找到这个库。例如，假设我们在 /path/to/tvm 目录下克隆了 tvm，那么我们可以在 ~/.bashrc 中添加以下一行。一旦你拉出代码并重建项目，这些变化将立即反映出来（不需要再次调用 setup）

1
export TVM_HOME=/path/to/tvm
2
export PYTHONPATH=$TVM_HOME/python:${PYTHONPATH}

然后在 tvm 目录下输入：

1
cd python; python3 setup.py install --user; cd ..

Bug
在安装 scipy 时会报错，要求 python 版本大于等于 3.9，所以有些包需要我们手动安装：
Terminal window
1
apt install -y pip
2
pip uninstall numpy
3
pip install "numpy<=1.23" decorator attrs psutil 'xgboost<1.6.0' cloudpickle ml_dtypes scipy

启用 C++ 测试（注意这里在 ~ 目录下进行）

1
git clone https://github.com/google/googletest
2
cd googletest
3
mkdir build
4
cd build
5
cmake -DBUILD_SHARED_LIBS=ON ..
6
make -j6
7
make install

然后在 tvm 目录下，运行：

1
make cpptest -j6

进行构建，若无报错则已构建成功

测试

参见编译PyTorch模型测试

运行命令

1
pip install torch==1.7.0
2
pip install torchvision==0.8.1

然后创建一个 torch_test.py ，内容如下：

1
import time
2
import tvm
3
from tvm import relay
4
import numpy as np
5
from tvm.contrib.download import download_testdata
6
import torch
7
import torchvision
8
from scipy.special import softmax
9
# device = torch.device("cpu")
10
model_name = "resnet18"
11
model = getattr(torchvision.models, model_name)(pretrained=True)
12
model = model.eval()
13

14
# We grab the TorchScripted model via tracing
15
input_shape = [1, 3, 224, 224]
16
input_data = torch.randn(input_shape)
17
scripted_model = torch.jit.trace(model, input_data).eval()
18

19
from PIL import Image
20

21
img_url = "https://github.com/dmlc/mxnet.js/blob/main/data/cat.png?raw=true"
22
img_path = download_testdata(img_url, "cat.png", module="data")
23
print(img_path)
24
img = Image.open(img_path).resize((224, 224))
25

26
# Preprocess the image and convert to tensor
27
from torchvision import transforms
28

29

30
my_preprocess = transforms.Compose(
31
    [
32
        transforms.Resize(256),
33
        transforms.CenterCrop(224),
34
        transforms.ToTensor(),
35
        transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
36
    ]
37
)
38
img = my_preprocess(img)
39
img = np.expand_dims(img, 0)
40

41
######################################################################
42
# Import the graph to Relay
43
# -------------------------
44
# Convert PyTorch graph to Relay graph. The input name can be arbitrary.
45
input_name = "input0"
46
shape_list = [(input_name, img.shape)]
47
mod, params = relay.frontend.from_pytorch(scripted_model, shape_list)
48

49
######################################################################
50
# Relay Build
51
# -----------
52
# Compile the graph to llvm target with given input specification.
53
target = "llvm"
54
target_host = "llvm"
55
dev = tvm.cpu(0)
56
with tvm.transform.PassContext(opt_level=7):
57
    lib = relay.build(mod, target=target, target_host=target_host, params=params)
58

59
######################################################################
60
# Execute the portable graph on TVM
61
# ---------------------------------
62
# Now we can try deploying the compiled model on target.
63
from tvm.contrib import graph_executor
64

65
m = graph_executor.GraphModule(lib["default"](dev))
66

67
tvm_time_spent=[]
68
torch_time_spent=[]
69
n_warmup=5
70
n_time=10
71
# tvm_t0 = time.process_time()
72
for i in range(n_warmup+n_time):
73
    dtype = "float32"
74
    # Set inputs
75
    m.set_input(input_name, tvm.nd.array(img.astype(dtype)))
76
    tvm_t0 = time.time()
77
    # Execute
78
    m.run()
79
    # Get outputs
80
    tvm_output = m.get_output(0)
81
    tvm_time_spent.append(time.time() - tvm_t0)
82
# tvm_t1 = time.process_time()
83

84
#####################################################################
85
# Look up synset name
86
# -------------------
87
# Look up prediction top 1 index in 1000 class synset.
88
synset_url = "".join(
89
    [
90
        "https://raw.githubusercontent.com/Cadene/",
91
        "pretrained-models.pytorch/master/data/",
92
        "imagenet_synsets.txt",
93
    ]
94
)
95
synset_name = "imagenet_synsets.txt"
96
synset_path = download_testdata(synset_url, synset_name, module="data")
97
with open(synset_path) as f:
98
    synsets = f.readlines()
99

100
synsets = [x.strip() for x in synsets]
101
splits = [line.split(" ") for line in synsets]
102
key_to_classname = {spl[0]: " ".join(spl[1:]) for spl in splits}
103

104
class_url = "".join(
105
    [
106
        "https://raw.githubusercontent.com/Cadene/",
107
        "pretrained-models.pytorch/master/data/",
108
        "imagenet_classes.txt",
109
    ]
110
)
111
class_name = "imagenet_classes.txt"
112
class_path = download_testdata(class_url, class_name, module="data")
113
with open(class_path) as f:
114
    class_id_to_key = f.readlines()
115

116
class_id_to_key = [x.strip() for x in class_id_to_key]
117

118
# Get top-1 result for TVM
119
top1_tvm = np.argmax(tvm_output.asnumpy()[0])
120
tvm_class_key = class_id_to_key[top1_tvm]
121

122
# Convert input to PyTorch variable and get PyTorch result for comparison
123
# torch_t0 = time.process_time()
124
# torch.set_num_threads(1)
125
for i in range(n_warmup+n_time):
126
    with torch.no_grad():
127
        torch_img = torch.from_numpy(img)
128
        torch_t0 = time.time()
129
        output = model(torch_img)
130
        torch_time_spent.append(time.time() - torch_t0)
131
        # Get top-1 result for PyTorch
132
        top1_torch = np.argmax(output.numpy())
133
        torch_class_key = class_id_to_key[top1_torch]
134
# torch_t1 = time.process_time()
135

136
# tvm_time = tvm_t1 - tvm_t0
137
# torch_time = torch_t1 - torch_t0
138
tvm_time = np.mean(tvm_time_spent[n_warmup:]) * 1000
139
torch_time = np.mean(torch_time_spent[n_warmup:]) * 1000
140
tvm_output_prob = softmax(tvm_output.asnumpy())
141
output_prob = softmax(output.numpy())
142
print("Relay top-1 id: {}, class name: {}, class probality: {}".format(top1_tvm, key_to_classname[tvm_class_key], tvm_output_prob[0][top1_tvm]))
143
print("Torch top-1 id: {}, class name: {}, class probality: {}".format(top1_torch, key_to_classname[torch_class_key], output_prob[0][top1_torch]))
144
print('Relay time(ms): {:.3f}'.format(tvm_time))
145
print('Torch time(ms): {:.3f}'.format(torch_time))

注意，在文档中给出了 import set_env

set_env 用于在本次运行代码前添加如下函数用于设置 Python 临时环境：

1
def set_env(num, current_path='.'):
2
    '''
3
    num 表示相对于 current_path 的父级根目录级别
4
    '''
5
    import sys
6
    from pathlib import Path
7

8
    ROOT = Path(current_path).resolve().parents[num]
9
    sys.path.extend([str(ROOT/'src')]) # 设置 `tvm_book` 环境
10
    from tvm_book.config.env import set_tvm
11
    # 设置 TVM 环境
12
    set_tvm(TVM_ROOT)

set_tvm 需要自行配置以适配设备

如果使用 cuda 的话，代码为：

1
import time
2
import tvm
3
from tvm import relay
4
import numpy as np
5
from tvm.contrib.download import download_testdata
6
import torch
7
import torchvision
8
from scipy.special import softmax
9
# device = torch.device("cpu")
10
model_name = "resnet18"
11
model = getattr(torchvision.models, model_name)(pretrained=True)
12
model = model.eval()
13

14
# We grab the TorchScripted model via tracing
15
input_shape = [1, 3, 224, 224]
16
input_data = torch.randn(input_shape)
17
scripted_model = torch.jit.trace(model, input_data).eval()
18

19
from PIL import Image
20

21
img_url = "https://github.com/dmlc/mxnet.js/blob/main/data/cat.png?raw=true"
22
img_path = download_testdata(img_url, "cat.png", module="data")
23
print(img_path)
24
img = Image.open(img_path).resize((224, 224))
25

26
# Preprocess the image and convert to tensor
27
from torchvision import transforms
28

29

30
my_preprocess = transforms.Compose(
31
    [
32
        transforms.Resize(256),
33
        transforms.CenterCrop(224),
34
        transforms.ToTensor(),
35
        transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
36
    ]
37
)
38
img = my_preprocess(img)
39
img = np.expand_dims(img, 0)
40

41
######################################################################
42
# Import the graph to Relay
43
# -------------------------
44
# Convert PyTorch graph to Relay graph. The input name can be arbitrary.
45
input_name = "input0"
46
shape_list = [(input_name, img.shape)]
47
mod, params = relay.frontend.from_pytorch(scripted_model, shape_list)
48

49
######################################################################
50
# Relay Build
51
# -----------
52
target = "cuda"
53
target_host = "llvm"
54
dev = tvm.gpu(0)
55
with tvm.transform.PassContext(opt_level=7):
56
    lib = relay.build(mod, target=target, target_host=target_host, params=params)
57

58
######################################################################
59
# Execute the portable graph on TVM
60
# ---------------------------------
61
# Now we can try deploying the compiled model on target.
62
from tvm.contrib import graph_executor
63

64
m = graph_executor.GraphModule(lib["default"](dev))
65

66
tvm_time_spent=[]
67
torch_time_spent=[]
68
n_warmup=5
69
n_time=10
70
# tvm_t0 = time.process_time()
71
for i in range(n_warmup+n_time):
72
    dtype = "float32"
73
    # Set inputs
74
    m.set_input(input_name, tvm.nd.array(img.astype(dtype)))
75
    tvm_t0 = time.time()
76
    # Execute
77
    m.run()
78
    # Get outputs
79
    tvm_output = m.get_output(0)
80
    tvm_time_spent.append(time.time() - tvm_t0)
81
# tvm_t1 = time.process_time()
82

83
#####################################################################
84
# Look up synset name
85
# -------------------
86
# Look up prediction top 1 index in 1000 class synset.
87
synset_url = "".join(
88
    [
89
        "https://raw.githubusercontent.com/Cadene/",
90
        "pretrained-models.pytorch/master/data/",
91
        "imagenet_synsets.txt",
92
    ]
93
)
94
synset_name = "imagenet_synsets.txt"
95
synset_path = download_testdata(synset_url, synset_name, module="data")
96
with open(synset_path) as f:
97
    synsets = f.readlines()
98

99
synsets = [x.strip() for x in synsets]
100
splits = [line.split(" ") for line in synsets]
101
key_to_classname = {spl[0]: " ".join(spl[1:]) for spl in splits}
102

103
class_url = "".join(
104
    [
105
        "https://raw.githubusercontent.com/Cadene/",
106
        "pretrained-models.pytorch/master/data/",
107
        "imagenet_classes.txt",
108
    ]
109
)
110
class_name = "imagenet_classes.txt"
111
class_path = download_testdata(class_url, class_name, module="data")
112
with open(class_path) as f:
113
    class_id_to_key = f.readlines()
114

115
class_id_to_key = [x.strip() for x in class_id_to_key]
116

117
# Get top-1 result for TVM
118
top1_tvm = np.argmax(tvm_output.asnumpy()[0])
119
tvm_class_key = class_id_to_key[top1_tvm]
120

121
# Convert input to PyTorch variable and get PyTorch result for comparison
122
# torch_t0 = time.process_time()
123
# torch.set_num_threads(1)
124
for i in range(n_warmup+n_time):
125
    with torch.no_grad():
126
        torch_img = torch.from_numpy(img)
127
        torch_t0 = time.time()
128
        output = model(torch_img)
129
        torch_time_spent.append(time.time() - torch_t0)
130
        # Get top-1 result for PyTorch
131
        top1_torch = np.argmax(output.numpy())
132
        torch_class_key = class_id_to_key[top1_torch]
133
# torch_t1 = time.process_time()
134

135
# tvm_time = tvm_t1 - tvm_t0
136
# torch_time = torch_t1 - torch_t0
137
tvm_time = np.mean(tvm_time_spent[n_warmup:]) * 1000
138
torch_time = np.mean(torch_time_spent[n_warmup:]) * 1000
139
tvm_output_prob = softmax(tvm_output.asnumpy())
140
output_prob = softmax(output.numpy())
141
print("Relay top-1 id: {}, class name: {}, class probality: {}".format(top1_tvm, key_to_classname[tvm_class_key], tvm_output_prob[0][top1_tvm]))
142
print("Torch top-1 id: {}, class name: {}, class probality: {}".format(top1_torch, key_to_classname[torch_class_key], output_prob[0][top1_torch]))
143
print('Relay time(ms): {:.3f}'.format(tvm_time))
144
print('Torch time(ms): {:.3f}'.format(torch_time))

运行代码即可

运行会出现很多日志，暂时还没找到消除的方法，这个日志应该是由于开启了 USE_RELAY_DEBUG 的原因

运行成功的截图如下：

如果第二次运行出现 Module Not Found: "tvm" is not found 这种错误，请重新安装 tvm 的 python 环境

また夏を追う

最近的笔记

TAOCP 4B & SAT Handbook 阅读

RoundingSAT 阅读笔记其二

基数约束编码中文字顺序的重要性

探索

TVM 运行环境搭建

镜像准备

容器建立

构建准备

构建 `tvm`

测试

🕸️ 关系图谱

目录

反向链接

また夏を追う

最近的笔记

TAOCP 4B & SAT Handbook 阅读

RoundingSAT 阅读笔记其二

基数约束编码中文字顺序的重要性

探索

TVM 运行环境搭建

镜像准备

容器建立

构建准备

构建 tvm

测试

🕸️ 关系图谱

目录

反向链接

构建 `tvm`