制作一个简单的语义分割推理镜像#

镜像输入输出示例#

.
├── in
│   ├── annotations
│   ├── assets
│   ├── candidate-index.tsv
│   ├── config.yaml
│   ├── env.yaml
│   └── models
└── out
    ├── monitor.txt
    └── infer-result.json

工作目录#

cd seg-semantic-demo-tmi

提供超参数模型文件#

镜像中包含/img-man/infer-template.yaml 表示镜像支持推理

img-man/infer-template.yaml

# infer template for your executor app
# after build image, it should at /img-man/infer-template.yaml
# key: gpu_id, task_id, model_params_path, class_names, gpu_count should be preserved

# gpu_id: '0'
# gpu_count: 1
# task_id: 'default-infer-task'
# model_params_path: []
# class_names: []

# just for test, remove this key in your own docker image
idle_seconds: 3  # idle seconds for each task

Dockerfile

RUN mkdir -p /img-man  # 在镜像中生成/img-man目录
COPY img-man/*.yaml /img-man/  # 将主机中img-man目录下的所有yaml文件复制到镜像/img-man目录

提供镜像说明文件#

object_type 为 3 表示镜像支持语义分割

img-man/manifest.yaml

# 3 for semantic segmentation
"object_type": 3

Dockerfile COPY img-man/*.yaml /img-man/ 在复制infer-template.yaml的同时，会将manifest.yaml复制到镜像中的/img-man目录

提供默认启动脚本#

Dockerfile

RUN echo "python /app/start.py" > /usr/bin/start.sh  # 生成启动脚本 /usr/bin/start.sh
CMD bash /usr/bin/start.sh  # 将镜像的默认启动脚本设置为 /usr/bin/start.sh

实现基本功能#

app/start.py

Source code in seg-semantic-demo-tmi/app/start.py

def _run_infer(cfg: edict) -> None:
    # use `cfg.param` to get config file for training
    #   models are transfered in `cfg.ymir.input.models_dir` model_params_path
    class_names = cfg.param.get('class_names')
    idle_seconds: float = cfg.param.get('idle_seconds', 60)
    trigger_crash: bool = cfg.param.get('trigger_crash', False)
    seed: int = cfg.param.get('seed', 15)
    # use `logging` or `print` to write log to console
    logging.info(f"infer config: {cfg.param}")

    # use `cfg.ymir.input.candidate_index_file` to read candidate dataset items
    #   note that annotations path will be empty str if there's no annotations in that dataset
    with open(cfg.ymir.input.candidate_index_file, 'r') as fp:
        lines = fp.readlines()

    valid_images = []
    invalid_images = []
    valid_image_count = 0
    for line in lines:
        if os.path.isfile(line.strip()):
            valid_image_count += 1
            valid_images.append(line.strip())
        else:
            invalid_images.append(line.strip())

    # use `monitor.write_monitor_logger` to write log to console and write task process percent to monitor.txt
    logging.info(f"assets count: {len(lines)}, valid: {valid_image_count}")
    monitor.write_monitor_logger(percent=0.2)

    _dummy_work(idle_seconds=idle_seconds, trigger_crash=trigger_crash)

    # write infer result
    random.seed(seed)
    results = []

    fake_mask_num = min(len(class_names), 10)
    for iter, img_file in enumerate(valid_images):
        img = cv2.imread(img_file, cv2.IMREAD_GRAYSCALE)
        mask = np.zeros(shape=img.shape[0:2], dtype=np.uint8)
        for idx in range(fake_mask_num):
            percent = 100 * idx / fake_mask_num
            value = np.percentile(img, percent)
            mask[img > value] = idx + 1

        results.append(dict(image=img_file, result=mask))

        # real-time monitor
        monitor.write_monitor_logger(percent=0.2 + 0.8 * iter / valid_image_count)

    coco_results = convert(cfg, results, True)
    rw.write_infer_result(infer_result=coco_results, algorithm='segmentation')

    # if task done, write 100% percent log
    logging.info('infer done')
    monitor.write_monitor_logger(percent=1.0)

写进度#

# use `monitor.write_monitor_logger` to write log to console and write task process percent to monitor.txt
logging.info(f"assets count: {len(lines)}, valid: {valid_image_count}")
monitor.write_monitor_logger(percent=0.2)

# real-time monitor
monitor.write_monitor_logger(percent=0.2 + 0.8 * iter / valid_image_count)

# if task done, write 100% percent log
logging.info('infer done')
monitor.write_monitor_logger(percent=1.0)

写结果文件#

coco_results = convert(cfg, results, True)
rw.write_infer_result(infer_result=coco_results, algorithm='segmentation')

结果文件格式
- 参考 coco-formats
- 其中 RLE 为一种mask编码格式，可通过pycocotools生成

{
    "categories": [{"id": int, "name": str, "supercategory": str}],
    "images": [{"id": int, "file_name": str, "width": int, "height" int}],
    "annotations": [{"id": int, "image_id": int, "category_id": int, "segmentation": RLE}]
}

结果文件示例

{
    "categories":[
        {
            "id":1,
            "name":"dog",
            "supercategory":"none"
        },
        {
            "id":2,
            "name":"cat",
            "supercategory":"none"
        }
    ],
    "images":[
        {
            "id":1,
            "file_name":"5ec2163001ed53f2169c525ff2e5e5ec.jpg",
            "width":1280,
            "height":854
        }
    ],
    "annotations":[
        {
            "id":5,
            "image_id":1,
            "category_id":1,
            "segmentation":{
                "size":[
                    854,
                    1280
                ],
                "counts":"iZ83cj00k_QQ1"
            }
        }
    ]
}

制作镜像 demo/semantic_seg:infer#

# a docker file for an sample training / mining / infer executor

# FROM ubuntu:20.04
FROM python:3.8.16

ENV LANG=C.UTF-8

# Change mirror
RUN sed -i 's#http://archive.ubuntu.com#http://mirrors.ustc.edu.cn#g' /etc/apt/sources.list \
    && sed -i 's#http://security.ubuntu.com#http://mirrors.ustc.edu.cn#g' /etc/apt/sources.list

# Set timezone
RUN ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime \
    && echo 'Asia/Shanghai' >/etc/timezone

# Install linux package
RUN apt-get update && apt-get install -y gnupg2 git libglib2.0-0 \
    libgl1-mesa-glx libsm6 libxext6 libxrender-dev \
    build-essential ninja-build \
    && apt-get clean \
    && rm -rf /var/lib/apt/lists/*

COPY requirements.txt /app/
RUN pip3 install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple

WORKDIR /app
# copy user code to WORKDIR
COPY ./app/*.py /app/

# copy user config template and manifest.yaml to /img-man
RUN mkdir -p /img-man
COPY img-man/*.yaml /img-man/

# view https://github.com/protocolbuffers/protobuf/issues/10051 for detail
ENV PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python

# entry point for your app
# the whole docker image will be started with `nvidia-docker run <other options> <docker-image-name>`
# and this command will run automatically

RUN echo "python /app/start.py" > /usr/bin/start.sh
CMD bash /usr/bin/start.sh

docker build -t demo/semantic_seg:infer -f Dockerfile .