Data Convert Backend
BMF Data Convert Backend
Background
An all-in-one solution is needed when multiple dimension factors are involved in video process pipeline such as CPU/GPU devices, YUV420/NV12 or RGB24/RGB48, and AVFrame or Torch structure. As a framework, each module just focuses on its own target and data requirement, but it becomes complex when multiple modules work together as below supper resolution pipeline:
We can see that different modules have their own data requirements. The decode module outputs FFmpeg AVFrame in YUV420 pixel format, which is located on the CPU memory. The Trt SR module requires that the input data is an RGB24 torch after hardware acceleration and is located on the GPU memory. For SR through the Trt module, the output data needs to be encoded by the GPU, so the HW encode module can get AVFrame with NV12 pixel format located on the GPU memory and encode it by the GPU.
It tends to include the capabilities of video data conversion below:
- pixel format and color space
- devices between CPU and GPU
- different media types such as avframe, cvmat and torch
Currently, the backend interface is under testing and demo integration. The fully unified and easy to use interface and examples for both C++ and Python will be released soon.
C++ Interface
/** @addtogroup bmf_backend_convert
* @{
* @arg src_vf: src VideoFrame
* @arg src_dp: src MediaDesc that describe src_vf's attributes
* @arg dst_dp: The desired MediaDesc of the converted VideoFrame.
* @} */
BMF_API VideoFrame bmf_convert(VideoFrame& src_vf, const MediaDesc &src_dp, const MediaDesc &dst_dp);
This interface allows the conversion of a source VideoFrame to a destination VideoFrame. If the media_type in the source MediaDesc is different from the media_type in the destination MediaDesc, it indicates that the conversion will involve the transformation between VideoFrame and a third-party data structure. Accessing the third-party data structure requires the use of the private_attach and private_get methods of the VideoFrame.
VideoFrame scale and colorspace conversion
use bmf_convert
do scale and csc
MediaDesc dp;
dp.width(1920).pixel_format(hmp::PF_YUV420P);
auto rgbformat = hmp::PixelInfo(hmp::PF_RGB24);
auto src_vf = VideoFrame::make(640, 320, rgbformat);
auto dst_vf = bmf_convert(src_vf, MediaDesc{}, dp);
EXPECT_EQ(dst_vf.width(), 1920);
EXPECT_EQ(dst_vf.height(), 960);
EXPECT_EQ(dst_vf.frame().format(), hmp::PF_YUV420P);
Device memory transfer
Samples show how to transfer the input video frame memory on GPU memory.
MediaDesc dp;
dp.device(hmp::Device("gpu")); // or hmp::Device("cpu")
auto dst_vf = bmf_convert(src_vf, MediaDesc{}, dp);
Conversion between VideoFrame and third-party data structure
BMF support the following types of third-party structure conversion with VideoFrame
- FFmpeg AVFrame
- Opencv cv::Mat
- libtorch at::Tensor
Here, FFmpeg AVFrame is used as an example to illustrate the conversion. Other types of conversions can be referred to in the test_convert_backend.cpp
.
VideoFrame to ffmpeg AVFrame
- include <bmf/sdk/av_convertor.h>, this will register AVConvertor for AVFrame
- dst_dp set media_type with value MediaType::kAVFrame
- do
bmf_convert
, check if the return VideoFrame is a valid VideoFrame - use
private_get<AVFrame>
to get the AVFrame pointer from the return VideoFrame.
It is important to note that the lifecycle of the structure pointed to by the pointer obtained from private_get is managed by the VideoFrame to which it belongs.
some example code snippet:
#include <bmf/sdk/av_convertor.h>
MediaDesc dp;
dp.width(1920).height(1080).pixel_format(hmp::PF_RGB24).media_type(MediaType::kAVFrame);
auto yuvformat = hmp::PixelInfo(hmp::PF_YUV420P);
auto src_vf = VideoFrame::make(640, 320, yuvformat);
auto dst_vf = bmf_convert(src_vf, MediaDesc{}, dp);
EXPECT_TRUE(static_cast<bool>(dst_vf));
const AVFrame* frame = dst_vf.private_get<AVFrame>();
EXPECT_EQ(frame->width, 1920);
EXPECT_EQ(frame->height, 1080);
EXPECT_EQ(frame->format, AV_PIX_FMT_RGB24);
AVFrame to VideoFrame
- use
private_attach
set AVFrame as private_data of VideoFrame - src_dp set media_type with value MeidaType::kAVFrame
- use
bmf_convert
to convert - get the return VideoFrame
some example code snippet:
VideoFrame src_with_avf;
src_with_avf.private_attach<AVFrame>(frame);
MediaDesc src_dp;
src_dp.pixel_format(hmp::PF_RGB24).media_type(MediaType::kAVFrame);
VideoFrame vf = bmf_convert(src_with_avf, src_dp, MediaDesc{});
Python Interface
The unified bmf_convert
as C++ will be designed and implemented later.
Scale and colorspace conversion
bmf.hml.hmp.img.rgb_to_yuv
bmf.hml.hmp.img.yuv_to_rgb
bmf.hml.hmp.img.yuv_to_yuv
bmf.hml.hmp.img.resize
bmf.hml.hmp.img.rotate
Sample code:
from bmf import *
import bmf.hml.hmp as mp
NV12 = mp.PixelInfo(mp.PixelFormat.kPF_NV12, mp.ColorSpace.kCS_BT470BG, mp.ColorRange.kCR_MPEG)
RGB = mp.PixelInfo(mp.PixelFormat.kPF_RGB24, mp.ColorSpace.kCS_BT709, mp.ColorRange.kCR_MPEG)
src_vf = pkt.get(VideoFrame)
out_frame = mp.Frame(src_vf.frame().width(), src_vf.frame().height(), NV12, device='cuda')
mp.img.rgb_to_yuv(out_vf.frame().data(), src_vf.frame().plane(0), NV12, mp.kNHWC)
Device memory transfer
Interface used:
VideoFrame.frame().device()
gets the Device property
VideoFrame.cuda()
moves the memory data on cuda memory
VideoFrame.cpu()
moves the memory data on cpu memory
Sample code:
from bmf import *
import bmf.hml.hmp as mp
vf = pkt.get(VideoFrame)
if (vf.frame().device() == mp.Device('cpu')):
vf = vf.cuda()
#...
Conversion between VideoFrame and third-party data structure
In python API, those types of third-party data type are supported:
- VideoFrame, which is the general class of video frame in BMF. And
VideoFrame
includesFrame
as member - numpy
- torch
bmf.hml.hmp.Frame.numpy
converts a BMF Frame to numpy, and Frame can be included in VideoFrame
bmf.hml.hmp.Frame.from_numpy
Sample code
from bmf import *
import numpy as np
np1 = np.array(obj)
rgb = mp.PixelInfo(mp.kPF_RGB24)
frame = mp.Frame(mp.from_numpy(npa, rgb)
vf = VideoFrame(frame)
#...video frame process
np0 = vf.frame().plane(0).numpy()
np1 = vf.frame().plane(1).numpy()
np2 = vf.frame().plane(2).numpy()
For bellow interface, Torch Enabled BMF is needed by compile, please reference “Install” section.
bmf.hml.hmp.from_torch
convert a torch into bmf Frame
bmf.hml.hmp.torch
convert bmf Frame to torch
Sample code:
from bmf import *
import torch
vf = pkt.get(VideoFrame)
torch_data = vf.frame().data().torch()
#...torch process
from bmf import *
import bmf.hml.hmp as mp
import torch
vf = pkt.get(VideoFrame)
rgb = mp.PixelInfo(mp.kPF_RGB24)
torch_vf = torch.from_dlpack(vf.reformat(rgb).frame().plane(0))
#... process of torch
RGB = mp.PixelInfo(mp.PixelFormat.kPF_RGB24, mp.ColorSpace.kCS_BT709, mp.ColorRange.kCR_MPEG)
frame = mp.Frame(mp.from_torch(torch_vf.contiguous()), RGB)
bmf_vframe = VideoFrame(frame)
#...