Detailed Overview of NVENC Encoder API

DETAIL OVERVIEW OF NVENC
ENCODER API
Swagat Mohapatra
Senior Lead Engineer
GPU Multimedia SW
AGENDA
 Introduction to NVENC SDK
 Detailed Overview of NVENC API
 Advanced Topics
— Rate Control Modes
— Low Latency Encoding
BENEFITS OF HW BASED ENCODER




Low power
Low latency
High performance
Ease of Programming
NVENC VIDEO ENCODING SOLUTIONS
 Fixed Function Hardware (NVENC)
 Entire encode pipeline implemented in hardware
 ME, intra-prediction, mode decision, VLE
 High performance, low power
 Kepler +
 Proprietary software API(NVENC SDK)
 Windows (NVENC-DirectX interop, NVENC-CUDA interop)
 Linux (NVENC-CUDA interop)
 Can work in hybrid mode with ME on CUDA
NVENC SDK
 Available on NVIDIA developer zone
— https://developer.nvidia.com/nvidia-video-codec-sdk
 .DLL/.so, interface header, documentation, sample apps
 Unified API for Windows and Linux
 Works on x86/x64
NVENC SDK
SDK 1.0
(May 2012)
SDK 2.0
(March 2013)
SDK 3.0
(Sep 2013)
SDK 4.0
(May 2014)
VERSION
SDK 1.0
Windows Support Only, Transcoding Support
SDK 2.0
Linux Support, Low latency Encoder support
SDK 3.0
Low latency encoding improvements, Reconfigure API
SDK 4.0
Maxwell Support, yuv444 , lossless
NVENC STACK
Encoded
bitstream
Client application
Initialize, Configure, Encode
NVENC API
Configure HW
CUDA
Driver
NVENC
Driver
DirectX
Driver
HW Encode
NVENC firmware + hardware
NVENC API FLOW
OPENING ENCODE SESSION
5
1
2
6
QUERY ENCODER ATTRIBUTES
ENCODE FRAME
3
QUERY ENCODER PRESETS
4
INITIALIZING ENCODER
ALLOCATE I/O
RESOURCES
NVENC SW
SDK
7
8
READING OUTPUT
BITSTREAM
CLOSING ENCODER
SESSION
OPENING ENCODE SESSION
OPENING ENCODE SESSION
1
NVENC SW
SDK
OPENING ENCODE SESSION
Create DX/CUDA Device
Load nvEncodeAPI.dll
Retrieve Encoder API function ptrs
Open the encoder Session
OPENING ENCODE SESSION
 The NVENC SDK API shared library(dll) name is
nvEncodeAPI.dll
 It has a single entry point NvEncodeAPICreateInstance
 NvEncodeAPICreateInstance to retrieve the API function
pointers.
 NvEncOpenEncodeSessionEx API to start encode session.
 Application must create a DX or CUDA device , which passed
as part of NvEncOpenEncodeSessionEx API.
QUERY ENCODER ATTRIBUTES
OPENING ENCODE SESSION
1
2
QUERY ENCODER ATTRIBUTES
NVENC SW
SDK
QUERY ENCODER ATTRIBUTES
HW ENCODER ATTRIBUTES
ATTRIBUTE GUIDS
ENCODE GUID
NV_ENC_CODEC_H264_GUID
H264/MPEG4 AVC
PROFILE GUID
NV_ENC_H264_PROFILE_BASELINE_GUID
NV_ENC_H264_PROFILE_HIGH_GUID
NV_ENC_H264_PROFILE_MAIN_GUID
H264 BASELINE PROFILE
H264 HIGH PROFILE
H264 MAIN PROFILE
ENCODER CAPS
NV_ENC_CAPS_SUPPORTED_RATECONTROL_MODES,
NV_ENC_CAPS_SUPPORT_CABAC,
NV_ENC_CAPS_SUPPORT_BDIRECTMODE,
NV_ENC_CAPS_SUPPORT_STEREO_MVC
QUERY ENCODER ATTRIBUTES
Query Encoder Codec GUID
IsCodecSupported ?
FAIL
SUCCESS
Query Profile GUID
IsProfileSupported ?
FAIL
SUCCESS
Query HW Caps
IsEncodeCapSupported ?
SUCCESS
FAIL
Close Encoder Session
QUERY ENCODER ATTRIBUTES
 Query Codec GUID
 NvEncGetEncodeGUIDCount

NvEncGetEncodeGUIDs
 Query Profile GUID
 NvEncGetEncodeProfileGUIDCount
 NvEncGetEncodeProfileGUIDs
 Query Encode Caps
 NvEncGetEncodeCaps
QUERY ENCODER PRESETS
OPENING ENCODE SESSION
1
2
QUERY ENCODER ATTRIBUTES
3
QUERY ENCODER PRESETS
NVENC SW
SDK
QUERY ENCODER PRESETS
PRESET
Encoder Settings
APPLICATION
HIGH QUALITY
B Frames, CABAC, 8x8 Transform, All Intra
Modes, All Inter Modes*, VBR RC, GopLength 30
TRANSCODING HIGH
BITRATE
HIGH PERFORMANCE
No B Frames, CAVLC,
P16x16, Intra16x16 and Intra4x4 Modes, VBR,
GopLength 30
MULTIPLE TRANSCODING
LOW LATENCY HQ
No B Frames, CABAC, All Intra , All Inter Modes,
Single frame VBV 2 PASS, Infinite GOP,
CLOUD GAMING,
MIRACAST,
VIDEO CONFERENCING
LOW LATENCY HP
No B Frames, CABAC, All Intra and Inter Modes,
Single frame VBV 2 PASS, Infinite GOP, Smaller
Search Range compared to LOW LATENCY HQ
CLOUD GAMING, MIRACAST
ENCODER PRESETS
LOW LATENCY HP
LOW LATENCY HQ
HP
HQ
200 FPS
100 FPS
Category 1
320 FPS
240 FPS
0
50
100
150
200
250
720p Performance on NVIDIA Geforce GTX 650
300
350
ENCODER PRESETS
 Query Encoder Presets
 NvEncGetEncodePresetCount
 NvEncGetEncodePresets
 Get Encoder Presets settings
 NvEncGetEncodePresetConfig
 NvEncGetEncodeCaps API to query HW caps
ENCODER PRESETS
Query Preset GUIDs
FAIL
IsPresetSupported ?
SUCCESS
Get Preset Config
Query HW Encoder Caps
Modify NVENC Preset Settings
Initialize Encoder
Release
Encoder
INITIALIZING ENCODER
OPENING ENCODE SESSION
1
2
QUERY ENCODER ATTRIBUTES
3
QUERY ENCODER PRESETS
4
INITIALIZING ENCODER
NVENC SW
SDK
INITIALIZING ENCODER
 NvEncInitializeEncoder API.
 Parameters used for Initializing the Encoder
 NV_ENC_INITIALIZE_PARAMS
Basic Encoder parameters common for all codecs.
 NV_ENC_CONFIG
 Optional advance codec parameters for applications which want
more control over the encoder and supports various codec specific
parameters
 NV_ENC_CONFIG_H264
INITIALIZING ENCODER
 NV_ENC_INITIALIZE_PARAMS
Description
Parameter Name
Encode Dimensions
encodeWidth , encodeHeight
Codec
encodeGUID
Preset
presetGUID
Display Aspect Ratio
darWidth, darHeight
Frame Rate
frameRateNum, frameRateDen
Async Event Based Signaling
enableEncodeAsync
Picture Type Decision
enablePTD
Low Latency Slice based read back
enableSubFrameWrite
Slice Offsets reporting
reportSliceOffsets
INITIALIZING ENCODER
 NV_ENC_CONFIG
Description
Parameter Name
Profile
profileGUID
GOP structure
gopLength, frameIntervalP
Rate Control Parameters
rcParams
MV Precision(Qpel/Hpel/Fpel)
mvPrecision
Input Frame structure
frameFieldMode
H264 Codec parameters
(NV_ENC_CONFIG_H264)
encodeCodecConfig
INITIALIZING ENCODER
 NV_ENC_CONFIG_H264
Description
Parameter Name
Key frame interval
idrPeriod
VLE mode
entropyCodingMode
Adaptive Block Transform(8x8)
adaptiveTransformMode
Disable Deblocking Flags
disableDeblockingFilterIDC
Slice Parameters
sliceMode, sliceModeData
H264 VUI Parameters
h264VUIParams
Bdirect Mode
bdirectMode
DPB size
maxNumRefFrames
Intra Refresh
intraRefreshPeriod, intraRefreshCnt
ALLOCATE I/O RESOURCES
OPENING ENCODE SESSION
5
1
2
QUERY ENCODER ATTRIBUTES
3
QUERY ENCODER PRESETS
4
INITIALIZING ENCODER
NVENC SW
SDK
ALLOCATE I/O
RESOURCES
INPUT RESOURCES
 Two types of Input Resources
 NVENC Input Buffers
 Externally Allocated DX/Cuda Buffers mapped to NVENC
 NV_ENC_INPUT_RESOURCE_TYPE_DIRECTX
 NV_ENC_INPUT_RESOURCE_TYPE_CUDADEVICEPTR
NVENC INPUT BUFFERS
 NVENC Input Buffers
 Provides a simple interface to load input data from system memory.
 Includes an expensive copy of input from system to video
memory using NvEncLockInputBuffer API.
NVENC
CPU
SLOW PCIE XFER
SYS MEM
VIDEO MEM
NVENC INPUT BUFFERS
 NVENC Input Buffers are allocated using
 NvEncCreateInputBuffer
 Only NV_ENC_BUFFER_FORMAT_NV12_PL is supported
 NvEncDestroyInputBuffer
 Application loads input data on NVENC Input Buffers using
 NvEncLockInputBuffer
 NvEncUnlockInputBuffer
MAPPING DX / CUDA INPUT RESOURCES TO
NVENC
 Mapping DX / CUDA Buffers to NVENC
 Direct mapping of video memory buffer to NVENC address space
 Removes the expensive copy of system memory data to video memory.
 Much lower latency than NVENC Input buffer method.
DX/CUDA
NVENC
VIDEO MEM
MAPPING DX / CUDA INPUT RESOURCES
TO NVENC
 Mapping DX / CUDA Resources to NVENC
 Provides DX/CUDA interoperability with NVENC
 Create an NV12 buffer using DX /CUDA API
 Register the DX/CUDA Resource with NVENC
 NvEncRegisterResource
 Map the DX/CUDA Resource with NVENC before sending it for Encoding
 NvEncMapInputResource
 Unmap the DX/CUDA Resource once frame has been encoded
 NvEncUnMapInputResource
 Unregister the DX/CUDA Resource before destroying it.
 NvEncUnRegisterResource
ALLOCATING OUTPUT BUFFERS
 Allocating Output Bitstream Buffer
 NvEncCreateBitstreamBuffer
 NvEncDestroyBitstreamBuffer
 Allocating Output buffer completion Event(*Windows Only)
 CreateEvent
 NvEncRegisterAsyncEvent
 NvEncUnregisterAsyncEvent
ENCODE FRAME
OPENING ENCODE SESSION
5
1
2
6
QUERY ENCODER ATTRIBUTES
ENCODE FRAME
3
QUERY ENCODER PRESETS
4
INITIALIZING ENCODER
ALLOCATE I/O
RESOURCES
NVENC SW
SDK
ENCODE FRAME
Find a free Input and Output Buffer
Call Encode Frame
NEED_MORE_INPUT
Check Encode
Frame Status
FAIL
SUCCESS
Read Bitstream data
RELEASE ENCODER
ENCODE FRAME
 NvEncEncodePicture API used for submitting input buffers for
encoding.
 Input Buffers are submitted
 Display Order : I B B P B B P
 Reordering done by NVENC SDK
 Encoder Order : I P B B P B B
 Reordering done by Application
ENCODE FRAME
 Application submitting buffers in Encode order must specify
 NV_ENC_PIC_PARAMS :: pictureType
 NV_ENC_PIC_PARAMS_H264 :: displayPOCSyntax
 NV_ENC_PIC_PARAMS_H264 :: refPicFlag
 NV_ENC_INITIALIZE_PARAMS :: enablePTD to 0
 Application submitting buffers in Display order must specify
 NV_ENC_CONFIG ::gopLength
 NV_ENC_CONFIG :: frameIntervalP
 NV_ENC_CONFIG_H264 :: idrPeriod
 NV_ENC_INITIALIZE_PARAMS :: enablePTD to 1
READING OUTPUT BITSTREAM
OPENING ENCODE SESSION
QUERY ENCODER HW
ATTRIBUTES
2
ALLOCATE I/O
RESOURCES
6
ENCODE FRAME
3
QUERY ENCODER PRESETS
4
INITIALIZING ENCODER
5
1
NVENC SW
SDK
7
READING OUTPUT
BITSTREAM
READING OUTPUT BITSTREAM
 Reading output buffer after encoding
 NvEncLockBitstream
 NvEncUnlockBitstream
 Encode Completion Notification
 NvEncLockBitstream with doNotWait to 0.
 Wait on NvENC event (registered with NvEncRegisterAsyncEvent API).
 Set NV_ENC_INITIALIZE_PARAMS::enableEncodeAsync to 1
READING OUTPUT BITSTREAM
 Slice Level Readback
 NvEncLockBitstream with doNotWait to 1.

Set NV_ENC_INITIALIZE_PARAMS::enableSubFrameWrite to 1
 Poll and read data till NV_ENC_LOCK_BITSTREAM :: hwEncodeStatus = 2
 Number slices encoded till that loop is reported
NV_ENC_LOCK_BITSTREAM ::numSlices
 Slice offset can also be reported
 NV_ENC_INITIALIZE_PARAMS::reportSliceOffsets = 1;
 NV_ENC_LOCK_BITSTREAM ::sliceOffsets[]
CLOSING ENCODER SESSION
OPENING ENCODE SESSION
QUERY ENCODER HW
ATTRIBUTES
2
ALLOCATE I/O
RESOURCES
6
ENCODE FRAME
3
QUERY ENCODER PRESETS
4
INITIALIZING ENCODER
5
1
NVENC SW
SDK
7
8
READING OUTPUT
BITSTREAM
CLOSING ENCODER
SESSION
CLOSING ENCODER SESSION
Flush Encoder Queue
Wait for Flush Operation to Complete
Release Encode I/O Buffers
Unregister Output Events
Release NVENC SW Encoder Object
Release the DX/Cuda Device
CLOSING ENCODER SESSION
 Flush Encoder Queue : NvEncEncodePicture with NULL input
and output buffer
 Release I/O Buffers
 NvEncDestroyInputBuffer

NvEncDestroyBitstreamBuffer
 Unregister Completion Event
 NvEncUnregisterAsyncEvent API.
 NvEncDestroyEncoder API.
NVENC RATE CONTROL MODES
 RATE CONTROL MODES
 NV_ENC_PARAMS_RC_CBR
 NV_ENC_PARAMS_RC_VBR
 NV_ENC_PARAMS_RC_2_PASS_QUALITY
 NV_ENC_PARAMS_RC_2_PASS_FRAMESIZE_CAP
NVENC RATE CONTROL MODES

NV_ENC_PARAMS_RC_CBR
 Single Pass Constant Bitrate Rate Control Mode
 Constant Bitrate doesn’t mean constant frame size
 Mostly used for media streaming with low end to end delay.
 NV_ENC_PARAMS_RC_VBR
 Single Pass Variable Bitrate Mode
 Bitrate varies according to frame complexity.
 Larger VBV size compared to CBR as a result more flexibility in allocating bits.
 Mostly used for media storage .
NVENC RATE CONTROL MODES
 NV_ENC_PARAMS_RC_2_PASS_FRAMESIZE_CAP
 Customized two pass CBR for low latency applications
 First pass analysis without any frame look ahead.
 Reduces banding effect due to single pass CBR at low bit rate streaming.
 Mostly used for low delay application like cloud gaming, miracast etc.
 NV_ENC_PARAMS_RC_2_PASS_QUALITY.
 Customized two pass CBR for single frame VBV cases.
 Special handling of scene cuts and I frames.
1
40
79
118
157
196
235
274
313
352
391
430
469
508
547
586
625
664
703
742
781
820
859
898
937
976
1015
1054
1093
1132
1171
1210
1249
1288
1327
1366
1405
1444
1483
1522
1561
1600
1639
1678
1717
1756
1795
1834
1873
1912
1951
FrameSizeInBits
NV_ENC_PARAMS_RC_2_PASS_FRAMESIZE_CAP
200000
180000
160000
140000
120000
100000
Bits
80000
vbvsize
60000
40000
20000
0
Frame Number
1
43
85
127
169
211
253
295
337
379
421
463
505
547
589
631
673
715
757
799
841
883
925
967
1009
1051
1093
1135
1177
1219
1261
1303
1345
1387
1429
1471
1513
1555
1597
1639
1681
1723
1765
1807
1849
1891
1933
FrameSizeInBits
NV_ENC_PARAMS_RC_2_PASS_QUALITY
300000
250000
200000
150000
Bits
vbvsize
100000
50000
0
Frame Number
LOW LATENCY ENCODING
 ULTRA LOW LATENCY ENCODER SETTING
 DYNAMIC BITRATE CHANGE
 DYNAMIC RESOLUTION CHANGE
 PERIODIC INTRA REFRESH
 REFERENCE PICTURE INVALIDATION
ULTRA LOW LATENCY ENCODER
SETTINGS
 PRESET
 NV_ENC_PRESET_LOW_LATENCY_HQ_GUID
 NV_ENC_PRESET_LOW_LATENCY_HP_GUID
 B FRAMES DISABLED
 CABAC, 8x8 TRANSFORM, ALL INTRA MODES , ALL INTER MODES
 RATE CONTROL SETTINGS
 NV_ENC_PARAMS_RC_2_PASS_QUALITY
 NV_ENC_PARAMS_RC_2_PASS_FRAMESIZE_CAP
 FIRST PASS ANALYSIS
 INFINITE GOP
 SINGLE FRAME VBV
 VBVSIZE = VBV INITIAL DELAY = BITRATE / FRAME RATE
ULTRA LOW LATENCY ENCODER
SETTING
 Slice Size In Bytes
 Slice Level Readback of Output Bitstream
 Disable Deblocking across slices
 Constrained Intra Prediction
DYNAMIC BITRATE CHANGE
 NVENC SDK supports dynamic bitrate change within a gop.
 NvEncReconfigureEncoder API
 NV_ENC_RECONFIGURE_PARAMS :: reInitEncodeParams
 NV_ENC_CONFIG::rcParams
 NV_ENC_RC_PARAMS::averageBitRate
 NV_ENC_RC_PARAMS::maxBitRate
 NV_ENC_RC_PARAMS::vbvBufferSize
 NV_ENC_RC_PARAMS::vbvInitialDelay
DYNAMIC BITRATE CHANGE
pictureSize
300000
200000
150000
100000
50000
0
1
8
15
22
29
36
43
50
57
64
71
78
85
92
99
106
113
120
127
134
141
148
155
162
169
176
183
190
197
204
211
218
225
232
239
246
253
260
267
274
281
288
295
FrameSizeInBits
250000
FRAME NUMBER
Bitrate = 8 mbps Frame Number < 100
Bitrate = 4 mbps Frame Number > 100
DYNAMIC RESOLUTION CHANGE
 NV_ENC_INITIALIZE_PARAMS::maxEncodeWidth
 NV_ENC_INITIALIZE_PARAMS::maxEncodeWidth
 NvEncReconfigureEncoder API
 NV_ENC_RECONFIGURE_PARAMS :: reInitEncodeParams
 NV_ENC_RECONFIGURE_PARAMS :: resetEncoder
 NV_ENC_RECONFIGURE_PARAMS :: forceIdr
PERIODIC INTRA REFRESH
 NV_ENC_CONFIG_H264::enableIntraRefresh
 NV_ENC_CONFIG_H264:: intraRefreshCnt
 NV_ENC_CONFIG_H264:: intraRefreshPeriod
FRAME N
FRAME N+1
Intra MBs
Dirty MBs
Clean MBs
FRAME N + 2
FRAME N + 3
REFERENCE PICTURE INVALIDATION
 NV_ENC_CONFIG_H264::maxNumRefFrames
 NvEncInvalidateRefFrames API
NETWORK
CHANNEL
DECODER
ENCODER
CLIENT FEEDBACK
REF0
REF1
REF N
QUESTIONS?