Evaluation

P2P video broadcast based on per-peer
transcoding
and its evaluation on PlanetLab
Naoki Shibata, † Keiichi Yasumoto, Masaaki Mori
Shiga University, †Nara Institute of Sci. and Tech.
1
Motivation

Watching TV on various devices
 Screen
resolution of mobile phone : 96x64 ~
 Screen resolution of Plasma TV
: ~ 1920x1080
 Video
delivery method for wide variety of devices
 Screen resolution
 Computing power
 Available bandwidth to Internet

Popularization of P2P video delivery


Joost
Zattoo
2
Overview of this presentation

Improvement to our previously proposed video
delivery method named MTcast
Features of (previous) MTcast



Video delivery method based on P2P video streaming
Serves requests with different video qualities
Scalable with number of users
New improvement


Reduced backbone bandwidth for further scalability
Evaluation of performance on PlanetLab


Implementation in Java language
Evaluation in PlanetLab environment
3
Outline
Background
 Related works
 MTcast overview
 Implementation overview
 Evaluation

4
Multiversion method
request 200k
deliver 300k
request 400k
…
deliver 500k
300k 500k
Minimum delay
 No need of transcoding
 Low user satisfaction : # of served video qualities = # of versions
 High network load

G. Conklin, G. Greenbaum, K. Lillevold and A. Lippman :
``Video Coding for Streaming Media Delivery on the Internet,’’
IEEE Transactions on Circuits and Systems for Video Technology, 11(3), 2001.
5
Online transcoding method
Server
Proxies
1000k
1000k
1000k



Transcode
300k
500k
700k
Higher user satisfaction
Additional cost for proxies
# of qualities is restricted by capacity of proxies
S. Jacobs and A. Eleftheriadis :
``Streaming Video using Dynamic Rate Shaping and TCP Flow Control,’’
Visual Communication and Image Representation Journal, 1998.
6
Layered multicast method
200k
…
200k+300k
200k
300k
Base layer 2nd layer
200k+300k+500k
Low computation load on server
 User satisfaction depends on # of layers
 Limitation on # of layers … High CPU usage to decode many layers

J. Liu, B. Li and Y.-Q. Zhang :
``An End-to-End Adaptation Protocol for Layered Video Multicast Using Optimal Rate Allocation,’’
7
IEEE Transactions on Multimedia, 2004.
Outline
Background
 Related works
 MTcast overview
 Implementation overview
 Evaluation

8
Service provided by MTcast

Network environment
 Wide

Number of users
 500

area network across multiple domains
to 100,000
Kind of contents
 Simultaneous
broadcast of video - same as TV broadcast
 New user can join to receive delivered video from the
scene currently on broadcast

Kind of request by users
 Each

user can specify bit rate of video
We assume resolution and frame rate are decided from bit rate
9
Idea of MTcast
Server
request 800k
reply 800k
Transcode
reply 500k
Transcode
request 500k
Transcode
1000k
request 300k
reply 300k
10
Building transcode tree
Transcode tree is video delivery tree
User nodes
User nodes
Bit rate request 2000k
Bit rate request 2000k
Bit rate request 800k
Bit rate request 2000k
Bit rate request 300k
Bit rate request 1920k
Bit rate request 1200k
Bit rate request 1500k
Sort
Bit rate request 1850k
Bit rate request 1830k
11
Building transcode tree
A constant value decided
on each video broadcast
User nodes
• Make groups of k user nodes from the top
• Each group is called a layer
• Minimum requested bit rate for each layer
is actually delivered to the layer
Bit rate request 2000k
Bit rate request 2000k
Bit rate request 1920k
Bit rate request 1850k
Bit rate request 1830k
Delivered bit rate
for each layer
Bit rate request 1800k
Bit rate request 1800k
Bit rate request 1780k
12
Building transcode tree
• Put each layer at the place of nodes in a binary tree
• In the order of depth first search
• Construct a modified binary tree
Video server
2000k
1800k
1500k
1100k
1300k
900k
700k
13
Advantages of building tree in this manner
Video server
2000k
1800k
1500k



1300k
900k
700k
Videos in many qualities can be served


1100k
Number of qualities = Number of layers
Each node is required to perform only one transcoding
Length of video delivery delay is O(log(# of nodes))
Tolerant to node failures
14
Recovery from node failure
No increase in number of video transcoding on each node
• Degree of tolerance of node failure depends on :
• Number of nodes in each layer
If there are many nodes in layer, it has greater tolerance of failure
• Available bandwidth on each node
• Buffered video data is played back during recovery
• Users never notice node failures
15
Extension for real world usage
Each link is an overlay link
Traffic may go back and forth many times
between ASs
Precious bandwidth between ASs is consumed
Idea of extension
Nodes in a same AS should connect in priority
Priority of connection is decided according to
hop count and available bandwidth
Nodes in service provider A
Nodes in service provider B
Nodes in service provider
17 C
Outline
Background
 Related works
 MTcast overview
 Implementation overview
 Evaluation

18
Design policy
Usable on PlanetLab
 Usable in many similar projects
 Easily modifiable


Good performance, if possible

Why not use JMF?
 It’s
not maintained
 Huge buffering delay
19
Modular design

We designed many classes for video delivery






Transcoder
Transmitter and receiver to/from network
Buffer
etc.
Each node is a set of instances of these classes
Each node instantiate these classes and connects
the instances according to the command from a
central server
Workings of each node can be flexibly changed by
changing commands from the central server
20
Outline
Background
 Related works
 MTcast overview
 Implementation overview
 Evaluation

21
Results of evaluation published in [9]

Computation load of transcoding




Computation load of making transcode tree




Measured computation load when video playback and
transcoding are simultaneously executed
Measured on desktop PC, notebook PC and PDA
Result : All processing can be performed in real time
1.5 secs of computation on Pentium 4 2.4GHz
Time complexity: O( n log n )
Network load : Practical if the computation node of transcode tree
has enough bandwidth
User satisfaction



Satisfaction degree is defined as to [3]
Made a network with 6000 node using Inet 3.0
Satisfaction with our method was at least 6% higher than layered
multicast method

Satisfaction becomes better as the number of nodes increases
23
Video quality degradation by transcoding


Video quality may degrade by multiple transcoding
We measured PSNR value when video is transcoded
in our method

We compared :


A video transcoded only once
A video transcoded multiple times
24
Effectiveness of bandwidth reduction(1/2)

Compared physical hop count in transcode tree
 By our method
 By randomly selecting node to connect

Comparison by simulation
 Number of user nodes : 1000
 333 nodes has bandwidth between 100 and 500kbps
 333 nodes has bandwidth between 2 and 5Mbps
 334 nodes has bandwidth between 10 and 20Mbps
 Result of simulation
 Hop count by the random method : 4088
 Hop count by our method : 3121
 25% reduction of hop count by our method
25
Effectiveness of bandwidth reduction(2/2)

Compared physical hop count in transcode tree
 By our method
 By randomly selecting node to connect

Comparison on PlanetLab
 20
user nodes on 7 countries
 Result



Random selection : hop count 343, 361, 335
Our method : hop count 314, 280, 277
16% reduction of hop count by our method
26
Time to start up the system on PlanetLab

Measured time to the following events since beginning



All nodes complete establishing connection
All nodes receive the first one byte of data
Comparison on PlanetLab


20 user nodes on 7 countries
Nodes are cascaded, not connected in tree

Result of evaluation

Observation


Most of time is consumed to establish connections
All operations are performed in parallel, and thus the observed
time is the time for the slowest node to establish connection 27
Time to recover from node failure on PlanetLab

Measured following time since a node failure

Establishing connection to a new node
 Receiving data from the new node
Comparison on PlanetLab

20 user nodes on 7 countries
 Nodes are cascaded, not connected in tree


Result of evaluation

Observation
These are practical values
 During recovery time, buffered data is played back, and thus user never
notices node failure

28
Conclusion

Improved MTcast
 Bandwidth
usage between ASs is reduced

Made a prototype system in Java
Evaluation on PlanetLab

Ongoing works include

 Serving
request of many parameters including picture
size, framerate and audio channels
 Further reduction of bandwidth between nodes
29
Thank you!
Any questions?
30
Devices to reduce buffering delay



前記の各インスタンスはそれぞれ一つのスレッドに対応
各スレッドは,上流のインスタンスの出力データを(ファイルを
読み出すように) readし,加工し,write することを繰り返す
本実装では write メソッドが読み出されたとき,下流のインス
タンスがそのデータを全て read するまでブロックする
ブロックしないと,スレッドが切替わるまでデータの受渡しが行われない
 最悪,スレッドの切替え時間 x インスタンス数分のバッファリング遅延が発生



マルチコアプロセッサの性能を有効活用
各クラスの実装が容易
31
多数のノードで評価するための工夫

PlanetLab のノードの状況は刻々と変化する

突然オフラインになったり,また戻ったり

各ノードのファイルのインストール状況の調査およびインス
トール作業を自動化

上記のため,いくつかのスクリプトを用意




各ノードにスクリプトをコピーして実行するスクリプト
 この作業は,ノード毎に並列に行う
ノード毎にインストール作業を行うスクリプト
作業は並列に行われるので,高速
数十のノードに Sun JRE をインストールするだけなら,30分
もかからない
32