Improving Transport Design for WARP SDR Deployments Krishna C. Garikipati Kang G. Shin SRIF '14 EECS Real Time Computing Laboratory Outline Ø Introduction Ø Problem of Latency Ø WARP Transport Ø Proposed Design Ø Evaluation Ø Extensions SRIF '14 2 Software Defined Radios Flexible radios that implement most of the hardware functions in software Indispensible for wireless research NI FlexRIO So1ware Defined Radio Bundle SRIF '14 3 SDR Elements Processing • Host (GPP), FPGA, DSP • Signal processing library Radio • Antennas • ADC/DACs, Transceivers Image Credit: GNURadio Other • Clocking, Transport, Memory, User I/O, Debug … Image Credit: WARP SRIF '14 4 SDR Architecture GPP-based • Separate dedicated GPP for centralized processing • Split functionality • Requires external transport of radio samples • Scalable • E.g. USRP and GNURadio, WARP and WARPLab, SORA SRIF '14 Non-GPP based • On-board dedicated or soft processor/Standalone • Integrated • Transport occurs through internal bus • Not scalable • E.g. USRP embedded series, WARP 802.11 Reference designs 5 SDR Architecture IQ " samples" TX Baseband processing" GPP/Host Machine" Write Transport" frames" PCIe/ Ethernet/USB" Baseband signal" Up-" conversion" FPGA Board " ADC/DACs, " Convertors" Transmitter " Wireless Channel" FPGA Board " RX Baseband processing" GPP/Host Machine" PCIe/ Ethernet/USB" IQ " samples" SRIF '14 Read Transport" frames" Receiver" ADC/DACs, " Convertors" Baseband signal" Downconversion" 6 SDR Example WARP • A popular SDR platform for wireless research and prototyping • Self-contained with custom hardware and software designs http://warpproject.org/ WARP hardware • Vertex-6 FPGA with 2 dual-band RF transceivers with maximum 40MHz bandwidth • 12-bit ADC/DACs, Gigabit Ethernet ports, shared clocking, extensions SRIF '14 Image: WARP v3 node 7 SDR Example WARPLab • Flexible MATLAB-based framework for developing wireless applications with large array of WARP nodes • Supports rapid implementation of PHY layer • Utilizes WARPLab FPGA reference hardware designs, and reference code (C and MATLAB) for software design FPGA" Ethernet" Driver" BUS" ... Reference" M-code" WARPLab" Buffers" Reference " C-code" ... Ethernet" UDP" Transport" IP Cores" MicroBlaze" MATLAB" WARPLab" AGC" Host Processor" WARP Hardware" SRIF '14 8 SDR Example WARPLab library Image Credit: WARP • MATLAB commands for configuration of WARP nodes • Library modules: each paired to hardware and software design run on the node • Transport module is responsible for message exchanges between host and WARP node over Ethernet SRIF '14 9 Outline Ø Introduction Ø Problem of Latency Ø WARP Transport Ø Proposed Design Ø Evaluation Ø Extensions SRIF '14 10 Transport Latency Definition • For a given number of samples, the delay in reading(writing) radio samples from(to) host memory or userspace Userspace" Kernel" Target HW" request()" readIQ()" recvfrom()" DMA " transfer + Serialization" Target HW" Userspace" Kernel" Fixed point conversion" Ethernet transfer" DMA " transfer+ Serialization " writeIQ()" sendto()" Ethernet transfer" Floating point conversion" Timeline of read function SRIF '14 ack" Timeline of write function 11 Problem: Large WARP Deployments MIMO Technology • Large no. of antennas (SDRs) • Centralized processing in CoMP, Massive MIMO, etc. Other Applications • Indoor localization of wireless signals using antenna array Argos (64 antennas), 2012 ... Antenna array WARPLab with its linear increase in transport latency with the number of nodes is unsuitable for large deployments SRIF '14 12 Problem: Strict deadlines Processing time in SDRs • GPP Processing time (rx & tx) + transport latency (receive & send) Protocol requirements • WiFi processing deadline <16us • LTE turnaround deadline ~ 3ms • Mobile channel measurements < 10ms Holy Grail : Meeting protocol deadlines in large SDR setups SRIF '14 13 Objective ü Improve transport performance of large SDR deployments using WARPLab reference design ü Explore the implementation of accurate channel measurements and practical wireless systems such as LTE Only transport: SDR Acceleration on GPP is a whole different story ! SRIF '14 14 Outline Ø Introduction Ø Problem of Latency Ø WARP Transport Ø Proposed Design Ø Evaluation Ø Extensions SRIF '14 15 Buffers • 16-bit I and 16-bit Q samples • Single buffer per RF chain • Max buffer size = 32k samples (128KB) Baseband" WARP Transport Sampling " " Buffer" UDP Protocol • Fixed-size packetized (non-streaming) • UDP sockets for maximum speed • Sequence numbers, checksums, acks for reliable transfer • Provision for timeouts SRIF '14 DMA transfer" Ethernet frames" Transport, IP, Ethernet headers" 16 WARP Transport Ethernet • Link speed = 1Gbps (2xEth ports) • Xilent library with DMA • Support for Jumbo frames (9KB) Transport code • • • • WARPLab Reference M-Code MEX implementation of UDP transport Single-thread instantiation Sequential read/write of buffers SRIF '14 17 WARP Transport Testbed • 16x WARPv3 boards • HP ProCurve 6600 Switch (48x1GbE, 4x10GbE) • 32-core Intel(R) Xeon(R) E5-2660 CPU (HT enabled), 128GB RAM • Dual-port 10GbE card • WARPLab 7.4, MATLAB 2012b • Ubuntu 12.04 LTS SRIF '14 18 WARP Transport Single node benchmarks (32K samples) • Max. theoretical transfer rate on 1Gbps link = 31.2 Msps Function Packet Size (bytes) Line Throughput (Kpps) Line Throughput (Mbps) #calls (per sec) Transfer rate (Msps) Read 1508 30.83 373.2 193.3 6.3 Read 9004 13.57 972.8 314.2 10.3 Write 1508 9.8 118.4 71.1 2.3 Write 9004 13.67 979.9 336.7 11.0 Transport is the processing bottleneck in WARPLab ! Line rate saturation SRIF '14 Less than max rate due to overheads 19 WARP Transport Multiple node benchmarks • Total read latency averaged over 103 runs (negligible variance) Total read latency (ms) 80 60 Linear increase 40 20 0 SRIF '14 1464B, 32K 1464B, 16K 8960B, 32K 8960B, 16K 2 4 6 8 10 12 Number of WARP nodes 14 16 20 WARP Transport Multiple node benchmarks • Total write latency averaged over 103 runs Total write latency (ms) 250 200 Linear increase 150 100 50 0 SRIF '14 1464B, 32K 1464B, 16K 8960B, 32K 8960B, 16K 2 4 6 8 10 12 Number of WARP nodes 14 16 21 Outline Ø Introduction Ø Problem of Latency Ø WARP Transport Ø Proposed Design Ø Evaluation Ø Extensions SRIF '14 22 Proposed Design Code Refactoring • Standard C/C++ instead of MEX • Standalone WARP driver • Improved interface for further modifications • Extensible to other WARPLab modules void nodes_initialize(int* node_sock , int numNodes); void readIQ (double complex* samples, int start_sample , int num_samples , int node_sock , int node_id , int buffer_id, int host_id ); void writeIQ (double complex* samples, int start_sample, int num_samples, int node_sock, int node_id, int buffer_id, int host_id); void sendTrigger(); void nodes_disable(int* node_sock, int numNodes); Transport functions SRIF '14 23 Proposed Design Transport Parallelism • • • • Read/write calls are independent Multi-threaded implementationApplications! Utilize multi-core processor OpenMP API C/C++ extensions Measure! latency! RX! Processing! TX! Processing! multiWrite()! multiRead()! writeIQ()! readIQ()! warp_functions.c! sendTrigger()! warp_transport.c! nodesInit()! nodesDis()! WARPLab UDP Transport! !!!…!! Code organization SRIF '14 24 Proposed Design Network Design • • • • Support combined transfer rate of multiple nodes High-capacity link at the host : 10Gbps Switch is 1GbE/10GbE compliant Suitable for up to 10 WARP nodes (each node at line rate ) 10 Gbps Host Processor SRIF '14 1-GbE/10-GbE Switch 1Gbps WARP nodes 25 Proposed Design Beyond 10 nodes • Additional 10Gbps link at host • Reduce queuing (congestion) delay • Static routing between two links for load-balancing ( Host has two separate IP addresses) 10 Gbps Host Processor SRIF '14 1-GbE/10-GbE Switch 1Gbps WARP nodes 26 Proposed Design CWARP: https://github.com/gkchai/cwarp SRIF '14 27 Outline Ø Introduction Ø Problem of Latency Ø WARP Transport Ø Proposed Design Ø Evaluation Ø Extensions SRIF '14 28 Evaluation Comparison w/ WARPLab Total duration (ms) 50 40 30 latency reduction! 20 10 0 SRIF '14 M write M read C write C read 2 4 6 8 10 12 Number of WARP nodes 14 16 29 Evaluation 32K samples Total duration (ms) 3 2.5 Queuing delay Reduction with additional 10Gbps link 2 1.5 1 SRIF '14 write, 10Gbps read, 10Gbps write, 2x10Gbps read, 2x10Gbps 2 4 6 8 10 12 Number of WARP nodes 14 16 30 Evaluation 16K samples Total duration (ms) 1.4 1.2 write, 10Gbps read, 10Gbps write, 2x10Gbps read, 2x10Gbps 1 0.8 2 SRIF '14 Low bandwidth LTE is possible ?? 4 6 8 10 12 Number of WARP nodes 14 16 31 Outline Ø Introduction Ø Problem of Latency Ø WARP Transport Ø Proposed Design Ø Evaluation Ø Applications SRIF '14 32 Signal Processing Libraries Advantages of CWARP • Fast ! • Readily built as shared libraries of existing SDR frameworks • Can be compiled to be processor specific (thread libraries) • Cross-SDR platform compatible SRIF '14 33 Research Adaptive transport control • Variable sample rate, quantization • Effect of baseband sample (lossy or lossless) compression • Study of network load Mobility measurements • Fine grained evaluation of wireless channel in large MIMO systems • Moving away from trace-based evaluation of PHY protocols SRIF '14 Transport Compression Prioritization Rate control Host PC" PC Trigger" TX_NODES WriteIQ" Host PC" PC Trigger" RX_NODES Tx/Rx" 700μs" 20μs" 100μs" ReadIQ" Tx/Rx" ReadIQ" ≈650μs" time" Channel measurement period ≈ 0.8ms" K. C. Garikipati , K.G. Shin “Measurement-Based Transmission Schemes for Network MIMO” , ACM MobiHoc 2014 34 Thank you SRIF '14
© Copyright 2024 ExpyDoc