Capabilities, Costs ∝ 𝑷𝒆𝒓𝒇𝒐𝒓𝒎𝒂𝒏𝒄𝒆/𝑾𝒂𝒕𝒕 $ FPGAs ASICs Source: Bob Broderson, Berkeley Wireless group Xeon CPU NIC Xeon CPU Search Acc. (FPGA) Search Acc. (ASIC) NIC Xeon CPU Search Acc. v2 (FPGA) Wasted Power, Holds back SW NIC Xeon CPU Math Accelerator Wasted Power, NIC One more thing that can break http://www.globalfoundationservices.com/posts/2014/january/27/microsoft-contributes-cloud-server-specification-to-open-compute-project.aspx • • • • Two 8-core Xeon 2.1 GHz CPUs 64 GB DRAM 4 HDDs, 2 SSDs No cable attachments to server Stratix V 8GB DDR3 PCIe Gen3 x8 Data Center Server (1U, ½ width) FPGA FPGA FPGA Web Search Pipeline FPGA Math Acceleration Service FPGA FPGA FPGA Web Search Pipeline Physics Engine FPGA Comp. Vision Service 4 GB DDR3-1333 ECC SO-DIMM 4 GB DDR3-1333 ECC SO-DIMM 72 Shell DDR3 Core 0 72 DDR3 Core 1 Config Flash (RSU) Role JTAG Host CPU 8 x8 PCIe Core LEDs Application Temp Sensors DMA Engine I 2C xcvr reconfig Inter-FPGA Router North SLIII 2 South SLIII 2 East SLIII 2 SEU West SLIII 2 4 256 Mb QSPI Config Flash Ranking as a Service (RaaS) Selection as a Service (SaaS) Query SaaS 11 IFM IFM IFM1 1 SaaS 22 IFM IFM IFM22 SaaS 33 IFM IFM IFM33 SaaS IFM 44 48 IFM IFM44 44 Selection-as-a-Service (SaaS) - Find all docs that contain query terms, - Filter and select candidate documents for ranking Selected Documents RaaS 11 IFM IFM IFM1 1 RaaS 22 IFM IFM IFM22 RaaS 33 IFM IFM IFM33 10 blue links RaaS IFM 44 48 IFM IFM44 44 Ranking-as-a-Service (RaaS) - Compute scores for how relevant each selected document is for the search query - Sort the scores and return the results {Query, Document} Document Query: “FPGA Configuration” NumberOfOccurrences_0 = 7 ~4K Dynamic Features ~2K Synthetic Features L2 Score Score NumberOfOccurrences_1 = 4 NumberOfTuples_0_1 = 1 {Query, Document} Document NumberOfOccurrences_0 = 7 NumberOfOccurrences_1 = 4 NumberOfTuples_0_1 = 1 ~4K Dynamic Features ~2K Synthetic Features FFE #1 =(2*NumberOfOccurrences_0 + NumberOfOccurrences_1) (2 * NumberOfTuples_0_1) FFE #1 = 9 L2 Score Score PCIe Compressed Document Free Form Expression (FFE) • • • Stream Preprocessing FSM Feature Gathering Network 196 feature families 54 state machines 2.6K dynamic features extracted in less than 4us (~600us in SW) Control/Data Tokens Distribution latches Cluster 0 Outpu t Core 0 FST Core 3 Core 1 Core 2 Complex Core 4 Core 5 Document 8-Stage Pipeline FE: Feature Extraction FPGA 0 Route to Head FPGA 1 Route to Head FPGA 2 FFE: Free-Form Expressions FPGA 3 FPGA 4 FPGA 5 FPGA 6 Score Compute Score Compute Score FPGA 7 Document Scoring Request Return Score Document Scoring Request Return Score RaaS Servers Server Server Server Server Server Server Server Server 8-Stage Pipeline 8-Stage Pipeline FPGA 0 FPGA 5 FPGA 1 FPGA 6 FPGA 2 FPGA 2 FPGA 3 FPGA 0 FPGA 4 FPGA 1 FPGA 5 FPGA 2 FPGA 6 FPGA 3 FPGA 7 FPGA 4 1,632 Servers with FPGAs Running Bing Page Ranking Service (~30,000 lines of C++) But when will an FPGA handle my Bing Search? • Bing is going into production with FPGAs Top Row: Eric Peterson, Scott Hauck, Aaron Smith, Jan Gray, Adrian M. Caulfield, Phillip Yi Xiao, Michael Haselman, Doug Burger Bottom Row: Joo-Young Kim, Stephen Heil, Derek Chiou, Sitaram Lanka, Andrew Putnam, Eric S. Chung, Not Pictured: Kypros Constantinides, John Demme, Hadi Esmaeilzadeh, Jeremy Fowers, Gopi Prashanth Gopal, Amir Hormati, James Larus, Simon Pope, Jason Thong Huge thanks to our partners at Save the planet and return your name badge before you leave (on Tuesday) Microsoft Privacy Policy statement applies to all information collected. Read at research.microsoft.com
© Copyright 2024 ExpyDoc