Why render hidden objects? Cull them with a software depth-buffer rasterizer FTW! Charumathi Chandrasekaran Graphics Software Engineer GDC 2013 www.intel.com/software/gdc Be Bold. Define the Future of Software. Agenda • • • • • Algorithm overview Depth Buffer rasterization Depth testing Optimizations Performance results GDC 2013 www.intel.com/software/gdc 2 Performance • • Occluder size threshold = 1.5 Occludee size threshold = 0.01 Frame rate (fps) Frame time (ms) # of draw calls Objects rendered Occluders rasterized Occludees Culled Depth rasterizer (ms) Depth test (ms) Total Cull Time Gain GDC 2013 No optimization SSE Multi-threading + Frustum Culling 7.51 133.15 23279 20802 - 19.56 51.12 7360 6494 2X+ Multi-threading + Frustum Culling + Depth test Culling 70.11 14.26 1831 1557 9 25468 0.7 0.67 1.37 9X+ www.intel.com/software/gdc 3 Sample Screenshot GDC 2013 www.intel.com/software/gdc 4 Algorithm Overview Occluders Transform vertices to screen space Bin triangles Rasterize binned triangles to create depth buffer yes Scene objects Occludees Transform 8 vertices of AABBox to screen space Rasterize AABBox triangles and depth test Occluded? no GDC 2013 Do not render Render www.intel.com/software/gdc 5 Occluders GDC 2013 www.intel.com/software/gdc 6 Occludees GDC 2013 www.intel.com/software/gdc 7 Software Depth Buffer Rasterization • Transform the occluder vertices to screen space on the CPU • Bin the triangles to the frame buffer tiles GDC 2013 www.intel.com/software/gdc 8 Pixel Traversal • Rasterize the pixels within each tile • Use bounding box traversal • Rasterize 2x2 blocks for SSE GDC 2013 www.intel.com/software/gdc 9 Line Equation f ( x, y ) Ax By C 0 f ( x, y ) ( y0 y1 ) x ( x1 x0 ) y x0 y1 x1 y0 0 +ve (1 ) -ve y (x1,y1) A y0 y1 (A,B) B x1 x0 p(x,y) C x0 y1 x1 y0 x (x0,y0) f ( x, y ) 0 In; f ( x, y ) 0; f ( x, y ) 0 Out GDC 2013 www.intel.com/software/gdc 10 Is pixel inside triangle? • Triangle edge equation : P0 line ( P1P2 ) A1 x B1 y C1 0 +ve line ( P0 P1 ) A0 x B0 y C0 0 (2) line ( P2 P0 ) A2 x B2 y C2 0 +ve P2 p(x,y) +ve P1 • For ‘p’ when all 3 edge equations >= 0 ‘p’ lies inside the triangle GDC 2013 www.intel.com/software/gdc 11 Incremental pixel evaluation f ( x, y ) Ax By C 0 f ( x, y ) ( y0 y1 ) x ( x1 x0 ) y x0 y1 x1 y0 0 Compute A, B and C once Compute f ( x, y ) once f(x 1,y) A(x 1 ) By C Ax By C A ( 3 ) f(x 1, y) f(x,y) A (4) f(x,y 1 ) f(x,y) B (5) GDC 2013 (x, y) (x+1, y) (x, y+1) (x+1, y+1) www.intel.com/software/gdc 12 Triangle Area 1 x1 x0 Area 2 y1 y0 x2 x1 y2 y1 (4) P0 (x0,y0) P2 (x2,y2) 1 ( y0 y1 ) x2 ( x1 x0 ) y2 x0 y1 x1 y0 (5) 2 f ( x, y ) ( y0 y1 ) x ( x1 x0 ) y x0 y1 x1 y0 0 (6) 1 Area f ( x, y ) at x x2 , y y2 2 GDC 2013 P1 (x1,y1) www.intel.com/software/gdc 13 Cull back facing triangles • Consider triangles T1 and T2: 1 Area (T1 ) f ( x, y ) at x x2 , y y2 2 1 Area (T2 ) f ( x, y ) at x x '2 , y y2' 2 P0 (x0,y0) T1 T2 P2’ (x2’,y2’) • P2’ is outside the triangle. • f(x,y) will evaluate to a negative value. • Cull triangles with area < = 0 GDC 2013 P2 (x2,y2) P1 (x1,y1) www.intel.com/software/gdc 14 Depth computation using Barycentric coordinates A0 A1 A2 , , A A A (7) P0 (x0,y0) z0 P2 (x2,y2) z2 A1 A A0 A1 A2 0 ( , , ) 1 A2 1 • Interpolate depth at triangle vertices z p z0 z1 z2 GDC 2013 (8) p A0 z1 P1 (x1,y1) www.intel.com/software/gdc 15 CPU Rasterized Depth Buffer GDC 2013 www.intel.com/software/gdc 16 Axis Aligned Bounding Box • Use object space axis aligned bounding box (AABB) • All occluders are treated as occludees • Transform and rasterize the AABB triangles: max 6 front facing GDC 2013 www.intel.com/software/gdc 17 Depth Testing • Depth test the rasterized AABB triangles against the CPU generated depth buffer. • Assumption: – AABB is visible, object inside may also be visible. • AABB depth testing is conservative. – May have false positives • A clipper stage is not implemented. • Objects clipped by near clip plane are marked visible. GDC 2013 www.intel.com/software/gdc 18 Find near plane clipped objects • Use homogeneous coordinate ‘w’ of the AABB • For objects in front of camera w > 1.0 • If any occludee BB vertex has w < 1.0 – object is clipped by near clip plane GDC 2013 near plane far plane Wnear = 1.0 www.intel.com/software/gdc 19 Optimizations • • • • • • • GDC 2013 Binning Frustum Culling Vectorization with SSE Multithreading Pipelining Occluder Size Threshold Occludee Size Threshold www.intel.com/software/gdc 20 Occluder / Occludee size threshold h=r • Avoid processing occluder / occludee if their screen space size is too small r1 1 tan FOV w1 2 h r (9) r threshold value ( 10 ) if 1 w tan FOV 2 small true r1 1/2FOV FOV r2 w1 w2 www.intel.com/software/gdc 21 Scene 1 GDC 2013 www.intel.com/software/gdc 22 Performance scene 1 • • Occluder size threshold = 1.5 Occludee size threshold = 0.01 Frame rate (fps) Frame time (ms) # of draw calls Objects rendered Occluders rasterized Occludees Culled Depth rasterizer (ms) Depth test (ms) Total Cull Time Gain GDC 2013 No optimization SSE Multi-threading + Frustum Culling 7.51 133.15 23279 20802 - 19.56 51.12 7360 6494 2X+ Multi-threading + Frustum Culling + Depth test Culling 70.11 14.26 1831 1557 9 25468 0.71 0.67 1.38 9X+ www.intel.com/software/gdc 23 Scene 2 GDC 2013 www.intel.com/software/gdc 24 Performance scene 2 • • Frame rate (fps) Frame time (ms) # of draw calls Objects rendered Occluders rasterized Occludees Culled Depth rasterizer (ms) Depth test (ms) Total Cull Time Gain GDC 2013 Occluder size threshold = 1.5 Occludee size threshold = 0.01 No optimization Multi-threading + Frustum Culling 9.07 110.25 19073 16698 - 9.22 108.45 18893 16518 1.01X+ Multi-threading + Frustum Culling + Depth test Culling 11.67 85.68 14443 12651 11 14374 1.05 0.94 1.99 1.28X+ www.intel.com/software/gdc 25 Future Work • • • • • GDC 2013 Compare against a non-binned rasterizer Experiment with smaller depth buffer Vary number of tiles in the binned version Use AVX2 to vectorize rasterizer Implement DrawIndexedInstanced() call www.intel.com/software/gdc 26 Complete Solution • • • • • GDC 2013 Automatic occluder simplification Fixed memory and bounded CPU time Static and dynamic occluders and occludees Shadow caster culling Streaming www.intel.com/software/gdc 27 Acknowledgements • Engineers: – Doug Mcnabb - Team Tech Lead – David Houlton - Sr Graphics Software Engineer • Artist: – Glen Lewis – Project Offset artists • Fabian Giesen – http://fgiesen.wordpress.com/2013/02/17/optimizing-sw-occlusionculling-index/ GDC 2013 www.intel.com/software/gdc 28 Questions ? • Sample download – http://software.intel.com/gamecode – www.intel.com/software/gdc • Charu Chandrasekaran – [email protected] • Doug Mcnabb – [email protected] GDC 2013 www.intel.com/software/gdc 29 Legal Disclaimers INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL® PRODUCTS. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY RELATING TO SALE AND/OR USE OF INTEL PRODUCTS, INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT, OR OTHER INTELLECTUAL PROPERTY RIGHT. Intel products are not intended for use in medical, life saving, life sustaining, critical control or safety systems, or in nuclear facility applications. Intel Corporation may have patents or pending patent applications, trademarks, copyrights, or other intellectual property rights that relate to the presented subject matter. The furnishing of documents and other materials and information does not provide any license, express or implied, by estoppel or otherwise, to any such patents, trademarks, copyrights, or other intellectual property rights. Intel may make changes to specifications, product descriptions, and plans at any time, without notice. The Intel processor and/or chipset products referenced in this document may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. All dates provided are subject to change without notice. All dates specified are target dates, are provided for planning purposes only and are subject to change. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. * Other names and brands may be claimed as the property of others. Copyright © 2012, Intel Corporation. All rights reserved. GDC 2013 www.intel.com/software/gdc 30 GDC 2013 www.intel.com/software/gdc 31
© Copyright 2024 ExpyDoc