WHITE PAPER Intel® PTAS-iEN Data Center Efficiency and Optimization Data Center Management Monitoring and Managing the Modern Data Center Intel® Data Center Management (Intel® DCM) software and Intel® Power Thermal Aware Solution (Intel® PTAS), combined with Chunghwa Telecom’s Intelligent Energy Network (iEN) data center solution, can reduce the power consumption of data center cooling equipment by up to 30% and improve overall data center operating efficiency. With the rising demand for cloud service, businesses today require increasingly more compute capability from their data centers. This compute increase will most likely come from higher rack and room power densities. (servers, hubs, routers, wiring patch panels, and other network appliances), not to mention the infrastructure needed to keep these devices alive and protected, encroaches on another IT goal: to reduce long-term energy usage. But an increase in a data center’s business-critical IT equipment IT facilities use electric power for lights, security, backup power, and climate control to maintain temperature and humidity levels that minimize downtime due to heat issues. By benchmarking power consumption, you are comparing the power needed to run businesscritical equipment with the power needed to maintain that equipment. Challenges • Reduce power consumption/costs. How to raise the IT and power density of data center racks to meet business growth and better service delivery is a top concern for many cloud service providers and a barrier to some companies’ growth. For efficient and intelligent operation and for improved service, data centers must maximize available resources while reducing operating costs. • Cool equipment intelligently and efficiently. Tailoring data center cooling needs on a case-by-case, server/rack/row basis has not been possible until now. Solution • Intel® DCM and Intel® PTAS. The PTAS technology solution provides platform telemetry data, metrics, and analytics to enable real-time power and thermal monitoring and reporting, analytics, cooling, and compute control in a data center. (Early hot spot identification, monitor events in server/rack, advanced CRAC control strategies.) • Chunghwa Telecom iEN Service. CHT’s Intelligent Energy Network (iEN) data center solution decreases power usage and increases operation efficiency by combining intelligent management of your data center’s infrastructure and power system (including air conditioning, lighting, security, and environmental monitoring) in one platform. Benefits • Up to 30% energy savings. Decrease energy consumption by reducing overcooling, while achieving PUE 1.52 to meet LEED standard for green data center operation. • Increased reliability of data center operation. Intel® platform telemetry data, metrics, and analytics solution enables real-time monitoring and energy efficiency management. Data center administrators get real-time information on cooling issues and heat distribution alarms. “Working with Intel on this data center management proof-of-concept has shown us that we can deliver real-time level 3 PUE measurements on a fully operational data center. And not only can we monitor our data center’s energy efficiency through the cloud, we can also intervene when necessary, redistributing resources and facility infrastructure according to need. “We are optimistic about the feasibility of integrating real-time individual server data with a building energy management system to dynamically manage a facility’s cooling infrastructure. We look forward to validating this next step in Chunghwa Telecom’s long-term plan to improve data center efficiency management.” — Ruey-Chang Hsu, Vice President of Network Department, Chunghwa Telecom Monitoring and Managing the Modern Data Center Benchmarking power usage Before you can intelligently reduce your data center’s energy consumption, you need to know what its current consumption is. Monitoring energy consumption is the first step to managing your data center’s energy efficiency. Benchmarking helps you understand the existing level of efficiency in your data center. Power Usage Effectiveness (PUE) and its reciprocal Data Center Infrastructure Efficiency (DCIE), developed by the Green Grid consortium, are internationally accepted benchmarking standards that measure a data center’s power usage for actual computing functions (as opposed to power consumed by lighting, cooling, and other overhead). PUE = Total facility power IT equipment power DCiE = IT equipment power Total facility power A data center that operates at 1.5 PUE or lower is considered efficient (see Table 1). Table 1 PUE/DCIE efficiency standards a PUE DCIE Efficiency 1.2 83% Very efficient 1.5 67% Efficient 2.0 50% Average 2.5 40% Inefficient 3.0 33% Very inefficient Identifying where power is lost is key to making your data center run more efficiently. Three ways to measure PUE: • Level 1. Measure at least twice a month from the data center's UPS (uninterruptible power supply). • Level 2. Measure daily at the PDU (power distribution unit). • Level 3. Measure continuously throughout the day, including data from the PDU and UPS. Level 3 measurements are the most accurate and most useful, and they are easy to acquire with Intel® PTAS. Intel® PTAS gives you precise indicators to match supply with demand, comparing the power currently used for the IT equipment with the power used by the infrastructure that keeps that equipment cooled, powered, backed up, and protected. By addressing inefficiencies at the rack level, you can optimize by row, and eventually address the whole data center’s efficiency, reducing power consumption and related energy costs—in both operating and capital expenses—and thereby extend the useful life of your data center. Cooling needs a. Source: Green Grid Because heat is a leading cause of downtime in data centers, rooms filled with racks of computers and other heat-producing IT equipment require a lot of energy to cool. Some experts claim a data center's infrastructure may be responsible for as much as 50% of the DC’s energy bill—a good portion of that coming from cooling equipment. The energy required by this cooling equipment may come at the expense of actual compute power. So reducing the power fed to your cooling solution may allow greater utilization of your power resources for actual business. The Rack Cooling Index (RCI)*, which monitors rack overtemperatures and undertemperatures, is another useful industry benchmark. In short, an RCI score of 100% indicates that temperatures did not exceed the acceptable highs (or lows). Anything above 90% is acceptable, and above 96% is considered good. A data center that maintains 100% for both highs and lows is in the “goldilocks zone”—the optimal operating temperature that is not too hot or too cold. Generally speaking, the goldilocks zone for data centers is between 65 and 80°F (18 and 27°C). Anything cooler is probably wasted energy, and anything warmer may result in equipment failures and more downtime (Figure 1). But because IT professionals seldom have access to real-time controls to optimize power and temperature, many will overcool their data centers to meet peak or “worst case” conditions. Overcooling a data center wastes energy, but IT professionals rarely have the tools they need to cool their data centers wisely and economically. Figure 1 Data center operating temperature ranges 60°F (15°C) 65°F (18°C) Overcooled (higher costs) 2 80°F (27°C) RCI = 100% (best operation — ”Goldilocks” zone) 90°F (32°C) Too hot (risk of failure) Monitoring and Managing the Modern Data Center Intel® PTAS and how it works Intel® PTAS is Intel’s DCIM solution with integrated platform telemetry and analytics to identify and address energy efficiency issues. It provides server level power monitoring through Intel® DCM, calculates efficiency metrics, and develops 3D thermal maps of with PUE Level 3 measurements. It also works out air conditioning control strategies and stimulation, logging events for any abnormal behavior in a server, rack, or room. running hot, the iEN-Box, through controlling devices, could increase the cooling for that rack location. One server acts as the Intel® DCM Server. It gathers information from the other servers, then sends this data through API to the iEN-Box, and both interact with each other in real-time (Figure 2). For example, if the Intel® DCM Server were to notify the iEN-Box that a server was Then iEN builds thermal maps and efficiency metrics from Intel® DCM data. As a result, data center administrators can identify unused or idle servers for consolidation, avoid potential failures before they occur, and run the data center more efficiently. Figure 2 Proof of concept configuration iEN-DCM configuration Intel® PTAS Feature design Control strategies Cloud Argus CHT OSS QoS Cloud iEN Ethernet iEN web UI and graphic control User POMIS Internet Management layer Video monitoring Backup system iEN-Box Master Intel® PTAS iEN-Box Slave Web service (API) Access control system Intel® DCM Intel Server Controlling devices Inlet and outlet temperature of servers, power, performance indicators, air flow, etc. Controlling devices Control layer Device layer Power Results Intel helped conduct a proof of concept at Chunghwa Telecom’s 2,427-square-foot data center to evaluate Intel® DCM and Intel® PTAS working with CHT’s iEN. The test involved an internal data center with separate hot and cool aisles, using 19 QCT and Intel servers1— with at least four units in each of 1. Nine Quanta* QCT servers: One Intel® Xeon® E2600 v3 65 W CPU, one DIMM, and one 2.5 in. HDD. Ten Intel servers: Two Intel® Xeon® DP E5-2680 130 W CPUs, one 200 GB SSD, and one 2.5 in. HDD. A/C Lighting Rack monitor four dedicated racks within a single row—to monitor and compare results with the influence of different locations. A twentieth server, acting as the Intel® DCM server, gathered data on itself and the other servers. The administrator could log onto the Intel® DCM server through the cloud and get real-time information on each server’s power usage, air flow, temperature, CPU utilization, etc. (Figure 3). In this example, with a CUPS threshold Fire suppression Smoke detection Water leak detection of 50, the system alerts you when any unit exceeds 50 CUPS2. The two-dimensional “floorplan” view of the data center shows server thermal distribution in realtime. Clicking on a monitored rack or on an alarm icon displays a threedimensional representation of the servers in the rack (Figure 4), with detailed readings and color-coded thermal indicators for each server. 2. CUPS = compute usage per second; a measurement of the amount of “useful” work a server is performing. 3 temperature distribution occurs, Intel® PTAS provides color-coded visual warnings, notifies you of the event (by alert, e-mail, or SMS), and recommends corrective actions. With integrated CUPS telemetry and thermal metrics, Intel® PTAS balances compute load to correct thermal events. So if the system exceeds a threshold, or an uneven Figure 3 Real-time two-dimensional representation of power usage, air flow, temperature, CPU utilization 2014/08/17 11:52:35 26.0°C AC9:25.0 49.0% : Hot Spot (Max Out Temp > 40) : High Load (Max CUPS > 50) 29.0°C AC8:25.0 46.0% Spotlight on Chunghwa Telecom Chunghwa Telecom (CHT) is Taiwan’s leading telecom service provider. The company provides fixed-line, mobile, and Internet and data services to residential and business customers in Taiwan. Chunghwa Telecom is headquartered in Taipei, Taiwan. For more information, visit www.cht.com.tw/en. Air Control Setting 26.7°C 28.0°C 28.5°C 28.0°C CUPS 52.0% CUPS 0.0% CUPS 0.0% CUPS 0.0% J03 J04 J05 J06 37.0°C 33.0°C 32.0°C 32.0°C Aisle 32.0°C AC 30.0°C 45.0% 31.0°C Hot 44.0% 39.0% AC 26.0°C 30.0°C 51.0% 50.0% CUPS Temp (°C) 81~100 T>32 61~80 27<T<32 41~60 18<T<27 21~40 15<T<18 0~20 T<15 For more information on Quanta Cloud Technology (QCT) products, featuring high manageability and energy-efficient rackmount servers powered by Intel® PTAS technology, visit www.QuantaQCT.com. Figure 4 Three-dimensional representation of a specific rack Rack_J03 PlanMap QCT Temp (°C) 81~100 T>32 61~80 27<T<32 41~60 18<T<27 21~40 15<T<18 QCT Server 2 Info Airflow : 63.7 CFM CUPS : 100.0% Avg Power : 105.0 W CPU Power : 62.0 W 28.0°C 25.0°C Intel Intel QCT Intel QCT CUPS 30.0°C 27.0°C QCT QCT Server 1 Info 98.0% Airflow : 65.5 CFM CUPS : Avg Power : 111.0 W CPU Power : 59.0 W 30.0°C 37.0 °C 28.0 °C 36.0 °C 29.0 °C 27.0 °C 27.0 °C 27.0 °C 26.0 °C 27.0 °C Intel Server 3 Info Airflow : 30.0 CFM CUPS : 10.0% Avg Power : 88.0 W CPU Power : 15.0 W Intel Server 2 Info Airflow : 54.0 CFM CUPS : 48.0% Avg Power : 279.0 W CPU Power : 142.0 W QCT Server 3 Info Airflow : 55.0 CFM CUPS : 5.0% Avg Power : 56.0 W CPU Power : 12.0 W CUPS 110 88 98 100 50 44 22 10 5 0 0 0~20 T<15 QCT 01 Intel 03 QCT 02 QCT 03 Intel 02 Summary Our tests showed that integrating CUPS data with thermal readings to balance server loads, using supplyside optimization (which allowed us to raise ambient temperatures), and using server sensors to control cooling equipment reduced cooling needs and resulted in savings. An administrator can choose the hottest, coldest, or average QCT 04 Intel 01 Find a business solution that is right for your company. Contact your Intel representative or visit the Reference Room at intel.com/references. Intel Server 1 Info Airflow : 54.0 CFM CUPS : 50.0% Avg Power : 270.0 W CPU Power : 50.0 W 66 48 Learn how the Intel® DCM SDK can help you address real-time power and thermal monitoring issues in your data center at software.intel.com/datacentermanager. QCT Server 4 Info Airflow : 55.7 CFM CUPS : 0.0% Avg Power : 70.0 W CPU Power : 10.0 W temperature to trigger cooling remedies. We found that using the hottest temperature as the control point (and switching from returnside to supply-side monitoring) improved AC efficiency by 10 to 15%. Integrating Intel® PTAS, Intel® DCM Energy Director with Intel® Node Manager, and Chunghwa Telecom's iEN data center solution improved operating efficiency and reduced power consumption by up to 30%. Intel does not control or audit the design or implementation of third-party benchmark data or websites referenced in this document. Intel encourages all of its customers to visit the referenced websites or others where similar performance benchmark data is reported and confirm whether the referenced benchmark data is accurate and reflects performance of systems available for purchase. Copyright © 2014 Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation in the United States and other countries. The Rack Cooling Index (RCI) is a registered trademark of ANCIS Incorporated. *Other names and brands may be claimed as the property of others. 330247-003EN
© Copyright 2024 ExpyDoc