Web Browsing Quality of Experience: Best Practices for Accurate Monitoring An Industry Whitepaper Contents Executive Summary ................................... 1 Introduction to Web QoE............................. 2 The Three Key Challenges of Web QoE Monitoring ........................................ 2 The Three Main Components of Web QoE Monitoring ........................................ 3 The Cost of Slow Page Loads .................. 3 Issues Surrounding Web Browsing QoE Monitoring3 What should be Measured? ....................... 3 The Importance of Page Load Wait Time ... 5 A Method for Measuring Page Load Time ...... 6 How CSPs can Obtain Accurate Page Load Time Measurements ............................ 6 A Database of Web Page Anatomy Profiles for Comparison....................................... 6 Putting Browsing QoE Scores into Context to Generate Answers............................... 7 How it all Comes Together ....................... 7 Conclusion .............................................. 8 Summary of Web QoE Measurement Techniques ........................................................ 8 Executive Summary The ability to monitor web browsing quality of experience (QoE) is essential to determining when and where network conditions are contributing to an impaired user experience. Understanding the relationship between web QoE and network factors helps Communication Service Providers (CSPs) reveal the cause of problems and evaluate potential solutions. In setting about the daunting task of obtaining accurate measurement of web browsing QoE, some immediate questions spring forth: • • • What is the relevant data that should be collected? How closely can the subscriber’s actual perception be measured? How should measurements and collected data be processed? Any viable web browsing QoE solution inevitably overlaps with the problem/opportunity that is “big data”; how to minimize performance overhead at the data collection front end while maximizing the usefulness of collected data by distilling it into meaningful reports that provide clear insight? The goal is for CSPs to arm themselves with intelligence and viable paths to solutions in support of Customer Experience Management (CEM). This paper presents the key issues and requirements for a viable web browsing QoE monitoring solution. Additional Resources .............................. 9 Version 2.0 Web-browsing Quality of Experience Monitoring Introduction to Web QoE Since the earliest days of its widespread public adoption, Internet web browsing has been the most prominent and consistent activity linked to subscriber quality of experience (QoE). 1 It was the appearance of the first graphical web browsers that brought the Internet out of academia and into the public consciousness. Graphical web browsing made the Internet an indispensible part of our daily lives, and subscribers have always been extremely sensitive to poor page load times because, unlike money, products and many services, “time” is universally viewed as something that cannot be replaced. 2 The ability to monitor web browsing QoE is therefore essential to determining when and where network conditions are contributing to an impaired user experience. Understanding the relationship between web QoE and network factors helps Communication Service Providers (CSPs) reveal the cause of problems and evaluate potential solutions. For example, web browsing QoE reporting can help a mobile operator decide whether to increase a cell’s transmit power to improve signal strength, or decrease it to reduce handovers from overlapping cells. 3 Decisions related to CDN placement, peering and transit cannot be made blindly, and confidence increases greatly with in-depth QoE assessments of various subscriber activities including web browsing. In setting about the daunting task of obtaining accurate measurement of web browsing QoE, some immediate questions spring forth: • • • What is the relevant data to be collected? How closely can the subscriber’s actual perception be measured? How can measurements and collected data be processed into meaningful reports? The challenge of web browsing QoE monitoring inevitably overlaps with the problem/opportunity that is “big data”; how can CSPs minimize performance overhead at the data collection front end while maximizing the usefulness of collected data by distilling it into meaningful reports that provide clear insight? The ultimate goal is for CSPs to arm themselves with the intelligence knowing how their service is actually being perceived by the entire subscriber base, which presents valuable opportunities in support of Customer Experience Management (CEM). The Three Key Challenges of Web QoE Monitoring When looking at what currently challenges CSPs that are reaching for an accurate solution to monitor the QoE of subscribers browsing web pages, three main issues stand out: 1. How can a CSP measure how long subscribers are waiting for pages to load, and can these measurements be taken without overwhelming the solution with an avalanche of data? 2. Once obtained, what does a CSP compare these wait time measurements against to determine the quality that subscribers have perceived while browsing the web? 3. With wait time measurements and a baseline for comparison obtained, how can the resulting browsing QoE score information be presented without a mountain of analysis and comparison effort to actual answer the question “Why was browsing QoE so poor here?” 1 One example of an early study from the late 1990s asks the question “Tired of having to make coffee while waiting for a homepage to download?” before making recommendations about how content providers can improve load times. 2 A 2012 IEEE study examines the human psychology of wait times related to subscriber QoE and the challenges associated with accurately measuring browsing QoE. 3 AT&T has released a study looking at web QoE challenges and metrics for mobile operators. 2 Web-browsing Quality of Experience Monitoring The Three Main Components of Web QoE Monitoring This paper will show that the solution to these problems – a solution that offers an accurate view of subscriber web browsing QoE - can be found in the following three capabilities that exist today: 1. The ability to perform real-time traffic classification that measures Layer-7 HTTP page load time for a targeted cross-section of websites. 2. A machine learning function that profiles web pages at set intervals to maintain a current baseline of nominal page load performance for comparison. 3. A reporting analytics platform to automatically create reports that compare subscriber browsing QoE scores to known page profiles and other network data (e.g., CDN performance, access network latency, peering and transit relationships, etc.) to provide pure insight leading to answers. The Cost of Slow Page Loads The most obvious impact of a slow web browsing experience is that it frustrates subscribers. As a direct result of poor web QoE, dissatisfied customers are likely to call technical support and complain, or they may churn to a difference provider. The support cost to CSPs grows with the severity of web QoE degradation and the number of affected subscribers. Armed with accurate data, CSPs can make confident CEM decisions while avoiding the cost of remediating unsatisfied subscribers. Content providers also have a cost to bear. Google and Amazon have both independently released research results showing that latency on their websites has a direct impact on sales. Google, upon introducing an added 0.5 seconds on page loads to deliver 30 results instead of 10, found revenue dropped 20% as a result 4. Similarly, Amazon conducted an experiment that showed with every 100ms increase in latency, sales dropped by 1%. It’s clear that subscribers are affected by latency, and that it doesn’t have to be particularly high to have a heavy impact on subscriber QoE. In other words, long wait times are bad for everyone – there are no winners here. Issues Surrounding Web Browsing QoE Monitoring The primary goal when measuring web browsing QoE is to ultimately arrive at a value that matches the subscriber perception as accurately as possible, but the path to achieving this is far from clear. What should be Measured? There are many factors that can affect page load performance, including: • • • • • • • • 4 Device Web page content/complexity Subscriber count on the access resource Peering and transit configurations and changes CDN placement and routing External events (e.g., the release of a new mobile video application) Web server load Signal strength (mobile) Interested readers can find the original research in User Preference and Search Engine Latency 3 Web-browsing Quality of Experience Monitoring • Session handovers (mobile) 5 Specific device types and their installed operating systems can vary widely in terms of performance across the access network, as shown by Figure 1. Figure 1 – Access network round trip time for 14 different mobile devices There have been recent attempts to approximate accurate measurements of web browsing QoE by collecting and correlating data associated with the above metrics and user behaviors analyzed offline. Metrics such as “number of user clicks” for a particular page, or how often subscribers “abandon” a page load, have been used to infer web browsing QoE as part of a general study (not a solution) specific to 3G networks with a reported accuracy of about 80%. 6 From a CSP perspective, metrics such as the device latency shown by Figure 1 are all indirectly related to the subscriber’s actual perception and provide questionable indication of actual QoE based on bestguesses and probability calculations. 7 This is because the most important metric of all – page load wait time – has been difficult for anyone, let alone CSPs, to obtain across the network and for each individual subscriber. Reporting systems that are built into network transport equipment do not provide the resolution required to measure page load time since only flow data up to Layer-3 can be typically obtained. 8 To complicate matters, Figure 2 shows how websites have evolved from serving relatively static objects, such as hypertext and images, to hosting rich media applications and third-party content such as advertising. In such cases various “parts” of the page must be fetched from multiple domains and servers. As a simple example, subscribers would expect www.google.com to load almost instantly, but 5 According to AT&T, inter-radio-access-technology (IRAT) handovers are the largest contributor to poor browsing experience in mobile networks (section 5.1 of this AT&T study). 6 See section 1 of the AT&T study. 7 For example, subscribers often abandon web page loads for reasons other than impatience. 8 See section 1 of the AT&T study. 4 Web-browsing Quality of Experience Monitoring the “satisfaction tolerance” increases for more complex pages with longer load times such as www.cnn.com. Figure 2 – Average number of objects fetched to load a web page (1995 – 2012) 9 Despite the variability in terms of complexity, each web page has an optimum load time that can be measured objectively and used as a baseline to compare against deviations. The Importance of Page Load Wait Time As difficult as it may be to obtain, the time a subscriber must wait for a page to load is the central data point upon which all others depend in forming an objective judgement of web browsing QoE. A 2012 IEEE study notes that “when it comes to web browsing, it has been widely recognized that in contrast to the domains of audio and video quality, where psychoacoustic and psycho-visual phenomena are dominant, end-user waiting time is the key determinant of QoE: the longer users have to wait for the web page to arrive (or transactions to complete), the more dissatisfied they tend become with the service.” 10 The goal of measuring web browsing QoE is therefore simplified by the fact that the only metric that matters to subscribers, and the key metric that opens the door to insight for CSPs, is how long it takes for a page to completely load. The same study notes that subscribers do not perceive web browsing as a sequence of single isolated page retrieval events but as an immersive flow experience, so the popular phrase “surfing the web” comes as no surprise. “This flow state is characterized by positive emotions (enjoyment) and focused attention and as a result, heightened human performance and engagement. The notion of flow implies that the quality of the web browsing experience is determined by the timings of multiple page-view events that occur over a certain time frame during which the user interacts with a website and forms a quality judgment. This has a dual influence on the relationship between waiting times and QoE: on the one hand, flow experiences cause users to ’lose their sense of time’, resulting in distorted time perception. On the other hand, a sudden instance of overly long waiting time(s) abruptly ends the pleasant flow state and thus tends to be perceived particularly negatively.” 9 This diagram is provided by WebSiteOptimization.com. See section 3.1 of the 2012 IEEE study examining the impact of web browsing wait times. 10 5 Web-browsing Quality of Experience Monitoring As such, accurately determining web browsing QoE requires a solution that can measure time intervals between all of the subscriber/server events that occur during page loads, and then “call out” wait times that clearly interrupt the flow of a particular web browsing session. Applications such as the web Performance Group’s Navigation Time 11 and Google’s Page Speed 12 allow web developers to test designs against latency thresholds, but what exists for a CSP with millions of subscribers? A Method for Measuring Page Load Time The key component of a functional solution is an entity that can interpret real-time network data to classify Layer-7 HTTP flows and, most critically, includes the ability to measure time between events. How CSPs can Obtain Accurate Page Load Time Measurements The data is available – tucked away in the headers of Layer-7 HTTP browsing information is a “start” and “end” indicator for each and every page load 13. CSPs must have the ability to measure the time interval between the moment the page load event is initiated by the subscriber and the moment the page finishes loading. This is preferable to an active approach that injects regular measurements into the data stream because it is non-intrusive. Moreover, the solution’s traffic classification engine must be able to associate every single one of the page load sub-events in the request/response interplay between subscriber and server with the overall event of loading the page. For scalability and optimum insight, it must also be able to classify flows by subscriber and not merely by IP or via aggregate clickstream analysis. A Database of Web Page Anatomy Profiles for Comparison As noted earlier, modern web pages are very complex – the subscriber’s initial click in a search engine or entry in a browser’s URL field sets off a flurry of component retrievals and network interactions that build the page. All web pages have a structural “anatomy” of object retrieval events associated with rendering the page in a browser. Figure 3 shows a partial view of the anatomy of object retrieval events that occur when a browser is “building” a complex web page requested by a subscriber. Figure 3 – Partial list of objects fetched and load times for a complex web page 11 Specifications can be viewed here. The developers page is located here. 13 Such indicators are not available from encrypted browsing sessions. 12 6 Web-browsing Quality of Experience Monitoring A web browsing QoE solution must include an automated “machine learning” component, where an offline element maintains an up-to-date database of page profiles that are used as a baseline against which measured wait times can be compared. When wait times between page load events are compared against a database of “normal” page load times the analytics platform can determine an overall QoE score for the browsing session, highlighting web browsing anomalies (poor QoE scores) down to the individual subscriber. However, millions of pages are accessed by subscribers every day and assessing QoE for each of the ‘long tail’ pages is not economically viable neither necessary. The ability to select the most meaningful pages on a per-CSP basis and measure the browsing quality provides the needed balance - specialized content that is relevant without the need to provision a massive infrastructure. The solution also needs to account for changes in behavior, interest and seasonality. This database of page anatomy profiles is stored on an offline element that updates itself at set intervals by automatically requesting the list of pages stored. The offline element literally loads the pages into a browser and records all objects that make up a particular page’s anatomy so that the reporting analytics platform can compare measured wait times against an accurate baseline. It is essential that the policy control configuration allow for targeting of specific information elements at the data collection point relevant to the database of page profiles to reduce the overall performance impact and streamline the data feed to the reporting analytics platform. An important component that cannot be overlooked is value of real-time wait-time measurement in support of traffic management QoS. With real-time measurement, wait times can be immediately compared against the profile database to generic an real-time QoE score that can trigger a specific traffic management action. Putting Browsing QoE Scores into Context to Generate Answers CSPs want to know when the problem is with the network and not the page itself – is it the device, the device OS, a particular web server, a particular subscriber plan, or something else like an unforeseen social event? Is it related to peering between a CSP and content providers or other CSPs, or could it be the routing architecture? Once accurate page load times have been obtained and processed into reliable indications of poor QoE, the final component that answers “why” is found in the analytics platform’s ability to easily draw in other network data for comparison. Poor browsing QoE events can be compared against a myriad factors such as device type, web server, geographic location, peering relationships, network packet latency (round trip time) and much more. One could imagine a report comparing poor browsing QoE scores to specific web servers or CDNs, or a report comparing page load times to BGP data to see if an upstream provider is frustrating subscribers. Any information element within the analytics data pool that correlates to a poor median page load time becomes a target point of increased scrutiny from a CSP perspective. Such comparisons reveal the final answers to the pressing question of “why are these subscribers having a poor web browsing experience?”, revealing insightful correlations that lead directly to confident CEM actions that remedy a discovered issue. How it all Comes Together Figure 4 shows how all of the key components come together to provide CSPs with daily insight into web browsing QoE as actually perceived by subscribers, from yesterday all the way back to when the solution first went live. 7 Web-browsing Quality of Experience Monitoring Figure 4 – Key Components of a Web Browsing QoE Monitoring Solution Conclusion This paper has shown that CSPs have a clear path to attaining reliable and objective measurement of subscriber web browsing QoE, given the right capabilities. 1. The ability to perform real-time traffic classification that measures Layer-7 HTTP page load time for a targeted cross-section of websites. 2. The application of machine learning that profiles web pages offline to develop and maintain a current baseline of nominal performance for comparison. 3. A reporting analytics platform to automatically create reports that compare subscriber browsing to known page profiles and other network data (e.g., CDN performance, access network latency, peering and transit relationships, etc.), where the final output is nothing less than pure insight. Summary of Web QoE Measurement Techniques The following table summarizes the requirements for determining web browsing QoE based on page load time: Component Data plane Measurement Offline processing Report analysis Description Relevant metrics in the data plane, including page load times organized by subscriber, must be targeted to obtain the crucial data points without overwhelming the overall system with irrelevant aggregate data. An offline element performs automated machine learning to maintain a current database of nominal page load times. An analytics platform compares measured wait times to nominal load times to generate a QoE score. It draws in additional data points to provide the critical clues required see the cause of poor, or good, web browsing QoE 8 Web-browsing Quality of Experience Monitoring Additional Resources See the Sandvine technology showcase Monitoring Web Browsing Quality of Experience. 9 Headquarters Sandvine Incorporated ULC Waterloo, Ontario Canada Phone: +1 519 880 2600 Email: [email protected] European Offices Sandvine Limited Basingstoke, UK Phone: +44 0 1256 698021 Email: [email protected] Copyright ©2014 Sandvine Incorporated ULC. Sandvine and the Sandvine logo are registered trademarks of Sandvine Incorporated ULC. All rights reserved.
© Copyright 2024 ExpyDoc