Top Banner
Dissecting the End-to-end Latency of Interactive Mobile Video Applications Teemu Kämäräinen, Matti Siekkinen, Antti Ylä-Jääski Department of Computer Science Aalto University, Finland firstname.lastname@aalto.fi Wenxiao Zhang, Pan Hui Systems and Media Lab The Hong Kong University of Science and Technology, Hong Kong [email protected], [email protected] ABSTRACT In this paper we measure the step-wise latency in the pipeline of three kinds of interactive mobile video applications that are rapidly gaining popularity, namely Remote Graphics Rendering (RGR) of which we focus on mobile cloud gam- ing, Mobile Augmented Reality (MAR), and Mobile Virtual Reality (MVR). The applications differ from each other by the way in which the user interacts with the application, i.e., video I/O and user controls, but they all share in com- mon the fact that their user experience is highly sensitive to end-to-end latency. Long latency between a user control event and display update renders the application unusable. Hence, understanding the nature and origins of latency of these applications is of paramount importance. We show through extensive measurements that control input and dis- play buffering have a substantial effect on the overall delay. Our results shed light on the latency bottlenecks and the ma- turity of technology for seamless user experience with these applications. Keywords latency; mobile device; cloud gaming; virtual reality; aug- mented reality 1. INTRODUCTION In recent years, new ways to bring interactive multime- dia to mobile devices have emerged. Examples include Re- mote Graphics Rendering (RGR), Mobile Augmented Real- ity (MAR), and Mobile Virtual Reality (MVR) applications. They all commonly leverage the superior capability of cloud computing to render graphics and process video but their I/O and user interactions differ. One of the biggest chal- lenges with these applications is that their user experience may degrade dramatically if it takes even just a few hundred milliseconds to update the display after a control action by the user. Hence, understanding and optimizing the end-to- end latency in user interactions with these applications is critically important. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full cita- tion on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or re- publish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]. HotMobile ’16 February 21–22, 2017, Sonoma, California, USA c 2016 ACM. ISBN 123-4567-24-567/08/06. . . $15.00 DOI: 10.475/123 4 In what we call RGR applications, the user runs on a thin client software that intercepts user control events and sends them to the cloud. The application logic is executed and graphics rendered completely on the cloud and the client typically receives back a video stream. Cloud gaming and full-cloud CAD are examples of RGR applications. In MAR applications, the input stream is usually a camera feed from the mobile device. This video is either streamed to a cloud or processed locally, depending on the application and de- vice capabilities. Video processing recognizes and tracks features of interest and additional (augmented) objects or information are drawn to the screen of the mobile device. MVR applications are usually used together with a head- set to render a complete virtual world for the user. Head tracking using the sensors of the mobile phone and possibly a remote controller are the input to the application and the mobile phone renders the resulting projection of the virtual world depending on the user’s head movements. The added delay of rendering, processing, encoding, de- coding, and transmitting video through the network are key factors that affects the user experience. In addition, the different user interactions, i.e., touch screen, separate gamepad, or sensors, must be accounted for. To minimize the end-to-end latency, we must first precisely understand where in the pipeline it accumulates, and that is the ob- jective of our work. Previous studies have attempted to quantify some of the latency components of such applica- tions, but their methodology has been limited and most of them have not considered scenarios involving mobile devices. They have mostly used either timing hooks injected directly into the code or a high-speed camera[7, 13, 6]. Timing hooks are useful for measuring independent tasks occurring com- pletely in the mobile device, whereas a high-speed camera can only capture the total delay from a control press to dis- play update. A predictive approach has also been proposed by Cattan et al. but it requires separate calibration[5]. A more detailed and precise break down of delay components requires new methods. In this paper we utilize a modified and extended version of the WALT Latency Timer [17] together with code injections dissect the latency of these mobile applications into subcom- ponents. Similar measurement setups have been previously utilized in measuring mobile phone display responsiveness [3, 11]. Our approach can also measure gyro, gamepad and Bluetooth delay on top of the traditional touch latency mea- surements. Our setup is also in-sync with the mobile phone, allowing a more precise division of latency components. Our results reveal where in the processing pipeline of these appli- arXiv:1611.08520v1 [cs.HC] 25 Nov 2016
6

Dissecting the End-to-end Latency of Interactive Mobile ... · Samsung S4, Samsung S7 and Huawei Nexus 6P. Samsung S4 was released in 2013, Nexus 6P in 2015 and Samsung S7 in 2016.

Sep 13, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Dissecting the End-to-end Latency of Interactive Mobile ... · Samsung S4, Samsung S7 and Huawei Nexus 6P. Samsung S4 was released in 2013, Nexus 6P in 2015 and Samsung S7 in 2016.

Dissecting the End-to-end Latency of Interactive MobileVideo Applications

Teemu Kämäräinen, Matti Siekkinen,Antti Ylä-Jääski

Department of Computer ScienceAalto University, Finland

[email protected]

Wenxiao Zhang, Pan HuiSystems and Media Lab

The Hong Kong University of Science andTechnology, Hong Kong

[email protected], [email protected]

ABSTRACTIn this paper we measure the step-wise latency in the pipelineof three kinds of interactive mobile video applications thatare rapidly gaining popularity, namely Remote GraphicsRendering (RGR) of which we focus on mobile cloud gam-ing, Mobile Augmented Reality (MAR), and Mobile VirtualReality (MVR). The applications differ from each other bythe way in which the user interacts with the application,i.e., video I/O and user controls, but they all share in com-mon the fact that their user experience is highly sensitiveto end-to-end latency. Long latency between a user controlevent and display update renders the application unusable.Hence, understanding the nature and origins of latency ofthese applications is of paramount importance. We showthrough extensive measurements that control input and dis-play buffering have a substantial effect on the overall delay.Our results shed light on the latency bottlenecks and the ma-turity of technology for seamless user experience with theseapplications.

Keywordslatency; mobile device; cloud gaming; virtual reality; aug-mented reality

1. INTRODUCTIONIn recent years, new ways to bring interactive multime-

dia to mobile devices have emerged. Examples include Re-mote Graphics Rendering (RGR), Mobile Augmented Real-ity (MAR), and Mobile Virtual Reality (MVR) applications.They all commonly leverage the superior capability of cloudcomputing to render graphics and process video but theirI/O and user interactions differ. One of the biggest chal-lenges with these applications is that their user experiencemay degrade dramatically if it takes even just a few hundredmilliseconds to update the display after a control action bythe user. Hence, understanding and optimizing the end-to-end latency in user interactions with these applications iscritically important.

Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full cita-tion on the first page. Copyrights for components of this work owned by others thanACM must be honored. Abstracting with credit is permitted. To copy otherwise, or re-publish, to post on servers or to redistribute to lists, requires prior specific permissionand/or a fee. Request permissions from [email protected].

HotMobile ’16 February 21–22, 2017, Sonoma, California, USAc© 2016 ACM. ISBN 123-4567-24-567/08/06. . . $15.00

DOI: 10.475/123 4

In what we call RGR applications, the user runs on a thinclient software that intercepts user control events and sendsthem to the cloud. The application logic is executed andgraphics rendered completely on the cloud and the clienttypically receives back a video stream. Cloud gaming andfull-cloud CAD are examples of RGR applications. In MARapplications, the input stream is usually a camera feed fromthe mobile device. This video is either streamed to a cloudor processed locally, depending on the application and de-vice capabilities. Video processing recognizes and tracksfeatures of interest and additional (augmented) objects orinformation are drawn to the screen of the mobile device.MVR applications are usually used together with a head-set to render a complete virtual world for the user. Headtracking using the sensors of the mobile phone and possiblya remote controller are the input to the application and themobile phone renders the resulting projection of the virtualworld depending on the user’s head movements.

The added delay of rendering, processing, encoding, de-coding, and transmitting video through the network arekey factors that affects the user experience. In addition,the different user interactions, i.e., touch screen, separategamepad, or sensors, must be accounted for. To minimizethe end-to-end latency, we must first precisely understandwhere in the pipeline it accumulates, and that is the ob-jective of our work. Previous studies have attempted toquantify some of the latency components of such applica-tions, but their methodology has been limited and most ofthem have not considered scenarios involving mobile devices.They have mostly used either timing hooks injected directlyinto the code or a high-speed camera[7, 13, 6]. Timing hooksare useful for measuring independent tasks occurring com-pletely in the mobile device, whereas a high-speed cameracan only capture the total delay from a control press to dis-play update. A predictive approach has also been proposedby Cattan et al. but it requires separate calibration[5]. Amore detailed and precise break down of delay componentsrequires new methods.

In this paper we utilize a modified and extended version ofthe WALT Latency Timer [17] together with code injectionsdissect the latency of these mobile applications into subcom-ponents. Similar measurement setups have been previouslyutilized in measuring mobile phone display responsiveness[3, 11]. Our approach can also measure gyro, gamepad andBluetooth delay on top of the traditional touch latency mea-surements. Our setup is also in-sync with the mobile phone,allowing a more precise division of latency components. Ourresults reveal where in the processing pipeline of these appli-

arX

iv:1

611.

0852

0v1

[cs

.HC

] 2

5 N

ov 2

016

Page 2: Dissecting the End-to-end Latency of Interactive Mobile ... · Samsung S4, Samsung S7 and Huawei Nexus 6P. Samsung S4 was released in 2013, Nexus 6P in 2015 and Samsung S7 in 2016.

1

2

3

4

56

Figure 1: Measurement setup for touch, gamepad,Ethernet and screen delay analysis.

cations lie the major latency bottlenecks and how differentcontrol methods affect the end-to-end latency. Based on theresults, we also discuss whether the technology is matureenough for seamless user experience with these applicationsand highlight the most promising avenues for latency.

2. MEASUREMENT SETUPThe measurement setup depicted in Figure 1 uses an Arduino-

compatible board (Teensy LC) connected to the phone througha USB connection (1). The Teensy board is configured to actas a joystick in addition to a serial connection through theUSB. This enables us to programmatically enter key pressesto the mobile device. To simulate touch presses, we use acoin (2) attached to a relay (3) which closes a connectionloop to the human tester (4) when activated. This in turnenables us to precisely measure the time when a touch isinitiated on the display.

In addition two BPW34 photodiodes (5) catch the timewhen a frame has been updated on the display of the mo-bile device. The photodiode can sense the change from adark frame to a more illuminated frame, for example fromcolor black to white. We utilize this property in the softwareto measure the display times of specific frames. The mea-surement board has also an Ethernet shield (6) attached forInternet connectivity. This is required to measure how longdoes it take to prepare and send a packet including a controlevent from the mobile device towards a server. This timeperiod can be calculated by directing the mobile device tosend the control command directly to the Teensy measure-ment board which also initiated the control command.

For the virtual reality application experiments a referencegyro value is needed to compare the responsiveness of thegyro sensor inside the mobile phone. The modified measure-ment setup is presented in Figure 2 Using the same Teensy-board (A), we attached also a MPU-6050 gyro sensor (B) tothe measurement setup. We recorded raw gyro values dur-ing the experiment to get indication of movement as quicklyas possible for a base reference value. The gyroscope sen-sor was attached to a plate (C) together with the mobilephone (D). When we moved the plate, both mobile phoneand the gyro sensor moved simultaneously. The setup alsoincludes a Bluetooth Low Energy module (E) for measuringthe Bluetooth controller delay in an MVR application.

Without time synchronization, the Teensy board can cal-culate events that started and ended in the device itself. Inaddition the mobile device can calculate the delay of eventsoccurring within the mobile device. However to calculate the

A

B

C

D

E

Figure 2: Measurement setup for gyro and Blue-tooth delay analysis.

Figure 3: Typical pipeline of a mobile cloud gamingscenario.

delay of a control command which initiated in the Teensyside and was received in the mobile device, requires us tosync the times of the devices. This is possible with theWALT Latency timer which utilizes the fast USB connec-tion of the Teensy board and an algorithm similar to NTPto sync the clocks of the devices.

We used three mobile devices in our test experiments:Samsung S4, Samsung S7 and Huawei Nexus 6P. SamsungS4 was released in 2013, Nexus 6P in 2015 and Samsung S7in 2016. All of the devices are top of the line in performancewhen released so comparing the devices should give a hinton how the different delay components are developing andwill develop in the future.

3. RGR: MOBILE CLOUD GAMINGMobile cloud gaming is a prime example of an RGR appli-

cation. It has very stringent requirements since it requiresboth low latency and high bandwidth for good quality of ex-perience. We focus on the delays occurring on the client sideas network and server delay have been covered in depth inexisting literature. The complete pipeline of a typical cloudgaming scenario is presented in Figure 3. The measurementresults for the cloud gaming use case are presented in Table1. We use the GamingAnywhere open-source cloud gamingapplication in our measurements.

3.1 Control delayControl delay is the delay between the user initiating a

command using the touch screen or by pressing a button onan external control device and the operating system of themobile device registering the command. The control devicecan also be integrated as part of the device as is the casewith for example Nvidia’s Shield device. In this paper we

Page 3: Dissecting the End-to-end Latency of Interactive Mobile ... · Samsung S4, Samsung S7 and Huawei Nexus 6P. Samsung S4 was released in 2013, Nexus 6P in 2015 and Samsung S7 in 2016.

Table 1: Cloud gaming delay measurement results.Samsung S4 Samsung S7Avg. SD Avg. SD

Touch to kernel (ms) 40.5 2.3 24.1 3.0Gamepad to kernel (ms) 0.6 0.6 0.2 0.4Kernel to callback (ms) 5.5 1.6 3.4 0.6Callback to radio (ms) 9.1 2.7 1.6 0.8Frame receive (ms) 10.5 5.8 9.6 4.5Frame decode (ms) 20.4 11.6 8.3 1.1Frame display (ms) 25.1 5.4 27.3 4.7

measure both cases as the measurement setup can simulateboth user touch interactions and gamepad commands.

A capacitive touch screen is the most common methodof controlling a mobile device. Processing the touch eventsdoes however incur significant delay to the system. We mea-sured a delay of over 40 ms on the older Samsung S4 whilethe newer Samsung S7 handles touch commands significantlyfaster with an average delay of 24 ms.

User’s control commands can also be inputted through anexternal gamepad or an integrated hardware controller. TheUSB connection conveys user commands to the operatingsystem considerably faster than the capacitive touch screen.Our measurements show a negligible delay of under 1 mson both tested devices. The operating system passes on thecontrol command to the running application which receivesa callback after a short delay. We measured this delay to beapproximately 6 ms on the Samsung Galaxy S4 and roughly3 ms on the newer S7. The control command is sent to thecloud gaming server after the control has been registered inthe cloud gaming application code. Samsung S4 sends thecommands in approximately 9 ms while the S7 uses under 2ms to send a control command to the network.

3.2 Frame receive and decodeAndroid’s decoder takes full frame buffers as an input.

However the server sends the frames in multiple networkpackets. The packets have to be buffered in the client sidebefore handing them over to the decoder. We measured adelay of roughly 10 ms for both devices between the firstand last packet’s arrival of a single frame.

Android has built-in video decoders for h264 video whichare used by the GamingAnywhere application. We instru-mented the code to measure the time between a frame beinginputted to the decoder and the time the frame is decodedand ready to be displayed. With FullHD video (1920x1080)the S4 averaged a delay of 20 ms while the S7 was consider-ably faster with a delay of 8 ms.

3.3 Frame displayFrame display time is the delay before a frame is visible

on the screen after it has been handed over from the mediadecoder to the display buffer. The results show that thisdelay is not trivial and is actually one of the largest singlecomponents affecting the overall delay. The refresh rate of amobile device’s display is usually 60 Hz which translates toroughly 17 ms delay between display updates. Android OSuses double buffering which adds to the overall frame displaytime. Depending on how the decoded frames and vsync ofthe display line up, the frame display time will be between17-34 ms with an average of 1.5 vsync periods (25 ms). Themeasurement results confirm this logic in both phones as thedisplay refresh rate is same in both of the tested devices.

3.4 Network and server delayIn cloud gaming the game is rendered on a distant server.

This adds network and server processing delay on top of theclient-side delays. Even in optimal conditions the networkdelay is at least 20 ms and the server delay around 15 ms.These are naturally highly dependent on the location andprocessing power of the server infrastructure. We comparethe overall latencies discovered further in Chapter 6.

3.5 SummaryTwo surprising components dominate the overall client-

side delay: touch input processing and frame display. Byusing an external usb-connected controller, we can reducethe overall latency by over a third with both mobile phones.Frame encoding and decoding is usually associated as thedominating factor in the cloud gaming pipeline. However thedecoding part at least seems to be getting faster and fasterwhile the frame display time dominates still with both gen-erations of phones. This could however be mitigated with amethod called scanline racing which is introduced in the lat-est Android operating system (7.0) mainly for virtual realityapplications. We measure this feature in Chapter 4.

4. MOBILE VIRTUAL REALITYVirtual reality (VR) applications require a very low de-

lay between user controls and visual feedback. VR applica-tions are usually controlled by head movement and a pos-sible hand-held external controller. The mobile device it-self is attached to a head mounted device. In this chapterwe measure these VR-application specific control delays andshow how the new async reprojection feature of the newestAndroid operating system could decrease the frame displaydelay. We use the simple Treasurehunt application from theAndroid VR SDK samples in our measurements.

4.1 Control delayThe industry standard in running virtual reality applica-

tions on mobile devices is to attach the device to a wearableheadset. The built-in sensors of the mobile phone can be uti-lized to track the head movements of the user. Additionallya handheld control device can be used for further user inter-action. In the latest VR platform guidelines by Google (Day-dream) the control device is a special purpose Bluetooth-device called the Daydream controller. In the measurementswe measure the delay of the built-in gyroscope, which is usedfor head tracking and an Arduino-compatible board witha Bluetooth (BLE) connection to the mobile phone. TheArduino-device mimics the Daydream controller and alsoprovides a reference orientation. We address the gyro andBluetooth delays separately as the perceived delay dependson the what the user is performing. The gyro sensor delayis perceived only in head tracking while the bluetooth delayis present in other control commands.

The orientation of the mobile device is received from thegyroscope sensor inside the mobile device. A callback is trig-gered in the application code when the sensor reading haschanged. The callback includes the sensor event and a times-tamp of when the event happened. We used the raw valuesof the Arduino gyro setup explained in Chapter 2 as a ref-erence. The average results of the gyro delay measurementscan be seen in Table 2. On Samsung S7 the timestamps gen-erated for the sensor events were by average 5.8 ms delayedfrom the reference gyro setup. The callback was fired by

Page 4: Dissecting the End-to-end Latency of Interactive Mobile ... · Samsung S4, Samsung S7 and Huawei Nexus 6P. Samsung S4 was released in 2013, Nexus 6P in 2015 and Samsung S7 in 2016.

Table 2: Mobile device gyro sensor and Bluetoothdelay.

Gyro delay(ms) Bluetooth delay (ms)Mobile device Avg. SD Avg. SDSamsung S4 78.8 32.8 17.5 5.2Samsung S7 12.2 4.1 22.0 4.9Nexus 6P 10.2 3.2 28.2 6.5

Table 3: Frame draw and display delay results forthe MVR application.

Frame draw (ms) Frame display (ms)Mobile device Avg. SD Avg. SDSamsung S4 5.2 2.2 55.9 5.0Samsung S7 13.2 2.2 58.3 1.7Nexus 6P 12.3 6.8 67.5 1.3Nexus 6p (async) 12.3 6.8 34.9 7.6

average 6.4 ms after the sensor timestamp. Nexus 6P por-trayed similar results while the Samsung S4 had significantlymore delay in its gyro values.

The Bluetooth controller delay was measured from send-ing an update from the Teensy device to the callback in theJava application code. The preferred delay was set to theminimum interval defined in the BLE standard (7.5 ms). In-terestingly we observed from the Bluetooth communicationlogs that the Samsung S4 (Android 5.0) accepted this delaywhile the Samsung S7 (Android 6.0) and Nexus 6P (An-droid 7.0) only accepted a slightly larger interval of 11.5 ms.This difference can be observed from the minimum delaysrecorded. This limitation might be because of energy con-sumption optimizations in the later Android versions. Themeasured average delays are presented in Table 2.

4.2 Frame draw and displayIn contrast to RDR and MAR applications the entire

three-dimensional environment is rendered on the mobiledevice in virtual reality applications. The delay perceivedby the user is the difference between the time of a headmovement and the time when this information is used in adisplayed frame.

We have already established the raw delay in gyroscopevalues compared to the reference setup. The gyroscope in-formation however still needs to be applied to a renderedview and the frame needs to be displayed through the screenof the mobile device. We injected timing hooks to the sam-ple VR application and timed how long it takes to render asingle frame using the setup presented in chapter 2.

In the latest Android version (7.0) a mode called asyn-chronous reprojection is available to supported devices. Themode tries to lower the rendering latency by decoupling theframerate from the display framerate. It enables scanlineracing where the image is directly rendered to the frontbuffer just before it is scanned out.

The display latency results measured with the three mo-bile phones are presented in Table 3. Nexus 6P is at the timeof writing the only available device capable of asynchronousreprojection. We measured its delay both with the modeenabled and disabled. The results show that enabling asyncreprojection significantly lowers the overall display latency.The base line latency without the method enabled is how-ever surprisingly high compared for example to the cloudgaming use case where instead of OpenGL graphics videowas rendered to the screen.

Table 4: Camera frame, frame draw and frame dis-play delay results for the MAR application.

Camera delay(640p/1440p) (ms)

Framedraw (ms)

Framedisplay (ms)

Mobile device Avg. SD Avg. SD Avg. SDSamsung S7 63.5/88.6 1.9/3.9 16.8 6.2 36.7 5.0Nexus 6P 60.0/78.8 1.9/3.3 20.0 6.1 36.9 5.1

5. MOBILE AUGMENTED REALITYMobile Augmented Reality (MAR) is another application

type requiring a low-latency pipeline for an acceptable QoE.AR applications differ from the RGR and the MVR scenarioby using the camera as the main input as the augmentedcontent is added on top of a projection of the real world.We measured the delays with the ARToolkit sample appARSimpleProj which draws a cube on top of a marker.

5.1 Control delayInitial delay present in any MAR application is the delay

between the image sensor starting to capture an image andthe capture result callback in the application code. We mea-sured this delay with the Samsung S7 and Nexus 6P mobiledevices using the Camera2 API introduced in Android 5.0.Samsung S4 uses an unspecified starting point in its cameratimestamps and was left out from the measurements. Weused resolutions 1440x1080 and 640x480 with 30 fps framerate in all tests as they were the highest and lowest resolu-tions supported by both phones. The results are presentedin Table 4. Both tested devices were similar in performancewith a delay of 60 to 90 ms depending on the resolution.

5.2 Frame draw and displayA typical MAR application combines the input frame from

the camera with a rendered object. The location of the ob-ject marker needs to be tracked for each frame. Table 4shows the measured frame draw and display times. Framedraw times include search for the location of the marker inthe camera frame and all the draw commands in the appli-cation code. Frame display time is the delay after the frameis drawn in the code and the time the frame is displayed onthe screen of the mobile device.

5.3 Offloading exampleThe overall delay of a MAR application is highly depen-

dant on the amount of processing needed to display the re-sult. Some tasks have to be offloaded to an external server.For this use case we analyzed an application which playsa movie trailer when the mobile phone’s camera is pointedtowards a movie poster. The trailer is overlaid accordingto the homography of the posters. The poster recognitionand pose estimation is done on the server and the mobileclient keeps tracking and estimating the pose of the posterafter receiving the result. For availability reasons we usedthe Xiaomi Mi 5 mobile phone for measuring the MAR of-floading application delay. The phone’s performance shouldbe similar to the Samsung S7.

The application offloads image recognition to an exter-nal server. This means every 60th frame is sent out andthe result is used to track the object on the mobile device.This delay from the application to the server and back wasmeasured to be approximately 500 ms. The constant de-lay visible to the user is however the local tracking time of

Page 5: Dissecting the End-to-end Latency of Interactive Mobile ... · Samsung S4, Samsung S7 and Huawei Nexus 6P. Samsung S4 was released in 2013, Nexus 6P in 2015 and Samsung S7 in 2016.

the object. We measured this delay to be 24 ms on averagewhich is line with the sample application measurements.

6. DISCUSSION OF RESULTSThe measurements show that the two major components

affecting the overall delay in latency-critical mobile videoapplications are control delay and frame display. This canbe observed in the summary presented in Figure 4. Themagnitude of the control delay is highly dependent on thetype of input used by the application. A modern mobiledevice can send gamepad commands to a remote server in amatter of milliseconds while an AR application can wait upto 90 ms to even get a frame from the camera for process-ing. Top-of-the-line mobile phone (Samsung S7) can processtouch and Bluetooth events in roughly 20 to 30 ms while thegyro sensor events arrive faster with an average of 12 ms ofdelay. Touch screen delays seem to get lower with each mo-bile phone generation. The gyro sensor delays is also verysmall with recent mobile phones. The camera feed to the ap-plication would however benefit from further optimizationsperhaps with the cost of image quality.

Drawing and displaying a single frame is processed in ap-proximately 25 ms in the RGR scenario where a video streamis received and decoded. Rendering a scene takes longer, wemeasured a delay of almost 60 ms to display a VR scene forboth eyes in a head-mounted setup. Using the latest featuresof the Android OS, this can be lowered to 35 ms.

While deep understanding of the impact of latency on userexperience is still an open problem, previous research hasdiscovered many things about human perception of latencywith modern mobile technology. For example, Deber et al.recently characterized the Just Noticeable Difference (JND)and the impact of additional latency on task performance indirect and indirect user interaction with a touch device[12].They found that the mean JND for a simple tapping task is69 ms and 96 ms for direct and indirect touch, respectively.The JND is substantially shorter when performing a drag-ging task. In our target applications, mobile cloud gamingin particular, pressing virtual buttons on a touch screen canbe argued to be a combination of the two. A comparison ofthese JND numbers to the results on touch, gamepad, andBluetooth-based controls in Figure 4 reveals that only theVR case with Bluetooth-based control can reach end-to-endlatency below the JND limit of tapping latency in the bestcase. Notice that mobile VR with offloaded graphics ren-dering, in case it is too computationally demanding for themobile device to perform, is essentially the RGR case, onlythe control delay being different.

Lee et al. studied error rates in pointing tasks where atarget is about to appear within a limited time window forselection[18].They found out that variability in the timingof the pointing event with a touch screen causes lower userperformance in gaming compared to using physical keys, forexample. One interesting observation from our results re-lated to the variability is the effect of asynchronous repro-jection: It lowers the latency on average but it also increasesits variance. However, further user studies are required toquantify its exact effect on user experience. We also notethat using an external gamepad instead of a touch screenclearly reduces both the average latency and its variance.

Also application-specific studies have been performed.Among the RGR applications, cloud gaming has been inves-tigated. Subjective tests show a clear correlation between

0

50

100

150

Del

ay (

ms)

Control delayFrame drawReceive & decodeServer delayNetwork delayFrame display

S4

RGR(Touch)

RGR(Gamepad)

MVR (Bluetooth)

MVR(Gyro)

ARS7 S4 S7 S4 S7 6P S4 S7 6P S7 6P

Figure 4: Summary of the measured delay scenarios.

QoE and latency but it is difficult to precisely quantify theirrelationship[15, 20, 9, 22, 10]. Unfortunately, most of thestudies have not measured the true end-to-end latency be-cause of which the results are difficult to interpret and com-pare to the latencies we have measured.

Acceptable delay for VR applications is also a debatablequestion. Previous research [16] has shown that the thresh-old for latency is very subjective, some hardly notice a delayof 100 ms while others can perceive delays down to 3-4 ms.Also velocity of head movements influence the tolerance ofdelay[2]. Some industry representatives have stated that alatency lower than 20 ms is recommended [21, 1]. Our mea-surements show that the recent features in the Android op-erating system enable the overall latency to go under 50 msfor a compatible device. This shows that there is still roomfor improvement in mobile device hardware to achieve seam-less interaction with the user. The measurements show thatmore powerful GPUs and displays with higher refresh ratescould alleviate the end-to-end delay substantially. Anotheroption is to offload the graphics rendering to an externalserver. The network delay with current technologies mighthowever increase the latency even further.

The quest for low latency has mostly focused earlier onmaking the network delay shorter through novel architec-tures. For example, Satyanarayanan et al. have presentedCloudlets that offer computing power for mobile clients withinone-hop latency[23]. A more incremental approach by Choyet al. is to use the existing CDN network to offload com-putation from the mobile device[8]. However, as Figure 4shows, network latency together with access delay is onlypart of the delay pipeline in latency-critical mobile videoapplications. Processing delay on the server side as well ason the client side occur both in the software and hardware.Jain et al. focus on balancing the network and the computa-tional delay with accuracy in mobile AR[14]. Lee et al. tooka different path and developed a system to speculatively ex-ecute different possible scenarios of a cloud game in orderto mask latency[19]. In a similar vein, Boos et al. builta system to aggressively precompute and cache all possibleimages that a VR user might encounter in order to achievelow latency and energy consumption[4]. These solutions areuseful for minimizing the latency components excluding thecontrol and the frame draw & display ones.

Finally, we point out that multiuser scenarios introduceadditional latency components due to geographic distance

Page 6: Dissecting the End-to-end Latency of Interactive Mobile ... · Samsung S4, Samsung S7 and Huawei Nexus 6P. Samsung S4 was released in 2013, Nexus 6P in 2015 and Samsung S7 in 2016.

between users and our current study excludes those.

7. CONCLUSIONWe presented a measurement methodology to study the

latency within a mobile device and apply it to three dif-ferent interactive mobile multimedia applications for whichlow latency is very important for the user experience. Ourresults demonstrate that the delays vary substantially be-tween device models, applications, and input methods used.Comparing our results to those obtained with user studieson the effect of latency, the technology does not appear to bemature enough yet for completely seamless user experience.

8. REFERENCES[1] M. Abrash. What VR Could, Should, and almost

certainly will be within two years.http://media.steampowered.com/apps/abrashblog/Abrash%20Dev%20Days%202014.pdf.

[2] R. S. Allison, L. R. Harris, M. Jenkin, U. Jasiobedzka,and J. E. Zacher. Tolerance of temporal delay invirtual environments. In Proceedings of the VirtualReality 2001 Conference (VR’01), VR ’01, pages 247–,Washington, DC, USA, 2001. IEEE Computer Society.

[3] J. Beyer, R. Varbelow, J.-N. Antons, and S. Zander. Amethod for feedback delay measurement using alow-cost arduino microcontroller. In Proceedings of the7th International Workshop on Quality of MultimediaExperience (QoMEX), IEEE, 2015.

[4] K. Boos, D. Chu, and E. Cuervo. Flashback:Immersive virtual reality on mobile devices viarendering memoization. In Proceedings of the 14thAnnual International Conference on Mobile Systems,Applications, and Services, MobiSys ’16, pages291–304, New York, NY, USA, 2016. ACM.

[5] E. Cattan, A. Rochet-Capellan, and F. Berard. Apredictive approach for an end-to-end touch-latencymeasurement. In Proceedings of the 2015 InternationalConference on Interactive Tabletops & Surfaces, ITS’15, pages 215–218, New York, NY, USA, 2015. ACM.

[6] C.-M. Chang, C.-H. Hsu, C.-F. Hsu, and K.-T. Chen.Performance measurements of virtual reality systems:Quantifying the timing and positioning accuracy. InProceedings of the 2016 ACM on MultimediaConference, MM ’16, pages 655–659. ACM, 2016.

[7] K.-T. Chen, Y.-C. Chang, H.-J. Hsu, D.-Y. Chen,C.-Y. Huang, and C.-H. Hsu. On the quality of serviceof cloud gaming systems. Trans. Multi., 16(2):480–495,Feb. 2014.

[8] S. Choy, B. Wong, G. Simon, and C. Rosenberg. Ahybrid edge-cloud architecture for reducingon-demand gaming latency. Multimedia Systems,20(5):503–519, 2014.

[9] M. Claypool and D. Finkel. The effects of latency onplayer performance in cloud-based games. In 201413th Annual Workshop on Network and SystemsSupport for Games, pages 1–6. IEEE, 2014.

[10] S. Clinch, J. Harkes, A. Friday, N. Davies, andM. Satyanarayanan. How close is close enough?understanding the role of cloudlets in supportingdisplay appropriation by mobile users. In PerCom,2012 IEEE International Conference on, pages122–127, March 2012.

[11] J. Deber, B. Araujo, R. Jota, C. Forlines, D. Leigh,S. Sanders, and D. Wigdor. Hammer time!: Alow-cost, high precision, high accuracy tool to measurethe latency of touchscreen devices. In Proceedings ofthe 2016 CHI Conference on Human Factors inComputing Systems, pages 2857–2868. ACM, 2016.

[12] J. Deber, R. Jota, C. Forlines, and D. Wigdor. Howmuch faster is fast enough?: User perception oflatency & latency improvements in direct and indirecttouch. In Proceedings of the 33rd Annual ACMConference on Human Factors in Computing Systems,pages 1827–1836. ACM, 2015.

[13] Z. Ivkovic, I. Stavness, C. Gutwin, and S. Sutcliffe.Quantifying and mitigating the negative effects oflocal latencies on aiming in 3d shooter games. InProceedings of the 33rd Annual ACM Conference onHuman Factors in Computing Systems, CHI ’15, pages135–144, New York, NY, USA, 2015. ACM.

[14] P. Jain, J. Manweiler, and R. Roy Choudhury.Overlay: Practical mobile augmented reality. InProceedings of the 13th Annual InternationalConference on Mobile Systems, Applications, andServices, MobiSys ’15, pages 331–344, New York, NY,USA, 2015. ACM.

[15] M. Jarschel, D. Schlosser, S. Scheuring, andT. Hoßfeld. An evaluation of qoe in cloud gamingbased on subjective tests. In Innovative Mobile andInternet Services in Ubiquitous Computing (IMIS),2011 Fifth International Conference on, pages330–335. IEEE, 2011.

[16] J. J. Jerald. Scene-motion-and latency-perceptionthresholds for head-mounted displays. PhD thesis,University of North Carolina at Chapel Hill, 2010.

[17] M. Koudritsky. Github. Walt Latency Timer.https://github.com/google/walt.

[18] B. Lee and A. Oulasvirta. Modelling error rates intemporal pointing. In Proc. of the 2016 CHIConference on Human Factors in Computing Systems,CHI ’16, pages 1857–1868. ACM, 2016.

[19] K. Lee, D. Chu, E. Cuervo, J. Kopf, Y. Degtyarev,S. Grizan, A. Wolman, and J. Flinn. Outatime: Usingspeculation to enable low-latency continuousinteraction for mobile cloud gaming. In Proceedings ofthe 13th Annual International Conference on MobileSystems, Applications, and Services, MobiSys ’15,pages 151–165, New York, NY, USA, 2015. ACM.

[20] Y.-T. Lee, K.-T. Chen, H.-I. Su, and C.-L. Lei. Are allgames equally cloud-gaming-friendly? anelectromyographic approach. In Network and SystemsSupport for Games (NetGames), 2012 11th AnnualWorkshop on, pages 1–6. IEEE, 2012.

[21] OculusRift-Blog.com. John Carmack’s Delivers SomeHome Truths On Latency. http://oculusrift-blog.com/john-carmacks-message-of-latency/682/.

[22] K. Raaen, R. Eg, and C. Griwodz. Can gamers detectcloud delay? In Proceedings of the 13th AnnualWorkshop on Network and Systems Support forGames, page 6. IEEE Press, 2014.

[23] M. Satyanarayanan, P. Bahl, R. Caceres, andN. Davies. The case for vm-based cloudlets in mobilecomputing. IEEE pervasive Computing, 8(4):14–23,2009.