Figure 8 - uploaded by Dmitry Lagun
Content may be subject to copyright.
This figure shows fairly strong gaze viewport correlations. In each panel, the x axis is a gaze measure, and the y axis is the corresponding viewport measure. Left panels show time measures in milliseconds, while right panels show time measures as a % of time on all results on that page. 

This figure shows fairly strong gaze viewport correlations. In each panel, the x axis is a gaze measure, and the y axis is the corresponding viewport measure. Left panels show time measures in milliseconds, while right panels show time measures as a % of time on all results on that page. 

Source publication
Conference Paper
Full-text available
Web Search has seen two big changes recently: rapid growth in mobile search traffic, and an increasing trend towards providing answer-like results for relatively simple information needs (e.g., [weather today]). Such results display the answer or relevant infor-mation on the search page itself without requiring a user to click. While clicks on orga...

Contexts in source publication

Context 1
... result position (x axis). Similar to viewport, we find a main effect of result position or rank on time on result (F(9, 1720) = 15.1, p < 0.001) and a bump at position 2 (% time on result is significantly higher for second result than the first: t(343)=-2.3, p=0.02). We believe this may be a function of scrolling too – due to the small screen size in phones, the second result may only be partially visible; in order to bring it fully into view, the user has to adjust the scroll distance by continuing to look at the second result until it its bottom portion comes into view. This finding of non-monotonic attention decay with rank position may have implications for results ranking and design of a novel discount function (as opposed to MAP or NDCG[16]) that better reflects user experience in mobile search. We plan to investigate this question in the future work. Figure 7 shows the attention distribution across all users and conditions in our study. The left panel shows a heatmap of gaze activity (note that the red hotspots of increased attention are clearly shifted to the top half of the screen). The right panel shows a distribution of eye fixations as a function of y position. The median fixated y position was 224 pixels which is above the screen center (290 pixels). Thus, we found that on average, almost 70 % of the users’ attention was focused on the top half of the phone screen, with little or no attention paid to the bottom 1/3 portion of the screen (only 14%). This trend was consistent on a per user basis (20/24 users showed the preference for top half of the screen). We hypothesize that weighting viewport measurements by this attention distribution may further improve gaze viewport correlations. We have already shown in the previous section that viewport metrics can signal relevance of answer like results and reflect user’s satisfaction with the search. In this section we investigate whether viewport data can serve for an additional benefit – tracking user attention. To this end, we attempt to correlate result viewing time measured with the eye tracking and viewport data. If a reasonably strong correlation between gaze and viewport time exists, it implies that we can measure user attention at scale from viewport data alone. We analyze viewing time on per-result basis. We gather all the data collected in the user study independent of experimental condition, relevance, result position and result type (traditional web results vs. answer-like results). We hypothesize that viewport time alone might provide a poor proxy for the user’s attention, thus, in order to refine our measurements we account for two factors: result coverage and exposure defined below. Let v denote the viewport. We explore different ways of computing viewport time on result as a combination of the time the result was visible on the viewport ( t v ) and two factors: how much of the result area was visible to a user (result exposure , e v ) and how much of the viewport real estate did the result occupy (viewport coverage , c v ). Total viewport time on result using all factors is computed as n v =1 ( t v ∗ c v ∗ e v ) , where v can take values from [1 , n ] ( n is the number of viewports). Table 4 reports the gaze-viewport correlations for combinations of the above factors. We denote the baseline approach computing n viewport time = v =1 t v as C1. We find that the best combination among C1-C4 is C4 (C2 is close), which is weighted by result exposure and viewport coverage. The scatter plots in Figure 8 are generated using C4. Figure 8 (top-left panel) shows the scatter plot of viewport time on result vs. gaze time on result, both measured in milliseconds. Each data point in the scatter plot is a (user, query, condition, result) tuple. The correlations are reasonably strong (Pearson’s correlation r=0.57; the blue line shows the metric values obtained by binning the data into deciles, binned r = 0.77). Figure 8 (top-right panel) is similar, but shows a scatter plot of percent time on result (time on result / time on page) as measured by gaze (x axis) and viewport (y axis). Interestingly, we found higher correlations using % time on result than absolute time on result in milliseconds (raw correlation: r = 0.69 vs. 0.57; binned correlation: r = 0.97 vs. 0.77). We suspect that the normalization (by time on all results on the page) helps adjust for the pace at which users read the page. For example, some users may quickly glance and skim through the results, while others may read carefully. In such cases, the absolute time measure will vary a lot while the percent time measure may be relatively stable. Since 3-4 results may be shown on the viewport simultaneously, the observed gaze-viewport correlation on a per-result basis (raw correlation of 0.69, binned correlation in deciles of 0.97) is high and suggests that we may reliably infer how long a specific result was seen by the eye, from the viewport data alone. The middle and bottom panels in Figure 8 are similar to the top panel, and show gaze viewport correlations for other measures, such as time spent below KG (mid-left panel) and percent time spent below KG (i.e., time below KG / time on all results on the page, mid-right panel) measured using gaze (x axis) and viewport (y axis). Here too, we find strong gaze viewport correlations, and again, the % time measures show higher correlations than time in millisecond measures (time below KG: r = 0.71, %time below KG: r = 0.86). The bottom panel in Figure 8 shows correlations for time below IA (r = 0.59) and % time below IA (r = 0.81). In all three figures, we find that the percent time measures, that are normalized by time on page, show higher gaze-viewport correlations than time in millisecond measures, for reasons discussed earlier. To our knowledge, this is the first quantitative mobile eye tracking study in search. As more traffic goes mobile, there is a need to better understand user attention and satisfaction on mobile devices. Prior work has focused on search behavior in desktops. These studies report a Golden Triangle [23], where searcher attention is focused near the top-left of the page and decreases as we go down or to the right on the SERP. It is not clear whether attention behavior on desktop will generalize to mobile phones, as they differ from desktops in several ways – small real estate, variety of touch interactions (touch, scroll, swipe, zoom) and tendency to perform short queries. In this study, we found that indeed, user attention behavior on mobile phones is very different from that on desktops. First, unlike desktop where engagement (both clicks and attention) has been widely reported to decrease from top to bottom positions [9, 10], on mobile phones, we found surprisingly, that the second result gets more viewport and gaze time than the first. The most likely explanation for this is short scrolls. Unlike desktop where searchers can use the page up/down keys on the keyboard to move from one page fold to the next (no overlap between the results in different page folds), on mobile phones, users tend to perform short and continuous scrolls that render the second and third results visible across more viewports and hence longer than the first. Figure 6 illustrates this with an example. This bias towards the second position occurs in eye gaze too. We think this is because the second result is often partially hidden, and to bring it fully into view, the user has to carefully scroll (to avoid scrolling too much or too little) by continuously looking at the result until it is fully visible, leading to longer gaze time on the second result than the first. It is possible that in the absence of scrolling, viewport and gaze time on results (in mobile phones) may decrease with position, similar to desktop. For example, navigational tasks ("BBC") where the user often clicks the first result, may not require scrolling, and may show higher viewport time on the first than second result. In our study, however, all tasks were non-navigational, and often in- volved scrolling. An intriguing question that immediately follows is, whether there is a more appropriate evaluation metric or rank position discount that better reflects user experience on mobile phones than current evaluation metrics, such as mean average precision or discounted cumulative gain. The second finding which is different on mobile phones than desktop is that, unlike the Golden Triangle in desktop, where attention is focused on the top-left and decreases towards the bottom and right of the search result page, in our study on mobile phones, we found that on average, user attention is focused on the center and top half of the screen. This, together with the already strong gaze viewport correlations (r=0.7 for %time on a page element as shown in Fig 8) suggests that by using the appropriate weighting functions on viewport data, we may identify which result the user is looking at, with high confidence. In other words, this offers an opportunity, for the first time, to scalably and reliably measure user attention on mobile phones. Another possible direction for improving accuracy of user attention measurements is to follow the work Huang et al. [14] and Navalpakkam et al. [24] that advocate to directly predict user attention on the screen from user interactions. While the absence of cursor movements in mobile phones makes attention prediction more difficult, we hypothesize that features of smaller screen size and the time user spends in the viewport without scrolling can be used to improve the accuracy of the “vanilla” approach that uses viewport time information only. In addition to understanding searcher attention on mobile phones, we examined search satisfaction and its effect on viewport data. We systematically varied task-relevance (whether the KG/IA contained the answer to the user’s task), and found that users reported less satisfied when the KG/IA was task-irrelevant than ...
Context 2
... result position, we find a surprising bump at positions 2 and 3 (significantly higher % time on the second result than the first: t(528)=-2.2, p=0.02; and higher % time on the third result than the first: t(504)=-3.7, p < 0.001). Authors verified that this is not a bug and is indeed feature of the mobile data. One possible explanation for the bump at position 2 and 3 is the presence of short scrolls on mobile phones. Figure 6 illustrates this with an example – unlike desktop where the page up down keys allow users to move from one page fold to another non-overlapping page fold, in mobile phones, users often tend to perform short scrolls that may render the second or third result visible across more viewports and for longer time than the first result. It is possible that for navigational tasks where For Page measures the p-values are computed using the repeated measures ANOVA; for Viewport and Gaze measures Wilcoxon rank sum test is used. users mostly click the first result (e.g., twitter), since scrolling is unlikely, we may observe that viewport time decreases with position. This remains to be tested in a future study. An obvious question is whether the bump at position 2 or 3 is an artifact of viewport data, or is a real attention phenomenon that occurs with eye gaze too. The right panels in figure 5 show gaze time on each result in milliseconds (top-right panel) and in % (bottom- right panel) as a function of result position (x axis). Similar to viewport, we find a main effect of result position or rank on time on result (F(9, 1720) = 15.1, p < 0.001) and a bump at position 2 (% time on result is significantly higher for second result than the first: t(343)=-2.3, p=0.02). We believe this may be a function of scrolling too – due to the small screen size in phones, the second result may only be partially visible; in order to bring it fully into view, the user has to adjust the scroll distance by continuing to look at the second result until it its bottom portion comes into view. This finding of non-monotonic attention decay with rank position may have implications for results ranking and design of a novel discount function (as opposed to MAP or NDCG[16]) that better reflects user experience in mobile search. We plan to investigate this question in the future work. Figure 7 shows the attention distribution across all users and conditions in our study. The left panel shows a heatmap of gaze activity (note that the red hotspots of increased attention are clearly shifted to the top half of the screen). The right panel shows a distribution of eye fixations as a function of y position. The median fixated y position was 224 pixels which is above the screen center (290 pixels). Thus, we found that on average, almost 70 % of the users’ attention was focused on the top half of the phone screen, with little or no attention paid to the bottom 1/3 portion of the screen (only 14%). This trend was consistent on a per user basis (20/24 users showed the preference for top half of the screen). We hypothesize that weighting viewport measurements by this attention distribution may further improve gaze viewport correlations. We have already shown in the previous section that viewport metrics can signal relevance of answer like results and reflect user’s satisfaction with the search. In this section we investigate whether viewport data can serve for an additional benefit – tracking user attention. To this end, we attempt to correlate result viewing time measured with the eye tracking and viewport data. If a reasonably strong correlation between gaze and viewport time exists, it implies that we can measure user attention at scale from viewport data alone. We analyze viewing time on per-result basis. We gather all the data collected in the user study independent of experimental condition, relevance, result position and result type (traditional web results vs. answer-like results). We hypothesize that viewport time alone might provide a poor proxy for the user’s attention, thus, in order to refine our measurements we account for two factors: result coverage and exposure defined below. Let v denote the viewport. We explore different ways of computing viewport time on result as a combination of the time the result was visible on the viewport ( t v ) and two factors: how much of the result area was visible to a user (result exposure , e v ) and how much of the viewport real estate did the result occupy (viewport coverage , c v ). Total viewport time on result using all factors is computed as n v =1 ( t v ∗ c v ∗ e v ) , where v can take values from [1 , n ] ( n is the number of viewports). Table 4 reports the gaze-viewport correlations for combinations of the above factors. We denote the baseline approach computing n viewport time = v =1 t v as C1. We find that the best combination among C1-C4 is C4 (C2 is close), which is weighted by result exposure and viewport coverage. The scatter plots in Figure 8 are generated using C4. Figure 8 (top-left panel) shows the scatter plot of viewport time on result vs. gaze time on result, both measured in milliseconds. Each data point in the scatter plot is a (user, query, condition, result) tuple. The correlations are reasonably strong (Pearson’s correlation r=0.57; the blue line shows the metric values obtained by binning the data into deciles, binned r = 0.77). Figure 8 (top-right panel) is similar, but shows a scatter plot of percent time on result (time on result / time on page) as measured by gaze (x axis) and viewport (y axis). Interestingly, we found higher correlations using % time on result than absolute time on result in milliseconds (raw correlation: r = 0.69 vs. 0.57; binned correlation: r = 0.97 vs. 0.77). We suspect that the normalization (by time on all results on the page) helps adjust for the pace at which users read the page. For example, some users may quickly glance and skim through the results, while others may read carefully. In such cases, the absolute time measure will vary a lot while the percent time measure may be relatively stable. Since 3-4 results may be shown on the viewport simultaneously, the observed gaze-viewport correlation on a per-result basis (raw correlation of 0.69, binned correlation in deciles of 0.97) is high and suggests that we may reliably infer how long a specific result was seen by the eye, from the viewport data alone. The middle and bottom panels in Figure 8 are similar to the top panel, and show gaze viewport correlations for other measures, such as time spent below KG (mid-left panel) and percent time spent below KG (i.e., time below KG / time on all results on the page, mid-right panel) measured using gaze (x axis) and viewport (y axis). Here too, we find strong gaze viewport correlations, and again, the % time measures show higher correlations than time in millisecond measures (time below KG: r = 0.71, %time below KG: r = 0.86). The bottom panel in Figure 8 shows correlations for time below IA (r = 0.59) and % time below IA (r = 0.81). In all three figures, we find that the percent time measures, that are normalized by time on page, show higher gaze-viewport correlations than time in millisecond measures, for reasons discussed earlier. To our knowledge, this is the first quantitative mobile eye tracking study in search. As more traffic goes mobile, there is a need to better understand user attention and satisfaction on mobile devices. Prior work has focused on search behavior in desktops. These studies report a Golden Triangle [23], where searcher attention is focused near the top-left of the page and decreases as we go down or to the right on the SERP. It is not clear whether attention behavior on desktop will generalize to mobile phones, as they differ from desktops in several ways – small real estate, variety of touch interactions (touch, scroll, swipe, zoom) and tendency to perform short queries. In this study, we found that indeed, user attention behavior on mobile phones is very different from that on desktops. First, unlike desktop where engagement (both clicks and attention) has been widely reported to decrease from top to bottom positions [9, 10], on mobile phones, we found surprisingly, that the second result gets more viewport and gaze time than the first. The most likely explanation for this is short scrolls. Unlike desktop where searchers can use the page up/down keys on the keyboard to move from one page fold to the next (no overlap between the results in different page folds), on mobile phones, users tend to perform short and continuous scrolls that render the second and third results visible across more viewports and hence longer than the first. Figure 6 illustrates this with an example. This bias towards the second position occurs in eye gaze too. We think this is because the second result is often partially hidden, and to bring it fully into view, the user has to carefully scroll (to avoid scrolling too much or too little) by continuously looking at the result until it is fully visible, leading to longer gaze time on the second result than the first. It is possible that in the absence of scrolling, viewport and gaze time on results (in mobile phones) may decrease with position, similar to desktop. For example, navigational tasks ("BBC") where the user often clicks the first result, may not require scrolling, and may show higher viewport time on the first than second result. In our study, however, all tasks were non-navigational, and often in- volved scrolling. An intriguing question that immediately follows is, whether there is a more appropriate evaluation metric or rank position discount that better reflects user experience on mobile phones than current evaluation metrics, such as mean average precision or discounted cumulative gain. The second finding which is different on mobile phones than desktop is that, unlike the Golden Triangle in desktop, where attention is focused on the top-left and decreases towards ...
Context 3
... of short scrolls on mobile phones. Figure 6 illustrates this with an example – unlike desktop where the page up down keys allow users to move from one page fold to another non-overlapping page fold, in mobile phones, users often tend to perform short scrolls that may render the second or third result visible across more viewports and for longer time than the first result. It is possible that for navigational tasks where For Page measures the p-values are computed using the repeated measures ANOVA; for Viewport and Gaze measures Wilcoxon rank sum test is used. users mostly click the first result (e.g., twitter), since scrolling is unlikely, we may observe that viewport time decreases with position. This remains to be tested in a future study. An obvious question is whether the bump at position 2 or 3 is an artifact of viewport data, or is a real attention phenomenon that occurs with eye gaze too. The right panels in figure 5 show gaze time on each result in milliseconds (top-right panel) and in % (bottom- right panel) as a function of result position (x axis). Similar to viewport, we find a main effect of result position or rank on time on result (F(9, 1720) = 15.1, p < 0.001) and a bump at position 2 (% time on result is significantly higher for second result than the first: t(343)=-2.3, p=0.02). We believe this may be a function of scrolling too – due to the small screen size in phones, the second result may only be partially visible; in order to bring it fully into view, the user has to adjust the scroll distance by continuing to look at the second result until it its bottom portion comes into view. This finding of non-monotonic attention decay with rank position may have implications for results ranking and design of a novel discount function (as opposed to MAP or NDCG[16]) that better reflects user experience in mobile search. We plan to investigate this question in the future work. Figure 7 shows the attention distribution across all users and conditions in our study. The left panel shows a heatmap of gaze activity (note that the red hotspots of increased attention are clearly shifted to the top half of the screen). The right panel shows a distribution of eye fixations as a function of y position. The median fixated y position was 224 pixels which is above the screen center (290 pixels). Thus, we found that on average, almost 70 % of the users’ attention was focused on the top half of the phone screen, with little or no attention paid to the bottom 1/3 portion of the screen (only 14%). This trend was consistent on a per user basis (20/24 users showed the preference for top half of the screen). We hypothesize that weighting viewport measurements by this attention distribution may further improve gaze viewport correlations. We have already shown in the previous section that viewport metrics can signal relevance of answer like results and reflect user’s satisfaction with the search. In this section we investigate whether viewport data can serve for an additional benefit – tracking user attention. To this end, we attempt to correlate result viewing time measured with the eye tracking and viewport data. If a reasonably strong correlation between gaze and viewport time exists, it implies that we can measure user attention at scale from viewport data alone. We analyze viewing time on per-result basis. We gather all the data collected in the user study independent of experimental condition, relevance, result position and result type (traditional web results vs. answer-like results). We hypothesize that viewport time alone might provide a poor proxy for the user’s attention, thus, in order to refine our measurements we account for two factors: result coverage and exposure defined below. Let v denote the viewport. We explore different ways of computing viewport time on result as a combination of the time the result was visible on the viewport ( t v ) and two factors: how much of the result area was visible to a user (result exposure , e v ) and how much of the viewport real estate did the result occupy (viewport coverage , c v ). Total viewport time on result using all factors is computed as n v =1 ( t v ∗ c v ∗ e v ) , where v can take values from [1 , n ] ( n is the number of viewports). Table 4 reports the gaze-viewport correlations for combinations of the above factors. We denote the baseline approach computing n viewport time = v =1 t v as C1. We find that the best combination among C1-C4 is C4 (C2 is close), which is weighted by result exposure and viewport coverage. The scatter plots in Figure 8 are generated using C4. Figure 8 (top-left panel) shows the scatter plot of viewport time on result vs. gaze time on result, both measured in milliseconds. Each data point in the scatter plot is a (user, query, condition, result) tuple. The correlations are reasonably strong (Pearson’s correlation r=0.57; the blue line shows the metric values obtained by binning the data into deciles, binned r = 0.77). Figure 8 (top-right panel) is similar, but shows a scatter plot of percent time on result (time on result / time on page) as measured by gaze (x axis) and viewport (y axis). Interestingly, we found higher correlations using % time on result than absolute time on result in milliseconds (raw correlation: r = 0.69 vs. 0.57; binned correlation: r = 0.97 vs. 0.77). We suspect that the normalization (by time on all results on the page) helps adjust for the pace at which users read the page. For example, some users may quickly glance and skim through the results, while others may read carefully. In such cases, the absolute time measure will vary a lot while the percent time measure may be relatively stable. Since 3-4 results may be shown on the viewport simultaneously, the observed gaze-viewport correlation on a per-result basis (raw correlation of 0.69, binned correlation in deciles of 0.97) is high and suggests that we may reliably infer how long a specific result was seen by the eye, from the viewport data alone. The middle and bottom panels in Figure 8 are similar to the top panel, and show gaze viewport correlations for other measures, such as time spent below KG (mid-left panel) and percent time spent below KG (i.e., time below KG / time on all results on the page, mid-right panel) measured using gaze (x axis) and viewport (y axis). Here too, we find strong gaze viewport correlations, and again, the % time measures show higher correlations than time in millisecond measures (time below KG: r = 0.71, %time below KG: r = 0.86). The bottom panel in Figure 8 shows correlations for time below IA (r = 0.59) and % time below IA (r = 0.81). In all three figures, we find that the percent time measures, that are normalized by time on page, show higher gaze-viewport correlations than time in millisecond measures, for reasons discussed earlier. To our knowledge, this is the first quantitative mobile eye tracking study in search. As more traffic goes mobile, there is a need to better understand user attention and satisfaction on mobile devices. Prior work has focused on search behavior in desktops. These studies report a Golden Triangle [23], where searcher attention is focused near the top-left of the page and decreases as we go down or to the right on the SERP. It is not clear whether attention behavior on desktop will generalize to mobile phones, as they differ from desktops in several ways – small real estate, variety of touch interactions (touch, scroll, swipe, zoom) and tendency to perform short queries. In this study, we found that indeed, user attention behavior on mobile phones is very different from that on desktops. First, unlike desktop where engagement (both clicks and attention) has been widely reported to decrease from top to bottom positions [9, 10], on mobile phones, we found surprisingly, that the second result gets more viewport and gaze time than the first. The most likely explanation for this is short scrolls. Unlike desktop where searchers can use the page up/down keys on the keyboard to move from one page fold to the next (no overlap between the results in different page folds), on mobile phones, users tend to perform short and continuous scrolls that render the second and third results visible across more viewports and hence longer than the first. Figure 6 illustrates this with an example. This bias towards the second position occurs in eye gaze too. We think this is because the second result is often partially hidden, and to bring it fully into view, the user has to carefully scroll (to avoid scrolling too much or too little) by continuously looking at the result until it is fully visible, leading to longer gaze time on the second result than the first. It is possible that in the absence of scrolling, viewport and gaze time on results (in mobile phones) may decrease with position, similar to desktop. For example, navigational tasks ("BBC") where the user often clicks the first result, may not require scrolling, and may show higher viewport time on the first than second result. In our study, however, all tasks were non-navigational, and often in- volved scrolling. An intriguing question that immediately follows is, whether there is a more appropriate evaluation metric or rank position discount that better reflects user experience on mobile phones than current evaluation metrics, such as mean average precision or discounted cumulative gain. The second finding which is different on mobile phones than desktop is that, unlike the Golden Triangle in desktop, where attention is focused on the top-left and decreases towards the bottom and right of the search result page, in our study on mobile phones, we found that on average, user attention is focused on the center and top half of the screen. This, together with the already strong gaze viewport correlations (r=0.7 for %time on a page element as shown in Fig 8) suggests that by using the appropriate weighting functions on viewport data, we may ...
Context 4
... on result (in ms, %) decreases with result position, we find a surprising bump at positions 2 and 3 (significantly higher % time on the second result than the first: t(528)=-2.2, p=0.02; and higher % time on the third result than the first: t(504)=-3.7, p < 0.001). Authors verified that this is not a bug and is indeed feature of the mobile data. One possible explanation for the bump at position 2 and 3 is the presence of short scrolls on mobile phones. Figure 6 illustrates this with an example – unlike desktop where the page up down keys allow users to move from one page fold to another non-overlapping page fold, in mobile phones, users often tend to perform short scrolls that may render the second or third result visible across more viewports and for longer time than the first result. It is possible that for navigational tasks where For Page measures the p-values are computed using the repeated measures ANOVA; for Viewport and Gaze measures Wilcoxon rank sum test is used. users mostly click the first result (e.g., twitter), since scrolling is unlikely, we may observe that viewport time decreases with position. This remains to be tested in a future study. An obvious question is whether the bump at position 2 or 3 is an artifact of viewport data, or is a real attention phenomenon that occurs with eye gaze too. The right panels in figure 5 show gaze time on each result in milliseconds (top-right panel) and in % (bottom- right panel) as a function of result position (x axis). Similar to viewport, we find a main effect of result position or rank on time on result (F(9, 1720) = 15.1, p < 0.001) and a bump at position 2 (% time on result is significantly higher for second result than the first: t(343)=-2.3, p=0.02). We believe this may be a function of scrolling too – due to the small screen size in phones, the second result may only be partially visible; in order to bring it fully into view, the user has to adjust the scroll distance by continuing to look at the second result until it its bottom portion comes into view. This finding of non-monotonic attention decay with rank position may have implications for results ranking and design of a novel discount function (as opposed to MAP or NDCG[16]) that better reflects user experience in mobile search. We plan to investigate this question in the future work. Figure 7 shows the attention distribution across all users and conditions in our study. The left panel shows a heatmap of gaze activity (note that the red hotspots of increased attention are clearly shifted to the top half of the screen). The right panel shows a distribution of eye fixations as a function of y position. The median fixated y position was 224 pixels which is above the screen center (290 pixels). Thus, we found that on average, almost 70 % of the users’ attention was focused on the top half of the phone screen, with little or no attention paid to the bottom 1/3 portion of the screen (only 14%). This trend was consistent on a per user basis (20/24 users showed the preference for top half of the screen). We hypothesize that weighting viewport measurements by this attention distribution may further improve gaze viewport correlations. We have already shown in the previous section that viewport metrics can signal relevance of answer like results and reflect user’s satisfaction with the search. In this section we investigate whether viewport data can serve for an additional benefit – tracking user attention. To this end, we attempt to correlate result viewing time measured with the eye tracking and viewport data. If a reasonably strong correlation between gaze and viewport time exists, it implies that we can measure user attention at scale from viewport data alone. We analyze viewing time on per-result basis. We gather all the data collected in the user study independent of experimental condition, relevance, result position and result type (traditional web results vs. answer-like results). We hypothesize that viewport time alone might provide a poor proxy for the user’s attention, thus, in order to refine our measurements we account for two factors: result coverage and exposure defined below. Let v denote the viewport. We explore different ways of computing viewport time on result as a combination of the time the result was visible on the viewport ( t v ) and two factors: how much of the result area was visible to a user (result exposure , e v ) and how much of the viewport real estate did the result occupy (viewport coverage , c v ). Total viewport time on result using all factors is computed as n v =1 ( t v ∗ c v ∗ e v ) , where v can take values from [1 , n ] ( n is the number of viewports). Table 4 reports the gaze-viewport correlations for combinations of the above factors. We denote the baseline approach computing n viewport time = v =1 t v as C1. We find that the best combination among C1-C4 is C4 (C2 is close), which is weighted by result exposure and viewport coverage. The scatter plots in Figure 8 are generated using C4. Figure 8 (top-left panel) shows the scatter plot of viewport time on result vs. gaze time on result, both measured in milliseconds. Each data point in the scatter plot is a (user, query, condition, result) tuple. The correlations are reasonably strong (Pearson’s correlation r=0.57; the blue line shows the metric values obtained by binning the data into deciles, binned r = 0.77). Figure 8 (top-right panel) is similar, but shows a scatter plot of percent time on result (time on result / time on page) as measured by gaze (x axis) and viewport (y axis). Interestingly, we found higher correlations using % time on result than absolute time on result in milliseconds (raw correlation: r = 0.69 vs. 0.57; binned correlation: r = 0.97 vs. 0.77). We suspect that the normalization (by time on all results on the page) helps adjust for the pace at which users read the page. For example, some users may quickly glance and skim through the results, while others may read carefully. In such cases, the absolute time measure will vary a lot while the percent time measure may be relatively stable. Since 3-4 results may be shown on the viewport simultaneously, the observed gaze-viewport correlation on a per-result basis (raw correlation of 0.69, binned correlation in deciles of 0.97) is high and suggests that we may reliably infer how long a specific result was seen by the eye, from the viewport data alone. The middle and bottom panels in Figure 8 are similar to the top panel, and show gaze viewport correlations for other measures, such as time spent below KG (mid-left panel) and percent time spent below KG (i.e., time below KG / time on all results on the page, mid-right panel) measured using gaze (x axis) and viewport (y axis). Here too, we find strong gaze viewport correlations, and again, the % time measures show higher correlations than time in millisecond measures (time below KG: r = 0.71, %time below KG: r = 0.86). The bottom panel in Figure 8 shows correlations for time below IA (r = 0.59) and % time below IA (r = 0.81). In all three figures, we find that the percent time measures, that are normalized by time on page, show higher gaze-viewport correlations than time in millisecond measures, for reasons discussed earlier. To our knowledge, this is the first quantitative mobile eye tracking study in search. As more traffic goes mobile, there is a need to better understand user attention and satisfaction on mobile devices. Prior work has focused on search behavior in desktops. These studies report a Golden Triangle [23], where searcher attention is focused near the top-left of the page and decreases as we go down or to the right on the SERP. It is not clear whether attention behavior on desktop will generalize to mobile phones, as they differ from desktops in several ways – small real estate, variety of touch interactions (touch, scroll, swipe, zoom) and tendency to perform short queries. In this study, we found that indeed, user attention behavior on mobile phones is very different from that on desktops. First, unlike desktop where engagement (both clicks and attention) has been widely reported to decrease from top to bottom positions [9, 10], on mobile phones, we found surprisingly, that the second result gets more viewport and gaze time than the first. The most likely explanation for this is short scrolls. Unlike desktop where searchers can use the page up/down keys on the keyboard to move from one page fold to the next (no overlap between the results in different page folds), on mobile phones, users tend to perform short and continuous scrolls that render the second and third results visible across more viewports and hence longer than the first. Figure 6 illustrates this with an example. This bias towards the second position occurs in eye gaze too. We think this is because the second result is often partially hidden, and to bring it fully into view, the user has to carefully scroll (to avoid scrolling too much or too little) by continuously looking at the result until it is fully visible, leading to longer gaze time on the second result than the first. It is possible that in the absence of scrolling, viewport and gaze time on results (in mobile phones) may decrease with position, similar to desktop. For example, navigational tasks ("BBC") where the user often clicks the first result, may not require scrolling, and may show higher viewport time on the first than second result. In our study, however, all tasks were non-navigational, and often in- volved scrolling. An intriguing question that immediately follows is, whether there is a more appropriate evaluation metric or rank position discount that better reflects user experience on mobile phones than current evaluation metrics, such as mean average precision or discounted cumulative gain. The second finding which is different on mobile phones than desktop is that, unlike the Golden Triangle in desktop, where attention is focused on the ...
Context 5
... at the second result until it its bottom portion comes into view. This finding of non-monotonic attention decay with rank position may have implications for results ranking and design of a novel discount function (as opposed to MAP or NDCG[16]) that better reflects user experience in mobile search. We plan to investigate this question in the future work. Figure 7 shows the attention distribution across all users and conditions in our study. The left panel shows a heatmap of gaze activity (note that the red hotspots of increased attention are clearly shifted to the top half of the screen). The right panel shows a distribution of eye fixations as a function of y position. The median fixated y position was 224 pixels which is above the screen center (290 pixels). Thus, we found that on average, almost 70 % of the users’ attention was focused on the top half of the phone screen, with little or no attention paid to the bottom 1/3 portion of the screen (only 14%). This trend was consistent on a per user basis (20/24 users showed the preference for top half of the screen). We hypothesize that weighting viewport measurements by this attention distribution may further improve gaze viewport correlations. We have already shown in the previous section that viewport metrics can signal relevance of answer like results and reflect user’s satisfaction with the search. In this section we investigate whether viewport data can serve for an additional benefit – tracking user attention. To this end, we attempt to correlate result viewing time measured with the eye tracking and viewport data. If a reasonably strong correlation between gaze and viewport time exists, it implies that we can measure user attention at scale from viewport data alone. We analyze viewing time on per-result basis. We gather all the data collected in the user study independent of experimental condition, relevance, result position and result type (traditional web results vs. answer-like results). We hypothesize that viewport time alone might provide a poor proxy for the user’s attention, thus, in order to refine our measurements we account for two factors: result coverage and exposure defined below. Let v denote the viewport. We explore different ways of computing viewport time on result as a combination of the time the result was visible on the viewport ( t v ) and two factors: how much of the result area was visible to a user (result exposure , e v ) and how much of the viewport real estate did the result occupy (viewport coverage , c v ). Total viewport time on result using all factors is computed as n v =1 ( t v ∗ c v ∗ e v ) , where v can take values from [1 , n ] ( n is the number of viewports). Table 4 reports the gaze-viewport correlations for combinations of the above factors. We denote the baseline approach computing n viewport time = v =1 t v as C1. We find that the best combination among C1-C4 is C4 (C2 is close), which is weighted by result exposure and viewport coverage. The scatter plots in Figure 8 are generated using C4. Figure 8 (top-left panel) shows the scatter plot of viewport time on result vs. gaze time on result, both measured in milliseconds. Each data point in the scatter plot is a (user, query, condition, result) tuple. The correlations are reasonably strong (Pearson’s correlation r=0.57; the blue line shows the metric values obtained by binning the data into deciles, binned r = 0.77). Figure 8 (top-right panel) is similar, but shows a scatter plot of percent time on result (time on result / time on page) as measured by gaze (x axis) and viewport (y axis). Interestingly, we found higher correlations using % time on result than absolute time on result in milliseconds (raw correlation: r = 0.69 vs. 0.57; binned correlation: r = 0.97 vs. 0.77). We suspect that the normalization (by time on all results on the page) helps adjust for the pace at which users read the page. For example, some users may quickly glance and skim through the results, while others may read carefully. In such cases, the absolute time measure will vary a lot while the percent time measure may be relatively stable. Since 3-4 results may be shown on the viewport simultaneously, the observed gaze-viewport correlation on a per-result basis (raw correlation of 0.69, binned correlation in deciles of 0.97) is high and suggests that we may reliably infer how long a specific result was seen by the eye, from the viewport data alone. The middle and bottom panels in Figure 8 are similar to the top panel, and show gaze viewport correlations for other measures, such as time spent below KG (mid-left panel) and percent time spent below KG (i.e., time below KG / time on all results on the page, mid-right panel) measured using gaze (x axis) and viewport (y axis). Here too, we find strong gaze viewport correlations, and again, the % time measures show higher correlations than time in millisecond measures (time below KG: r = 0.71, %time below KG: r = 0.86). The bottom panel in Figure 8 shows correlations for time below IA (r = 0.59) and % time below IA (r = 0.81). In all three figures, we find that the percent time measures, that are normalized by time on page, show higher gaze-viewport correlations than time in millisecond measures, for reasons discussed earlier. To our knowledge, this is the first quantitative mobile eye tracking study in search. As more traffic goes mobile, there is a need to better understand user attention and satisfaction on mobile devices. Prior work has focused on search behavior in desktops. These studies report a Golden Triangle [23], where searcher attention is focused near the top-left of the page and decreases as we go down or to the right on the SERP. It is not clear whether attention behavior on desktop will generalize to mobile phones, as they differ from desktops in several ways – small real estate, variety of touch interactions (touch, scroll, swipe, zoom) and tendency to perform short queries. In this study, we found that indeed, user attention behavior on mobile phones is very different from that on desktops. First, unlike desktop where engagement (both clicks and attention) has been widely reported to decrease from top to bottom positions [9, 10], on mobile phones, we found surprisingly, that the second result gets more viewport and gaze time than the first. The most likely explanation for this is short scrolls. Unlike desktop where searchers can use the page up/down keys on the keyboard to move from one page fold to the next (no overlap between the results in different page folds), on mobile phones, users tend to perform short and continuous scrolls that render the second and third results visible across more viewports and hence longer than the first. Figure 6 illustrates this with an example. This bias towards the second position occurs in eye gaze too. We think this is because the second result is often partially hidden, and to bring it fully into view, the user has to carefully scroll (to avoid scrolling too much or too little) by continuously looking at the result until it is fully visible, leading to longer gaze time on the second result than the first. It is possible that in the absence of scrolling, viewport and gaze time on results (in mobile phones) may decrease with position, similar to desktop. For example, navigational tasks ("BBC") where the user often clicks the first result, may not require scrolling, and may show higher viewport time on the first than second result. In our study, however, all tasks were non-navigational, and often in- volved scrolling. An intriguing question that immediately follows is, whether there is a more appropriate evaluation metric or rank position discount that better reflects user experience on mobile phones than current evaluation metrics, such as mean average precision or discounted cumulative gain. The second finding which is different on mobile phones than desktop is that, unlike the Golden Triangle in desktop, where attention is focused on the top-left and decreases towards the bottom and right of the search result page, in our study on mobile phones, we found that on average, user attention is focused on the center and top half of the screen. This, together with the already strong gaze viewport correlations (r=0.7 for %time on a page element as shown in Fig 8) suggests that by using the appropriate weighting functions on viewport data, we may identify which result the user is looking at, with high confidence. In other words, this offers an opportunity, for the first time, to scalably and reliably measure user attention on mobile phones. Another possible direction for improving accuracy of user attention measurements is to follow the work Huang et al. [14] and Navalpakkam et al. [24] that advocate to directly predict user attention on the screen from user interactions. While the absence of cursor movements in mobile phones makes attention prediction more difficult, we hypothesize that features of smaller screen size and the time user spends in the viewport without scrolling can be used to improve the accuracy of the “vanilla” approach that uses viewport time information only. In addition to understanding searcher attention on mobile phones, we examined search satisfaction and its effect on viewport data. We systematically varied task-relevance (whether the KG/IA contained the answer to the user’s task), and found that users reported less satisfied when the KG/IA was task-irrelevant than when it was relevant. We also identified viewport metrics that signal user dissatisfaction with answers – increased scrolling down the SERP and increased % time below the answer. We found that when the KG/IA is task-irrelevant, users read through it (expecting to find the answer) and upon not finding the answer, they continued to examine results below, leading to increased scrolling down the page, and increased time below KG/IA (in milliseconds, and as a % of page time). These results suggest that we may ...
Context 6
... the blue line shows the metric values obtained by binning the data into deciles, binned r = 0.77). Figure 8 (top-right panel) is similar, but shows a scatter plot of percent time on result (time on result / time on page) as measured by gaze (x axis) and viewport (y axis). Interestingly, we found higher correlations using % time on result than absolute time on result in milliseconds (raw correlation: r = 0.69 vs. 0.57; binned correlation: r = 0.97 vs. 0.77). We suspect that the normalization (by time on all results on the page) helps adjust for the pace at which users read the page. For example, some users may quickly glance and skim through the results, while others may read carefully. In such cases, the absolute time measure will vary a lot while the percent time measure may be relatively stable. Since 3-4 results may be shown on the viewport simultaneously, the observed gaze-viewport correlation on a per-result basis (raw correlation of 0.69, binned correlation in deciles of 0.97) is high and suggests that we may reliably infer how long a specific result was seen by the eye, from the viewport data alone. The middle and bottom panels in Figure 8 are similar to the top panel, and show gaze viewport correlations for other measures, such as time spent below KG (mid-left panel) and percent time spent below KG (i.e., time below KG / time on all results on the page, mid-right panel) measured using gaze (x axis) and viewport (y axis). Here too, we find strong gaze viewport correlations, and again, the % time measures show higher correlations than time in millisecond measures (time below KG: r = 0.71, %time below KG: r = 0.86). The bottom panel in Figure 8 shows correlations for time below IA (r = 0.59) and % time below IA (r = 0.81). In all three figures, we find that the percent time measures, that are normalized by time on page, show higher gaze-viewport correlations than time in millisecond measures, for reasons discussed earlier. To our knowledge, this is the first quantitative mobile eye tracking study in search. As more traffic goes mobile, there is a need to better understand user attention and satisfaction on mobile devices. Prior work has focused on search behavior in desktops. These studies report a Golden Triangle [23], where searcher attention is focused near the top-left of the page and decreases as we go down or to the right on the SERP. It is not clear whether attention behavior on desktop will generalize to mobile phones, as they differ from desktops in several ways – small real estate, variety of touch interactions (touch, scroll, swipe, zoom) and tendency to perform short queries. In this study, we found that indeed, user attention behavior on mobile phones is very different from that on desktops. First, unlike desktop where engagement (both clicks and attention) has been widely reported to decrease from top to bottom positions [9, 10], on mobile phones, we found surprisingly, that the second result gets more viewport and gaze time than the first. The most likely explanation for this is short scrolls. Unlike desktop where searchers can use the page up/down keys on the keyboard to move from one page fold to the next (no overlap between the results in different page folds), on mobile phones, users tend to perform short and continuous scrolls that render the second and third results visible across more viewports and hence longer than the first. Figure 6 illustrates this with an example. This bias towards the second position occurs in eye gaze too. We think this is because the second result is often partially hidden, and to bring it fully into view, the user has to carefully scroll (to avoid scrolling too much or too little) by continuously looking at the result until it is fully visible, leading to longer gaze time on the second result than the first. It is possible that in the absence of scrolling, viewport and gaze time on results (in mobile phones) may decrease with position, similar to desktop. For example, navigational tasks ("BBC") where the user often clicks the first result, may not require scrolling, and may show higher viewport time on the first than second result. In our study, however, all tasks were non-navigational, and often in- volved scrolling. An intriguing question that immediately follows is, whether there is a more appropriate evaluation metric or rank position discount that better reflects user experience on mobile phones than current evaluation metrics, such as mean average precision or discounted cumulative gain. The second finding which is different on mobile phones than desktop is that, unlike the Golden Triangle in desktop, where attention is focused on the top-left and decreases towards the bottom and right of the search result page, in our study on mobile phones, we found that on average, user attention is focused on the center and top half of the screen. This, together with the already strong gaze viewport correlations (r=0.7 for %time on a page element as shown in Fig 8) suggests that by using the appropriate weighting functions on viewport data, we may identify which result the user is looking at, with high confidence. In other words, this offers an opportunity, for the first time, to scalably and reliably measure user attention on mobile phones. Another possible direction for improving accuracy of user attention measurements is to follow the work Huang et al. [14] and Navalpakkam et al. [24] that advocate to directly predict user attention on the screen from user interactions. While the absence of cursor movements in mobile phones makes attention prediction more difficult, we hypothesize that features of smaller screen size and the time user spends in the viewport without scrolling can be used to improve the accuracy of the “vanilla” approach that uses viewport time information only. In addition to understanding searcher attention on mobile phones, we examined search satisfaction and its effect on viewport data. We systematically varied task-relevance (whether the KG/IA contained the answer to the user’s task), and found that users reported less satisfied when the KG/IA was task-irrelevant than when it was relevant. We also identified viewport metrics that signal user dissatisfaction with answers – increased scrolling down the SERP and increased % time below the answer. We found that when the KG/IA is task-irrelevant, users read through it (expecting to find the answer) and upon not finding the answer, they continued to examine results below, leading to increased scrolling down the page, and increased time below KG/IA (in milliseconds, and as a % of page time). These results suggest that we may auto-detect answer satisfaction at scale by using viewport signals. We acknowledge several limitations of this study. First, we focused on tasks with information seeking search intent and have not explored navigational search intent [3]. In our data we observed 2.51 viewport scrolls performed on average. We expect the amount scrolling activity to be smaller for navigational searches, as often the first result is the destination site (e.g. queries like “BBC” or "Twitter"). In the absence of scrolling, we may find that attention strictly decreases with result position (unlike the bump at position 2 observed in this study). Second, in this study, we fixed the mobile phone’s position by mounting it to the eye tracker’s stand. In real life, user’s attention on the phone can vary depending on whether s/he is moving or not; whether the user is right handed or left handed or, perhaps interacting with the phone with both hands. Other factors such as demographics can also influence user behavior. For example, depending on the user’s language s/he may read information on the phone from left to right or vice-versa. Age and experience with touch interfaces is widely recognized throughout research commu- nity as an important factor in touch interactions, thus can affect user attention/search behavior. Third, examination habits on the mobile device may vary across users, as noted by [2]. While Figure 7 already shows a clear pattern - most users prefer focusing on the top half of the phone screen, it is possible that a few users may prefer the center or bottom of the screen. In our future work we plan to address this limitation by exploring the possibility of adaptively weighting user attention based on current user actions, e.g. direction of page examination (upward or downward). Despite these limitations, our study offers the hope of accurately measuring user eye-fixated result at scale on mobile phones. Future work will consider tablets (this study focused on mobile phones) and other devices, satisfaction with clickable results (including ads), and explore diverse user settings such as users who are moving or multi-tasking. We demonstrated, for the first time, that by tracking the browser viewport (visible portion of the page), one can develop viewport metrics that are strongly correlated with user attention (eye gaze) and search satisfaction on mobile phones. Focusing on answer-like results, in a controlled lab study, we found that increased scrolling past answer and increased time below answer can signal user dissatisfaction with answer results. We demonstrated strong gaze- viewport correlations on a per-result basis, and found that attention (on average) is focused on the top half of the phone, suggesting that we may infer the amount of attention received by a specific result (of 3-4 results shown in the viewport) scalably and reliably using viewport data alone. Potential applications of this work include better estimation of result relevance and satisfaction in search, and could benefit other areas including advertising, web page design and optimization, and measuring engagement in social networking ...

Citations

... NewsMoment logs its users' news-browsing and news-reading behaviors. Specifically, it tracks the position of the user's viewport in the news, inspired by prior analytical work [47,48]. Each viewport is defined as the line of the content presented on the user's screen, the number of viewports largely depends on both the size of the screen and font size. ...
... Along with scanning, we deem unengaged reading to be a shallower reading mode that induces insufficient processing of news articles. Inspired by [47,48], Figure 5 presents how the four reading modes differed in terms of the participants' average dwell time at each viewport position. Unlike in the two deeper reading modes, the dwell time for scanning was close to zero until the participants reached certain points in the text that they desired to scrutinize. ...
Article
Full-text available
News notifications on smartphones provide a convenient way to stay informed, but their delivery timing can influence user engagement. Despite this, research on the impact of notification timing on reading behavior remains limited. Therefore, we developed NewsMoment, a news aggregation app that monitors user reading patterns and sends news notifications. Our experience sampling study with 46 NewsMoment users revealed four distinct reading modes: typical, comprehensive, scanning, and unengaged. Deep reading, encompassing typical and comprehensive modes, more often occurred during self-initiated browsing rather than through pushed news. Interestingly, shallow reading modes - unengaged and scanning - showed varying prevalence, associated triggers, and engagement, despite their similarities. Importantly, unengaged reading persisted regardless of users' perceived moment opportuneness, whereas scanning reading was more common during inopportune moments. These findings suggest that identifying opportune moments for news reading may primarily reduce scanning reading, without substantially impacting unengaged reading.
... For example, eye movement patterns have been used to recognise human activities such as reading or common office activities [18]. The implicit gaze interaction signals can be used to recognise and speculate humans' latent behaviours, including measurement of users' preferences [145], attention [56,57,169,177], interests [150], individual stress [99], emotional states [173], and mental disorders such as schizophrenia and autism spectrum disorder [7,50,217]. For example, Deng et al. [37] have analysed eye movement of drivers to predict their fixation and understand their attention allocation on scenes, and Pan et al. [182] have explored the use of eye-tracking technologies to predict drivers' lane-changing intention. ...
... GazeRoomLock [72] combines gaze and head pose for user authentication in VR applications. EyeVeri [224] applies signal processing and pattern matching techniques to explore conscious and [145] 2014 Tobii X60 dwell -I attention analysis [206] 2015 modified prototype gaze gesture -E interface control [193] 2016 Tobii EyeX dwell touch I interface control [224] 2016 phone camera gaze basic events -E user authentication [125] 2016 Facelab 5 gaze basic events -I Website usability test [150] 2017 phone camera dwell -I intention inference [120] 2017 phone camera gaze gesture touch E user authentication [261] 2017 Tobii eyeX gaze basic events touch I gaze adaptive UI [267] 2017 phone camera gaze gesture -E gaze input [121] 2017 external RGB cam gaze gesture touch E user authentication [227] 2018 phone camera gaze basic events -I attention inference [219] 2019 Tobii 4C dwell touch E text editing aids [202] 2019 Tobii 4C dwell touch E text interface control [167] 2020 phone camera dwell voice E map navigation [220] 2020 phone camera eye image -I ocular exam [239] 2020 phone camera dwell touch E cross-device control [58] 2020 phone camera dwell eyelid E interface control [112] 2020 phone camera gaze basic events -I attention analysis [254] 2021 phone camera gaze gesture touch E gaze-assist input [133] 2021 phone camera dwell hand motion E interface control [180] 2021 Tobii X2 gaze basic events -I attention analysis [153] 2021 phone camera dwell -E target selection [123] 2021 Tobii 4C dwell voice I implicit note-taking [122] 2022 external RGB cam gaze gesture touch E user authentication [274] 2022 phone camera dwell voice E text correction [10] 2022 external RGB cam gaze & face -I user privacy [266] 2022 phone camera eye image -I holding posture detection [275] 2022 external RGB cam gaze basic events * E gaze command definition [103] 2022 Tobii X2 gaze basic events -I attention analysis [107] 2022 SMI Glasses gaze basic events -I learning process of typing [105] 2022 watch camera face position hand motion E spatial user interfaces [175] 2023 phone camera dwell, pursuit, gesture -E gaze UI usability test [147] 2023 phone camera dwell, pursuit, gesture -I&E gaze UI usability test Note: Feature means gaze feature for interaction; Modality means the other modality with gaze; Leverage means the way of leveraging gaze; E means Explicit; I means Implicit; gaze basic events mean: fixation, saccade, smooth pursuit etc; *: including eyelids, mouth, and head. ...
Article
Full-text available
In recent years we have witnessed an increasing number of interactive systems on handheld mobile devices which utilise gaze as a single or complementary interaction modality. This trend is driven by the enhanced computational power of these devices, higher resolution and capacity of their cameras, and improved gaze estimation accuracy obtained from advanced machine learning techniques, especially in deep learning. As the literature is fast progressing, there is a pressing need to review the state of the art, delineate the boundary, and identify the key research challenges and opportunities in gaze estimation and interaction. This paper aims to serve this purpose by presenting an end-to-end holistic view in this area, from gaze capturing sensors, to gaze estimation workflows, to deep learning techniques, and to gaze interactive applications.
... Their satisfaction on rankings is often affected by complicated factors such as the relations between results, the local context of the request, personal preferences of the users, etc. Thus, research has been conducted in IR to provide better understanding on user behaviors in online systems, which results in numerous user-centric evaluation metrics such as satisfaction scores [23,55] and engagements [52,54,102]. We refer to these as complex ranking metrics. ...
Preprint
Ranking is at the core of Information Retrieval. Classic ranking optimization studies often treat ranking as a sorting problem with the assumption that the best performance of ranking would be achieved if we rank items according to their individual utility. Accordingly, considerable ranking metrics have been developed and learning-to-rank algorithms that have been designed to optimize these simple performance metrics have been widely used in modern IR systems. As applications evolve, however, people's need for information retrieval have shifted from simply retrieving relevant documents to more advanced information services that satisfy their complex working and entertainment needs. Thus, more complicated and user-centric objectives such as user satisfaction and engagement have been adopted to evaluate modern IR systems today. Those objectives, unfortunately, are difficult to be optimized under existing learning-to-rank frameworks as they are subject to great variance and complicated structures that cannot be explicitly explained or formulated with math equations like those simple performance metrics. This leads to the following research question -- how to optimize result ranking for complex ranking metrics without knowing their internal structures? To address this question, we conduct formal analysis on the limitation of existing ranking optimization techniques and describe three research tasks in \textit{Metric-agnostic Ranking Optimization}. Through the discussion of potential solutions to these tasks, we hope to encourage more people to look into the problem of ranking optimization in complex search and recommendation scenarios.
... From the viewport data, it is possible to infer user attention on a document with comparable performance to the use of eye-tracking [11,13,14,20]. This can be achieved by using the viewport scrolling data to infer the time spent on individual parts of the document instead of just capturing the time a student spends on the entire document as it is done in time-on-task approaches. ...
... This process was repeated for the remaining 11 tasks which were displayed to them in random order. Other researchers have also employed randomisation for condition allocation to minimise topic ordering effects [15,36]. ...
Preprint
Full-text available
The Search Engine Results Page (SERP) has evolved significantly over the last two decades, moving away from the simple ten blue links paradigm to considerably more complex presentations that contain results from multiple verticals and granularities of textual information. Prior works have investigated how user interactions on the SERP are influenced by the presence or absence of heterogeneous content (e.g., images, videos, or news content), the layout of the SERP (list vs. grid layout), and task complexity. In this paper, we reproduce the user studies conducted in prior works-specifically those of Arguello et al. [4] and Siu and Chaparro [29]-to explore to what extent the findings from research conducted five to ten years ago still hold today as the average web user has become accustomed to SERPs with ever-increasing presentational complexity. To this end, we designed and ran a user study with four different SERP interfaces: (i) a heterogeneous grid; (ii) a heterogeneous list; (iii) a simple grid; and (iv) a simple list. We collected the interactions of 41 study participants over 12 search tasks for our analyses. We observed that SERP types and task complexity affect user interactions with search results. We also find evidence to support most (6 out of 8) observations from [4 , 29] indicating that user interactions with different interfaces and to solve tasks of different complexity have remained mostly similar over time.
... Recently, studies have suggested viewport logging to assess users' attention on mobile phones at scale (Lagun et al. 2014). The viewport can be characterized as the "portion of the web page that is visible on the phone's screen at a given point in time" (Lagun et al. 2014, p. 2). ...
... The viewport can be characterized as the "portion of the web page that is visible on the phone's screen at a given point in time" (Lagun et al. 2014, p. 2). The viewport changes and reveals previously hidden parts as the user interacts with the website (e.g., by scrolling the page), resulting in path data of the time that website elements (e.g., a search result, an ad) are visible to the user (e.g., Lagun et al. 2014). Research has successfully employed viewport data for studying active information retrieval in goal attainment (e.g., news articles; Grusky et al. 2017). ...
Article
Ad avoidance (e.g., “blinding out” digital ads) is a substantial problem for advertisers. Avoiding mobile banner ads differs from active ad avoidance in nonmobile (desktop) settings, because mobile phone users interact with ads to avoid them: (1) They classify new content at the bottom of their screens; if they see an ad, they (2) scroll so that it is out of the locus of attention and (3) position it at a peripheral location at the top of the screen while focusing their attention on the (non-ad) content in the screen center. Introducing viewport logging to marketing research, we capture granular ad-viewing patterns from users’ screens (i.e., viewports). While mobile users’ ad-viewing patterns are concave over the viewport (with more time at the periphery than in the screen center), viewing patterns on desktop computers are convex (most time in the screen center). Consequently, we show that the effect of viewing time on recall depends on the position of an ad in interaction with the device. An eye-tracking study and an experiment show that 43% to 46% of embedded mobile banner ads are likely to suffer from ad avoidance, and that ad recall is 6 to 7 percentage points lower on mobile phones (versus desktop).
... Han et al. [25] found that mobile touch interaction signals on SERP were more effective than landing web page signals for predicting content relevance. Lagun et al. [26] proved that scrolling past search cards and spending more time on contents below search cards are clear signals of non-relevance. Huang and Diriye [27] pointed out that changing viewport coordinates are more accurate than user touch coordinates for predicting content relevance in mobile search. ...
Preprint
Full-text available
Searcher struggle is important feedback to Web search engines. Existing Web search struggle detection methods rely on effort-based features to identify the struggling moments. Their underlying assumption is that the more effort a user spends, the more struggling the user may be. However, recent studies have suggested this simple association might be incorrect. This paper proposes a new feature modulation method for struggle detection and refers to the reversal theory in psychology. The reversal theory (RT) points out that instead of having a static personality trait, people constantly switch between opposite psychological states, complicating the relationship between the efforts they spend and the level of frustration they feel. Supported by the theory, our method modulates the effort-based features based on reversal theory’s bi-modal arousal model. Evaluations on week-long Web search logs confirm that the proposed method can statistically significantly improve state-of-the-art struggle detection methods.
... Besides, interactive data (e.g., mouse clicks and movements) are collected and explored in laboratory user study Huang et al. (2011 or diary study Teevan et al. (2004), , Chen et al. (2021). Eye-tracking, which can capture real-time eye movements effectively, is a favored technique for investigating user examination behavior and, therefore, has been utilized in various search scenarios Hotchkiss et al. (2005), Lagun et al. (2014), Xie et al. (2017), Li et al. (2018), Zheng et al. (2020). The "Golden Triangle" pattern was observed on the traditional ten-blue-link pages in web search Hotchkiss et al. (2005). ...
Article
Modern search engine result pages (SERPs) become increasingly complex with heterogeneous information aggregated from various sources. In many cases, these SERPs also display results in the right rail besides the traditional left-rail result lists, which change the linear result list to a non-linear panel and might influence user search behavior patterns. While user behavior on the traditional ranked result list has been well studied in existing works, it still lacks a thorough investigation of the effects caused by the right-rail results, especially on complex SERPs. To shed light on this research question, we conducted a user study, which collected participants’ eye movements, detailed interaction behavioral logs, and feedback information. Based on the collected data, we analyze the influence of right-rail results on users’ examination patterns, search behavior, perceived workload, and satisfaction. We further construct a user model to predict users’ examination behavior on non-linear SERPs. Our work contributes to understanding the effects of the right-rail results on users’ interaction patterns, benefiting other related research, such as the evaluation and UI optimization of search systems.
... -Scroll: scrolling up and down a page without reading the content may be a signal of frustration and lack of confidence, which is a frequent action on mobile browsing [18]. In relation to this, [19] concluded that when users frequently scroll to find irrelevant information in their navigation, they are less satisfied. Furthermore, "the current scrolling method for a mobile device is both time-consuming and fatigue-prone" [20]. ...
Chapter
Full-text available
In recent years, the usage of mobile browsers has experienced an astonishing growth. Nowadays, most citizens use their mobile phones instead of their laptops to surf the Web for immediate availability. Nevertheless, Web design is performed considering laptops screen dimensions and websites are readjusted to mobile screen resolutions using Responsive Web Design. This conversion to smaller screen resolutions causes some drawbacks to mobile navigation. Web Augmentation is an effective methodology that allows end-users to customize third party websites according to their needs. This technique can mitigate the problems caused by small screen resolutions.
... The study began with a warm-up task. The main tasks were shown in a random order to balance the order effects [24]. Each participant spent about 1.5 hours completing the main tasks and gained $18 for involvement. ...
... Previous works [24,54] found a strong correlation between viewport with user attention. We utilized the weighted viewport to reduce the presentation bias, which had the strongest correlation with user attention [24,54]. ...
... Previous works [24,54] found a strong correlation between viewport with user attention. We utilized the weighted viewport to reduce the presentation bias, which had the strongest correlation with user attention [24,54]. The weighting factor is calculated following = ℎ , 2 / ℎ * ℎ , where ℎ , is the visible height of section in the -th viewport, ℎ is the height of the -th viewport, and ℎ is the actual height of section . ...