Figure 8 - uploaded by Dmitry Lagun
Content may be subject to copyright.
This figure shows fairly strong gaze viewport correlations. In each panel, the x axis is a gaze measure, and the y axis is the corresponding viewport measure. Left panels show time measures in milliseconds, while right panels show time measures as a % of time on all results on that page. 

This figure shows fairly strong gaze viewport correlations. In each panel, the x axis is a gaze measure, and the y axis is the corresponding viewport measure. Left panels show time measures in milliseconds, while right panels show time measures as a % of time on all results on that page. 

Source publication
Conference Paper
Full-text available
Web Search has seen two big changes recently: rapid growth in mobile search traffic, and an increasing trend towards providing answer-like results for relatively simple information needs (e.g., [weather today]). Such results display the answer or relevant infor-mation on the search page itself without requiring a user to click. While clicks on orga...

Contexts in source publication

Context 1
... result position (x axis). Similar to viewport, we find a main effect of result position or rank on time on result (F(9, 1720) = 15.1, p < 0.001) and a bump at position 2 (% time on result is significantly higher for second result than the first: t(343)=-2.3, p=0.02). We believe this may be a function of scrolling too – due to the small screen size in phones, the second result may only be partially visible; in order to bring it fully into view, the user has to adjust the scroll distance by continuing to look at the second result until it its bottom portion comes into view. This finding of non-monotonic attention decay with rank position may have implications for results ranking and design of a novel discount function (as opposed to MAP or NDCG[16]) that better reflects user experience in mobile search. We plan to investigate this question in the future work. Figure 7 shows the attention distribution across all users and conditions in our study. The left panel shows a heatmap of gaze activity (note that the red hotspots of increased attention are clearly shifted to the top half of the screen). The right panel shows a distribution of eye fixations as a function of y position. The median fixated y position was 224 pixels which is above the screen center (290 pixels). Thus, we found that on average, almost 70 % of the users’ attention was focused on the top half of the phone screen, with little or no attention paid to the bottom 1/3 portion of the screen (only 14%). This trend was consistent on a per user basis (20/24 users showed the preference for top half of the screen). We hypothesize that weighting viewport measurements by this attention distribution may further improve gaze viewport correlations. We have already shown in the previous section that viewport metrics can signal relevance of answer like results and reflect user’s satisfaction with the search. In this section we investigate whether viewport data can serve for an additional benefit – tracking user attention. To this end, we attempt to correlate result viewing time measured with the eye tracking and viewport data. If a reasonably strong correlation between gaze and viewport time exists, it implies that we can measure user attention at scale from viewport data alone. We analyze viewing time on per-result basis. We gather all the data collected in the user study independent of experimental condition, relevance, result position and result type (traditional web results vs. answer-like results). We hypothesize that viewport time alone might provide a poor proxy for the user’s attention, thus, in order to refine our measurements we account for two factors: result coverage and exposure defined below. Let v denote the viewport. We explore different ways of computing viewport time on result as a combination of the time the result was visible on the viewport ( t v ) and two factors: how much of the result area was visible to a user (result exposure , e v ) and how much of the viewport real estate did the result occupy (viewport coverage , c v ). Total viewport time on result using all factors is computed as n v =1 ( t v ∗ c v ∗ e v ) , where v can take values from [1 , n ] ( n is the number of viewports). Table 4 reports the gaze-viewport correlations for combinations of the above factors. We denote the baseline approach computing n viewport time = v =1 t v as C1. We find that the best combination among C1-C4 is C4 (C2 is close), which is weighted by result exposure and viewport coverage. The scatter plots in Figure 8 are generated using C4. Figure 8 (top-left panel) shows the scatter plot of viewport time on result vs. gaze time on result, both measured in milliseconds. Each data point in the scatter plot is a (user, query, condition, result) tuple. The correlations are reasonably strong (Pearson’s correlation r=0.57; the blue line shows the metric values obtained by binning the data into deciles, binned r = 0.77). Figure 8 (top-right panel) is similar, but shows a scatter plot of percent time on result (time on result / time on page) as measured by gaze (x axis) and viewport (y axis). Interestingly, we found higher correlations using % time on result than absolute time on result in milliseconds (raw correlation: r = 0.69 vs. 0.57; binned correlation: r = 0.97 vs. 0.77). We suspect that the normalization (by time on all results on the page) helps adjust for the pace at which users read the page. For example, some users may quickly glance and skim through the results, while others may read carefully. In such cases, the absolute time measure will vary a lot while the percent time measure may be relatively stable. Since 3-4 results may be shown on the viewport simultaneously, the observed gaze-viewport correlation on a per-result basis (raw correlation of 0.69, binned correlation in deciles of 0.97) is high and suggests that we may reliably infer how long a specific result was seen by the eye, from the viewport data alone. The middle and bottom panels in Figure 8 are similar to the top panel, and show gaze viewport correlations for other measures, such as time spent below KG (mid-left panel) and percent time spent below KG (i.e., time below KG / time on all results on the page, mid-right panel) measured using gaze (x axis) and viewport (y axis). Here too, we find strong gaze viewport correlations, and again, the % time measures show higher correlations than time in millisecond measures (time below KG: r = 0.71, %time below KG: r = 0.86). The bottom panel in Figure 8 shows correlations for time below IA (r = 0.59) and % time below IA (r = 0.81). In all three figures, we find that the percent time measures, that are normalized by time on page, show higher gaze-viewport correlations than time in millisecond measures, for reasons discussed earlier. To our knowledge, this is the first quantitative mobile eye tracking study in search. As more traffic goes mobile, there is a need to better understand user attention and satisfaction on mobile devices. Prior work has focused on search behavior in desktops. These studies report a Golden Triangle [23], where searcher attention is focused near the top-left of the page and decreases as we go down or to the right on the SERP. It is not clear whether attention behavior on desktop will generalize to mobile phones, as they differ from desktops in several ways – small real estate, variety of touch interactions (touch, scroll, swipe, zoom) and tendency to perform short queries. In this study, we found that indeed, user attention behavior on mobile phones is very different from that on desktops. First, unlike desktop where engagement (both clicks and attention) has been widely reported to decrease from top to bottom positions [9, 10], on mobile phones, we found surprisingly, that the second result gets more viewport and gaze time than the first. The most likely explanation for this is short scrolls. Unlike desktop where searchers can use the page up/down keys on the keyboard to move from one page fold to the next (no overlap between the results in different page folds), on mobile phones, users tend to perform short and continuous scrolls that render the second and third results visible across more viewports and hence longer than the first. Figure 6 illustrates this with an example. This bias towards the second position occurs in eye gaze too. We think this is because the second result is often partially hidden, and to bring it fully into view, the user has to carefully scroll (to avoid scrolling too much or too little) by continuously looking at the result until it is fully visible, leading to longer gaze time on the second result than the first. It is possible that in the absence of scrolling, viewport and gaze time on results (in mobile phones) may decrease with position, similar to desktop. For example, navigational tasks ("BBC") where the user often clicks the first result, may not require scrolling, and may show higher viewport time on the first than second result. In our study, however, all tasks were non-navigational, and often in- volved scrolling. An intriguing question that immediately follows is, whether there is a more appropriate evaluation metric or rank position discount that better reflects user experience on mobile phones than current evaluation metrics, such as mean average precision or discounted cumulative gain. The second finding which is different on mobile phones than desktop is that, unlike the Golden Triangle in desktop, where attention is focused on the top-left and decreases towards the bottom and right of the search result page, in our study on mobile phones, we found that on average, user attention is focused on the center and top half of the screen. This, together with the already strong gaze viewport correlations (r=0.7 for %time on a page element as shown in Fig 8) suggests that by using the appropriate weighting functions on viewport data, we may identify which result the user is looking at, with high confidence. In other words, this offers an opportunity, for the first time, to scalably and reliably measure user attention on mobile phones. Another possible direction for improving accuracy of user attention measurements is to follow the work Huang et al. [14] and Navalpakkam et al. [24] that advocate to directly predict user attention on the screen from user interactions. While the absence of cursor movements in mobile phones makes attention prediction more difficult, we hypothesize that features of smaller screen size and the time user spends in the viewport without scrolling can be used to improve the accuracy of the “vanilla” approach that uses viewport time information only. In addition to understanding searcher attention on mobile phones, we examined search satisfaction and its effect on viewport data. We systematically varied task-relevance (whether the KG/IA contained the answer to the user’s task), and found that users reported less satisfied when the KG/IA was task-irrelevant than ...
Context 2
... result position, we find a surprising bump at positions 2 and 3 (significantly higher % time on the second result than the first: t(528)=-2.2, p=0.02; and higher % time on the third result than the first: t(504)=-3.7, p < 0.001). Authors verified that this is not a bug and is indeed feature of the mobile data. One possible explanation for the bump at position 2 and 3 is the presence of short scrolls on mobile phones. Figure 6 illustrates this with an example – unlike desktop where the page up down keys allow users to move from one page fold to another non-overlapping page fold, in mobile phones, users often tend to perform short scrolls that may render the second or third result visible across more viewports and for longer time than the first result. It is possible that for navigational tasks where For Page measures the p-values are computed using the repeated measures ANOVA; for Viewport and Gaze measures Wilcoxon rank sum test is used. users mostly click the first result (e.g., twitter), since scrolling is unlikely, we may observe that viewport time decreases with position. This remains to be tested in a future study. An obvious question is whether the bump at position 2 or 3 is an artifact of viewport data, or is a real attention phenomenon that occurs with eye gaze too. The right panels in figure 5 show gaze time on each result in milliseconds (top-right panel) and in % (bottom- right panel) as a function of result position (x axis). Similar to viewport, we find a main effect of result position or rank on time on result (F(9, 1720) = 15.1, p < 0.001) and a bump at position 2 (% time on result is significantly higher for second result than the first: t(343)=-2.3, p=0.02). We believe this may be a function of scrolling too – due to the small screen size in phones, the second result may only be partially visible; in order to bring it fully into view, the user has to adjust the scroll distance by continuing to look at the second result until it its bottom portion comes into view. This finding of non-monotonic attention decay with rank position may have implications for results ranking and design of a novel discount function (as opposed to MAP or NDCG[16]) that better reflects user experience in mobile search. We plan to investigate this question in the future work. Figure 7 shows the attention distribution across all users and conditions in our study. The left panel shows a heatmap of gaze activity (note that the red hotspots of increased attention are clearly shifted to the top half of the screen). The right panel shows a distribution of eye fixations as a function of y position. The median fixated y position was 224 pixels which is above the screen center (290 pixels). Thus, we found that on average, almost 70 % of the users’ attention was focused on the top half of the phone screen, with little or no attention paid to the bottom 1/3 portion of the screen (only 14%). This trend was consistent on a per user basis (20/24 users showed the preference for top half of the screen). We hypothesize that weighting viewport measurements by this attention distribution may further improve gaze viewport correlations. We have already shown in the previous section that viewport metrics can signal relevance of answer like results and reflect user’s satisfaction with the search. In this section we investigate whether viewport data can serve for an additional benefit – tracking user attention. To this end, we attempt to correlate result viewing time measured with the eye tracking and viewport data. If a reasonably strong correlation between gaze and viewport time exists, it implies that we can measure user attention at scale from viewport data alone. We analyze viewing time on per-result basis. We gather all the data collected in the user study independent of experimental condition, relevance, result position and result type (traditional web results vs. answer-like results). We hypothesize that viewport time alone might provide a poor proxy for the user’s attention, thus, in order to refine our measurements we account for two factors: result coverage and exposure defined below. Let v denote the viewport. We explore different ways of computing viewport time on result as a combination of the time the result was visible on the viewport ( t v ) and two factors: how much of the result area was visible to a user (result exposure , e v ) and how much of the viewport real estate did the result occupy (viewport coverage , c v ). Total viewport time on result using all factors is computed as n v =1 ( t v ∗ c v ∗ e v ) , where v can take values from [1 , n ] ( n is the number of viewports). Table 4 reports the gaze-viewport correlations for combinations of the above factors. We denote the baseline approach computing n viewport time = v =1 t v as C1. We find that the best combination among C1-C4 is C4 (C2 is close), which is weighted by result exposure and viewport coverage. The scatter plots in Figure 8 are generated using C4. Figure 8 (top-left panel) shows the scatter plot of viewport time on result vs. gaze time on result, both measured in milliseconds. Each data point in the scatter plot is a (user, query, condition, result) tuple. The correlations are reasonably strong (Pearson’s correlation r=0.57; the blue line shows the metric values obtained by binning the data into deciles, binned r = 0.77). Figure 8 (top-right panel) is similar, but shows a scatter plot of percent time on result (time on result / time on page) as measured by gaze (x axis) and viewport (y axis). Interestingly, we found higher correlations using % time on result than absolute time on result in milliseconds (raw correlation: r = 0.69 vs. 0.57; binned correlation: r = 0.97 vs. 0.77). We suspect that the normalization (by time on all results on the page) helps adjust for the pace at which users read the page. For example, some users may quickly glance and skim through the results, while others may read carefully. In such cases, the absolute time measure will vary a lot while the percent time measure may be relatively stable. Since 3-4 results may be shown on the viewport simultaneously, the observed gaze-viewport correlation on a per-result basis (raw correlation of 0.69, binned correlation in deciles of 0.97) is high and suggests that we may reliably infer how long a specific result was seen by the eye, from the viewport data alone. The middle and bottom panels in Figure 8 are similar to the top panel, and show gaze viewport correlations for other measures, such as time spent below KG (mid-left panel) and percent time spent below KG (i.e., time below KG / time on all results on the page, mid-right panel) measured using gaze (x axis) and viewport (y axis). Here too, we find strong gaze viewport correlations, and again, the % time measures show higher correlations than time in millisecond measures (time below KG: r = 0.71, %time below KG: r = 0.86). The bottom panel in Figure 8 shows correlations for time below IA (r = 0.59) and % time below IA (r = 0.81). In all three figures, we find that the percent time measures, that are normalized by time on page, show higher gaze-viewport correlations than time in millisecond measures, for reasons discussed earlier. To our knowledge, this is the first quantitative mobile eye tracking study in search. As more traffic goes mobile, there is a need to better understand user attention and satisfaction on mobile devices. Prior work has focused on search behavior in desktops. These studies report a Golden Triangle [23], where searcher attention is focused near the top-left of the page and decreases as we go down or to the right on the SERP. It is not clear whether attention behavior on desktop will generalize to mobile phones, as they differ from desktops in several ways – small real estate, variety of touch interactions (touch, scroll, swipe, zoom) and tendency to perform short queries. In this study, we found that indeed, user attention behavior on mobile phones is very different from that on desktops. First, unlike desktop where engagement (both clicks and attention) has been widely reported to decrease from top to bottom positions [9, 10], on mobile phones, we found surprisingly, that the second result gets more viewport and gaze time than the first. The most likely explanation for this is short scrolls. Unlike desktop where searchers can use the page up/down keys on the keyboard to move from one page fold to the next (no overlap between the results in different page folds), on mobile phones, users tend to perform short and continuous scrolls that render the second and third results visible across more viewports and hence longer than the first. Figure 6 illustrates this with an example. This bias towards the second position occurs in eye gaze too. We think this is because the second result is often partially hidden, and to bring it fully into view, the user has to carefully scroll (to avoid scrolling too much or too little) by continuously looking at the result until it is fully visible, leading to longer gaze time on the second result than the first. It is possible that in the absence of scrolling, viewport and gaze time on results (in mobile phones) may decrease with position, similar to desktop. For example, navigational tasks ("BBC") where the user often clicks the first result, may not require scrolling, and may show higher viewport time on the first than second result. In our study, however, all tasks were non-navigational, and often in- volved scrolling. An intriguing question that immediately follows is, whether there is a more appropriate evaluation metric or rank position discount that better reflects user experience on mobile phones than current evaluation metrics, such as mean average precision or discounted cumulative gain. The second finding which is different on mobile phones than desktop is that, unlike the Golden Triangle in desktop, where attention is focused on the top-left and decreases towards ...
Context 3
... of short scrolls on mobile phones. Figure 6 illustrates this with an example – unlike desktop where the page up down keys allow users to move from one page fold to another non-overlapping page fold, in mobile phones, users often tend to perform short scrolls that may render the second or third result visible across more viewports and for longer time than the first result. It is possible that for navigational tasks where For Page measures the p-values are computed using the repeated measures ANOVA; for Viewport and Gaze measures Wilcoxon rank sum test is used. users mostly click the first result (e.g., twitter), since scrolling is unlikely, we may observe that viewport time decreases with position. This remains to be tested in a future study. An obvious question is whether the bump at position 2 or 3 is an artifact of viewport data, or is a real attention phenomenon that occurs with eye gaze too. The right panels in figure 5 show gaze time on each result in milliseconds (top-right panel) and in % (bottom- right panel) as a function of result position (x axis). Similar to viewport, we find a main effect of result position or rank on time on result (F(9, 1720) = 15.1, p < 0.001) and a bump at position 2 (% time on result is significantly higher for second result than the first: t(343)=-2.3, p=0.02). We believe this may be a function of scrolling too – due to the small screen size in phones, the second result may only be partially visible; in order to bring it fully into view, the user has to adjust the scroll distance by continuing to look at the second result until it its bottom portion comes into view. This finding of non-monotonic attention decay with rank position may have implications for results ranking and design of a novel discount function (as opposed to MAP or NDCG[16]) that better reflects user experience in mobile search. We plan to investigate this question in the future work. Figure 7 shows the attention distribution across all users and conditions in our study. The left panel shows a heatmap of gaze activity (note that the red hotspots of increased attention are clearly shifted to the top half of the screen). The right panel shows a distribution of eye fixations as a function of y position. The median fixated y position was 224 pixels which is above the screen center (290 pixels). Thus, we found that on average, almost 70 % of the users’ attention was focused on the top half of the phone screen, with little or no attention paid to the bottom 1/3 portion of the screen (only 14%). This trend was consistent on a per user basis (20/24 users showed the preference for top half of the screen). We hypothesize that weighting viewport measurements by this attention distribution may further improve gaze viewport correlations. We have already shown in the previous section that viewport metrics can signal relevance of answer like results and reflect user’s satisfaction with the search. In this section we investigate whether viewport data can serve for an additional benefit – tracking user attention. To this end, we attempt to correlate result viewing time measured with the eye tracking and viewport data. If a reasonably strong correlation between gaze and viewport time exists, it implies that we can measure user attention at scale from viewport data alone. We analyze viewing time on per-result basis. We gather all the data collected in the user study independent of experimental condition, relevance, result position and result type (traditional web results vs. answer-like results). We hypothesize that viewport time alone might provide a poor proxy for the user’s attention, thus, in order to refine our measurements we account for two factors: result coverage and exposure defined below. Let v denote the viewport. We explore different ways of computing viewport time on result as a combination of the time the result was visible on the viewport ( t v ) and two factors: how much of the result area was visible to a user (result exposure , e v ) and how much of the viewport real estate did the result occupy (viewport coverage , c v ). Total viewport time on result using all factors is computed as n v =1 ( t v ∗ c v ∗ e v ) , where v can take values from [1 , n ] ( n is the number of viewports). Table 4 reports the gaze-viewport correlations for combinations of the above factors. We denote the baseline approach computing n viewport time = v =1 t v as C1. We find that the best combination among C1-C4 is C4 (C2 is close), which is weighted by result exposure and viewport coverage. The scatter plots in Figure 8 are generated using C4. Figure 8 (top-left panel) shows the scatter plot of viewport time on result vs. gaze time on result, both measured in milliseconds. Each data point in the scatter plot is a (user, query, condition, result) tuple. The correlations are reasonably strong (Pearson’s correlation r=0.57; the blue line shows the metric values obtained by binning the data into deciles, binned r = 0.77). Figure 8 (top-right panel) is similar, but shows a scatter plot of percent time on result (time on result / time on page) as measured by gaze (x axis) and viewport (y axis). Interestingly, we found higher correlations using % time on result than absolute time on result in milliseconds (raw correlation: r = 0.69 vs. 0.57; binned correlation: r = 0.97 vs. 0.77). We suspect that the normalization (by time on all results on the page) helps adjust for the pace at which users read the page. For example, some users may quickly glance and skim through the results, while others may read carefully. In such cases, the absolute time measure will vary a lot while the percent time measure may be relatively stable. Since 3-4 results may be shown on the viewport simultaneously, the observed gaze-viewport correlation on a per-result basis (raw correlation of 0.69, binned correlation in deciles of 0.97) is high and suggests that we may reliably infer how long a specific result was seen by the eye, from the viewport data alone. The middle and bottom panels in Figure 8 are similar to the top panel, and show gaze viewport correlations for other measures, such as time spent below KG (mid-left panel) and percent time spent below KG (i.e., time below KG / time on all results on the page, mid-right panel) measured using gaze (x axis) and viewport (y axis). Here too, we find strong gaze viewport correlations, and again, the % time measures show higher correlations than time in millisecond measures (time below KG: r = 0.71, %time below KG: r = 0.86). The bottom panel in Figure 8 shows correlations for time below IA (r = 0.59) and % time below IA (r = 0.81). In all three figures, we find that the percent time measures, that are normalized by time on page, show higher gaze-viewport correlations than time in millisecond measures, for reasons discussed earlier. To our knowledge, this is the first quantitative mobile eye tracking study in search. As more traffic goes mobile, there is a need to better understand user attention and satisfaction on mobile devices. Prior work has focused on search behavior in desktops. These studies report a Golden Triangle [23], where searcher attention is focused near the top-left of the page and decreases as we go down or to the right on the SERP. It is not clear whether attention behavior on desktop will generalize to mobile phones, as they differ from desktops in several ways – small real estate, variety of touch interactions (touch, scroll, swipe, zoom) and tendency to perform short queries. In this study, we found that indeed, user attention behavior on mobile phones is very different from that on desktops. First, unlike desktop where engagement (both clicks and attention) has been widely reported to decrease from top to bottom positions [9, 10], on mobile phones, we found surprisingly, that the second result gets more viewport and gaze time than the first. The most likely explanation for this is short scrolls. Unlike desktop where searchers can use the page up/down keys on the keyboard to move from one page fold to the next (no overlap between the results in different page folds), on mobile phones, users tend to perform short and continuous scrolls that render the second and third results visible across more viewports and hence longer than the first. Figure 6 illustrates this with an example. This bias towards the second position occurs in eye gaze too. We think this is because the second result is often partially hidden, and to bring it fully into view, the user has to carefully scroll (to avoid scrolling too much or too little) by continuously looking at the result until it is fully visible, leading to longer gaze time on the second result than the first. It is possible that in the absence of scrolling, viewport and gaze time on results (in mobile phones) may decrease with position, similar to desktop. For example, navigational tasks ("BBC") where the user often clicks the first result, may not require scrolling, and may show higher viewport time on the first than second result. In our study, however, all tasks were non-navigational, and often in- volved scrolling. An intriguing question that immediately follows is, whether there is a more appropriate evaluation metric or rank position discount that better reflects user experience on mobile phones than current evaluation metrics, such as mean average precision or discounted cumulative gain. The second finding which is different on mobile phones than desktop is that, unlike the Golden Triangle in desktop, where attention is focused on the top-left and decreases towards the bottom and right of the search result page, in our study on mobile phones, we found that on average, user attention is focused on the center and top half of the screen. This, together with the already strong gaze viewport correlations (r=0.7 for %time on a page element as shown in Fig 8) suggests that by using the appropriate weighting functions on viewport data, we may ...
Context 4
... on result (in ms, %) decreases with result position, we find a surprising bump at positions 2 and 3 (significantly higher % time on the second result than the first: t(528)=-2.2, p=0.02; and higher % time on the third result than the first: t(504)=-3.7, p < 0.001). Authors verified that this is not a bug and is indeed feature of the mobile data. One possible explanation for the bump at position 2 and 3 is the presence of short scrolls on mobile phones. Figure 6 illustrates this with an example – unlike desktop where the page up down keys allow users to move from one page fold to another non-overlapping page fold, in mobile phones, users often tend to perform short scrolls that may render the second or third result visible across more viewports and for longer time than the first result. It is possible that for navigational tasks where For Page measures the p-values are computed using the repeated measures ANOVA; for Viewport and Gaze measures Wilcoxon rank sum test is used. users mostly click the first result (e.g., twitter), since scrolling is unlikely, we may observe that viewport time decreases with position. This remains to be tested in a future study. An obvious question is whether the bump at position 2 or 3 is an artifact of viewport data, or is a real attention phenomenon that occurs with eye gaze too. The right panels in figure 5 show gaze time on each result in milliseconds (top-right panel) and in % (bottom- right panel) as a function of result position (x axis). Similar to viewport, we find a main effect of result position or rank on time on result (F(9, 1720) = 15.1, p < 0.001) and a bump at position 2 (% time on result is significantly higher for second result than the first: t(343)=-2.3, p=0.02). We believe this may be a function of scrolling too – due to the small screen size in phones, the second result may only be partially visible; in order to bring it fully into view, the user has to adjust the scroll distance by continuing to look at the second result until it its bottom portion comes into view. This finding of non-monotonic attention decay with rank position may have implications for results ranking and design of a novel discount function (as opposed to MAP or NDCG[16]) that better reflects user experience in mobile search. We plan to investigate this question in the future work. Figure 7 shows the attention distribution across all users and conditions in our study. The left panel shows a heatmap of gaze activity (note that the red hotspots of increased attention are clearly shifted to the top half of the screen). The right panel shows a distribution of eye fixations as a function of y position. The median fixated y position was 224 pixels which is above the screen center (290 pixels). Thus, we found that on average, almost 70 % of the users’ attention was focused on the top half of the phone screen, with little or no attention paid to the bottom 1/3 portion of the screen (only 14%). This trend was consistent on a per user basis (20/24 users showed the preference for top half of the screen). We hypothesize that weighting viewport measurements by this attention distribution may further improve gaze viewport correlations. We have already shown in the previous section that viewport metrics can signal relevance of answer like results and reflect user’s satisfaction with the search. In this section we investigate whether viewport data can serve for an additional benefit – tracking user attention. To this end, we attempt to correlate result viewing time measured with the eye tracking and viewport data. If a reasonably strong correlation between gaze and viewport time exists, it implies that we can measure user attention at scale from viewport data alone. We analyze viewing time on per-result basis. We gather all the data collected in the user study independent of experimental condition, relevance, result position and result type (traditional web results vs. answer-like results). We hypothesize that viewport time alone might provide a poor proxy for the user’s attention, thus, in order to refine our measurements we account for two factors: result coverage and exposure defined below. Let v denote the viewport. We explore different ways of computing viewport time on result as a combination of the time the result was visible on the viewport ( t v ) and two factors: how much of the result area was visible to a user (result exposure , e v ) and how much of the viewport real estate did the result occupy (viewport coverage , c v ). Total viewport time on result using all factors is computed as n v =1 ( t v ∗ c v ∗ e v ) , where v can take values from [1 , n ] ( n is the number of viewports). Table 4 reports the gaze-viewport correlations for combinations of the above factors. We denote the baseline approach computing n viewport time = v =1 t v as C1. We find that the best combination among C1-C4 is C4 (C2 is close), which is weighted by result exposure and viewport coverage. The scatter plots in Figure 8 are generated using C4. Figure 8 (top-left panel) shows the scatter plot of viewport time on result vs. gaze time on result, both measured in milliseconds. Each data point in the scatter plot is a (user, query, condition, result) tuple. The correlations are reasonably strong (Pearson’s correlation r=0.57; the blue line shows the metric values obtained by binning the data into deciles, binned r = 0.77). Figure 8 (top-right panel) is similar, but shows a scatter plot of percent time on result (time on result / time on page) as measured by gaze (x axis) and viewport (y axis). Interestingly, we found higher correlations using % time on result than absolute time on result in milliseconds (raw correlation: r = 0.69 vs. 0.57; binned correlation: r = 0.97 vs. 0.77). We suspect that the normalization (by time on all results on the page) helps adjust for the pace at which users read the page. For example, some users may quickly glance and skim through the results, while others may read carefully. In such cases, the absolute time measure will vary a lot while the percent time measure may be relatively stable. Since 3-4 results may be shown on the viewport simultaneously, the observed gaze-viewport correlation on a per-result basis (raw correlation of 0.69, binned correlation in deciles of 0.97) is high and suggests that we may reliably infer how long a specific result was seen by the eye, from the viewport data alone. The middle and bottom panels in Figure 8 are similar to the top panel, and show gaze viewport correlations for other measures, such as time spent below KG (mid-left panel) and percent time spent below KG (i.e., time below KG / time on all results on the page, mid-right panel) measured using gaze (x axis) and viewport (y axis). Here too, we find strong gaze viewport correlations, and again, the % time measures show higher correlations than time in millisecond measures (time below KG: r = 0.71, %time below KG: r = 0.86). The bottom panel in Figure 8 shows correlations for time below IA (r = 0.59) and % time below IA (r = 0.81). In all three figures, we find that the percent time measures, that are normalized by time on page, show higher gaze-viewport correlations than time in millisecond measures, for reasons discussed earlier. To our knowledge, this is the first quantitative mobile eye tracking study in search. As more traffic goes mobile, there is a need to better understand user attention and satisfaction on mobile devices. Prior work has focused on search behavior in desktops. These studies report a Golden Triangle [23], where searcher attention is focused near the top-left of the page and decreases as we go down or to the right on the SERP. It is not clear whether attention behavior on desktop will generalize to mobile phones, as they differ from desktops in several ways – small real estate, variety of touch interactions (touch, scroll, swipe, zoom) and tendency to perform short queries. In this study, we found that indeed, user attention behavior on mobile phones is very different from that on desktops. First, unlike desktop where engagement (both clicks and attention) has been widely reported to decrease from top to bottom positions [9, 10], on mobile phones, we found surprisingly, that the second result gets more viewport and gaze time than the first. The most likely explanation for this is short scrolls. Unlike desktop where searchers can use the page up/down keys on the keyboard to move from one page fold to the next (no overlap between the results in different page folds), on mobile phones, users tend to perform short and continuous scrolls that render the second and third results visible across more viewports and hence longer than the first. Figure 6 illustrates this with an example. This bias towards the second position occurs in eye gaze too. We think this is because the second result is often partially hidden, and to bring it fully into view, the user has to carefully scroll (to avoid scrolling too much or too little) by continuously looking at the result until it is fully visible, leading to longer gaze time on the second result than the first. It is possible that in the absence of scrolling, viewport and gaze time on results (in mobile phones) may decrease with position, similar to desktop. For example, navigational tasks ("BBC") where the user often clicks the first result, may not require scrolling, and may show higher viewport time on the first than second result. In our study, however, all tasks were non-navigational, and often in- volved scrolling. An intriguing question that immediately follows is, whether there is a more appropriate evaluation metric or rank position discount that better reflects user experience on mobile phones than current evaluation metrics, such as mean average precision or discounted cumulative gain. The second finding which is different on mobile phones than desktop is that, unlike the Golden Triangle in desktop, where attention is focused on the ...
Context 5
... at the second result until it its bottom portion comes into view. This finding of non-monotonic attention decay with rank position may have implications for results ranking and design of a novel discount function (as opposed to MAP or NDCG[16]) that better reflects user experience in mobile search. We plan to investigate this question in the future work. Figure 7 shows the attention distribution across all users and conditions in our study. The left panel shows a heatmap of gaze activity (note that the red hotspots of increased attention are clearly shifted to the top half of the screen). The right panel shows a distribution of eye fixations as a function of y position. The median fixated y position was 224 pixels which is above the screen center (290 pixels). Thus, we found that on average, almost 70 % of the users’ attention was focused on the top half of the phone screen, with little or no attention paid to the bottom 1/3 portion of the screen (only 14%). This trend was consistent on a per user basis (20/24 users showed the preference for top half of the screen). We hypothesize that weighting viewport measurements by this attention distribution may further improve gaze viewport correlations. We have already shown in the previous section that viewport metrics can signal relevance of answer like results and reflect user’s satisfaction with the search. In this section we investigate whether viewport data can serve for an additional benefit – tracking user attention. To this end, we attempt to correlate result viewing time measured with the eye tracking and viewport data. If a reasonably strong correlation between gaze and viewport time exists, it implies that we can measure user attention at scale from viewport data alone. We analyze viewing time on per-result basis. We gather all the data collected in the user study independent of experimental condition, relevance, result position and result type (traditional web results vs. answer-like results). We hypothesize that viewport time alone might provide a poor proxy for the user’s attention, thus, in order to refine our measurements we account for two factors: result coverage and exposure defined below. Let v denote the viewport. We explore different ways of computing viewport time on result as a combination of the time the result was visible on the viewport ( t v ) and two factors: how much of the result area was visible to a user (result exposure , e v ) and how much of the viewport real estate did the result occupy (viewport coverage , c v ). Total viewport time on result using all factors is computed as n v =1 ( t v ∗ c v ∗ e v ) , where v can take values from [1 , n ] ( n is the number of viewports). Table 4 reports the gaze-viewport correlations for combinations of the above factors. We denote the baseline approach computing n viewport time = v =1 t v as C1. We find that the best combination among C1-C4 is C4 (C2 is close), which is weighted by result exposure and viewport coverage. The scatter plots in Figure 8 are generated using C4. Figure 8 (top-left panel) shows the scatter plot of viewport time on result vs. gaze time on result, both measured in milliseconds. Each data point in the scatter plot is a (user, query, condition, result) tuple. The correlations are reasonably strong (Pearson’s correlation r=0.57; the blue line shows the metric values obtained by binning the data into deciles, binned r = 0.77). Figure 8 (top-right panel) is similar, but shows a scatter plot of percent time on result (time on result / time on page) as measured by gaze (x axis) and viewport (y axis). Interestingly, we found higher correlations using % time on result than absolute time on result in milliseconds (raw correlation: r = 0.69 vs. 0.57; binned correlation: r = 0.97 vs. 0.77). We suspect that the normalization (by time on all results on the page) helps adjust for the pace at which users read the page. For example, some users may quickly glance and skim through the results, while others may read carefully. In such cases, the absolute time measure will vary a lot while the percent time measure may be relatively stable. Since 3-4 results may be shown on the viewport simultaneously, the observed gaze-viewport correlation on a per-result basis (raw correlation of 0.69, binned correlation in deciles of 0.97) is high and suggests that we may reliably infer how long a specific result was seen by the eye, from the viewport data alone. The middle and bottom panels in Figure 8 are similar to the top panel, and show gaze viewport correlations for other measures, such as time spent below KG (mid-left panel) and percent time spent below KG (i.e., time below KG / time on all results on the page, mid-right panel) measured using gaze (x axis) and viewport (y axis). Here too, we find strong gaze viewport correlations, and again, the % time measures show higher correlations than time in millisecond measures (time below KG: r = 0.71, %time below KG: r = 0.86). The bottom panel in Figure 8 shows correlations for time below IA (r = 0.59) and % time below IA (r = 0.81). In all three figures, we find that the percent time measures, that are normalized by time on page, show higher gaze-viewport correlations than time in millisecond measures, for reasons discussed earlier. To our knowledge, this is the first quantitative mobile eye tracking study in search. As more traffic goes mobile, there is a need to better understand user attention and satisfaction on mobile devices. Prior work has focused on search behavior in desktops. These studies report a Golden Triangle [23], where searcher attention is focused near the top-left of the page and decreases as we go down or to the right on the SERP. It is not clear whether attention behavior on desktop will generalize to mobile phones, as they differ from desktops in several ways – small real estate, variety of touch interactions (touch, scroll, swipe, zoom) and tendency to perform short queries. In this study, we found that indeed, user attention behavior on mobile phones is very different from that on desktops. First, unlike desktop where engagement (both clicks and attention) has been widely reported to decrease from top to bottom positions [9, 10], on mobile phones, we found surprisingly, that the second result gets more viewport and gaze time than the first. The most likely explanation for this is short scrolls. Unlike desktop where searchers can use the page up/down keys on the keyboard to move from one page fold to the next (no overlap between the results in different page folds), on mobile phones, users tend to perform short and continuous scrolls that render the second and third results visible across more viewports and hence longer than the first. Figure 6 illustrates this with an example. This bias towards the second position occurs in eye gaze too. We think this is because the second result is often partially hidden, and to bring it fully into view, the user has to carefully scroll (to avoid scrolling too much or too little) by continuously looking at the result until it is fully visible, leading to longer gaze time on the second result than the first. It is possible that in the absence of scrolling, viewport and gaze time on results (in mobile phones) may decrease with position, similar to desktop. For example, navigational tasks ("BBC") where the user often clicks the first result, may not require scrolling, and may show higher viewport time on the first than second result. In our study, however, all tasks were non-navigational, and often in- volved scrolling. An intriguing question that immediately follows is, whether there is a more appropriate evaluation metric or rank position discount that better reflects user experience on mobile phones than current evaluation metrics, such as mean average precision or discounted cumulative gain. The second finding which is different on mobile phones than desktop is that, unlike the Golden Triangle in desktop, where attention is focused on the top-left and decreases towards the bottom and right of the search result page, in our study on mobile phones, we found that on average, user attention is focused on the center and top half of the screen. This, together with the already strong gaze viewport correlations (r=0.7 for %time on a page element as shown in Fig 8) suggests that by using the appropriate weighting functions on viewport data, we may identify which result the user is looking at, with high confidence. In other words, this offers an opportunity, for the first time, to scalably and reliably measure user attention on mobile phones. Another possible direction for improving accuracy of user attention measurements is to follow the work Huang et al. [14] and Navalpakkam et al. [24] that advocate to directly predict user attention on the screen from user interactions. While the absence of cursor movements in mobile phones makes attention prediction more difficult, we hypothesize that features of smaller screen size and the time user spends in the viewport without scrolling can be used to improve the accuracy of the “vanilla” approach that uses viewport time information only. In addition to understanding searcher attention on mobile phones, we examined search satisfaction and its effect on viewport data. We systematically varied task-relevance (whether the KG/IA contained the answer to the user’s task), and found that users reported less satisfied when the KG/IA was task-irrelevant than when it was relevant. We also identified viewport metrics that signal user dissatisfaction with answers – increased scrolling down the SERP and increased % time below the answer. We found that when the KG/IA is task-irrelevant, users read through it (expecting to find the answer) and upon not finding the answer, they continued to examine results below, leading to increased scrolling down the page, and increased time below KG/IA (in milliseconds, and as a % of page time). These results suggest that we may ...
Context 6
... the blue line shows the metric values obtained by binning the data into deciles, binned r = 0.77). Figure 8 (top-right panel) is similar, but shows a scatter plot of percent time on result (time on result / time on page) as measured by gaze (x axis) and viewport (y axis). Interestingly, we found higher correlations using % time on result than absolute time on result in milliseconds (raw correlation: r = 0.69 vs. 0.57; binned correlation: r = 0.97 vs. 0.77). We suspect that the normalization (by time on all results on the page) helps adjust for the pace at which users read the page. For example, some users may quickly glance and skim through the results, while others may read carefully. In such cases, the absolute time measure will vary a lot while the percent time measure may be relatively stable. Since 3-4 results may be shown on the viewport simultaneously, the observed gaze-viewport correlation on a per-result basis (raw correlation of 0.69, binned correlation in deciles of 0.97) is high and suggests that we may reliably infer how long a specific result was seen by the eye, from the viewport data alone. The middle and bottom panels in Figure 8 are similar to the top panel, and show gaze viewport correlations for other measures, such as time spent below KG (mid-left panel) and percent time spent below KG (i.e., time below KG / time on all results on the page, mid-right panel) measured using gaze (x axis) and viewport (y axis). Here too, we find strong gaze viewport correlations, and again, the % time measures show higher correlations than time in millisecond measures (time below KG: r = 0.71, %time below KG: r = 0.86). The bottom panel in Figure 8 shows correlations for time below IA (r = 0.59) and % time below IA (r = 0.81). In all three figures, we find that the percent time measures, that are normalized by time on page, show higher gaze-viewport correlations than time in millisecond measures, for reasons discussed earlier. To our knowledge, this is the first quantitative mobile eye tracking study in search. As more traffic goes mobile, there is a need to better understand user attention and satisfaction on mobile devices. Prior work has focused on search behavior in desktops. These studies report a Golden Triangle [23], where searcher attention is focused near the top-left of the page and decreases as we go down or to the right on the SERP. It is not clear whether attention behavior on desktop will generalize to mobile phones, as they differ from desktops in several ways – small real estate, variety of touch interactions (touch, scroll, swipe, zoom) and tendency to perform short queries. In this study, we found that indeed, user attention behavior on mobile phones is very different from that on desktops. First, unlike desktop where engagement (both clicks and attention) has been widely reported to decrease from top to bottom positions [9, 10], on mobile phones, we found surprisingly, that the second result gets more viewport and gaze time than the first. The most likely explanation for this is short scrolls. Unlike desktop where searchers can use the page up/down keys on the keyboard to move from one page fold to the next (no overlap between the results in different page folds), on mobile phones, users tend to perform short and continuous scrolls that render the second and third results visible across more viewports and hence longer than the first. Figure 6 illustrates this with an example. This bias towards the second position occurs in eye gaze too. We think this is because the second result is often partially hidden, and to bring it fully into view, the user has to carefully scroll (to avoid scrolling too much or too little) by continuously looking at the result until it is fully visible, leading to longer gaze time on the second result than the first. It is possible that in the absence of scrolling, viewport and gaze time on results (in mobile phones) may decrease with position, similar to desktop. For example, navigational tasks ("BBC") where the user often clicks the first result, may not require scrolling, and may show higher viewport time on the first than second result. In our study, however, all tasks were non-navigational, and often in- volved scrolling. An intriguing question that immediately follows is, whether there is a more appropriate evaluation metric or rank position discount that better reflects user experience on mobile phones than current evaluation metrics, such as mean average precision or discounted cumulative gain. The second finding which is different on mobile phones than desktop is that, unlike the Golden Triangle in desktop, where attention is focused on the top-left and decreases towards the bottom and right of the search result page, in our study on mobile phones, we found that on average, user attention is focused on the center and top half of the screen. This, together with the already strong gaze viewport correlations (r=0.7 for %time on a page element as shown in Fig 8) suggests that by using the appropriate weighting functions on viewport data, we may identify which result the user is looking at, with high confidence. In other words, this offers an opportunity, for the first time, to scalably and reliably measure user attention on mobile phones. Another possible direction for improving accuracy of user attention measurements is to follow the work Huang et al. [14] and Navalpakkam et al. [24] that advocate to directly predict user attention on the screen from user interactions. While the absence of cursor movements in mobile phones makes attention prediction more difficult, we hypothesize that features of smaller screen size and the time user spends in the viewport without scrolling can be used to improve the accuracy of the “vanilla” approach that uses viewport time information only. In addition to understanding searcher attention on mobile phones, we examined search satisfaction and its effect on viewport data. We systematically varied task-relevance (whether the KG/IA contained the answer to the user’s task), and found that users reported less satisfied when the KG/IA was task-irrelevant than when it was relevant. We also identified viewport metrics that signal user dissatisfaction with answers – increased scrolling down the SERP and increased % time below the answer. We found that when the KG/IA is task-irrelevant, users read through it (expecting to find the answer) and upon not finding the answer, they continued to examine results below, leading to increased scrolling down the page, and increased time below KG/IA (in milliseconds, and as a % of page time). These results suggest that we may auto-detect answer satisfaction at scale by using viewport signals. We acknowledge several limitations of this study. First, we focused on tasks with information seeking search intent and have not explored navigational search intent [3]. In our data we observed 2.51 viewport scrolls performed on average. We expect the amount scrolling activity to be smaller for navigational searches, as often the first result is the destination site (e.g. queries like “BBC” or "Twitter"). In the absence of scrolling, we may find that attention strictly decreases with result position (unlike the bump at position 2 observed in this study). Second, in this study, we fixed the mobile phone’s position by mounting it to the eye tracker’s stand. In real life, user’s attention on the phone can vary depending on whether s/he is moving or not; whether the user is right handed or left handed or, perhaps interacting with the phone with both hands. Other factors such as demographics can also influence user behavior. For example, depending on the user’s language s/he may read information on the phone from left to right or vice-versa. Age and experience with touch interfaces is widely recognized throughout research commu- nity as an important factor in touch interactions, thus can affect user attention/search behavior. Third, examination habits on the mobile device may vary across users, as noted by [2]. While Figure 7 already shows a clear pattern - most users prefer focusing on the top half of the phone screen, it is possible that a few users may prefer the center or bottom of the screen. In our future work we plan to address this limitation by exploring the possibility of adaptively weighting user attention based on current user actions, e.g. direction of page examination (upward or downward). Despite these limitations, our study offers the hope of accurately measuring user eye-fixated result at scale on mobile phones. Future work will consider tablets (this study focused on mobile phones) and other devices, satisfaction with clickable results (including ads), and explore diverse user settings such as users who are moving or multi-tasking. We demonstrated, for the first time, that by tracking the browser viewport (visible portion of the page), one can develop viewport metrics that are strongly correlated with user attention (eye gaze) and search satisfaction on mobile phones. Focusing on answer-like results, in a controlled lab study, we found that increased scrolling past answer and increased time below answer can signal user dissatisfaction with answer results. We demonstrated strong gaze- viewport correlations on a per-result basis, and found that attention (on average) is focused on the top half of the phone, suggesting that we may infer the amount of attention received by a specific result (of 3-4 results shown in the viewport) scalably and reliably using viewport data alone. Potential applications of this work include better estimation of result relevance and satisfaction in search, and could benefit other areas including advertising, web page design and optimization, and measuring engagement in social networking ...

Citations

... This process was repeated for the remaining 11 tasks which were displayed to them in random order. Other researchers have also employed randomisation for condition allocation to minimise topic ordering effects [15,36]. ...
Preprint
Full-text available
The Search Engine Results Page (SERP) has evolved significantly over the last two decades, moving away from the simple ten blue links paradigm to considerably more complex presentations that contain results from multiple verticals and granularities of textual information. Prior works have investigated how user interactions on the SERP are influenced by the presence or absence of heterogeneous content (e.g., images, videos, or news content), the layout of the SERP (list vs. grid layout), and task complexity. In this paper, we reproduce the user studies conducted in prior works-specifically those of Arguello et al. [4] and Siu and Chaparro [29]-to explore to what extent the findings from research conducted five to ten years ago still hold today as the average web user has become accustomed to SERPs with ever-increasing presentational complexity. To this end, we designed and ran a user study with four different SERP interfaces: (i) a heterogeneous grid; (ii) a heterogeneous list; (iii) a simple grid; and (iv) a simple list. We collected the interactions of 41 study participants over 12 search tasks for our analyses. We observed that SERP types and task complexity affect user interactions with search results. We also find evidence to support most (6 out of 8) observations from [4 , 29] indicating that user interactions with different interfaces and to solve tasks of different complexity have remained mostly similar over time.
... Recently, studies have suggested viewport logging to assess users' attention on mobile phones at scale (Lagun et al. 2014). The viewport can be characterized as the "portion of the web page that is visible on the phone's screen at a given point in time" (Lagun et al. 2014, p. 2). ...
... The viewport can be characterized as the "portion of the web page that is visible on the phone's screen at a given point in time" (Lagun et al. 2014, p. 2). The viewport changes and reveals previously hidden parts as the user interacts with the website (e.g., by scrolling the page), resulting in path data of the time that website elements (e.g., a search result, an ad) are visible to the user (e.g., Lagun et al. 2014). Research has successfully employed viewport data for studying active information retrieval in goal attainment (e.g., news articles; Grusky et al. 2017). ...
Article
Ad avoidance (e.g., “blinding out” digital ads) is a substantial problem for advertisers. Avoiding mobile banner ads differs from active ad avoidance in nonmobile (desktop) settings, because mobile phone users interact with ads to avoid them: (1) They classify new content at the bottom of their screens; if they see an ad, they (2) scroll so that it is out of the locus of attention and (3) position it at a peripheral location at the top of the screen while focusing their attention on the (non-ad) content in the screen center. Introducing viewport logging to marketing research, we capture granular ad-viewing patterns from users’ screens (i.e., viewports). While mobile users’ ad-viewing patterns are concave over the viewport (with more time at the periphery than in the screen center), viewing patterns on desktop computers are convex (most time in the screen center). Consequently, we show that the effect of viewing time on recall depends on the position of an ad in interaction with the device. An eye-tracking study and an experiment show that 43% to 46% of embedded mobile banner ads are likely to suffer from ad avoidance, and that ad recall is 6 to 7 percentage points lower on mobile phones (versus desktop).
... Besides, interactive data (e.g., mouse clicks and movements) are collected and explored in laboratory user study Huang et al. (2011 or diary study Teevan et al. (2004), , Chen et al. (2021). Eye-tracking, which can capture real-time eye movements effectively, is a favored technique for investigating user examination behavior and, therefore, has been utilized in various search scenarios Hotchkiss et al. (2005), Lagun et al. (2014), Xie et al. (2017), Li et al. (2018), Zheng et al. (2020). The "Golden Triangle" pattern was observed on the traditional ten-blue-link pages in web search Hotchkiss et al. (2005). ...
Article
Modern search engine result pages (SERPs) become increasingly complex with heterogeneous information aggregated from various sources. In many cases, these SERPs also display results in the right rail besides the traditional left-rail result lists, which change the linear result list to a non-linear panel and might influence user search behavior patterns. While user behavior on the traditional ranked result list has been well studied in existing works, it still lacks a thorough investigation of the effects caused by the right-rail results, especially on complex SERPs. To shed light on this research question, we conducted a user study, which collected participants’ eye movements, detailed interaction behavioral logs, and feedback information. Based on the collected data, we analyze the influence of right-rail results on users’ examination patterns, search behavior, perceived workload, and satisfaction. We further construct a user model to predict users’ examination behavior on non-linear SERPs. Our work contributes to understanding the effects of the right-rail results on users’ interaction patterns, benefiting other related research, such as the evaluation and UI optimization of search systems.
... -Scroll: scrolling up and down a page without reading the content may be a signal of frustration and lack of confidence, which is a frequent action on mobile browsing [18]. In relation to this, [19] concluded that when users frequently scroll to find irrelevant information in their navigation, they are less satisfied. Furthermore, "the current scrolling method for a mobile device is both time-consuming and fatigue-prone" [20]. ...
Chapter
Full-text available
In recent years, the usage of mobile browsers has experienced an astonishing growth. Nowadays, most citizens use their mobile phones instead of their laptops to surf the Web for immediate availability. Nevertheless, Web design is performed considering laptops screen dimensions and websites are readjusted to mobile screen resolutions using Responsive Web Design. This conversion to smaller screen resolutions causes some drawbacks to mobile navigation. Web Augmentation is an effective methodology that allows end-users to customize third party websites according to their needs. This technique can mitigate the problems caused by small screen resolutions.
... The study began with a warm-up task. The main tasks were shown in a random order to balance the order effects [24]. Each participant spent about 1.5 hours completing the main tasks and gained $18 for involvement. ...
... Previous works [24,54] found a strong correlation between viewport with user attention. We utilized the weighted viewport to reduce the presentation bias, which had the strongest correlation with user attention [24,54]. ...
... Previous works [24,54] found a strong correlation between viewport with user attention. We utilized the weighted viewport to reduce the presentation bias, which had the strongest correlation with user attention [24,54]. The weighting factor is calculated following = ℎ , 2 / ℎ * ℎ , where ℎ , is the visible height of section in the -th viewport, ℎ is the height of the -th viewport, and ℎ is the actual height of section . ...
... A large number of IR studies [63][64][65][66][67][68][69][70] have demonstrated that users of retrieval systems tend to pay attention mostly to top-ranked results. IR metrics, therefore, focus on rank-based comparisons of the retrieved result set R to an ideal ranking of documents, as determined by manual judgments or implicit feedback from user behaviour data. ...
Preprint
Neural networks with deep architectures have demonstrated significant performance improvements in computer vision, speech recognition, and natural language processing. The challenges in information retrieval (IR), however, are different from these other application areas. A common form of IR involves ranking of documents -- or short passages -- in response to keyword-based queries. Effective IR systems must deal with query-document vocabulary mismatch problem, by modeling relationships between different query and document terms and how they indicate relevance. Models should also consider lexical matches when the query contains rare terms -- such as a person's name or a product model number -- not seen during training, and to avoid retrieving semantically related but irrelevant results. In many real-life IR tasks, the retrieval involves extremely large collections -- such as the document index of a commercial Web search engine -- containing billions of documents. Efficient IR methods should take advantage of specialized IR data structures, such as inverted index, to efficiently retrieve from large collections. Given an information need, the IR system also mediates how much exposure an information artifact receives by deciding whether it should be displayed, and where it should be positioned, among other results. Exposure-aware IR systems may optimize for additional objectives, besides relevance, such as parity of exposure for retrieved items and content publishers. In this thesis, we present novel neural architectures and methods motivated by the specific needs and challenges of IR tasks.
... We build upon extant research that established viewport tracking as a method for studying active information retrieval (e.g., Grusky et al. 2017;Lagun et al. 2014). We account for consumer heterogeneity in viewport data, by employing a hierarchical sequence clustering with dynamic time warping (Berndt and Clifford 1994) to identify clusters of trajectories across the viewport data of heterogeneous viewers. ...
... Viewport tracking might be an approach that helps to assess consumers' ad attention while accounting for consumer heterogeneity in ad viewing. Extant research has already employed data obtain from user viewports (see Table 1), so far, however, exclusively focused on active information retrieval, either of content pieces from news websites (Grusky et al. 2017;Lagun and Lalmas 2016), or the results of search engine rankings (Lagun et al. 2014;Lagun and Lalmas 2016). While most of the research offers a uniform assessment, which does not differentiate between different types of viewing patterns of a website, Lagun and Lalmas (2016) differentiate between different user groups. ...
... They establish that in active information retrieval processes, consumers differ in their progression over six different elements of a newspaper website (e.g., bottom, top, comments). Extant research partially compares viewport trajectories to the ground truth of actual viewing and fixation patterns obtained from an eye-tracking study (Lagun et al. 2014;), but focuses on gaze time only, while actual fixations are preferable in an advertising context (Hoffman and Subramaniam 1995). ...
Conference Paper
Full-text available
Advertisers have to pay publishers for "viewable" ads, irrespective of whether the users paid active attention. In this paper, we suggest that a granular analysis of users' viewing patterns can help us to progress beyond mere "viewability" and toward actual differentiation of whether a user has paid attention to an ad or not. To this end, we use individual viewport trajectories, which measures the sequence of locations and times an object (e.g., an ad) is visible on the display of a device (desktop or mobile). To validate our model and benchmark it against the extant models, such as the "viewability" policy (50% threshold) model, we use data from an eye-tracking experiment. Findings confirm the improved model fit, highlight distinct viewing patterns in the data, and inform information processing on mobile phones. Consequently, implications are relevant to publishers, advertisers, and consumer researchers.
... Still, crowdsourcing allows for exploring a wider range of parameters in a more controlled manner as compared to in-the-wild large-scale studies. We collected self-reported ground-truth labels in a similar vein to previous work (Feild et al., 2010;Lagun et al., 2014b;Liu et al., 2015;Arapakis and Leiva, 2016) which also administered post-task questionnaires. To mitigate and discount low-quality responses, several preventive measures were put into practice, such as introducing test (gold-standard) questions to our tasks, selecting experienced contributors with high accuracy rates, and monitoring their task completion time, thus ensuring the internal validity of our experiment. ...
... First, existing works heavily rely on hand-crafted features to analyze the eye movement patterns [3][4][5][6][7][8][9]. However, human visual behavior is highly heterogeneous across subjects [8,16,17], visual stimuli [14,15], and eye tracking devices. For instance, eye movements involved in reading articles are diverse among subjects and reading materials (e.g., the layout and salience content) [15]. ...
... Human visual behaviors are heterogeneous across subjects, visual stimuli, and eye tracking devices. Researchers have shown that subject differences, including the subject's language proficiency [8], personal interests [16], and engagement [17] with the visual material, have significant impact on the visual behavior. In addition, eye movement patterns are also influenced by the visual stimuli [14,15]: gaze patterns are diverse among different visual stimuli, even though the same subject is performing the same activity. ...
... The first study on a mobile-sized screen was performed by Kim et al. (2012) [49]. In seven papers mobile versions or mobile devices were studied [49,51,52,[56][57][58][59]. Kim et al. (2015) tested the mobile version, but not on a mobile device or smartphone [50]. ...
... Three papers tested granular parts of organic results like title, description, and URL separately [60][61][62]. In one paper, knowledge graphs were tested [57] and in one, image results were tested [63]. ...
... Heat maps represent all the participants. In some studies, each task was presented on a different heat map [60], while in other studies all participants were included on one map [38,57]. Showing averages on a heat map, especially if the study has less than 30 participants, could provide different results. ...
Article
Full-text available
This paper analyzes peer-reviewed empirical eye-tracking studies of behavior in web search engines. A framework is created to examine the effectiveness of eye-tracking by drawing on the results of, and discussions concerning previous experiments. Based on a review of 56 papers on eye-tracking for search engines from 2004 to 2019, a 12-element matrix for coding procedure is proposed. Content analysis shows that this matrix contains 12 common parts: search engine; apparatus; participants; interface; results; measures; scenario; tasks; language; presentation, research questions; and findings. The literature review covers results, the contexts of web searches, a description of participants in eye-tracking studies, and the types of studies performed on the search engines. The paper examines the state of current research on the topic and points out gaps in the existing literature. The review indicates that behavior on search engines has changed over the years. Search engines’ interfaces have been improved by adding many new functions and users have moved from desktop searches to mobile searches. The findings of this review provide avenues for further studies as well as for the design of search engines.