Assessing gaze behavior during real-world tasks is difficult; dynamic bodies moving through dynamic worlds make gaze analysis difficult. Current approaches involve laborious coding of pupil positions. In settings where motion capture and mobile eye tracking are used concurrently in naturalistic tasks, it is critical that data collection be simple, efficient, and systematic. One solution is to combine eye tracking with motion capture to generate 3D gaze vectors. When combined with tracked or known object locations, 3D gaze vector generation can be automated. Here we use combined eye and motion capture and explore how linear regression models generate accurate 3D gaze vectors. We compare spatial accuracy of models derived from four short calibration routines across three pupil data inputs: the efficacy of calibration routines was assessed, a validation task requiring short fixations on task-relevant locations, and a naturalistic object interaction task to bridge the gap between laboratory and “in the wild” studies. Further, we generated and compared models using spherical and Cartesian coordinate systems and monocular (left or right) or binocular data. All calibration routines performed similarly, with the best performance (i.e., sub-centimeter errors) coming from the naturalistic task trials when the participant is looking at an object in front of them. We found that spherical coordinate systems generate the most accurate gaze vectors with no differences in accuracy when using monocular or binocular data. Overall, we recommend 1-min calibration routines using binocular pupil data combined with a spherical world coordinate system to produce the highest-quality gaze vectors.