## No full-text available

To read the full-text of this research,

you can request a copy directly from the authors.

The image points in two images satisfy epipolar constraint.
However, not all sets of points satisfying epipolar constraint
correspond to any real geometry because there can exist no cameras and
scene points projecting to given image points such that all image points
have positive depth. Using the cheirability theory due to Hartley and
previous work an oriented projective geometry, we give necessary and
sufficient conditions for an image point set to correspond to any real
geometry. For images from conventional cameras, this condition is simple
and given in terms of epipolar lines and epipoles. Surprising, this is
not sufficient for central panoramic cameras. Apart from giving the
insight to epipolar geometry, among the applications are reducing the
search space and ruling out impossible matches in stereo, and ruling out
impossible solutions for a fundamental matrix computed from seven points

To read the full-text of this research,

you can request a copy directly from the authors.

... TV or photographic) cameras are directional. Note, not every camera is central (i.e., its rays do not intersect in a single scene point) [5] and not every central camera is linear (i.e., the scene-to-image mapping is not linear in homogeneous coordinates) [9]. ...

... Hence (see Table 1 in the full version of [8], and also [9]) ...

... In other words, oriented projective reconstructibility and directionality of one or both cameras are necessary and sufficient for the existence of a real scene and real cameras underlying x A k n [9]. camera panoramic directional camera Ω scene Figure 4: Having two cameras, at least one of them directional, the directional camera center can always be separated from the scene points and the second camera center by plane. ...

Well-known matching constraints for points and lines in muliple images are necessary but not sufficient condition for the existence of real structure and cameras, underlying the image correspondences. To obtain sufficient conditions, the following additional constraints must be imposed: positive scales, the existence of a plane at infinity not intersecting the scene, and the existence of handedness preserving cameras. We present modifications of the well-known matching constraints and also some new constraints, taking into account some of this additional knowledge. Not only conventional but also central panoramic cameras are naturally described. To achieve this, we have generalized and simplified Hartley's ch(e)irality theory by formulating it in the language of oriented projective geometry and Grassmann tensors.

... Werner and Pajdla built on this framework deriving a theory of oriented matching constraints which enforce chi-rality [23]. In the case of two cameras, they gave a geometric interpretation of these constraints in the epipolar plane and suggest methods to use chirality for reducing the search space in stereo matching [22]. Werner further showed that such orientation constraints naturally give rise to combinatorial conditions on sets of images necessary for them to correspond to a true scene [20,21]. ...

... In [5], Hartley also characterizes the existence of a chiral reconstruction for two views in terms of a sign condition on the given projective reconstruction. Werner et al. also study the two-view case, considering both minimal and nonminimal configurations [21,22]. Nistér and Schafflitzky consider the minimial problem in the Euclidean case [15]. ...

... This observation that chirality can be used to clip epipolar lines was first made by Werner and Pajdla [22,Section 6]. They argue geometrically that chirality may be used to restrict the search space for stereo-matching from a full epipolar line to a segment of the line. ...

We introduce the chiral domain of an arrangement of cameras A={A1,...,Am}\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {A} = \{A_1,..., A_m\}$$\end{document} which is the subset of P3\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbb {P}^3$$\end{document} visible in A\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {A}$$\end{document}. It generalizes the classical definition of chirality to include all of P3\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbb {P}^3$$\end{document} and offers a unifying framework for studying multiview chirality. We give an algebraic description of the chiral domain which allows us to define and describe the chiral version of Triggs’ joint image. We then use the chiral domain to re-derive and extend prior results on chirality due to Hartley.

... In Section 6 we show that when k ą 4, point pairs that are in general position may not have a chiral reconstruction. Specific examples of this type when k " 5 were known to Werner [21] and there are close connections between our work and that of Werner's [20,21,22,23]. We make two new contributions. ...

... , u k q denote the pencil of lines joining e to each u i . The following geometric characterization is well-known [12,17,21,22]. The points e 1 , e 2 in the above theorem are the epipoles of the camera pair pA 1 , A 2 q in the reconstruction. ...

... , ku. A projective reconstruction which is not chiral can sometimes be transformed into a chiral reconstruction by a homography [1,11,22]. We recall the conditions under which this is possible. ...

A fundamental question in computer vision is whether a set of point pairs is the image of a scene that lies in front of two cameras. Such a scene and the cameras together are known as a chiral reconstruction of the point pairs. In this paper we provide a complete classification of k point pairs for which a chiral reconstruction exists. The existence of chiral reconstructions is equivalent to the non-emptiness of certain semialgebraic sets. For up to three point pairs, we prove that a chiral reconstruction always exists while the set of five or more point pairs that do not have a chiral reconstruction is Zariski-dense. We show that for five generic point pairs, the chiral region is bounded by line segments in a Schl\"afli double six on a cubic surface with 27 real lines. Four point pairs have a chiral reconstruction unless they belong to two non-generic combinatorial types, in which case they may or may not.

... Werner et. al. also consider the third question and answer it for two views in image space, considering both minimal and nonminimal configurations [15,16]. Nistér & Schafflitzky consider the minimial problem in the Euclidean case [10]. ...

... In the context of two cameras, [4] and [16] call a projective reconstruction of P a weak realization, and a chiral reconstruction a strong realization. In fact, while our definition of chiral reconstruction requires finite cameras, by allowing world points to be infinite, we extend the notion of a strong realization. ...

Given an arrangement of cameras $\mathcal{A} = \{A_1,\dots, A_m\}$, the chiral domain of $\mathcal{A}$ is the subset of $\mathbb{P}^3$ that lies in front it. It is a generalization of the classical definition of chirality. We give an algebraic description of this set and use it to generalize Hartley's theory of chiral reconstruction to $m \ge 2$ views and derive a chiral version of Triggs' Joint Image.

... Fundamental studies of stereo panoramas can be found in [4] [5] [11] [17]. For applications see [8] [12] [15]. ...

... 2) are designed to enhance 3D perception and understanding [6, 7, 8, 11]. Fundamental studies of stereo panoramas can be found in [4, 5, 11, 17]. For applications see [8, 12, 15]. ...

The specification of image acquisition parameters needs to be in accordance with given constraints defined by application requirements, the architecture of the camera, and specifications of the targeted 3D scenes. The paper proposes an approach for specifying acquisition parameters at image acquisition time to ensure high-quality stereo panoramas. Our approach satisfies commonly demanded application requirements such as: proper scene composition in resultant images; adequate sampling at a particular scene distance; and desired stereo quality (i.e. depth level resolution) over a diversity of scenes of interest. Previous studies have paid great attention to how proposed stereo panorama imaging models/methods support the epipolar geometry constraint and system realizations. The image acquisition parameter assignment problem has not yet been dealt with in these studies. The lack of guidance in specifying image acquisition parameters affects the validity of results for any subsequent processes.

... Hartley [20] has built on oriented ideas to develop his ideas of quasi-affine reconstruction and chirality (these will be briefly mentioned in Section 2.5). Werner and Pajdla [60,61] have described oriented matching constraints that are mathematically equivalent to the epipolar consistency constraints described in Section 2.4. ...

... This is the "strong realizability" condition of Werner and Pajdla [60,61]. ...

This article presents a novel method for computing the visual hull of a solid bounded by a smooth surface and observed by a finite set of cameras. The visual hull is the intersection of the visual cones formed by back-projecting the silhouettes found in the corresponding images. We characterize its surface as a generalized polyhedron whose faces are visual cone patches; edges are intersection curves between two viewing cones; and vertices are frontier points where the intersection of two cones is singular, or intersection points where triples of cones meet. We use the mathematical framework of oriented projective differential geometry to develop an image-based algorithm for computing the visual hull. This algorithm works in a weakly calibrated setting–-that is, it only requires projective camera matrices or, equivalently, fundamental matrices for each pair of cameras. The promise of the proposed algorithm is demonstrated with experiments on several challenging data sets and a comparison to another state-of-the-art method.

... Stereo panoramas have been found very useful in the applications of immerse technology, telepresence, robot navigation , localization etc [3, 10, 4, 8, 2]. Traditionally the design of stereo panorama cameras is mainly concerned with epipolar geometry, optics optimization , or some other realization/practical issues [3, 7, 5, 1, 6, 9]. This paper draws attention to two further criteria of stereo panorama camera design: controllabilities of pictorial/scene composition and stereo acuity (depth levels) over certain dynamic 3D scene ranges. ...

... Traditionally the design of stereo panorama cameras is mainly concerned with epipolar geometry, optics optimization, or some other realization/practical issues [3,7,5,1,6,9]. This paper draws attention to two further criteria of stereo panorama camera design: controllabilities of pictorial/scene composition and stereo acuity (depth levels) over certain dynamic 3D scene ranges. ...

Existing stereo panorama cameras do not allow controllability of pictorial/scene composition and stereo acuity (depth levels) over dynamic 3D scene ranges. We specify the design of such a camera allowing this type of flexibility. Previous approaches to design panorama cameras even lack studies with respect to this important aspect, while other design issues such as epipolar geometry, optics optimization, or realization-oriented approximations have been investigated. Without incorporating the controllability into stereo panorama camera design, the poor quality of produced stereo panoramas is foreseeable (e.g. incoherence, cardboard-effect, dipopia etc.). The paper proposes a solution to incorporate controllability into previously discussed (Ishiguro et al., 1992; Wei ei al., 1999; Shum et al., 1999; Peleg et al., 2000) stereo panorama camera models. By using a stereo panorama camera equipped with the designed camera parameters according to our solution, the desired/expected pictorial composition and stereo acuity in resultant stereo panoramas can be ensured.

... In stereo vision it is possible that not all the points satisfying the epipolar constraints belong to a real structure. According to Hartley's cheirality theory, as cited by Werner and Pajdla [Werner and Pajdla, 2001], particular conditions, in terms of epipolar lines and epipoles, must be added in order to ensure a robust correspondence between images. In [Werner and Pajdla, 2001], the authors detected that the panoramic sensors need supplementary constraints in order to satisfy this correspondence. ...

... According to Hartley's cheirality theory, as cited by Werner and Pajdla [Werner and Pajdla, 2001], particular conditions, in terms of epipolar lines and epipoles, must be added in order to ensure a robust correspondence between images. In [Werner and Pajdla, 2001], the authors detected that the panoramic sensors need supplementary constraints in order to satisfy this correspondence. Therefore, they extended Hartley's theory for wide field of view images and expressed the necessary constraints in terms of image points, epipoles, fundamental matrix and inter-image homography. ...

... Hartley [20] has built on oriented ideas to develop h ideas of quasi-affine reconstruction and chirality (these will be briefly mentioned in Secti 2.5). Werner and Pajdla [60,61] have described oriented matching constraints that a mathematically equivalent to the epipolar consistency constraints described in Section 2.4 ...

... This is the "strong realizability" condition of Werner and Pajdla [60,61]. images. ...

This thesis presents an image-based method for computing the visual hull of an object bounded by a smooth surface and observed by a finite number of perspective cameras. The essential structure of the visual hull is projective: to compute an exact topological (combinatorial) description of its boundary, we do not need to know the Euclidean properties of the input cameras or of the scene. Unlike most existing visual hull computation methods, ours requires only a projective reconstruction of the camera matrices, or equivalently, the epipolar geometry between each pair of cameras in the scene. Starting with a rigorous theoretical framework of oriented projective geometry and projective differential geometry, we develop a suite of algorithms to construct the visual hull and associated data structures. The thesis discusses our implementation of the algorithms, and presents experimental results on synthetic and real data sets.

... Image points satifsyingXX = 0 do not necessarily correspond to any real geometry [270]. Cheirality defines a 3d point to be visible when it is located in front of the camera [96] and accordingly has a positive depth [98,100] and positive viewing direction [155]. ...

... Cheirality defines a 3d point to be visible when it is located in front of the camera [96] and accordingly has a positive depth [98,100] and positive viewing direction [155]. These constraints can be applied to omnidirectional cameras [270] as well, which additionally use projection as a cheirality constraint. ...

This work aims at providing a novel camera motion estimation pipeline from large collections of unordered omnidirectional images. In order to keep the pipeline as general and flexible as possible, cameras are modelled as unit spheres, allowing to incorporate any central camera type. For each camera an unprojection lookup is generated from intrinsics, which is called P2S-map (Pixel-to-Sphere-map), mapping pixels to their corresponding positions on the unit sphere. Consequently the camera geometry becomes independent of the underlying projection model. The pipeline also generates P2S-maps from world map projections with less distortion effects as they are known from cartography. Using P2S-maps from camera calibration and world map projection allows to convert omnidirectional camera images to an appropriate world map projection in oder to apply standard feature extraction and matching algorithms for data association. The proposed estimation pipeline combines the flexibility of SfM (Structure from Motion) - which handles unordered image collections - with the efficiency of PGO (Pose Graph Optimization), which is used as back-end in graph-based Visual SLAM (Simultaneous Localization and Mapping) approaches to optimize camera poses from large image sequences. SfM uses BA (Bundle Adjustment) to jointly optimize camera poses (motion) and 3d feature locations (structure), which becomes computationally expensive for large-scale scenarios. On the contrary PGO solves for camera poses (motion) from measured transformations between cameras, maintaining optimization managable. The proposed estimation algorithm combines both worlds. It obtains up-to-scale transformations between image pairs using two-view constraints, which are jointly scaled using trifocal constraints. A pose graph is generated from scaled two-view transformations and solved by PGO to obtain camera motion efficiently even for large image collections. Obtained results can be used as input data to provide initial pose estimates for further 3d reconstruction purposes e.g. to build a sparse structure from feature correspondences in an SfM or SLAM framework with further refinement via BA. The pipeline also incorporates fixed extrinsic constraints from multi-camera setups as well as depth information provided by RGBD sensors. The entire camera motion estimation pipeline does not need to generate a sparse 3d structure of the captured environment and thus is called SCME (Structureless Camera Motion Estimation).

... The cheirality constraint, first proposed by Hartley in [13], means that any point that lies in an image must lie in front of the camera producing that image, which is alternatively known as the positive depth constraint. Werner and Pajdla [14] give necessary and sufficient conditions for an image point set to correspond to any real imaging geometry. In this paper, we use the cheirality constraint to segment epipolar line and identify the correct epipolar segment. ...

Identifying feature correspondence between two images is a fundamental procedure in three-dimensional computer vision. Usually the feature search space is confined by the epipolar line. Using the cheirality constraint, this paper finds that the feature search space can be restrained to one of two or three segments of the epipolar line that are defined by the epipole and a so-called virtual infinity point.

... The locus of the corresponding point ¯ x (resp x) in the other image is the image of r (resp ¯ r). Since writing the fundamental relation ¯ x Q x = 0 induces loss of orientation information [25], the search curve (5) obtained from Q is a superset of the image of a line. The curve (5) is not a correspondence curve, since some points on it are not possible correspondences [16]. ...

We describe an algebraic constraint on corresponding image points in a per- spective image and a circular panorama and provide a method to estimate it from noisy image measurements. Studying this combination of cameras is a step forward in localization and recognition since a database of circu- lar panoramas captures completely the appearance of objects and scenes, and perspective images are the simplest query images. The constraint gives a way to use a RANSAC-like algorithm for image matching. We introduce a gen- eral method to establish constraints between (non-central) images in the form of a bilinear function of the lifted coordinates of corresponding image points. We apply the method to obtain an algebraic constraint for a perspective image and a circular panorama. The algebraic constraints are interpreted geometri- cally and the constraints estimated from image data are used to auto-calibrate cameras and to compute a metric reconstruction of the scene observed. A synthetic experiment demonstrates that the proposed reconstruction method behaves favorably in presence of image noise. As a proof of concept, the constraints are estimated from real images of indoor scenes and used to re- construct positions of cameras and to compute a metric reconstruction of the scene.

... The theoretical investigation of chirality was initiated by Hartley [Har98]; see also [HZ03,Chapter 21]. Further studies of this concept were for instance undertaken by Laveau and Faugeras [LF96], Werner and Pajdla [WP01a,WP01b,Wer03a,Wer03b], and Agarwal, Pryhuber, Sinn, and Thomas [APST22,PST22]. ...

In this survey article, we present interactions between algebraic geometry and computer vision, which have recently come under the header of Algebraic Vision. The subject has given new insights in multiple view geometry and its application to 3D scene reconstruction, and carried a host of novel problems and ideas back into algebraic geometry.

... The cheirality test [41] is widely used to discard some impossible depth configurations. This test discards minimal samples which imply negative depths for some triangulated points. ...

Since RANSAC, a great deal of research has been devoted to improving both its accuracy and run-time. Still, only a few methods aim at recognizing invalid minimal samples early, before the often expensive model estimation and quality calculation are done. To this end, we propose NeFSAC, an efficient algorithm for neural filtering of motion-inconsistent and poorly-conditioned minimal samples. We train NeFSAC to predict the probability of a minimal sample leading to an accurate relative pose, only based on the pixel coordinates of the image correspondences. Our neural filtering model learns typical motion patterns of samples which lead to unstable poses, and regularities in the possible motions to favour well-conditioned and likely-correct samples. The novel lightweight architecture implements the main invariants of minimal samples for pose estimation, and a novel training scheme addresses the problem of extreme class imbalance. NeFSAC can be plugged into any existing RANSAC-based pipeline. We integrate it into USAC and show that it consistently provides strong speed-ups even under extreme train-test domain gaps - for example, the model trained for the autonomous driving scenario works on PhotoTourism too. We tested NeFSAC on more than 100k image pairs from three publicly available real-world datasets and found that it leads to one order of magnitude speed-up, while often finding more accurate results than USAC alone. The source code is available at https://github.com/cavalli1234/NeFSAC.

... After that, the extrinsic parameters were extracted by essential matrix decomposition. To find out the unique and proper solution for the rotation parameters and translation parameters, they needed to apply the chirality check [25,26]. Therefore, their method is quite sensitive to erroneous of the estimated locations between head (or foot) correspondences in different camera views. ...

Extrinsic camera calibration is essential for any computer vision task in a camera network. Typically, researchers place a calibration object in the scene to calibrate all the cameras in a camera network. However, when installing cameras in the field, this approach can be costly and impractical, especially when recalibration is needed. This paper proposes a novel, accurate and fully automatic extrinsic calibration framework for camera networks with partially overlapping views. The proposed method considers the pedestrians in the observed scene as the calibration objects and analyzes the pedestrian tracks to obtain extrinsic parameters. Compared to the state of the art, the new method is fully automatic and robust in various environments. Our method detect human poses in the camera images and then models walking persons as vertical sticks. We apply a brute-force method to determines the correspondence between persons in multiple camera images. This information along with 3D estimated locations of the top and the bottom of the pedestrians are then used to compute the extrinsic calibration matrices. We also propose a novel method to calibrate the camera network by only using the top and centerline of the person when the bottom of the person is not available in heavily occluded scenes. We verified the robustness of the method in different camera setups and for both single and multiple walking people. The results show that the triangulation error of a few centimeters can be obtained. Typically, it requires less than one minute of observing the walking people to reach this accuracy in controlled environments. It also just takes a few minutes to collect enough data for the calibration in uncontrolled environments. Our proposed method can perform well in various situations such as multi-person, occlusions, or even at real intersections on the street.

... The definition of the essential matrix as shown in the following equation states that the rotation matrix R and the translation vector t of the camera relations can be recovered from E by using singular value decomposition (SVD). Here four possible solutions can be evaluated by following cheirality constraint as shown in [34] or [48]. This constraint requires that reconstructed point correspondences lie in front of both camera coordinate systems. ...

The three-dimensional reconstruction of rigid scenes from monocular image streams is based on the former calculation of the relative camera pose between at least two successive image frames. This egomotion estimation has not been solved satisfactorily by relying only on corresponding image features, such as points or lines, due to noise, critical motion patterns or special point configurations. This paper describes a framework for incorporating inertial measurements from gyroscopes, accelerometers and magnetometers to achieve an improved performance of the estimation of camera motion and scene structure in terms of accuracy, robustness and computational costs. The framework is designed as a dual-track system containing a visual and a inertial route.

... This tends to improve the consensus score more rapidly than is the case in " vanilla " RANSAC, and hence the condition for termination may be reached more quickly. Chum et al. [3, 4] and Werner and Pajdla [13] propose a cheirality test for the fundamental matrix based on consideration of the oriented projective geometry that allows hypotheses that do not satisfy the oriented epipolar constraint to be rejected without further evaluation. Nister [8] proposes a radically different approach in which multiple hypotheses are scored in parallel, with the least promising hypotheses being dropped at successive stages. ...

The random sample consensus (RANSAC) algorithm, along with its many cousins such as MSAC and MLESAC, has become a standard choice for robust estimation in many computer vision problems. Recently, a raft of modifications to the basic RANSAC algorithm have been proposed aimed at improving its efficiency. Many of these optimizations work by reducing the number of hypotheses that need to be evaluated. This paper proposes a complementary strategy that aims to reduce the average amount of time spent computing the consensus score for each hypothesis. A simple statistical test is proposed that permits the scoring process be terminated early, potentially yielding large computational savings. The proposed test is simple to imple- ment, imposes negligible computational overhead, and is effective for any given size of data set. The approach is evaluated by estimation of the fun- damental matrix for a large number of image pairs and is shown to offer a significant reduction in computational cost compared to recently proposed RANSAC modifications.

... Only two of the four solutions in Eq. (150) will place the celestial body in front of the camera. The process of checking to ensure that an observed object lies in front of (and not behind) the camera is sometimes called a cheirality test [45], [102]. To perform such a cheirality test, we simply need to see if the z-component of r C is positive, which occurs when ...

Images of a nearby celestial body collected by a camera on an exploration spacecraft contain a wealth of actionable information. This work considers how the apparent location of the observed body’s horizon in a digital image may be used to infer the relative position, attitude, or both. When the celestial body is a sphere, spheroid, or ellipsoid (as is the case for most large bodies in the Solar System), the projected horizon in an image is a conic—usually an ellipse at large distances and a hyperbola at small distances. This work develops non-iterative and analytically exact methods for every case (all combinations of unknown state parameters and quadric shapes), completely superseding older horizon-based methods that are iterative, approximate, or both. Some of the analytic methods presented in this work are new. Recognizing that these developments build on techniques that may be unfamiliar to many spacecraft navigators, this work is fashioned as a tutorial. Descriptive illustrations and numerical examples are provided to make concepts clear and to validate the proposed algorithms.

... The cheirality constraint, first proposed by Hartley in [13], means that any point that lies in an image must lie in front of the camera producing that image, which is alternatively known as the positive depth constraint. Werner and Pajdla [14] give necessary and sufficient conditions for an image point set to correspond to any real imaging geometry. Agarwal and Pryhuber [15] give an algebraic description of mutiview cheirality. ...

Identifying feature correspondence between two images is a fundamental procedure in three-dimensional computer vision. Usually the feature search space is confined by the epipolar line. Using the cheirality constraint, this paper finds that the feature search space can be restrained to one of two or three segments of the epipolar line that are defined by the epipole and a so-called virtual infinity point.

... Then, they decompose the essential matrix to obtain the camera rotation and translation parameters. However, decomposing the essential matrix, multiple triangulations are needed for the chirality check [31,32], which makes the method more prone to erroneous correspondences between heads (or feet) in different camera views. Moreover, the method using the essential matrix will fail when pedestrians walk along a straight line, which occurs quite often in practice. ...

In this paper, we propose a novel extrinsic calibration method for camera networks by analyzing tracks of pedestrians. First of all, we extract the center lines of walking persons by detecting their heads and feet in the camera images. We propose an easy and accurate method to estimate the 3D positions of the head and feet w.r.t. a local camera coordinate system from these center lines. We also propose a RANSAC-based orthogonal Procrustes approach to compute relative extrinsic parameters connecting the coordinate systems of cameras in a pairwise fashion. Finally, we refine the extrinsic calibration matrices using a method that minimizes the reprojection error. While existing state-of-the-art calibration methods explore epipolar geometry and use image positions directly, the proposed method first computes 3D positions per camera and then fuses the data. This results in simpler computations and a more flexible and accurate calibration method. Another advantage of our method is that it can also handle the case of persons walking along straight lines, which cannot be handled by most of the existing state-of-the-art calibration methods since all head and feet positions are co-planar. This situation often happens in real life.

... With the spherical model, more can be done than with the standard perspective model. It is possible to obtain stronger form of epipolar constraint [94], constraint on five points in two images [92,93], and the epipolar geometry can be augmented by an orientation [19]. ...

Image-based modeling of urban environments is a key component of enabling outdoor, vision-based augmented reality applications. The images used for modeling may come from off-line efforts, or online user contributions. Panoramas have been used extensively in mapping cities and can be captured quickly by an end-user with a mobile phone. In this paper, we describe and evaluate a reconstruction pipeline for upright panoramas taken in an urban environment. We first describe how panoramas can be aligned to a common vertical orientation using vertical vanishing point detection, which we show to be robust for a range of inputs. The orientation sensors in modern cameras can also be used to correct the vertical orientation. Secondly, we introduce a pose estimation algorithm, which uses knowledge of a common vertical orientation as a simplifying constraint. This procedure is shown to reduce pose estimation error in comparison with the state of the art. Finally, we evaluate our reconstruction pipeline with several real-world examples.

Since RANSAC, a great deal of research has been devoted to improving both its accuracy and run-time. Still, only a few methods aim at recognizing invalid minimal samples early, before the often expensive model estimation and quality calculation are done. To this end, we propose NeFSAC, an efficient algorithm for neural filtering of motion-inconsistent and poorly-conditioned minimal samples. We train NeFSAC to predict the probability of a minimal sample leading to an accurate relative pose, only based on the pixel coordinates of the image correspondences. Our neural filtering model learns typical motion patterns of samples which lead to unstable poses, and regularities in the possible motions to favour well-conditioned and likely-correct samples. The novel lightweight architecture implements the main invariants of minimal samples for pose estimation, and a novel training scheme addresses the problem of extreme class imbalance. NeFSAC can be plugged into any existing RANSAC-based pipeline. We integrate it into USAC and show that it consistently provides strong speed-ups even under extreme train-test domain gaps – for example, the model trained for the autonomous driving scenario works on PhotoTourism too. We tested NeFSAC on more than 100 k image pairs from three publicly available real-world datasets and found that it leads to one order of magnitude speed-up, while often finding more accurate results than USAC alone. The source code is available at https://github.com/cavalli1234/NeFSAC.

Finding feature correspondences between a pair of stereo images is a key step in computer vision for 3D reconstruction and object recognition. In practice, a larger number of correct correspondences and a higher percentage of correct matches are beneficial. Previous researches show that the spatial distribution of correspondences are also very important especially for fundamental matrix estimation. So far, no existing feature matching method considers the spatial distribution of correspondences. In our research, we develop a new algorithm to find good correspondences in all the three aspects mentioned, i.e., larger number of correspondences, higher percentage of correct correspondences, and better spatial distribution of correspondences. Our method consists of two processes: an adaptive disparity smoothing filter to remove false correspondences based on the disparities of neighboring correspondences and a matching exploration algorithm to find more correspondences according to the spatial distribution of correspondences so that the correspondences are as uniformly distributed as possible in the images. To find correspondences correctly and efficiently, we incorporate the cheirality constraint under an epipole polar transformation together with the epipolar constraint to predict the potential location of matching point. Experiments demonstrate that our method performs very well on both the number of correct correspondences and the percentage of correct correspondences; and the obtained correspondences are also well distributed over the image space.

In this paper, we propose a novel extrinsic calibration method for camera networks based on a pedestrian who walks on a horizontal surface. Unlike existing methods which require both the feet and head of the person to be visible in all views, our method only assumes that the upper body of the person is visible, which is more realistic in occluded environments. Firstly, we propose a method to calculate the normal of the plane containing all head positions of a single pedestrian. We then propose an easy and accurate method to estimate the 3D positions of a head w.r.t. to each local camera coordinate system. We apply orthogonal procrustes analysis on the 3D head positions to compute relative extrinsic parameters connecting the coordinate systems of cameras in a pairwise fashion. Finally, we refine the extrinsic calibration matrices using a method which minimizes the reprojection error. Experimental results show that the proposed method provides an accurate estimation of the extrinsic parameters.

This article examines projectively-invariant local geometric properties of smooth curves and surfaces. Oriented projective differential geometry is proposed as a general framework for establishing such invariants and characterizing the local projective shape of surfaces and their outlines. It is applied to two problems: (1) the projective generalization of Koenderink’s famous characterization of convexities, concavities, and inflections of the apparent contours of solids bounded by smooth surfaces, and (2) the image-based construction of rim meshes, which provide a combinatorial description of the arrangement induced on the surface of an object by the contour generators associated with multiple cameras observing it.

You are granted permission for the non-commercial reproduction, distribution, display, and performance of this technical report in any format, BUT this permission is only for a period of 45 (forty-five) days from the most recent time that you verified that this technical report is still available from the original CITR web site; http://citr.auckland.ac.nz/techreports/ under terms that include this permission. All other rights are reserved by the author(s). This paper proposes an approach for solving the parameter determination problem for a stereoscopic panorama camera. Image acquisition parameters have to be calculated under given constraints defined by application requirements, the image acquisition model, and specifications of the targeted 3D scenes. Previous studies on stereoscopic panorama imaging, such as [IYT92, MB95b, WHK99b, PPB00, SKS99, HWK01, Sei01, WP01], pay great attention on how a proposed imaging approach supports a chosen area of application. The image acquisition parameter determination problem has not yet been dealt with in these studies. The lack of guidance in selecting image acquisition parameters affects the validity of results obtained for subsequent processes [WHK00]. Our approach towards parameter determination allows to satisfying commonly demanded 3D scene visualization/reconstruction application requirements: proper scene composition in resultant images; adequate sampling at a particular scene distance; and desired stereo quality i.e. depth levels) over a diversity of scenes of interest. The paper details the models, constraints and criteria used for solving the parameter determination problem. Some practical examples are given for demonstrating the use of the formulas derived. The study contributes to the design of stereoscopic panorama cameras as well as to manuals for on-site image acquisition. The results of our studies are also useful for camera calibration, or pose estimation in stereoscopic panoramic imaging.

This paper addresses recent developments of circular line-scan imaging system for applications of 3D scene visualization and/or reconstruction. Such an imaging system is characterized by rotating linear sensors capturing one image column at a time respectively. This allows for accurate mappings onto a cylindrical image surface and very high image resolutions paid by motion distortions in dynamic scenes. These images can be used, for example, for stereo visualization and 3D reconstruction in the VR applications where extremely high image resolution is of benefit (for static scenes). The paper elaborates the basic geometry, the geometric analysis, and the design and control of imaging parameters to ensure high-quality 3D data acquisition.

It is well-known that epipolar geometry relating two uncalibrated images is determined by at least seven correspondences. If there are more than seven of them, their positions cannot be arbitrary if they are to be projections of any world points by any two cameras. Less than seven matches have been thought not to be constrained in any way. We show that there is a constraint even on five matches, i.e., that there exist forbidden configurations of five points in two images. The constraint is obtained by requiring orientation consistence points on the wrong side of rays are not allowed. For allowed configurations, we show that epipoles must lie in domains with piecewise-conic boundaries, and how to compute them. We present a concise algorithm deciding whether a configuration is allowed or forbidden.

Traditionally the acquisition of real time panoramic images has been performed by the usage of lenses or mirrors coupled with standard image sensors, which give distorted images. In this paper we will present a system composed of a mirror and a log polar sensor, which is able to provide directly understandable images.

The epipolar geometry of central panoramic catadioptric cameras is presented. Firstly, we show that central panoramic catadioptric cameras are obtained by a combination of (i) a conventional perspective camera with a hyperbolic mirror or (ii) an orthographic camera with a parabolic mirror. Secondly, a complete characterization of the epipolar geometry of central panoramic catadioptric cameras is given as the hyperbolic and the parabolic catadioptric cameras are the only panoramic catadioptric cameras with a single mirror which have a single viewpoint. It is shown that epipolar curves in image become (i) general conics for the hyperbolic camera (ii) ellipses or lines for the parabolic camera. It is advocated that the difference in the estimation of epipolar geometry for conventional and central panoramic cameras consists in image data normalization. A normalization suitable for panoramic cameras is proposed. The effect of normalization is demonstrated for the hyperbolic camera.

We introduce the notion of oriented projective reconstruction (OPR). We show that, in contrary to common belief, it is possible to obtain more than a projective reconstruction (PR) of a scene from uncalibrated real cameras, namely OPR. This is enabled by knowing that a real camera sees only points in front of it. The defining property of OPR is that the plane that is in an underlying Euclidean reconstruction at infinity does not intersect the convex hull of the reconstructed points in OPR. This is generally not true for PR. Thus, OPR can be viewed as a step between affine reconstruction (when the plane at infinity projects to infinity) and PR (when the position of the plane at infinity is unconstrained). The important practical consequence is that OPR preserves the convex hull, and the reconstructed scene is "topologically correct" and it can be e.g. rendered with hidden surfaces removed correctly. 1 Introduction Let us consider reconstructing a scene from image points obtained by...

Conventional video cameras have limited fields of view which make them restrictive for certain applications in computational vision. A catadioptric sensor uses a combination of lenses and mirrors placed in a carefully arranged configuration to capture a much wider field of view. One important design goal for catadioptric sensors is choosing the shapes of the mirrors in a way that ensures that the complete catadioptric system has a single effective viewpoint. The reason a single viewpoint is so desirable is that it is a requirement for the generation of pure perspective images from the sensed images. In this paper, we derive the complete class of single-lens single-mirror catadioptric sensors that have a single viewpoint. We describe all of the solutions in detail, including the degenerate ones, with reference to many of the catadioptric systems that have been proposed in the literature. In addition, we derive a simple expression for the spatial resolution of a catadioptric sensor in te...

We present an extension of the usual projective geometric framework for computer vision which can nicely take into account an information that was previously not used, i.e. the fact that the pixels in an image correspond to points which lie in front of the camera. This framework, called the oriented projective geometry, retains all the advantages of the unoriented projective geometry, namely its simplicity for expressing the viewing geometry of a system of cameras, while extending its adequation to model realistic situations.
We discuss the mathematical and practical issues raised by this new framework for a number of computer vision algorithms. We present different experiments where this new tool clearly helps.

Cheirality in epipolar geometry Research Report CAK-340?03?1?2000-01, CTU-CMP-2000?21, Center for Machine Perception

- Werner Toma
- Pajdla Toma

- Richard I Hartley

Richard I. Hartley. Chirality. Int. Jour. Computer Vision IJCV,
26(1):41–61, 1998.

Multiple View Geometry in Computer Vision

- R Hartley
- A Zisserman

Int. Jour. Computer Vision IJCV

- Richard I Hartley
- Chirality
- Int
- Jour