The International Archives of the Photogrammetry Remote Sensing and Spatial Information Sciences

Published by Copernicus GmbH

Online ISSN: 2194-9034


Print ISSN: 1682-1777


Table 1 . Change Detection Result Comparison
Automatic 3d Change Detection Based On Optical Satellite Stereo Imagery
  • Conference Paper
  • Full-text available

January 2010


474 Reads






Pablo d’Angelo
When monitoring urban areas from space, change detection based on satellite images is one of the most heavily investigated topics. In the case of monitoring change in 2D, one major shortcoming consists in the lack of height change detection. Thereby only changes related to reflectance values or local textures changes can be detected. However, changes in the vertical direction are completely ignored. In this paper we present a new 3D change detection approach. We focus our work on the detection of changes using Digital Surface Models (DSMs) which are generated from stereo imagery acquired at two different epochs. The so called “difference image” method is adopted in this framework where the final DSM is subtracted from the initial one to get the height difference. Our approach is a two-step approach. While in the first step, reduction of the noise effects (coming from registration noise, matching artifacts caused by the DEM generation procedures, etc), the second one exploits the rectangular property of the building shape in order to provide an accurate urban area monitoring change map. The method is tested, evaluated and compared with manually extraction results over the city centre of Munich in Germany.

Figure 3: Upper: Confidence image with detected blobs (red circles) and final SVM detection (green crosses). Lower: Original image with final detection.
Figure 4: Tracking of a group of cars on motorway A96 near Munich. Corresponding matches are marked by dashed lines. Mind, that the motorbike was not tracked, because it was not detected (the classifier of detection was not trained to two-wheeled vehicles).
Figure 5: The diagram shows the amount of data which is processed by the Ortho Module but not by the Traffic Processor in case of full traffic data extraction including all road categories and urban core scenarios.
Real-Time Image Processing For Road Traffic Data Extraction From Aarial Images
A world with growing individual traffic requires sufficient solutions for traffic monitoring and guidance. The actual ground based approaches for traffic data collection may be barely sufficient for everyday life, but they will fail in case of disasters and mass events. Therefore, a road traffic monitoring solution based on an airborne wide area camera system has been currently developed by DLR. Here, we present a new image processing chain for real-time traffic data extraction from high resolution aerial image sequences with automatic methods. This processing chain is applied in a computer network as part of an operational sensor system for traffic monitoring onboard a DLR aircraft. It is capable of processing aerial images obtained with a frame rate of up to 3 Hz. The footprint area of the three viewing directions of an image exposure with three cameras is 4 x 1 km at a resolution of 20 cm (recorded at a flight height of 1500 m). The processing chain consists of a module for data readout from the cameras and for the synchronization of the images with the GPS/IMU navigation data (used for direct georeferencing) and a module for orthorectification of the images. Traffic data is extracted by a further module based on a priori knowledge from a road database of the approximate location of road axes in the georeferenced and orthorectified images. Vehicle detection is performed by a combination of Adaboost using Haar-like features for pixel wise classification and subsequent clustering by Support Vector Machine based on a set of statistical features of the classified pixel. In order to obtain velocities, vehicle tracking is applied to consecutive images after performing vehicle detection on the first image of the burst. This is done by template matching along a search space aligned to road axes based on normalized cross correlation in RGB color space. With this processing chain we are able to obtain accurate traffic data with completeness and correctness both higher than 80% at high actuality for varying and complex image scenes. The proposed processing chain is evaluated on a huge number of images including inner city scenes of Cologne and Munich, demonstrating the robustness of our work in operational use.

Figure 1: Cartosat-1 image showing the three 4 km×4 km test areas: Terrassa, Vacarisses, La Mola  
Figure 5: Evaluation of a sample area using the described ap- proach  
DSM Accuracy Evaluation for the ISPRS Commission I Image matching Benchmark

November 2014


368 Reads

sensing community, the Working Group 4 of Commission I: Geometric and Radiometric Modeling of Optical Airborne and Spaceborne Sensors provides on its website a benchmark dataset for measuring and comparing the accuracy of dense stereo algorithms. The data provided consists of several optical spaceborne stereo images together with ground truth data produced by aerial laser scanning. In this paper we present our latest work on this benchmark, based upon previous work. As a first point, we noticed that providing the abovementioned test data as geo-referenced satellite images together with their corresponding RPC camera model seems too high a burden for being used widely by other researchers, as a considerable effort still has to be made to integrate the test datas camera model into the researchers local stereo reconstruction framework. To bypass this problem, we now also provide additional rectified input images, which enable stereo algorithms to work out of the box without the need for implementing special camera models. Care was taken to minimize the errors resulting from the rectification transformation and the involved image resampling. We further improved the robustness of the evaluation method against errors in the orientation of the satellite images (with respect to the LiDAR ground truth). To this end we implemented a point cloud alignment of the DSM and the LiDAR reference points using an Iterative Closest Point (ICP) algorithm and an estimation of the best fitting transformation. This way, we concentrate on the errors from the stereo reconstruction and make sure that the result is not biased by errors in the absolute orientation of the satellite images. The evaluation of the stereo algorithms is done by triangulating the resulting (filled) DSMs and computing for each LiDAR point the nearest Euclidean distance to the DSM surface. We implemented an adaptive triangulation method minimizing the second order derivative of the surface in a local neighborhood, which captures the real surface more accurate than a fixed triangulation. As a further advantage, using our point-to-surface evaluation, we are also able to evaluate non-uniformly sampled DSMs or triangulated 3D models in general. The latter is for example needed when evaluating building extraction and data reduction algorithms. As practical example we compare results from three different matching methods applied to the data available within the benchmark data sets. These results are analyzed using the above mentioned methodology and show advantages and disadvantages of the different methods, also depending on the land cover classes.

Figure 2: A region of the classification map: (a) visible range multispectral image (bands 5,3,2), (b) fusion and classification by INFOFUSE (this region contains several of the 23 classes)
Figure 3: A region of the classification map (INFOFUSE): (a) visible range multispectral image (bands 5,3,2), (b) fusion and classification by INFOFUSE (this region contains several of the 23 classes) 
Classification accuracy increase using multisensor data fusion
The practical use of very high resolution visible and near-infrared (VNIR) data is still growing (IKONOS, Quickbird, GeoEye-1, etc.) but for classification purposes the number of bands is limited in comparison to full spectral imaging. These limitations may lead to the confusion of materials such as different roofs, pavements, roads, etc. and therefore may provide wrong interpretation and use of classification products. Employment of hyperspectral data is another solution, but their low spatial resolution (comparing to multispectral data) restrict their usage for many applications. Another improvement can be achieved by fusion approaches of multisensory data since this may increase the quality of scene classification. Integration of Synthetic Aperture Radar (SAR) and optical data is widely performed for automatic classification, interpretation, and change detection. In this paper we present an approach for very high resolution SAR and multispectral data fusion for automatic classification in urban areas. Single polarization TerraSAR-X (SpotLight mode) and multispectral data are integrated using the INFOFUSE framework, consisting of feature extraction (information fission), unsupervised clustering (data representation on a finite domain and dimensionality reduction), and data aggregation (Bayesian or neural network). This framework allows a relevant way of multisource data combination following consensus theory. The classification is not influenced by the limitations of dimensionality, and the calculation complexity primarily depends on the step of dimensionality reduction. Fusion of single polarization TerraSAR-X, WorldView-2 (VNIR or full set), and Digital Surface Model (DSM) data allow for different types of urban objects to be classified into predefined classes of interest with increased accuracy. The comparison to classification results of WorldView-2 multispectral data (8 spectral bands) is provided and the numerical evaluation of the method in comparison to other established methods illustrates the advantage in the classification accuracy for many classes such as buildings, low vegetation, sport objects, forest, roads, rail roads, etc.

Improving HySpex Sensor Co-registration Accuracy using BRISK and Sensor-model based RANSAC

November 2014


244 Reads

In this paper a method to improve the co-registration accuracy of two separate HySpex SWIR and VNIR cameras is proposed. The first step of the presented approach deals with the detection of point features from both scenes using the BRISK feature detector. After matching these features, the match coordinates in the VNIR scene are orthorectified and the resulting ground control points in the SWIR scene are filtered using a sensor-model based RANSAC. This implementation of RANSAC estimates the boresight angles of a scene by iteratively fitting the sensor-model to a subset of the matches. The boresight angles which can be applied to most of the remaining matches are then used to orthorectify the scene. Compared to previously used methods, the main advantages of this approach are the high robustness against outliers and the reduced runtime. The proposed methodology was evaluated using a test data set and it is shown in this work that the use of BRISK for feature detection followed by sensor-model based RANSAC significantly improves the co-registration accuracy of the imagery produced by the two HySpex sensors.

Figure 1: Workflow of proposed car detection method
Figure 4: 3K camera system 
Figure 5: Original 3K image sample 
Figure 6: Classification result without motion mask 
Motion component supported Boosted Classifier for car detection in aerial imagery
Research of automatic vehicle detection in aerial images has been done with a lot of innovation and constantly rising success for years. However information was mostly taken from a single image only. Our aim is using the additional information which is offered by the temporal component, precisely the difference of the previous and the consecutive image. On closer viewing the moving objects are mainly vehicles and therefore we provide a method which is able to limit the search space of the detector to changed areas. The actual detector is generated of HoG features which are composed and linearly weighted by AdaBoost. Finally the method is tested on a motorway section including an exit and congested traffic near Munich, Germany.

Figure 11: Tracked objects filtered by CC-threshold 0.9 (100% correct)
Fast Vehicle Detection and Tracking in Aerial Image Bursts
Caused by the rising interest in traffic surveillance for simulations and decision management many publications concentrate on automatic vehicle detection or tracking. Quantities and velocities of different car classes form the data basis for almost every traffic model. Especially during mass events or disasters a wide-area traffic monitoring on demand is needed which can only be provided by airborne systems. This means a massive amount of image information to be handled. In this paper we present a combination of vehicle detection and tracking which is adapted to the special restrictions given on image size and flow but nevertheless yields reliable information about the traffic situation. Combining a set of modified edge filters it is possible to detect cars of different sizes and orientations with minimum computing effort, if some a priori information about the street network is used. The found vehicles are tracked between two consecutive images by an algorithm using Singular Value Decomposition. Concerning their distance and correlation the features are assigned pairwise with respect to their global positioning among each other. Choosing only the best correlating assignments it is possible to compute reliable values for the average velocities.

Evaluation of selected features for car detection in aerial images
The extraction of vehicles from aerial images provides a wide area traffic situation within a short time. Applications for the gathered data are various and reach from smart routing in the case of congestions to usability validation of roads in the case of disasters. The challenge of the vehicle detection task is finding adequate features which are capable to separate cars from other objects; especially those that look similar. We present an experiment where selected features show their ability of car detection. Precisely, Haar-like and HoG features are utilized and passed to the AdaBoost algorithm for calculating the final detector. Afterwards the classifying power of the features is accurately analyzed and evaluated. The tests a carried out on aerial data from the inner city of Munich, Germany and include small inner city roads with rooftops close by which raise the complexity factor.

Figure 1. Flow diagram of the proposed traffic parameter extraction method  
Traffic congestion parameter estimation in time series of airborne optical remote sensing images
In this paper we propose a new model based traffic parameter estimation approach in congested situations in time series of airborne optical remote sensing data. The proposed approach is based on the combination of various techniques: change detection, image processing and incorporation of a priori information such as road network, information about vehicles and roads and finally a traffic model. The change detection in two images with a short time lag of several seconds is implemented using the multivariate alteration detection method resulting in a change image where the moving vehicles on the roads are highlighted. Further, image processing techniques are applied to derive the vehicle density in the binarized change image. Finally, this estimated vehicle density is related to the vehicle density, acquired by modelling the traffic flow for a road segment. The model is derived from a priori information about the vehicle sizes and road parameters, the road network and the spacing between the vehicles. Then, the modelled vehicle density is directly related to the average vehicle velocity on the road segment and thus the information about the traffic situation can be derived. To confirm our idea and to validate the method several flight campaigns with the DLR airborne experimental wide angle optical 3K digital camera system operated on a Do-228 aircraft were performed. Experiments are performed to analyse the performance of the proposed traffic parameter estimation method for highways and main streets in the cities. The estimated velocity profiles coincide qualitatively and quantitatively quite well with the reference measurements.

Fig 5. Architecture of the distributed airborne processing unit 
Fig 6.
Fig 7. Examples for road extraction (clipping from nadir images) and vehicle detection at free traffic. Upper panel shows line detections at a flight height of 1000 m, lower panel shows detected vehicles in the city area of Munich.
Near real time airborne monitoring system for disaster and traffic applications
A near real time airborne monitoring system for monitoring of natural disasters, mass events, and large traffic disasters was developed in the last years at the German Aerospace Center (DLR). This system consists of an optical wide-angle camera system (3K system), a SAR sensor, an optical and microwave data downlink, an onboard processing unit and ground processing station with online data transmission to the DLR traffic and disaster portals. The development of the near real time processing chain from the data acquisition to the ground station is still a very challenging task. In this paper, an overview of all relevant parts of the airborne optical mapping system is given and selected system processes are addressed and described in more detail. The experiences made in the flight campaigns of the last years are summarized with focus on the image processing part, e.g. reached accuracies of georeferencing and status of the traffic processors.

Figure 1: Imaging Geometry of SAR ATI System
Figure 9: SRTM/X-SAR amplitude (left) and coherence (right) of German motorway A9
Figure 11: Test cars with different along-track velocities on the runway in a nominally processed radar image (left) and a reference optical image (right).
Figure 12: Five radar images of a car driving with an along- track velocity of 37 km/h processed with FM rates adapted for 0, 20, 37, 50 and 70 km/h (clockwise). 
An Airborne SAR Experiment For Ground Moving Target Identification
With the launch of the advanced high resolution radar satellite TerraSAR-X in summer 2006 new possibilities open to demonstrate traffic monitoring from space. At the German Aerospace Center an automatic and operational traffic processor is developed for the TerraSAR-X ground segment. This comprises the detection of moving objects in Synthetic Aperture Radar (SAR) images, their correct assignment to the road network and the estimation of their velocities. An airborne SAR campaign with DLR’s ESAR sensor was flown in an Along Track Interferometry (ATI) mode in April 2004 to investigate the effects of ground moving objects on SAR data and to acquire a data basis for algorithm development and validation for the traffic processor. Several vehicles with measured GPS tracks as well as vehicles on motorways with unknown velocities were imaged with the radar under different conditions. The paper provides an overview on the flight campaign and on the used experiment setups. Several data takes with different platform headings were acquired over a test site including controlled cars on a runway and over other test sites including motorways. The across track velocity was estimated by means of ATI phase. The measurements for experimental and road vehicles are presented, including comparison with GPS and optical reference data. To optimize the detectability of moving objects and to facilitate accurate speed measurements, adapted SAR processing approaches to compensate for energy losses because of a mismatched processed Doppler Centroid (due to across track motion) and because of defocused impulse responses (due to along track motion) were successfully applied to ESAR and SRTM data. Selected results and examples thereof are presented in the paper.

Figure 3. Illustration of projection geometry with directions of used coordinate frames Thereafter, parameters of interior and exterior orientation of the camera are known. Assuming a known interior camera geometry, for an observed point P the following equation describes the relation between image space and object space (Figure 3): m m m m m dp C C P C P ⋅ + = − + = α ) ( (1) i
Figure 4. Georeference of the recorded image within the digital map
Figure 6. The expected road area based on the digital map information is masked out on the image To get accurate knowledge about the mapped roads all street segments of a sufficient area around the recorded region have to be tested regarding their intersection with the image. This information enables the aggregation of vehicle data from image sequences later on. The vehicle detection is done on the reduced image area from the previous phase. The pixel sizes of the expected vehicle classes are dynamically adapted to the current navigation data
Airborne camera experiments for traffic monitoring
Prediction of traffic, dynamic routing, off board navigation and a standardisation of traffic flow parameters are the cornerstones of modern intelligent transport systems. The development of such systems requires intelligent data acquisition from different sensors and platforms. Due to its spatial and temporal flexibility airborne sensors can provide useful information beside existing systems, e.g. induction loops and vehicle probes data etc. DLR is involved in two projects proving the gain of using aerial images for traffic monitoring – “LUMOS” and “Eye in the sky”. For LUMOS an infrared camera system was used in combination with an inertial measurement unit (IMU) onboard an airplane. The project “Eye in the sky” provides an opportunity to evaluate the relevance of image data captured by a zeppelin and a helicopter. A high resolution digital camera and an inertial measurement unit mounted on an airborne platform were used to provide images and attitude data. In both projects, images were transmitted to a ground station, georeferenced and processed in order to extract user relevant traffic information. The whole procedure is realized in real time. Within the projects a variety of different sensors and platforms were used. This allows a validation of several configurations helping DLR in opening up new perspectives for traffic monitoring in future.

Fig 3 Illustration of 3ray tie points for images without regarding boresight misalignment (red circles) and for images with the correct boresight angles (black dots)
Fig 4 Relation between attitude change and boresight angle accuracy 4.2.2 Estimation of boresight misalignment with a bundle adjustment using only 3ray tie points For the estimation of boresight misalignment, a bundle adjustment using the GPS/IMU measurements and automatically matched 3ray tie points is conceived. Additional tie points between left/right looking images and the nadir images are introduced to stabilize the relative camera orientations (Fig 5). In case of 3K-camera system, altogether nine boresight angles, three for each camera, must be estimated. Due to the tilted cameras, boresight angles up to 35° are possible, which impede commonly used approximations for boresight misalignment.  
Fig 5 Proposed matching scheme (red zones): matched 3ray tie points in three consecutive left, nadir, and right looking images as well as 2ray tie points between left/right and nadir images.  
Calibration of a Wide-Angel Digital Camera System for Near Real Time Scenarios
Near real time monitoring of natural disasters, mass events, and large traffic disasters with airborne SAR and optical sensors will be the focus of several projects in research and development at the German Aerospace Center (DLR) in the next years. For these projects, new airborne camera systems are applied and tested. An important part of the sensor suite plays the recently developed optical wide angle 3K camera system (3K = “3Kopf”), which consists of three non-metric off-the-shelf cameras (Canon EOS 1Ds Mark II, 16 MPixel). The cameras are aligned in an array with one camera looking in nadir direction and two in oblique sideward direction, which leads to an increased FOV of max 110°/ 31° in across track/flight direction. With this camera configuration, a high resolution, colour and wide-area monitoring task even at low flight altitudes, e.g. below the clouds, becomes feasible. The camera system is coupled to a GPS/IMU navigation system, which enables the direct georeferencing of the 3K optical images. The ability to acquire image sequences with up to 3Hz broadens the spectrum of possible applications in particular for traffic monitoring. In this paper, we present the concept of calibration and georeferencing which is adjusted to the requirements of a near real time monitoring task. The concept is based on straight forward georeferencing, using the GPS/IMU data to automatically estimate the not-measured boresight angles. To achieve this without measuring of ground control points (GCPs), we estimate on-the-fly boresight angles based on automatically matched 3-ray tie points in combination with GPS/IMU measurements. A prerequisite for obtaining robust results for the boresight angles is that the air plane attitude changes slightly during image taking; through these singular solutions can be avoided. Additionally, we assume known and fixed parameters of interior orientation. The determination of the interior orientation is performed ground based using a bundle adjustment of images from a calibration test field. The determination of the parameters of the interior orientation is repeated to check for their systematic changes in time. The proposed georeferencing and calibration concept was tested with images acquired during three flight campaigns in 2006. To evaluate the accuracy obtained by direct georeferencing using the proposed estimation procedure for the boresight angles without GCPs, the data are compared with the results of a bundle adjustment using GCPs and the GPS/IMU information. Summarizing, the RMSE of direct georeferencing with/without GCPs is 1.0m / 5.1m in position and 0.5m / 1.0m in height, at image scales of 1:20.000. The accuracy without GCPs is regarded as acceptable for near real time applications. Additionally, it is shown that the parameter of the interior orientation remain stable during three repetitive calibrations on a test field for all three cameras.

Figure 1. Diagram of spatial consistency assessment using phase congruency 
Multiresolution image fusion: Phase congruency for spatial consistency assessment
Multiresolution and multispectral image fusion (pan-sharpening) requires proper assessment of spectral consistency but also spatial consistency. Many fusion methods resulting in perfect spectral consistency may leak spatial consistency and vice versa, therefore a proper assessment of both spectral and spatial consistency is required. Up to now, only a few approaches were proposed for spatial consistency assessment using edge map comparison, calculated by gradient-like methods (Sobel or Laplace operators). Since image fusion may change intensity and contrast of the objects in the fused image, gradient methods may give disagreeing edge maps of the fused and reference (panchromatic) image. Unfortunately, this may lead to wrong conclusions on spatial consistency. In this paper we propose to use phase congruency for spatial consistency assessment. This measure is invariant to intensity and contrast change and allows to assess spatial consistency of fused image in multiscale way. Several assessment tests on IKONOS data allowed to compare known assessment measures and the measure based on phase congruency. It is shown that phase congruency measure has common trend with other widely used assessment measures and allows to obtain confident assessment of spatial consistency.

Quality Assessment of the TanDEM-X Global Digital Elevation Model
TanDEM-X is an innovative synthetic aperture radar (SAR) mission with the main goal to generate a global and homogeneous digital elevation model (DEM) of the Earth’s land masses. The final DEM product will reach a new dimension of detail with respect to resolution and quality. The absolute horizontal and vertical accuracy shall each be less than 10 m in a 90% confidence interval at a pixel spacing of 12 m. The relative vertical accuracy specification for the TanDEM-X mission foresees a 90% point-to-point error of 2 m (4 m) for areas with predominant terrain slopes smaller than 20% (greater than 20%) within a 1° longitude by 1° latitude cell. The global DEM is derived from interferometric SAR acquisitions performed by two radar satellites flying in close orbit formation. Interferometric performance parameters like the coherence between the two radar images have been monitored and evaluated throughout the mission. In a further step, over 500,000 single SAR scenes are interferometrically processed, calibrated, and mosaicked into a global DEM product which will be completely available in the second half of 2016. This paper presents an up-todate quality status of the single interferometric acquisitions as well as of 50% of the final DEM. The overall DEM quality of these first products promises accuracies well within the specification, especially in terms of absolute height accuracy.

Navigation assistance for ice-infestedwaters through automatic iceberg detection and ice classification based on TerraSAR-X imagery
Over the last three decades, the Arctic summer sea ice coverage has decreased significantly. This trend is expected to continue due to persistent climate change. Besides increased research efforts in this field, this phenomenon has also attracted attention from maritime end-users. To keep Arctic shipping routes safe, monitoring of icebergs and drift ice are crucial. Satellite borne remote sensing, in particular Synthetic Aperture Radar (SAR), is ideally suited to this purpose. Wide coverage, high-frequency availability, and Independence from daylight and cloud coverage are among the major advantages of this data source. We propose automated iceberg detection and sea ice classification algorithms based on TerraSAR-X imagery and their application for near real-time purposes. Operational data acquired during several cruises into ice-infested waters are discussed. We show how maritime users benefit from such value-added SAR based products.

Figure 5. Lines (highlighted in colour) extracted from the simulated image generated from the whole DSM are overlapped on TSX image: (a) before matching, (b) after matching 
Figure 6. Lines (highlighted in colour) extracted from TSX image are matched to simulated image generated from the whole DSM 
Figure 7. TerraSAR-X image (cyan) overlapped with simulated SAR image (red) include: (a) all reflection level, (b) only double bounce reflection 
Figure 8. TerraSAR-X image (cyan) overlapped with reflection area (a) and shadow area (b) of single building (red) 
Automatic interpretation of high resolution SAR images: First results of SAR image simulation for single buildings
Due to the all-weather data acquisition capabilities, high resolution space borne Synthetic Aperture Radar (SAR) plays an important role in remote sensing applications like change detection. However, because of the complex geometric mapping of buildings in urban areas, SAR images are often hard to interpret. SAR simulation techniques ease the visual interpretation of SAR images, while fully automatic interpretation is still a challenge. This paper presents a method for supporting the interpretation of high resolution SAR images with simulated radar images using a LiDAR digital surface model (DSM). Line features are extracted from the simulated and real SAR images and used for matching. A single building model is generated from the DSM and used for building recognition in the SAR image. An application for the concept is presented for the city centre of Munich where the comparison of the simulation to the TerraSAR-X data shows a good similarity. Based on the result of simulation and matching, special features (e.g. like double bounce lines, shadow areas etc.) can be automatically indicated in SAR image.

Automatic crowd analysis from very high resolution satellite images

October 2011


3,639 Reads

Recently automatic detection of people crowds from images became a very important research field, since it can provide crucial information especially for police departments and crisis management teams. Due to the importance of the topic, many researchers tried to solve this problem using street cameras. However, these cameras cannot be used to monitor very large outdoor public events. In order to bring a solution to the problem, herein we propose a novel approach to detect crowds automatically from remotely sensed images, and especially from very high resolution satellite images. To do so, we use a local feature based probabilistic framework. We extract local features from color components of the input image. In order to eliminate redundant local features coming from other objects in given scene, we apply a feature selection method. For feature selection purposes, we benefit from three different type of information; digital elevation model (DEM) of the region which is automatically generated using stereo satellite images, possible street segment which is obtained by segmentation, and shadow information. After eliminating redundant local features, remaining features are used to detect individual persons. Those local feature coordinates are also assumed as observations of the probability density function (pdf) of the crowds to be estimated. Using an adaptive kernel density estimation method, we estimate the corresponding pdf which gives us information about dense crowd and people locations. We test our algorithm using Worldview-2 satellite images over Cairo and Munich cities. Besides, we also provide test results on airborne images for comparison of the detection accuracy. Our experimental results indicate the possible usage of the proposed approach in real-life mass events.

Automatic generation of digital terrain models from Cartosat-1 stereo images
This paper proposes a novel algorithm for automatic Digital Terrain Model (DTM) generation from high resolution CARTOSAT-1 satellite images generating accurate and reliable results. It consists of two major steps: Generation of Digital Surface Models (DSM) from CARTOSAT-1 stereo scenes and hierarchical image filtering for DTM generation. High resolution stereo satellite imagery is well suited for the creation of DSM. A system for automated and operational DSM and orthoimage generation based on CARTOSAT-1 imagery is presented, with emphasis on fully automated georeferencing. It processes level-1 stereo scenes using the rational polynomial coefficients (RPC) universal sensor model. A novel, automatic georeferencing method is used to derive a high quality RPC correction from lower resolution reference dataset, such as Landsat ETM+ Geocover and SRTM C Band DSM. Digital surface models (DSM) are derived from dense stereo matching and forward intersection and subsequent interpolation into a regular grid. In the second step which is dedicated to DSM filtering, the DSM pixels are classified into ground and non-ground using the algorithm motivated from the gray-scale image reconstruction to suppress unwanted elevated pixels. In this method, non-ground regions, i.e., 3D objects as well as outliers (very low or very high elevated regions) are hierarchically separated from the ground regions. The generated DTM is qualitatively and quantitatively evaluated. Profiles in the image as well as a comparison of the derived DTM to the original data and ground truth data are presented. The ground truth data consist of a high quality DTM produced from aerial images which is generated by the Institut Cartografic de Catalunya (ICC) in Barcelona, Spain. The evaluation result indicates that almost all non-ground objects regardless of their size are eliminated and the iterative approach gives good results in hilly as well as smooth residential areas.

Figure 1: Cartosat-1 image showing the three test areas. 
Figure 2: Different subareas in the Terrassa area. Yellow: City; Blue: Industrial; Purple: Bridges; Orange: Residential; Green: Fields, Red: Changes . 
Figure 3: Histogram of errors for Worldview-1 Vacarisses DSM, showing the deviation from the normal distribution. The red curve are the expected counts from a normal distribution with mean and σ estimated from the error distribution with the standard techniques. The mismatch happens due to the heavy tails.
Figure 6: Oblique view on the Terrassa DEMs. The industrial area is located in the foreground of the image. Top to Bottom: 1st Pulse LIDAR (Reference), Worldview-1 DSM, Cartosat-1 DSM.
Semiglobal Matching Results on the ISPRS Stereo Matching Benchmark
Digital surface models can be efficiently generated with automatic image matching from optical stereo images. The Working Group 4 of Commission I on "Geometric and Radiometric Modelling of Optical Spaceborne Sensors" provides a matching benchmark dataset with several stereo data sets from high and very high resolution space borne stereo sensors at . The selected regions are in Catalonia, Spain, and include three test areas, covering city areas, rural areas and forests in flat and medium undulated terrain as well as steep mountainous terrain. In this paper, digital surface models (DSM) are derived from the Cartosat-1 and Worldview-1 datasets using Semiglobal Matching. The resulting DSM are evaluated against the first pulse returns of the LIDAR reference dataset provided by the Institut Cartogr`afic de Catalunya (ICC), using robust accuracy measures.

Satellite-based radar measurements for validation of high-resolution sea state forecast models in German Bight
Remote sensing Synthetic Aperture Radar (SAR) data from TerraSAR-X and Tandem-X (TS-X and TD-X) satellites have been used for validation and verification of newly developed coastal forecast models in the German Bight of the North Sea. The empirical XWAVE algorithm for estimation of significant wave height has been adopted for coastal application and implemented for NRT services. All available TS-X images in the German Bight collocated with buoy measurements (6 buoys) since 2013 were processed and analysed (total of 46 scenes/passages with 184 StripMap images). Sea state estimated from series of TS-X images cover strips with length of ~200km and width of 30km over the German Bight from East-Frisian Islands to the Danish coast. The comparisons with results of wave prediction model show a number of local variations due to variety in bathymetry and wind fronts

Extracting Orthogonal Building Objects in urban Areas from high resolution Stereo Satellite Image Pairs

September 2007


424 Reads

Since the number of urban residents is rapidly increasing, especially in developing countries, relatively cheap and fast methods for modeling and mapping such cities are required. Besides the creation and updating of maps from sprawling cities three dimensional models are useful for simulation, monitoring and planning in case of catastrophic events like flooding, tsunamis or earthquakes. With the availability of very high resolution (VHR) stereo satellite data investigations of large urban areas regarding their three-dimensional shape can be performed fast and relatively cheap in comparison to aerial photography ­ especially for cities in developing countries. Most of the methods actually used for the generation of city models depend on a large amount of interactive work and mostly also on additional information like building footprints and so on. A method for a fully automatic derivation of relatively coarse and simple models of urban structure is therefore of great use. In this paper one approach for such an automatic modeling and a processing chain is sketched and the method used for the modeling of buildings is described.

Figure 1. Perspective Camera Model 
Figure 3. Calibration Target 
A Unified Calibration Approach For Generic Cameras
The classic perspective projection is mostly used when calibrating a camera. Although this approach is fairly developed and often suitable, it is not necessarily adequate to model any camera system like fish-eyes or catadioptrics. The perspective projection is not applicable when field of views reach 180° and beyond. In this case an appropriate model for a particular non perspective camera has to be used. Having an unknown camera system a generic camera model is required. This paper discusses a variety of parametric and generic camera models. These models will be validated subsequently using different camera systems. A unified approach of deriving initial parameter guesses for subsequent parameter optimisation is presented. Experimental results prove that generic camera models perform as accurate as a particular parametric model would do. Furthermore, there is no previous knowledge about the camera system needed.

Figure 2-1: CARTOSAT stereo pair, ICC reference DEM and distribution of GCP extracted from the 10 orthoimages of scale 1:5000 (GCP: +, reference DEM: small dashes, CatA: larger dashes, CatF: full line)
Figure 4-1: Deviations in pixel (factor 200 enlarged) of measured versus RPC image coordinates after affine transformation correction of RPC at 70 GCP (CatA-orthoimage names in row/col nomenclature of ICC)
Figure 6-1: Shifts (in pixel / factor 200 enlarged) derived via automatic image matching between the Aft/Fore orthoimages generated with new RPC and DSM derived from CARTOSAT-1 stereo pair (upper left part)
Figure 6-2: Shifts (in pixel / factor 200 enlarged) between the Aft/Fore orthoimages generated with new RPC but reference DEM (full stereo scene) 7. CONCLUSIONS
Stereo evaluation of CARTOSAT-1 data for French and Catalonian test sites
DLR's Remote Sensing Technology Institute has more than 20 years of experience in developing spaceborne stereo scanners (MEOSS, MOMS) and the corresponding stereo evaluation software systems. It takes part in the CARTOSAT-1 Scientific Assessment Program (C-SAP) as a principal investigator for a Catalonian test site (TS-10) and as Co-Investigator for a French test site (TS-5, Mausanne-les-Alpilles). Rational polynomial coefficients (RPC) are provided by the distributing Indian agency as a universal sensor model for each scene of the CARTOSAT-1 stereo pairs. These RPC have to be corrected via ground control points (GCP) as they are derived from orbit and attitude information with an accuracy far worse than the pixel size of 2.5 m (normally, post facto pointing accuracy is in the range of a few hundred meters). GCP of sufficient (sub-pixel) accuracy are derived from high-resolution orthoimages of the Catalonian Survey Institute (ICC) and from JRC DGPS campaigns, respectively. The GCP are identified in the aft images of the stereo pairs because of their superior radiometric quality and then transferred to the fore images via local least squares matching. From these GCP an affine correction to the delivered RPC is estimated. The residual vectors at the GCP are reduced to sub-pixel values. Because of zero denominator problems with the original RPC in small regions of the stereo pairs new RPC are estimated Using the corrected RPC and mass tie points from automatic image matching the digital surface models (DSM) are derived from the CARTOSAT-1 stereo pairs via forward intersection and subsequent interpolation. These DSM are compared to the available sufficiently accurate DEM/DSM provided by ICC and JRC. For the comparison a 3D shift between the individual DSM is estimated via least squares adjustment. Height accuracies (in terms of 1 σ) of 2.5-4 m are achieved. The shifts between the orthoimages of the two looking directions computed on the basis of the reference DEM and the computed DSM are assessed via image matching and are found to be in sub-pixel range.

Region based forest change detection from CARTOSAT-1 stereo imagery
Tree height is a fundamental parameter for describing the forest situation and changes. The latest development of automatic Digital Surface Model (DSM) generation techniques allows new approaches of forest change detection from satellite stereo imagery. This paper shows how DSMs can support the change detection in forest area. A novel region based forest change detection method is proposed using single-channel CARTOSAT-1 stereo imagery. In the first step, DSMs from two dates are generated based on automatic matching technology. After co-registration and normalising by using LiDAR data, the mean-shift segmentation is applied to the original pan images, and the images of both dates are classified to forest and non-forest areas by analysing their histograms and height differences. In the second step, a rough forest change detection map is generated based on the comparison of the two forest map. Then the GLCM texture from the nDSM and the Cartosat-1 images of the resulting regions are analyzed and compared, the real changes are extracted by SVM based classification.

Top-cited authors