Technical ReportPDF Available

Airborne hyperspectral data over Chikusei


Abstract and Figures

Airborne hyperspectral datasets were acquired by Hyperspec-VNIR-C (Headwall Photonics Inc.) over agricultural and urban areas in Chikusei, Ibaraki, Japan, on July 29, 2014, as one of the flight campaigns supported by KAKENHI 24360347. This technical report summarizes the experiment. The hyperspectral data and ground truth were made available to the scientific community.
Content may be subject to copyright.
Airborne hyperspectral data over Chikusei
Naoto Yokoya and Akira Iwasaki
E-mail: {yokoya, aiwasaki}
May 27, 2016
Airborne hyperspectral datasets were acquired by Hyperspec-VNIR-C (Headwall Photonics
Inc.) over agricultural and urban areas in Chikusei, Ibaraki, Japan, on July 29, 2014, as one of
the flight campaigns supported by KAKENHI 24360347. This technical report summarizes the
experiment. The hyperspectral data and ground truth were made available to the scientific
Headwall Hyperspec-VNIR-C imaging sensor and Canon EOS 5D Mark II [see Figure 1.1] were
used for flight campaigns supported by a project named "Multidimensional superresolution of
remote sensing data via information fusion" (KAKENHI 24360347) to obtain hyperspectral and
color images from the same platform. The two sensors were mounted on the same platform
together with GPS/IMU, as shown in Figure 1.2. The main specifications of the two imaging
sensors are summarized in Table 1.1.
Hyperspec-VNIR-C comprises a Headwall’s original spectrograph and a CCD camera of PCO
pixelfly qe. The pixelfly qe detector has a size of 1024
1392 pixels in spectral and cross-track
directions. The Hyperspec sensor covers the wavelength range from 363 nm to 1018 nm with
a spectral resolution of 1.29 nm. The frame rate is originally 12 fps and can be improved to
23 fps with a w/ 2x vert. binning mode, which records each frame with a 512
696 pixel
size. To improve an along-track GSD, we used the binning mode and set the frame rate to
23 fps. The along-track GSD is 2.66 m when a ground speed is 220 km/h. The cross-track
(a) (b)
Figure 1.1:
(a) Hyperspec-VNIR-C (Headwall Photonics Inc.) and (b) EOS 5D Mark II (Canon
(a) (b)
Figure 1.2: Layout of (a) Hyperspec-VNIR-C, EOS 5D Mark II, and (b) IMU.
GSD is 0.97 with the binning mode. In this case, the pixel aspect ration is approximately 11:4
Airborne hyperspectral datasets were taken by the Hyperspec-VNIR-C imaging sensor over
agricultural and urban areas in Chikusei, Ibaraki, Japan, on July 29, 2014, between the times
9:56 to 10:53 UTC+9. The flightlines recorded by GPS are shown in Figure 2.1. There are
thirteen flightlines parallel to the north-south direction and one flightline parallel to the
east-west direction. The thirteen flightlines in the north-south direction were overlapped by
approximately 35 % of each swath to reduce BRDF effects in the mosaic data. The average
ground speed was 119 kt (220 km/h) and the average height of the sensor above ground was
Table 1.1:
Specifications of Hyperspec-VNIR-C and EOS 5D Mark II with 900 m flight height
and 220 km/h flight speed. (·) indicates specifications with binning.
Sensor Hyperspec EOS 5D Mark II
Size of detector 1024 ×1392 3744 ×5616
Size of pixel (µm) 6.45 6.4
Focal length (mm) 12 35
FOV (degree) 41.0 54.4
Frame rate 12 (23)
Cross-track GSD (m) 0.48 (0.97) 0.16
Along-track GSD (m) 5.09 (2.66) 0.16
Swath 673 924
approximately 900 m. Therefore, the along-track and cross-track GSDs were 2.66 m and 0.97
m, respectively. EOS 5D Mark II sequentially acquired high-resolution color images every 2.2
sec together with the hyperspectral data.
Figure 2.1: Flightlines in Google Earth.
The Hyperspec sensor recorded 512 bands in 12-bit DN values. Spectral binning was performed
and the number of bands was reduced to 128 to increase signal-to-noise ratios. The DN
datasets were converted to reflectance by a series of data correction and mosaicked to obtain
one entire image. The data correction procedure comprises radiometric correction, geometric
correction, atmospheric correction, and BRDF correction. The correction procedure was
performed on each flightline image.
Radiometric correction: Gains and offset radiometric calibration coefficients were mea-
sured for 512 bands by a set of precise laboratory experiments. The at-sensor radiance
datasets were retrieved from the DN datasets using the calibration parameters.
Geometric correction: The position and orientation data of the sensor was recorded for
each frame by GPS/IMU. The at-sensor radiance datasets were geometrically corrected
and rectified to the UTM 54N projection with a grid of 2.5 m.
Atmospheric correction: We used the ATCOR-4 program [
], version 6.3, for atmospheric
correction. The used atmospheric type was a mid-latitude summer atmosphere with a
rural aerosol model.
BRDF correction: BRDF correction was performed by the ATCOR-4 program.
We used thirteen north-to-south flightlines to make the mosaic data. The mosaiced entire
scene consists of 2517
2335 pixels. The central point of the scene is located at coordinates:
N, 140.008380
E. The color composite mosaic image is shown in Figure 3.1(a). All
the flightlines were mosaicked based on smoothly weighted averaging in overlapped areas so
that edges of flightlines can be seamless. Finally, we applied spectral polishing to the mosaic
data using the ATCOR-4 program to mitigate spectral spikes due to non-optimal atmospheric
Ground truth of 19 classes was collected via a field survey and visual inspection using the
high-resolution color images obtained by EOS 5D Mark II. The 19 classes comprise water,
three types of bare soil, seven types of vegetation, and eight types of man-made objects. The
ground truth is shown in Figure 3.1(b) and the names and numbers of ground truth pixels
are listed in Table 4.1. A classification map obtained by Rotation Forest [
] using the ground
truth is demonstrated in Figure 3.1(c). The hyperspectral data and ground truth were made
available to the scientific community in the ENVI and MATLAB formats at http://park.itc.u-
R. Richter and D. Schläpfer, “Atmospheric / topographic correction for airborne imager:
ATCOR-4 User Guide,” DLR IB 565-02/16, Wessling, Germany, 2016.
J. Xia, P. Du, X. He, and J. Chanussot, “Hyperspectral remote sensing image classification
based on rotation forest,IEEE Geoscience and Remote Sensing Letters, vol. 11, no. 1, pp.
239–243, 2014.
(a) (b)
Water Bare soil
Bare soil
Bare soil
(farmland) Natural plants
Weeds in farmland Forest Grass Rice field
Rice field
(first stage)
Row crops Plastic house Manmade
(red) Manmade grass Asphalt Paved ground
Figure 3.1:
(a) Color composite image, (b) ground truth, (c) Rotation Forest classification map,
and (d) legend of 19 classes.
Table 4.1: Name and number of samples in ground truth
No. Name Pixels
1 Water 2845
2 Bare soil (school) 2859
3 Bare soil (park) 286
4 Bare soil (farmland) 4852
5 Natural plants 4297
6 Weeds in farmland 1108
7 Forest 20516
8 Grass 6515
9 Rice field (grown) 13369
10 Rice field (first stage) 1268
11 Row crops 5961
12 Plastic house 2193
13 Manmade (non-dark) 1220
14 Manmade (dark) 7664
15 Manmade (blue) 431
16 Manmade (red) 222
17 Manmade grass 1040
18 Asphalt 801
19 Paved ground 145
... Source domain: The Chikusei scene [44] was taken by Headwall Hyperspec-VNIR-C imaging sensor in Chikusei, Ibaraki, Japan, 29 July, 2014, which has 128 bands in the spectral range from 363 to 1018 nm. As can be seen, the false-colour image and ground truth map in Figure 4, consists of 2517 � 2335 pixels, and the ground sampling distance was 2.5 m. ...
... The authors gratefully acknowledge Space Application Laboratory, Department of Advanced Interdisciplinary Studies, the University of Tokyo for providing the hyperspectral Chikusei data [44]. Moreover, this work was supported by the National Natural Science Foundation of China under Grant 62161160336 and Grant 42030111. ...
Full-text available
Recently, deep learning has achieved considerable results in the hyperspectral image (HSI) classification. However, most available deep networks require ample and authentic samples to better train the models, which is expensive and inefficient in practical tasks. Existing few‐shot learning (FSL) methods generally ignore the potential relationships between non‐local spatial samples that would better represent the underlying features of HSI. To solve the above issues, a novel deep transformer and few‐shot learning (DT‐FSL) classification framework is proposed, attempting to realize fine‐grained classification of HSI with only a few‐shot instances. Specifically, the spatial attention and spectral query modules are introduced to overcome the constraint of the convolution kernel and consider the information between long‐distance location (non‐local) samples to reduce the uncertainty of classes. Next, the network is trained with episodes and task‐based learning strategies to learn a metric space, which can continuously enhance its modelling capability. Furthermore, the developed approach combines the advantages of domain adaptation to reduce the variation in inter‐domain distribution and realize distribution alignment. On three publicly available HSI data, extensive experiments have indicated that the proposed DT‐FSL yields better results concerning state‐of‐the‐art algorithms.
... 1) Source Domain: The source domain dataset utilizes the Chikusei dataset. The Chikusei dataset was gathered over agricultural and urban areas in Chikusei, Ibaraki, Japan by a Headwall Hyperspec-VNIR-C imaging sensor, on July 29, 2014 [58]. It comprises 128 spectral bands with a spectrum of 363-1018 nm, comprises 2517 × 2335 pixels in which each has a spatial resolution of 2.5 m and comprises 19 unique land-cover categories. ...
Full-text available
In cross-domain hyperspectral image (HSI) classification, the labeled samples of the target domain are very limited, and it is a worthy attention to obtain sufficient class information from the source domain to categorize the target domain classes (both the same and new unseen classes). This article investigates this problem by employing few-shot learning (FSL) in a meta-learning paradigm. However, most existing cross-domain FSL methods extract statistical features based on convolutional neural networks (CNNs), which typically only consider the local spatial information among features, while ignoring the global information. To make up for these shortcomings, this paper proposes novel convolutional transformer-based few-shot learning (CTFSL). Specifically, FSL is first performed in the classes of source and target domains simultaneously to build the consistent scenario. Then, a domain aligner is set up to map the source and target domains to the same dimensions. In addition, a convolutional transformer (CT) network is utilized to extract local-global features. Finally, a domain discriminator is executed subsequently that can not only reduce domain shift, but also distinguish from which domain a feature originates. Experiments on three widely used hyperspectral image datasets indicate that the proposed CTFSL method is superior to the state-of-the-art cross-domain FSL methods and several typical HSI classification methods in terms of classification accuracy.
... (4) Chikusei: The Chikusei dataset is an aerial hyperspectral dataset which was captured by the Headwall Hyperspec-VNIR-C sensor in Chikusei, Japan on July 29, 2014 [52]. This dataset contains 128 spectral bands with wavelengths ranging from 343 to 1018nm. ...
Full-text available
Recently, deep learning methods, especially convolutional neural networks (CNNs), have achieved good performance for hyperspectral image (HSI) classification. However, due to limited training samples of HSIs and the high volume of trainable parameters in deep models, training deep CNN-based models is still a challenge. To address this issue, this study investigates contrastive learning (CL) as a pre-training strategy for HSI classification. Specifically, a supervised contrastive learning (SCL) framework, which pre-trains a feature encoder using an arbitrary number of positive and negative samples in a pair-wise optimization perspective, is proposed. Additionally, three techniques for better generalization in the case of limited training samples are explored in the proposed SCL framework. First, a spatial–spectral HSI data augmentation method, which is composed of multiscale and 3D random occlusion, is designed to generate diverse views for each HSI sample. Second, the features of the augmented views are stored in a queue during training, which enriches the positives and negatives in a mini-batch and thus leads to better convergence. Third, a multi-level similarity regularization method (MSR) combined with SCL (SCL–MSR) is proposed to regularize the similarities of the data pairs. After pre-training, a fully connected layer is combined with the pre-trained encoder to form a new network, which is then fine-tuned for final classification. The proposed methods (SCL and SCL–MSR) are evaluated on four widely used hyperspectral datasets: Indian Pines, Pavia University, Houston, and Chikusei. The experiment results show that the proposed SCL-based methods provide competitive classification accuracy compared to the state-of-the-art methods.
Recently, super-resolution (SR) tasks for single hyperspectral images have been extensively investigated and significant progress has been made by introducing advanced deep learning-based methods. However, hyperspectral image SR is still a challenging problem because of the numerous narrow and successive spectral bands of hyperspectral images. Existing methods adopt the group reconstruction mode to avoid the unbearable computational complexity brought by the high spectral dimensionality. Nevertheless, the group data lose the spectral responses in other ranges and preserve the information redundancy caused by continuous and similar spectrograms, thus containing too little information. In this paper, we propose a novel single hyperspectral image SR method named GSSR, which pioneers the exploration of tweaking spectral band sequence to improve the reconstruction effect. Specifically, we design the group shuffle that leverages interval sampling to produce new groups for separating adjacent and extremely similar bands. In this way, each group of data has more varied spectral responses and less redundant information. After the group shuffle, the spectral-spatial feature fusion block is employed to exploit the spectral-spatial features. To compensate for the adjustment of spectral order by the group shuffle, the local spectral continuity constraint module is subsequently appended to constrain the features for ensuring the spectral continuity. Experimental results on both natural and remote sensing hyperspectral images demonstrate that the proposed method achieves the best performance compared to the state-of-the-art methods.
Few-shot learning provides a new way to solve the problem of insufficient training samples in hyperspectral classification. It can implement reliable classification under several training samples by learning meta-knowledge from similar tasks. However, most existing works perform frequency statistics, which may suffer from the prevalent uncertainty in point estimates with limited training samples. To overcome this problem, we reconsider the hyperspectral image few-shot classification (HSI-FSC) task as a hierarchical probabilistic inference from a Bayesian view and provide a careful process of meta-learning probabilistic inference. We introduce a prototype vector for each class as latent variables and adopt distribution estimates for them to obtain their posterior distribution. The posterior of the prototype vectors is maximized by updating the parameters in the model via the prior distribution of HSI and labeled samples. The features of the query samples are matched with prototype vectors drawn from the posterior, thus a posterior predictive distribution over the labels of query samples can be inferred via an amortized Bayesian variational inference approach. Experimental results on four datasets demonstrate the effectiveness of our method. Especially given only 3-5 labeled samples, the method achieves noticeable upgrades of overall accuracy against competitive methods.
Full-text available
Hiperspektral Görüntüler (HSG), sağladığı yüksek spektral çözünürlük sayesinde birçok alanda kullanım alanına sahiptir. HSG'lerin sınıflandırılması, görüntülerin yüksek spektral çözünürlüğü sebebiyle zorlayıcı bir süreçtir. Bu bağlamda HSG'lerin sınıflandırılmasında birçok Makine Öğrenme (MÖ) algoritmasının performansı araştırılmıştır. Özellikle Derin Öğrenmenin alt dallarından biri olan Evrişimli Sinir Ağları (ESA) tabanlı birçok ağ mimarisi HSG'lerin sınıflandırılması için özel olarak geliştirilmiştir. Hiperspektral görüntüleme sistemlerinin (HGS) yüksek maliyetleri sebebiyle veri setlerinin elde edilmesi zordur. Son yıllarda insanlı ve insansız hava araçları (İHA) için geliştirilen yeni nesil hiperspektral görüntüleme sistemlerinin maliyetleri giderek düşmekte olup yüksek mekânsal çözünürlüklü ve uygun maliyetli HSG elde edilmesi mümkün hale gelmiştir. Bu çalışmada çeşitli platformlardan elde edilmiş farklı spektral ve uzamsal çözünürlükteki HSG'lerin sınıflandırılmasında çeşitli MÖ algoritmalarının performansının incelenmesi amaçlanmıştır. Bu kapsamda uydu tabanlı HyRANK Loukia, hava aracı tabanlı Chikusei İHA tabanlı WHU-Hi HanChuan isimli görüntüler Destek Vektör Makineleri, Rastgele Orman ve ESA algoritmaları kullanılarak sınıflandırılmıştır. Sınıflandırma performansları incelendiğinde en yüksek genel doğruluk değerleri veri setleri için sırasıyla %87,78, %99,82 ve %96,89 olarak ESA tarafından elde edildiği görülmüştür.
Hyperspectral image (HSI) has received considerable attention in the field of target detection due to its powerful ability to capture the spectral information of land covers, and plenty of detection algorithms have been explored. However, these methods generally leverage the difference between the spectrum of the target to be detected and the background spectrum to accomplish target detection, and so are susceptible to the problem of spectral variability. In this article, we propose a global-to-local hierarchical detection algorithm for HSI (G2LHTD). Firstly, extended morphological attribute profile (EMAP) is first used to model global spatial texture information from HSI. Subsequently, a diverse-direction constrained energy minimization (D <sup xmlns:mml="" xmlns:xlink="">2</sup> CEM) detector is developed to consider the spatial information within eight neighborhoods around each pixel in HSI, yielding comprehensive local spatial information. More substantially, to effectively discriminate the neighborhood information in diverse directions, we devise an adaptive neighborhood feature aggregation (ANFA) strategy, which will comprehensively evaluate the significance of neighborhood information in diverse directions. As a result, the spatial features of HSI can be comprehensively considered for hyperspectral target detection (HTD). Extensive experiments, conducted on four standard datasets, demonstrate the effectiveness of the proposed method. The codes of this work will be available at for the sake of reproducibility.
Deep convolutional neural networks (CNNs) have made great progress in the super-resolution (SR) of hyperspectral images (HSIs). However, most methods utilize convolution to explore local features, and global features are ignored. It is expected that combining non-local mechanism with CNN will improve the performance of HSI SR. This paper presents a multi-level progressive HSI SR network. The dense non-local and local block (DNLB) is constructed to combine local and global features, which are used to reconstruct super-resolution images at each level. Due to the high dimension of HSI, original non-local methods produce memory-expensive attention maps. We develop a non-local channel attention block to extract the global features of HSIs efficiently. Spatial-spectral gradient is injected in the non-local attention block to obtain better details. Furthermore, the progressive learning mode based multi-level network is proposed to reconstruct HSI with fine details. A number of experiments demonstrate that our method can reconstruct hyperspectral images more accurately than existing methods.
Although natural image super-resolution methods have achieved impressive performance, single hyperspectral image super-resolution still remains a challenge due to the high dimensionality. In recent years, many single hyperspectral image super-resolution methods adopted the group-convolution strategy to design the network for reducing the computational burden. However, these methods still process all spectral bands at once during the deep feature extraction and reconstruction, which increases the difficulty of fully exploring the inherent data characteristic of hyperspectral images. Moreover, the advanced group-based methods make insufficient exploitation of complementary information contained in different bands, resulting in limited reconstruction performance. In this paper, we propose a novel group-based single hyperspectral image super-resolution method termed GELIN to reconstruct high-resolution images in a group-by-group manner, which alleviates the difficulty of feature extraction and reconstruction for hyperspectral images. Specifically, a spatial-spectral embedding learning module is designed to extract rewarding spatial details and explore the correlations among spectra simultaneously. Considering the high similarity among different bands, a neighboring group integration module is proposed to fully exploit the complementary information contained in neighboring image groups to recover missing details in the target image group. Experimental results on both natural and remote sensing hyperspectral datasets demonstrate that the proposed method is superior to other state-of-the-art methods both visually and metrically.
Full-text available
In this letter, an ensemble learning approach, Rotation Forest, has been applied to hyperspectral remote sensing image classification for the first time. The framework of Rotation Forest is to project the original data into a new feature space using transformation methods for each base classifier (decision tree), then the base classifier can train in different new spaces for the purpose of encouraging both individual accuracy and diversity within the ensemble simultaneously. Principal component analysis (PCA), maximum noise fraction, independent component analysis, and local Fisher discriminant analysis are introduced as feature transformation algorithms in the original Rotation Forest. The performance of Rotation Forest was evaluated based on several criteria: different data sets, sensitivity to the number of training samples, ensemble size and the number of features in a subset. Experimental results revealed that Rotation Forest, especially with PCA transformation, could produce more accurate results than bagging, AdaBoost, and Random Forest. They indicate that Rotation Forests are promising approaches for generating classifier ensemble of hyperspectral remote sensing.