Graph learning is an effective dimensionality reduction (DR) manner to analyze the intrinsic properties of high dimensional data, it has been widely used in the fields of DR for hyperspectral image (HSI) data, but they ignore the collaborative relationship between sample pairs. In this paper, a novel supervised spectral DR method called local constrained manifold structure collaborative preserving embedding (LMSCPE) was proposed for HSI classification. At first, a novel local constrained collaborative representation (CR) model is designed based on the CR theory, which can obtain more effective collaborative coefficients to characterize the relationship between samples pairs. Then, an intraclass collaborative graph and an interclass collaborative graph are constructed to enhance the intraclass compactness and the interclass separability, and a local neighborhood graph is constructed to preserve the local neighborhood structure of HSI. Finally, an optimal objective function is designed to obtain a discriminant projection matrix, and the discriminative features of various land cover types can be obtained. LMSCPE can characterize the collaborative relationship between sample pairs and explore the intrinsic geometric structure in HSI. Experiments on three benchmark HSI data sets show that the proposed LMSCPE method is superior to the stateoftheart DR methods for HSI classification.
Hyperspectral imagery (HSI) captures reflectance values over a wide range of electromagnetic spectra for each pixel, it can distinguish more subtle differences between land cover types than traditional multispectral imagery (MSI) [
Serving as a good tool for DR, graph learning methods have attracted increasing attention of researchers by mapping the highdimensional data into a lowerdimensional embedding space [
To overcome this drawback, researchers proposed a series of linear graph learning methods [
Recently, the graph learning methods based on sparse representation have achieved good classification performance [
To characterize the collaborative relationship and intrinsic structure of the HSI data, we proposed a novel supervised spectral DR method, termed local constrained manifold structure collaborative preserving embedding (LMSCPE) for HSI classification. The LMSCPE method makes full use of collaborative relationship and local neighborhood information of HSI to extract discriminant features for classification. The main contributions of this paper are listed as below: (1) Based on the collaborative representation theory, we proposed a novel local constrained CR model, which can obtain more effective collaborative coefficients to characterize the relationship between samples pairs. (2) According to the graph embedding frame, an intraclass collaborative graph and an interclass collaborative graph are constructed to enhance the intraclass compactness and the interclass separability. (3) To preserve the local neighborhood structure of HSI, a local neighborhood graph is constructed by its
The paper organization is as follows:
For convenience, let us denote a HSI data set with
The graph embedding (GE) framework helps to redefine most DR algorithms in a unified framework, it characterizes the statistical and geometric properties of data by intrinsic graph
The GE framework aims to represent a graph in a low dimensional space which preserves as much graph property information as possible. The optimal objective function can be given by a graph preserving criterion:
In the hyperspectral imagery community, many representative graph learning methods have been proposed for preserving intrinsic structure information of HSI data, these methods often employ sparse representation coefficients to characterize the relationship of different samples. However, sparse representation is an iterative procedure, which usually suffers from computational expensiveness and the solution is often suboptimal. Therefore, the problem of collaborative representation (CR) has received considerable attentions recently [
The basic idea of CR is that a query sample can be reconstructed by a set of training samples, and the reconstructive coefficients achieved by the
With some mathematical operations, the collaborative representation vector
To characterize the discriminative properties and intrinsic structure of the HSI data, a local constrained manifold structure collaborative preserving embedding (LMSCPE) method was proposed for DR. LMSCPE first designs a novel CR model to discover the collaborative relationship between different samples that belong to the same class. Based on the collaborative representation coefficients, it constructs an intraclass collaborative graph and an interclass collaborative graph to characterize the intraclass compactness and the interclass separability. Then, it selects neighbors to construct a local neighborhood graph, which can effectively preserve the local geometric structure of HSI. After that, the collaborative graph and local neighborhood graph are incorporated to learn an effective projection matrix. LMSCPE can preserve the local neighborhood structure in HSI and enhance the discrimination power of embedding features. The flowchart of the proposed LMSCPE method is shown in
Due to the nonlinear optical effects of spectrum transmitting in the atmosphere, such as reflection, absorption, and dispersion, the spectral curves of pixels of the same category usually emerge subtle differences, which will affect the final classification performance of HSI [
In the proposed LCCR model, we incorporate localityconstrained terms into collaborative representation, and the minimization problem can be formulated as
In (
According to LCCR model, we construct the intraclass collaborative graph and the interclass collaborative graph to characterize the intraclass compactness and the interclass separability. For intraclass collaborative graph, the weight matrix is set as the collaborative coefficients between each point and the points from the same class. Denote the samples of the
Therefore, the intraclass collaborative representation coefficients of
With some mathematical operations, the collaborative representation coefficients
After obtaining the collaborative coefficients of each class, the intraclass weight matrix
For interclass collaborative graph, the interclass dictionary
Then, the interclass collaborative representation coefficients of
With some mathematical operations, the collaborative representation coefficients
Considering that the CR coefficients can effectively characterize the similarity relationship between sample pairs, we adopt the coefficients to construct the interclass weight matrix
Therefore, based on the intraclass collaborative graph
Similarly, according to interclass collaborative graph
For the purpose of seeking an optimal projection, it is natural to minimize the intraclass compactness and maximize the interclass variance simultaneously, that is
Due to the fact that the spectral curves of HSI pixels are easily affected by external environment and imaging equipment, the actually obtained spectral curves of each category exhibits a certain degree of difference, which will lead to a degraded classification performance [
In local neighborhood graph
To enhance the aggregation of data on local neighborhood structure, each point and its neighbor points are adopted to formulate the optimization problem in lowdimensional embedding space
In addition, aiming to separate the local graph structure of each pixel as far as possible, the optimization problem between different classes can be designed as follows.
To explore the local neighborhood structure in lowdimensional embedding space, we minimize the local neighborhood scatter matrix and maximize the total scatter matrix. Therefore, the optimal projection matrix
For high dimensional data, the collaborative relationship and local neighborhood structure between pixel pairs should be discovered simultaneously. Therefore, based on optimization problem of (
With the Lagrangian multiplier method, the solution of (
In this section, three public HSI data sets are adopted to demonstrate the effectiveness of LMSCPE by comparing it with some stateoftheart DR algorithms.
PaviaU data set: This data set was acquired by the ROSIS03 sensor over the University of Pavia, Italy. The spatial size of the scene is 610 × 340 with the spatial resolution of 1.3 m/pixel, and each pixel contains 115 spectral bands ranging from 0.43 to 0.86
LongKou data set: This data set was captured by Headwall NanoHyperspec imaging sensor on a UAV platform in Longkou Town, Hubei province, China. The full scene contains 550 × 400 pixels, each pixel contains 270 spectral bands ranging from 400 to 1000 nm. The data set possesses 9 different land cover types, and the spatial resolution is about 0.463 m.
MUUFL data set: This data set was captured by the ITRES CASI1500 sensor over the University of Southern MississippisGulfpark Campus. The full scene contains 325 × 220 pixels with a spatial resolution of 1 m, and the number of spectral bands in this dataset is 72. After removing 8 bands affected by noise, we adopt the remaining 64 spectral bands for classification.
In this section, the HSI data were randomly divided into training and test sets. The training set was adopted to learn a DR model, while the test set was used to verify the validity of the model. Then, we employed the nearest neighbor (NN) classifier for classification. After that, four performance indicators were used to evaluate the effectiveness of the algorithms, such as classification accuracy of each class (CA), overall classification accuracy (OA), average classification accuracy (AA), and kappa coefficient (KC) [
In experiments, we compared LMSCPE with some stateoftheart DR algorithms, including principal component analysis (PCA), locality preserving projection (LPP), linear discriminant analysis (LDA), local Fisher Discriminant Analysis (LFDA), local geometric structure Fisher analysis (LGSFA), sparse graph based discriminant analysis (SGDA), collaborative graphbased discriminant analysis (CGDA), and lowrank graphbased discriminant analysis (LGDA). Among them, PCA and LPP are unsupervised DR algorithms, they only consider the spectrum similarity between different pixels. The later six algorithms are supervised DR algorithms, which simultaneously consider the spectrum similarity and category information of pixels. Besides, the RAW method was added to compare with the DR algorithms, it indicates that the test set was classified by the NN classifier directly without DR process.
To demonstrate the effectiveness of the proposed LMSCPE algorithm, the parameters of all the DR models were optimized to achieve higher classification accuracies. For LPP and LFDA, the neighbors number was set to 7. For LGSFA, the numbers of intraclass neighbor and interclass neighbor were set as
To analyze the influence of different neighbor number
As shown in
To evaluate the classification performance of the proposed LMSCPE method under different regularization parameters of
From
In the experiments, the tradeoff parameter
According to
For the abovementioned DR methods, the value of the embedding dimension
From
To analyze the influence of training sample size on the classification performance,
As shown in
Considering that the land cover types of HSI have the problem of sample imbalance in practical scenes, we constructed the training set by selecting a certain proportion of training samples from each class, and the rest samples were adopted as test set. In the experiments, we set the proportion to 1% for the PaviaU data set, 0.2% for the LongKou data set, and 1% for the MUUFL data set. For convenience of comparison, we listed the CAs, OAs, AAs, and KCs of different algorithms in
From
For the classification task, both classification accuracy and running time are important performance evaluation indicators. To analyze the computational complexity of LMSCPE, denote the neighbor number of each sample as
Aiming to quantitatively evaluate the classification performance of the abovementioned DR methods, we detailed the running time of all DR algorithms in
As shown in
Hyperspectral images (HSI) contain abundant spectral information that can accurately distinguish the subtle differences between different pixels. However, the high dimensionality of HSIs has brought huge challenge to land cover classification. Traditional graph learning algorithms cannot effectively characterize the collaborative relationship between sample pairs, which will lead to a degraded classification performance for HSI. In this paper, we designed a supervised spectral DR method termed local constrained manifold structure collaborative preserving embedding (LMSCPE) for HSI classification. LMSCPE first adopts a novel local constrained CR model to obtain more effective collaborative coefficients between samples pairs. Then, two collaborative graphs are constructed to enhance the intraclass compactness and the interclass separability, and a local neighborhood graph is constructed to explore the local neighborhood structure of HSI. After that, an optimal objective function is designed to obtain a discriminant projection matrix, and the embedding features of different land cover types can be obtained. Experiments on PaviaU, LongKou, and MUUFL hyperspectral data sets demonstrate that LMSCPE is superior to some stateoftheart methods. However, spectralbased DR methods only consider the spectral information in HSI, which limit the final classification performance. Therefore, in the future, we consider fusing the spatial information into the DR model to further improve the classification accuracy of HSI.
All the authors made significant contributions to the manuscript. G.S. designed the DR algorithm, completed the corresponding experiments, and finished the first manuscript draft. F.L. analyzed the results and performed the validation work. Y.T. and Y.L. edited the first manuscript and finalized the manuscript for communication. All authors have read and agreed to the published version of the manuscript.
This work was supported in part by the National Natural Science Foundation of China under Grant 61801336 and Grant 62071340, in part by the Fundamental Research Funds for the Central Universities under Grant 2042020kf0013, and in part by the China Postdoctoral Science Foundation under Grant 2019M662717 and Grant 2020T130480.
Not applicable.
Not applicable.
The data presented in this study are available on request from the corresponding author.
The authors would like to thank the anonymous reviewers and associate editor for their valuable comments and suggestions to improve the quality of the paper.
The authors declare no conflict of interest.
The flowchart of the proposed local constrained manifold structure collaborative preserving embedding (LMSCPE) method.
PaviaU hyperspectral image. (
LongKou hyperspectral image. (
MUUFL hyperspectral image. (
Classification results with different neighbors number of
Classification results with different regularization parameters
Classification results of LMSCPE with different tradeoff parameter
Classification results with different embedding dimensions
Classification maps for the whole scene on the PaviaU data set. (
Classification maps for the whole scene on the LongKou data set. (
Classification maps for the whole scene on the MUUFL data set. (
Classification results of each method with different training sample sizes on the PaviaU dataset [OA ± STD (%)(KC)]. (OA—overall classification accuracy, STD—standard deviation, KC—kappa coefficient, and the best results of a column are marked in bold).
Method  20  30  40  50  60 

RAW  68.77 ± 1.74 (0.607)  70.48 ± 1.93 (0.628)  71.21 ± 2.04 (0.637)  72.79 ± 1.03 (0.655)  73.29 ± 0.76 (0.661) 
PCA  68.75 ± 1.73 (0.607)  70.44 ± 1.94 (0.628)  71.23 ± 1.97 (0.637)  72.77 ± 1.01 (0.655)  73.28 ± 0.79 (0.661) 
LPP  66.12 ± 0.91 (0.578)  70.91 ± 2.90 (0.634)  72.56 ± 1.48 (0.654)  74.58 ± 0.89 (0.677)  76.13 ± 1.05 (0.695) 
LDA  67.28 ± 4.54 (0.589)  72.53 ± 2.21 (0.652)  75.38 ± 0.85 (0.683)  77.36 ± 1.10 (0.709)  77.57 ± 1.56 (0.711) 
LFDA  61.88 ± 2.44 (0.525)  69.95 ± 2.84 (0.622)  74.03 ± 1.80 (0.670)  75.39 ± 1.02 (0.687)  77.42 ± 1.89 (0.711) 
LGSFA  65.08 ± 1.96 (0.560)  71.02 ± 1.71 (0.634)  73.01 ± 1.51 (0.657)  75.18 ± 3.02 (0.683)  75.60 ± 0.98 (0.686) 
SGDA  73.04 ± 1.19 (0.659)  75.62 ± 1.77 (0.690)  76.13 ± 2.33 (0.696)  78.43 ± 1.29 (0.723)  79.06 ± 1.15 (0.730) 
LGDA  71.55 ± 1.90 (0.642)  73.85 ± 1.60 (0.669)  75.30 ± 2.44 (0.682)  75.65 ± 0.93 (0.690)  77.25 ± 0.81 (0.709) 
CGDA  72.22 ± 2.60 (0.651)  75.20 ± 1.03 (0.686)  76.03 ± 2.62 (0.696)  77.20 ± 1.44 (0.709)  78.95 ± 1.47 (0.730) 
LMSCPE 
Classification results of each method with different training sample sizes on the LongKou dataset [OA ± STD (%)(KC)]. (OA—overall classification accuracy, STD—standard deviation, KC—kappa coefficient, and the best results of a column are marked in bold).
Method  20  30  40  50  60 

RAW  79.39 ± 1.52 (0.740)  81.39 ± 0.68 (0.765)  82.20 ± 0.84 (0.775)  82.49 ± 0.78 (0.778)  83.53 ± 0.89 (0.791) 
PCA  79.38 ± 1.53 (0.740)  81.38 ± 0.68 (0.764)  82.18 ± 0.84 (0.774)  82.48 ± 0.80 (0.778)  83.52 ± 0.91 (0.791) 
LPP  66.70 ± 1.81 (0.592)  73.26 ± 0.39 (0.668)  76.66 ± 1.38 (0.708)  80.06 ± 1.69 (0.749)  82.45 ± 0.67 (0.777) 
LDA  83.94 ± 1.93 (0.795)  85.60 ± 0.74 (0.816)  87.73 ± 0.68 (0.843)  89.74 ± 1.06 (0.868)  91.16 ± 0.63 (0.886) 
LFDA  77.76 ± 2.43 (0.719)  83.43 ± 2.33 (0.789)  83.02 ± 0.55 (0.784)  89.47 ± 0.57 (0.865)  91.68 ± 0.47 (0.893) 
LGSFA  83.46 ± 0.66 (0.789)  83.54 ± 2.16 (0.790)  85.78 ± 1.13 (0.818)  90.12 ± 0.36 (0.873)  91.79 ± 0.40 (0.894) 
SGDA  87.47 ± 2.04 (0.839)  89.05 ± 0.65 (0.859)  89.93 ± 0.91 (0.870)  90.59 ± 0.65 (0.879)  91.62 ± 0.44 (0.892) 
LGDA  84.73 ± 2.62 (0.805)  85.02 ± 0.62 (0.809)  85.50 ± 1.23 (0.815)  85.79 ± 0.50 (0.819)  86.77 ± 0.53 (0.831) 
CGDA  88.03 ± 2.82 (0.847)  89.61 ± 0.94 (0.866)  89.92 ± 1.01 (0.870)  90.17 ± 0.72 (0.874)  90.99 ± 0.69 (0.884) 
LMSCPE 
Classification results of each method with different training sample sizes on the MUUFL dataset [OA ± STD (%)(KC)]. (OA—overall classification accuracy, STD—standard deviation, KC—kappa coefficient, and the best results of a column are marked in bold).
Method  20  30  40  50  60 

RAW  68.72 ± 3.60 (0.610)  70.55 ± 1.02 (0.632)  71.41 ± 1.19 (0.641)  72.58 ± 1.36 (0.655)  72.70 ± 0.40 (0.656) 
PCA  68.71 ± 3.58 (0.610)  70.55 ± 1.02 (0.632)  71.42 ± 1.17 (0.642)  72.57 ± 1.32 (0.654)  72.74 ± 0.40 (0.657) 
LPP  63.08 ± 4.46 (0.545)  67.32 ± 1.70 (0.594)  68.86 ± 2.25 (0.612)  70.90 ± 1.29 (0.635)  72.16 ± 1.01 (0.649) 
LDA  62.23 ± 4.26 (0.532)  65.65 ± 1.70 (0.570)  65.76 ± 2.23 (0.571)  66.89 ± 0.77 (0.583)  67.04 ± 1.18 (0.586) 
LFDA  62.65 ± 4.03 (0.537)  65.68 ± 2.88 (0.574)  68.45 ± 2.09 (0.606)  70.89 ± 1.25 (0.636)  71.56 ± 0.74 (0.643) 
LGSFA  63.04 ± 4.68 (0.544)  67.32 ± 1.38 (0.593)  68.24 ± 2.50 (0.605)  69.97 ± 2.23 (0.625)  70.42 ± 0.73 (0.630) 
SGDA  70.03 ± 3.78 (0.625)  72.61 ± 1.10 (0.656)  73.32 ± 0.95 (0.664)  73.87 ± 1.36 (0.671)  74.89 ± 0.42 (0.682) 
LGDA  71.33 ± 3.75 (0.640)  72.71 ± 0.72 (0.658)  73.54 ± 0.86 (0.667)  74.11 ± 0.20 (0.673)  74.34 ± 1.19 (0.676) 
CGDA  71.12 ± 3.52 (0.638)  72.40 ± 0.98 (0.654)  73.04 ± 1.08 (0.661)  74.62 ± 1.33 (0.679)  74.71 ± 0.44 (0.680) 
LMSCPE 
Classification results (%) of each type of land covers (t = 1%) with the nearest neighbor (NN) classifier on the PaviaU data set. (AA—average classification accuracy, OA—overall classification accuracy, KC—kappa coefficient, and the best results of a row are marked in bold).
Class  Land Covers  Training  Test  RAW  PCA  LPP  LDA  LFDA  LGSFA  SGDA  LGDA  CGDA  LMSCPE 

1  Asphalt  66  6565  84.54  84.42  84.74  84.49  69.72  84.04  85.35  84.78  88.26 

2  Meadows  186  18,463  89.87  89.84  89.43  88.93  77.56 

89.60  90.88  92.83  95.54 
3  Gravel  21  2078  49.09  48.89  41.29  38.69  48.85  31.57  52.17  56.59  60.59 

4  Trees  31  3033  75.70  75.77  79.59  86.25 

88.16  75.37  78.37  78.57  85.23 
5  Metal  13  1332  98.57  98.57  99.17  99.62  99.02 

99.10  98.57  99.32  99.55 
6  Soil  50  4979  54.25  54.35  53.99  55.79  66.98  38.24  56.92  61.54  66.94 

7  Bitumen  13  1317  64.54  64.77  51.86  20.12  51.94  32.73  65.60  63.86 

63.33 
8  Bricks  37  3645  72.98  72.62  65.24  62.09  64.20  73.39  73.77  73.20 

76.27 
9  Shadows  10  937 


99.79  88.26 

98.40 

99.15  99.36  99.79 
AA  76.60  76.57  73.90  69.36  74.53  71.46  77.53  78.55  81.89 


OA  80.09  80.05  78.76  77.56  73.99  80.28  80.66  81.97  84.95 


KC  73.29  73.23  71.55  70.02  66.59  72.92  74.10  75.87  79.84 

Classification results (%) of each type of land covers (t = 0.2%) with the NN classifier on the LongKou data set. (AA—average classification accuracy, OA—overall classification accuracy, KC—kappa coefficient, and the best results of a row are marked in bold).
Class  Land Covers  Training  Test  RAW  PCA  LPP  LDA  LFDA  LGSFA  SGDA  LGDA  CGDA  LMSCPE 

1  Corn  69  34,442  93.36  93.30  82.20  93.03  94.56  96.90  95.46  96.85  97.77 

2  Cotton  17  8357  47.24  47.24  34.79  47.79  52.02  56.19  53.05  45.14  54.83 

3  Sesame  10  3021  37.80  37.54  15.33  33.40  32.54  65.18  40.19  56.04  64.71 

4  Broadleaf soybean  126  63,086  87.33  87.32  80.98  88.89  90.57  95.36  89.74  90.09  92.22 

5  Narrowleaf soybean  10  4141  53.39  53.34  27.58  33.30  35.69  57.38  54.94  74.50  76.33 

6  Rice  24  11,830  90.51  90.45  78.88  98.08  97.98  95.58  91.62  96.13 

99.80 
7  Water  134  66,922  99.93  99.93  99.95  99.92  99.91 

99.93  99.94  99.91  99.91 
8  Roads and houses  14  7110  75.09  75.08  60.03  62.57  67.33  79.54  76.17  83.36 

85.23 
9  Mixed weed  10  5219  28.68  28.66  29.70  60.80  57.14  58.59  34.68  50.89  59.11 

AA  68.15  68.10  56.60  68.64  69.75  78.30  70.64  76.99  81.14 


OA  87.67  87.65  81.30  88.47  89.52  92.84  89.33  90.91  92.78 


KC  83.80  83.77  75.47  84.78  86.13  90.52  85.96  88.03  90.49 

Classification results (%) of each type of land covers (t = 1%) with the NN classifier on the MUUFL data set. (AA—average classification accuracy, OA—overall classification accuracy, KC—kappa coefficient, and the best results of a row are marked in bold).
Class  Land Covers  Training  Test  RAW  PCA  LPP  LDA  LFDA  LGSFA  SGDA  LGDA  CGDA  LMSCPE 

1  Trees  465  22,781  88.98  89.02  89.41  90.06  81.71  89.83  89.06  89.65  89.81 

2  Mostly grass  85  4185  67.78  67.68  65.65  60.82  67.64  65.27  67.14 

69.43  70.43 
3  Mixed ground surface  138  6744  62.64  62.70  65.14  69.03  53.00  64.60  63.09  65.49  66.84 

4  Dirt/sand  37  1789  58.13  58.08  64.88  62.39 

72.84  57.52  62.94  67.87  71.24 
5  Road  134  6553  86.60  86.66  83.56  72.01  66.89  77.92  86.27 

87.34  87.70 
6  Water  10  456  77.19  76.97  73.90  27.41  53.51  49.78  76.75 

78.29  76.75 
7  Building shadow  45  2188  61.28  60.83  57.49  30.08  52.92  41.75  60.83  62.78  64.36 

8  Buildings  125  6115  77.78  77.70  75.17  69.96  64.45  76.21  77.97  82.03 

82.10 
9  Sidewalk  28  1357  43.25  43.25  44.42  32.82 

41.06  43.76  45.95  43.03  41.79 
10  Yellow curb  10  173  47.40  46.82  43.35  78.03 

82.08  47.98  57.23  53.76  70.52 
11  Cloth panels  10  259  88.80  88.80  89.96 

94.59  94.98  88.42  89.19  87.64  89.58 
AA  69.08  68.96  68.45  62.51  68.05  68.76  68.98  72.30  71.96 


OA  78.70  78.69  78.42  74.98  70.84  77.39  78.69  80.65  80.93 


KC  72.06  72.04  71.65  66.61  62.44  70.05  72.04  74.64  74.97 

Computational time (in seconds) of the different algorithms on the PaviaU, LongKou, and MUUFL data sets.
Dataset  PCA  LPP  LDA  LFDA  LGSFA  SGDA  LGDA  CGDA  LMSCPE 

PaviaU  0.025  0.031  0.013  0.037  0.292  0.659  3.009  0.315  2.650 
LongKou  0.014  0.124  0.023  0.040  0.616  0.308  1.655  0.112  0.887 
MUUFL  0.031  0.042  0.012  0.019  0.356  0.287  2.038  0.125  1.038 