Adaptive graph weighting for multi-view dimensionality ...see.xidian.edu.cn/faculty/chdeng/Welcome to Cheng Deng's Homepa… · mensional reduction can be well-incorporated with graph-based
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Signal Processing 165 (2019) 186–196
Contents lists available at ScienceDirect
Signal Processing
journal homepage: www.elsevier.com/locate/sigpro
Adaptive graph weighting for multi-view dimensionality reduction
Xinyi Xu
a , Yanhua Yang
c , Cheng Deng
a , ∗, Feiping Nie
b
a School of Electronic Engineering, Xidian University, Xi’an 710071, China b OPTIMAL, Northwestern Polytechnical University, Xi’an 710072, China c School of Computer Science and Technology, Xidian University, Xi’an 710071, China
a r t i c l e i n f o
Article history:
Received 8 January 2019
Revised 8 June 2019
Accepted 16 June 2019
Available online 24 June 2019
Keywords:
Multi-view learning
Adaptive graph weighting
Dimensionality reduction
Semi-supervised learning
Unsupervised learning
a b s t r a c t
Multi-view learning has become a flourishing topic in recent years since it can discover various informa-
tive structures with respect to disparate statistical properties. However, multi-view data fusion remains
challenging when exploring a proper way to find shared while complementary information. In this pa-
per, we present an adaptive graph weighting scheme to conduct semi-supervised multi-view dimensional
reduction. Particularly, we construct a Laplacian graph for each view, and thus the final graph is ap-
proximately regarded as a centroid of these single view graphs with different weights. Based on the
learned graph, a simple yet effective linear regression function is employed to project data into a low-
dimensional space. In addition, our proposed scheme can be well extended to an unsupervised version
within a unified framework. Extensive experiments on varying benchmark datasets illustrate that our
proposed scheme is superior to several state-of-the-art semi-supervised/unsupervised multi-view dimen-
sionality reduction methods. Last but not least, we demonstrate that our proposed scheme provides a
unified view to explain and understand a family of traditional schemes.
view integration algorithm can choose the useful features and
filter out the interfering ones.
c. The more labeled samples, the standard variation shows a de-
creasing tendency under most cases, which demonstrates that
more label information contributes to more stable performance.
6.3. Unsupervised experiments
When conducting the unsupervised multi-view dimensional
reduction experiment, we use the same ( β , μ) as the semi-
supervised experiments and verify the performance on multiple
feature dimensions: { 10 , 20 , 30 , 40 , 50 , 60 , 70 , 80 , 90 , 100 } .Table 7 illustrates the results of the four approaches and the num-
ber in brackets are the optimal dimension for the four datasets. We
can easily conclude that our proposed scheme achieves significant
performance boosting than the other unsupervised dimensionality
reduction approaches.
6.4. Hyperparameter analysis
Finally, we analyze the influence of hyperparameters μ and
β in the semi-supervised task. Set the number of labeled sam-
les to 3, and vary one of the parameters in the range of
10 −2 , 10 −1
, 10 0 , 10 1 , 10 2 } when fix another. Fig. 4 shows the
erformance tendency with respect to β and μ in which the
op row is of unlabeled data and the second row is of test
ata. In most situation, the bigger β , the higher performance
e can achieve, which demonstrates the importance of optimiz-
ng the label fitness. On the contrary, the recognition accuracy
hows negative tendency with respect to μ. We set { β, μ} = 10 , 0 . 1 } , { 10 , 1 } , { 10 , 0 . 01 } , { 100 , 0 . 1 } for COIL, CMU, UMIST, YALE-
respectively. From Fig. 4 we can see that the performance is
ot very stable with respect to various hyperparameters, however,
X. Xu, Y. Yang and C. Deng et al. / Signal Processing 165 (2019) 186–196 195
Fig. 4. The accuracy of varying β . The first four graphics are the accuracy of the unlabeled data, while the last four graphics are the accuracy of the unseen data for four
different datasets: (a) CMU, (b) COIL, (c) UMIST (d) YALE-B.
t
a
c
i
n
s
t
7
s
s
t
t
m
f
l
t
s
w
C
d
p
D
A
e
p
o
0
P
R
here exists a consistent rule throughout the eight experiments:
small μ cooperated with a large β lead to high accuracy. We
an further fix a group of { β, μ} = { 10 , 0 . 1 } at the cost of sacrific-
ng a small amount of accuracy for all the experiments. This phe-
omenon illustrates the relative importance between the dimen-
ionality reduction term ‖ X T W + 1 b T − F ‖ 2
2 and the regularization
erm ‖ W ‖ 2 F .
. Conclusion
In this paper, we proposed a unified and effective scheme to
olve the multi-view dimensionality reduction issue under both
emi-supervised and unsupervised scenarios. Particularly, an adap-
ive weighting multi-view graph is employed for various informa-
ion integration, which can discover the persistent and comple-
entary pattern. We further learn a linear regression projection
or dimensionality reduction and penalize the discrepancy between
ow-dimensional representation and prediction vector to maximize
he matching degree. In optimization, we combine multi-view fu-
ion, dimensionality reduction with graph regularization together,
hich are jointly optimized and cooperated with each other well.
omprehensive experiments on four benchmark databases clearly
emonstrate that our proposed scheme outperforms existing ap-
roaches.
eclaration of Competing Interest
None.
cknowledgments
Our work was supported in part by the National Natural Sci-
nce Foundation of China under Grant 61572388 and 61703327 , in
art by the Key R&D Program-The Key Industry Innovation Chain
f Shaanxi under Grant 2017ZDCXL-GY-05-04-02, 2017ZDCXL-GY-
5-02 and 2018ZDXM-GY-176, and in part by the National Key R&D
rogram of China under Grant 2017YFE0104100.
eferences
[1] M. Yang , C. Deng , F. Nie , Adaptive-weighting discriminative regression for mul-ti-view classification, Pattern Recogn. 88 (4) (2019) 236–245 .
[2] H. Zhang , V.M. Patel , R. Chellappa , Hierarchical multimodal metric learning formultimodal classification, in: IEEE International Conference on Computer Vi-
sion (CVPR), 2017, pp. 3057–3065 . [3] N. Silberman , D. Hoiem , P. Kohli , R. Fergus , Indoor segmentation and sup-
port inference from rgbd images, in: European Conference on Computer Vision
(ECCV), 2012, pp. 746–760 . [4] S. Song , S.P. Lichtenberg , J. Xiao , Sun rgb-d: a rgb-d scene understand-
ing benchmark suite, in: IEEE International Conference on Computer Vision(CVPR), 5, 2015, p. 6 .
[5] D.G. Lowe , Distinctive image features from scale-invariant keypoints, Int. J.Comput. Vision (IJCV) 60 (2) (2004) 91–110 .
[6] A . Oliva , A . Torralba , Modeling the shape of the scene: a holistic representationof the spatial envelope, Int. J. Comput. Vision (IJCV) 42 (3) (2001) 145–175 .
[7] J. Xu , J. Han , F. Nie , X. Li , Re-weighted discriminatively embedded k -means
[8] F. Nie , G. Cai , X. Li , Multi-view clustering and semi-supervised classificationwith adaptive neighbours, in: American Association for Artificial Intelligence
(AAAI), 2017, pp. 2408–2414 . [9] X. Liu , L. Huang , C. Deng , B. Lang , D. Tao , Query-adaptive hash code ranking
(10) (2016) 4514–4524 . [10] X. Liu , L. Huang , C. Deng , J. Lu , B. Lang , Multi-view complementary hash tables
for nearest neighbor search, in: IEEE International Conference on ComputerVision (ICCV), 2015, pp. 1107–1115 .
[11] H. Hotelling , Relations between two sets of variates, Biometrika 28 (3/4) (1936)321–377 .
[12] A. Blum , T.M. Mitchell , Combining labeled and unlabeled sata with co-training,
in: CCLT, 1998, pp. 92–100 . [13] U. Brefeld , T. Scheffer , Co-em support vector learning, in: International Confer-
ence on Machine Learning (ICML), 2004 . [14] I.A. Muslea , Active learning with multiple views, J. Artif. Intell. Res. (JAIR) 27
(1) (2011) 203–233 . [15] S. Sun , F. Jin , Robust co-training, Int. J. Pattern Recognit. Artif. Intell. (TPAMI)
25 (07) (2011) 1113–1126 .
[16] M. Belkin , P. Niyogi , V. Sindhwani , Manifold regularization: a geometric frame-work for learning from labeled and unlabeled examples, J. Mach. Learn. Res.
(JMLR) 7 (Nov) (2006) 2399–2434 . [17] S. Yu , T. Falck , A. Daemen , L.-C. Tranchevent , J.A. Suykens , B. De Moor ,
Y. Moreau , L 2-norm multiple kernel learning and its application to biomedicaldata fusion, BMC Bioinf. 11 (1) (2010) 309 .
[18] J.A.K. Suykens , T.V. Gestel , J.D. Brabanter , B.D. Moor , J. Vandewalle , Least
squares support vector machines, Int. J. Circuit Theory Appl. (IJCTA) 27 (6)(2002) 605–615 .
196 X. Xu, Y. Yang and C. Deng et al. / Signal Processing 165 (2019) 186–196
[
[19] J. Ye , S. Ji , J. Chen , Multi-class discriminant kernel learning via convex pro-gramming, J. Mach. Learn. Res. (JMLR) 9 (Apr) (2008) 719–758 .
[20] W. Liu , I.W. Tsang , Making decision trees feasible in ultrahigh feature and labeldimensions, J. Mach. Learn. Res. 18 (2017) 81:1–81:36 .
[21] W. Liu , D. Xu , I.W. Tsang , W. Zhang , Metric learning for multi-output tasks,IEEE Trans. Pattern Anal. Mach. Intell. 41 (2) (2019) 408–422 .
[22] W. Liu , I.W. Tsang , K.-R. Müller , An easy-to-hard learning paradigm for multi-ple classes and multiple labels, J. Mach. Learn. Res. 18 (2017) 94:1–94:38 .
[23] X. Peng, J. Feng, S. Xiao, W-Y. Yau, J.T. Zhou, S. Yang, Structured AutoEncoders
[24] P.N. Belhumeur , J.P. Hespanha , D.J. Kriegman , Eigenfaces vs. fisherfaces: recog-nition using class specific linear projection, IEEE Trans. Pattern Anal. Mach. In-
tell. (TPAMI) 19 (7) (1997) 711–720 . [25] M. Sugiyama , Dimensionality reduction of multimodal labeled data by lo-
cal fisher discriminant analysis, J. Mach. Learn. Res. (JMLR) 8 (2007) 1027–
1061 . [26] S. Yan , D. Xu , B. Zhang , H.-J. Zhang , Q. Yang , S. Lin , Graph embedding and ex-
tensions: a general framework for dimensionality reduction, IEEE Trans. Pat-tern Anal. Mach. Intell. (TPAMI) 29 (1) (2007) 40–51 .
[28] H. Li , T. Jiang , K. Zhang , Efficient and robust feature extraction by maximum
margin criterion, IEEE Trans. Neural Netw. (TNN) 17 (1) (2006) 157–165 . [29] E. Yang , C. Deng , T. Liu , W. Liu , D. Tao , Semantic structure-based unsupervised
deep hashing, in: International Joint Conferences on Artificial Intelligence (IJ-CAI), 2018, pp. 1064–1070 .
[30] S. Wold , K. Esbensen , P. Geladi , Principal component analysis, ChemometricsIntell. Lab. Syst. 2 (1–3) (1987) 37–52 .
[31] X. He , D. Cai , S. Yan , H.-J. Zhang , Neighborhood preserving embedding, in: IEEE
International Conference on Computer Vision (ICCV), 2, 2005, pp. 1208–1213 . [32] L. Qiao , S. Chen , X. Tan , Sparsity preserving projections with applications to
face recognition, Pattern Recognit. (PR) 43 (1) (2010) 331–341 . [33] C. Deng , R. Ji , W. Liu , D. Tao , X. Gao , Visual reranking through weakly super-
vised multi-graph learning, in: IEEE International Conference on Computer Vi-sion (ICCV), 2013, pp. 2600–2607 .
[34] D. Zhang , Z.-H. Zhou , S. Chen , Semi-supervised dimensionality reduction, in:
Industrial Conference on Data Mining (ICDM), 2007, pp. 629–634 . [35] M. Sugiyama , T. Idé, S. Nakajima , J. Sese , Semi-supervised local fisher discrim-
[36] D. Zhou , O. Bousquet , T.N. Lal , J. Weston , B. Schölkopf , Learning with local andglobal consistency, in: Conference and Workshop on Neural Information Pro-
cessing Systems (NeurIPS), 2004, pp. 321–328 . [37] X. Zhu , J. Lafferty , Z. Ghahramani , Combining active learning and semi-super-
vised learning using gaussian fields and harmonic functions, in: InternationalConference on Machine Learning (ICML) Workshop, 3, 2003 .
[38] V. Sindhwani , P. Niyogi , M. Belkin , S. Keerthi , Linear manifold regularization forlarge scale semi-supervised learning, in: International Conference on Machine
Learning (ICML) Workshop, 28, 2005 .
[39] M. Belkin , P. Niyogi , Laplacian eigenmaps and spectral techniques for embed-ding and clustering, in: Conference and Workshop on Neural Information Pro-
cessing Systems (NeurIPS), 2002, pp. 585–591 . [40] J. Duchi , S. Shalev-Shwartz , Y. Singer , T. Chandra , Efficient projections onto the
l1-ball for learning in high dimensions, in: International Conference on Ma-chine Learning (ICML), 2008, pp. 272–279 .
[41] X. He , S. Yan , Y. Hu , P. Niyogi , H.J. Zhang , Face recognition using laplacianfaces,
in: IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI),2005, pp. 328–340 .
[42] F. Nie , H. Huang , C. Xiao , C. Ding , Efficient and robust feature selection via jointl2,1 -norms minimization, in: International Conference on Neural Information
Processing Systems, 2010 . [43] D. Cai , X. He , J. Han , Spectral regression for efficient regularized subspace
learning, in: IEEE International Conference on Computer Vision (ICCV), 2007,
pp. 1–8 . 44] T. Sim , S. Baker , M. Bsat , The CMU Pose, Illumination, and Expression Database,
IEEE Computer Society, 2003 . [45] C. Rate , C. Retrieval , Columbia object image library (coil-20), Computer (2011) .
[46] D.B. Graham , N.M. Allinson , Characterising Virtual Eigensignatures for GeneralPurpose Face Recognition, Springer Berlin Heidelberg, 1998 .
[47] A. Georghiades et al., From few to many: Illumination cone models for face
recognition under variable lighting and pose, 2001, pp. 643–660. [48] W. Liu , D. Tao , J. Liu , Transductive component analysis, in: Industrial Confer-
ence on Data Mining (ICDM), 2008, pp. 433–442 . [49] Deng, X. He, J. Han, Semi-supervised discriminant analysis (2007) 1–7.
[50] F. Nie , D. Xu , I.W.-H. Tsang , C. Zhang , Flexible manifold embedding: a frame-work for semi-supervised and unsupervised dimension reduction, IEEE Trans.
Image Process. (TIP) 19 (7) (2010) 1921–1932 .
[51] F. Nie , J. Li , X. Li , et al. , Parameter-free auto-weighted multiple graph learn-ing: a framework for multiview clustering and semi-supervised classifica-
tion., in: International Joint Conferences on Artificial Intelligence (IJCAI), 2016,pp. 1881–1887 .