-
Dictionary Learning for PhotometricRedshift Estimation
Joana Frontera-Pons ⇤, Florent Sureau⇤, Bruno Moraes†, Jérôme
Bobin⇤, Filipe B. Abdalla† ‡, Jean-Luc Starck⇤⇤ IRFU, CEA,
Université Paris-Saclay, F-91191 Gif-sur-Yvette, France. Email:
[email protected]† Department of Physics & Astronomy,
University College London, Gower Street, London WC1E 6BT, UK‡
Department of Physics and Electronics, Rhodes University, PO Box
94, Grahamstown, 6140, South Africa
Abstract—Photometric redshift estimation and the assessmentof
the distance to an astronomic object plays a key role in
modern cosmology. We present in this article a new method
for photometric redshift estimation that relies on sparse
linear
representations. The proposed algorithm is based on a sparse
decomposition for rest-frame spectra in a learned
dictionary.
Additionally, it provides both an estimate for the redshift
together
with the full resolution spectra from the observed photometry
for
a given galaxy. This technique has been evaluated on
realistic
simulated photometric measurements.
I. INTRODUCTION
Measuring the angular positions of galaxies to the
requiredcosmological precision is easily achievable with an
opticalgalaxy survey; measuring their radial positions, on the
otherhand, is one of the most challenging problems in modern
ob-servational cosmology. The way we infer those radial distancesis
based on their spectral energy distribution (SED): due tothe
expansion of the Cosmos, galaxies are receding from usand their
light is consequently redshifted, similar to a DopplerEffect. These
redshifts are directly related to the galaxies’distances, and by
measuring it from the spectral characteristicsof the received
light, we can reconstruct their positions.Here two different
approaches need to be distinguished, withtheir own characteristics,
advantages and challenges. Mea-suring spectroscopic redshifts
consists in observing the fullSED of a galaxy and identifying
features that allow a secureredshift determination. Galaxy spectra
are a consequence ofa series of relatively well-understood physical
phenomena,mostly concerning the nuclear and chemical reactions
insidestars and the types and ages of stellar populations within
thegalaxy in question (see [1] for a review). Atomic emission
andabsorption lines give rise to very distinct peaks and troughs in
agalaxy SED, and the secure identification of the wavelength ofsuch
a feature can easily be translated into a shift compared tothe
known wavelength of such a transition observed in
Earth’slaboratories.Photometric redshift measurements, on the other
hand, tryto reconstruct the redshift value out of only a handful
ofnumbers representing the integrated flux in broadband filters
.This is an ill-posed severely underdetermined inverse problemwhere
both redshift and spectra needs be estimated from afew photometric
measurements. Degeneracies abound, makingresults less precise and
possibly biased, but they circumventthe need of a spectrograph and
can also reach fainter magni-
tudes, as light is integrated in broad wavelength ranges.
Whilespectroscopic redshifts are more accurate than
photometricredshifts, their acquisition is time consuming and
limited toonly the brightest objects.Most of the techniques for
photometric redshift estimationare based either on empirical
machine learning approachesor obtained through template-fitting
methods [2]. Some of themost popular codes take advantage of neural
networks [3], [4],regression trees [5] among others. Other
information than fluxsuch as galaxy morphology, colors, etc can
also be included intheir redshift estimation to improve their
accuracy. However,the major drawback of these methods is that they
have to betrained with of a huge amount of representative labelled
datafor which the true redshift value needs to be perfectly
known.Another family of methods is based on template fitting.
Theyare based on matching physically meaningful redshifted
rest-frame templates (i.e. without redshift effects) to the
observedspectrum, to obtain both redshift and best fit template.
Thesetemplate spectra are constructed from theoretical
libraries.The most widespread photometric redshift estimation
templatefitting code is is LePHARE [6]. These techniques strongly
relyon a good template modelling and a deep understanding
ofrealistic galaxy SEDs.The main contributions of this article
are:
• A new algorithm for photometric redshift estimationbased on
rest-frames templates learned from data usingsparse dictionary
learning; the complete spectrum of thegalaxies is also
recovered;
• The evaluation of the proposed scheme on realistic
galaxyphotometric simulations.
II. METHODOLOGYLet us first consider the problem of recovering
the full
spectra of a galaxy, x 2 Rws , from photometric measurements,y 2
Rwp , and the vectors’ dimensions satisfy w
p
-
n represents the noise. Hence, we seek to retrieve the
originalsignal x by solving this super-resolution task. This
severlyunderdetermined ill-posed inverse problem requires
constraintson the spectra x to be solved. We propose to model the
spectraas a sparse linear combination of a few learned templates
thenredshifted to a tested redshift value ; the best
approximationof the photometric data giving the estimated redshift.
In thefollowing, we first present how we build our learned
rest-framerepresentation for galaxy spectra using sparse dictionary
learn-ing, the sparse coding algorithm associated to the recovery
ofthe spectra, and finally how we estimate the redshift.
A. Dictionary learning for rest-frame galaxy spectra
The proposed method relies on learning linear representa-tions
on rest-frame training data and the spectra are approx-imated by a
sparse decomposition, x = D↵. In this context,the dictionary D̂ 2
Rws⇥na with n
a
atoms is constructed froma training set X 2 Rws⇥nt . This
training set is composed of n
t
examples disposed in columns and the dictionary is obtainedby
solving the joint minimization problem:
D̂, Â = argminD2D,A
||X�DA||2F
s.t. 8i, ||↵i
||0 ⌧ (2)
where  2 Rna⇥nt is the matrix of codes and each
columncorresponds to the representation for each training
example,{↵
i
}. || · ||F
denotes the Frobenius norm, || · ||0 counts thenumber of
non-zero entries of a vector and ⌧ is the targetedsparsity degree,
D designates the set of dictionaries withatoms in the unit `2 ball.
Among the different approaches tosolve (2), we use a technique
based on the method of optimaldirection detailed in [7]. This
procedure performs alternatelysparse coding by orthogonal matching
pursuit and dictionaryupdating. The sparsity degree specified in
the sparse codingstage and the number of atoms in the dictionary
are freeparameters.
B. Sparse coding for rest-frame galaxy spectra
The original spectroscopic signal x is then retrieved fromthe
photometric signal y by imposing sparsity on the
learnedrepresentation ↵. In addition to the sparsity constraint,
pos-itivity on the reconstructed spectra can also be enforced fora
more constrained recovery. Although negative values of thespectra
may lead to a better photometry reconstruction, thesesolutions are
impossible. Therefore, we need to minimize:
↵̂ = argmin↵
1
2||y �HD↵||22 + � ||↵||1 + IC(D↵) (3)
where IC denotes the indicator function on the spectra setC that
enforces non-negativity for the galaxy emitted light.The
regularization parameter � controls the trade off betweenthe
reconstruction error and the sparsity promoting term. Thevalue of �
has been automatically set to be proportional to theestimated noise
level �̂ as detailed in [8].To take into account the different
constraints and the differ-ential term in the cost function, the
optimisation in (3) is
performed with the Generalized Forward-Backward
Splittingalgorithm introduced in [9] and recalled in algorithm 1.
Theprox operator associated to the `1 norm corresponds to
soft-thresholding operator; the one associated to the
indicatorfunction has no closed-form expression but was computed
withan inner FISTA algorithm on the dual problem, as detailed
in[10].
Algorithm 1 Generalized Forward-Backward Splitting
Initialization : k = 0, t1 = 0, t2 = 0, ↵̂ = 0 and � =
3�̂2,while Have not converged do
r = � 1L
DHT (y �HDT ↵̂k�1)
t1 = t1 + prox �L ||·||1
(2 ⇤ ↵̂k�1 � t1 �r)
t2 = t2 + proxIC(·)(2 ⇤ ↵̂k�1 � t2 �r)
↵̂k
= t1+t22end while
return ↵̂
C. Photometric redshift algorithm
Similarly, we can decompose an observed spectrum xz
, at acertain redshift z, according to x
z
= D(z)↵(z). The value ofz is computed as the one providing the
closest approximationfor the observed photometric signal y
z
.More specifically, for every tested value of z, the dictionaryD
originally built for rest-frame representations is redshiftedto
D(z) and we solve an inverse problem as the one describedin (3).
Accordingly, we can write for every value of z:
↵̂(z) =argmin↵
1
2||y �HD(z)↵(z)||22
+ � ||↵(z)||1 + IC(D(z)↵(z)) (4)
and solve (4) with algorithm 1 described above. Ultimately,the
value of the redshift z is obtained as the solution of thefollowing
equation:
ẑ = argminz
||y �HD(z)↵(z)||22||y||22
(5)
Solving problem (5) requires a fine sampling on the range
oftested redshifts, which would require solving many problems(4)
and would be computationnaly extremely costly. To avoidthis, we
propose a coarse-to-fine strategy for redshift testing:we evaluate
the approximation error for a hierarchical grid ofz values. In
other words, the whole interval that encompassesall possible values
of z 2 [z
min
, zmax
] has been uniformlysampled with ten steps, and the minimum
among this points,ẑ1, is retained. Then, the explored interval is
reduced aroundthis minimum. The new interval is evenly re-sampled
at tenpoints yielding a new minima. This process is repeated
fivetimes allowing us to build a hierarchical grid for z.
Thismethod reduces the computational time while keeping a
goodresolution in terms of z and will be illustrated in the
followingexperimental section.
-
D. Comparison with LePHAREIn order to assess the performance of
the proposed redshift
algorithm, the proposed algorithm is compared to LePHAREcode
[6]. LePHARE is a template-based redshift estimationmethod. It
starts from a library of spectroscopic templates builtfrom a wide
range of theoretical observations. It then appliesobservational
corrections to the spectra and integrates themthrough the defined
filter set. For each galaxy, LePHARE in-tegrates all spectra in the
library for several redshift test valuesand finds the combination
of a spectrum and a redshift valuethat provide the best possible
fit to the observed photometricdata. In this way, each galaxy is
assigned a best-fit templateand a redshift value.
III. EXPERIMENTAL RESULTSWe present in this section the results
obtained with galaxy
simulated spectroscopy for the training stage and
simulatedphotometry for testing the algorithms.
A. SimulationsIn this section we present the data used in our
studies. The
first step is to define a master catalog for the analyses.
Wework with the COSMOSSNAP simulation pipeline [11] togenerate a
data set of simulated galaxy SEDs and correspond-ing photometric
properties. The idea is to take real data asa basis, thereby
ensuring that realistic relationships betweengalaxy type, color,
size, redshift and SED are preserved.COSMOSSNAP chooses the COSMOS
photometric redshiftcatalog [12], generated from a combination of
30 bands fromdiverse astronomical surveys covering the full
spectral rangefrom the UV (GALEX), through the optical (Subaru) and
allthe way to infrared bands (CFHT, UKIRT, Spitzer). This dataset
is matched to Hubble ACS imaging data, to provide re-alistic
size-magnitude distributions, employing weak-lensing-quality shape
measurements [13]. Based on these properties,COSMOSSNAP chooses a
spectral template from a predefinedlibrary such that the integrated
fluxes through the 30 broad-band filters above provide the best-fit
to the observations. Eachgalaxy therefore has a “true” redshift and
its associated SED,and the distribution of types and redshifts
follows the measureddistribution in the COSMOS field. This catalog
is the basis forall COSMOSSNAP simulations.
To generate realistic photometric properties, the first stepis
to integrate the best-fit spectral template through a set
ofbroadband wavelength filters that will be used for a givengalaxy
survey. In actuality, the full transmission curve includesnot only
filter effects, but also atmospheric transmission (inthe case of
ground observations), telescope optical effects andmore. The full
transmission curve is commonly referred to asfilter throughput
(even though it is not only due to the filteritself). COSMOSSNAP
takes a defined set of filter throughputand calculates magnitudes
and their corresponding errors foreach galaxy in the catalogue. For
the purposes of our analysis,we choose to reproduce closely the
expected properties of theLarge Synoptic Survey Telescope [14]
(LSST). Fig. 1 showsthe modelled throughputs [15] for our current
band selection
represented by H in the problem formulation. Therefore,
theredshift value will need to be inferred only from these
6available broadbands (commonly referred to as ’ugrizY’). Atthe end
of the generation procedure, we have a realistic mastergalaxy
catalogue with magnitudes, colors, shapes and redshiftsfor 538 000
galaxies on an effective 1.24 deg2 region of thesky down to an
i-band magnitude of 26.5. To further matchthe expected properties
of the LSST Science sample, we limitour catalog to galaxies
brighter than 25.3 and with signal-to-noise (S/N) > 10 in the
i-band. Imposing these restrictions,we obtain a galaxy catalog with
a realistic set of photometricproperties, and best-fit spectral
templates with realistic contin-uum and emission line properties.
We now need to forward-model the observational process in the
spectroscopic case ina manner consistent with expected
observational conditions.
Fig. 1: LSST filter throughputs for the considered
photometricscenario.
For obtaining realistic spectral templates, we need to re-sample
and integrate the best-fit SEDs. As given by thesimulations, these
SEDs are pure functional forms. At the endof the observational
process, what we obtain is an integratedflux in logarithmic
wavelength bins at a resolution of R. Fromthe simulation run
described above, we select two randomsubsets.
B. Dictionary Learning
Fig. 2: Example of the subtraction of high-frequency featuresfor
rest-frame spectra. The original spectra is represented bya blue
solid line and the retained information after emissionlines
subtraction is displayed with black circles.
-
Fig. 3: Example of five atoms learned using dictionary
learningand imposing a sparsity degree of 3 on rest-frame
spectra.
Firstly, we chose a subset of noiseless low-redshift
galaxiesthat have been blueshifted to z = 0 in order to form the
train-ing set. Hence, the X is composed of n
t
= 10000 clean rest-frame example spectra covering the range
[1250Å, 10499Å]and w
s
= 4258. Moreover, high frequency information fromthese
rest-frame spectra has been removed through waveletfiltering
retaining four scales and keeping the baseline asillustrated in
Fig. 2. Finally, the dictionary D is learned byspecifying the
desired sparsity degree ⌧ = 3 and the numberof atoms of the
dictionary n
a
= 40. The code developedin C++ was iterated for 100 repetitions
which allowed forconvergence in the dictionary estimation measured
as theaveraged approximation error variations through
iterations.Fig. 3 displays five atoms from the adapted dictionary
usedfrom now on.
C. Redshift estimation
Secondly, the testing is performed on a different
randomlyselected subset. We have evaluated the algorithm on n =
1000galaxies lying in a redshift range of z 2 [0, 1] and
includingonly w
p
= 6 photometric measures for each galaxy.Let us now discuss the
results obtained for redshift estima-
tion in the simulated catalogue.The considered strategy of
building a hierarchical grid meshfor testing the different z values
is illustrated in Fig. 4. Thegrid search starts by exploring the
whole z 2 [0, 1] intervaland the approximation error as a function
of the tested redshiftis depicted in Fig. 4 (a). Hence, the minimum
is chosen andthe considered interval is reduced in Fig. 4 (b). We
repeatthe process five times to achieve the desired resolution in
z.The smoothness of the approximation curves as a function
ofredshift allows to attain the same minima with this
hierarchicalapproach as the one obtained with a one level grid with
a muchfiner resolution as shown in Fig. 5, although the
computationaltime is significantly lower, which justifies the
choice of ourapproach.
Fig. 6 displays the estimated redshift for all the galaxiesin
the test set with respect to their true redshift value.
Theperformance of the method is quantified through the bias overthe
entire test set h�
z
i = hzest
� ztrue
i = �0.004, and the68th percentile scatter �68 = 0.0475. Then,
one can define
the number of catastrophic failures as those galaxies
fallingoutside 3�68, yielding ⌫ = 53.
Finally, Fig. 7 shows the results of the simulated cataloguewith
LePHARE photometric estimation. The correspondingbias is h�
z
i = 0.0421, the 68th percentile scatter �68 = 0.0708and the
number of catastrophic failures ⌫ = 22.
It is important to point out two main differences with
ouralgorithm. On one hand, the templates used in the LePHAREcode
are theoretical while ours are derived directly fromthe data.
Moreover, while LePHARE is based on templatefitting, the proposed
method allows for a linear combinationof more than one template
leading to greater flexibility andrepresentational capacity.
IV. CONCLUSION
We have introduced a new method to compute redshift
fromphotometric data. The proposed algorithm allows to recoverthe
full-spectra of the galaxies from broad-band photometrysolving a
super-resolution problem. This estimation schemehas been analyzed
on simulated galaxies’ spectra and com-pared to classical LePHARE
code.
Further developments will explore other representation
ap-proaches where the emission lines are included. The
per-formances will be compared to other photometric
redshiftestimation based on machine learning as ANNz2 [4].
Finally,we aim to investigate the performance of this algorithm
onreal photometric data.
ACKNOWLEDGMENT
This work is funded by the DEDALE project (contract no.665044)
and LENA (ERC StG no. 678282) within the H2020Framework Program of
the European Commission.
REFERENCES
[1] H. Mo, F. Van den Bosch, and S. White, Galaxy formation and
evolution.Cambridge University Press, 2010.
[2] H. Hildebrandt, S. Arnouts, P. Capak, L. Moustakas, C. Wolf,
F. B.Abdalla, R. Assef, M. Banerji, N. Benı́tez, G. Brammer et al.,
“Phat:Photo-z accuracy testing,” Astronomy & Astrophysics, vol.
523, p. A31,2010.
[3] R. Tagliaferri, G. Longo, S. Andreon, S. Capozziello, C.
Donalek, andG. Giordano, “Neural networks for photometric redshifts
evaluation,” inItalian Workshop on Neural Nets. Springer, 2003, pp.
226–234.
[4] I. Sadeh, F. B. Abdalla, and O. Lahav, “Annz2: photometric
redshiftand probability distribution function estimation using
machine learning,”Publications of the Astronomical Society of the
Pacific, vol. 128, no. 968,p. 104502, 2016.
[5] A. Boselli, A panchromatic view of galaxies. John Wiley
& Sons, 2012.[6] S. Arnouts and O. Ilbert, “Lephare:
Photometric analysis for redshift
estimate,” Astrophysics Source Code Library, 2011.[7] K. Engan,
S. O. Aase, and J. H. Husoy, “Method of optimal directions
for frame design,” in Acoustics, Speech, and Signal Processing,
1999.Proceedings., 1999 IEEE International Conference on, vol. 5.
IEEE,1999, pp. 2443–2446.
[8] D. L. Donoho and J. M. Johnstone, “Ideal spatial adaptation
by waveletshrinkage,” biometrika, vol. 81, no. 3, pp. 425–455,
1994.
[9] H. Raguet, J. Fadili, and G. Peyré, “A generalized
forward-backwardsplitting,” SIAM Journal on Imaging Sciences, vol.
6, no. 3, pp. 1199–1226, 2013.
-
(a) (b) (c)
(d) (e)
Fig. 4: Different levels in the hierarchical grid mesh for
testing the values of z. The whole z range is explored in (a) and
theminimum is computed at each layer reducing the considered
interval up to the finest resolution in (e).
Fig. 5: One-level grid uniformly sampled at 100 steps betweenz =
0 and z = 1.
[10] J. Rapin, J. Bobin, A. Larue, and J.-L. Starck, “NMF with
sparseregularizations in transformed domains,” SIAM journal on
ImagingSciences, vol. 7, no. 4, pp. 2020–2047, 2014.
[11] S. Jouvel, J.-P. Kneib, O. Ilbert, G. Bernstein, S.
Arnouts, T. Dahlen,A. Ealet, B. Milliard, H. Aussel, P. Capak et
al., “Designing future darkenergy space missions-i. building
realistic galaxy spectro-photometriccatalogs and their first
applications,” Astronomy & Astrophysics, vol.504, no. 2, pp.
359–371, 2009.
[12] O. Ilbert, P. Capak, M. Salvato, H. Aussel, H. McCracken,
D. Sanders,N. Scoville, J. Kartaltepe, S. Arnouts, E. Le Floc’h et
al., “Cosmosphotometric redshifts with 30-bands for 2-deg2,” The
AstrophysicalJournal, vol. 690, no. 2, p. 1236, 2008.
[13] A. Leauthaud, R. Massey, J.-P. Kneib, J. Rhodes, D. E.
Johnston et al.,“Weak gravitational lensing with cosmos: galaxy
selection and shapemeasurements,” The Astrophysical Journal
Supplement Series, vol. 172,no. 1, p. 219, 2007.
[14] https://www.lsst.org/.[15]
https://github.com/lsst/throughputs.
Fig. 6: True vs estimated redshifts for the proposed
dictionarylearning photometric redshift estimation algorithm.
Fig. 7: True vs estimated redshifts from the benchmark LeP-HARE
code.