HSC測光的赤方偏移 -Machine Learningによる推定-

B08a
HSC測光的赤方偏移
-Machine Learningによる推定○西澤 淳(名古屋大学)、田中賢幸(国立天文台)、
HSC Collaboration
日本天文学会 2015年春季年会@大阪大学
2015年03月21日
1
Contents
❖
❖
❖
❖
What is the Machine Learning?
Random Forest
✤ method and application to the photo-z
Results
✤ photo-z by Random Forest
✤ star/galaxy separation by Random Forest
summary
2
機械学習(Machine Learning)とは
❖
❖
❖
❖
❖
❖
❖
❖
パターンの学習による予測
1950年代から発生
最初はゲームの学習 (現在も将棋、チェスなど)
1990年代中盤からphoto-zに応用
学習用データ(training set)が必要
学習用データの仕様はアルゴリズム依存
Neural Network, k-d tree, random forest,
nearest neighbors, Gaussian process, …
本講演ではrandom forestを用いる
3
Random Forestの概念
データを2分木で分割、多数の木を作る
root
i < 22.5
y
n
g > 21.3
z=0.01
z=0.02
…
…
n
…
y
…
n
…
y
r >24.6
…
z=5.99
z=6.00
Training set を用いて、決定木を構築
bootstrapping により多数の決定木を作る=乱数の森
測光誤差も取り入れてランダムサンプルを作成
多数の決定木からPDFを構築 = photo-z
銀河モデルの仮定なしに、赤方偏移推定が可能
4
leaf
TPZ(Trees for PhotoZ)
各ノードでの分類基準:分散の和が最小
S(T ) =
X
X
(zi
Kind & Bruner 2013
ẑm )2
mag or color
m2values(M ) i2m
redshift for training set
公開コード:MLZ (Kind
& Bruner)を用いる
5
COSMOS data (ver. S14A_0b)
calibration data set :
• HCS data (S14A_0b) observed in S14A semester
• COSMOS 30-bands photo-z(w/wo Ultra-VISTA), and
zCOSMOS-bright matched with HSC sources
Truncate at i<25 mag as 30-bands photo-z is not reliable beyond
this (wrt zCOSMOS-faint)
Totally 60,000 objects (star/gal/AGN all included)
•
•
• Half of them are used for calibration and the rest for validation
6
COSMOS data (ver. S14A_0b)
• HCS data (S14A_0) observed in S14A semester
• i-band selected objects (galaxy/star/AGN not separated)
•
•
•
•
force photometry for other band images
Limiting mag : i 25.5 mag (5σ)
Totally we have 100,000 objects
Blended objects are excluded (lost 90% detected objects!)
we use CModel (de Vaucauleurs + exponential) magnitude
7
photo-z results : COSMOS (S14A_0b)
faint galaxies (i>22.5)
HSC 5 bands photo-z
HSC 5 bands photo-z
bright galaxies (i<22.5)
COSMOS 30 bands photo-z
COSMOS 30 bands photo-z
z = 0.069 ± 0.155
z = 0.060 ± 0.284
8
Clipping with PDF
bright
faint
z_Conf clipping
zConf ⌘
Z
zp +a
P (z)dz
zp a
z_Var clipping
zVar ⌘
z=
0.010 ± 0.067
z=
0.001 ± 0.139
9
Z
P (z)
(z
zp ) 2
68
dz
ready for Weak Lensing?
- criterion for selecting bg galaxies
Z
CDF ⌘ P (z > z0 ) =
dzP (z) > 0.8
or
zp > z0 +
z0
contamination
- fraction of fg galaxies in
the bg subsample selected
via photo-z (mean or CDF)
- may bring the WL dilution
completeness
completeness
contamination
- fraction of lost bg galaxies
not selected as bg by
Δmean z
photo-z (mean or CDF)
- may reduce the statistical
power (shot noise)
cluster redshift z0
10
star galaxy separation?
11
star/galaxy separation
{
galaxy -> etrue=1
star -> etrue=0
Training set : based on COSMOS color and ACS image
(Ilbert et al. 2009, Leauthaud et al 2007)
magnitudes , colors , and size
e=0, or 1 ?
12
銀河サンプルの品質評価
• For galaxy (sub)sample, contamination by stars is
pretty low: <1% for bright and <4% even for faint.
• e=1, we loose 30% objects but e>0.95, 100%
completeness
• e>0.95 seems optimal for picking up galaxies.
13
星サンプルの品質評価
• For star (sub)sample, contamination by galaxies is
considerable: <1% for bright but <15% for faint.
• Even e<0.2, 70%(30%) completeness for bright (faint)
• Data (reduction) quality should be upgraded: e.g. PSF
measurement
14
real images of stars/galaxies
HSC r-i-z composite image
correctly identified stars
no
no
image
image
misidentified galaxies (stars in reality)
no
image
correctly identified galaxies
20
21
22
23
15
24
25
(i-band magnitude)
summary
• ML(機械学習)はphoto-z推定に利用可能
• MLは 銀河/星/AGN などのテンプレートに依存しないため、テンプレー
トフィットを用いる手法とは相補的
• photo-z精度は15%(bright), 28%(faint)だが、振る舞いの悪い銀河を
クリッピングすれば7%(bright), 14%(faint)まで向上
• photo-zによる背景銀河選出はz<1のclusterに対してpure
sample(contami.<4%, Δz <5%)が得られるが、30%程度のロス
• MLを星/銀河 分離にも応用
• 銀河選出には非常に強力(<1%のcontamination rate でほぼ全ての銀河
を選出可能)
• 明るい星選出には強力だが、暗い星には銀河のcontamination 15%
16