Top Banner
"The Maximum Likelihood Problem and Fitting the Sagittarius Dwarf Tidal Stream" Matthew Newby Astronomy Seminar RPI Oct. 22, 2009 1
22

" The Maximum Likelihood Problem and Fitting the Sagittarius Dwarf Tidal Stream " Matthew Newby Astronomy Seminar RPI Oct. 22, 2009 1.

Mar 30, 2015

Download

Documents

Donald Neblett
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: " The Maximum Likelihood Problem and Fitting the Sagittarius Dwarf Tidal Stream " Matthew Newby Astronomy Seminar RPI Oct. 22, 2009 1.

"The Maximum Likelihood Problem and Fitting the Sagittarius Dwarf Tidal

Stream"Matthew Newby

Astronomy Seminar RPI Oct. 22, 2009

1

Page 2: " The Maximum Likelihood Problem and Fitting the Sagittarius Dwarf Tidal Stream " Matthew Newby Astronomy Seminar RPI Oct. 22, 2009 1.

2

•Introduction •The Sagittarius Stream• SDSS• Locating

• Maximum Likelihood• Methods•Differential Evolution• Monte-Carlo Markov-Chain•Gradient Descent•Genetic Search•Particle Swarm

• Revisit the Sagittarius Stream • BOINC•Overview•Current and Future Work

Overview:

Page 3: " The Maximum Likelihood Problem and Fitting the Sagittarius Dwarf Tidal Stream " Matthew Newby Astronomy Seminar RPI Oct. 22, 2009 1.

3

Introduction

•Modern Astronomy – No longer staring through a telescope•Automated Surveys produce large data sets

Image : NASA.gov

•Errors in measurements – statistical methods needed•Fast and accurate computer routines are needed in order to analyze this information!

Image : Wikimedia Commons

computer$ go faster_

Page 4: " The Maximum Likelihood Problem and Fitting the Sagittarius Dwarf Tidal Stream " Matthew Newby Astronomy Seminar RPI Oct. 22, 2009 1.

4

The Sloan Digital Sky Survey (SDSS):

Image: sdss.org

• 230+ million objects• 8,400 square degrees in the sky• Large percentage of north galactic cap• Very little data in galactic plane (too much dust)• Several hundred thousand stars

Page 5: " The Maximum Likelihood Problem and Fitting the Sagittarius Dwarf Tidal Stream " Matthew Newby Astronomy Seminar RPI Oct. 22, 2009 1.

The Sagittarius Dwarf Tidal Stream

5Image (above): [Ibata et al. 1997, AJ]Image (left): David Martinez-Delgado (MPIA) & Gabriel Perez (IAC)

• The Sagittarius Dwarf Galaxy is merging with the Milky Way • The dwarf is being tidally disrupted by the Milky Way, creating long “tails.”

• Provide information on matter distribution in Milky Way• Provide constraints on Galactic Halo

Mapping the Tidal Stream will:

Page 6: " The Maximum Likelihood Problem and Fitting the Sagittarius Dwarf Tidal Stream " Matthew Newby Astronomy Seminar RPI Oct. 22, 2009 1.

6

The Milky Way: HaloBulgeThin DiskThick Disk

~30 kiloparsecs (100,000 light-years)

SunSagittarius Dwarf Galaxy

Tidal Stream

Data Wedge

Page 7: " The Maximum Likelihood Problem and Fitting the Sagittarius Dwarf Tidal Stream " Matthew Newby Astronomy Seminar RPI Oct. 22, 2009 1.

7

Data Stripe:

Stripe 82 (southern galactic cap)

F-turnoff stars on the H-R diagram

Image: Newberg & Yanny 2006, JoP Conference series (modified by N. Cole

Page 8: " The Maximum Likelihood Problem and Fitting the Sagittarius Dwarf Tidal Stream " Matthew Newby Astronomy Seminar RPI Oct. 22, 2009 1.

8

Cole, N.

Sag. Stream: Model

• Assume stream is a cylinder• Radial drop-off given by a Gaussian Distribution

• 2 background parametersr0, q

• 6 parameters per streamε, μ, r, θ, φ, σ

At least 8 parameters in the search –8-dimensional solutions space!

Background distribution:

Page 9: " The Maximum Likelihood Problem and Fitting the Sagittarius Dwarf Tidal Stream " Matthew Newby Astronomy Seminar RPI Oct. 22, 2009 1.

9

Maximum Likelihood:

• Bayesian Method• Must assume a “prior” – a model explaining the data• Find the parameters that are the “most likely” in a data set, given the prior

• Law of large numbers•Can assume that large data sets have normally distributed data points

• Find probability that each data point lies in the given distribution•The you can get the likelihood:

L(Q|D) = DataPointProbi

Page 10: " The Maximum Likelihood Problem and Fitting the Sagittarius Dwarf Tidal Stream " Matthew Newby Astronomy Seminar RPI Oct. 22, 2009 1.

10

Computational AlgorithmsOverview:

• Set up problem• Parameter space: all allowed values of parameters• Likelihood evaluator for given parameters• Evaluation method – moves in parameter space in an efficient way• End conditions: when change in best is below a limit, or a predefined number of iterations is reached.

Problems: •Likelihood calculation is usually time-consuming• Need to avoid local maximums – find global max

What is the best method?

Page 11: " The Maximum Likelihood Problem and Fitting the Sagittarius Dwarf Tidal Stream " Matthew Newby Astronomy Seminar RPI Oct. 22, 2009 1.

11

Computational Methods:

“No Free Lunch” (David H. Wolpert, William G. Macready)

•Only eats meat •Vegetarian•Low Carb Diet

Poor Students:

Prices differ by restaurant! Not everyone can eat cheaply!One restaurant cannot be the best solution for every person (problem)!

Burger Palace Gourmet Salads No Carbs at All

Local Eateries, same menus, random prices:

•One solution method (or algorithm) will not be ideal for all problems!•Need to choose the best solution for the job at hand!

Rosencrantz OpheliaGuildenstern

Page 12: " The Maximum Likelihood Problem and Fitting the Sagittarius Dwarf Tidal Stream " Matthew Newby Astronomy Seminar RPI Oct. 22, 2009 1.

Conjugate Gradient Descent (CGD)

12

• Calculates the gradient of the surface for each parameter• Moves towards best likelihood using a line search• Conjugate gradient uses the gradient of the previous step to converge faster•Requires many likelihood calculations per move• Unfortunately, may end at local maximums• Need to run from several different directions in order to find global best

Gradient Descent: 1-dimensional case

location

gradient

Likelihood vs. Position

best solution

Local Maximum

L = likelihood functionQ = Parameter (i or j)hi = step size for ith parameter

The gradient, G:

Page 13: " The Maximum Likelihood Problem and Fitting the Sagittarius Dwarf Tidal Stream " Matthew Newby Astronomy Seminar RPI Oct. 22, 2009 1.

13

Line Search example (left):

The first search does not find a better likelihood for the middle point (yellow), so the distance is doubled. This time, the new middle point (red) has the best likelihood. The next iteration of CGD will start at this point.

Line Search

starting point

first middle point

first end point

next middle point

next end point

• Evaluates two points in direction of gradient: one a distance 1d away, the other 2d• d is usually related to the gradient (slope)• If the middle point is not at a better likelihood than the end points, d is doubled and the process repeated• If the middle point is higher, then the middle point becomes the starting point for another CGD• Line Search causes the algorithm to reach the best likelihood efficiently

Page 14: " The Maximum Likelihood Problem and Fitting the Sagittarius Dwarf Tidal Stream " Matthew Newby Astronomy Seminar RPI Oct. 22, 2009 1.

14

Monte-Carlo Markov-Chain (MCMC)• A “random walk” method• Samples parameter space well• Automatically produces error distribution• Easy to code

•Sensitive to running time and step size• Never truly converges

•Metropolis-Hastings:• Take a step in each direction (parameter)• Step size/direction is random, drawn from a normal distribution• If the new location has a better likelihood, move to it• If the new location has a worse likelihood, then there is a chance of moving to it

The trajectory of a 1000 step MCMC straight-line fit (top) and the distribution in b (bottom).

Page 15: " The Maximum Likelihood Problem and Fitting the Sagittarius Dwarf Tidal Stream " Matthew Newby Astronomy Seminar RPI Oct. 22, 2009 1.

15

Genetic Search• Inspired by natural selection• Start with multiple “individuals” (positions) in parameter space • Evaluate likelihood for each individual• Remove individuals with the worst likelihoods• Replace the removed individuals with “children” of the remaining individuals (“parents”)

• Parents can be chosen randomly or from the best likelihoods• Create children through crossover and mutation:

• Crossover: A child inherits the parameters of multiple parents, either by averaging the parents’ parameters or by inheriting select parameters from each parent•Mutation: Replace a parameter with a new, randomly generated one

• Repeat until end conditions are met

Page 16: " The Maximum Likelihood Problem and Fitting the Sagittarius Dwarf Tidal Stream " Matthew Newby Astronomy Seminar RPI Oct. 22, 2009 1.

Differential Evolution

16

• An individual moves according to the weighted difference between the locations of two “parent” individuals• If the new position has a worse likelihood, then the individual does not move• Parents may be random or chosen from the population best• Also, multiple pairs of parents may be used (averaging over the differences)

(center is global best)

Difference Vector

Change in position

XNo Change

Page 17: " The Maximum Likelihood Problem and Fitting the Sagittarius Dwarf Tidal Stream " Matthew Newby Astronomy Seminar RPI Oct. 22, 2009 1.

Particle-Swarm Optimization

17

• Physically Intuitive – based on animal behavior• Particles have velocities• “Forces” towards personal best, global best

particle

Global best

velocity

to global best

to personal best

Personal best

Parameter Space

Position (x) change at step t:

w, c1,c2 are weighting parameters, p is personal best, g is global best, rand() is a random number

Page 18: " The Maximum Likelihood Problem and Fitting the Sagittarius Dwarf Tidal Stream " Matthew Newby Astronomy Seminar RPI Oct. 22, 2009 1.

18

BOINC

Berkeley Open Infrastructure for Network Computing Milkyway@home stats: Total Active

Users 37,251 16,010

Hosts 79,023 25,101

Teams 1,410 922

Countries 163 124

Total Credit 9,302,434,280

Recent average credit RAC 52,731,529

Average floating point operations per second

527,315.3 GigaFLOPS / 527.315 TeraFLOPS

• Users volunteer spare processor / graphics card time to the project• Massively parallel• Graphics processor technology has created a large increase in processing power• Milkyway@home is now the #2 ranked BOINC project• You can help, too: http://milkyway.cs.rpi.edu/milkyway/

Page 19: " The Maximum Likelihood Problem and Fitting the Sagittarius Dwarf Tidal Stream " Matthew Newby Astronomy Seminar RPI Oct. 22, 2009 1.

19

Sgr Stream Stars Non-Sgr Stream StarsSgr Stream Stars

Separation: Stripe 82

Page 20: " The Maximum Likelihood Problem and Fitting the Sagittarius Dwarf Tidal Stream " Matthew Newby Astronomy Seminar RPI Oct. 22, 2009 1.

20

Conclusions:

• Modern astronomy produces large data sets

• The Maximum Likelihood method is ideal for analyzing this data

• Powerful computer algorithms exist to perform MLE

• Mapping the Sagittarius Stream is possible by using these methods

Page 21: " The Maximum Likelihood Problem and Fitting the Sagittarius Dwarf Tidal Stream " Matthew Newby Astronomy Seminar RPI Oct. 22, 2009 1.

21

The Sloan Digital Sky SurveyBOINC.comMilkyway@homeProf. Heidi Newberg, Rensselaer Polytechnic InstituteNathan Cole, “Maximum Likelihood Fitting of Tidal Streams with Applications to the Sagittarius Dwarf Tidal Tails” (PhD Thesis, Rensselaer Polytechnic Institute, 2008)Travis Desell, “Aysnchronous [sic] Global Optimization for Massively Distributed Computing”

(PhD candidacy document, 2009)Shakespeare, et al. “Hamlet”

Credits

Page 22: " The Maximum Likelihood Problem and Fitting the Sagittarius Dwarf Tidal Stream " Matthew Newby Astronomy Seminar RPI Oct. 22, 2009 1.

22

3 stream search: