Top Banner
CDCOL A GEOSCIENCE DATA CUBE THAT MEETS COLOMBIAN NEEDS Christian Ariza-Porras, Germán Bravo, Mario Villamizar , Andrés Moreno, Harold Castro, Gustavo Galindo, Edersson Cabera, Saralux Valbuena, and Pilar Lozano
31

CDCOL · 2017. 9. 26. · CDCOL A GEOSCIENCE DATA CUBE THAT MEETS COLOMBIAN NEEDS Christian Ariza-Porras, Germán Bravo, Mario Villamizar , Andrés Moreno, Harold Castro, Gustavo

Feb 28, 2021

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: CDCOL · 2017. 9. 26. · CDCOL A GEOSCIENCE DATA CUBE THAT MEETS COLOMBIAN NEEDS Christian Ariza-Porras, Germán Bravo, Mario Villamizar , Andrés Moreno, Harold Castro, Gustavo

CDCOL A GEOSCIENCE DATA CUBE THAT MEETS COLOMBIAN NEEDS

Christian Ariza-Porras, Germán Bravo, Mario Villamizar , Andrés Moreno, Harold Castro, Gustavo Galindo, Edersson Cabera, Saralux Valbuena, and Pilar Lozano

Page 2: CDCOL · 2017. 9. 26. · CDCOL A GEOSCIENCE DATA CUBE THAT MEETS COLOMBIAN NEEDS Christian Ariza-Porras, Germán Bravo, Mario Villamizar , Andrés Moreno, Harold Castro, Gustavo

Dimensions Latitude, Longitude

Page 3: CDCOL · 2017. 9. 26. · CDCOL A GEOSCIENCE DATA CUBE THAT MEETS COLOMBIAN NEEDS Christian Ariza-Porras, Germán Bravo, Mario Villamizar , Andrés Moreno, Harold Castro, Gustavo

Dimensions Time

2010

2011

2012

2013

2014

2015

2016

x

y

t

Page 4: CDCOL · 2017. 9. 26. · CDCOL A GEOSCIENCE DATA CUBE THAT MEETS COLOMBIAN NEEDS Christian Ariza-Porras, Germán Bravo, Mario Villamizar , Andrés Moreno, Harold Castro, Gustavo

Dimensions Spectral bands

Page 5: CDCOL · 2017. 9. 26. · CDCOL A GEOSCIENCE DATA CUBE THAT MEETS COLOMBIAN NEEDS Christian Ariza-Porras, Germán Bravo, Mario Villamizar , Andrés Moreno, Harold Castro, Gustavo

Variety

SOURCE

Landsat

MODIS

IDEAM products

Sentinel

RESOLUTION

Temporal

Spatial

Spectral

Page 6: CDCOL · 2017. 9. 26. · CDCOL A GEOSCIENCE DATA CUBE THAT MEETS COLOMBIAN NEEDS Christian Ariza-Porras, Germán Bravo, Mario Villamizar , Andrés Moreno, Harold Castro, Gustavo

Problem Analysts’ time

Effort replication

Processing

Variety of sources and tools

Replicability

Processing power and storage

Developers are a scarce resource

Results can be reused only if can be trusted

Page 7: CDCOL · 2017. 9. 26. · CDCOL A GEOSCIENCE DATA CUBE THAT MEETS COLOMBIAN NEEDS Christian Ariza-Porras, Germán Bravo, Mario Villamizar , Andrés Moreno, Harold Castro, Gustavo

Traditional remote sensing product generation process

Source: Held A. 2015. Power Point presentation First Workshop Data Cube Colombia

Page 8: CDCOL · 2017. 9. 26. · CDCOL A GEOSCIENCE DATA CUBE THAT MEETS COLOMBIAN NEEDS Christian Ariza-Porras, Germán Bravo, Mario Villamizar , Andrés Moreno, Harold Castro, Gustavo

To majority of end-users, saving up to 80% of collective effort and costs.

New Vision – Analysis ready data

Source: Held A. 2015. Power Point presentation First Workshop Data Cube Colombia

Page 9: CDCOL · 2017. 9. 26. · CDCOL A GEOSCIENCE DATA CUBE THAT MEETS COLOMBIAN NEEDS Christian Ariza-Porras, Germán Bravo, Mario Villamizar , Andrés Moreno, Harold Castro, Gustavo

CDCol Goals

Data ownership Extensibility Lineage Replicability Standardization

Reusability Complexity abstraction

Ease of use Parallelization

Page 10: CDCOL · 2017. 9. 26. · CDCOL A GEOSCIENCE DATA CUBE THAT MEETS COLOMBIAN NEEDS Christian Ariza-Porras, Germán Bravo, Mario Villamizar , Andrés Moreno, Harold Castro, Gustavo

Related Works

Page 11: CDCOL · 2017. 9. 26. · CDCOL A GEOSCIENCE DATA CUBE THAT MEETS COLOMBIAN NEEDS Christian Ariza-Porras, Germán Bravo, Mario Villamizar , Andrés Moreno, Harold Castro, Gustavo

Background

This work

Page 12: CDCOL · 2017. 9. 26. · CDCOL A GEOSCIENCE DATA CUBE THAT MEETS COLOMBIAN NEEDS Christian Ariza-Porras, Germán Bravo, Mario Villamizar , Andrés Moreno, Harold Castro, Gustavo

Solution Strategy

Roles Bank of

algorithms and results

Web UI

Parallelization strategy

Bulk Ingestion Training

Workshop

Page 13: CDCOL · 2017. 9. 26. · CDCOL A GEOSCIENCE DATA CUBE THAT MEETS COLOMBIAN NEEDS Christian Ariza-Porras, Germán Bravo, Mario Villamizar , Andrés Moreno, Harold Castro, Gustavo

CDCol User Roles

System Administrator

Data Administrator

Developer

Analyst

Roles Bank of

algorithms and results

Web UI

Parallelization strategy

Bulk Ingestion Training

Workshop

Page 14: CDCOL · 2017. 9. 26. · CDCOL A GEOSCIENCE DATA CUBE THAT MEETS COLOMBIAN NEEDS Christian Ariza-Porras, Germán Bravo, Mario Villamizar , Andrés Moreno, Harold Castro, Gustavo

Algorithms Life Cycle

Roles Bank of

algorithms and results

Web UI

Parallelization strategy

Bulk Ingestion Training

Workshop

Page 15: CDCOL · 2017. 9. 26. · CDCOL A GEOSCIENCE DATA CUBE THAT MEETS COLOMBIAN NEEDS Christian Ariza-Porras, Germán Bravo, Mario Villamizar , Andrés Moreno, Harold Castro, Gustavo

Development

Complexity Abstraction

• Independent of datacube-core

• Automatic parallelization

• Python well known libraries

• Numpy

• xArray

Roles Bank of

algorithms and results

Web UI

Parallelization strategy

Bulk Ingestion Training

Workshop

Page 16: CDCOL · 2017. 9. 26. · CDCOL A GEOSCIENCE DATA CUBE THAT MEETS COLOMBIAN NEEDS Christian Ariza-Porras, Germán Bravo, Mario Villamizar , Andrés Moreno, Harold Castro, Gustavo

Execution

Roles Bank of

algorithms and results

Web UI

Parallelization strategy

Bulk Ingestion Training

Workshop

Page 17: CDCOL · 2017. 9. 26. · CDCOL A GEOSCIENCE DATA CUBE THAT MEETS COLOMBIAN NEEDS Christian Ariza-Porras, Germán Bravo, Mario Villamizar , Andrés Moreno, Harold Castro, Gustavo

CDCol Web UI

Empowers users to work on a large set of satellite images from any device

Reduces learning curve

Authentication and roles management

Roles Bank of

algorithms and results

Web UI

Parallelization strategy

Bulk Ingestion Training

Workshop

Page 18: CDCOL · 2017. 9. 26. · CDCOL A GEOSCIENCE DATA CUBE THAT MEETS COLOMBIAN NEEDS Christian Ariza-Porras, Germán Bravo, Mario Villamizar , Andrés Moreno, Harold Castro, Gustavo

CDCol Demo

Roles Bank of

algorithms and results

Web UI

Parallelization strategy

Bulk Ingestion Training

Workshop

Page 19: CDCOL · 2017. 9. 26. · CDCOL A GEOSCIENCE DATA CUBE THAT MEETS COLOMBIAN NEEDS Christian Ariza-Porras, Germán Bravo, Mario Villamizar , Andrés Moreno, Harold Castro, Gustavo

Parallelization Strategy

Automatic

By Tile

Generic Task

Celery

Roles Bank of

algorithms and results

Web UI

Parallelization strategy

Bulk Ingestion Training

Workshop

Page 20: CDCOL · 2017. 9. 26. · CDCOL A GEOSCIENCE DATA CUBE THAT MEETS COLOMBIAN NEEDS Christian Ariza-Porras, Germán Bravo, Mario Villamizar , Andrés Moreno, Harold Castro, Gustavo

Bulk Ingestion

Initial ingestion

15854 Scenes

Landsat 5, 7, and 8 (T1 Surface

Reflectance products from USGS)

15 years

Roles Bank of

algorithms and results

Web UI

Parallelization strategy

Bulk Ingestion Training

Workshop

Page 21: CDCOL · 2017. 9. 26. · CDCOL A GEOSCIENCE DATA CUBE THAT MEETS COLOMBIAN NEEDS Christian Ariza-Porras, Germán Bravo, Mario Villamizar , Andrés Moreno, Harold Castro, Gustavo

Training Workshops

Training and diffusion workshops are essential to the success of the data cube.

Developers

• Python fundamentals

• Multidimensional arrays manipulation on python

Analysts

• Datacube workfow

Roles Bank of

algorithms and results

Web UI

Parallelization strategy

Bulk Ingestion Training

Workshop

Page 22: CDCOL · 2017. 9. 26. · CDCOL A GEOSCIENCE DATA CUBE THAT MEETS COLOMBIAN NEEDS Christian Ariza-Porras, Germán Bravo, Mario Villamizar , Andrés Moreno, Harold Castro, Gustavo

CDCol Components

Page 23: CDCOL · 2017. 9. 26. · CDCOL A GEOSCIENCE DATA CUBE THAT MEETS COLOMBIAN NEEDS Christian Ariza-Porras, Germán Bravo, Mario Villamizar , Andrés Moreno, Harold Castro, Gustavo

CDCol Components

OpenDatacube/

datacube-core

Page 24: CDCOL · 2017. 9. 26. · CDCOL A GEOSCIENCE DATA CUBE THAT MEETS COLOMBIAN NEEDS Christian Ariza-Porras, Germán Bravo, Mario Villamizar , Andrés Moreno, Harold Castro, Gustavo

CDCol Components

Page 25: CDCOL · 2017. 9. 26. · CDCOL A GEOSCIENCE DATA CUBE THAT MEETS COLOMBIAN NEEDS Christian Ariza-Porras, Germán Bravo, Mario Villamizar , Andrés Moreno, Harold Castro, Gustavo

Results Bank of algorithms

◦ Algorithms

◦ Temporal medians compounds

◦ NDVI

◦ Forest-No forest classification

◦ Change detection using PCA

◦ WOFS –adapted

Workshops participants developed their own algorithms

Repeatable results

Set of available tools to analysts

Time reduction (a task that used to take 72 hours now can be done on 12 hours)

Page 26: CDCOL · 2017. 9. 26. · CDCOL A GEOSCIENCE DATA CUBE THAT MEETS COLOMBIAN NEEDS Christian Ariza-Porras, Germán Bravo, Mario Villamizar , Andrés Moreno, Harold Castro, Gustavo

Results

15años DATOS DE 2000-2015

30metros RESOLUCIÓN DE PIXEL

342 escenas

LANDSAT 7/8

2h PROCESAMIENTO

Page 27: CDCOL · 2017. 9. 26. · CDCOL A GEOSCIENCE DATA CUBE THAT MEETS COLOMBIAN NEEDS Christian Ariza-Porras, Germán Bravo, Mario Villamizar , Andrés Moreno, Harold Castro, Gustavo

Results

15años DATOS DE 2000-2015

30metros RESOLUCIÓN DE PIXEL

466 imágenes

LANDSAT 7/8

2min x año PROCESAMIENTO

Bosque Otras Coberturas

Page 28: CDCOL · 2017. 9. 26. · CDCOL A GEOSCIENCE DATA CUBE THAT MEETS COLOMBIAN NEEDS Christian Ariza-Porras, Germán Bravo, Mario Villamizar , Andrés Moreno, Harold Castro, Gustavo

Results

15años DATOS DE 2000-2015

30metros RESOLUCIÓN DE PIXEL

45 imágenes

LANDSAT 7

20min x periodo PROCESAMIENTO

Page 29: CDCOL · 2017. 9. 26. · CDCOL A GEOSCIENCE DATA CUBE THAT MEETS COLOMBIAN NEEDS Christian Ariza-Porras, Germán Bravo, Mario Villamizar , Andrés Moreno, Harold Castro, Gustavo

Conclusions Data ownership

• 15 years of curated images from different sources

Extensibility

• Developers can implement, with a low learning curve, new algorithms

• Data administrator to add new images to collection, and create new data types to support new sources.

Lineage and Replicability

• Results are replicable by logging executions parameters and algorithms versions.

Complexity abstraction

• Algorithms are independent of data cube core API. Developers Works only with multidimensional arrays with well stablished Python packages.

Ease of use

• Easy to use web user interface.

Parallelism

• Automatic parallelism by tile.

Page 30: CDCOL · 2017. 9. 26. · CDCOL A GEOSCIENCE DATA CUBE THAT MEETS COLOMBIAN NEEDS Christian Ariza-Porras, Germán Bravo, Mario Villamizar , Andrés Moreno, Harold Castro, Gustavo

Future Work Horizontal Scaling

Algorithm dependent parallelization schemes

Workflows management

New sensors

New algorithms

Training

Cloud enabled-CDCol

Page 31: CDCOL · 2017. 9. 26. · CDCOL A GEOSCIENCE DATA CUBE THAT MEETS COLOMBIAN NEEDS Christian Ariza-Porras, Germán Bravo, Mario Villamizar , Andrés Moreno, Harold Castro, Gustavo

Acknowledgements We thank to Brian Killough from NASA, and Alfredo Delos Santos and Kayla Fox from AMA team, for their support and fruitfully discussions. We also thank to CEOS Australia group for its work and for share it with the world. We thank also to the Environmental Ministry for financial support.

CDCol uses NetCDF format UCAR/Unidata to storage ingested data and results (http://doi.org/10.5065/D6H70CW6).