Top Banner
A Cyber-Physical OS for enabling Spatio-Temporal Coordination at Geo-distributed Scale Sandeep D’souza and Raj Rajkumar Carnegie Mellon University 1st Workshop on Next-Generation OS for Cyber-Physical Systems 2019
25

A Cyber-Physical OS Coordination at Geo-distributed Scale ...cdgill/ngoscps2019/...Coordination at Geo-distributed Scale Sandeep D’souza and Raj Rajkumar Carnegie Mellon University

Sep 10, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: A Cyber-Physical OS Coordination at Geo-distributed Scale ...cdgill/ngoscps2019/...Coordination at Geo-distributed Scale Sandeep D’souza and Raj Rajkumar Carnegie Mellon University

A Cyber-Physical OS for enabling Spatio-Temporal Coordination at Geo-distributed Scale

Sandeep D’souza and Raj Rajkumar Carnegie Mellon University

1st Workshop on Next-Generation OS for Cyber-Physical Systems 2019

Page 2: A Cyber-Physical OS Coordination at Geo-distributed Scale ...cdgill/ngoscps2019/...Coordination at Geo-distributed Scale Sandeep D’souza and Raj Rajkumar Carnegie Mellon University

Cyber-Physical OS● An OS which aids:

○ the development, deployment, scheduling and management ○ of cyber-physical applications over distributed infrastructure

Layered OS with user-space to kernel-space components

CPS Applications

Distributed Infrastructure: Sensors, Actuators, Compute, Network, Storage

System Services

Kernel Modules

Middleware

Device Drivers

Hypervisor

2

Page 3: A Cyber-Physical OS Coordination at Geo-distributed Scale ...cdgill/ngoscps2019/...Coordination at Geo-distributed Scale Sandeep D’souza and Raj Rajkumar Carnegie Mellon University

Coordination in Space and Time*

Accurate Time and Location key for Spatio-Temporal Coordination

*D’souza et al., HotCloud ‘17 3

Page 4: A Cyber-Physical OS Coordination at Geo-distributed Scale ...cdgill/ngoscps2019/...Coordination at Geo-distributed Scale Sandeep D’souza and Raj Rajkumar Carnegie Mellon University

Outline● Motivation● The Importance of Time & Localization● Exposing Time & Location as First-Class Entities● Realizing a Spatio-Temporal Cyber-Physical OS

4

Page 5: A Cyber-Physical OS Coordination at Geo-distributed Scale ...cdgill/ngoscps2019/...Coordination at Geo-distributed Scale Sandeep D’souza and Raj Rajkumar Carnegie Mellon University

Part I: The Importance of Time & Localization

5

Page 6: A Cyber-Physical OS Coordination at Geo-distributed Scale ...cdgill/ngoscps2019/...Coordination at Geo-distributed Scale Sandeep D’souza and Raj Rajkumar Carnegie Mellon University

A Shared Notion of Time● Ordering of Events● Coordinated Actions

A Shared Notion of Time is useful → Replace Communication with Local Computation*

*Liskov, Distributed Computing ‘93 6

Page 7: A Cyber-Physical OS Coordination at Geo-distributed Scale ...cdgill/ngoscps2019/...Coordination at Geo-distributed Scale Sandeep D’souza and Raj Rajkumar Carnegie Mellon University

Accurate Localization● Location-based sensing and actuation● Low-latency proximal computation● Safe Interaction between physical endpoints

A Shared Localization Frame-of-Reference is useful → Safe+Efficient Coordination at the Correct Location

7

Page 8: A Cyber-Physical OS Coordination at Geo-distributed Scale ...cdgill/ngoscps2019/...Coordination at Geo-distributed Scale Sandeep D’souza and Raj Rajkumar Carnegie Mellon University

City-Scale Connected Vehicles

Required: A Cyber-Physical OS with Time & Location as First-Class Entities

*D’souza et al., HotCloud ‘17 8

Tim

esta

mpe

d+G

eota

gged

Se

nsor

Dat

a

Tim

esta

mpe

d+G

eota

gged

A

ctua

tion

Com

man

ds

Page 9: A Cyber-Physical OS Coordination at Geo-distributed Scale ...cdgill/ngoscps2019/...Coordination at Geo-distributed Scale Sandeep D’souza and Raj Rajkumar Carnegie Mellon University

Coordinated Vehicles using TimeNet● TimeNet: Cyber-Physical Internet

○ perfect time+geo stamping● Dynamic Traffic Management

○ city-scale vehicular coordination■ Timestamps+Location

→ event ordering ■ event ordering

→ coordination policy

Uncertain time and location estimates can violate safety constraints

9

Page 10: A Cyber-Physical OS Coordination at Geo-distributed Scale ...cdgill/ngoscps2019/...Coordination at Geo-distributed Scale Sandeep D’souza and Raj Rajkumar Carnegie Mellon University

Part II: Exposing Time & Location as First-Class Entities

10

Page 11: A Cyber-Physical OS Coordination at Geo-distributed Scale ...cdgill/ngoscps2019/...Coordination at Geo-distributed Scale Sandeep D’souza and Raj Rajkumar Carnegie Mellon University

Time & Location as First-Class Entities● Access to OS-supported Time+Location abstractions

○ read the current time & location○ observe and schedule events + computation

● Specify & Observe Timing+Localization Uncertainty○ specifying safety-dictated uncertainty tolerances

■ allows the system to autonomously meet them○ observing the delivered uncertainty

■ allows applications to adapt during failures

Exposing uncertainty metrics to applications → enables an autonomous system + adaptive applications

11

Page 12: A Cyber-Physical OS Coordination at Geo-distributed Scale ...cdgill/ngoscps2019/...Coordination at Geo-distributed Scale Sandeep D’souza and Raj Rajkumar Carnegie Mellon University

Quality of Time (QoT)*● Quantified

○ using clock parameters: ■ accuracy, precision, drift….

○ w.r.t a reference clock (time)● Each timestamp has bounds

○ Timestamp ϵ {t-𝜺l, t+𝜺

h}

The end-to-end uncertainty in the notion of time delivered to an application by the system

*Anwar et al., RTSS ‘1612

Page 13: A Cyber-Physical OS Coordination at Geo-distributed Scale ...cdgill/ngoscps2019/...Coordination at Geo-distributed Scale Sandeep D’souza and Raj Rajkumar Carnegie Mellon University

Quality of Location (QoL)● Quantified

○ using localization parameters: ■ accuracy, precision, drift….

○ w.r.t a reference frame● Each location estimate has bounds

○ Location ϵ {L+/-Εh}

○ L is the location vector, and Ε the uncertainty vector

The uncertainty radius in a location estimate, with respect to a reference

13

Page 14: A Cyber-Physical OS Coordination at Geo-distributed Scale ...cdgill/ngoscps2019/...Coordination at Geo-distributed Scale Sandeep D’souza and Raj Rajkumar Carnegie Mellon University

QoT & QoL-based Connected Vehicles● QoT & QoL Requirements based on

○ safety requirements○ coordination policy

● If uncertainty exceeds tolerable limit○ coordination policy can adapt○ Graceful Degradation:

■ Increase vehicular spacing○ Safe Halt:

■ Instruct vehicles to stop

Synchronized Clocks + Localization → Scalable Spatio-Temporal Coordination Quality of Time + Quality of Location → Fault Tolerance

14

Page 15: A Cyber-Physical OS Coordination at Geo-distributed Scale ...cdgill/ngoscps2019/...Coordination at Geo-distributed Scale Sandeep D’souza and Raj Rajkumar Carnegie Mellon University

The Relation Between QoT & QoL● QoT & QoL Requirements

○ influenced by safety requirements○ can be interdependent

● Fleet of connected autonomous vehicles○ velocity & inter-vehicular spacing constraints

■ decided by safety requirements○ decision-making system

■ higher localization accuracy (lower uncertainty) ■ allows less-accurate timestamps (higher uncertainty)

Relation between clock synchronization and localization → Quality of Time & Quality of Location requirements are interdependent

15

Page 16: A Cyber-Physical OS Coordination at Geo-distributed Scale ...cdgill/ngoscps2019/...Coordination at Geo-distributed Scale Sandeep D’souza and Raj Rajkumar Carnegie Mellon University

Part III: Realizing a Spatio-Temporal Cyber-Physical OS

16

Page 17: A Cyber-Physical OS Coordination at Geo-distributed Scale ...cdgill/ngoscps2019/...Coordination at Geo-distributed Scale Sandeep D’souza and Raj Rajkumar Carnegie Mellon University

Coordination in CPS - Challenges● Scalability

○ Both numerical and geographical● Autonomy & Fault Tolerance

○ adapt to application requirements○ graceful degradation during adversity

● Ease of Programmability○ coordination framework with APIs

● Ease of Deployment & Management○ complex applications on heterogeneous infrastructure

● Security & Privacy○ protect users and infrastructure

Need for a Cyber-Physical OS which meets all these requirements

17

Page 18: A Cyber-Physical OS Coordination at Geo-distributed Scale ...cdgill/ngoscps2019/...Coordination at Geo-distributed Scale Sandeep D’souza and Raj Rajkumar Carnegie Mellon University

Enabling Spatio-Temporal Coordination● Autonomous Time and Location Services

○ scale across numerous distributed nodes○ adapt to application demands and faults

Expose time and localization as adaptive & autonomous services to cyber-physical applications

3

Page 19: A Cyber-Physical OS Coordination at Geo-distributed Scale ...cdgill/ngoscps2019/...Coordination at Geo-distributed Scale Sandeep D’souza and Raj Rajkumar Carnegie Mellon University

Case Study: Time-as-a-Service (TaaS)● Leverage mature open-source

○ protocols: NTP, PTP …○ technologies: GPS, hardware timestamping

● Adapts to Application QoT Demands○ tunable clock synchronization○ probabilistic QoT-estimation mechanisms

● Autonomous & Fault-Tolerant○ adapts to clock-sync failures○ notifies apps if QoT degrades beyond spec

4

The same ideas can be extended to Localization-as-a-Service (LaaS) → adaptive layer leveraging mature localization technologies

Page 20: A Cyber-Physical OS Coordination at Geo-distributed Scale ...cdgill/ngoscps2019/...Coordination at Geo-distributed Scale Sandeep D’souza and Raj Rajkumar Carnegie Mellon University

Design 1: LAN-scale QoT Stack for Linux*

Support for ARM and x86 platforms + QEMU-KVM Virtual Machines^ open source, modular implementation, no change to the Linux kernel

20*Anwar et al., RTSS ‘16^Dsouza et al., RTAS ‘18

Page 21: A Cyber-Physical OS Coordination at Geo-distributed Scale ...cdgill/ngoscps2019/...Coordination at Geo-distributed Scale Sandeep D’souza and Raj Rajkumar Carnegie Mellon University

Design 2: Geo-Scale Quartz TaaS*

21

Micro-service architecture for providing Time-as-a-Service over wide-area → scales across embedded endpoints, to the edge and the cloud

*Developed in collaboration with Nutanix Inc.

Page 22: A Cyber-Physical OS Coordination at Geo-distributed Scale ...cdgill/ngoscps2019/...Coordination at Geo-distributed Scale Sandeep D’souza and Raj Rajkumar Carnegie Mellon University

User-space vs Kernel-Space● Implement general functionality as user-space micro-services

○ good for scalability, portability and security○ easy to develop, deploy and manage○ performance and accuracy can suffer

Need a balance between placing functionality in user-space and kernel space → balance performance with scalability and ease-of-use

3

● Rely on kernel/driver support for high accuracy○ great for utilizing specialized hardware○ loss of portability, kernel-version/OS dependencies○ security risks

Page 23: A Cyber-Physical OS Coordination at Geo-distributed Scale ...cdgill/ngoscps2019/...Coordination at Geo-distributed Scale Sandeep D’souza and Raj Rajkumar Carnegie Mellon University

Deploying Applications @ Scale● CPS: Physical Access is Challenging

○ nodes in the real-world ● Heterogeneous Infrastructure

○ binary compatibility and dependencies● Software Management Layer

○ Application+Services Lifecycle Management○ Multi-tenant resource allocation○ Infrastructure Management

Virtualization technologies like containerization in conjunction with orchestration technologies like Kubernetes can help in app deployment

3

Page 24: A Cyber-Physical OS Coordination at Geo-distributed Scale ...cdgill/ngoscps2019/...Coordination at Geo-distributed Scale Sandeep D’souza and Raj Rajkumar Carnegie Mellon University

Ease of Development: API● Timeline*: Virtual reference time base● Coordinated actions on distributed components

○ all components bind to a common timeline○ each specifying its required QoT

Timelines can be extended to provide a common location reference to all the bound components, each specifying their desired QoL

*Anwar et al., RTSS ‘16

Timeline

100 us 100 us 100 us

10 ms

3

● Timeline-based API○ specify uncertainty tolerances○ read the current time with associated uncertainty○ schedule events in space and time

Page 25: A Cyber-Physical OS Coordination at Geo-distributed Scale ...cdgill/ngoscps2019/...Coordination at Geo-distributed Scale Sandeep D’souza and Raj Rajkumar Carnegie Mellon University

Summary● Geo-Distributed Coordination in CPS

○ Time and Location are key● A Cyber-Physical OS for Spatio-Temporal Coordination should

○ expose time and localization as a service to applications○ expose uncertainty metrics to applications

● Key Objectives:○ Scalability, Autonomy, Ease of Use, Fault-Tolerance, Security & Privacy

● Implementation Challenges○ use of virtualization and orchestration technologies

■ overheads, low-level device access (sensors/actuators) ○ balance between kernel-level and user-space functionality

■ performance vs portability & scalability trade-offs■ security risks due to kernel-level access

25