Libra: a Library OS for a JVM

Libra: A Library OS for a JVMGlenn Ammons, Jonathan Appavoo, Maria Butrico, Dilma Da Silva, David Grove, Kiyokuni Kawachiya, Orran Krieger, Bryan Rosenburg, Eric Van Hensbergen, Robert Wisniewski

T.J. Watson Research CenterTokyo Research Lab

13 - June - 2007

Monday, August 10, 2009

13 June 2007 - VEE’07 Libra

IBM Research

Motivation Customized operating system support for

applications Previous approaches

– SPIN, Vino, Scout, K42– Exokernel

Virtualization – new opportunity

Multiplexhardware

Hardware

Exokernel

Library OSApplication

Abstractions

Hardware

Hypervisor

Control domain

General-Purpose

OS

User domain

Library OSApplication



IBM Research

Outline

J9/Libra architecture Nutch/Lucene Query application Initial optimizations Experimental results Next steps



IBM Research

Xen

Storage

NetworkFile System

Linux

Control Partition (Dom0) User Partition (DomU)

J9/Libra Architecture

in channelout channel

Libra

9p ClientInferno (9p Server) Memory mgmt.Threads

Libra API

J9 (JVM)J9 port layer

File Ops Sockets Sys Svc

ApplicationConsole

Authority

UserEnvironment

Proxy Process



IBM Research

Xen

Storage

NetworkFile System

Linux





IBM Research

Xen

Storage

NetworkFile System

Linux




Libra



IBM Research

Xen

Storage

NetworkFile System

Linux




Libra

9p ClientInferno (9p Server)



IBM Research

Xen

Storage

NetworkFile System

Linux




Libra




IBM Research

Xen

Storage

NetworkFile System

Linux




Libra


Libra API




IBM Research

Xen

Storage

NetworkFile System

Linux




Libra


Libra API





IBM Research

Xen

Storage

NetworkFile System

Linux




Libra


Libra API



ApplicationConsole

Authority

UserEnvironment

Proxy Process



IBM Research

Target workload: Nutch/Lucene Query

documents

docs

query

query

docs

query

docs

query

docs

query

distribute



IBM Research

Nutch/Lucene Query application structure

throughput measurement

Driver Front-end

Back-end

Back-end

Back-end

data

data

data



IBM Research

Libra optimizations: file caching

Index and some raw data must be in memory Nutch query back-end relies on OS buffer cache Going to control partition is expensive Solution: cache files locally in Libra Average lseek() & read() cost for back-end:

– J9/Linux: 2.25 usec– J9/Libra: 0.9 usec



IBM Research

Libra optimizations: socket streaming

Nutch query back-end is a streaming application Requests buffered in control partition Fetching them on-demand adds latency Sending results ties up worker threads Solution: stage socket data into/out of Libra partition

– New requests are always available locally– Results are sent asynchronously in batches



IBM Research

Nutch/Lucene Query performance

Single back-end server 10 GB document set

Configuration Queries / second

Default 5.9

File caching 12.8

File caching & Socket streaming

16.0



IBM Research

Performance evaluation

Platform– JS21 (PowerPC) blade– IBM BladeCenter– XenPPC– Partitions with 1 core, 1920 MB of memory



IBM Research

SPECjvm98

jess & jack: file cache

javac: open/stat

db: large pages

0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

201_

compre

ss

202_

jess

209_

db

213_

javac

222_

mpega

udio

227_

mtrt

228_

jack

Spe

edup

ove

r J9/

Linu

x

J9/Libra, no cache

J9/Libra, with cache



IBM Research

SPECjbb2000

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1 2 3 4 5 6 7 8

Number of Warehouses

Spe

edup

ove

r J9/

Linu

x, 1

War

ehou

se

J9/LinuxJ9/Libra



IBM Research

SPECjbb2000

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1 2 3 4 5 6 7 8

Number of Warehouses

Spe

edup

ove

r J9/

Linu

x, 1

War

ehou

se

J9/LinuxJ9/LibraJ9/Libra, small pages



IBM Research

Java Grande ForumMulti-threaded Benchmarks

J9/Libra vs J9/Linux



IBM Research

Nutch/Lucene Query scaling

Query cluster– 1 front-end– 15 back-end servers– 5 blades– 150 GB document set

All clusters serve the same (replicated) data



IBM Research

Related Work

BEA LiquidVM Hardware

– Azul– IBM zAAP



IBM Research

Next Steps

Supporting POSIX interfaces– C/C++ library code– Java-Native-Interface support– C/C++ applications

Supporting network directly on Libra Pursuing JVM optimizations

– “safe-points” to support type-accurate garbage collection– Real-Time Java support



IBM Research

Why is it named Libra?

We chose the name “Libra” because our goal – to provide well-balanced services -- aligns with the imagery associated with the constellation of the same name.

JVM

Libra

cache stre

amI/

O

stat

Nutch

Only the necessary components can be chosen.

Constellation Libra



IBM Research

Stop!

The talk ended with the previous slide Everything else is backup stuff



IBM Research

One slide summary Specializing OS for application is attractive Prior attempts had trouble supporting legacy code Our current approach leverages

– Hypervisor– Library OS handles target optimizations; side car handles the

rest– 9P protocol for distributed resources

We did it: we run JVMs on a user domain partition without an OS

We explore performance optimization opportunities We profit from management simplification



IBM Research

System architecture (cont)

Using hypervisor instead of exokernel– App can use privileged execution mode and instructions– A pared-down linux can run as libOS

Apps can be like OSes or like micro-kernels Criticism for virtualization approach

– Focus on parts; no general view• In Libra control partition provides general view



IBM Research

J9/Libra implementation J9’s portability layer Threading and synchronization Incremental

– Started with smallest J9 • no JIT, no optional VM features, CLDC J2ME class libraries

– Started with dummy implementations on port and thread libraries– Added features as required by workload– Added features back into J9

Debugging Currently

• Running full JIT and largest set of IBM class libraries for J9 version• 50 % of port “dummies” refer to signal handling, shared memory regions,

socket/network



IBM Research

Libra Internals Memory management

– Simple two-level management– Heap allocated when requested

File system services– Needed for loading Java class files and providing I/O– Mapped to 9P protocol

Subset of pthreads libraries– Designed for scalability but currently only supports 1 processor per partition

– No preemptive time-slicing Socket interface

– Operations forwarded to gateway server– From outside, app seems to be running on the system network server



IBM Research

Nutch query

Driver Front-end

Back-end

Back-end

Back-end

data


data

data



IBM Research

Nutch query

1. Front-end receives query (“foo AND bar”)

Driver Front-end

Back-end

Back-end

Back-end

data


data

data

Q



IBM Research

Nutch query

Driver Front-end

Back-end

Back-end

Back-end

data


data

data



IBM Research

Nutch query

1. Front-end receives query (“foo AND bar”)2. Front-end sends query to each back-end

Driver Front-end

Back-end

Back-end

Back-end

data


data

data



IBM Research

Nutch query

1. Front-end receives query (“foo AND bar”)2. Front-end sends query to each back-end3. Each back-end searches its data segment, ranks results

Driver Front-end

Back-end

Back-end

Back-end

data


data

data



IBM Research

Nutch query

1. Front-end receives query (“foo AND bar”)2. Front-end sends query to each back-end3. Each back-end searches its data segment, ranks results

Driver Front-end

Back-end

Back-end

Back-end

data


data

data

Q

Q

Q



IBM Research

Nutch query

1. Front-end receives query (“foo AND bar”)2. Front-end sends query to each back-end3. Each back-end searches its data segment, ranks results4. Back-end send their partial results to front-end

Driver Front-end

Back-end

Back-end

Back-end

data


data

data



IBM Research

Nutch query

1. Front-end receives query (“foo AND bar”)2. Front-end sends query to each back-end3. Each back-end searches its data segment, ranks results4. Back-end send their partial results to front-end

Driver Front-end

Back-end

Back-end

Back-end

data


data

data



IBM Research

Nutch query

1. Front-end receives query (“foo AND bar”)2. Front-end sends query to each back-end3. Each back-end searches its data segment, ranks results4. Back-end send their partial results to front-end5. Front-end selects top results overall

Driver Front-end

Back-end

Back-end

Back-end

data


data

data

R



IBM Research

File Block I/O Performance

Performance of forwarding standard read/write 128 MB file with varying buffer size

Data present on Linux cache



IBM Research

9.2.22.125

9.2.22.121

OS

9.2.22.36

9.2.22.140

9.2.22.63

9.2.22.40

9.2.22.1609.2.22.134

9.2.22.100 9.2.22.1509.2.22.200

OS

OSOS

OS

OS

OS

OS

OSOSOS

Management simplification: Many Become One

diskscpus

xio



IBM Research

9.2.22.125

9.2.22.121

9.2.22.140

9.2.22.63

9.2.22.40

9.2.22.1609.2.22.134

9.2.22.100 9.2.22.1509.2.22.200

OS

OSOS

OS

OS

OS

OS

OSOSOS

11 1

Management simplification: Many Become One

Accelerators

OS

9.2.22.36



IBM Research

Blades

Pool of Domains

Linux

Virtual Chassis 0

$ ssh chassis0chassis0 > java HelloWorld

Blades

Pool of Domains

Linux

Virtual Chassis 1


Blades

Pool of Domains

Linux

Virtual Chassis 2




IBM Research

9.2.70.236Java Program

9P Path

9P Component

J9LibOS andSupport

Legend

XenPPCJS21

gpfsmgt9p

BEJ9LibOS

BEJ9LibOS

BEJ9LibOS

SM

Linux

DomU DomU DomU Dom0

XenPPCJS21

gpfsmgt9p

BEJ9LibOS

BEJ9LibOS

BEJ9LibOS

SM

Linux

DomU DomU DomU Dom0

192.168.X.Y

XenPPCJS21

gpfsmgt9p

BEJ9LibOS

BEJ9LibOS Linux

DomU DomU DomU Dom0

Linuxmgt gpfs

SMLinux

gpfs

Xen Dom0 Linux

GPFS Component

Network Interfaces

Common Linux Linux


Libra: a Library OS for a JVM

Technology

threads memory mgmt

vee07 libra monday

vee07 libra monday

vee07 libra monday

vee07 libra monday

vee07 libra monday

vee07 libra monday

vee07 libra monday