Landing in the Right Nest: New Negotiation Features for Enterprise Environments

Post on 06-Jan-2016

20 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Landing in the Right Nest: New Negotiation Features for Enterprise Environments. Jason Stowe. New Features for Negotiation. Experience in Enterprise Environments. What is an Enterprise Environment?. Any Organization Using Condor with. Demanding Users. Demanding Users. - PowerPoint PPT Presentation

Transcript

Landing in the Right Nest:New Negotiation Features for Enterprise Environments

Jason Stowe

New Features for Negotiation

Experience in Enterprise Environments

What is an Enterprise Environment?

Any Organization Using Condor with

Demanding Users

Demanding Users

Organization = Groups of Demanding Users

Purchased Computer Capacity

Guaranteed

Minimum Capacity

Need As Many as Possible

As Soon as they submit

Vanilla/Java Universe

Avoid Preemption

How do we ensure Resources land in the right Group’s Nest?

A valid definition ofEnterprise Condor Users?

I started off as a Demanding User

Follow up to earlier work

Condor for Movies:75+ Million Jobs

1000+ CPUs (Linux/OSX)70+ TB storage

(Project that added AccountingGroups)

Web-based Management Tools, Consulting, and 24/7 Support

A Conversation with Miron

Bob Nordlund’s idea for Condor += Hooks

Configuration with Pipes

CONDOR_CONFIG = cat /opt/condor/condor_config |

(Condor 6.8)

Demanding Condor Uses for Banks/Insurance Companies => This year, new features

Negotiation Policies to ManageNumber of Resources

For Groups and Users

What are the Requirements?

-Guaranteed Minimum Quota-Fast Claiming of Quota-Avoid Unnecessary Preemption

Three Common Ways

“Fair share” User PriorityPREEMPTION_REQUIREMENTS

Machine RANK

AccountingGroups GROUP_QUOTA

Generally these are a progression

Story of a Pool

100 Machines

A = 100

Fair-Share, User Priority

It Works! More Users…

100 Machines

A = 50 B = 50

condor_userprio –setfactor A 2 condor_userprio –setfactor B 2

PREEMPTION_REQUIREMENTS = RemoteUserPrio > SubmittorPrio

Works Well in Most cases

Suppose A has all 100 machines, and B submits 100 jobs

User Priorities Cached at Beginning of Negotiation

And not updated…

PREEMPTION_REQUIREMENTS = RemoteUserPrio > SubmittorPrio

Standard Universe = No Problem (Preemption doesn’t lose work)

Problem: Vanilla or Java Universe (Work is lost!)

Dampen these with NEGOTIATOR_MAX_TIME_PER_SUBMITTER

NEGOTIATOR_MAX_TIME_PER_PIESPIN

Slows matching rate,can lead to starvation

Time For RANK

RANK = Owner =?= “A” on 50 Machines RANK = Owner =?= “B” on 50 Machines

Users get their “quota”

Tied to particular machines

50 Machines

A = 50 B = 50

50 Machines

Problem: Group A submits 100 jobs on Empty Pool

A = 50 B = 50

A A

50 jobs Finish

A = 50 B = 50

A A

Empty Empty

Group B submits 100 jobs,Empty Machines get jobs

A Jobs on B Machines are preempted

A = 50 B = 50

A

B

B

B Jobs on A Machines are preempted.

A = 50 B = 50

A B

Skip Preemption, Use Empty Machines?

A = 50 B = 50

A A

Empty Empty

A = 50 B = 50

A A

B B

Accounting Groups, GROUP_QUOTA

#New Machines = 200GROUP_QUOTA_A = 50GROUP_QUOTA_B = 50 GROUP_QUOTA_C = 50GROUP_QUOTA_D = 50GROUP_AUTOREGROUP = True

200 Machines

A = 50 B = 50C = 50 D = 50

A, B Have 100 machines each, how does C get resources?

PREEMPTION_REQUIREMENTS Still has cache/preemption issues

We Need access to Up to Date Usage/Quota information

PREEMPTION_REQUIREMENTS

A Conversation with Todd

SubmitterUserPrio SubmitterUserResourcesInUse

(RemoteUser as well)

SubmitterGroupQuotaSubmitterGroupResourcesInUse

(RemoteGroup as well)

With Great Power Comes Great Responsibility

IMPORTANT: Turn-off Caching (may slow down)PREEMPTION_REQUIREMENTS_STABLE= False

PREEMPTION_RANK_STABLE = False

PREEMPTION_REQUIREMENTS = (SubmitterGroupResourcesInUse < SubmitterGroupQuota) && (RemoteGroupResourcesInUse > RemoteGroupQuota)

PREEMPTION_REQUIREMENTS_STABLE= False

RANK = 0

Now we have everything needed!

Demanding Groups of Users

Getting Purchased Compute Capacity (Quota, not tied to machine)

Getting Guaranteed

Minimum Capacity(GROUP_QUOTA)

Getting As Many as Possible

(Auto-Regroup)

Getting As Soon as they submit

(One Negotiation Cycle typically)

Avoids Preemption

A = 50 B = 50

A A

Empty Empty

A = 50 B = 50

A A

B B

condor_status?

It Works! (patched 6.8 and 6.9+)Code & Condor Community Process

Where do we go from here?What did we learn?

Wisconsin is Working on 6.9 Negotiation/Scheduling more Efficient

In the FutureAllow us to Specify what we Account

For per VM/Slot (KFLOPS) ?

That’s just me…

Come to tonight’s ReceptionParticipate in the Community

Talk with Condor Team.Talk with other users.

Help the community continue to work well for everyone.

Thank you. Questions?

http://www.cyclecomputing.comjstowe @ cyclecomputing.com

top related