Exploiting Data Parallelism in SELinux Using a Multicore Processor Bodhisatta Barman Roy National University of Singapore, Singapore Arun Kalyanasundaram,

Post on 16-Jan-2016

226 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

Transcript

Exploiting Data Parallelism in SELinux Using a Multicore Processor

Bodhisatta Barman RoyNational University of Singapore, Singapore

Arun Kalyanasundaram, Shrisha RaoInternational Institute of Information Technology

Bangalore, India

Computer Society of India, CSI 2012, Kolkata, India

Motivation• One of the major drawbacks of Security:

Motivation• One of the major drawbacks of Security:–Reduction in Efficiency

Motivation• Similarly, performance overhead due to

Security features in software is considerable.

Motivation• Similarly, performance overhead due to

Security features in software is considerable.

• However, with the proliferation of multicore processors, we can introduce parallelism in software security validations.

Goal• Our aim is to optimize and evaluate the

performance of SELinux (Security Enhanced Linux) on a multicore processor.

Goal• Our aim is to optimize and evaluate the

performance of SELinux (Security Enhanced Linux) on a multicore processor.– SELinux is a Linux operating system feature that

provides fine grain access control over system resources.

Goal• Our aim is to optimize and evaluate the

performance of SELinux (Security Enhanced Linux) on a multicore processor.– SELinux is a Linux operating system feature that

provides fine grain access control over system resources.

– We propose several techniques to introduce parallelism in the SELinux architecture.

Goal• Our aim is to optimize and evaluate the

performance of SELinux (Security Enhanced Linux) on a multicore processor.– SELinux is a Linux operating system feature that

provides fine grain access control over system resources.

– We propose several techniques to introduce parallelism in the SELinux architecture.

– We evaluate our approach using a Cell Broadband Engine (CBE) multicore processor.

Background - SELinux

• SELinux implements the Mandatory Access Control (MAC) security paradigm.

• MAC operates on a set of rules to constrain a ‘process’ from performing an operation on a resource (e.g. file).

• Each process/resource is assigned a label called security context, which eases the task of writing security policy rules.

Background – SELinux Architecture

Subject:

Process xyz

Policy

database

Security Server:

Makes Decision

Allowed ? LSM Hooks

PolicyEnforcement

AVC

Object:

File: xyz.txt

Linux DAC

Deny

No

Access: ReadYes

SELinux MAC

Access

Vector

Cache

Security

Context

<Kernel Space> <User Space><User Space>

Identifying SELinux Performance Bottlenecks

• The decision to allow or deny an operation is a two step process,– Validation of the security contexts (SC) of the

source (Process) and target (resource).– Determining the presence of a security policy rule

corresponding to the requested operation.

• We found the validation step to be a major cause for performance overhead.

Hardware Setup - CBE• The CBE is a master-slave based multicore processor

consisting of one Power Processing Element (PPE) and eight Synergistic Processing Elements (SPE).

• Execution on SPE is initiated by PPE and data is transferred using DMA controllers.

• We used a Sony Play Station 3 console powered by a CBE processor, with Yellow Dog Linux 6.1 installed.

Our Approach

• We implement a parallel search using SIMD programming paradigm in the validation of security contexts (SC).

Our Approach

• We implement a parallel search using SIMD programming paradigm in the validation of security contexts (SC).

• Since the SC has three components, the validation requires traversing 3 linked list data structures.

Our Approach

• We implement a parallel search using SIMD programming paradigm in the validation of security contexts (SC).

• Since the SC has three components, the validation requires traversing 3 linked list data structures.

• We use either 3 SPEs (3U) or 6 SPEs (6U) to perform the search with one or two SPEs per component respectively.

Our Approach

• We implement a parallel search using SIMD programming paradigm in the validation of security contexts (SC).

• Since the SC has three components, the validation requires traversing 3 linked list data structures.

• We use either 3 SPEs (3U) or 6 SPEs (6U) to perform the search with one or two SPEs per component respectively.

• We also evaluate a busy wait strategy on the SPE, where the SPE is not freed between node lookups.

Our Approach – Different Number of SPEs

SPE Busy Wait Loading Strategy

• Keep the SPE waiting till the data for next node in the linked list is available.

SPE Busy Wait Loading Strategy

• Keep the SPE waiting till the data for next node in the linked list is available.

• Pros– Improves performance by eliminating load time.

SPE Busy Wait Loading Strategy

• Keep the SPE waiting till the data for next node in the linked list is available.

• Pros– Improves performance by eliminating load time.

• Cons– Other processes which require the SPE may be

blocked.– Requires continuous polling on main memory

which impede data access operations.

Optimizing DMA Transfers for Matching Strings

• DMA double buffering for null terminated Strings.

Performance Measurement

• Evaluation based on two configurable parameters,

Performance Measurement

• Evaluation based on two configurable parameters,– Number of rules in security policy.• This determines the number of valid security contexts• We evaluate with policies contining 0 – 4000 rules.

Performance Measurement

• Evaluation based on two configurable parameters,– Number of rules in security policy.• This determines the number of valid security contexts• We evaluate with policies contining 0 – 4000 rules.

– Size of Access Vector Cache (AVC).• Helps accurately measure overhead due to decision

making logic in the Security server.• Two different AVC size – 512 entries (Optimal) and 1

entry (Minimal).

Results : Single Core PPE Performance• The increase in running time is about 64%, 112% between

2500 - 4000 rules with optimal and minimal AVC respectively.• Establishes the fact that security context validations are

computationally intensive.

Results : Comparing Different Techniques

• Counter-intuitive results showing multicore performance lower than single core with Optimal AVC size.

• However, with Minimal AVC size and busy wait strategy, there is an efficiency gain of up to 43%.

Conclusion

• The gain in efficiency of optimizing security validations depend on the architecture of the software and the hardware platform.

• However, software applications designed for a uniprocessor system cannot be easily optimized for parallel computing.

• The problem is especially prominent in securityrelated applications, since the priority is robustness rather than efficiency.

Future Work• One extension of our work is to apply the proposed

techniques to other security features / applications like TOMOYO Linux, SMACK5, and compare their performances.

• Evaluating our approach on different multicore architectures like GPGPUs, could give greater insights into its effectiveness.

• Analyze the proposed techniques in distributed platforms like Beowulf clusters and grid networks.

Questions?

top related