Top Banner
FRANZ INC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig
85

F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

Dec 27, 2015

Download

Documents

Philip Hoover
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Optimizing User Code in Allegro CL 5.0

by Duane Rettig

Page 2: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Optimizing User Code in Allegro CL 5.0

– Introduction– Optimization-related lisp architecture– Undocumented tools in Allegro CL– Optimization methodology– Speed optimizations– Space optimizations– Speed vs space tradeoffs– Lisp heap management

Page 3: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Optimizing User Code in Allegro CL 5.0

– Introduction– Optimization-related lisp architecture– Undocumented tools in Allegro CL– Optimization methodology– Speed optimizations– Space optimizations– Speed vs space tradeoffs– Lisp heap management

Page 4: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Optimization-related Lisp Architecture

• Static vs dynamic - Function dispatch

• Closure structure

• Foreign functions - entry-vec struct

• Disassemble extensions

Page 5: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Architecture: Static Vs Dynamic

• Pure static: absolute

• Relocatable

• Shared libraries

• Dynamic shared libraries

• Dynamic functions

Page 6: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Static Programs

• Absolute addresses• Fast startup• Fast running• Large• Not reconfigurable

•Code

•Data

•0

•sbrk

Page 7: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Relocatable Programs

• Not tied to a base address

• Slightly longer startup times

• Fast running• Large• Not reconfigurable

•Code

•Data

•Reloc

Page 8: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Programs that Use Shared Libraries

• Usually need relocation

• Smaller than non-shared libraries

• Faster startup times• Medium speed; may

start slow and gain speed after first use

• Not reconfigurable

•Main

•Lib 2

•Lib 3

•Lib 1

Page 9: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Programs that Use Dynamic Shared Libraries

• May be absolute or relocatable

• May be very small• Very fast startup• Medium speed,

amortized over lib loading

• Reconfigurable

•Main

•Lib 2

•Lib 3

•Lib 1

Page 10: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Programs that Dynamically Define Functions

• May be absolute or relocatable

• May be very small• Very fast startup• Medium speed,

amortized over function definitions

• Extremely reconfigurable

•Main•Lisp lib

•Heap

•Lib 1

•Lib 2•Functions

Page 11: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Lisp Data Availability

T Flgs Size

StartHashNameCode

FormalsAux1Aux2LocalsConst 0

…Const N

T Size

Entry

Instructions…

Return

SymbolsLvaluesc-values

Nil

R/ S funcs

•nil

•func •Glob table

•function

•codevector

•pc

Page 12: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

C Data Availability

GOT

Func1

Func2

Func3

GOT

Func1

Func2

Func3

•lib1 •lib2

address

GOT

address

GOT

address

GOT

address

GOT

address

GOT

address

GOT

Page 13: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Registers

Alpha HP X86 Mips Sparc 68K RS6000GOT r29 r27/r19 ebx r28 l7 static r2Args r16-r21 r26-r23 eax,edx r4-r11 i/o 0-4,5 (d0-d3) r3–r10NIL r14 r18 edi r23 g4 d6 r15

Tramp r9 r9 edi r21 g4 a4 r21Func r10 r17 esi r20 o5 / i5 a5 r13Name r15 r8 ebx r30 g2 a3 r20Count r4 r4 ecx r3 g3 d7 r16

Page 14: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Calling Sequence: Lisp

• Caller:– Store caller-saves registers

– set up arguments and count

– load name register

– call trampoline *

– restore caller-saves registers

• Callee:– establish stack

– save function

– Execute body

– restore stack

– restore caller’s function

– return

Page 15: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Calling Sequence: C

• Caller:– Store caller-saves registers

– set up args (no count)

– store caller’s context

– call function, function desc, or stub

– restore caller’s context

– restore caller-saves registers

• Callee:– setup callee’s context

– establish stack

– store callee-saves registers

– Execute body

– restore callee-saves registers

– restore stack

– return

Page 16: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Lisp’s Symbol Trampoline

• Required:– get function register from

name register

– get start address from function

– jump to start

• Optional:– save argument registers

– check for stack overflow

– jump to call-count code

– jump to single-step code

Page 17: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Architecture: Closures

T Flgs Size

StartSharedPrivate T Size

T Flgs Size

StartHashNameCode

FormalsAux1Aux2LocalsConst 0

…Const N

T Flgs Size

StartShared

Private 0Private 1

…Private N

T Flgs Size

StartHashNameCode

FormalsAux1Aux2LocalsConst 0

…Const N

•External vec •Internal vec

Page 18: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

(in-package :excl)

(defstruct (entry-vec (:type (vector excl::foreign (*))) (:constructor make-entry-vec-boa ())) name ; entry point name (address 0) ; jump address for foreign code (handle 0) ; shared-lib handle (flags 0) ; ep-* flags (alt-address 0) ; sometimes holds the real func addr )

Architecture: Foreign Functions

•Entry-vec struct

Page 19: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

;; Entry-point constants:

(defconstant excl::ep-call-semidirect 1) ; Real address stored in alt-address slot(defconstant excl::ep-never-release 2) ; Never release the heap(defconstant excl::ep-always-release 4) ; Always release the heap(defconstant excl::ep-release-when-ok 8) ; Release the heap unless without-interrupts

(defconstant excl::ep-tramp-calls #x70) ; Make calls through special trampolines(defconstant excl::ep-tramp-shift 4)

(defconstant excl::ep-variable-address #x100) ; Entry-point contains address of C var

Architecture: Foreign Functions

•Entry-vec flags

Page 20: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Architecture: Foreign Functions

T Size

NameAddressHandleFlags

Alt-address

•Entry vec

•missing_entry_point

•bind_and_call

•call_semidirect

•“foo”

•foo()

Page 21: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Architecture: Foreign Functions

•Entry vec •Entry vec

•Entry vec

•Entry vec•Entry vec

•“foo”

•“bar”

•“bas”

•Excl::.saved-entry-points. table

Page 22: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Architecture: Disassemble

– Extensions• non-lisp names

• :absolute

• :addr-list

• :find-callee

• :find-pc

• :references-only

• :recurse

• :target-class

Page 23: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Disassembling non-lisp names

• A string representing a C entry point– Allows for viewing of non-lisp assembler

code– Some instructions are interpreted

automatically

Page 24: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

(disassemble "qcons");; disassembly of #("qcons" 1074935746)

;; code start: #x401237c2: 0: 8b 8f ff fd movl ecx,[edi-513] ; C_GSGC_NEWCONSLOC ff ff 6: 3b 8f 03 fe cmpl ecx,[edi-509] ; C_GSGC_NEWCONSEND ff ff 12: 0f 84 3c 1e jz 7758 ; cons+0 00 00 18: 89 41 0f movl [ecx+15],eax 21: 89 c8 movl eax,ecx 23: 89 50 13 movl [eax+19],edx 26: 83 87 ff fd addl [edi-513],$8 ; C_GSGC_NEWCONSLOC ff ff 08 33: c3 ret

Page 25: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

excl::*c-symbol-table* build:

• dirty (excl::*rebuild-c-symbol-table-p* is non-nil):– at lisp start– after load or unload of shared library

• rebuilt:– for disassemble of a string– for profiler analysis– for “:zoom :all t :verbose t” invocation

Page 26: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

(inspect excl::*c-symbol-table*)A simple T vector (3538) @ #x2039c352 0-> cstruct (2) = #("unidentified" 0) 1-> cstruct (2) = #("_init" 134514576) 2-> cstruct (2) = #("strcpy" 134514600) 3-> cstruct (2) = #("dlerror" 134514616) 4-> cstruct (2) = #("getenv" 134514632) 5-> cstruct (2) = #("fgets" 134514648) 6-> cstruct (2) = #("perror" 134514664) 7-> cstruct (2) = #("readlink" 134514680) 8-> cstruct (2) = #("malloc" 134514696) 9-> cstruct (2) = #("malloc" 134514696) 10-> cstruct (2) = #("_lxstat" 134514712) 11-> cstruct (2) = #("isspace" 134514728) 12-> cstruct (2) = #("_xstat" 134514744) 13-> cstruct (2) = #("__libc_init" 134514760) 14-> cstruct (2) = #("strrchr" 134514776) 15-> cstruct (2) = #("fprintf" 134514792) 16-> cstruct (2) = #("fprintf" 134514792) 17-> cstruct (2) = #("strcat" 134514808) 18-> cstruct (2) = #("chdir" 134514824) 19-> cstruct (2) = #("strncmp" 134514840) ... 3537-> cstruct (2) = #("__bss_start" 1075102200)

Page 27: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

USER(1): (defun foo (x) (list (bar x)))FOOUSER(2): (compile 'foo)Warning: While compiling these undefined functions were referenced: BAR.FOONILNILUSER(3):

•(simple function for next examples)

Page 28: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

(disassemble 'foo);; disassembly of #<Function FOO>;; formals: X;; constant vector:0: BAR

;; code start: #x203dcddc: 0: 55 pushl ebp 1: 8b ec movl ebp,esp 3: 56 pushl esi 4: 83 ec 24 subl esp,$36 7: 83 f9 01 cmpl ecx,$1 10: 74 02 jz 14 12: cd 61 int $97 ; trap-argerr 14: d0 7f a3 sarb [edi-93],$1 ; C_INTERRUPT 17: 74 02 jz 21 19: cd 64 int $100 ; trap-signal-hit 21: 8b 5e 32 movl ebx,[esi+50] ; BAR 24: b1 01 movb cl,$1 26: ff d7 call *edi 28: 8b d7 movl edx,edi 30: ff 57 2b call *[edi+43] ; QCONS 33: c9 leave 34: 8b 75 fc movl esi,[ebp-4] 37: c3 ret

Page 29: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Disassembling with absolute addresses

• :absolute– Allows debug at absolute addresses– Warning: addresses may not be in sync after

gc, though per-disassemble consistency is maintained

Page 30: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

(disassemble 'foo :absolute t);; disassembly of #<Function FOO>;; formals: X;; constant vector:0: BAR204cb5a4: 55 pushl ebp204cb5a5: 8b ec movl ebp,esp204cb5a7: 56 pushl esi204cb5a8: 83 ec 24 subl esp,$36204cb5ab: 83 f9 01 cmpl ecx,$1204cb5ae: 74 02 jz 0x204cb5b2204cb5b0: cd 61 int $97 ; trap-argerr204cb5b2: d0 7f a3 sarb [edi-93],$1 ; C_INTERRUPT204cb5b5: 74 02 jz 0x204cb5b9204cb5b7: cd 64 int $100 ; trap-signal-hit204cb5b9: 8b 5e 32 movl ebx,[esi+50] ; BAR204cb5bc: b1 01 movb cl,$1204cb5be: ff d7 call *edi204cb5c0: 8b d7 movl edx,edi204cb5c2: ff 57 2b call *[edi+43] ; QCONS204cb5c5: c9 leave204cb5c6: 8b 75 fc movl esi,[ebp-4]204cb5c9: c3 ret

Page 31: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Disassemble support for the profiler

• addr-list

– Marks a specific instruction– Allows for exact profiler hits to be recorded

Page 32: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

(disassemble 'foo :addr-list -10);; disassembly of #<Function FOO>;; formals: X;; constant vector:0: BAR

;; code start: #x204cb5a4: 0: 55 pushl ebp 1: 8b ec movl ebp,esp 3: 56 pushl esi 4: 83 ec 24 subl esp,$36 7: 83 f9 01 cmpl ecx,$1 stopped --> 10: 74 02 jz 14 12: cd 61 int $97 ; trap-argerr 14: d0 7f a3 sarb [edi-93],$1; C_INTERRUPT 17: 74 02 jz 21 19: cd 64 int $100 ; trap-signal-hit 21: 8b 5e 32 movl ebx,[esi+50] ; BAR 24: b1 01 movb cl,$1 26: ff d7 call *edi 28: 8b d7 movl edx,edi 30: ff 57 2b call *[edi+43] ; QCONS 33: c9 leave 34: 8b 75 fc movl esi,[ebp-4] 37: c3 ret

Page 33: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

(disassemble 'foo :addr-list '(11 (#x204cb5ae . 4) (#x204cb5b9 . 4) (#x204cb5c5 . 3)));; disassembly of #<Function FOO>;; formals: X;; constant vector:0: BAR

;; code start: #x204cb5a4: 0: 55 pushl ebp 1: 8b ec movl ebp,esp 3: 56 pushl esi 4: 83 ec 24 subl esp,$36 7: 83 f9 01 cmpl ecx,$1 4 (36%) 10: 74 02 jz 14 12: cd 61 int $97 ; trap-argerr 14: d0 7f a3 sarb [edi-93],$1; C_INTERRUPT 17: 74 02 jz 21 19: cd 64 int $100 ; trap-signal-hit 4 (36%) 21: 8b 5e 32 movl ebx,[esi+50] ; BAR 24: b1 01 movb cl,$1 26: ff d7 call *edi 28: 8b d7 movl edx,edi 30: ff 57 2b call *[edi+43] ; QCONS 3 (27%) 33: c9 leave 34: 8b 75 fc movl esi,[ebp-4] 37: c3 ret

Page 34: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Disassemble support for the debugger

• :find-callee

– Returns information given a relative pc

• :find-pc– Returns information about instruction

sequencing, or prints an instruction

• :references-only– Returns references from function or glob table

Page 35: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

USER(22): (disassemble 'foo :find-callee 26)BAR:CONST-1USER(23): (disassemble 'foo :find-callee 28)BAR:CALL0USER(24): (disassemble 'foo);; disassembly of #<Function FOO> ... 14: d0 7f a3 sarb [edi-93],$1 ; C_INTERRUPT 17: 74 02 jz 21 19: cd 64 int $100 ; trap-signal-hit 21: 8b 5e 32 movl ebx,[esi+50] ; BAR 24: b1 01 movb cl,$1 26: ff d7 call *edi 28: 8b d7 movl edx,edi 30: ff 57 2b call *[edi+43] ; QCONS 33: c9 leave 34: 8b 75 fc movl esi,[ebp-4] 37: c3 ret

USER(25):

Page 36: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

USER(28): (disassemble 'foo :find-pc 14)1417NILNILUSER(29): (disassemble 'foo :find-pc 17)171921:BCCUSER(30): (disassemble 'foo :find-pc '(:print 17)) 17: 74 02 jz 21USER(31): (disassemble 'foo :find-pc '(:print 21)) 21: 8b 5e 32 movl ebx,[esi+50] ; BAR

USER(32):

Page 37: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

USER(26): (disassemble 'foo :references-only t)(SYSTEM::QCONS BAR SYSTEM::C_INTERRUPT)USER(27):

Page 38: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Miscellaneous Disassembler modes

• :recurse– Useful to control the amount of output

• :target-class– Used only in cross-porting

Page 39: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Optimizing User Code in Allegro CL 5.0

– Introduction– Optimization-related lisp architecture– Undocumented tools in Allegro CL– Optimization methodology– Speed optimizations– Space optimizations– Speed vs space tradeoffs– Lisp heap management

Page 40: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Undocumented Tools in Allegro CL

• excl::get-objects (att #42-1)

• excl::get-references (typo in your notes)

• excl::create-box/excl::box-value(att #42-2)

• excl::atomically– allows compiler to guarantee atomic body

• Autoloading facilities (described later)

Page 41: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Atomic forms

– Generally a form is atomic if it has• no interrupt-checks

• no consing

• no non-atomic forms or calls

– Use excl::atomically like progn; if it compiles, the body is atomic

– Atomic primcalls:• gsgc-setf-protect gsgc-set-protect fd-stack-real

qcar qcdr

– Atomic calls:• error excl::.error excl::eq-hash-fcn excl::eql-not-

eq excl::get_2op-atomic excl::sxhash-if-fast excl::symbol-hash-fcn

Page 42: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Optimizing User Code in Allegro CL 5.0

– Introduction– Optimization-related lisp architecture– Undocumented tools in Allegro CL– Optimization methodology– Speed optimizations– Space optimizations– Speed vs space tradeoffs– Lisp heap management

Page 43: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Optimization Methodology

• Get it right first

• Profile it– The time macro– The Allegro CL profiler

• Hit the high cost items– Implementations– Algorithms

Page 44: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Optimizing User Code in Allegro CL 5.0

– Introduction– Optimization-related lisp architecture– Undocumented tools in Allegro CL– Optimization methodology– Speed optimizations– Space optimizations– Speed vs space tradeoffs– Lisp heap management

Page 45: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Speed Optimizations

– Profiling– Efficient compilation– Immediate compilation– Foreign function optimizations– Hash tables– CLOS optimizations– Miscellaneous optimizations

Page 46: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Speed Optimizations: Profiling

• Always compile top-level test functions

• Example profile run (att #48-1)

• Do not use time macro with profiler

• Avoid simultaneous time/call-count profiles

• When using time macro, beware of new closures

Page 47: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Time macro: extra closures

(defun test-driver (n) (time (dotimes (i n) (test-it)))

•This driver is not as simple as it looks!

Page 48: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Speed Optimizations: Efficient Compilation

• :explain

• excl::atomically

• excl:add-typep-transformer (att #50-1,2)

Page 49: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Speed Optimizations: Immediate

Compilation• Inlining and unboxing

• Immediate-args

• defun-immediate (att #51-1,2,3)

Page 50: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Speed Optimizations: Foreign Functions

• Call-direct (att #52-1,2)

• comp:list-call-direct-possibilities

Page 51: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

USER(2): (pprint (comp:list-call-direct-possibilities))

(("arg types:" (:FOREIGN-ADDRESS :LISP FIXNUM INTEGER SINGLE-FLOAT DOUBLE-FLOAT :SINGLE-FLOAT-NO-PROTO SIMPLE-STRING CHARACTER)) ("[also any one-dimensional simple-array is ok as an arg]") ("return types:" (SINGLE-FLOAT DOUBLE-FLOAT :SINGLE-FLOAT-FROM-DOUBLE :FIXNUM FIXNUM :MACHINE-INTEGER :INTEGER INTEGER :UNSIGNED-INTEGER :FOREIGN-ADDRESS :CHARACTER CHARACTER :LISP :VOID :BOOLEAN BOOLEAN)) ("unboxed machine-integer return types:" (FIXNUM INTEGER :FIXNUM :INTEGER :MACHINE-INTEGER)) ("bad return types for unboxing [some because they are keywords]:" (:FIXNUM :INTEGER :UNSIGNED-INTEGER :FOREIGN-ADDRESS :CHARACTER)))

Page 52: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Speed Optimizations: Hash Tables

• Make-hash-table extensions

• Rehash issues

• excl::*default-rehash-size*

• excl::*allocate-large-hash-table-vectors-in-old-space*

• Convert-to-internal-fspec (example use of weak-key, sans value ht)

Page 53: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Hash-table Architecture

T Size

T Size

T F …

Key…

Table

Page 54: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Hash Table extensions

• :test extensions

• :values [t] (may be nil or :weak)

• :weak-keys [nil] (may be non-nil)

• :hash-function [nil] (or fboundp symbol)– Must return 16-bit value

Page 55: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Speed Optimizations: Hash Tables

• Make-hash-table extensions

• Rehash issues

• excl::*default-rehash-size*

• excl::*allocate-large-hash-table-vectors-in-old-space*

• Convert-to-internal-fspec– example of weak-key, sans value hash-table

(att #57-1)

Page 56: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Speed Optimizations: CLOS Optimizations

• Discriminators (att #58-1)

• For accessors, stay with unique names

• Outside of methods, use generic functions; inside of methods which specialize on the class, use slot-value

• Stay with standard-method-combination

• Avoid methods on slot-value-using-class

Page 57: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Speed Optimizations: Misc Optimizations

• Applyn

• New array dimensions optimizations

• Know what functions are optimized:– compiler macros (use compiler-macro-

function)– compiler transforms (att #59-1)– compiler inliners (att #59-2)

• Fixed-index (att #59-3,4)

Page 58: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Optimizing User Code in Allegro CL 5.0

– Introduction– Optimization-related lisp architecture– Undocumented tools in Allegro CL– Optimization methodology– Speed optimizations– Space optimizations– Speed vs space tradeoffs– Lisp heap management

Page 59: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Space Optimization Vs Consing Reduction

• Space has to do with overall size of the program; consing has to do with how much allocation/deallocation is occurring– The room function measures space– The time macro measures consing

Page 60: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Space Optimizations

• Tools for space measurement

• Closures

• Presto

• Autoloading

• Foreign functions space considerations

• The .Pll file; strings and codevectors

• Generate-application

Page 61: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Tools for Space Measurement

• (room t) and excl:print-type-counts

• excl::get-objects

• excl::get-references

• inspect– Use :raw mode for low-level inspection

Page 62: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Space Optimization: Using closures

• Use of closures is smaller than expanding macros

• Closures are small themselves

• Closures can be precompiled in a file; late-binding can occur without run-time compilation

Page 63: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Space Optimization: Presto

• Fully automatic

• Good for large (> 10 Mb) programs

• Not useful for small (< 2 Mb) programs

• Beware of delivery– Use sys:presto-build-lib to build new bundle

file

• Beware of use during development– Compile-file/load sequence can cause

corruption

Page 64: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Space Optimizations: Autoloading

• Function (macro/generic-function)– excl::def-autoload-function– excl::autoload-it– excl::autoloadp

• Package– excl::*autoload-package-name-alist*

• Class– excl::*autoload-find-class-alist*

Page 65: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

// (pprint excl::*autoload-package-name-alist*)(("comp" . :COMPILER-S) ("compiler" . :COMPILER-S) ("cltl1".:CLTL1) ("db" . :DEBUG) ("debug" . :DEBUG) ("debugger" . :DEBUG) ("ds" . :DEFSYS-S) ("defsys" . :DEFSYS-S) ("defsystem" . :DEFSYS-S) ("fla" . :FLAVORS) ("flavors" . :FLAVORS) ("ff" . :FOREIGN-S) ("foreign-functions" . :FOREIGN-S) ("inspect" . :INSPECT) ("lep" . :LEP) ("prof" . :PROF-S) ("profiler" . :PROF-S) ("acl-socket" . :SOCK-S) ("socket" . :SOCK-S) ("scm" . :SCM) ("multiprocessing" . :PROCESS-S) ("mp" . :PROCESS-S) ("xref" . :XREF-S) ("cross-reference" . :XREF-S))NIL // (pprint excl::*autoload-find-class-alist*)((BROADCAST-STREAM . :STREAMA) (CONCATENATED-STREAM . :STREAMA) (ECHO-STREAM . :STREAMA) (SYNONYM-STREAM . :STREAMA) (TWO-WAY-STREAM . :STREAMA))NIL //

Page 66: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Space Optimizations: Foreign Functions

• Use ff:def-foreign-call!

• Avoid code that pulls in the compatibility packages– ffcompat– defctype

• Avoid requiring aclwffi (can’t do this yet on CG/IDE programs)

Page 67: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

(pprint (macroexpand '(ff:def-foreign-call foo (x) :arg-checking nil)))

(PROGN (EVAL-WHEN (:COMPILE-TOPLEVEL) (EXCL::CHECK-LOCK-DEFINITIONS-COMPILE-TIME 'FOO 'FUNCTION 'FOREIGN-FUNCTIONS:DEF-FOREIGN-CALL (FBOUNDP 'FOO)) (PUSH 'FOO EXCL::.FUNCTIONS-DEFINED.)) (EVAL-WHEN (COMPILE LOAD EVAL) (REMPROP 'FOO 'SYSTEM::DIRECT-FF-CALL)) (SETF (FDEFINITION 'FOO) (EXCL::GET-FF-N-ARGS-CLOSURE (EXCL::DETERMINE-FOREIGN-ADDRESS '("foo" :LANGUAGE :C) 2 NIL) 'INTEGER '(0))) (EXCL::.INV-FUNC_FORMALS (FBOUNDP 'FOO) '(X)) (RECORD-SOURCE-FILE 'FOO) 'FOO)

Page 68: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

(pprint (macroexpand '(ff:def-foreign-call foo (x) :call-direct t))) (PROGN (EVAL-WHEN (:COMPILE-TOPLEVEL) (EXCL::CHECK-LOCK-DEFINITIONS-COMPILE-TIME 'FOO 'FUNCTION 'FOREIGN-FUNCTIONS:DEF-FOREIGN-CALL (FBOUNDP 'FOO)) (PUSH 'FOO EXCL::.FUNCTIONS-DEFINED.)) (EVAL-WHEN (COMPILE LOAD EVAL) (SETF (GET 'FOO 'SYSTEM::DIRECT-FF-CALL) (LIST '("foo" :LANGUAGE :C) T :C 'INTEGER '(INTEGER) '(:INT) 2))) (SETF (FDEFINITION 'FOO) (LET ((EXCL::F (NAMED-FUNCTION FOO (LAMBDA (X) (EXCL::CHECK-ARGS '(INTEGER) 'FOO X) (SYSTEM::FF-FUNCALL (LOAD-TIME-VALUE (EXCL::DETERMINE-FOREIGN-ADDRESS '("foo" :LANGUAGE :C) 2 NIL)) 0 X 'INTEGER))))) (EXCL::SET-FUNC_NAME EXCL::F 'FOO) EXCL::F)) (RECORD-SOURCE-FILE 'FOO) 'FOO)

Page 69: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

USER(7): (arglist 'excl::determine-foreign-address)(EXCL::NAME &OPTIONAL EXCL::FLAGS EXCL::METHOD-INDEX)TUSER(8): (apropos "FF-PASS-")FF::FF-PASS-TYPE-LISP value: 32FF::FF-PASS-TYPE-SINGLE-FLOAT value: 4FF::FF-PASS-TYPE-BY-REFERENCE value: 1FF::FF-PASS-TYPE-PROTOTYPED value: 2

•Pass-type flags

Page 70: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Space Optimizations: The .Pll File

• Formerly known as the .lso file

• Establishes pure (read-only) shared space

• Currently supports codevectors and strings

• lso.h (att #72-1)

Page 71: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Space Optimizations: Generate-application

• Keywords to specify as non-nil:– Keywords that start with “discard-”– :pll-file (but preferably :pure-files or :purify)– :presto (use only if helpful)

• Keywords to specify as nil:– Keywords that start with “include-” or “load-”

or “record-”– :preserve-documentation-strings

Page 72: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Optimizing User Code in Allegro CL 5.0

– Introduction– Optimization-related lisp architecture– Undocumented tools in Allegro CL– Optimization methodology– Speed optimizations– Space optimizations– Speed vs space tradeoffs– Lisp heap management

Page 73: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Speed Vs Space Tradeoffs

• Inlining vs. Function calls

• comp:compile-format-strings-switch

• (comp:generate-inline-call-tests-switch)

• The typecase/case statement and the jump table

Page 74: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Typecase/case Jump Tables

• Typecase normally turns into cond

• If typecase or case has right conditions, turns into a jump table

• Jump table has constant execution time

• Jump table can get large (up to 256 entries)

• Example for x86 (att #76-1)

Page 75: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Optimizing User Code in Allegro CL 5.0

– Introduction– Optimization-related lisp architecture– Undocumented tools in Allegro CL– Optimization methodology– Speed optimizations– Space optimizations– Speed vs space tradeoffs– Lisp heap management

Page 76: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Lisp Heap Management

• New strategy for heaps in 5.0

• malloc/free vs aclmalloc/aclfree

• Garbage collection problems

• Cons Reduction

Page 77: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

New Heap Strategy

• Dumplisp no longer fiddles with internal executable file formats

• Separate lisp-heap from “c-heap” (really aclmalloc space)

• Gaps are much less likely to occur

• Applications may use whatever malloc they choose

Page 78: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

malloc/free vs aclmalloc/aclfree

• Warning! excl::malloc and excl::free are not links to malloc and free!

• Allocations and deallocations must not be mixed– Some o/s-supplied frees will segv– At best, memory leaks will occur

Page 79: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Common GC-related problems

• Too much paging– Newspace may be too large

• Too many scavenges– Newspace may be too small– Too much consing

• Global-GC takes too long– Be sure it is doing worthwhile work– Maybe turn it off!

Page 80: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Cons Reduction

• Profile it - space profiler

• Identify unnecessary consing

• Replace consing operations with non-consing

Page 81: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Profile it -The Trouble with Space Profiling

•Time profile •Space profile

•main •main

•app

•libs •libs

•app

Page 82: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Identify unnecessary consing

• Application specific

• Sometimes characterized by many scavenges of extremely high efficiency

Page 83: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Replace consing operations with non-consing

• Sometimes caused by system functions– Find alternates– Complain to vendor !

• Use resourcing strategies

• Use stack-allocations where possible

Page 84: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Stack allocations

• <something> can be– flet or labels– list, list*, cons, and certain simple make-array

forms

(let ((x <something>)) (declare (dynamic-extent x)) ...or (<something> ((x ...)) (declare (dynamic-extent x)) ...

Page 85: F RANZ I NC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig.

FRANZ INC.

Close

• Questions

• Kudos

• Tomatoes