Php internal architecture

Post on 15-Jan-2017

217 Views

Category:

Software

0 Downloads

Preview:

Click to see full reader

Transcript

PHP Internal ArchitecturePluggable, Extendable, Useable

ArchitecturePHP piece by piece

You should know the basics

All the puzzle pieces

PHPInput/Output• SAPI• Streams

Engine• Lexer• Parser• AST• Compiler• Executor

Extensions• Zend Extensions• Compiled In• Loaded at startup• Loaded at runtime

Running PHP

server makes request

SAPI talks to engine

engine runsSAPI returns

output to server

How other languages do this

Python (Cpython)• mod_python (embedded

python interpreter, deprecated)

• mod_wsgi (embedded or daemon) – basically a mod_python copy OR speaking to python interpreter with a special library installed via unix sockets)

• command line interpreter• Fastcgi/cgi (using a library in

python)

Ruby (MRI)• also known as “CRuby”• Matz’s Ruby Interpreter

• use Rack (library) to:• write/run a ruby webserver • use another server in between

with hooks to nginx/apache (unicorn, passenger)

• use FastCgi/Cgi

And still more..

NodeJS• Your app is your server

• This is a pain• Write your own clustering or

other neat features!!

• So you stick a process manager in front

• And you reverse proxy from apache/nginx

• Or you use passenger or some other server….

Perl• Yes it still exists – shhh you in

the back

• PSGI + plack • mod_perl• mod_psgi

What makes PHP different?• Shared nothing architecture by design

• application lifecycle is per-request• no shared state natively• infinite horizontal scalability in the language itself

• HTTP is a first class citizen• You don’t need a library or framework

• SAPI is a first class citizen• Designed to have a server in front of it• No library necessary

• You don’t need a deployment tool to keep it all going

The answer to your question is

SAPIServer API – the least understood feature in PHP

What is a SAPI?• Tells a Server how to talk to PHP via an API

• Server API• Server Application Programming Interface

• “Server” is a bit broad as it means any type of Input/Output mechanism

• SAPIS do:• input arguments• output, flushing, file descriptors, interruptions, system user info• input filtering and optionally headers, POST data, HTTP specific stuff• Handling a stream for the request body

In the beginning• CGI

• Common gateway interface• Shim between web server and

program

• Simple• Stateless• Slow• Local• Good security with linux tools

• Slow• Local• Programs can have too much

access• Memory use not transparent

(thrash and die!)

Then there was PHP in a Webserver• mod_php (apache2handler)

• Run the language directly in the webserver, speaking to a webserver’s module api

• Can access all of apache’s stuff

• Webserver handles all the request stuff, no additional sockets/processes

• It works well

• Requires prefork MPM or thread safe PHP

• Eats all your memories and never lets the system have it back

• Makes apache children take more memory

CGI is slow: FastCGI to the rescue!• Persistent processes but CGI mad style• Biggest drawbacks?

• “it’s old”• “I don’t like the protocol”• “it’s not maintained”• “other people say it’s not stable”

• Apache fcgi modules do kind of suck • Nginx “just works”• IIS8+ “just works”

php-fpm – Make FastCGI better • FastCGI Process Manager• Adds more features than traditional FastCGI

• Better process management including graceful stop/start• Uid/gid/chroot/environment/port/ini configuration per worker• Better logging• Emergency restart• Accelerated upload support• Dynamic/static child spawning

CLI?• Yes, in PHP the CLI is a SAPI• (Did you know there’s a special windows cli that doesn’t pop a

console window?)• PHP “overloads” the CLI to have a command line webserver for

easier development (even though it SHOULD be on its own) • PHP did that because fighting with distros to always include the cli-

server would have meant pain, and if you just grab php.exe the dev webserver is always available

• The CLI treats console STDIN/STDOUT as its request/response

php-embed• A thin wrapper allowing PHP to be easily embedded via C• Used for extensions in node, python, ruby, and perl to interact with

PHP• Corresponding extensions do exist for those languages embedded in

PHP

phpdbg• Wait – there’s a debugger SAPI?• Yes, yes there is

litespeed• It is a SAPI• The server just went open source…• I’ve never tried it, but they take care of the SAPI

Just connect to the app?• Use a webserver to reverse proxy to webserver built into a

framework?

• Smart to use a webserver that has already solved the hard stuff• But the app/web framework on top needs to deal with

• HTTP keepalive?• Gzip with caching?• X-forwarded-for? Logging? Issues• Load balancing and failover?• HTTPS and caching?• ulimit? Remember we’re opening up a bunch of sockets!

Well, PHP streams can do that

StreamsInput and Output beyond the SAPI

What is a Stream?• Access input and output generically• Can write and read linearly• May or may not be seekable• Comes in chunks of data

How PHP Streams Work

Stream Contexts

Stream Wrapper

Stream FilterALL IO

Definitions• Socket

• Bidirectional network stream that speaks a protocol

• Transport• Tells a network stream how to communicate

• Wrapper• Tells a stream how to handle specific protocols and encodings

Built in Socket Transports• tcp• udp• unix• udg• SSL extension

• ssl• sslv2• sslv3• tls

You can write your own streams!• You can do a stream wrapper in userland and register it• But you need an extension to register them if they have a transport• Extensions with streams include ssh, bzip2, openssl• I’d really like the curl stream back (not with the compile flag, but

curl://)

Welcome to the EngineLexers and Parsers and Opcodes OH MY!

Lexer• checks PHP’s spelling• turns into tokens• see token_get_all for what PHP sees

Parser + AST• checks PHP’s grammar• E_PARSE means “bad phpish”• creates AST

Compiler• Turns AST into Opcodes• Allows for fancier grammar• Opcodes can then be cached (opcache) skipping lex/parse/compile

cycle

Opcodes• dump with http://derickrethans.nl/projects.html• machine readable language which the runtime understands

Engine (Virtual Machine)• reads opcode• does something• zend extension can hook it!• ???• PROFIT

ExtensionsHow a simple design pattern made PHP more useful

“When I say that PHP is a ball of nails, basically, PHP is just this piece of shit that you just put all the parts together and you throw it against the wall and it fucking sticks”- Terry Chay

So what is an extension?• Written in C or C++• Compiled statically into the PHP binary or as a shared object

(so/dylib/dll)• Provides

• Bindings to a C or C++ library• even embed other languages

• Code in C instead of PHP (speed)• template engine

• Alter engine functionality • debugging

So why an extension?• add functionality from other languages (mainly C)• speed• to infinity and beyond!

• intercept the engine• add debugging• add threading capability• the impossible (see: operator)

About Extensions• Types

• Zend Extension• PHP Module

• Sources• Core Built in• Core Default• Core• PECL• Github and Other 3rd Party

– “We need to foster a greater sense of community for people writing PHP extensions, […] Quite what this means hasn't been decided, although one of the major responsibilities is to spark up some community spirit, and that is the purpose of this email.”

- Wez Furlong, 2003

What is PECL?• PHP Extension Code Library• The place for people to find PHP extensions• No GPL code – license should be PHP license compatible (LGPL

is ok but not encouraged)• http://news.php.net/article.php?group=php.pecl.dev&article=5

PECL Advantages• Code reviews

• See https://wiki.php.net/internals/review_comments

• Help from other devs with internal API changes (if in PHP source control)• https://svn.php.net/viewvc?view=revision&revision=297236

• Advertising and individual release cycles• http://pecl.php.net/news/

• pecl command line integration• actually just integration with PEAR installer (which support

binaries/compiling) and unique pecl channel

• php.net documentation!

PECL Problems• Has less oversight into code quality

• peclqa?• not all source accessible

• no action taken for abandoned code• still has “siberia” modules mixed with “need a maintainer”

• never enough help • tests• bug triaging• maintainers• code reviews• docs!

• no composer integration• Half the code in git, half in svn still, half… elsewhere …

“It’s really free as in pull request”- me

My extension didn’t make it faster!• PHP is usually not the real bottleneck• Do full stack profiling and benchmarking to see if PHP is the real

bottleneck• If PHP IS the real bottleneck you’re awesome – and you need to be

writing stuff in C or C++• Most times your bottleneck is not PHP but I/O

What about other languages?• Ruby gem

• Will compile and install

• Node’s npm• Will compile and install

• Perl’s CPAN• Written in special “xs” language• Will compile and install

• Python• Mixed bag? Distutils can install or grab a binary

FFITalk C without compiling

What is FFI?• Foreign Function Interface• Most things written in C use libffi• https://github.com/libffi/libffi

Who has FFI?• Java calls it JNI• HHVM calls it HNI• Python calls it “ctypes” (do not ask, stupidest name ever)• C# calls it P/Invoke• Ruby calls it FFI• Perl has Inline::C (a bit of a mess)• PHP calls it…

FFI

Oh wait…• PHP’s FFI is rather broken• PHP’s FFI has no maintainer• It needs some TLC• There’s MFFI but it’s not done

• https://github.com/mgdm/MFFI

• Are you interested and not afraid?

For the future?• More SAPIs?

• Websockets• PSR-7• Other ideas?

• Fix server-tests.php so we can test SAPIs • Only CGI and CLI are currently tested well

• More extensions• Guidelines for extensions• Better documentation• Builds + pickle + composer integration

About Me http://emsmith.net auroraeosrose@gmail.com twitter - @auroraeosrose IRC – freenode –

auroraeosrose #phpmentoring https://joind.in/talk/67433

top related