© 2010 Hal Stern Some Rights Reserved Parsing Strange: URL to SQL to HTML Hal Stern snowmanonfire.com slideshare.net/freeholdhal headshot by Richard Stevens http://dieselsweeties.com
© 2010 Hal Stern Some Rights Reserved
Parsing Strange: URL to SQL to HTML
Hal Stern snowmanonfire.com
slideshare.net/freeholdhal headshot by Richard Stevens http://dieselsweeties.com
© 2010 Hal Stern Some Rights Reserved
Why Do You Care?
• Database performance = user experience • A little database expertise goes a long way • Taxonomies for more than sidebar lists • Custom post types (!!) • WordPress is a powerful CMS > Change default behaviors > Defy the common wisdom > Integrate other content sources/filters
WordCamp Boulder 2
© 2010 Hal Stern Some Rights Reserved
Flow of Control • Web server URL manipulation > Real file or permalink URL?
• URL to query variables > What to display? Tag? Post? Category?
• Query variables to SQL generation > How exactly to get that content?
• Template file selection > How will content be displayed?
• Content manipulation
3 WordCamp Boulder
© 2010 Hal Stern Some Rights Reserved
Whose File Is This?
• User URL request passed to web server • Web server checks .htaccess file > WP install root > Other .htaccess
files may interfere • Basic rewriting rules:
If file or directory URL doesn’t exist, start WordPress via index.php
WordCamp Boulder 4
<IfModule mod_rewrite.c> RewriteEngine On RewriteBase /whereyouputWordPress/ RewriteCond %{REQUEST_FILENAME} !-f RewriteCond %{REQUEST_FILENAME} !-d RewriteRule . /index.php [L] </IfModule>
© 2010 Hal Stern Some Rights Reserved
Example Meta Fail: 404 Not Found
• Access broken image URLs for unintended results: no 404 pages! myblog/images/not-a-pic.jpg!
• Web server can’t find file, assumes it’s a permalink, hands to WP
• WP can’t interpret it, so defaults to home
WordCamp Boulder 5
myblog/ myblog/wp-content (etc) myblog/images
© 2010 Hal Stern Some Rights Reserved
What Happens Before The Loop • Parse URL into a query • Set conditionals & select templates • Execute the query & cache results • Run the Loop:
<?php if (have_posts()) :
while (have_posts()) : the_post();
//loop content endwhile; endif; ?>
WordCamp Boulder 6
© 2010 Hal Stern Some Rights Reserved
Examining the Query String
• SQL passed to MySQL in WP_Query object’s request element
• Brute force: edit theme footer.php to see main loop’s query for displayed page
WordCamp Boulder 7
<?php global $wp_query; echo ”SQL for this page "; echo $wp_query->request; echo "<br>"; ?>
© 2010 Hal Stern Some Rights Reserved
“Home Page” Query Deconstruction
WordCamp Boulder 8
SELECT SQL_CALC_FOUND_ROWS wp_posts.* FROM wp_posts WHERE 1=1 AND wp_posts.post_type = 'post’ AND (wp_posts.post_status = 'publish' OR wp_posts.post_status = 'private’) ORDER BY wp_posts.post_date DESC LIMIT 0, 10
Get all fields from posts table, but limit number of returned rows
Only get posts, and those that are published or private to the user
Sort the results by date in descending order
Start results starting with record 0 and up to 10 more results
© 2010 Hal Stern Some Rights Reserved
Query Parsing
• parse_request() method of WP_Query extracts query variables from URL
• Execute rewrite rules > Pick off ?p=67 style http GET variables > Match permalink structure > Match keywords like “author” and “tag” > Match custom post type slugs
WordCamp Boulder 9
© 2010 Hal Stern Some Rights Reserved
Query Variables to SQL
• Query type: post by title, posts by category or tag, posts by date
• Variables for the query > Slug values for category/tags > Month/day numbers > Explicit variable values ?p=67 for post_id
• post_type variable has been around for a while; CPT fill in new values
WordCamp Boulder 10
© 2010 Hal Stern Some Rights Reserved
Simple Title Slug Parsing
• Rewrite matches root of permalink, extracts tail of URL as a title slug
WordCamp Boulder 11
SELECT wp_posts.* FROM wp_posts WHERE 1=1 AND YEAR(wp_posts.post_date)='2010' AND wp_posts.post_name = 'premio-sausage' AND wp_posts.post_type = 'post' ORDER BY wp_posts.post_date DESC
/2010/premio-sausage
© 2010 Hal Stern Some Rights Reserved
Graphs and JOIN Operations
• WordPress treats tags and categories as “terms”, mapped 1:N to posts
• Relational databases aren’t ideal for this > INNER JOIN builds intermediate tables on
common key values • Following link in a social graph is
equivalent to an INNER JOIN on tables of linked items
WordCamp Boulder 12
© 2010 Hal Stern Some Rights Reserved
WordPress Taxonomy Tables
• Term relationships table maps N:1 terms to each post
• Term taxonomy maps slugs 1:N to taxonomies
• Term table has slugs for URL mapping
WordCamp Boulder 13
wp_term_relationships object_id term_taxonomy_id
wp_posts post_id …. post_date … post_content
wp_term_taxonomy term_taxonomy_id term_id taxonomy description
wp_terms term_id name slug
© 2010 Hal Stern Some Rights Reserved
SELECT SQL_CALC_FOUND_ROWS wp_posts.* FROM wp_posts INNER JOIN wp_term_relationships ON (wp_posts.ID = wp_term_relationships.object_id) INNER JOIN wp_term_taxonomy ON (wp_term_relationships.term_taxonomy_id = wp_term_taxonomy.term_taxonomy_id) INNER JOIN wp_terms ON (wp_term_taxonomy.term_id = wp_terms.term_id) WHERE 1=1 AND wp_term_taxonomy.taxonomy = 'post_tag' AND wp_terms.slug IN ('premio') AND wp_posts.post_type = 'post' AND (wp_posts.post_status = 'publish' OR wp_posts.post_status = 'private') GROUP BY wp_posts.ID ORDER BY wp_posts.post_date DESC LIMIT 0, 10
Taxonomy Lookup
WordCamp Boulder 14
/tag/premio
© 2010 Hal Stern Some Rights Reserved
More on Canonical URLs
• Canonical URLs improve SEO • WordPress is really good about generating
301 Redirects for non-standard URLs • Example: URL doesn’t appear to match a
permalink, WordPress does prediction > Use “LIKE title%” in WHERE clause > Matches “title” as initial substring with %
wildcard
WordCamp Boulder 15
© 2010 Hal Stern Some Rights Reserved
Modifying the Query
• Brute force isn’t necessarily good > Using query_posts() ignores all previous
parsing, runs a new SQL query • Filter query_vars > Change default parsing (convert any day to a
week’s worth of posts, for example) • Actions parse_query & parse_request > Access WP_Query object before execution > is_xx() conditionals are already set
WordCamp Boulder 16
© 2010 Hal Stern Some Rights Reserved
SQL Generation Filters
• posts_where > More explicit control over query variable to
SQL grammar mapping • posts_join > Add or modify JOINS for other graph like
relationships • Many other filters > Change grouping of results > Change ordering of results
WordCamp Boulder 17
© 2010 Hal Stern Some Rights Reserved
Custom Post Types • Change SQL WHERE clause on post type > wp_posts.post_type=‘ebay’
• Add new rewrite rules for URL parsing similar to category & tag > Set slug in CPT registration array 'rewrite' => array ("slug" => “ebay”),
• Watch out for competing, overwritten or unflushed rewrite entries <?php echo "<pre>”; print_r(get_option('rewrite_rules')); echo "</pre>”; ?>
WordCamp Boulder 18
© 2010 Hal Stern Some Rights Reserved
Applications
• Stylized listings > Category sorted alphabetically > Use posts as listings of resources (jobs,
clients, events) – good CPT application • Custom URL slugs > Add rewrite rules to match slug and set query
variables • Joining other social graphs > Suggested/related content
WordCamp Boulder 19
© 2010 Hal Stern Some Rights Reserved
Template File Selection
• is_x() conditionals set in query parsing • Used to drive template selection > is_tag() looks for tag-slug, tag-id, then tag > Full search hierarchy in Codex
• template_redirect action > Called in the template loader > Add actions to override defaults
WordCamp Boulder 20
© 2010 Hal Stern Some Rights Reserved
HTML Generation
• Done in the_post() method • Raw content retrieved from MySQL > Short codes interpreted > CSS applied
• Some caching plugins generate and store HTML, so YMMV
WordCamp Boulder 21
© 2010 Hal Stern Some Rights Reserved
Why Do You Care?
• User experience improvement > JOINS are expensive > Large table, repetitive SELECTs = slow > Running query once keeps cache warm > Category, permalink, title slug choices matter
• More CMS, less “blog” > Alphabetical sort > Adding taxonomy/social graph elements
WordCamp Boulder 22
© 2010 Hal Stern Some Rights Reserved
Resources
• Core files where SQL stuff happens > query.php > post.php > canonical.php > rewrite.php
• Template loader search path > http://codex.wordpress.org/Template_Hierarchy
WordCamp Boulder 23
© 2010 Hal Stern Some Rights Reserved
Contact
Hal Stern [email protected] @freeholdhal snowmanonfire.com facebook.com/hal.stern
slideshare.net/freeholdhal
WordCamp Boulder 24