Fun With SQL Joshua Tolley End Point Corporation Why and Why Not Joins CROSS JOIN INNER JOIN OUTER JOIN NATURAL JOIN Self Joins Other Useful Operations Subqueries Set Operations Common Operations Advanced Operations Common Table Expressions Window Functions Real, Live Queries Something Simple Something Fun Key Points Fun With SQL Joshua Tolley End Point Corporation October 5, 2009
Developers deal with SQL databases daily, yet very few speak SQL fluently. Even when an ORM hides most of SQL's details, SQL fluency will allow for increased performance and better troubleshooting capabilities. Given at UTOSC 2009
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Fun With SQL
Joshua TolleyEnd Point
Corporation
Why and Why Not
Joins
CROSS JOIN
INNER JOIN
OUTER JOIN
NATURAL JOIN
Self Joins
Other UsefulOperations
Subqueries
Set Operations
Common Operations
AdvancedOperations
Common TableExpressions
Window Functions
Real, Live Queries
Something Simple
Something Fun
Key Points
Fun With SQL
Joshua TolleyEnd Point Corporation
October 5, 2009
Fun With SQL
Joshua TolleyEnd Point
Corporation
Why and Why Not
Joins
CROSS JOIN
INNER JOIN
OUTER JOIN
NATURAL JOIN
Self Joins
Other UsefulOperations
Subqueries
Set Operations
Common Operations
AdvancedOperations
Common TableExpressions
Window Functions
Real, Live Queries
Something Simple
Something Fun
Key Points
”The degree of normality in a database is inverselyproportional to that of its DBA.” - Anon, twitter
Fun With SQL
Joshua TolleyEnd Point
Corporation
Why and Why Not
Joins
CROSS JOIN
INNER JOIN
OUTER JOIN
NATURAL JOIN
Self Joins
Other UsefulOperations
Subqueries
Set Operations
Common Operations
AdvancedOperations
Common TableExpressions
Window Functions
Real, Live Queries
Something Simple
Something Fun
Key Points
Why Not Do Stuff in SQL
I Databases are harder to replicate, if you really need toscale out
I Often, one complex SQL query is more efficient thanseveral simple ones
I Sometimes, indeed, it’s useful to reduce the load on thedatabase by moving logic into the application. Becareful doing this
I c.f. Premature Optimization
I More complex queries are harder to write and debugI True. But so is more complex programming.
I More complex queries are harder for the next guy tomaintain
I Also, good DBAs are often more expensive than goodprogrammers
I These are both true. But complex programming is alsohard for the next guy to maintain
I Of all the reasons not to write fluent SQL, this isprobably the most widely applicable
Fun With SQL
Joshua TolleyEnd Point
Corporation
Why and Why Not
Joins
CROSS JOIN
INNER JOIN
OUTER JOIN
NATURAL JOIN
Self Joins
Other UsefulOperations
Subqueries
Set Operations
Common Operations
AdvancedOperations
Common TableExpressions
Window Functions
Real, Live Queries
Something Simple
Something Fun
Key Points
Why do stuff in SQL?
I The database is more efficient than your application forprocessing big chunks of data
I ...especially if your code is in an interpreted language
I The database is better tested than your applicationI Applications trying to do what SQL should be doing
often get big and complex quicklyI ...and also buggy quickly
I That’s what the database is there for
I SQL is designed to express relations and conditions onthem. Your application’s language isn’t.
I A better understanding of SQL allows you to writequeries that perform better
Fun With SQL
Joshua TolleyEnd Point
Corporation
Why and Why Not
Joins
CROSS JOIN
INNER JOIN
OUTER JOIN
NATURAL JOIN
Self Joins
Other UsefulOperations
Subqueries
Set Operations
Common Operations
AdvancedOperations
Common TableExpressions
Window Functions
Real, Live Queries
Something Simple
Something Fun
Key Points
Why do stuff in SQL?
In short, the database exists to manage data, and yourapplication exists to handle business logic. Write softwareaccordingly.
RecursionRecursive CTE to retrieve management hierarchy:
# WITH RECURSIVE t (id, managernames) AS (SELECT e.id, first || ’ ’ || last
AS managernamesFROM employee e WHERE manager IS NULL
UNION ALLSELECT e.id,first || ’ ’ || last || ’, ’ || managernames
AS managernamesFROM employee eJOIN t ON (e.manager = t.id)WHERE manager IS NOT NULL
)SELECT e.id, first || ’ ’ || last AS name,
managernamesFROM employee e JOIN t ON (e.id = t.id);
Fun With SQL
Joshua TolleyEnd Point
Corporation
Why and Why Not
Joins
CROSS JOIN
INNER JOIN
OUTER JOIN
NATURAL JOIN
Self Joins
Other UsefulOperations
Subqueries
Set Operations
Common Operations
AdvancedOperations
Common TableExpressions
Window Functions
Real, Live Queries
Something Simple
Something Fun
Key Points
Recursion
...and get this...
id | name | managernames----+-----------------+-----------------------------
1 | john doe | john doe2 | fred rogers | fred rogers, john doe3 | speedy gonzales | speedy gonzales, john doe4 | carly fiorina | carly fiorina, john doe5 | hans reiser | hans reiser, fred rogers,| | john doe
6 | johnny carson | johnny carson, hans reiser,| | fred rogers, john doe
7 | martha stewart | martha stewart, speedy| | gonzales, john doe
(7 rows)
Fun With SQL
Joshua TolleyEnd Point
Corporation
Why and Why Not
Joins
CROSS JOIN
INNER JOIN
OUTER JOIN
NATURAL JOIN
Self Joins
Other UsefulOperations
Subqueries
Set Operations
Common Operations
AdvancedOperations
Common TableExpressions
Window Functions
Real, Live Queries
Something Simple
Something Fun
Key Points
Fractals in SQL
WITH RECURSIVE x(i) AS
(VALUES(0) UNION ALL SELECT i + 1 FROM x WHERE i < 101),
TO_TIMESTAMP(start - start::INTEGER % 3600) AS start,
TO_TIMESTAMP(stop - stop::INTEGER % 3600) AS stop
FROM (
SELECT id,
TO_CHAR(NOW() - id * INTERVAL ’1 HOUR’,
’Dy Mon DD HH:MI AM’) AS idname,
EXTRACT(EPOCH FROM NOW() - id * INTERVAL ’1 HOUR’) AS start,
EXTRACT(EPOCH FROM NOW() - (id - 1) * INTERVAL ’1 HOUR’) AS stop
FROM (
SELECT GENERATE_SERIES(1, 15) AS id
) f
) g
) h ON (slavecommit BETWEEN start AND stop)
GROUP BY id, idname
ORDER BY id DESC;
Fun With SQL
Joshua TolleyEnd Point
Corporation
Why and Why Not
Joins
CROSS JOIN
INNER JOIN
OUTER JOIN
NATURAL JOIN
Self Joins
Other UsefulOperations
Subqueries
Set Operations
Common Operations
AdvancedOperations
Common TableExpressions
Window Functions
Real, Live Queries
Something Simple
Something Fun
Key Points
Something Fun
I The table contains replication dataI Time of commit on masterI Time of commit on slaveI Number of rows replicated
I The user wants a graph of replication speed over time,given a user-determined range of time
Fun With SQL
Joshua TolleyEnd Point
Corporation
Why and Why Not
Joins
CROSS JOIN
INNER JOIN
OUTER JOIN
NATURAL JOIN
Self Joins
Other UsefulOperations
Subqueries
Set Operations
Common Operations
AdvancedOperations
Common TableExpressions
Window Functions
Real, Live Queries
Something Simple
Something Fun
Key Points
Something Fun
We want to average replication times over a series ofbuckets. The first part of our query creates those buckets,based on generate series(). Here we create buckets for15 hours
SELECTid,TO_CHAR(NOW() - id * INTERVAL ’1 HOUR’,
’Dy Mon DD HH:MI AM’) AS idname,EXTRACT(EPOCH FROM NOW() - id *
INTERVAL ’1 HOUR’) AS start,EXTRACT(EPOCH FROM NOW() - (id - 1) *
) ON (slavecommit BETWEEN start AND stop)GROUP BY id, idnameORDER BY id DESC;
Fun With SQL
Joshua TolleyEnd Point
Corporation
Why and Why Not
Joins
CROSS JOIN
INNER JOIN
OUTER JOIN
NATURAL JOIN
Self Joins
Other UsefulOperations
Subqueries
Set Operations
Common Operations
AdvancedOperations
Common TableExpressions
Window Functions
Real, Live Queries
Something Simple
Something Fun
Key Points
Something Fun
...and get this:
id | idname | avgtime | count----+---------------------+---------+-------15 | Sat Mar 14 08:35 AM | 7.9 | 1421914 | Sat Mar 14 09:35 AM | 6.9 | 1644413 | Sat Mar 14 10:35 AM | 6.5 | 6210012 | Sat Mar 14 11:35 AM | 6.2 | 4734911 | Sat Mar 14 12:35 PM | 0 | 010 | Sat Mar 14 01:35 PM | 4.6 | 21348
This is the average replication time and total replicated rowsper hour. Note that this correctly returns zeroes when norows are replicated, and still returns a value for that timeslot. This prevents some amount of application-sideprocessing.
Fun With SQL
Joshua TolleyEnd Point
Corporation
Why and Why Not
Joins
CROSS JOIN
INNER JOIN
OUTER JOIN
NATURAL JOIN
Self Joins
Other UsefulOperations
Subqueries
Set Operations
Common Operations
AdvancedOperations
Common TableExpressions
Window Functions
Real, Live Queries
Something Simple
Something Fun
Key Points
Something Fun
That query again:
Fun With SQL
Joshua TolleyEnd Point
Corporation
Why and Why Not
Joins
CROSS JOIN
INNER JOIN
OUTER JOIN
NATURAL JOIN
Self Joins
Other UsefulOperations
Subqueries
Set Operations
Common Operations
AdvancedOperations
Common TableExpressions
Window Functions
Real, Live Queries
Something Simple
Something Fun
Key Points
SELECT
id, idname,
COALESCE(ROUND(AVG(synctime)::NUMERIC, 1), 0) AS avgtime,
COALESCE(SUM(total), 0) AS count
FROM (
SELECT slavecommit,
EXTRACT(EPOCH FROM slavecommit - mastercommit) AS synctime,