Sybase Blogs Questions

Sybase Techincal Interview Questions

Cursors and Triggers in SybaseHow triggers workTriggers are automatic. They work no matter what caused the data modification—a clerk’s data entry or an application action. A trigger is specific to one or more of the data modification operations (update, insert, and delete), and is executed once for each SQL statement.

For example, to prevent users from removing any publishing companies from the publishers table, you could use this trigger:

create trigger del_pubon publishersfor deleteasbeginrollback transactionprint “You cannot delete any publishers!”endThe next time someone tries to remove a row from the publishers table, the del_pub trigger cancels the deletion, rolls back the transaction, and prints a message.

A trigger “fires” only after the data modification statement has completed and Adaptive Server has checked for any datatype, rule, or integrity constraint violation. The trigger and the statement that fires it are treated as a single transaction that can be rolled back from within the trigger. If Adaptive Server detects a severe error, the entire transaction is rolled back.

Triggers are most useful in these situations:

* Triggers can cascade changes through related tables in the database. For example, a delete trigger on the title_id column of the titles table can delete matching rows in other tables, using the title_id column as a unique key to locating rows in titleauthor and roysched.* Triggers can disallow, or roll back, changes that would violate referential integrity, canceling the attempted data modification transaction. Such a trigger might go into effect when you try to insert a foreign key that does not match its primary key. For example, you could create an insert trigger on titleauthor that rolled back an insert if the new titleauthor.title_id value did not have a matching value in titles.title_id.* Triggers can enforce restrictions that are much more complex than those that are defined with rules. Unlike rules, triggers can reference columns or database objects. For example, a trigger can roll back updates that attempt to increase a book’s price by more than 1 percent of the advance.* Triggers can perform simple “what if” analyses. For example, a trigger can compare the state of a table before and after a data modification and take action based on that comparison.Sybase SQL commands that are not allowed in triggersSybase SQL statements that are not allowed in triggersSince triggers execute as part of a transaction, the following statements are not allowed in a trigger:

* All create commands, including create database, create table, create index, create procedure, create default, create rule, create trigger, and create view* All drop commands* alter table and alter database

* truncate table* grant and revoke* update statistics* reconfigure* load database and load transaction* disk init, disk mirror, disk refit, disk reinit, disk remirror, disk unmirror* select into

Listing object names and attributes

Examples below are formatted to run using the isql utility.

/* list all table names for current database */

select name from sysobjects where type = ‘U’

go

sp_tables

go

/* list all trigger names for current database */

select name from sysobjects where type = ‘T’

go

/* list all procedure names for current database */

select name from sysobjects where type = ‘P’

go

/* display column definitions and indexes for employee table */

sp_help employee

go

/* display spaced used for employee table */

sp_spaceused employee

go

/* display source code for proc_rtv_employee */

sp_helptext proc_rtv_employee

go

Clustered vs non-clustered indexes

Typically, a clustered index will be created on the primary key of a table, and non-clustered indexes are used where needed. Non-clustered indexes

Leaves are stored in b-tree

Lower overhead on inserts, vs clustered

Best for single key queries

Last page of index can become a ‘hot spot’

Clustered indexes

Records in table are sorted physically by key values

Only one clustered index per table

Higher overhead on inserts, if re-org on table is required

Best for queries requesting a range of records

Index must exist on same segment as table

Note! With “lock datapages” or “lock datarows” … clustered indexes are sorted physically only upon creation. After that, the indexes behave like non-clustered indexes.

Transact SQL: numeric functionsMathematic Functions

abs absolute value abs(-5) = 5ceiling next highest int ceiling(5.3) = 6floor next lowest int floor(5.7) = 5power exponential power(2,8)=256rand random number rand=0.315378 for exampleround round to n places round(5.6,0)=6 round(5.66,1)=5.7

sign -1,0,1 sign(-5)=-1

Transact SQL: string functionsplus sign (+) concatenation 'one'+'two'='onetwo'ascii char->ascii value ascii('A')=65char ascii->char char(65)='A'charindex similar to instring charindex('two','onetwothree')=4char_length length of string charlength('onetwo')=6lower lower case lower('ONE')='one'ltrim trim left blanks ltrim(' one')='one'replicate repeat chars replicate('-',8)='--------'reverse flip string reverse('salad')='dalas'right right chunk of string right('Chicago',2)='go'rtrim trim right blanks rtrim('test ')='test'space spaces space(5)=' 'str float->char str(5.6,12,2)=' 5.60'stuff insert chars within str stuff('onetwothree',4,3,'-----')='one-----three'substring get piece of string substring('sybase',1,2)='sy'upper upper case upper('one')='ONE'

Transact SQL: misc functionsconvert convert between data types convert(float,'5.50')=5.50suser_name() current login idgetdate() current date

Transact SQL: date/time functionsdatepart* get part of a date datepart(MM,'10/21/98')=10dateadd* manipulate a date dateadd(DD,10,'10/21/98')= 10/31/98getdate todays date and time getdate()=Nov 16 1998-2000 7:27PM

* date parts are MM,DD,YY,HH,MI,SS,MS

Transact SQL: Finding duplicate rows in a table

This example finds cargo records with have duplicate destination ids.

3> select cargo_id, dest_id

4> from routing t1

5> where

6> ( select count(*)

7> from routing t2

8> where t2.dest_id = t1.dest_id ) > 1

9>

10> go

Using Temporary Tables

Temp tables allow developers to create and scan tables within a stored procedure – and have the tables totally isolated from all other database connections. This is very valuable when results need to be processed several times within a loop, or when a complex result set is expected (like a crosstab). Note that temp table transactions are logged within tempdb (exception: select into create statements).

Temporary tables are created in the tempdb database. To create a temporary table, you must have create table permission in tempdb.create table permission defaults to the Database Owner.

The table exists until the current session ends or until its owner drops it using drop table.

Tables that are accessible only by the current Adaptive Server session or procedure

Temporary tables with names beginning with “#”

Temporary tables with names beginning with “#” that are created within stored procedures disappear when the procedure exits. A single procedure can:

Create a temporary table

Insert data into the table

Run queries on the table

Call other procedures that reference the table

Since the temporary table must exist in order to create procedures that reference it, here are the steps to follow:

1. Use create table to create the temporary table.

2. Create the procedures that access the temporary table, but do not create the procedure that creates the table.

3. Drop the temporary table.

4. Create the procedure that creates the table and calls the procedures created in step 2.

tables with names beginning with tempdb..

You can create temporary tables without the # prefix, using create table tempdb..tablename from inside a stored procedure. These tables do not disappear when the procedure completes, so they can be referenced by independent procedures. Follow the steps above to create these tables.

Warning!

Create temporary tables with the “tempdb..” prefix from inside a stored procedure only if you intend to share the table among users and sessions. Stored procedures that create and drop a temporary table should use the # prefix to avoid inadvertent sharing.

General rules on temporary tables

Temporary tables with names that begin with # are subject to the following restrictions:

You cannot create views on these tables.

You cannot associate triggers with these tables.

You cannot tell which session or procedure has created these tables.

These restrictions do not apply to shareable, temporary tables created in tempdb.

Rules that apply to both types of temporary tables:

You can associate rules, defaults, and indexes with temporary tables. Indexes created on a temporary table disappear when the temporary table disappears.

System procedures such as sp_help work on temporary tables only if you invoke them from tempdb.

You cannot use user-defined datatypes in temporary tables unless the datatypes exist in tempdb; that is, unless the datatypes have been explicitly created in tempdb since the last time Adaptive Server was restarted.

You do not have to set the select into/bulkcopy option on toselect into a temporary table.

What is difference between SQL & T-SQL?

SQL- set of sqls are submitted individually to the database server.

T-SQL- the batch program is written where in all commands are submitted to the server in a single go. usually batches are run overnight and and all inserts and updates happen and these batches are scheduled. where as sqls’ are to run separately…..

all sqls’ are put in a file and schedule them called –t-sql .. besides it offers some other commands too.

SQL is the Structured Query Language the ANSI/ISO Standard database language. SQL Server’s implementation of the language is called Transact-SQL (T-SQL).

What is the difference between char and varchar data types?

char is used for fixed length memory storage whereas varchar

is used for variable lenght memory storage.

Fox Example if we have char(10) then we can store 10 bit

value in it but if we store only 6 bit value in it then rest

of the 4 bit space is goes wasted. this limitation is

overcome by varchar.

In varchar if we use less space than define space the rest

of space is not wasted.

How triggers work

Triggers are automatic. They work no matter what caused the data modification—a clerk’s data entry or an application action. A trigger is specific to one or more of the data modification operations (update, insert, and delete), and is executed once for each SQL statement.

For example, to prevent users from removing any publishing companies from the publishers table, you could use this trigger:

create trigger del_pubon publishersfor deleteasbeginrollback transactionprint “You cannot delete any publishers!”end

The next time someone tries to remove a row from the publishers table, the del_pub trigger cancels the deletion, rolls back the transaction, and prints a message.

A trigger “fires” only after the data modification statement has completed and Adaptive Server has checked for any datatype, rule, or integrity constraint violation. The trigger and the statement that fires it are treated as a single transaction that can be rolled back from within the trigger. If Adaptive Server detects a severe error, the entire transaction is rolled back.

Triggers are most useful in these situations:

* Triggers can cascade changes through related tables in the database. For example, a delete trigger on the title_id column of the titles table can delete matching rows in other tables, using the title_id column as a unique key to locating rows in titleauthor and roysched.* Triggers can disallow, or roll back, changes that would violate referential integrity, canceling the attempted data modification transaction. Such a trigger might go into effect when you try to insert a foreign key that does not match its primary key. For example, you could create an insert trigger on titleauthor that rolled back an insert if the new titleauthor.title_id value did not have a matching value in titles.title_id.* Triggers can enforce restrictions that are much more complex than those that are defined with rules. Unlike rules, triggers can reference columns or database objects. For example, a trigger can roll back updates that attempt to increase a book’s price by more than 1 percent of the advance.* Triggers can perform simple “what if” analyses. For example, a trigger can compare the state of a table before and after a data modification and take action based on that comparison.

Sybase SQL commands that are not allowed in triggers

Sybase SQL statements that are not allowed in triggers

Since triggers execute as part of a transaction, the following statements are not allowed in a trigger:

* All create commands, including create database, create table, create index, create procedure, create default, create rule, create trigger, and create view

* All drop commands* alter table and alter database* truncate table* grant and revoke* update statistics* reconfigure* load database and load transaction* disk init, disk mirror, disk refit, disk reinit, disk remirror, disk unmirror* select into

Sybase Architecture

The Sybase ServerA Sybase server consists of:A) two processes, data server and backup server ;B) devices which house the databases; one database (master) contains system and configuration data ;C) a configuration file which contains the server attributes .

Memory ModelThe Sybase memory model consists of:A) the program area, which is where the dataserver executable is stored;B) the data cache, stores recently fetched pages from the database deviceC) the stored procedure cache, which contains optimized sql callsThe Sybase dataserver runs as a single process within the operating system; when multiple users are connected to the database, only one process is managed by the OS. Each Sybase database connection requires 40-60k of memory.The “total memory” configuration parameter determines the amount of memory allocated to the server. This memory is taken immediately upon startup, and does not increase.

Transaction ProcessingTransactions are written to the data cache, where they advance to the transaction log, and database device. When a rollback occurs, pages are discarded from the data cache. The transaction logs are used to restore data in event of a hardware failure. A checkpoint operation flushes all updated (committed) memory pages to their respective tables.Transaction logging is required for all databases; only image (blob) fields may be exempt.During an update transaction, the data page(s) containing the row(s) are locked. This will cause contention if the transaction is not efficiently written. Record locking can be turned on in certain cases, but this requires sizing the table structure with respect to the page size.

Backup ProceduresA “dump database” operation can be performed when the database is on-line or offline. Subsequent “dump transaction” commands need to be issued during the day, to ensure acceptable recovery windows.

Recovery ProceduresA “load database” command loads the designated database with the named dump file. Subsequent “load transaction” commands can then be issued to load multiple transaction dump files.

Security and Account SetupThe initial login shipped with Sybase is “sa” (system administrator). This login has the role “sa_role” which is the super-user, in Sybase terms.User logins are added at the server level, and then granted access to each database, as needed. Within each database, access to tables can be granted per application requirements. A user can also be aliased as “dbo”, which automatically grants them all rights within a database.

Database CreationDatabases are initialized with the “create database” command. It is not unusual for a Sybase server to contain many different databases. Tables are created within each database; users refer to tables by using ownername.tablename nomenclature. “Aliasing” users with the database eliminates the need for the prefix. Typically, a user will be aliased as “dbo” (database owner), which also gives the same result.A typical Sybase database will consist of six segments spread across various devices.

Data TypesSupported data types include integer, decimal, float, money, char, varchar, datetime, image, and text datatypes.Text and image datatypes are implemented via pointers within the physical record structure ; the field contents are stored in dedicated pages. As a result, each text or image field requires at least 2K of storage (on most platforms).For string data, the varchar type can be used for lengths up to 255; the text type can be used for longer field data.Datetime fields are stored as a number which is accurate to 1/300 of a second.Within a “create table” statement, a column can be flagged as an “identity” column, which causes it to be incremented automatically when rows are inserted.

Storage ConceptsTables are stored in segments; a segment is an area within a device, with a name and a size, that is allocated for a database. The transaction log is stored in its own segment, usually on a separate device.

Transact-SQLTransact-SQL is a robust programming language in which stored procedures can be written. The procedures are stored in a compiled format, which allows for faster execution of code. Cursors are supported for row by row processing. Temporary tables are supported, which allows customized, private work tables to be created for complex processes. Any number of result sets can be returned to calling applications via SELECT statements.

Performance and scalabilitySybase continues to break TPC benchmark records. A recent (11/98) test yielded 53,049.97 transactions per minute (tpmC) at a price/performance of $76 per tpmC: The TPC-C tests on the Sun Enterprise 6500 server were conducted with 24 UltraSPARC(TM) processors, 24GB of main memory, 34 Sun StorEdge A5000 arrays and the new 64-bit Solaris 7 operating environment. Sybase 11 scales from handheld devices to enterprise level servers.Coming soon: benchmarks on Solaris and Linux machines.

PricePrice per seat is average, compared to other vendors.Support is achieved by opening cases with the support team. Response is usually within 24 hours.

Management and Development Tools (for Windows)ISQL is the interactive query tool used with Sybase ; it is useful for entering queries and stored procedures.

Sybase Central is shipped with System 12. It offers a good interface for performing basic database tasks. The “best of breed” product in this category is DB-Artisan by Embarcadero Technologies.

For development, Sybase Inc. offers Powerbuilder, Powerdesigner, Power J and its “Studio” line products. Powerbuilder remains the most robust, straightforward, and practical choice for windows development, supporting many other RDBMs in addition to Sybase System 11.

Investigating Locks

I had a colleague come across a situation where an investigation of a performance issue revealed blocking in the database, but wasn’t sure how to investigate further.

The first step is usually to execute sp_who.

1> sp_who2> go

This will list all the tasks and what they are doing. Some of them may be waiting on locks being use by another task. This task will also be in the list, and will helpfully include the user and the machine where this task originated

We can get more information about these tasks from the sysprocesses table, using those spids from sp_who. Here we get to see the name of blocking process, and the process ID on wherever it is being executed.

1> select * from sysprocesses where spid = [spid]2> go

Before you get up and go to that machine and ask the user what he is doing with that application, you can execute sp_lock to get more information about the resource the task is blocking.

1> sp_lock2> go

The results of sp_lock will reveal what kind of locks, and what object is being locked.

From there we look at sysobjects with the objectid from sp_lock.

1> select * from sysobjects where id = [objectid]2> go

Now you know exactly:1. Which task is blocking which2. What those tasks are, including who is running them on which machine3. Which resource is being blocked

That should give you the information you need to start thinking about how to avoid this problem.

Joins – Interview

All subsequent explanations on join types in this article make use of the following two tables. The rows in these tables serve to illustrate the effect of different types of joins and join-predicates. In the following tables, Department.DepartmentID is the primary key, while Employee.DepartmentID is a foreign key.

Employee Table

LastName

DepartmentID

Rafferty 31

Jones 33

Steinberg 33

Robinson 34

Smith 34

John NULL

Department TableDepartmentID

DepartmentName

31 Sales

33 Engineering

34 Clerical

35 Marketing

Note: The “Marketing” Department currently has no listed employees. Also, employee “John” has not been assigned to any Department yet.

[edit] Inner join

An inner join is the most common join operation used in applications, and represents the default join-type. Inner join creates a new result table by combining column values of two tables (A and B) based upon the join-predicate. The query compares each row of A with each row of B to find all pairs of rows which satisfy the join-predicate. When the join-predicate is satisfied, column values for each matched pair of rows of A and B are combined into a result row. The result of the join can be defined as the outcome of first taking the Cartesian product (or cross-join) of all records in the tables (combining every record in table A with every record in table B) – then return all records which satisfy the join predicate. Actual SQL implementations normally use other approaches like a Hash join or a Sort-merge join where possible, since computing the Cartesian product is very inefficient.

SQL specifies two different syntactical ways to express joins: “explicit join notation” and “implicit join notation”.

The “explicit join notation” uses the JOIN keyword to specify the table to join, and the ON keyword to specify the predicates for the join, as in the following example:

SELECT *

FROM employee

INNER JOIN department

ON employee.DepartmentID = department.DepartmentID

The “implicit join notation” simply lists the tables for joining (in the FROM clause of the SELECT statement), using commas to separate them. Thus, it specifies a cross-join, and the WHERE clause may apply additional filter-predicates (which function comparably to the join-predicates in the explicit notation).

The following example shows a query which is equivalent to the one from the previous example, but this time written using the implicit join notation:

SELECT *

FROM employee, department

WHERE employee.DepartmentID = department.DepartmentID

The queries given in the examples above will join the Employee and Department tables using the DepartmentID column of both tables. Where the DepartmentID of these tables match (i.e. the join-predicate is satisfied), the query will combine the LastName, DepartmentID andDepartmentName columns from the two tables into a result row. Where the DepartmentID does not match, no result row is generated.

Thus the result of the execution of either of the two queries above will be:

Employee.LastName

Employee.DepartmentID

Department.DepartmentName

Department.DepartmentID

Robinson 34 Clerical 34

Jones 33 Engineering 33

Smith 34 Clerical 34

Steinberg 33 Engineering 33

Rafferty 31 Sales 31

Note: Programmers should take special care when joining tables on columns that can contain NULL values, since NULL will never match any other value (or even NULL itself), unless the join condition explicitly uses the IS NULL or IS NOT NULL predicates.

Notice that the employee “John” and the department “Marketing” do not appear in the query execution results. Neither of these has any matching records in the respective other table: “John” has no associated department, and no employee has the department ID 35. Thus, no information on John or on Marketing appears in the joined table. Depending on the desired results, this behavior may be a subtle bug. Outer joins may be used to avoid it.One can further classify inner joins as equi-joins, as natural joins, or as cross-joins (see below).

Equi-join

An equi-join, also known as an equijoin, is a specific type of comparator-based join, or theta join, that uses only equalitycomparisons in the join-predicate. Using other comparison operators (such as <) disqualifies a join as an equi-join. The query shown above has already provided an example of an equi-join:

SELECT *

FROM employee



SQL provides an optional shorthand notation for expressing equi-joins, by way of the USING construct (Feature ID F402):

SELECT *

FROM employee


USING (DepartmentID)

The USING construct is more than mere syntactic sugar, however, since the result set differs from the result set of the version with the explicit predicate. Specifically, any columns mentioned in the USING list will appear only once, with an unqualified name,

rather than once for each table in the join. In the above case, there will be a single DepartmentID column and no employee.DepartmentID or department.DepartmentID.

The USING clause is supported by MySQL, Oracle, PostgreSQL, SQLite, DB2/400 and Firebird in version 2.1 or higher.

Natural join

A natural join offers a further specialization of equi-joins. The join predicate arises implicitly by comparing all columns in both tables that have the same column-name in the joined tables. The resulting joined table contains only one column for each pair of equally-named columns….

The above sample query for inner joins can be expressed as a natural join in the following way:

SELECT *

FROM employee NATURAL JOIN department

As with the explicit USING clause, only one DepartmentID column occurs in the joined table, with no qualifier:

DepartmentID

Employee.LastName


34 Smith Clerical

33 Jones Engineering

34 Robinson Clerical

33 Steinberg Engineering

31 Rafferty Sales

With either a JOIN USING or NATURAL JOIN, the Oracle databaseimplementation of SQL will report a compile-time error if one of the equijoined columns is specified with a table name qualifier: “ORA-25154: column part of USING clause cannot have qualifier” or “ORA-25155: column used in NATURAL join cannot have qualifier”, respectively.

Cross join

A cross join, cartesian join or product provides the foundation upon which all types of inner joins operate. A cross join returns the cartesian product of the sets of records from the two joined tables. Thus, it equates to an inner join where the join-condition always

evaluates toTrue or where the join-condition is absent from the statement. In other words, a cross join combines every row in B with every row in A. The number of rows in the result set will be the number of rows in A times the number of rows in B.

Thus, if A and B are two sets, then the cross join is written as A × B.

The SQL code for a cross join lists the tables for joining (FROM), but does not include any filtering join-predicate.

Example of an explicit cross join:

SELECT *

FROM employee CROSS JOIN department

Example of an implicit cross join:

SELECT *

FROM employee, department;

Employee.LastName





Jones 33 Sales 31

Steinberg 33 Sales 31

Smith 34 Sales 31

Robinson 34 Sales 31

John NULL Sales 31

Rafferty 31 Engineering 33



Smith 34 Engineering 33

Robinson 34 Engineering 33

John NULL Engineering 33

Rafferty 31 Clerical 34

Jones 33 Clerical 34

Steinberg 33 Clerical 34



John NULL Clerical 34

Rafferty 31 Marketing 35

Jones 33 Marketing 35

Steinberg 33 Marketing 35

Smith 34 Marketing 35

Robinson 34 Marketing 35

John NULL Marketing 35

The cross join does not apply any predicate to filter records from the joined table. Programmers can further filter the results of a cross join by using a WHERE clause.

Outer joins

An outer join does not require each record in the two joined tables to have a matching record. The joined table retains each record—even if no other matching record exists. Outer joins subdivide further into left outer joins, right outer joins, and full outer joins, depending on which table(s) one retains the rows from (left, right, or both).

(In this case left and right refer to the two sides of the JOIN keyword.)

No implicit join-notation for outer joins exists in standard SQL.

Left outer join

The result of a left outer join (or simply left join) for table A and B always contains all records of the “left” table (A), even if the join-condition does not find any matching record in the “right” table (B). This means that if the ON clause matches 0 (zero) records in B, the join will still return a row in the result—but with NULL in each column from B. This means that a left outer join returns all the values from the left table, plus matched values from the right table (or NULL in case of no matching join predicate). If the left table returns one row and the right table returns more than one matching row for it, the values in the left table will be repeated for each distinct row on the right table.

For example, this allows us to find an employee’s department, but still shows the employee(s) even when their department does not exist (contrary to the inner-join example above, where employees in non-existent departments are excluded from the result).

Example of a left outer join, with the additional result row italicized:

SELECT *

FROM employee LEFT OUTER JOIN department


Employee.LastName








John NULL NULL NULL


Right outer joins

A right outer join (or right join) closely resembles a left outer join, except with the treatment of the tables reversed. Every row from the “right” table (B) will appear in the joined table at least once. If no matching row from the “left” table (A) exists, NULL will appear in columns from A for those records that have no match in B.

A right outer join returns all the values from the right table and matched values from the left table (NULL in case of no matching join predicate).

For example, this allows us to find each employee and his or her department, but still show departments that have no employees.

Example right outer join, with the additional result row italicized:

SELECT *

FROM employee RIGHT OUTER JOIN department


Employee.LastName









NULL NULL Marketing 35

In practice, explicit right outer joins are rarely used, since they can always be replaced with left outer joins (with the table order switched) and provide no additional functionality. The result above is produced also with a left outer join:

SELECT *

FROM department LEFT OUTER JOIN employee


Full outer join

A full outer join combines the results of both left and right outer joins. The joined table will contain all records from both tables, and fill in NULLs for missing matches on either side.

For example, this allows us to see each employee who is in a department and each department that has an employee, but also see each employee who is not part of a department and each department which doesn’t have an employee.

Example full outer join:

SELECT *

FROM employee

FULL OUTER JOIN department


Employee.LastName







John NULL NULL NULL



NULL NULL Marketing 35

Some database systems (like MySQL) do not support this functionality directly, but they can emulate it through the use of left and right outer joins and unions. The same example can appear as follows:

SELECT *

FROM employee

LEFT JOIN department


UNION

SELECT *

FROM employee

RIGHT JOIN department


WHERE employee.DepartmentID IS NULL

SQLite does not support right join, so outer join can be emulated as follows:

SELECT employee.*, department.*

FROM employee

LEFT JOIN department


UNION ALL

SELECT employee.*, department.*

FROM department

LEFT JOIN employee


WHERE employee.DepartmentID IS NULL

Self-join

A self-join is joining a table to itself.[2] This is best illustrated by the following example.

Example

A query to find all pairings of two employees in the same country is desired. If you had two separate tables for employees and a query which requested employees in the first table having the same country as employees in the second table, you could use a normal join operation to find the answer table. However, all the employee information is contained within a single large table. [3]

Considering a modified Employee table such as the following:

Employee TableEmployeeID

LastName

CountryDepartmentID

123 Rafferty Australia 31

124 Jones Australia 33

145 Steinberg Australia 33

201 RobinsonUnited States

34

305 Smith Germany 34

306 John Germany NULL

An example solution query could be as follows:

SELECT F.EmployeeID, F.LastName, S.EmployeeID, S.LastName, F.Country

FROM Employee F, Employee S

WHERE F.Country = S.Country

AND F.EmployeeID < S.EmployeeID

ORDER BY F.EmployeeID, S.EmployeeID;

Which results in the following table being generated.

Employee Table after Self-join by CountryEmployeeID

LastName

EmployeeID

LastName

Country

123 Rafferty 124 JonesAustralia

123 Rafferty 145 SteinbergAustralia

124 Jones 145 SteinbergAustralia

305 Smith 306 JohnGermany

For this example, note that:

F and S are aliases for the first and second copies of the employee table.

The condition F.Country = S.Country excludes pairings between employees in different countries. The example question only wanted pairs of employees in the same country.

The condition F.EmployeeID < S.EmployeeID excludes pairings where the EmployeeIDs are the same.

F.EmployeeID < S.EmployeeID also excludes duplicate pairings. Without it only the following less useful part of the table would be generated (for Germany only shown):

EmployeeID

LastName

EmployeeID

LastName

Country

305 Smith 305 SmithGermany

305 Smith 306 JohnGermany

306 John 305 SmithGermany

306 John 306 John German

y

Only one of the two middle pairings is needed to satisfy the original question, and the topmost and bottommost are of no interest at all in this example.

ybase Migration ASE 12.5 to 15.01 Database

This section describes the database object consideration, while migrating from Sybase 12.5 to 15. It involves measurement and analysis of the below mentioned database objects.

Sybase Adaptive Server version number

Total size of development(and/or production) database in MB

Number of tables

The number of tables will impact the estimate for the schema migration. The process should be for the most part automatic. If the schema migration is done manually, an estimate needs to be made for schema migration based upon the number of tables.

Size and usage of tables

The size and usage of tables will effect where tables will be placed (i.e. tablespaces).

Largest tables (number and size):

Most active tables (number and size):

Look up tables (number and size):

Number of views

Why: All views need to be graded on level of complexity (simple, medium, complex and very complex).

Complexity items: Items that cause a view to be more difficult to migrate include: system tables, Adaptive Server specific functions. ANSI standard joins.

Number of indexes

Why: If the schema migration is done manually, an estimate needs to be made for schema migration based upon the number of indexes.

Number of clustered and non-clustered indexes

Number of triggers

Why: All triggers need to be graded on level of complexity (simple, medium, complex and very complex).

Complexity items: Items that cause a trigger to be more difficult to migrate include: temporary tables, DDL, transaction control logic, use of inserted/deleted tables in selects or DML, system tables, and global variables.

Sybase Adaptive Server uses the “inserted” and “deleted” tables to make comparisons of old values to new values.

Total number of stored procedures

Why: Some manual coding will needed to be done. All stored procedures need to be graded on level of complexity (easy, average and complex).

1 Pre-Migration Steps

1.1 System Databases Tasks

Check DB-Size 12.5 and 15.0.

Check underlying databases which are installed while installing Sybase Servers. (like master ,model, tempdb, sybsystemprocs)

List all the Objects in the above mentioned System Database

Compare DB-Objects in the existing 12.5.X and 15.0.X architecture. Thereby making sure all the objects which are in 12.5.X are present in the 15.0.X

Check the amount of data being populated into the Tables in 12.5.X Servers i.e System databases

Take the BCP-Out of the data.

Compare the data in the System database in 15.0.X, if discrepancies found in the two versions, remove discrepancy manually by generating additional insert script if required.

Classify DB-Objects (Stored Procedures, Views, Triggers, Cursors, Tables, etc )

1.2 Client Database Tasks

Check DB-Size in the 12.5.X and 15.0.X

Lists number of users, roles, groups; access related information in the Source Server (12.5.3)

Create Script of users, roles, groups, access related information for the Target Server (15.0.X)

List all the Objects in the Source Server (including Tables, Stored Procedures, Views, Triggers, Cursors, Indexes, etc )

List the referential integrity constraints.

Create a script of the DDL Statements from the Source Server.

Check datatypes and modify if required as per new Target Server.

Check and Take Count of the data in the Source Server.

Take the BCP-OUT of the available data from the Source Server.

Classify DB-Objects (Stored Procedures, Views, Triggers, Cursors, Tables, etc )

Analyze respective DB-Objects into the Client Databases

1 Required SQL Changes

1.1 Build In System- Functions

One of the major enhancements in ASE 15 is support for semantically partitioned tables. This required some low level changes in ASE which are relevant even when semantic partitioning is not even used in ASE 15.

One of these changes has to do with the built in system functions retrieving information from the OAM page, such as rowcnt(). These functions always had to be called with either the doampg or ioampg column from the sysindexes table. Since these functions only provide information about the size of tables and indexes, it is quite unlikely that they would be part of business logic. Instead, the functions are typically only found in DBA tools, such as scripts to report on space usage and row counts of the largest tables. In ASE 15, these built in system functions should be changed to very similar named equivalents, as shown below.

The new functions also have a slightly different interface, but fortunately, things have gotten simpler to use since specifying the doampg/ioampg columns are no longer required. Instead, it is sufficient to specify the ID values for the database, object, index or partition. The following list shows the functions in ASE 12.x and the corresponding new functions in ASE 15:

Pre‐15: rowcnt ( doampg )

ASE 15: row_count ( dbid, objid [, ptnid ] )

Pre‐15: data_pgs( objid, { doampg | ioampg } )

ASE 15: data_pages ( dbid, objid [, indid [, ptnid ]] )

Pre‐15: ptn_data_pgs ( objid, ptnid )

ASE 15: data_pages ( dbid, objid [, indid [, ptnid ]] )

Pre‐15: reserved_pgs ( objid, { doampg | ioampg } )

ASE 15: reserved_pages ( dbid, objid [, indid [, ptnid ]] )

Pre‐15: used_pgs ( objid, doampg, ioampg )

ASE 15: used_pages ( dbid, objid [, indid [, ptnid ]] )

It should be noted that the pre‐15 functions still exist in ASE 15, but they will now always return a value of zero. This means that, in principle, existing SQL code calling the pre‐15 functions will keep running, but results are unlikely to be correct. Maintaining the pre‐15 functions also introduces the risk of dividing by zero, for example when calculating number of rows per page with these functions; this would lead to a run‐time error, aborting SQL execution.

In addition, for every invocation of the pre‐15 functions a warning like the following will be generated, thus making it unlikely that the pre‐15 functions will go unnoticed: The built-in function ‘rowcnt’ has been deprecated. Use the new builtin function ‘row_count’ instead.Deprecated function returns value of 0.

1.2 Group By Without Order By

The ANSI SQL standard specifies that the order of rows in the result set of a SQL query is undefined unless an order by clause is specified. In practice in ASE however, the result set order was often predictable and indeed reliable even without an order by clause, as it followed logically from the choice of indexes in the query plan.

However, over the years, new features were introduced that made this implicit result set ordering ever more unpredictable. Such features include parallel query processing (ASE 11.5) and DOL row forwarding (ASE 11.9).

In ASE 15, there is yet another case where the result set order was previously predictable without an order by, but this can no longer be relied on. This concerns the case of a query with a group by clause but no order by. In pre 15, the order of the resulting rows would be guaranteed to be in the sorting order of the columns specified in the group by clause; this was a side effect of the rather classic sorting method for the group by.

Pre15: note that results are in alphabetical order for ‘type’:

1> select count(*), type

2> from sysobjects group by type

3> go

type

———– —-

1 D

55 P

52 S

19 U

4 V

This has changed in ASE 15. Among the many query processing enhancements in ASE 15 are more modern sorting algorithms based on hashing techniques. These sorting methods are faster than the classic sorting of pre 15, but the result set is no longer automatically in the sort order of the group by columns.

ASE 15.0: note the rows are no longer alphabetically ordered on ‘type’:

1> select count(*), type

2> from sysobjects group by type

3> go

type

———– —-

1 V

55 S

3 U

12 P

Unfortunately, it appeared that some customers had been inadvertently relying on the guaranteed result set order for a group by without order by. To offer a short term solution, traceflag 450 was introduced in version 15.0 ESD#2: when this traceflag is enabled, ASE will revert to the less performing, pre 15 sorting algorithm for group by sorting, thus

restoring the expected result set ordering. Required SQL code changes when migrating to ASE 15

The real solution is to make sure that order by is specified for every query where the result set ordering matters.

1.3 Fetching from Cursor while modifying underlying database

Sybase has identified a scenario whereby a particular sequence of events may cause an application to behave functionally different in ASE 15 than in 12.5. This could potentially affect business applications.

For this scenario to potentially occur, at least all of the following must be true:

• The client application is fetching from a cursor.

• The query plan for the cursor’s select statement involves a worktable in ASE 12.5, for example as a result of a distinct, group by, order by or union operation in the query.

• While the application is fetching rows from the cursor, rows in the underlying tables are being modified by another application (or by the same application on a different connection); the affected rows would be reflected in the result set if the cursor would be opened and evaluated again.

If all of these conditions apply, then the possibility exists that the application could fetch different results from the cursor in ASE 15 than it does in 12.5. Note that it is not guaranteed that this interference will indeed occur this depends on the actual timing of the different actions.

The following is happening in 12.5:

• In ASE 12.5, distinct, group by, order by or union in the query can lead to a query plan with

a worktable, whereby the result set rows are all buffered in the worktable before the first row can be fetched. When fetching rows from the cursor, they are actually fetched from the worktable which at that point already holds the full result set.

• When, at the same time as the fetching goes on in 12.5, rows in the underlying database tables are modified by another application, then these changes will not be reflected in the rows being fetched from the cursor: all result set rows are already buffered in the worktable, so changes to data rows in the underlying table will not affect the cursor’s current result set anymore.

Solution

The solution for this problem in ASE 15 is one of the following:

Modify the cursor declaration to include the keyword ‘insensitive’, e.g.:

declare my_curs insensitive cursor

for

select distinct col1, col2

from my_tab

Sybase recommends this solution.

1.4 Pre-15 Partitioned Tables

alter table my_table unpartition

Contrary to pre‐15, this statement will no longer work if the table in question has any indexes. The new implementation of partitions in ASE 15 requires that all indexes be dropped before any changes to the partitioning scheme can be made. In contrast, this command worked fine in pre‐15 with out without existing indexes. Therefore, any existing SQL code performing such an unpartitioning operation will need to be modified to drop indexes first, and recreate them afterwards. This command is very unlikely to occur in business application logic, but will typically be found only in maintenance scripts or tools for defragmenting or recreating/rebalancing partitioned tables.

2 Performance Tuning

As per the analysis done so far to increase the performance of the existing SQL blocks or Stored Procedure in ASE 12.5.X will not require any change, the advanced features discussed below and underlying algorithms in the new ASE 15 architecture will take care of the Performance thereby increasing the throughput.

2.1 Overview

ASE 15 caters to increased data volume; demand for operational DSS workload and at the same time keeps pace with the performance requirement in terms of throughput and response time.

The ASE 15 introduces some key features like on-disk encryption, smart partitions and a new, patented query processing technology. These features have exhibited superior performance, reliability and a reduced total cost of ownership

2.2 Impact of New Features

2.2.1 New Generation Query Engine

Query optimization has been improved through the use of advanced techniques like pipelined parallelism, cost-based pruning, timeout mechanisms, and plan strategies for improved star-joins, hash-joins and large multi-way joins, coupled with the use of join histograms, partition elimination, and goals-based optimization. Below are some example queries to illustrate the impact of the new Query engine.

2.2.1.1 Query 1: Selection of a better Join strategy

The below query involves association of the data scattered across a handful of relational database tables i.e. a join operation.

Example

Improvement:

The query runs 90% faster on ASE 15 compared to 12.5. The total amount of I/O (pages read/written) in ASE 12.5 turns out to be 4 times more.

Reason:

1) ASE 15 query optimizer chooses a Merge Join strategy over Nested Loop Join, thereby reducing the I/O requirement.

2) ASE 12.5 needs to store the join result into a temporary table so as to group and sort the rows. ASE 15 avoids such materializations by using a pipelined model. To implement the GROUP BY and ORDER BY, ASE 15 does an inmemory hash-based grouping (group_hashing), followed by on-the-fly sorting. This reduces the I/O and crunches the time further.

2.2.1.2 Query 2: Avoiding unnecessary SORTING of huge result sets

Example

Improvement:

This simple looking query runs 60% faster on ASE 15 compared to ASE 12.5. The table of interest has 6 million entries (table size ~ 2 GB) which are grouped into approximately 1.5 million distinct groups in the result set. The total amount of I/O incurred by the query in ASE 12.5 is 50 times more than in ASE 15.

Reason:

1) ASE 15 has introduced a new operator named ‘group_sorted’ to exploit the inherent result ordering derived from lower level operations. For the above query, ASE 15 is intelligently able to avoid the re-sorting after having done a clustered index scan on the group-by attribute to begin with. Hence all it must do, after reading the rows, is to mark the group boundaries within the result. On the other hand, ASE 12.5 uses temporary tables (called worktables)

to store the result of the index scan. Thus 12.5 incurs greater amount of I/O due to additional storage and spends more time due to redundant sorting.

2.2.1.3 Query 3: Executing intra-query operations in parallel

Example

Improvement:

This query runs 90% faster on ASE 15 in an operational DSS (Decision Support Systems) configuration mode compared to its fastest possible execution in ASE 12.5. The total amount of I/O in ASE 12.5 is a whooping 500 times more.

2.2.2 Smart Partitions

Database tables can now be divided into smaller chunks of data called “partitions” that can be individually managed. These can be stationed on separate disks to facilitate parallel reads. Queries may run faster since the “smart” query optimizer of ASE 15 is able to bypass partitions that don’t contain relevant data. In addition an administrator can quickly run maintenance tasks on selected partitions, if need be, rather than touching the entire database. This saves a great deal of maintenance time.

Improvement:

With range partitioning, the above balance query runs more than 25% faster on ASE 15 compared to ASE 12.5. The I/O requirement in ASE 12.5 is twice as much.

2.2.3 Direct I/O

This is an Operating System feature that can be used during file creation time. However support for using this feature was unavailable in ASE versions prior to and including 12.5. The support is now available in ASE 15. Enabling the direct I/O option for a database device file allows the administrator to configure the Adaptive Server to transfer data, to be written to that file, directly to the storage medium and thereby bypass any file system buffering mechanism. This speeds up database writes immensely by making the file system I/O comparable to that of raw devices. It gives the performance benefit of a raw device but has the ease of use and manageability of a file system device.

We see a 35% gain in the transaction throughput of TPC-C using DIRECTIO feature in 15.0, versus using the DSYNC feature available in 12.5. Both the features guarantee that writes will make it to the disk.

2.2.4 Delayed Commit

The delayed commit feature introduced in ASE 15 allows to postpone the instant at which log records are flushed to disk. With delayed commit enabled for a database in ASE, the log records for transactions bound to that database are asynchronously written to the disk at commit time. Thus control returns back immediately to the client without waiting for the I/O to complete. This improves response time for transactions with extensive logging activity. Some of the stream processing applications can benefit a lot from this feature since large volume of incoming data needs to be processed fairly quickly.

Improvement:

We observe almost 70-90% gain in execution time of data inserts in ASE 15.0 compared to ASE 12.5. The percentage depends on how big each transaction is in terms of number of

update/insert operations it does. Larger the number of operations per transaction, higher will be the gain with delayed commit.

2.2.5 Computed Columns

Often applications repeat the same calculations over and over for the same report or query. Instead of storing the result in a table’s column, the user or application may store the formula itself in the column. This creates a virtual column that will be recalculated whenever the rows of the table are accessed. Such columns are called “computed columns”. They are basically defined by an expression, involving regular columns in the same row, or from functions, arithmetic operators, path names, etc. Their main advantage is reduced storage, faster inserts/deletes and above all ease of altering the column value by simply re-specifying the defining expression.

2.2.6 Scrollable Cursors

With large data sets, access can be cumbersome. ASE 15 introduces bi-directional scrollable cursors to make it convenient to work with large result sets because application can easily move backward and forward through a result set, one row at a time. This especially helps with Web applications which need to process large result sets and present the user with a restricted view of the result.

2.2.7 CONCLUSION

The above features give a fair idea of the enhanced performance of the Adaptive Server in its latest incarnation. Keeping in mind the query and workload characteristics at customer installations, the latest descendant of the ASE lineage, with its enhanced set of features, is able to tune itself to individual usage scenarios and scale better performance. The

sophisticated query execution process selects efficient operations, skips redundant ones, and exploits concurrent ones. Smart distribution of data on underlying storage provides faster data retrieval and updates.

And, greater manageability and ease when using the traditional filesystem for storage of data, coupled with strategies for faster transaction completion, ensure optimal performance in a data server.

3 Miscellaneous

3.1 Running Update Statistics in ASE 15

In ASE 15, having sufficient and accurate statistics on database tables is more important than it was in pre 15. Without proper statistics, application performance could suffer, though this would not impact correct functioning of applications.

What this means in practice is that it is recommended to run update index statistics rather than update statistics. Consequently, DBAs may need to update their maintenance scripts accordingly. When running update index statistics on large tables, it may be also be needed to use the with sampling clause to avoid overrunning tempdb or the procedure cache, and/or specifying an explicit step count with the clause using nnn values. While update statistics is typically found in maintenance scripts, in rare cases this might also occur in SQL code in business applications. If this occurs, it is typically when a temporary table has been populated with data, and update statistics is required to ensure the proper query plan is selected in subsequent processing. In such cases, it may well be required to change this to update index statistics as well. Again though, this would only have a performance effect rather than lead to incorrect application behaviour.

3.2 Optimistic recompilation & sp_recompile

The DBAs have routinely been running sp_recompile after running update_statistics, to ensure that the new statistics will be considered for future queries. As of ASE 15, sp_recompile is no longer needed once statistics are updated; as the sp_recompile functionality is now implicitly included in update…statistics. The same applied when creating a new index on a table: it is no longer needed to run sp_recompile.

However, it is not needed to change any of the maintenance scripts where sp_recompile

might occur, since executing sp_recompile in those cases is totally harmless.

Sybase Performance Tuning Interview Questions

Reporting: SQL Performance and Tuning

This is a list of some techniques used successfully at several different sites.

Getting Maximum Index Usage

1) Verify tables have had “update statistics” on them ;

Verify tables have had “sp_recompile” on them.

2) Verify any declared variables have the same data

type as their corresponding columns – this is a common

pitfall.

3) Force index usage as follows, with a hint:

from customer (index idx_customer2)

4) Use SET TABLE COUNT

Example: set table count 6

Then, compile the procedure, in the same session.

5) If temp tables are being used, put the temp table

creation statements in one procedure, and the

processing SQL in another procedure. This allows

the optimizer to form a query plan on the already

established tables.

Example:

proc_driver calls proc_create_temp_tables

then,

proc_driver calls proc_generate_report

General SQL Programming

- Plan for growth. Assume the driver table doubled or tripled in size; would

the report still function ?

- Avoid dumb comparisons in the where clause, like

where @emp_id > 0

- use “WHERE EXISTS ( )” rather than “WHERE NOT EXISTS”

- use “!=” rather than “<>”

- use “IS NOT NULL” rather than “<>NULL”

- use “IS NULL” rather than “=NULL”

- avoid distinct if possible ; see cursor loop option below

- use meaningful names for temp tables … don’t use #temp (lame)

Report Structure Approaches

1) Single query

Single query reports are rare – usually they involve getting a simple list

together.

- Don’t try to ‘shoehorn’ SQL into one statement. Shorter programs are

great for C or Perl applications, but this is not the case in SQL.

Think “Bigger is Better” (and more maintainable).

- Keep queries from using more than four tables if possible.

2) Cursor on driver table(s), with IF..THEN processing in loop

Using a cursor for complex reports almost always increases performance

when large tables and a lot of joins are involved.

- Keep cursor queries from using more than two tables if possible,

make sure this query performs well on its own.

- Try to have a unique key of some sort available within the tables involved.

Strange results have been known to occur when a cursor is scanning

rows that are exactly alike.

- Don’t use cursors for updating.

- Use IF statements for filtering results even further. In most cases:

A code construct like the one below is better than cramming the

logic in a where clause.

IF

BEGIN

IF and

…..

ELSE

….

END

3) Set processing without cursors

This technique should be attempted when even a cursor construct fails to

achieve the desired performance.

Basically, the driver query is re-run with each iteration of the loop.

Sample, with cursor:

declare cursor1 cursor for

select emp_id, last_name, salary

from employee

open cursor1

fetch cursor1 into @emp_id, @last_name, @salary

while (@@sqlstatus = 0)

begin

< processing >

fetch cursor1 into @emp_id, @last_name, @salary

end

close cursor1

Sample, with set processing:

select @emp_id = 0, @loop = 1

while (@loop > 0)

begin

set rowcount 1

select

@emp_id = emp_id,

@last_name = last_name,

@salary = salary

from employee

where emp_id > @emp_id

order by 1

select @loop = @@rowcount

set rowcount 0

if @loop > 0

begin

< processing >

end

end

Transaction Log Filling Up ?

If the transaction log is filling up, for tempdb or the main database, there

is likely something wrong with the report logic.

Things to check:

- Instead of repetitively updating each row, can the values be obtained

ahead of time, and then inserted with a single transaction ?

- Are the “joined” updates occuring on each row once ? When updating

using a join statement, make sure that the tables in question

are joined in a way that avoids duplicate rows. Try running the

SQL statement as a SELECT – check it out.

- Are you cramming 500,000 rows from a temp table into a db table ?

Try elminating the temp table.

- Create indexes on updated/inserted tables after the fact.

- Use “set rowcount” along with “waitfor delay” if log problems persist

*** A proper DBA will never extend the log segment, based on the needs of a

single process

Sybase Blogs Questions

Documents

violate referential

integrity

implicit join

delpub trigger

adaptive server

explicit join

data modification

roll back