Indexes
Indexes
ObjectivesAfter completing this module, you should be able
to:
Define primary and secondary indexes and their purposes.
Distinguish between a primary index and a primary key.
Distinguish between a UPI and a NUPI.
Define a Partition Primary Index and its purpose.
Distinguish between a USI and a NUSI.
Explain the makeup of the Row-ID and its role in row
storage.
Describe the sequence of events for locating a row.
Explain the roles of the hashing algorithm and hash map in
locating a row.
Describe the operation of full table scans in Teradata.
Indexes in TeradataIndexes are used to access rows from a table
without having to search the whole table. In the Teradata RDBMS, an
index is made up of one or more columns in a table. Once Teradata
indexes are selected, they are maintained by the system. While
other vendors may require data partitioning or index maintenance,
these tasks are unnecessary with Teradata.
In the Teradata RDBMS, there are two types of indexes:
Primary Indexes define the way the data is distributed.
Primary Indexes and Secondary Indexes are used to locate the
data rows more efficiently than scanning the whole table.
You specify which column(s) are used as the Primary Index when
you create a table. Secondary Index column(s) can be specified when
you create a table or at any time during the life of the table.
Data DistributionWhen the Primary Index for a table is well
chosen, the table rows are evenly distributed across the AMPs for
the best performance. The way to guarantee even distribution of
data is by choosing a Primary Index whose columns contain unique
values. The values do not have to be evenly spaced, or even "truly
random," they just have to be unique to be evenly distributed.
The even distribution enables each AMP to be responsible for
only a subset of the rows in a table. If the data is evenly
distributed, the work is evenly divided among the AMPs so they can
work in parallel and complete their processing about the same time.
Even data distribution is critical to performance because it
optimizes the parallel access to the data.
Unevenly distributed data, also called "skewed data," causes
slower response time as the system waits for the AMP(s) with the
most data to finish their processing. The slowest AMP becomes a
bottleneck.
When data is loaded into the Teradata RDBMS:
The system automatically distributes the data across the AMPs
based on row content (the Primary Index values).
The distribution is the same regardless of the data volume being
loaded. In other words, large tables are distributed the same way
as small tables.
Data is not distributed in any particular order. The automatic,
unordered distribution of data eliminates tasks for a Teradata DBA
that are necessary with some other relational database systems. The
DBA does not waste time on labor-intensive data maintenance tasks.
Some benefits of unordered data include:
Prior to loading the data, no initial data ordering or sorting
is necessary.
Once data is loaded, no data maintenance is necessary to
preserve the order.
SQL requests can be formulated without regard to the data
order.
A Teradata system provides high performance because it
distributes the data evenly across the AMPs for parallel
processing.
Question
Which of the following statements do you think are true about
data distribution and Teradata indexes? (Choose two answers.)
A. If a table has 103 rows but there are 4 AMPs in the system,
each AMP will not have exactly the same number of rows from that
table. However, if the Primary Index is chosen well, each AMP still
will contain some rows from that table.
B. The rows of a table are stored on a single disk for best
access performance.
C. Skewed data leads to poor performance in processing data
access requests. D. Teradata RDBMS performance can be increased by
maintaining the indexes and conducting periodic data partitioning
and sorting.
Primary Index (PI) A Primary Index is the mechanism for
assigning a data row to an AMP and a location on the AMPs disks. It
is also used to access rows without having to search the entire
table. You specify the column(s) that comprise the Primary Index
for a table when the table is created. For a given row, the Primary
Index value is the combination of the data values in the Primary
Index columns.
Choosing a Primary Index for a table is perhaps the most
critical decision a database designer makes, because this choice
affects both data distribution and access.
Primary Index RulesThe following rules govern how Primary
Indexes implemented in a Teradata system must be defined as well as
how they function:
Rule 1: One Primary Index per table.Rule 2: A Primary Index
value can be unique or non-unique. Rule 3: The Primary Index value
can be NULL.Rule 4: The Primary Index value can be modified.Rule 5:
The Primary Index of a table cannot be modified.Rule 6: A Primary
Index has a limit of 16 columns.
Rule 1: One PI Per TableEach table must have a Primary Index.
The Primary Index is the only way for the system to determine where
a row will be physically stored. While a Primary Index may be
composed of multiple columns, the table can have only one (single-
or multiple-column) Primary Index.
Rule 2: Unique or Non-Unique PI There are two types of Primary
Index:
Unique Primary Index (UPI) - For a given row, the combination of
the data values in the columns of a Unique Primary Index are not
duplicated in other rows within the table. This uniqueness
guarantees uniform data distribution and direct access. For
example, in the case where old employee numbers are sometimes
recycled, the combination of the Last Name and Employee Number
columns would be a UPI.
Non-Unique Primary Index (NUPI) - For a given row, the
combination of the data values in the columns of a Non-Unique
Primary Index can be duplicated in other rows within the table. A
NUPI can cause skewed data, but in specific instances can still be
a good Primary Index choice. For example, either the Department
Number column or the Hire Date column might be a good choice for a
NUPI if you will be accessing the table most often via these
columns.
Rule 3: PI Can Be NULLIf the Primary Index is unique, you could
have one row with a null value. If you have multiple rows with a
null value, the Primary Index must be Non-Unique.
Rule 4: PI Value Can Be ModifiedThe Primary Index value can be
modified. In the table below, if Loretta Ryan changes departments,
the Primary Index value for her row changes.
When you update the index value in a row, Teradata re-hashes it
and redistributes the row to its new location based on its new
index value.
Rule 5: PI Cannot Be Modified The Primary Index of a table
cannot be modified.
In the event that you need a new Primary Index, you must drop
the table, recreate it with the new Primary Index, and reload the
table.
In Teradata RDBMS V2R5, the ALTER TABLE statement allows you to
change the PI of a table if the table is empty.
Rule 6: PI Has 16-Column LimitYou can designate a Primary Index
that is composed of 1 to 16 columns.
In Teradata RDBMS V2R5, the maximum number of columns in an
index is increased to 64.
SQL Syntax for Creating a Primary Index When a table is created,
it must have a Primary Index specified. The Primary Index is
created in the CREATE TABLE statement in SQL.
If you do not specify a Primary Index in the CREATE TABLE
statement, the system will use the Primary Key as the Primary
Index. If a Primary Key has not been specified, the system will
choose the first unique column. If there are no unique columns, the
system will use the first column in the table and designate it as a
Non-Unique Primary Index.
Creating a Unique Primary Index The SQL syntax to create a
Unique Primary Index is:
CREATE TABLE sample_1
(col_aINT
,col_bINT
,col_cINT)
UNIQUE PRIMARY INDEX (col_b);
Creating a Non-Unique Primary Index
The SQL syntax to create a Non-Unique Primary Index is:
CREATE TABLE sample_2
(col_xINT
,col_yINT
,col_zINT)
PRIMARY INDEX (col_x);
Modifying thePrimary Index of a Table
As mentioned in the Primary Index rules, you cannot modify the
Primary Index of a table. In the event that you need a new Primary
Index, you must drop the table, recreate it with the new Primary
Index, and reload the table.
Data Mechanics of Primary IndexesThis section describes how
Primary Indexes are used in:
Data distribution
Data access
Distributing Rows to AMPsRows are distributed to AMPs during the
following operations:
Loading data into a table (one or more rows, using a data
loading utility)
Inserting or updating rows (one or more rows, using SQL)
Changing the system configuration (redistribution of data,
caused by reconfigurations to add or delete AMPs)
When loading data or inserting rows, the data being affected by
the load or insert is not available to other users until the
transaction is complete. During a reconfiguration, no data is
accessible to users until the system is operational in its new
configuration.
Row Distribution Process
The process the system uses for inserting a row on an AMP is
described below:
1. The system uses the Primary Index value in each row as input
to the hashing algorithm.
2. The output of the hashing algorithm is the row hash value (in
this example, 646).
3. The system looks at the hash map, which identifies the
specific AMP where the row should be stored (in this example, AMP
3).
4. The row is stored on the target AMP.
UPI: The system automatically checks for duplicate UPI values
when rows are loaded or inserted. If a row already exists with the
UPI value, the new row is not added.
NUPI: The system does not check for duplicate NUPI values. If a
row already exists with the NUPI value, the new row is added to the
same AMP.
Hash Map
A hash map is an array that associates hash bucket numbers with
specific AMPs. While it has a limited number of hash buckets, there
are enough hash buckets to minimize the number of hash collisions
(when the hashing algorithm calculates the same row hash value for
two different rows).
The hash map is a GDO (globally distributed object), which is a
file that is copied and distributed to every node in the system. If
an AMP is executing a request that requires information in a GDO,
it can access the copy of the GDO on its node.
Duplicate Row Hash Values It is possible for the hashing
algorithm to end up with the same row hash value for two different
rows. There are two ways this could happen:
Duplicate NUPI values: If a Non-Unique Primary Index is used,
duplicate NUPI values will produce the same row hash value.
Hash synonym: Also called a hash collision, this occurs when the
hashing algorithm calculates an identical row hash value for two
different Primary Index values. Hash synonyms are very rare. When
using a Unique Primary Index, you will still get uniform data
distribution.
To differentiate each row in a table, every row is assigned a
unique Row ID. The Row ID is the combination of the row hash value
and a uniqueness value.
Row ID = Row Hash Value + Uniqueness Value
The uniqueness value is used to differentiate between rows whose
Primary Index values generate identical row hash values. In most
cases, only the row hash value portion of the Row ID is needed to
locate the row.
When each row is inserted, the AMP adds the row ID, stored as a
prefix of the row. The first row inserted with a particular row
hash value is assigned a uniqueness value of 1. The uniqueness
value is incremented by 1 for any additional rows inserted with the
same row hash value.
Duplicate RowsA duplicate row is a row in a table whose column
values are identical to another row in the same table. In other
words, the entire row is the same, not just an index. Although
duplicate rows are not allowed in the relational model (because
every Primary Key must be unique), Teradata does allow duplicate
rows because the capability is a part of the ANSI standard.
Because duplicate rows are allowed in Teradata, how does it
affect the UPI, which, by definition, is unique? When you create a
table, the following definitions determine whether or not it can
contain duplicate rows:
MULTISET tables: May contain duplicate rows. Teradata will not
check for duplicate rows.
SET tables: The default. Teradata checks for and does not permit
duplicate rows. If a SET table is created with a Unique Primary
Index, the check for duplicate rows is replaced by a check for
duplicate index values.
Accessing a Row With a Primary Index When a user submits an SQL
request using the table name and Primary Index, the request becomes
a one-AMP operation, which is the most direct and efficient way for
the system to find a row. The process is explained below.
Hashing Process 1. The primary index value goes into the hashing
algorithm.
2. The output of the hashing algorithm is the row hash
value.
3. The hash map points to the specific AMP where the row
resides.
4. The PE sends the request directly to the identified AMP.
5. The AMP locates the row(s) on its vdisk.
6. The row data is sent over the BYNET to the PE, and the PE
sends the answer set on to the client application.
Choosing a Unique or Non-Unique Primary IndexCriteria for
choosing a Primary Index include:
Uniqueness: A UPI guarantees even data distribution, so is often
a good choice. A NUPI with few duplicate values could provide good
(if not perfectly uniform) distribution, and might meet the other
criteria better.
Use in value access: Retrievals, updates, and deletes that
specify the Primary Index are much faster than those that do not.
Because a Primary Index is a known access path to the data, it is
best to choose column(s) that will be frequently used for access.
For example, the following SQL statement would directly access a
row based on the equality WHERE clause:
SELECT * FROM employee WHERE employee_ID = ABC456789
A NUPI may be a better choice if the access is based on another,
mostly unique column. For example, the table may be used by the
Mail Room to track package delivery. In that case, a column
containing room numbers or mail stops may not be unique if
employees share offices, but a better choice for access.
Use in join access: SQL requests that use a JOIN statement
perform the best when the join is done on a Primary Index. Consider
Primary Key and Foreign Key columns as potential candidates for
Primary Indexes. For example, if the Employee table and the Payroll
table are related by the Employee ID column, then the Employee ID
column could be a good Primary Index choice for one or both of the
tables.
Non-volatile values: Look for columns where the values do not
change frequently. For example, in an Invoicing table, the
outstanding balance column for all customers probably has few
duplicates, but probably changes too frequently to make a good
Primary Index. A customer ID, statement number, or other more
stable columns may be better choices.
When choosing a Primary Index, try to find the column(s) that
best fit these criteria and the business need.
QuestionsWhat do you think are key considerations in choosing a
Primary Index? (Choose three.)
A. Column(s) containing unique (or nearly unique) values for
uniform distribution.
B. Column(s) with values in sequential order for best load and
access performance.
C. Column(s) frequently used in queries to access data or to
join tables.
D. Column(s) with values that are stable (do not change
frequently), to minimize redistribution of table rows.
E. Column(s) with many duplicate values for redundancy.
Partitioned Primary IndexIn Teradata RDBMS V2R5 there is a new
indexing mechanism called Partitioned Primary Index (PPI). PPI is
used to improve performance for large tables when you submit
queries that specify a range constraint. PPI allows you to reduce
the number of rows to be processed by using a new technique called
partition elimination. PPI will increase performance for
incremental data loads, deletes, and data access when working with
large tables with range constraints.
How Does PPI Work?Data distribution with PPI is still based on
the Primary Index:
Primary Index
Hash Value
Determines which AMP gets the row
With PPI, the ORDER in which the rows are stored on the AMP is
affected. Using the traditional method, No Partitioned Primary
Index (NPPI), the rows are stored in row hash order.
4 AMPs with Orders Table Defined with NPPI
Using PPI, the rows are stored first by partition and then by
row hash. In our example, there are four partitions. Within the
partitions, the rows are stored in row hash order.
4 AMPs with Orders Table Defined with PPI on O_Date
Data Storage Using PPITo store rows using PPI: specify
Partitioning in the CREATE TABLE statement. The query will run
through the hashing algorithm as normal, and come out with the Base
Table ID, the Partition number(s), the Row Hash, and the Primary
Index values.
Data Storage Using PPI
Access Without a PPILet's say you have a table with Store
information by Location and did not use a PPI. If you query on
Location 3 on this NPPI table, the entire table will be scanned to
find records for Location (Full Table Scan).
Access Without a PPIQUERY SELECT * FROM Employee_NPPI WHERE
Location_Number = 3;PLAN ALL-AMPs - Full Table Scan
Access With a PPIIn the same example for a PPI table, you would
partition the table with as many Locations as you have (or will
soon have in the future.) Then if you query on Location 3, each AMP
will use partition elimination and each AMP only has to scan
partition 3 for the query. This query will run much faster than the
Full Table Scan in the previous example.
Access With a PPIQUERY SELECT * FROM EmployeeWHERE
Location_Number = 3;PLAN ALL-AMPs - Single Partition Scan
Secondary Index (SI)A Secondary Index is an alternate data
access path. It allows you to access the data without having to do
a full table scan. Secondary indexes do not affect how rows are
distributed among the AMPs.
You can drop and recreate secondary indexes dynamically, as they
are needed. Unlike Primary Indexes, Secondary Indexes are stored in
separate subtables that require extra overhead in terms of disk
space, and maintenance which is handled automatically by the
system. So, Secondary Indexes do require some system resources.
Question
In what instances would it be a good idea to define a secondary
index for a table? (This information will be covered in this
module, but here is a preview.)
1. The Primary Index exists for even data distribution and data
access, but a Secondary Index is defined to efficiently generate
monthly reports based on a different set of columns.
2. The Product table is accessed by the retailer (who accesses
data based on the retailer's product code column), and by a vendor
(who access the same data based on the vendor's product code
column).
3. The table already has a Unique Primary Index, but a second
column must also have unique values. The column is specified as a
Unique Secondary Index (USI) to enforce uniqueness on the second
column.
4. All of the above.Secondary Index RulesSeveral rules that
govern how Secondary Indexes must be defined and how they function
are:
Rule 1: Secondary Indexes are optional. Rule 2: Secondary Index
values can be unique or non-unique.Rule 3: Secondary Index values
can be NULL.Rule 4: Secondary Index values can be modified.Rule 5:
Secondary Indexes can be changed.Rule 6: A Secondary Index has a
limit of 16 columns.
Rule 1: Optional SIWhile a Primary Index is required, a
Secondary Index is optional. If one path to the data is sufficient,
no Secondary Index need be defined.
You can define 0 to 32 Secondary Indexes on a table for multiple
data access paths. Different groups of users may want to access the
data in various ways. You can define a Secondary Index for each
heavily used access path.
Rule 2: Unique or Non-Unique SI Like Primary Indexes, Secondary
Indexes can be unique or non-unique.
A Unique Secondary Index (USI) serves two possible purposes:
Enforces uniqueness in a column or group of columns. The
database will check USIs to see if the values are unique. For
example, if you have chosen different columns for the Primary Key
and Primary Index, you can make the Primary Key a USI to enforce
uniqueness on the Primary Key.
Speeds up access to a row. Accessing a row with a USI requires
one or two AMPs, which is less direct than a UPI (one AMP) access,
but more efficient than a full table scan.
A Non-Unique Secondary Index (NUSI) is usually specified to
prevent full table scans, in which every row of a table is read.
The Optimizer determines whether a full table scan or NUSI access
will be more efficient, then picks the best method. Accessing a row
with a NUSI requires all AMPs.
Rule 3: SI Can Be NULLAs with the Primary Index, the Secondary
Index column may contain NULL values.
Rule 4: SI Value Can Be ModifiedThe values in the Secondary
Index column may be modified as needed.
Rule 5: SI Can Be ChangedSecondary Indexes can be changed.
Secondary Indexes can be created and dropped dynamically as needed.
When the index is dropped, the system physically drops the subtable
that contained it.
Rule 6: SI Has 16-Column LimitYou can designate a Secondary
Index that is composed of 1 to 16 columns. To use the Secondary
Index below, the user would specify both Budget and Manager
Employee Number.
In Teradata RDBMS V2R5, the maximum number of columns in an
index is increased to 64.
Using Secondary IndexesIn the table below, users will be
accessing data based on the Department Name column. The values in
that column are unique, so it has been made a USI for efficient
access. In addition, the company wants reports on how many
departments each manager is responsible for, so the Manager
Employee Number can also be made a secondary index. It has
duplicate values, so it is a NUSI.
How Secondary Indexes Are StoredSecondary indexes are stored in
index subtables. The subtables for USIs and NUSIs are distributed
differently:
USI: The Unique Secondary Indexes are hash distributed
separately from the data rows, based on their USI value. (As you
remember, the base table rows are distibuted based on the Primary
Index value). The subtable row may be stored on the same AMP or a
different AMP than the base table row, depending on the hash
value.
NUSI: The Non-Unique Secondary Indexes are stored in subtables
on the same AMPs as their data rows. This reduces activity on the
BYNET and essentially makes NUSI queries an AMP-local operation -
the processing for the subtable and base table are done on the same
AMP. However, in all NUSI access requests, all AMPs are activated
because the non-unique value may be found on multiple AMPs.
Data Access Without a Primary IndexYou can submit a request
without specifying a Primary Index and still access the data. The
following access methods do not use a Primary Index:
Unique Secondary Index (USI)
Non-Unique Secondary Index (NUSI)
Full Table Scan
Accessing Data with a USIWhen a user submits an SQL request
using the table name and a Unique Secondary Index, the request
becomes a one- or two-AMP operation, as explained below.
USI Access 1. The SQL is submitted, specifying a USI (in this
case, a customer number of 56).
2. The hashing algorithm calculates a row hash value (in this
case, 602).
3. The hash map points to the AMP containing the subtable row
corresponding to the row hash value (in this case, AMP 2).
4. The subtable indicates where the base row resides (in this
case, row 778 on AMP 4).
5. The message goes back over the BYNET to the AMP with the row
and the AMP accesses the data row (in this case, AMP 4).
6. The row is sent over the BYNET to the PE, and the PE sends
the answer set on to the client application.
As shown in the example above, accessing data with a USI is
typically a two-AMP operation. However, it is possible that the
subtable row and base table row could end up being stored on the
same AMP, because both are hashed separately. If both were on the
same AMP, the USI request would be a one-AMP operation.
Accessing Data with a NUSIWhen a user submits an SQL request
using the table name and a Non-Unique Secondary Index, the request
becomes an all-AMP operation, as explained below.
NUSI Access 1. The SQL is submitted, specifying a NUSI (in this
case, a last name of "Adams").
2. The hashing algorithm calculates a row hash value for the
NUSI (in this case, 567).
3. All AMPs are activated to find the hash value of the NUSI in
their index subtables. The AMPs whose subtables contain that value
become the participating AMPs in this request (in this case, AMP1
and AMP2). The other AMPs discard the message.
4. Each participating AMP locates the row IDs (row hash value
plus uniqueness value) of the base rows corresponding to the hash
value (in this case, the base rows corresponding to hash value 567
are 640, 222, and 115).
5. The participating AMPs access the base table rows, which are
located on the same AMP as the NUSI subtable (in this case, one row
from AMP 1 and two rows from AMP 2).
6. The qualifying rows are sent over the BYNET to the PE, and
the PE sends the answer set on to the client application (in this
case, three qualifying rows are returned).
Accessing Data Without IndexesIn Teradata, you can access data
on any column, whether that column is an index or not. You can ask
any question, of any data, at any time.
If the request does not use a defined index, Teradata does a
full table scan. A full table scan is another way to access data
without using Primary or Secondary Indexes. In evaluating an SQL
request, the Optimizer examines all possible access methods and
chooses the one it believes to be the most efficient.
While Secondary Indexes generally provide a more direct access
path, in some cases the Optimizer will choose a full table scan
because it is more efficient. A request could turn into a full
table scan when:
An SQL request searches on a NUSI column with many duplicates.
For example, if a request using last names in a Customer database
searched on the very prevalent "Smith" in the United States, then
the Optimizer may choose a full table scan to efficiently find all
the many matching rows in the result set.
An SQL request uses a non-equality WHERE clause on an index
column. For example, if a request searched an Employee database for
all employees whose annual salary is greater than $100,000, then a
full table scan would be used, even if the Salary column is an
index. In this example, full table scan can be avoided by using
equality WHERE clause on a defined index column.
An SQL request uses a range WHERE clause on an index column. For
example, if a request searched an Employee database for all
employees hired between January 2001 and June 2001, then a full
table scan would be used, even if the Hire_Date column is an
index.
For all requests, you must specify a value for each column in
the index or Teradata will do a full table scan. A full table scan
is an all-AMP operation, and each data row is accessed only once.
As long as the choice of Primary Index has caused the table rows to
distribute evenly across all of the AMPs, the parallel processing
of the AMPs working simultaneously can accomplish the full table
scan quickly.
While full table scans are impractical and even disallowed on
some commercial database systems, Teradata routinely permits ad hoc
queries with full table scans.
Summary of Keys and IndexesSome fundamental differences between
Keys and Indexes are shown below:
KeysIndexesA relational modeling convention used in a logical
data model.
A Teradata mechanism used in a physical database design.
Uniquely identify a row (Primary Key).
Used for row distribution (Primary Index).
Establish relationships between tables (Foreign Key).
Used for row access (Primary Index and Secondary Index).
While most commercial database systems use the Primary Key as a
way to retrieve data, a Teradata system does not. In a Teradata
system, you use the Primary Key only when designing a database, as
a mechanism for maintaining referential integrity according to
relational theory. The Teradata RDBMS itself does not require keys
in order to manage the data, and can function fully with no
awareness of Primary Keys.
The Teradata parallel architecture uses Primary Indexes to
distribute and access the data rows. A Primary Index is always
required when creating a Teradata table.
A Primary Index may include the same columns as the Primary Key,
but does not have to. In some cases, you may want the Primary Key
and Primary Index to be different. For example, a credit card
account number may be a good Primary Key, but customers may prefer
to use a different kind of identification to access their
accounts.
Rules for Keys and IndexesA summary of the rules for keys (in
the relational model) and indexes (in the Teradata RDBMS) is shown
below.
Rule
Primary KeyForeign KeyPrimary IndexSecondary Index1
One PKMultiple FKs One PI 0 to 32 SIs 2
Unique valuesUnique or non-unique Unique or non-unique Unique or
non-unique 3
No NULLsNULLs allowed NULLs allowedNULLs allowed 4
Values should not changeValues may be changedValues may be
changed (redistributes row)Values may be changed5
Column should not changeColumn may changeColumn cannot be
changed (drop and recreate table)Index may be changed (drop and
recreate index)6
No column limitNo column limit 16-column limit 16-column limit
7
n/aFK must exist as PK in the related tablen/an/a
Defining Primary and Foreign Keys in TeradataAlthough Primary
Indexes are required and Primary Keys are not, you do have the
option to define a Primary Key or Foreign Key for any table. When
you define a Primary Key in a Teradata table, the RDBMS will
implement the specified column(s) as an index. Because a Primary
Key requires unique values, a defined Primary Key is implemented as
one of the following:
Unique Primary Index (If the DBA did not specify the Primary
Index in the CREATE TABLE satement.)
Unique Secondary Index (If columns other than the Primary Index
are chosen)
When a Primary Key is defined in Teradata SQL and implemented as
an index, the rules that govern that type of index now apply to the
Primary Key. For example, in relational theory, there is no limit
to the number of columns in a Primary Key. However, if you specify
a Primary Key in Teradata SQL, the 16-column limit for indexes now
applies to that Primary Key.
In Teradata RDBMS V2R5, the maximum number of columns in an
index is increased to 64.
Questions
What provides uniform data distribution through the hashing
algorithm?
UPI
NUPI
Both UPI and NUPI
Neither UPI nor NUPI
The output from the hashing algorithm is the:
hash map
uniqueness value
row ID
row hash
Choose the appropriate answers from the drop-down boxes that
complete each sentence:
Accessing a row with a Unique Secondary Index (USI) typically
requires one/ two/all AMP(s).
Accessing a row with a Non-Unique Secondary Index (NUSI)
requires one/two/ all AMP(s).
A full table scan accesses one/two/ all row(s).
Accessing a row with a Unique Primary Index (UPI) accesses
one/two/all row(s) on one AMP.
Accessing a row with a Non-Unique Primary Index (NUPI) accesses
multiple rows on one/two/all AMP(s).
The row ID helps the system to locate a row in case of a(n):
even distribution of rows.
Unique Primary Index.
multi-AMP request.
hash synonym.
PAGE 24Teradata Indexes - Workshop