This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
CHAPTER 8 SORTING
【【学习内容学习内容】 】 BASIC CONCEPTS1. Introduction and Notation2. Insertion Sort3. Selection Sort4. Shell Sort5. Lower Bounds6. Divide-and-Conquer Sorting7. Mergesort for Linked Lists8. Quicksort for Contiguous Lists9. Heaps and Heapsort10. Review: Comparison of Methods
Definitions of Sorting Problem Description: Given a list of records ( R0, R1, , Rn1 ) where each Ri has
a key value Ki. There is a transitive ordering relation “<” on the ke
ys such that for any two key values Ki and Kj, either Ki = Kj or Ki < Kj or Kj < Ki. Sorting is to find a permutation such that K(i1) K(i) for all 0 < i n1. The desired ordering is then ( R(0), R(1), , R(n1) ) . Remarks: No single sorting technique is the best in all cases. If there are identical key values, then is not unique. If s is a permutation which is not only sorted, but also stable —
that is, if Ki = Kj for i < j, then Ri precedes Rj in the sorted list
— then such a s is unique. A sorting method is stable if it generates s . An internal sort is to sort the list entirely in main memory. An external sort is to sort the file piece by piece in main memory.
Sortable_listsSortable_liststemplate <class Record>class Sortable_list: public List<Record> {public: // Add prototypes for sorting methods here.private: // Add prototypes for auxiliary functions here.};
Insertion Sort
template <class Record>void Sortable_list<Record> :: insertion sort( )/* Post: The entries of the Sortable_list have been rearranged so that the keys in all the entries are sorted into nondecreasing order.Uses: Methods for the class Record; the contiguous List implementation of Chapter 6 */{
int first_unsorted; // position of first_unsorted entryint position; // searches sorted part of listRecord current; // holds the entry temporarily removed from listfor (first_unsorted = 1; first_unsorted < count; first_unsorted++)
if (entry[first_unsorted] < entry[first_unsorted − 1]) {position = first_unsorted;current = entry[first_unsorted]; // Pull unsorted entry out of the list.
do { // Shift all entries until the proper position is found.entry[position] = entry[position − 1];position−−; // position is empty.
The pivot key is placedat the right position with respect to
the sorted sub-list.
Insertion Sort Linked VersionInsertion Sort Linked Version
Insertion Sort Linked VersionInsertion Sort Linked Version
Insertion Sort
A record Ri is LOO ( Left Out of Order ) }{max0 jiji KK
If there are k records that are LOO, then the worst case
))1((
))()2()1(()(
nkO
knnnOnOTp
Good for k << nand n is not too large.If k = n, Tp = O( n2 ).Average Tp = O( n2 ).
Variations:
Replace the sequential search by the binary search.
Search faster but still have to move the records.
Use linked list for representation. Now insertion becomes simple,but we have to use
the sequential search.
Selection SortSelection Sort
Selection SortSelection Sort
Shell SortShell Sort
希尔排序 希尔排序 (Shell Sort)(Shell Sort) 希尔排序方法又称为缩小增量排序。该方法的希尔排序方法又称为缩小增量排序。该方法的基本思想是 基本思想是 : : 设待排序对象序列有 设待排序对象序列有 n n 个对个对象象 , , 首先取一个整数 首先取一个整数 gap < ngap < n 作为间隔作为间隔 , , 将将全部对象分为 全部对象分为 gap gap 个子序列个子序列 , , 所有所有距离为距离为 gapgap 的对象放在的对象放在同一个子序列同一个子序列中中 , , 在每一个在每一个子序列中分别施行直接插入排序。然后缩小间子序列中分别施行直接插入排序。然后缩小间隔 隔 gap, gap, 例如取 例如取 gap = gap = gap/2gap/2 ,重复上述,重复上述的子序列划分和排序工作。直到最后取 的子序列划分和排序工作。直到最后取 gapgap == == 1, 1, 将所有对象放在同一个序列中排序为止。将所有对象放在同一个序列中排序为止。
Shell SortShell Sort
Lower Bounds of SortLower Bounds of Sort
Optimal Sorting Time【 Theorem 】 Any algorithm that sorts by comparisons only
must have a worst case computing time of ( n log2 n ).
Proof: K0 K1
K1 K2
K0 K2stop
[0,1,2]
stop[0,2,1]
stop[2,0,1]
T F
T F
K0 K2
K1 K2stop
[1,0,2]
stop[1,2,0]
stop[2,1,0]
T F
T F
T F
Decision tree for insertion sort on R0, R1, and R2
When sorting n distinct
elements, there are n! different possible results.
Thus any decision tree must
have at least n! leaves.
If the height of the tree
is k, then n! 2k1 (# of leaves in a complete
binary tree)
k log2 n! + 1
Since n! (n/2)n/2 and log2 n! (n/2)log2(n/2) = ( n log2 n )
Therefore Tp = k c n log2 n .
Divide and Conquer Sortingvoid Sortable_list :: sort( ){
if the list has length greater than 1 {partition the list into lowlist, highlist;lowlist.sort( );highlist.sort( );combine(lowlist, highlist);}
}
Merge Sort
Iterative Merge Sort
Sketch of the idea:
list 0 1 2 3 …… …… n4 n3 n2 n1
…… ……
…… ……
…… …… …… ……
21 25 49 25* 16 08 21 25 49 25* 16 08
21 25 49 25 49
21 25* 16 08
25* 16 08
21 25 49 25* 16 08
16 08 25* 25 49 21
递归
21 25* 16 08 49 25
25* 16 08 21 25 49
回推
Merge SortMerge Sort
Merge SortMerge Sort
Divide Linked List in HalfDivide Linked List in Half
Analysis of MergesortAnalysis of Mergesort
Quick Sort —— gives the best average Tp
Sketch of the idea: The pivot Ki is placed at the right position wit
h respect to the whole list.
R0 R1 R2 Ri Rj Rn2 Rn1
pivot< K0 > K0< K0 > K0…… K0 K0 ……
i < j
Rj Ri+1 Rj1 Ri
Continue for sublist until ...
Rj Ri
Right position for R0
[ Rj R1 R2 Rj1 ] R0 [ Ri Rn2 Rn1 ]
Smaller Universes
QuicksortQuicksort
Quick SortQuick Sort
Quick SortQuick Sort
Partitioning the ListPartitioning the List
Quick Sort
Worst case Tp = O( ? )n2 if the list is already in sorted order.
Lucky case: [ ... ... ] [ ... ... ] T ( n ) = O( n ) + 2 T ( n / 2 ) = O( n ) + 2 [ O( n / 2 ) + 2 T ( n / 22 ) ] = 2 O( n ) + 22 T ( n / 22 ) = ... ... = k O( n ) + 2k T ( n / 2k ) n / 2k = 1 k = log2 n
= O( n log2 n ) + n T ( 1 )= O( n log2 n )
【 Lemma 7.1 】 Let Tavg( n ) be the expected time for quickso
rt to sort a file with n records. Then there exists a constant k such that Tavg( n ) k n loge n for n 2.
Proof: By induction. See p.331~332.
Space: Slucky = O( ln n ); Sworst = O( n ).
Trees, Lists, and HeapsTrees, Lists, and Heaps
If the root of the tree is in position 0 of the list, then the left and right children of the node in position k are in positions 2k + 1 and 2k + 2 of the list, respectively. If these positions are beyond the end of the list, then these children do not exist.
The Definition of HeapsThe Definition of Heaps
DEFINITION A heap is a list in which each entry contains a key, and, for all positions k in the list, the key at position k is at least as large as the keys in positions 2k and 2k + 1, provided these positions exist in the list.
Therefore Tp = O( n ln n ). With a fix amount of additional storage, it is slightly slower than merge sort using O( n ) additional space, but is faster than merge sort using O( 1 ) additional space.
Radix Sort 基数排序 — sorting records that have several keys
Ki j ::= the j-th key of record Ri
Ki 0 ::= the most significant key of record Ri
Suppose that the record Ri has r keys.
Ki r1 ::= the least significant key of record Ri
A list of records R0, ..., Rn1 is lexically sorted with respect to the keys K 0, K 1, ..., K r1 iff
.10),,,,(),,,( 11
11
01
110
niKKKKKK riii
riii
That is, Ki 0 = Ki+1
0, ... , Ki l = Ki+1
l, Ki l+1 < Ki+1
l+1 for some l < r 1.
〖 Example 〗 A deck of cards sorted on 2 keysK 0 [Suit] < < < K 1 [Face value] 2 < 3 < 4 < 5 < 6 < 7 < 8 < 9 < 10 < J < Q < K < A
Sorting result : 2 ... A 2 ... A 2 ... A 2 ... A
§8 Radix Sort
1. MSD ( Most Significant Digit ) Sort 最高位优先 Sort on K 0: for example, create 4 bins for the suits
3
3
5
5
A
A
4
4
Sort each bin independently (using insertion sort)
Stack the 4 bins
§8 Radix Sort
2. LSD ( Least Significant Digit ) Sort 最低位优先 Sort on K 1: for example, create 13 bins for the face values
2
2
3
3
4
4
5
5
A
A
...
Reform them into a single pileA
A
3
3
2
2
Create 4 bins and resort using any stable method
Note: If the number of the least significant keys is O( n ), then a bin sort requires only O( n ) time, thus making it a very fast sorting technique.
Radix Sort
Remark: MSD or LSD can be used to sort a single key, if we interpret this key as a composite of several keys. For example, for 0 K 999 we can break it into 3 keys: K = K 0 100 + K 1 10 + K 2 1.
3. Radix Sort —— decompose the (integer) key into digits using a radix r
〖 Example 〗 K = 12, r = 10 K 0 = 1, K 1 = 2 K = 1100, r = 10 K 0 = K 1 = 1, K 2 = K 3 = 0
The time used by radix sort is O(nk), where n is the number of items being sorted and k is the number of characters in a key.
The relative performance of radix sort to other methods will relate to the relative sizes of nk and n lg n; that is, of k and lg n.
If the keys are long but there are relatively few of them, then k is large and lg n relatively small, and other methods (such as mergesort) will outperform radix sort.
If k is small (the keys are short) and there are a large number of keys, then radix sort will be faster than any other method we have studied.
Analysis of Radix SortAnalysis of Radix Sort
若每个排序码有 若每个排序码有 dd 位位 , , 需要重复执行需要重复执行 dd 趟趟““分配分配”与“”与“收集收集”。每趟对 ”。每趟对 nn 个对象个对象进行“进行“分配分配”,对”,对 radixradix 个队列进行“个队列进行“收收集集”。总时间复杂度为”。总时间复杂度为 O ( d ( n+radixO ( d ( n+radix )) )) 。。