Top Banner
Copyright 2014 Daina Pettit map, grep, sort – slide 1 Streamlining and simplifying your Perl code using Map, Grep, and Sort Daina Pettit [email protected] [email protected]
62

Map grep sort

Jul 15, 2015

Download

Software

Daina Pettit
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 1

Streamlining and simplifying your Perl code using

Map, Grep, and Sort

Daina Pettit

[email protected]

[email protected]

Page 2: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 2

“Perl culture” sometimes gets shortened to “Perl cult”.*

Larry Wall

*Wall, Larry, Perl, the first postmodern computer language, Linux World [Conference], March 3, 1999

Page 3: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 3

Overview● What are map, grep, & sort and why should I care?● map details● grep details● sort details● Combining map, grep, & sort● Advanced combinations

● Schwartzian Transform● Orcish Maneuver● Guttman-Rosler Transform● Alternatives

Page 4: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 4

What are they?

map, grep, & sort are iterator functions that operate on lists or arrays.

1. map performs action on each element.

2. grep tests each element.

3. sort orders the elements.

Page 5: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 5

General Form

All have similar forms.

@array = map  { exp } @list;@array = grep { exp } @list;@array = sort { exp } @list;

and

@array = map  exp, @list;@array = grep exp, @list;@array = sort      @list;

Page 6: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 6

General Form—code blocks

Damian Conway in Perl Best Practices* recommends:

“Always use a block with a map and grep”

This is a syntactic aid suggestion to help you prevent yourself from making an error with grouping arguments. Block enclosures actually incur more overhead. Not much, but some.

*Conway, Damian, Perl Best Practices, O'Reilly Media, Sebastopol, CA, 2005, pp 169-170.

@array = map  { exp } @list;@array = grep { exp } @list;

Page 7: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 7

What is map?

Map is essentially a loop that processes a list, much like a foreach loop.

foreach $line ( @lines ) {$line = uc $line;

}

Page 8: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 8

What is map?

Map is essentially a loop that processes a list, much like a foreach loop.

foreach $line ( @lines ) {$line = uc $line;

}

@lines = map uc, @lines;

Page 9: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 9

What is map?

Map is essentially a loop that processes a list, much like a foreach loop.

foreach $line ( @lines ) {$line = uc $line;

}

@lines = map uc, @lines;

@lines = map { uc } @lines;

Page 10: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 10

Aside—foreach inside-out

Alternate single line foreach is concise as map, and is slightly faster than map, but more cryptic.

@lines = map uc, @lines;

Page 11: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 11

Aside—foreach inside-out

Alternate single line foreach is concise as map, and is slightly faster than map, but more cryptic.

@lines = map uc, @lines;

$_ = uc foreach @lines;

Page 12: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 12

Aside—foreach inside-out

Alternate single line foreach is concise as map, and is slightly faster than map, but more cryptic.

@lines = map uc, @lines;

$_ = uc foreach @lines;

foreach ( @lines ) {    $_ = uc;}

Page 13: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 13

What is the best way to use map?

● map is best for creating new lists. ● foreach is best for transforming a list.

@words = map { split } @lines;

foreach ( @lines ) {$_ = uc;

}

Page 14: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 14

Dumping out a hash alternatives

foreach ( sort keys %h ) {    print "$_ => $h{$_}\n";}

Page 15: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 15

Dumping out a hash alternatives

foreach ( sort keys %h ) {    print "$_ => $h{$_}\n";}

map {     print "$_ => $h{$_}\n" } sort keys %h;

Page 16: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 16

Dumping out a hash alternatives

foreach ( sort keys %h ) {    print "$_ => $h{$_}\n";}

map {     print "$_ => $h{$_}\n" } sort keys %h;

print "$_ => $h{$_}\n"     foreach sort keys %h;

Page 17: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 17

map {} is list context

Damian Conway in Perl Even-Better Practices* recommends:

"Use explicitly scalar map expressions"

*Thoughtstream Pty Ltd, 2013 pp 10-11

@dates = map {     localtime $_   # Wrong!  } @epoch_times;

Page 18: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 18

map {} is list context

Damian Conway in Perl Even-Better Practices* recommends:

"Use explicitly scalar map expressions"

*Thoughtstream Pty Ltd, 2013 pp 10-11

@dates = map {     scalar localtime $_   } @epoch_times;

Page 19: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 19

map {} is list context

Damian Conway in Perl Even-Better Practices* recommends:

"Use explicitly scalar map expressions"

*Thoughtstream Pty Ltd, 2013 pp 10-11

@words = map {     scalar split   # Wrong!} @lines;

Page 20: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 20

map {} is list context

Damian Conway in Perl Even-Better Practices* recommends:

"Use explicitly scalar map expressions"

*Thoughtstream Pty Ltd, 2013 pp 10-11

@words = map {     split} @lines;

Page 21: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 21

map {} confusion

How does perl know that { 6 } is a code block or a partial hash? Use +{ 6 }. + is required or you will get a syntax error.

map +{ 6 }, @stuff; # hashmap  { 6 }  @stuff; # code block

Page 22: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 22

Using map in void context● Frowned upon. ● Incurs extra overhead.

map {     print "$_ => $h{$_}\n" } sort keys %h;

Page 23: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 23

Creating a hash in map

map { $age_of{$_} = ­M } @files;

Page 24: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 24

Creating a hash in map

map { $age_of{$_} = ­M } @files;

foreach ( @files ) {    $age_of{$_} = ­M;}

Page 25: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 25

Creating a hash in map

map { $age_of{$_} = ­M } @files;

foreach ( @files ) {    $age_of{$_} = ­M;}

$age_of{$_} = ­M for @files;

Page 26: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 26

Skipping in map● Drop an item using an empty list.● Do NOT use an explicit return.

@ones = map {     $_ < 10 ? $_ : (); } @numbers;

Page 27: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 27

What is grep?● Similar to Unix command-line utility grep● Given a list, grep returns only certain items

@ones = map {     $_ < 10 ? $_ : (); } @numbers;

Page 28: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 28

What is grep?● Similar to Unix command-line utility grep● Given a list, grep returns only certain items

@ones = map {     $_ < 10 ? $_ : (); } @numbers;

@ones = grep { $_ < 10 } @numbers;

Page 29: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 29

Boolean Scalar Context● Anywhere in perl where a true/false is expected

—if, while, and, or, not, &&, ||, !, etc.● Evaluation results in 0, “0”, 0.0, “”, or undef then

it is false. Everything else is true.

if (   0     ) {} # Falseif ( 400     ) {} # Trueif (  ­1     ) {} # Trueif ( "false" ) {} # True!if ( "00"    ) {} # True!undef $x;if (  $x     ) {} # False

Page 30: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 30

Examples of grep● Expression can be any valid perl expression.● Expression is in scalar boolean context.

@ones = grep { $_ < 10 } @numbers;

@dirs = grep { ­d } @files;

@no_dup = grep { ! $h{$_}++ } @old;

@errors = grep { /error/i } @log;

@true = grep { $_ } @all;

Page 31: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 31

Sorting Basics

Sort can be called in three ways:

1. With no comparison directives

2. With a subroutine that returns comparison directives

3. With a code block (an anonymous subroutine) that returns comparison directives

@sorted = sort         @unsorted;@sorted = sort   sub   @unsorted;@sorted = sort { exp } @unsorted;

Page 32: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 32

Sorting Basics

Sort requires the comparison directives value of -1, 0, or 1 to tell whether any two elements, $a and $b, are in order (-1), the same (0), or out of order (1).

cmp and <=> conveniently provide this for string or numeric comparisons, respectively.

We don't have to use cmp and <=>. We just have to return -1, 0, or 1.

$a <=> $b

Page 33: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 33

Sorting Basics

Basic ASCII-betical sort:

Basic numeric sort:

@sorted = sort @list;

@sorted = sort { $a <=> $b } @list;

Page 34: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 34

Sorting Basics

Basic ASCII-betical sort:

Basic numeric sort:

@sorted = sort { $a cmp $b } @list;

@sorted = sort { $a <=> $b } @list;

Page 35: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 35

Sorting Basics--reverse

Reverse ASCII-betical sort:

Reverse numeric sort:

@sorted = sort { $b cmp $a } @list;

@sorted = sort { $b <=> $a } @list;

Page 36: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 36

Sorting Basics--reverse

Or just use reverse function:

Reverse numeric sort:

@sorted = reverse sort @list;

@sorted = reverse sort { $a <=> $b }     @list;

Page 37: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 37

Sorting Basics--subroutine

Using a subroutine instead of a code block

You can also use anonymous subroutines.

These subroutines cannot be recursive!

sub compare {uc ( $a ) cmp uc ( $b ); 

}

$comp = \&compare;

@sorted = sort $comp @list;

Page 38: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 38

Complicated Sorting

You can sort on anything you can get to through $a and $b.

@sorted = sort {    @array_a = split / /, $a;    @array_b = split / /, $b;  $array_a[5] cmp $array_b[5];

} @lines;

Page 39: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 39

Complicated Sorting

Sorting hash keys

Sorting hash keys by value

@sorted_keys = sort keys %hash;

@sorted_keys = sort {     $hash{$a} cmp $hash{$b} } keys %hash;

Page 40: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 40

Complicated Sorting

We can sorting with multiple keys such as sort by year, then by month, then by day even if the data is mm-dd-yyyy.

@sorted_dates = sort {     ( $ma, $da, $ya ) = split /­/, $a;    ( $mb, $db, $yb ) = split /­/, $b;    $ya<=>$yb || $ma<=>$mb || $da<=>$db;} @dates;

Page 41: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 41

Complicated Sorting

We don't have to always use the comparison operators. We can make up our own unique order.

@order = sort {     return ­1 if $a eq 'King' &&                 $b ne 'King';  return  1 if $a ne 'King' &&               $b eq 'King';

    return  0;    } @cards; # King first,    # the rest doesn't matter.

Page 42: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 42

Combinations

Since map, grep, and sort both take and return lists, you can chain them together.

@pics = map { lc }         grep { /\.jpe?g$/i }         sort @list;

Page 43: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 43

Optimizing sort

Given a list of files, sort by the age of the files.

chomp ( @files = `ls ­1` );

“file1” “file7” “a.out” “x.pl” “5.dat”

Page 44: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 44

Optimizing sort

Sorts by name, but not by age.

@sorted = sort @files; # by name

“file1” “file7” “a.out” “x.pl” “5.dat”

Page 45: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 45

Optimizing sort

Sorts by date, but slow for large data sets.

­M is called twice every time sort compares!

@sorted = sort {       ­M $a <=> ­M $b } @files;

“file1” “file7” “a.out” “x.pl” “5.dat”

Page 46: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 46

Optimizing sort

We want to call ­M once for each file, save that and use that each time sort needs to compare.

Map will do this for us!

@order =     map { [ $_, ­M ] } @files;

“file1” “file7” “a.out” “x.pl” “5.dat”

1.2 2.9 3.1 1.1 2.9

Page 47: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 47

Optimizing sort

Then we want to sort based on just the date part.

But now we need to get rid of the date part.

@order =     sort { $a­>[1] <=> $b­>[1] }    map { [ $_, ­M ] } @files;

“x.pl” “file1” “5.dat” “file7” “a.out”

1.1 1.2 2.9 2.9 3.1

Page 48: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 48

Optimizing sortNow use map to extract just element 0 and we are back to the original list and sorted by date.

This is known as the Schwartzian Transform.**Perl idiom named for Randal Schwartz, author of Learning Perl, coined by Tom Christiansen.

@order =     map { $_­>[0] }    sort { $a­>[1] <=> $b­>[1] }    map { [ $_, ­M ] } @files;

“x.pl” “file1” “5.dat” “file7” “a.out”

Page 49: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 49

Optimizing sort

Key points to remember for ST:● map sort map idiom

@order =     map { $_­>[0] }    sort { $a­>[1] <=> $b­>[1] }    map { [ $_, ­M ] } @files;

Page 50: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 50

Optimizing sort

Key points to remember for ST:● map sort map idiom● Use proper comparison

@order =     map { $_­>[0] }    sort { $a­>[1] <=> $b­>[1] }    map { [ $_, ­M ] } @files;

Page 51: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 51

Optimizing sort

Key points to remember for ST:● map sort map idiom● Use proper comparison● Extract value to compare

@order =     map { $_­>[0] }    sort { $a­>[1] <=> $b­>[1] }    map { [ $_, ­M ] } @files;

Page 52: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 52

Optimizing sort

Key points to remember for ST:● map sort map idiom● Use proper comparison● Extract value to compare● Everything else stays the same.

@order =     map { $_­>[0] }    sort { $a­>[1] <=> $b­>[1] }    map { [ $_, ­M ] } @files;

Page 53: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 53

Optimizing sort—Orcish Maneuver*

Uses “or” cache (in a hash) to remember values already computed: ||=

● Simpler than ST● Almost as fast as ST● Faster if list contains duplicates

*Term coined by Joseph Hall in Effective Perl Programming, Addison-Wesley Professional, Boston, MA, 1998.

@order = sort {     ( $cache{$a} ||= ­M $a ) <=>     ( $cache{$b} ||= ­M $b ) }    @files;

Page 54: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 54

Optimizing sort—Orcish Maneuver

Key points to remember for OM:● Only sort

@order = sort {     ( $cache{$a} ||= ­M $a ) <=>     ( $cache{$b} ||= ­M $b ) }    @files;

Page 55: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 55

Optimizing sort—Orcish Maneuver

Key points to remember for OM:● Only sort● Compute comparison data

@order = sort {     ( $cache{$a} ||= ­M $a ) <=>     ( $cache{$b} ||= ­M $b ) }    @files;

Page 56: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 56

Optimizing sort—Orcish Maneuver

Key points to remember for OM:● Only sort● Compute comparison data● Use proper comparison operator

@order = sort {     ( $cache{$a} ||= ­M $a ) <=>     ( $cache{$b} ||= ­M $b ) }    @files;

Page 57: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 57

Optimizing sort—Orcish Maneuver

Key points to remember for OM:● Only sort● Compute comparison data● Use proper comparison operator

Everything else stays the same.

@order = sort {     ( $cache{$a} ||= ­M $a ) <=>     ( $cache{$b} ||= ­M $b ) }    @files;

Page 58: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 58

Optimizing sort—Guttman-Rosler Transform*

This is a tweak on ST. Takes advantage of substr and sprintf being faster than array manipulation. Also uses default string sort which is slightly faster.

*A Fresh Look at Efficient Perl Sorting, Uri Guttman and Larry Rosler, approx. 1999.

@order = map { substr $_, 10 }    sort    map { m#(\d{4})/(\d+)/(\d+)#;        sprintf "%d­%02d­%02d%s",             $1, $2, $3, $_    } @dates;

Page 59: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 59

Optimizing sort—Guttman-Rosler Transform

Faster than ST

Harder to code and less readable

Not suitable for all sorts

@order = map { substr $_, 10 }    sort    map { m#(\d{4})/(\d+)/(\d+)#;        sprintf "%d­%02d­%02d%s",             $1, $2, $3, $_    } @dates;

Page 60: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 60

Further List & Sort Options

List::Util

shuffle, reduce, any, first, max, min, ...

List::MoreUtils

uniq, natatime, ...

Sort::Key

May be faster than ST or GRT

Sort::Naturally

Automatically sorts numeric when appropriate

Sort::Maker

Internally uses OM, ST, or GRT.

Page 61: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 61

Q&A

Comments?

Questions?

Page 62: Map grep sort

Copyright 2014 Daina Pettit

map, grep, sort – slide 62

Resources

http://www.perlmonks.org

http://www.cpan.org

http://www.hidemail.de/blog/perl_tutor.shtml

http://perldoc.perl.org/

http://www.stonehenge.com/writing.html

For profiling:

perldoc Devel::NYTProf

perldoc Benchmark