Hatena::ブログ(Diary)

Perl Tutorial by Code Examples

2010-08-02

Perl Tips

I introduce knowledge that if you know them, you use and read Perl well.

BEGIN

BEGIN represent statement is executed in compile time.

BEGIN {
  # Statement you want to execute in compile time
}

require

require is similar to use, but require load module in run-time and don't call import method automatically.

use File::Basename 'basename';

# Above is same as the following
BEGIN {
  require File::Basename;
  File::Basename->import('basename');
}

When you want to load module dynamically, you can use require, but generally it is good to use only use.

local

local is not to create local variable. local is to restore value after scope end when you change the value. I don't know the reason, but local can't use lexical variable. you can use local only to package variable and element of hash and array.

our $num = 1;
{
  # Change value temporarily
  local $num = 2;
}

# Return back to 1
print $foo;

my $scores = {math => 90, english => 100};
{
  # Change the value temporarily
  local $scores->{math} = 80;
}

# Return back to 90
print $scores->{math};

In most case, you don't need to use local. In most case, my satisfy needs. so you should think you use my at first.

Comment for multiple lines

Perl don't have comment syntax for multiple lines. You can only use # for single line. If you want to comment for multiple lines in debugging, you can comment by syntax for documentation.s

=pod

my $str = 'aaa';
my $foo = 'foo';

=cut

The part surrounded by "=pod" and "=cut" is interpreted as document, and the part isn't executed.

If sigil is different, the variable is different

If sigil is different, the variable is different.

# Array
my @ids;

# Scalar 
my $ids;

Above two variables have the name "ids", but these are different variables.

Search special variable meaning

Special variable is difficult to search google search. It is boring to search in formal documentation. You can use perldoc command "-v" option, you can read the meaning of special variable.

perldoc -v $.

Method call

In Perl, you do method call by "->"s.

# Method call
SomeClass->method('a', 'b');
$obj->method('a', 'b');

Method call is different from function call in one point. In method call, First argument is the left-side value of "->". In above example, first argument of method, is string "SomeClass" or $obj. "a" and "b" are second and third argument.

Recieving side is the following code.

# In the case of class method
sub method {
  my ($class, $arg1, $arg2) = @_;
}

# In the calse of object method
sub method {
  my ($self, $arg1, $arg2) = @_;
}

Minimal perldoc guide

You can see Perl document by perldoc command.

perldoc perlfunc

You can also see module document.

perldoc File::Basename

If you want to see module source code, you can use "-m" option.

perldoc -m File::Basename

If you want to know module path, you can use "-l" option.

perldoc -l File::Basename

If you wan to see standard function document, you can use "-f" option.

perldoc -f substr

You can read document easily if you redirect it to file.

perldoc File::Basename > File-Basename.txt

Anonymous subroutine

In Perl you can create anonymous subroutine by "sub { }".

my $twice = sub {
  my $num = shift;

  return $num * 2;
}

If you want to execute anonymous subroutine, you do the following.

my $result = $twice->(5);

Omit "{}" of dereference

You can omit dereference mark "{}".

my @array = @$array_ref;

# Above is the same as the following.
my @array = @{$array_ref}s;

If variable is simple, "{}" is needed.

my @array = @{$var->nums};

See file operator document

File operator such as "-s" and "-f" is often used, but it is difficult to search it in document. You can see file operator document by the following command.

perldoc -f -X

Character replacing - tr

You maybe sometimes see tr. tr replace characters. The following code replace "a" with "1", "b" with "2", and "c" with "3".

$str =~ tr/abc/123/;

See Perl itself setting

Input the following command in command line.

perl -V

Hash slice of hash reference

It is difficult to read the syntax of hash slice of hash reference.

my ($key1, $key2, $key3) = @{$hash}{qw/key1 key2 key3/};

It is maybe good to write one by one.

my $key1 = $hash->{key1};
my $key2 = $hash->{key2};
my $key3 = $hash->{key3};

one eval usage

eval { Statement; 1; } or die "Exception";

I see sometimes this code. but it is a little difficult to read. This mean that if statement in {} is correct, eval return last executed statement value, which is 1. In this case, right statement isn't executed. If it fail, it return undef. In this case, right statement is executed. The reason "1;" is needed is that if return value is 0, in both success case and failing case right statement is executed.

I like the following code.

eval { Statement };
die "Exception" if $@;

Often used special variable

  • @_ - Subroutine arguments
  • @ARGV - Command line arguments
  • $. - Line number in file reading
  • $0 - Script name
  • $1 - Regular expression parentheses () matching part. $2, $3 is same meaning.
  • $! - OS error message
  • $@ - Exception message when you catch exception by eval
  • $? - Child process status

Default arguments of shift function

When you don't specify argument in shift function, at top level arguments are command line arguments(@ARGV), in subroutine they are subroutine arguments(@_). Single shift is often used.

# Top level 
my $num = shift;

# This is same as the following code
my $num = shift @ARGV; 
# In subroutine
sub {
  my $num = shift;
  
  # This is same as the following code
  my $num = shift @_;
}

End status of outside process

When you call outside process by system function, End status of outside process is assigned to upper 8 bit of special variable $?. If command call itself fail, -1 is assigned to $?. You can check if end status is success(0 is success).

my $command = "ls -l";
system $command;

if ($? == -1) {
  die "failed to execute: $!\n";
}
elsif ($? >> 8 != 0) {
  die "Return error status\n";
}

Perl ithread implementation is not good

Perl have thread. it is called ithread, but this implementation is not good. Most CPAN modules don't think about thread safe. You have many trouble if you use ithread. Perl don't support native thread. You can't write code whose thread is switched automatically.

Old Perl

You don't need to use old style, but if you maybe see source code which is written by other people,

you will see old style code.

vars pragma is same as our declaration.

# Same as "our $VAR;"
use vars 'VAR'; 

Operator which start q

It is a little difficult to remember operator which start q, so I summary these.

You can use "//", "{}", "||" as surrounding character.

q is single quote operator. This is same as single quote except that you can use single quote in it.

my $str = q/aaa ''' bbb/;

qq is double quote operator. This is same as double quote except that you can use double quote in it.

my $str = qq/aaa """ $str/;

qw is string list operator. You can write string list easily.

# Same as ('foo', 'bar', 'baz')
my $str = qw/foo bar baz/ 

qr is regular expression reference.

my $regex_ref = qr/^aaa.*bbb$/ms;

Need 1; at the end of module.

Module ending need "1;". We usually use "1;" although we can use true value except 1.

package SomeModule;

# Implementation 

1;

Semicolon at the end of block is not necessary. return at the end of block is not necesary

Semicolon at the end of block is not necessary.

sub func1 {
  my $arg = shift;

  # Last semicolon is not necessary
  return $arg; 
}

return at the end of block is not necesary.

sub func1 {
  my $arg = shift;

  # return at the end of block is not necesary.
  $arg; 
}

Usually you should use semicolon and return at the end of block,

but you can use this omit if you write one line subroutine.

This is clean from me.

# Such as constant value
sub CONST_VALUE { 3 }

Try many Perl code

If I want to execute example, I prepare simple a.pl file and try that example. If you feel boring to delete example and try new code, you can use "__END__".

__END__ finish script at that line.

perl a.pl
use strict;
use warnings;

print 'aaa';

__END__

print 'bbb';

The way not to end program when you execute final line by debugger

If you use debugger and avoid to finish program after you execute final line by debugger,

you can insert meaningless "1;" at the final line.

my $str = 'aaa';
print $str;

1;

Shortcut of file reading

If you create non-reusable script, it is a little boring to write file open. If you use only diamond operator, you can read line by line from file which is specified in command line argument.

# Command line 
perl script.pl file1 file2
# Read line by line from file
while (my $line = <>) {
  ...
}

This code is same as the following code. You can also read line which contain only 0.

while (defined(my $line = <>)) {
  ...
}

Caution of variable expansion

If colon or underscore exists after variable name, variable expansion don't work well. In that case, you can use "{}" to represent variable name explicitly.

my $name = 'aaa';
my $message = "${name}::aaa ${name}_ppp";

See symbol table

Symbol table is the data structure which contains variable and subroutine name. Each package symbol is assigned to variable which the "::" is added to the end of package name.

# Symbol table
%main::
%CGI::
%File::Basename::

Use each function only in while loop

each function internally have iterator. so if you use each function only once, iterator go forward and the state remain.

# Iterator go forward and the state remain
my ($key, $value) = each %hash;

If you use keys function, iterator is reset, but this is not good way.

# Iterator is reset
keys %hash;

You should use each function only in while loop.

while(my ($key, $value) = each %hash) {
  ...
}

File handle which is passed to print function is not first argument

This is strange specification, but File handle which is passed to print function is not first argument. There is comma after file handle.

# $fh is not first argument.
print $fh $output;

This is called indirect object syntax. This mean the following code.

$fh->print($output);

In Windows, IO::Poll and select only work well to socket

In Windows, non-blocking IO only work well to socket. it don't work well to file.

In Windows, glob function don't work well if argument contains space

In Windows, glob function don't work well if argument contains space.

If you want to make this work well, you must surround string by double quote.

my @files = glob('"C:/Documents and Settings/*.*"');

The case subroutine parentheses is needed

If subroutine is already declared, parentheses isn't needed.

# If subroutine is already declared, parentheses isn't needed
sub func1 { print $_[0] }
func1 'aaaa';

The case you import function is same.

# Parentheses is not needed in the case you import function
use Carp 'croak';

croak 'aaa';

If subroutine is declared after the statement, parentheess is needed.

# If subroutine is declared after the statement, parentheses is needed.
func1('aaa');
sub func1 { print 1 }

Anonymous subroutine exists from the start of program to the end

Anonymous subroutine which is seem to be created dynamically is compiled at the start of program and destructed at the end of program. The following code create subroutine reference, not subroutine itself.

{
  # Subroutine reference is created, not subroutine itself
  my $sub = sub {
    ...
  }
}

List context in regular expression

If you want to get matching strings by regular expression, you can usually write the following code using if statement.

if ($str =~ /Regular expression/) {
  my $match1 = $1;
  my $match2 = $2;
}

There is another style. You can get matching strings by the way to evaluate regular expression in list context.

my ($match1, $match2) = $str =~ /Regular expression/;

Change regular expression surrounding character

Usually you write regular expression such as the following.

$str =~ /Regular expression/

There is another way which have same meaning. You can use m.

$str =~ m/Regular expression/;

You can change regular expression surrounding character by this,

for example, in the case there are many slash / in regular expression.

$str =~ m#http://aaa.com#;

The name of ord function is abbreviation of ordinal

the name of ord function which return code point from character is abbreviation of ordinal. by the way chr function convert code point to character.

ord function  <-> chr function 


Table of contents

スパム対策のためのダミーです。もし見えても何も入力しないでください
ゲスト


画像認証

トラックバック - http://d.hatena.ne.jp/perlcodesampleen/20100802/1278596435