天天看點

PROGRAMMING THE WORLD WIDE WEB Chapter 4 The Basics of Perl

©

Chapter 4 The Basics of Perl

• 4.1 Origins and Uses of Perl

• 4.2 Scalars and Their Operations

• 4.3 Assignment Operators

• 4.4 Control Statements

• 4.5 sts and Arrays

• 4.6 Hashes

• 4.7 References

• 4.8 Functions

©

4.1 Origins and Uses of Perl

• Perl began as relatively small language with the modest

purpose of including and expanding on the operation of

the text-processing language awk and the system

administration capabities of the UNIX shell languages,

initially sh.

• Perl was first released in 1987, after being developed

and implemented by Larry Wall.

• Its facities for textual pattern matching are elaborate

and powerful, which is one of the reasons for its

widespread use in CGI programming.

• Perl is a language whose implementation is between

compiled (to machine code) and interpreted. Perl

programs are compiled to an intermediate form, in time

performance even though it is interpreted.

©

4.2 Scalars and Their

Operations

• Perl has three categories of variables:

– scalars

– arrays

– hashes

• each identified by the first character of

their names

– $ for scalar variables

– @ for array variables

– % for hash variables

©

4.2 Scalars and Their

Operations

• Scalar variables can store three different kinds

of values:

– numbers

– character strings

– references (which are addresses)

• The numeric values stored in scalar variables

are represented in double-precision floatingpoint

form.

• Character strings are treated as scalar units in

Perl. They are not arrays of individual characters.

©

4.2 Scalars and Their

Operations

• 4.2.1 NUMERIC AND STRING TERALS

– Numeric terals can have the forms of either

integers or floating-point values.

72 7.2 72. 7E2 7e2 .7e2 7.e2 7.2E-2

– Integer terals can be written in hexadecimal

by preceding their first digit with either 0x or

0X.

©

4.2 Scalars and Their

Operations

• 4.2.1 NUMERIC AND STRING TERALS

– String terals can appear in two forms,

depending on their demiters, single quotes( ‘)

or double quotes ( “).

– Single-quoted string terals cannot include

characters specified with escape sequences,

such as /n.

– the embedded single quote is preceded by a

backslash, as in this example:

‘You/’re the most freckly person I ever met ’

©

4.2 Scalars and Their

Operations

• 4.2.1 NUMERIC AND STRING TERALS

– if you want a string teral with the same

characteristics as single-quoted strings, but

you want to use a different demiter, you

simply precede the demiter with q, as in this

example:

q$I don’t want to go, I won ’t go, I can’t go!$

©

4.2 Scalars and Their

Operations

• 4.2.1 NUMERIC AND STRING TERALS

– if the new demiter is a parenthesis, a brace,

a bracket, or a pointed bracket, the left

element of the pair must be used on the left,

and the right element must be used on the

right:

q< I don’t want to go, I won ’t go, I can’t go!>

©

4.2 Scalars and Their

Operations

• Double-quoted string terals differ from

single-quoted string terals in two ways:

– firs, they can include escape sequences;

“completion % /t Yards /t Touchdowns /t Inter ”

– second, embedded variable names are

interpolated into the string, which means that

their values are substituted for their names.

“Jack is $age years old ”

©

4.2 Scalars and Their

Operations

• A different demiter can be specified for

string terals with the characteristics of

double-quoted strings by preceding the

new demiter with qq:

qq@”Why, I never!”, said she. @

• The null string (one with no characters)

can be denoted with either ‘’ or “”.

©

4.2 Scalars and Their

Operations

• 4.2.2 SCALAR VARIABLES

• The names of all scalar variables, whether

predefined or programmer-defined, begin with

dollar signs($).

• Following the dollar sign, there must be at least

one letter. After that, there can be any number of

letters, digits, or underscore characters.

• There is no mit to the length of variable name.

• The letters in variable name are case-sensitive.

©

4.2 Scalars and Their

Operations

• 4.2.2 SCALAR VARIABLES

• If you want to embed a variable name in a

double-quoted string but do not want it

interpolated, just backslash the dollar sign,

and the variable name will no longer be

recognized as such:

“The variable with the result is /$answer”

©

4.2 Scalars and Their

Operations

• 4.2.2 SCALAR VARIABLES

• In Perl, variables are not expcitly

declared;

• A scalar variable that has not been

assigned a value by the program has the

value undef.

– The numeric value of undef is 0;

– the string value of undef is the null string.

©

4.2 Scalars and Their

Operations

• 4.2.2 SCALAR VARIABLES

• Perl includes a large number of predefined

variables, which we sometimes call impcit

variables.

• The names of impcit scalar variables begin with

dollar signs. The rest of the name of a impcit

variables is often just one more special

character such as

– an underscore(_)

– a circumflex(^)

– or a backslash(/).

©

4.2 Scalars and Their

Operations

• 4.2.3 NUMERIC OPERATORS

– Most of Perl’s numeric operators are

borrowed from C.

– binary operators +, -, * , /, **

– Except under unusual circumstances, numeric

operations are done in double-precision

floating point.

• 5/2 evaluates to 2.5

©

4.2 Scalars and Their

Operations

• 4.2.4 STIRNG OPERATORS

• Perl strings are not stored or treated as

arrays of characters; rather, a string is a

single unit.

• String catenation is specified with the

operator denoted by a period.

$first=“Freddie”;

$first . “Freeloader”

©

4.2 Scalars and Their

Operations

• 4.2.4 STIRNG OPERATORS

• The repetition operator is specified with an

x.

“More! ” x 3

©

4.2 Scalars and Their

Operations

• 4.2.5 STRING FUNCTIONS

• Functions and operators in Perl are very

closely related. In fact, in many cases you

can treat them interchangeably.

– doit x

– doit(x)

©

4.2 Scalars and Their

Operations

• 4.2.5 STRING FUNCTIONS

• The most commonly used string functions

are shown in Table 4.2. (Book page 92)

A character and the The character inserted between them

strings catenated

together with a st of

strings

join

hex A string The decimal value of the hexadecimal number in the string

uc A string The string with all lowercase letters converted to uppercase

lc A string The string with all uppercase letters converted to lowercase

length A String The number of characters in the string

chomp A String The string with any terminating newne characters removed

Name Parameters Result

©

4.3 Assignment Operators

• Among the most fundamental constructs in

most programming languages are

assignment and the statements or

functions that provide keyboard input and

screen output.

©

4.3 Assignment Operators

• 4.3.1 ASSIGNMENT STATEMENTS

– assignment operator is =

$salary=47500;

– compound assignment operators are binary

operators with the simple assignment

operator catenated to their right side.

$sum+=$value;

– All Perl statements except those at the end of

blocks must be terminated by semicolons.

– Comments in Perl are specified using the

pound sign(#).

©

4.3 Assignment Operators

• 4.3.2 KEYBOARD INPUT

• All input and output in Perl is uniformly

thought of as file input and output.

• Files have external names but are

referenced in programs through internal

names, called filehandle

s.

• There are three predefined filehandles,

– STDIN

– STDOUT

– STDERR

©

4.3 Assignment Operators

• 4.3.2 KEYBOARD INPUT

• String are input from the keyboard with the

ne input operator using STDIN as the

filehandle;

$in_data=<STDIN>;

• In many cases, the newne character is

not wanted, so the following idiom is

common:

chomp($indata=<STDIN>);

©

4.3 Assignment Operators

• 4.3.3 SCREEN OUTPUT

• The operand for print is one or more string

terals, separated bye commas.

print “This is pretty easy /n”;

• The print function from C is also available

in Perl, including the format codes, such

as %7d and %5s.

Page 95 example quadeval.pl

©

4.3 Assignment Operators

• 4.3.3 Screen Output

• execute Perl program

– %perl filename.pl

• The ne input operator can be used to get

input from a file specified as a commandne

arguments.

%perl filename.pl data.dat

$input=<>;

©

4.4 Control Statements

• Perl has a powerful collection of

statements for controlng the execution

flow through its programs.

©

4.4 Control Statements

• 4.4.1 CONTROL EXPRESSIONS

• The expressions upon which statement

control flow is based are either scalarvalued

expressions, relational expressions,

or compound expressions.

• If value of a scalar-valued expression is a

string, it is true unless it is either the empty

string or a zero string “0”.

• If the value is a number, it is true unless it

is zero.

©

4.4 Control Statements

• Table 4.3 Relational Operators

Is greater or equal to >= ge

Is less than or equal to <= le

Is greater than > gt

Is less than < lt

Is not equal to != ne

Is equal to == eq

Operation Numeric operands String Operands

©

4.4 Control Statements

• Table 4.4 Operator Precedence and

Associatively

<< >> left

> < >= <= lt gt le ge Nonassociative

== != eq ne Nonassociative

+ - . left

* / % x left

~ ! unary + and - right

** Right

++ -- Nonassociative

Operators Associatively

©

4.4 Control Statements

• Table 4.4 Operator Precedence and

Associatively (cont.)

or xor left

and left

not Right

= += -= *= **= /= &= <<= right

>>= &&= |= ||= .= %= ^= x=

|| left

&& Left

| ^ Left

& Left

Operators Associativity

©

4.4 Control Statements

• Because assignment statements have

values (the value assigned to the left side

variable), they can be used as control

expressions.

• One common use of this is to use an

assignment statement that uses <STDIN>

as its right side.

While ($next=<STDIN>) {…}

©

4.4 Control Statements

• 4.4.2 SELECTION AND LOOP

STATEMENTS

• Control statements require some syntactic

container for sequences of statements

whose execution they are meant to control.

• Block is formed by putting the statements

in braces. Blocks can have local variables,

as you will see in Section 4.8.

• A control construct is a control statement

and the block whose execution it controls.

©

4.4 Control Statements

• 4.4.2 SELECTION AND LOOP

STATEMENTS

• The only thing a bit different is that both

the then clause and the else clause

always must be blocks.

if ($a>10)

$b=$a*2; # Illegal - Not a block

©

4.4 Control Statements

• 4.4.2 SELECTION AND LOOP

STATEMENTS

• Perl also has an unless statement, which

is related to its if statement, except that

the inverse of the value of the control

expression is used.

unless ($sum>1000){

print “We are not finished yet!/n”;

}

©

4.4 Control Statements

• 4.4.2 SELECTION AND LOOP STATEMENTS

• The Perl while and for statements are similar to

those of C and its descendents. The body of

both must be blocks.

While (contrl expression) {

#loop body statement(s)

}f

or (initial expression; control expression; increment

expression){

#loop body statement(s)

}

©

4.4 Control Statements

• 4.4.2 SELECTION AND LOOP STATEMENTS

• Perl has no case or switch statement. However,

it has an operator, last, that can be used to build

such a construct, using a labeled block.

• The last operator transfers control out of the

block in which it appears.

• last can include an operand, which is a block

label.

• Blocks can be labeled by placing a name and a colon

before the opening brace.

©

4.4 Control Statements

• 4.4.2 SELECTION AND LOOP STATEMENTS

• The impcit variable $_ is frequently used in Perl

programs, most often as the default parameter

in a function call and as the default operand of

an operator. It is the default input operator.

• There are three uses of $_ in this while construct:

– as the target of the input operator

– as default parameter to print

– as the default operand of chomp

Page 101 example

©

4.5 sts and Arrays

• Arrays in Perl are more flexible than those

of other common languages. This is a

result of the late binding of the lengths of

arrays and the types of the array elements.

• Arrays store only scalar values, but that

includes numbers, strings, and references,

which provide for nested data structures.

©

4.5 sts and Arrays

• 4.5.1 ST TERALS

• A st is an ordered sequence of scalar values.

• A st teral, which is a parenthesized st of

scalar values, is the way a st value is specified

in a program.

(3.1415926 * $radius, “circles”, 17)

• The qw operator can be used on a sequence of

unquoted strings to quote all of them:

qw(peaches apples pears kumquats)

©

4.5 sts and Arrays

• 4.5.2 ARRAYS

• All array names begin with an at sign,

which puts them in a namespace that is

different from that of the scalar variable

names.

• Arrays can be assigned st terals or other

arrays.

@st=qw(boy girl dog cat);

@[email protected];

©

4.5 sts and Arrays

• If an array is used in scalar context (a

position in which a scalar value is

required), the array’s length (the number

of elements that are in the array) is used.

[email protected];

• A st teral that contains only scalar

variable names can be the target of a st

assignment;

($first, $middle, $last)=(“George”, “Bernard”, “Shaw”);

©

4.5 sts and Arrays

All Perl array elements use integers as

subscripts, and the lower bound subscript

of every array is zero.

• Array elements are referenced through

subscripts demited by brackets([ ]).

• A subscript can be any numeric-valued

expression.

@st=(2,4,6,8);

$second=$st[1];

©

4.5 sts and Arrays

• The length of an array is dynamic:

@st=(“Monday”, ”Tuesday”, ”Wednesday”,

“Thursday”);

$st[4]=(“Friday”);

• The last subscript of @st can be

referenced as $#st. So, the length of

@st is $#st+1. The last subscript of an

array can be assigned to set its length to

whatever you want.

$#st=999;

©

4.5 sts and Arrays

• Two different contexts of a variable name

or expression exist:

– scalar

– st

• Some of Perl’s operators force either

scalar or st context on their operand.

scalar (@st)

©

4.5 sts and Arrays

• The foreach statement is used to process

the elements of an array.

foreach $value(@st) {

$value/=2;

}

Example each

©

4.5 sts and Arrays

• 4.5.3 ST OPERATORS

– unshift: takes two operands, an array and a

scalar or st. The scalar or st is appended to

the beginning of the array.

unshift @first, @st

unshift @first, $value

– shift: removes and returns the first elements

of its given array operand.

$first=shift @st;

Example unshift

Example shift

©

4.5 sts and Arrays

– pop: removes and returns the last element of

its given array operand.

$last= pop @st

– push: takes an array and a scalar or a st.

The scalar or st is added to the high end of

the array.

@st=(2,4,6);

push @st, (8,10); Example push

Example pop

©

4.5 sts and Arrays

– spt: is used to break strings into parts using

a specified character as the basis for the spt.

$stoogestring=“Curly Larry Moe”

@stooges = spt / /, $ stoogestring;

– sort : takes an array parameter and uses

string comparison to sort the elements of the

array into alphabetic order.

Example spt

Example sort

©

4.5 sts and Arrays

– 4.5.4 AN EXAMPLER OF STS AND

ARRAYS

– page 105-106

Example process_names.pl

©

4.6 Hashes

• Associative arrays are arrays in which each data

element is paired with a key, which is used to

find the data element.

• Perl associative arrays are called hashes.

• The two fundamental differences between

arrays and hashes :

– arrays use numeric subscripts to address specific

elements; hashes use strings values (the keys) for

element addressing.

– the elements of array are ordered, but in hashes they

are not.

©

4.6 Hashes

• Names of hash variables begin with

percent signs(%).

• st terals are used to initiaze hash

variables.

• The symbols => can be used between a

key and its associated data element, or

value.

%kids_ages=(“Jonh”=>31,”Genny”=>28,”

“Jake”+> 15, “Darcie”+>13);

©

4.6 Hashes

• Arrays can be assigned to hashes, with the

sensible semantics that the odd-subscripted

elements of the array become the values in the

hash, an the even-numbered subscript elements

of the array becoming the keys in the hashes.

• An individual value element of a hash can be

referenced by “subscripting” the hash name with

a key. Braces are used to specify the

subscripting operation.

$genny_age=$kids_ages{“Genny”};

©

4.6 Hashes

• New values are added to a hash by assigning

the value of the new element to a reference to

the key of the new element, as in this example:

$kids_ages{“Aidan”}=0;

• An elements is removed from a hash with the

delete operator:

delete $kids_age{“Genny”};

• Your can determine whether an element is in a

hash with the exists operator:

if (exists $kids_age{“Freddie”}) …

©

4.6 Hashes

• The keys and values can be extracted into

arrays with the operators keys and values,

respectively:

foreach $child (keys %kids_ages) {

print “The age of $child is

$kids_ages{$child}/n”

}

@ages=values %kids_ages;

print “All of the ages are: @ages /n”;

Example hashes

©

4.6 Hashes

• Perl has a predefined hash named %ENV, which

stores operating system environment variables.

• Environment variables are used to store

information about the system on which Perl is

running.

• The environment variables and their respective

values in %ENV can be accessed by any Perl

program. In Chapter 5, we will make use of

environment variables.

©

4.7 References

• A reference is a scalar variable that

references an other variable or a teral.

$age=42;

$ref_age=/$age;

@stooges=(“Curly”, “Larry”, “Moe”);

$ref_stooges=/@stooges;

©

4.7 References

• A reference to a st teral is created by

putting the teral value in brackets as

shown:

$ref_salaries=[42500, 29800, 50000, 35250]

©

4.7 References

• A reference to a hash teral is created by

putting the teral value in braces:

$ref_age={

‘Curly’=>41,

‘Larry’=>38,

‘Moe’=>43,

};

©

4.7 References

• A reference (or a pointer) can specify two

different values:

– its own, which is an address,

– or a reference (or pointer) variable specify the latter is

called dereferencing.

• There are two ways to dereferencing

– an extra dollar sign can be appended to the beginning

of the reference’s name.

$$ref_stooges[3]=“Maxine”;

– If the reference is to an array or hash, ther is a

second way to specify dereferencing, using the ->

operator between the variable ’s name and its

subscript.

$ref_stooges->[3]=“Maxine”;

©

4.8 Functions

• Subprograms are central to the usefulness

of any programming language. Perl’s

subprograms are all functions, as in its

heritage language, C. This section

describes the basics of Perl functions.

©

4.8 Functions

• 4.8.1 FUNDATMENTALS

• A function definition includes the function ’s

header and a block of code that describes its

actions.

• A function that returns a useful value is called in

the position of an operand in a expression.

• A function that does not return an interesting

value is called by a standalone statement, which

consists of the function ’s name followed by a

parenthesized st of the parameters being sent

to the function.

©

4.8 Functions

• A function definition specifies the value to

be returned in two ways, impcitly and

expcitly.

• A function can have any number of calls to

return, including none.

• If there are no calls to return in a function,

its returned value is the value of the last

expression evaluated in the functions.

©

4.8 Functions

sub product 1 {

return ($first * $second);

}

sub product2 {

$first * $second;

}

©

4.8 Functions

• 4.8.2 LOCAL VARIABLES

• Variables that are impcitly declared have

global scope-that is , they are visible in the

entire program.

• Such variables are declared to have local

scope in a function by including their

names as parameters to the my function.

my $count=0;

©

4.8 Functions

• 4.8.2 LOCAL VARIABLES

• If more than one variable is declared by a

call to my, they must be placed in

parentheses, as in this example:

my ($count, $sum)=(0,0);

• Notice the use of the st assignment to

initiaze these local variables.

©

4.8 Functions

• 4.8.2 LOCAL VARIABLES

• If the name of a local variable confcts

with that of a global variable, the local

variable is used.

• Perl includes a second kind of local

variables, which are declared with the

local reserved word.

©

4.8 Functions

• 4.8.3 PARAMETERS

• The parameter values that appear in a call to a

function are called actual parameters.

• The parameter names used in the function,

which correspond to the actual parameters, are

called formal parameters.

• There are two common models of parameter

transfers used in the nkage between a function

and its caller:

– pass by value

– pass by reference

©

4.8 Functions

• 4.8.3 PARAMETERS

• All parameters are communicated through

a special impcit array, @_.

sub plus10{

$_[0]+=10;

}

plus10($a);

©

4.8 Functions

• 4.8.3 PARAMETERS

• Pass-by-value parameters are

implemented by assigning the passed

values in @_ to local variables - for

example:

sub fun1{

my($a,$b)=@_;

++$a*++$b;

}

©

4.8 Functions

• 4.8.3 PARAMETERS

• References to variables can be used as actual

parameters, witch provides pass-by -references

parameters.

sub sub1 {

my($ref_len,$ref_st)=@_;

my $count;

for ($count =0; $count< $$ ref_len;

$$ref_st[$count++]--){}

}

sub1(/$len,/@myst);

©

4.8 Functions

• 4.8.4 An example

• Page 112- 113

Example tst_median.pl

©

Chapter 4 The basics of Perl

• 4.9 The pack and unpack function

• 4.10 Pattern Matching Using Regular

Expressions

• 4.11 File Input and Output

• 4.12 An Example

• 4.13 Summary

• 4.14 Review Questions

• 4.15 Exercise

©

4.9 The pack and unpack

function

• In some situations, values must be

converted, either form numbers to text, or

vice versa.

• The pack and unpack functions are

designed to perform these conversions.

• Both functions take two, a parameters, a

template, which is a character string that

specifies the particular conversion, and a

st that specifies the data to be packed or

unpacked.

©

4.9 The pack and unpack

function

• A description of all of the template

character codes can be found by typing

perldoc -f pack.

• The pack function can be used to build a

text string from an array of numeric values.

• The unpack function can be used to get

the numeric codes of the characters in a

string and make them the elements of an

array, or the values of a st of scalar

variables.

©

4.9 The pack and unpack

function

• These particular conversions (text string to an array of

numbers, and a st or array of numbers to a text string)

are specified with the “C” template character code as the

first parameter to pack or unpack.

• The “C” code specifies an unsigned character. If there

are more than one character in the string, the conversion

can be specified with either a string of “C”s or as “C”

followed by the teral that represents the number you

want.

$str[0]= ‘k’;

$str[1]= ‘c’;

$kc_str=pack(“CC”,@str);

@str2=unpack(“CC”,$kc_str);

©

4.10 Pattern Matching Using

Regular Expressions

• One of the greatest strengths of Perl,

particularly when compared with other

common high-level programming

languages, is its powerful facity for textual

pattern matching.

• Patterns are specified in Perl in a form that

is based on regular expressions, which

were developed to define members of a

simple class of formal languages.

©

4.10 Pattern Matching

Using Regular Expressions

• 4.10.1 CHARACTER AN CHARACTERCLASS

PATTERNS

The pattern-matching operation is

specified with an operator, m.

The pattern itself is demited by slashes. If

slashes are used to demit the pattern, the

m operator is not required.

The string against witch the matching is

attempted is by default in the impcit

variable $_.

©

4.10 Pattern Matching

Using Regular Expressions

• 4.10.1 CHARACTER AN CHARACTER-CLASS

PATTERNS

Within a pattern, “normal” characters match

themselves. Normal means that they are not

metacharacters, which are characters that have

special meanings in some contexts in patterns.

/ | ( ) [ ] { } ^ $ * + ? .

Metacharacter can themselves be matched by

being immediately preceded by a back slash.

©

4.10 Pattern Matching

Using Regular Expressions

• 4.10.1 CHARACTER AN CHARACTERCLASS

PATTERNS

How normal characters are used for

pattern matching:

if (/rabbit/) {

print “The word ‘rabbit’ appears

somewhere in $_ /n”;

}

©

4.10 Pattern Matching

Using Regular Expressions

• 4.10.1 CHARACTER AN CHARACTERCLASS

PATTERNS

• The period matches any character except

newne.

• /snow./

©

4.10 Pattern Matching

Using Regular Expressions

• 4.10.1 CHARACTER AN CHARACTERCLASS

PATTERNS

• It is often convenient to be able to specify

classes of characters rather than individual

characters. Such classes are defined by

placing the desired characters in brackets.

• [abc]

©

4.10 Pattern Matching

Using Regular Expressions

• 4.10.1 CHARACTER AN CHARACTERCLASS

PATTERNS

• Also, you could have the following

character class, which matches any

lowercase letter form ‘a’ to ‘h’:

• [a-h]

©

4.10 Pattern Matching

Using Regular Expressions

• 4.10.1 CHARACTER AN CHARACTERCLASS

PATTERNS

• If a circumflex character (^) is the first

character in a class, it inverts the specified

set .

• [^aeiou]

©

4.10 Pattern Matching

Using Regular Expressions

• 4.10.1 CHARACTER AN CHARACTERCLASS

PATTERNS

• In many cases, you‘ll want to repeat a

character or character-class pattern. to

repeat a pattern, a number quantifier,

demited by races, is attached.

• /xy{4}z/

©

4.10 Pattern Matching

Using Regular Expressions

• 4.10.1 CHARACTER AN CHARACTERCLASS

PATTERNS

• Perl also includes three nonnumeric

quantifiers:

– asterisk(*): means zero or more repetitions,

– plus (+): means one or more repetitions,

– question mark (?): means one or none.

• /x*y+z?/

©

4.10 Pattern Matching

Using Regular Expressions

• 4.10.1 CHARACTER AN CHARACTERCLASS

PATTERNS

• Table 4.5 Predefined Character Classes

Not A

whitespace

character

[^/

r/t/n/f

]

/S

A

whitespace

character

[/

r/t/n/f

]

/s

Not a word character

[^A-Za-z_0-9]

/W

A word character (

alphanumberic

)

[A-Za-z_0-9]

/w

Not a digit

[^0-9]

/D

A digit

[0-9]

/d

Matches

Equivalent Pattern

Name

©

4.10 Pattern Matching

Using Regular Expressions

• 4.10.1 CHARACTER AN CHARACTERCLASS

PATTERNS

• There are two additional named patterns

that are often useful.

– /b (boundary), matches the boundary position

between a word character and a non-word

character, in either order.

– /B is the opposite of /b; it matches a non-word

boundary.

©

4.10 Pattern Matching

Using Regular Expressions

• 4.10.2 BINDING OPERATORS

• Sometimes you’ll want to match a pattern

against a string that is not in $_. A string in

any scalar variable can be used by using

the binding operators =~ and !~.

• $string =~ //d/; # looks for a digit in $string

• $string !~ //d/; # looks for non digit in $string

©

4.10 Pattern Matching

Using Regular Expressions

• 4.10.3 ANCHORS

• It is frequently useful to be able to specify

that a pattern must match at a particular

position in the string.

• The most common example of this is

requiring a pattern to match at one specific

end of the string. A pattern is tied to a

string position with an anchor.

©

4.10 Pattern Matching

Using Regular Expressions

• 4.10.3 ANCHORS

• A pattern must match at the beginning of

the string and begin with a circumflex(^)

anchor.

– A pattern must match at the beginning of the string

and begin with a circumflex(^) anchor.

/^pearl/

– A pattern that must at the end of a string ends with a

dollar sign anchor.

/gold$/

©

4.10 Pattern Matching

Using Regular Expressions

• 4.10.4 PATTERN MODIFIERS

• Modifiers can be attached to patterns to

change how they are used, thereby

increasing their flexibity.

• The modifiers are specified as letters just

after the right demiter of the pattern.

– The i modifier makes the letters in the pattern match

either uppercase or lowercase letters in the string.

/gold$/i

– The x modifier allows wihtespace to appear in the

pattern. eg : /gold$/x

©

4.10 Pattern Matching

Using Regular Expressions

• 4.10.5 REMEMBERING MATCHES

• The part of the string that matched a part

of the pattern can be saved in an impcit

variable for later use. The part of the

pattern whose match you want to save is

placed in parentheses.

• The substring that matched the first

parenthesized part of the pattern is saved

in $1, the second in $2, and so forth.

Example page 118

©

4.10 Pattern Matching

Using Regular Expressions

• 4.10.6 SUBSTITUTIONS

• Sometimes the substring of a string that

matched a pattern must be replaced by

another string. Perl’s substitute operator is

designed to do exactly that.

• Type typical form of the substitute operator

is :

s/Pattern/New_string/

©

4.10 Pattern Matching

Using Regular Expressions

• 4.10.6 SUBSTITUTIONS

• The substitute operator can have two

modifiers, g and e. The g modifier tells the

substitute operator to find all matches in

the given string and replace all of them:

$_= “Fred, Freddie, and Frederica were sibngs ”;

s/Fre/Boy/g;

©

4.10 Pattern Matching

Using Regular Expressions

• 4.10.6 SUBSTITUTIONS

• The e modifier tells the substitute operator

to execute the New_String.

s/%([/dA-Fa-f])/pack(“c”,hex($1))/e;

©

4.10 Pattern Matching

Using Regular Expressions

• 4.10.6 SUBSTITUTIONS

• The i modifier can also be used with the

substitute operator, as in this code:

$_ = “Is it Rose, rose, or ROSE? ”;

s/Rose/rose/ig;

©

4.10 Pattern Matching

Using Regular Expressions

• 4.10.7 THE TRANSTERATE

OPERATOR

• Perl has a transterate operator, tr, which

translates a character or character class to

another character or character class,

respectively.

• tr/;/:/;

• tr/A-Z/a-z/;

• tr//,/.//;

©

4.11 File Input and Output

• File are referenced through program

variables called filehandles, which do not

begin with special characters.

• open(INDAT, “<temperatures”);

• open(OUTDAT, “>averages”);

©

4.11 File Input and Output

• Because open can fail, it is often used with

the die function, as in this example:

• open(INDAT, “<temperatures”) or

die “Error - unable to open

temperatures $!”;

The predefined variable, $!, has the

value of the system variable,

errno

,

which is useful for determining the

reason open filed.

©

4.11 File Input and Output

• Table 4.6 File Use Speicfications

Output, starting at the end of the

existing data on the file

>>

Output, starting at the beginning of

the file

>

< Input (the default)

Character(s) Meaning

©

4.11 File Input and Output

• One ne of text can be written to a file with

the print function.

print OUTDAT “The result is : $result /n”

• Notice that no comma is used between the

filehandle and the string.

• nes of text can be read from a file using

the ne input operator.

$next_ne=<INDAT>;

©

4.11 File Input and Output

• Multiple nes can be read from a file with

the read function.

• read(filehandle, buffer, length [, offset]);

$chars=read(ANIMALS, $buf, 255);

@nes=spt //n/, $buf;

©

4.12 An Example

• Page 123 -125

Example wages.pl

©

Chapter 4 The basics of Perl

• 4.13 Summary

• 4.14 Review Questions

• 4.15 Exercise