view · edit · history · print

Perl Scripting

Perl module management

find modules using @INC
$ find `perl -le 'print for grep {$_ ne q{.}and -d} @INC'` -name "*.pm"
$ find `perl -e 'print "@INC"'` -name '*.pm' -print  

perl -MCPAN -e shell
cpan[2]> o conf init /proxy/
Your ftp_proxy? [a.b.c.d] ftp://server:21
Your http_proxy? [a.b.c.d] http://server:8080
Your no_proxy? [] <== if left empty it will ask for username and password!!
cpan[3]> reload index # to test connection
cpan[3]> o conf commit

get modules from CPAN
$ perl -MCPAN -eshell
> m /FTP/
> o conf [init]
> o conf commit

# or use export http_proxy instead.
> o conf http_proxy ''
> o conf commit

check if module is installed (read: available)
$ perl -M<some::module> -e '1'

List of standard modules

Generic file editing

Tip: eat your daily "pie"

  -p loop 
  -i inplace_edit[with_backup] 
  -e code

perl -p -i[.backup-extension] -e 's#old_string#new_string#ig' file(s)

Perl column mode

Tip: "lane" will do it for you

  -l strips newlines on input, and adds them on output.
  -a enables auto-split of input into the @F array.
  -n loops over and does not print input.
  -e specifies Perl expressions.

  -F specifies the characters to split on with the -a option (default = spaces).
  -p loops over and prints input.

$ perl -F: -lane 'print $F[0]' /etc/passwd
$ perl -F: -lape '$_ = $F[0]' /etc/passwd

Nicely formed regex

Tip: For nice regex, use /x

$x =~ m/
  (.+?)         # non greedey tot eerste dubbel punt
  :(.+)         # blah, blah, blah
  -(.+)         # the rest
print "_ $1 _ $2 _ $3 _ \n";

One liners

Converting text files between : Mac OS - UNIX - Windows

EOL termination

  • UNIX: \n
  • Windows: \r\n
  • Mac: \r
Windows to Unix
$ perl -p -e 's/\r$//' < winfile.txt > unixfile.txt

Unix to Windows 
$ perl -p -e 's/\n/\r\n/' < unixfile.txt > winfile.txt
$ perl -p -i -e 's/([^\r])\n/$1\r\n/' somefile.txt

Mac to Unix
$ perl -p -e 's/\r/\n/g'  < macfile.txt > unixfile.txt

Unix to Mac
$ perl -p -e 's/\n/\r/g' < unixfile.txt > macfile.txt

Reverse a entire text file (better then tail -r)
perl -e 'print reverse <>'

# Find the position of a string
grep %% f1 | perl -lne '$x=index($_,"%%");print $x;'

# insert "New" after column "2"
cat f1 | perl -pe 's/((?:\d\s){2})(.*)/$1New $2/'

# insert "New" at position 20
cat f1 | perl -pe 'substr($_,20,0) = "New"'

Altering record parsing

Perl uses the -0 option to allow changing the input record separator. Use -00 to operate in paragraph mode, and -0777 to treat the file as a single line. The paragraphs file contains the -0 documentation from perlrun, used in the following example:

$ perl -00 -ne 'print if /special/' paragraphs
The special value 00 will cause Perl to slurp files in paragraph
mode. The value 0777 will cause Perl to slurp files whole because
there is no legal byte with that value.

Parsing the entire input file as a single line can be used to alter the newlines that otherwise require a range operator to deal with, as shown above. By treating an entire file as a single line, a s///g expression can eliminate runs of blank lines:

$ cat input

$ perl -0777 -pe 's/\n+/\n/g' input


Scrapbook: mini perls or perls in shells

rm -f f1
vi f1
position=$(grep %% f1 | perl -lne '$x=index($_,"%%");print $x;')
cat f1 | perl -pe 'substr($_,'$position',0)="'"${stringtoinsert}"'";' | grep -v "%%"
rm f1

todo: write completely in Perl; cat file into new script; add feature to delete columns

Changing @INC - where Perl loads its modules

Where does Perl load modules from in its use and require statements? It loads them from directorys in a special list called @INC, from files with a .pm extension in those directorys. When Perl's installed, @INC is set to a list of directorys that includes generic locations for its standard modules, some release specific directories, and "." the current directory, which are checked in order each time you do a use or require.

Some ways to modify @INC

==> You can add to the list in @INC by using the -I command line option:
perl -I /Users/grahamellis/jan06 i2
says "run the perl program i2, additionally checking the jan06 directory for modules"

==> You can add to the list within your program by doing so in a BEGIN block prior to the use statements:
        push @INC,"/Users/grahamellis/jan06";
use demo;
print "hello world";
Rather curiously, use calls are run at compile time not at run time ... but then so are BEGIN blocks ... so you put your manipulation of @INC into one of those to get it to happen early enough.

==> You can add to the beginning of the list by setting the PERL5LIB environment variable prior to running your program:
export PERL5LIB=/Users/grahamellis/jan06
and you can use a colon separated list for that if you

==> edit the perl binary but this is a dirty hach that does not allow you to add bytes (paths)

==> re-compile perl after "make distclean" but that implies you already compiled Perl yourself before.


admin · attr · attach · edit · history · print
Page last modified on January 19, 2016, at 02:49 AM