Using perl instead of awk in my one-liner
By Robin Smidsrød on Aug 29, 2011 | In Perl, Software Development
I was filtering some output in the shell today, and I reached for my trusty awk '{ print $1 }' to get the first part of a line separated by whitespace.
Then I started to think; why am I using awk for this when I know so much perl, and perl is supposed to be so good at text parsing?
I asked the question on #perl-help on irc.perl.org. A friendly user, daxim, started helping me out and quickly re-educated me on the -l, -a and -F parameters to perl:
$ perl --help
-F/pattern/ split() pattern for -a switch (//'s are optional)
-l[octal] enable line ending processing, specifies line terminator
-a autosplit mode with -n or -p (splits $_ into @F)
-n assume "while (<>) { ... }" loop around program
-p assume loop like -n but print line also, like sed
-e program one line of program (several -e's allowed, omit programfile)
As the default for -F is \s+, we can skip it when we're splitting on whitespace. And adding -l enables automatic "\n" on that print statement. That makes the perl equivalent quite similar to the awk method (and easier to memorize):
$ ls -l | perl -lane 'print $F[0]' # gives you the file modes only from ls
A nice side-effect of this is that perl has a very compact array slice syntax, so if you want to fetch multiple values, you can do something like this:
$ ls -l | perl -lane 'print @F[0,2..3] # file modes and owner/group information
Just remember that perl method (@F) starts counting at 0, while the awk method starts counting at 1.
10 comments
ls -l | awk '{ print $1, $3, $4 }'
Awk is worth knowing as well. My advice is don't dismiss awk like others dismiss perl.
Thanks for the feedback!I agree that each tool has its strengths, and I think awk and perl are useful for a lot of things. I'm guessing awk probably has a smaller memory footprint as well.
But I must admit that this use of awk as shown above is probably the only one I use on a regular basis. Perl, as the swiss army chainsaw it is, I use for most other text parsing/processing I need.
ls -l | perl -lane 'print @F[0,2..3]
...should probably include some inter-field delimiter on output, e.g. :
ls -l | perl -lane 'print join q( ), @F[0,2..3]

perldoc perlvar shows that $LIST_SEPARATOR ($") is used to separate values in an array variable inside double-quoted strings. It's default value is a single space. This means you can use this construct to ensure you get a space:
ls -l | perl -lane 'print "@F[0,2..3]"'
ls -l | perl -lane '$sum+=$F[4];print "sum=$sum" if(eof)'
# print the total storage of the directory
as i compare perl lane can beat both "sed and awk" by the way can i know if you dont mind
how to translate this awk to perl one liner# iam newbe on perl#:
#!/bin/sh
for i in `cat $TMP/out#$d#$$ |awk '{print $4}' `
do
myloc=`grep $SITEX $TMPLOC/SITE_LOC | awk -F "|" '{print $2}' | sort -u `
cat filena#e#txt | grep $i | awk -v B=$myloc ' {printf#"%-8s %-20s %-25s %-10s %s %s %-30s\n",$1,$2,$3,$4,$5,$6,""B""#}' >> outfile.txt
done
hope you can help me
Kind Regards
Andrez
@Craig: Mostly because I don't know cut as intimately as I know perl. I actually had to look at the man page to be sure I understood it correctly.@Andrez: Pulling that apart when it obviously relies on several unknown external variables is not easy. Maybe you should try to explain what it does in plain English instead of me trying to understand shell "golf".
Leave a comment
| « Unicode::Collate is really, really slow | Slow text consoles in Ubuntu 10.04-based VM in VirtualBox? » |



