Using perl instead of awk in my one-liner
By Robin Smidsrød on Aug 29, 2011 | In Perl, Software Development
I was filtering some output in the shell today, and I reached for my trusty awk '{ print $1 }' to get the first part of a line separated by whitespace.
Then I started to think; why am I using awk for this when I know so much perl, and perl is supposed to be so good at text parsing?
I asked the question on #perl-help on irc.perl.org. A friendly user, daxim, started helping me out and quickly re-educated me on the -l, -a and -F parameters to perl:
$ perl --help
-F/pattern/ split() pattern for -a switch (//'s are optional)
-l[octal] enable line ending processing, specifies line terminator
-a autosplit mode with -n or -p (splits $_ into @F)
-n assume "while (<>) { ... }" loop around program
-p assume loop like -n but print line also, like sed
-e program one line of program (several -e's allowed, omit programfile)
As the default for -F is \s+, we can skip it when we're splitting on whitespace. And adding -l enables automatic "\n" on that print statement. That makes the perl equivalent quite similar to the awk method (and easier to memorize):
$ ls -l | perl -lane 'print $F[0]' # gives you the file modes only from ls
A nice side-effect of this is that perl has a very compact array slice syntax, so if you want to fetch multiple values, you can do something like this:
$ ls -l | perl -lane 'print @F[0,2..3] # file modes and owner/group information
Just remember that perl method (@F) starts counting at 0, while the awk method starts counting at 1.
5 comments
ls -l | awk '{ print $1, $3, $4 }'
Awk is worth knowing as well. My advice is don't dismiss awk like others dismiss perl.
Thanks for the feedback!I agree that each tool has its strengths, and I think awk and perl are useful for a lot of things. I'm guessing awk probably has a smaller memory footprint as well.
But I must admit that this use of awk as shown above is probably the only one I use on a regular basis. Perl, as the swiss army chainsaw it is, I use for most other text parsing/processing I need.
ls -l | perl -lane 'print @F[0,2..3]
...should probably include some inter-field delimiter on output, e.g. :
ls -l | perl -lane 'print join q( ), @F[0,2..3]

perldoc perlvar shows that $LIST_SEPARATOR ($") is used to separate values in an array variable inside double-quoted strings. It's default value is a single space. This means you can use this construct to ensure you get a space:
ls -l | perl -lane 'print "@F[0,2..3]"'
ls -l | perl -lane '$sum+=$F[4];print "sum=$sum" if(eof)'
# print the total storage of the directory
Leave a comment
| « Unicode::Collate is really, really slow | Slow text consoles in Ubuntu 10.04-based VM in VirtualBox? » |



