home | O'Reilly's CD bookshelfs | FreeBSD | Linux | Cisco | Cisco Exam  

Unix Power ToolsUnix Power ToolsSearch this book

22.7. lensort: Sort Lines by Length

A nice little script to sort lines from shortest to longest can be handy when you're writing and want to find your big words:

deroff Section 16.9, uniq Section 21.20

% deroff -w report | uniq -d | lensort

Once I used it to sort a list of pathnames:

find Section 9.1

% find adir -type f -print | lensort

The script uses awk (Section 20.10) to print each line's length, followed by the original line. Next, sort sorts the lengths numerically (Section 22.5). Then sed (Section 34.1) strips off the lengths and the spaces and prints the lines:

Figure Go to http://examples.oreilly.com/upt3 for more information on: lensort

#! /bin/sh
awk 'BEGIN { FS=RS }
{ print length, $0 }' $* |
# Sort the lines numerically
sort +0n -1 |
# Remove the length and the space and print each line
sed 's/^[0-9][0-9]* //'

(Some awks require a semicolon after the first curly bracket -- that is, { FS=RS };.)

Of course, you can also tackle this problem with Perl:

$ perl -lne '$l{$_}=length;END{for(sort{$l{$a}<=>$l{$b}}keys %l){print}}' \

This one-line wonder has the side effect of eliminating duplicate lines. If this seems a bit terse, that's because it's meant to be "write-only," that is, it is a bit of shell magic that you'd use to accomplish a short-term task. If you foresee needing this same procedure in the future, it's better to capture the magic in script. Scripts also tend to be easier to understand, debug, and expand. The following script does the same thing as the one-liner but a bit more clearly:


my %lines;
while(my $curr_line = <STDIN>){
  chomp $curr_line;
  $lines{$curr_line} = length $curr_line;

for my $line (sort{ $lines{$a} <=> $lines{$b} } keys %lines){
  print $line, "\n";

This script reads in a line from standard input, removes the newline character and creates an associative array that maps whole line to its length in characters. After processing the whole file, the keys of the associative array is sorted in ascending numerical order by each key's value. It is then a simple matter to print the key, which is the line itself. More Perl tricks can be found in Chapter 11.

--JP and JJ

Library Navigation Links

Copyright © 2003 O'Reilly & Associates. All rights reserved.