Linux command: Sort

From man page:

-f, --ignore-case
              fold lower case to upper case characters

It means ‘-f’ will sort case insensitively. But when we do ‘sort’ without any option, it sorts case insensitively already:

[root@centos ~]# sort
d
h
D
H
a

a
d
D
h
H

[root@centos ~]# sort -f
d
h
D
H
a

a
d
D
h
H

So what’s ‘-f’ for? Also, from the man page:

 ***  WARNING *** The locale specified by the environment affects sort order.  Set LC_ALL=C to get the traditional sort order that uses
       native byte values.

What does it mean? Try to set LC_ALL=C then sort again:

[root@centos ~]# export LC_ALL=C
[root@centos ~]# sort
d
h
D
H
a

D
H
a
d
h

[root@centos ~]# sort -f
d
h
D
H
a

a
D
d
H
h

OK, so now we see the different. It means GNU sort doesn’t use the traditional way of sorting (case sensitive).

And if we want to restore LC_ALL to its default value, type this:

[root@centos ~]# unset LC_ALL

Some other common options:

-r: reverse order

-n: sort  number

-b: sort with leading blanks

And ‘-k’ is really interesting. Check out these commands:

[root@centos ~]# ls -al | head
total 336
drwxr-x— 14 root root  4096 Jan 16 11:54 .
drwxr-xr-x 23 root root  4096 May 11  2011 ..
-rw——-  1 root root  1291 Jan 14 03:04 anaconda-ks.cfg
-rw——-  1 root root  1291 Jan 14 02:45 anaconda-ks.cfg_new
-rw——-  1 root root  2124 Jan 16 07:22 .bash_history
-rw-r–r–  1 root root    24 Jan  6  2007 .bash_logout
-rw-r–r–  1 root root   191 Jan  6  2007 .bash_profile
-rw-r–r–  1 root root   176 Jan  6  2007 .bashrc
-rw-r–r–  1 root root   100 Jan  6  2007 .cshrc
[root@centos ~]# ls -al | sort -k5 | head
total 336
-rw-r–r–  1 root root     0 Jan 16 11:47 newpasswd
-rw-r–r–  1 root root   100 Jan  6  2007 .cshrc
-rw——-  1 root root  1291 Jan 14 02:45 anaconda-ks.cfg_new
-rw——-  1 root root  1291 Jan 14 03:04 anaconda-ks.cfg
-rw-r–r–  1 root root   129 Jan  6  2007 .tcshrc
-rw-r–r–  1 root root   176 Jan  6  2007 .bashrc
-rw-r–r–  1 root root  1919 Jan 16 11:54 new_password
-rw-r–r–  1 root root   191 Jan  6  2007 .bash_profile
-rw——-  1 root root  2124 Jan 16 07:22 .bash_history
[root@centos ~]# ls -al | sort -k5n | head
-rw-r–r–  1 root root     0 Jan 16 11:47 newpasswd
total 336
-rw-r–r–  1 root root    24 Jan  6  2007 .bash_logout
-rw——-  1 root root    26 Jan 12 03:10 .dmrc
-rw-r–r–  1 root root    31 Jan 14 05:40 wc
-rw-r–r–  1 root root    31 Jan 14 05:43 file
-rw-r–r–  1 root root    38 Jan 14 04:05 error_mesg
-rw——-  1 root root    41 Jan 16 12:44 .lesshst
-rw-r–r–  1 root root    81 Jan 12 03:10 .gtkrc-1.2-gnome2
-rw-r–r–  1 root root   100 Jan  6  2007 .cshrc

What I want in previous example is to sort the content of the current directory by their size.

Now, let see the content of passwd file:

[root@centos ~]# cat /etc/passwd | head
root:x:0:0:root:/root:/bin/bash
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin
adm:x:3:4:adm:/var/adm:/sbin/nologin
lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
sync:x:5:0:sync:/sbin:/bin/sync
shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
halt:x:7:0:halt:/sbin:/sbin/halt
mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
news:x:9:13:news:/etc/news:

I want to make an order in 5rd column (USER ID info), let see how -k combines with -t:

[root@centos ~]# cat /etc/passwd | sort -t: -k5 | head
adm:x:3:4:adm:/var/adm:/sbin/nologin
nfsnobody:x:65534:65534:Anonymous NFS User:/var/lib/nfs:/sbin/nologin
apache:x:48:48:Apache:/var/www:/sbin/nologin
avahi-autoipd:x:100:102:avahi-autoipd:/var/lib/avahi-autoipd:/sbin/nologin
avahi:x:70:70:Avahi daemon:/:/sbin/nologin
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin
distcache:x:94:94:Distcache:/:/sbin/nologin
dovecot:x:97:97:dovecot:/usr/libexec/dovecot:/sbin/nologin
ntp:x:38:38::/etc/ntp:/sbin/nologin

We need -t because in passwd file, the column delimiter is ‘:’, not blank.

And if you want to sort by multiple column, one after the other, let see the IP address file like below:

[root@centos ~]# cat ip_address
192.168.0.25
127.0.0.12
192.168.0.1
127.0.0.3
127.0.0.6
192.168.0.5
[root@centos ~]# sort -t. -k1,1n -k4,4n ip_address
127.0.0.3
127.0.0.6
127.0.0.12
192.168.0.1
192.168.0.5
192.168.0.25

I’ve sorted by 1st column, then by 4th column. We must be confused with the -k1,1. They’re the positions for -k, it means sort with the whole column 1, from the 1st character of the column to the last character of the column. If you want to sort from the 1st character to the 2nd character only, for example, you can use -k1.1,1.2 instead. It’s clear? So check it out:

[root@centos ~]# sort -t. -k1,1n -k4.1,4.1n ip_address
127.0.0.12
127.0.0.3
127.0.0.6
192.168.0.1
192.168.0.25
192.168.0.5
[root@centos ~]# sort -t. -k1,1n -k4.1,4.1r ip_address
127.0.0.6
127.0.0.3
127.0.0.12
192.168.0.5
192.168.0.25
192.168.0.1

In the last command, I sorted the 4th column with reverse order.

SORT is always fun!

This lab is done on:

[root@centos ~]# lsb_release -a | grep Description ; echo “Linux kernel: `uname -r`”
Description: CentOS release 5.7 (Final)
Linux kernel: 2.6.18-274.18.1.el5

dongthao

Advertisements

About dongthao

"Man does not simply exist but always decides what his existence will be, what he will become the next moment"
This entry was posted in Linux and tagged . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s