UCLA Academic Technology Services HomeServicesClassesContactJobs
Stat Computing > Data File Manipulation

Commonly Used Unix-like Commands


head [-n] [files]
Print the first few lines of one or more files. It is a shell-command and has to be issued at shell prompt.
For usage, use head --help.

Example:

C:\wbin\examples>sh
$head unix_cmd.txt
agrep.exe
ansi2knr.exe
basename.exe
bc.exe
bison.exe
bunzip2.exe
bzip2.exe
bzip2recover.exe
cat.exe
chgrp.exe

Example: Output the first three line of ASCII data file fixed.txt to fixed_small.txt. 

C:\wbin\examples>more fixed.txt
123456
234445
334566
456577
534345
676767
C:\wbin\examples>sh
$head -3 fixed.txt >fixed_small.txt
C:\wbin\examples>wc -l fixed_small.txt
      3 fixed_small.txt

tail [options] [file]
Print the last ten lines of the names file.
For usage, use tail --help.

Example:

C:\wbin\examples>tail unix_cmd.txt
uuencode.exe
wc.exe
wget.exe
wget.hlp
which.exe
whoami.exe
xargs.exe
yes.exe
zcat.exe
zip.exe

wc [options] [files]
Print a character, word, and line count for files.
For usage, use wc --help.

Example 1: Count the number of rows in the file.

C:\wbin\examples>more fixed.txt
123456
234445
334566
456577
534345
676767
C:\wbin\examples>wc -l fixed.txt
      6 fixed.txt

Example 2: Print the length of the longest line.

E:\temp>more odd.txt
1 2 3 4 5 6 7
1 2 3 4 5 6 7 8 9 0
11 22 3 3 444 555 6 77 88
222 444 55
E:\temp>wc -L odd.txt
     25 odd.txt

cut options [files]
Select a list of columns or fields from one or more files. Option -c or -f must be specified.
For usage, use cut --help.

Example1: Free format

C:\wbin\examples>more f2.txt
john b a
tina b a
C:\wbin\examples>cut -d" " -f2 grade.txt
b
b

Example 2: Fixed format

C:\wbin\examples>more fixed.txt
123456
234445
334566
456577
534345
676767
C:\wbin\examples>cut -c2-3 fixed.txt
23
34
34
56
34
76

join [options] file1 file2
Join the common lines of sorted file1 and sorted file2 (merge two files).
For usage, use join --help.

Example 1:

C:\wbin\examples>more score.txt
john 81 91
mark 82 93
tina 88 92
C:\wbin\examples>more grade.txt
john b a
tina b a
C:\wbin\examples>join score.txt grade.txt > final.txt
C:\wbin\examples>more final.txt
john 81 91 b a
tina 88 92 b a

Example 2: Pair each score with its grade.

C:\wbin\examples>join -o 1.1 1.2 2.2 1.3 2.3 score.txt grade.txt
john 81 b 91 a
tina 88 b 92 a

paste [options] files
Merge corresponding lines of one or more files into vertical columns, separated by tab. 
For usage, use paste --help.

Example: 

C:\wbin\examples>more c1.txt
1
2
3
4
5
6
C:\wbin\examples>more c23.txt
23
34
34
56
34
76
C:\wbin\examples>paste c1.txt c23.txt
1       23
2       34
3       34
4       56
5       34
6       76

cat [options] [file] ...
Concatenate and write files.
For usage, use cat --help.

Example 1: Write a file to the screen (standard out).

C:\>cat file1.txt
a1 a2 a3
1  3  4
2  9  0
3 10  2
4 11  2

Example 2: Concatenating (stacking) two files.

C:\>cat file1.txt
a1 a2 a3
1  3  4
2  9  0
3 10  2
4 11  2
C:\>cat file2.txt
5 12  0
6  9  1
7  8  3
C:\>cat file1.txt file2.txt > whole.txt
C:\>cat whole.txt
a1 a2 a3
1  3  4
2  9  0
3 10  2
4 11  2
5 12  0
6  9  1
7  8  3

Example 3: Stacking multiple files with the same file extension

C:\>cat file1.txt
a1 a2 a3
1  3  4
2  9  0
3 10  2
4 11  2
C:\>cat file2.txt
5 12  0
6  9  1
7  8  3
C:\>cat file3.txt
9  12 7
10 12 0
11 23 34
C:\>cat *.txt
a1 a2 a3
1  3  4
2  9  0
3 10  2
4 11  2
5 12  0
6  9  1
7  8  3
9  12 7
10 12 0
11 23 34
C:\>cat *.txt > big
C:\>cat big
a1 a2 a3
1  3  4
2  9  0
3 10  2
4 11  2
5 12  0
6  9  1
7  8  3
9  12 7
10 12 0
11 23 34

Remark: Notice that the file big does not have extension .txt. This is to avoid an infinite loop due to the use of wild card '*'.


sort [options] [files]
Sort the lines of the named files, typically in alphabetical order.
For usage, use sort --help.

Example 1: 

beta (124) % cat hsbfew.txt
id female race ses schtyp prog read write math science socst
70 0 4 1 1 1 57 52 41 47 57
121 1 4 2 1 3 68 59 53 63 61
86 0 4 3 1 1 44 33 54 58 31
141 0 4 3 1 3 63 44 47 53 56
172 0 4 2 1 2 47 52 57 53 61
113 0 4 2 1 2 44 52 51 63 61
50 0 3 2 1 1 50 59 42 53 61
11 0 1 2 1 2 34 46 45 39 36
84 0 4 2 1 1 63 57 54 58 51
48 0 3 2 1 2 57 55 52 50 51
75 0 4 2 1 3 60 46 51 53 61
60 0 4 2 1 2 57 65 51 63 61 
 beta(125)% sort -n hsbfew.txt
id female race ses schtyp prog read write math science socst
11 0 1 2 1 2 34 46 45 39 36
48 0 3 2 1 2 57 55 52 50 51
50 0 3 2 1 1 50 59 42 53 61
60 0 4 2 1 2 57 65 51 63 61
70 0 4 1 1 1 57 52 41 47 57
75 0 4 2 1 3 60 46 51 53 61
84 0 4 2 1 1 63 57 54 58 51
86 0 4 3 1 1 44 33 54 58 31
113 0 4 2 1 2 44 52 51 63 61
121 1 4 2 1 3 68 59 53 63 61
141 0 4 3 1 3 63 44 47 53 56
172 0 4 2 1 2 47 52 57 53 61       

fgrep [options] [pattern] [files]
Search one or more files for lines that match a literal, text-string pattern. Because fgrep does not support regular expressions, it is faster than grep.
For usage, use fgrep --help.

Example 1: Search for lines that contains "john". The option -i is used to ignore uppercase and lowercase distinction.

C:\wbin\examples>more score.txt
john 81 91
mark 82 93
tina 88 92
C:\wbin\examples>fgrep "john" score.txt
john 81 91
C:\wbin\examples>fgrep -i "JOHN" score.txt
john 81 91

grep [options] regexp [files]
Search one or more files for lines that match a regular expression regexp.
For usage, use grep --help.

Example 1: Search for lines that contain a particular character.

C:\wbin\examples>more problem.txt
123456
23?445
334566
456x77
534345
676767
C:\wbin\examples>grep "?" problem.txt
23?445

Example 2: 

C:\wbin\examples>grep "[x?]" problem.txt
23?445
456x77

Example 3: 


gawk [options] -f script_file input_file(s)
gawk [options] 'script' file(s)
A pattern-matching program for processing files, especially when files are databases.
For usage, use gawk --help.

Example1: Print the maximal record length of a file.

C:\wbin\examples>cat reclength.awk
BEGIN {len = 0}
{
  test = length($0)
  if ( test > len) len = test
}
END {print "The maximal record length is " len}
C:\wbin\examples>gawk -f reclength.awk score.txt
The maximal record length is 10

Example 2: Print the second column of a file. 

C:\wbin\examples>more score.txt
john 81 91
mark 82 93
tina 88 92
C:\wbin\examples>gawk  '{print $2}' score.txt
81
82
88

Example 3: Print the number of records of each line.

C:\wbin\examples>more temp
     1  70 0 4 1 1 1 57 52 41 47 57
     2  121 1 4 2 1 3 68 59 53 63 61
     3  86 0 4 3 1 1 44 33 54 58 31
     4  141 0 4 3 1 3 63 44 47 53 56
     5  172 0 4 2 1 2 47 52 57 53 61
     6  113 0 4 2 1 2 44 52 51 63 61
     7  50 0 3 2 1 1 50 59 42 53 61
     8  11 0 1 2 1 2 34 46 45 39 36
     9  84 0 4 2 1 1 63 57 54 58 51
    10  48 0 3 2 1 2 57 55 52 50 51
    11  75 0 4 2 1 3 60 46 51 53 61
C:\wbin\examples>gawk '{print NF}'  temp
12
12
12
12
12
12
12
12
12
12
12
Example 4: Deleting the first line of a file. The first line of file test.txt has variable names. Sometimes, it may be useful to delete the first line, or multiple lines.
E:\awk_stuff>more test.txt
a b c
1 2 3
2 3 5
3 5 7
1 2 3
2 3 4
4 5 5
2 4 6
E:\awk_stuff>sh
$ gawk 'NR >1' <test.txt > noname.txt
$ cat noname.txt
1 2 3
2 3 5
3 5 7
1 2 3
2 3 4
4 5 5
2 4 6

seq [options] first  increment last
Generate a sequence of integers, with a user-selected increment.
For usage, use seq --help.

Example 1: Use the default for first (1) and increment (1).

C:\wbin>seq 5
1
2
3
4
5

Example 2: 

C:\wbin>seq 0 10 100
0
10
20
30
40
50
60
70
80
90
100

od [options] [file]
Octal dump; produce a dump (normally octal) of the named file.
For usage, use od --help.


sum [option] file
Calculate and print a checksum and the number of (512-byte) blocks for file.
For usage, use sum --help.

Example:

beta (119) % sum hsb2.sas7bdat
27863    25 hsb2.sas7bdat 

fold [options] [files]
Break the lines of the named files so that they are no wider than the specified width.
For usage, use fold --help.

Example: 

C:\wbin\examples>more score.txt
john 81 91
mark 82 93
tina 88 92

C:\wbin\examples>fold -w 5 score.txt
john
81 91
mark
82 93
tina
88 92

dd [option = value]
Make a copy of an input file (if = ) using the specified conditions, and send the results to the output file (or standard output if of is not specified).
For usage, use dd --help.

Example : Convert an input file to all uppercase:

D:\temp>more score.txt
john 81 91
mark 82 93
tina 88 92
D:\temp>dd if=score.txt of=score_up.txt conv=ucase
0+1 records in
0+1 records out
D:\temp>more score_up.txt
JOHN 81 91
MARK 82 93
TINA 88 92

tr [options][string1][string2]
Performing substitution of characters from string1 to string2  or deletion of characters in string1.
For usage, use tr --help.

Example1: Change uppercase to lowercase in a file:

D:\temp>more score.txt
john 81 91
mark 82 93
tina 88 92
D:\temp>tr '[a-z]' '[A-Z]' < score.txt > score1.txt
D:\temp>more score1.txt
JOHN 81 91
MARK 82 93
TINA 88 92

Example 2: Delete ^M character appended to the end of each line:

Let's say we have a file called test.csv that has an extra character ^M at the end of each line as illustrated below.

18,307,130,3504,12,70,1,8,0^M
15,350,165,3693,12,70,1,8,0^M
18,318,150,3436,11,70,1,8,0^M
16,304,150,3433,12,70,1,8,0^M
17,302,140,3449,11,70,1,8,0^M
15,429,198,4341,10,70,1,8,0^M

We can do use tr command as follows.

tr -d "\015" <test.csv > test1.csv

The new file test1.csv will look like this:

18,307,130,3504,12,70,1,8,0
15,350,165,3693,12,70,1,8,0
18,318,150,3436,11,70,1,8,0
16,304,150,3433,12,70,1,8,0
17,302,140,3449,11,70,1,8,0
15,429,198,4341,10,70,1,8,0

for FOR %variable IN (set) DO command [command-parameters]
Runs a specified command for each file in a set of files.
For usage, use for /?.

Example1: Change file extensions for all the files in a directory

C:\temp\toshow>ls
myfile_1  myfile_2  myfile_3  myfile_4  myfile_5  myfile_6  myfile_7

C:\temp\toshow>for %v in (*) do rename %v %v.txt

C:\temp\toshow>ls
myfile_1.txt  myfile_3.txt  myfile_5.txt  myfile_7.txt
myfile_2.txt  myfile_4.txt  myfile_6.txt


Note: 1) Place to download the Unix utilities for Windows: http://unxutils.sourceforge.net/

         2) See more complete document: http://www.gnu.org/manual/manual.html


How to cite this page

Report an error on this page

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.