- 14 Useful Examples of Linux ‘sort’ Command – Part 1
- If You Appreciate What We Do Here On TecMint, You Should Consider:
- Linux sort command
- Overview
- Syntax
- Options
- Checking For Sorted Order
- Sorting Multiple Files Using The Output Of find
- Comparing Only Selected Fields Of Data
- Using sort And join Together
- Related commands
14 Useful Examples of Linux ‘sort’ Command – Part 1
Sort is a Linux program used for printing lines of input text files and concatenation of all files in sorted order. Sort command takes blank space as field separator and entire Input file as sort key. It is important to notice that sort command don’t actually sort the files but only print the sorted output, until your redirect the output.
This article aims at deep insight of Linux ‘sort‘ command with 14 useful practical examples that will show you how to use sort command in Linux.
1. First we will be creating a text file (tecmint.txt) to execute ‘sort‘ command examples. Our working directory is ‘/home/$USER/Desktop/tecmint.
The option ‘-e‘ in the below command enables interpretion of backslash and /n tells echo to write each string to a new line.
2. Before we start with ‘sort‘ lets have a look at the contents of the file and the way it look.
3. Now sort the content of the file using following command.
Note: The above command don’t actually sort the contents of text file but only show the sorted output on terminal.
4. Sort the contents of the file ‘tecmint.txt‘ and write it to a file called (sorted.txt) and verify the content by using cat command.
5. Now sort the contents of text file ‘tecmint.txt‘ in reverse order by using ‘-r‘ switch and redirect output to a file ‘reversesorted.txt‘. Also check the content listing of the newly created file.
6. We are going a create a new file (lsl.txt) at the same location for detailed examples and populate it using the output of ‘ls -l‘ for your home directory.
Now will see examples to sort the contents on the basis of other field and not the default initial characters.
7. Sort the contents of file ‘lsl.txt‘ on the basis of 2nd column (which represents number of symbolic links).
Note: The ‘-n‘ option in the above example sort the contents numerically. Option ‘-n‘ must be used when we wanted to sort a file on the basis of a column which contains numerical values.
8. Sort the contents of file ‘lsl.txt‘ on the basis of 9th column (which is the name of the files and folders and is non-numeric).
9. It is not always essential to run sort command on a file. We can pipeline it directly on the terminal with actual command.
10. Sort and remove duplicates from the text file tecmint.txt. Check if the duplicate has been removed or not.
Rules so far (what we have observed):
- Lines starting with numbers are preferred in the list and lies at the top until otherwise specified (-r).
- Lines starting with lowercase letters are preferred in the list and lies at the top until otherwise specified (-r).
- Contents are listed on the basis of occurrence of alphabets in dictionary until otherwise specified (-r).
- Sort command by default treat each line as string and then sort it depending upon dictionary occurrence of alphabets (Numeric preferred; see rule – 1) until otherwise specified.
11. Create a third file ‘lsla.txt‘ at the current location and populate it with the output of ‘ls -lA‘ command.
Those having understanding of ‘ls‘ command knows that ‘ls -lA’=’ls -l‘ + Hidden files. So most of the contents on these two files would be same.
12. Sort the contents of two files on standard output in one go.
Notice the repetition of files and folders.
13. Now we can see how to sort, merge and remove duplicates from these two files.
Notice that duplicates has been omitted from the output. Also, you can write the output to a new file by redirecting the output to a file.
14. We may also sort the contents of a file or the output based upon more than one column. Sort the output of ‘ls -l‘ command on the basis of field 2,5 (Numeric) and 9 (Non-Numeric).
That’s all for now. In the next article we will cover a few more examples of ‘sort‘ command in detail for you. Till then stay tuned and connected to Tecmint. Keep sharing. Keep commenting. Like and share us and help us get spread.
If You Appreciate What We Do Here On TecMint, You Should Consider:
TecMint is the fastest growing and most trusted community site for any kind of Linux Articles, Guides and Books on the web. Millions of people visit TecMint! to search or browse the thousands of published articles available FREELY to all.
If you like what you are reading, please consider buying us a coffee ( or 2 ) as a token of appreciation.
We are thankful for your never ending support.
Источник
Linux sort command
sort sorts the contents of a text file, line by line.
Overview
sort is a simple and very useful command which will rearrange the lines in a text file so that they are sorted, numerically and alphabetically. By default, the rules for sorting are:
- Lines starting with a number will appear before lines starting with a letter.
- Lines starting with a letter that appears earlier in the alphabet will appear before lines starting with a letter that appears later in the alphabet.
- Lines starting with a lowercase letter will appear before lines starting with the same letter in uppercase.
The rules for sorting can be changed according to the options you provide to the sort command; these are listed below.
Syntax
Options
-b, —ignore-leading-blanks | Ignore leading blanks. |
-d, —dictionary-order | Consider only blanks and alphanumeric characters. |
-f, —ignore-case | Fold lower case to upper case characters. |
-g, —general-numeric-sort | Compare according to general numerical value. |
-i, —ignore-nonprinting | Consider only printable characters. |
-M, —month-sort | Compare (unknown) Note |
if you are using the join command in conjunction with sort, be aware that there is a known incompatibility between the two programs — unless you define the locale. If you are using join and sort to process the same input, it is highly recommended that you set LC_ALL to C, which will standardize the localization used by all programs.
Checking For Sorted Order
If you just want to check to see if your input file is already sorted, use the -c option:
If your data is unsorted, you will receive an informational message reporting the line number of the first unsorted data, and what the unsorted data is:
Sorting Multiple Files Using The Output Of find
One useful way to sort data is to sort the input of multiple files, using the output of the find command. The most reliable (and responsible) way to accomplish this is to specify that find produces a NUL-terminated file list as its output, and to pipe that output into sort using the —files0-from option.
Normally, find outputs one file on each line; in other words, it inserts a line break after each file name it outputs. For instance, let’s say we have three files named data1.txt, data2.txt, and data3.txt. find can generate a list of these files using the following command:
This command uses the question mark wildcard to match any file that has a single character after the word «data» in its name, ending in the extension «.txt«. It produces the following output:
It would be nice if we could use this output to tell the sort command, «sort the data in any files found by find as if they were all one big file.» The problem with the standard find output is, even though it’s easy for humans to read, it can cause problems for other programs that need to read it in. Because file names can include non-standard characters, so in some cases, this format will be read incorrectly by another program.
The correct way to format find‘s output to be used as a file list for another program is to use the -print0 option when running find. This terminates each file name with the NUL character (ASCII character number zero), which is universally illegal to use in file names. This makes things easier for the program reading the file list, since it knows that any time it sees the NUL character, it can be sure it’s at the end of a file name.
So, if we run the previous command with the -print0 option at the end, like this:
. it will produce the following output:
You can’t see it, but after each file name is a NUL character. This character is non-printable, so it will not appear on your screen, but it’s there, and any programs you pipe this output to (sort, for example) will see them.
Be careful how you word the find command. It’s important to specify -print0 last; find needs this to be specified after the other options.
Okay, but how do we tell sort to read this file list and sort the contents of all those files?
One way to do it is to pipe the find output to sort, specifying the —files0-from option in the sort command, and specify the file as a dash («—«), which will read from the standard input. Here’s what the command will look like:
. and it will output the sorted data of any files located by find which matches the pattern data?.txt, as if they were all one file. This example is a very powerful function of sort — give it a try.
Comparing Only Selected Fields Of Data
Normally, sort decides how to sort lines based on the entire line: it compares every character from the first character in a line, to the last one.
If, on the other hand, you want sort to compare a limited subset of your data, you can specify which fields to compare using the -k option.
For instance, if you have an input file data.txt With the following data:
. and you sort it without any options, like this:
. you will receive the following output:
. as you can see, nothing was changed from the original data ordering, because of the numbers at the beginning of the line — which were already sorted. However, if you want to sort based on the names, you can use the following command:
This command will sort the second field, and ignore the first. (The «k» in «-k» stands for «key» — we are defining the «sorting key» used in the comparison.)
Fields are defined as anything separated by whitespace; in this case, an actual space character. Our command above will produce the following output:
. which is sorted by the second field, listing the lines alphabetically by name, and ignoring the numbers in the sorting process.
You can also specify a more complex -k option. The complete positional argument looks like this:
. where POS1 is the starting field position, and POS2 is the ending field position. Each field position, in turn, is defined as:
. where F is the field number and C is the character within that field to begin the sort comparison.
So, let’s say our input file data.txt contains the following data:
. we can sort by seniority if we specify the third field as the sort key:
. this produces the following output:
Or, we can ignore the first three characters of the third field, and sort solely based on title, ignoring seniority:
We can also specify where in the line to stop comparing. If we sort based on only the third-through-fifth characters of the third field of each line, like this:
. sort will see only the same thing on every line: «.De» . and nothing else. As a result, sort will not see any differences in the lines, and the sorted output will be the same as the original file:
Using sort And join Together
sort can be especially useful when used in conjunction with the join command. Normally join will join the lines of any two files whose first field match. Let’s say you have two files, file1.txt and file2.txt. file1.txt contains the following text:
. and file2.txt contains the following:
If you’d like sort these two files and join them, you can do so all in one command if you’re using the bash command shell, like this:
Here, the sort commands in parentheses are each executed, and their output is redirected to join, which takes their output as standard input for its first and second arguments; it is joining the sorted contents of both files and gives results similar to the below results.
Related commands
comm — Compare two sorted files line by line.
join — Join the lines of two files which share a common field of data.
uniq — Identify, and optionally filter out, repeated lines in a file.
Источник