In this chapter, we will discuss in detail about pipes and filters in Unix. You can connect two commands together so that the output from one program becomes the input of the next program. Two or more commands connected in this way form a pipe.
To make a pipe, put a vertical bar (|) on the command line between two commands.
When a program takes its input from another program, it performs some operation on that input, and writes the result to the standard output. It is referred to as a filter.
The grep Command
The grep command searches a file or files for lines that have a certain pattern. The syntax is −
The name «grep» comes from the ed (a Unix line editor) command g/re/p which means “globally search for a regular expression and print all lines containing it”.
A regular expression is either some plain text (a word, for example) and/or special characters used for pattern matching.
The simplest use of grep is to look for a pattern consisting of a single word. It can be used in a pipe so that only those lines of the input files containing a given string are sent to the standard output. If you don’t give grep a filename to read, it reads its standard input; that’s the way all filter programs work −
There are various options which you can use along with the grep command −
Sr.No.
Option & Description
1
Prints all lines that do not match pattern.
Prints the matched line and its line number.
Prints only the names of files with matching lines (letter «l»)
Prints only the count of matching lines.
Matches either upper or lowercase.
Let us now use a regular expression that tells grep to find lines with «carol», followed by zero or other characters abbreviated in a regular expression as «.*»), then followed by «Aug».−
Here, we are using the -i option to have case insensitive search −
The sort Command
The sort command arranges lines of text alphabetically or numerically. The following example sorts the lines in the food file −
The sort command arranges lines of text alphabetically by default. There are many options that control the sorting −
Sr.No.
Description
1
Sorts numerically (example: 10 will sort after 2), ignores blanks and tabs.
Reverses the order of sort.
Sorts upper and lowercase together.
Ignores first x fields when sorting.
More than two commands may be linked up into a pipe. Taking a previous pipe example using grep, we can further sort the files modified in August by the order of size.
The following pipe consists of the commands ls, grep, and sort −
This pipe sorts all files in your directory modified in August by the order of size, and prints them on the terminal screen. The sort option +4n skips four fields (fields are separated by blanks) then sorts the lines in numeric order.
The pg and more Commands
A long output can normally be zipped by you on the screen, but if you run text through more or use the pg command as a filter; the display stops once the screen is full of text.
Let’s assume that you have a long directory listing. To make it easier to read the sorted listing, pipe the output through more as follows −
The screen will fill up once the screen is full of text consisting of lines sorted by the order of the file size. At the bottom of the screen is the more prompt, where you can type a command to move through the sorted text.
Once you’re done with this screen, you can use any of the commands listed in the discussion of the more program.
Источник
Фундаментальные основы Linux. Часть IV. Программные каналы и команды
Глава 17. Фильтры
Команды, которые были реализованы для использования совместно с программными каналами , называются фильтрами . Эти фильтры реализуются в виде простейших программ, которые крайне эффективно выполняют одну определенную задачу. Исходя из всего вышесказанного, они могут использоваться в качестве строительных блоков при создании сложных конструкций.
В данной главе представлена информация о наиболее часто используемых фильтрах . В результате комбинирования простых команд и фильтров с использованием программных каналов могут быть созданы элегантные решения.
Фильтр cat
Фильтр tee
Фильтр grep
Фильтр cut
Фильтр tr
Фильтр wc
Фильтр sort
Фильтр uniq
Фильтр comm
Фильтр od
Фильтр sed
Примеры конвейеров
Конвейер who | wc
Конвейер who | cut | sort
Конвейер grep | cut
Практическое задание: фильтры
1. Сохраните отсортированный список пользователей командной оболочки bash в файле bashusers.txt.
2. Сохраните отсортированный список пользователей, осуществивших вход в систему, в файле onlineusers.txt.
3. Создайте список всех имен файлов из директории /etc , в которых содержится строка conf .
4. Создайте список всех имен файлов из директории /etc , в которых содержится строка conf вне зависимости от регистра символов.
5. Рассмотрите вывод утилиты /sbin/ifconfg . Создайте команду, с помощью которой будут выводиться исключительно IP-адреса и маски подсетей.
6. Создайте команду, которая позволит удалить все не относящиеся к буквенным символы из потока данных.
7. Создайте команду, которая будет принимать файл и выводить каждое слово из него в отдельной строке.
8. Разработайте систему проверки орфографии с интерфейсом командной строки. (Словарь должен находиться в директории /usr/share/dict/ .)
Корректная процедура выполнения практического задания: фильтры
1. Сохраните отсортированный список пользователей командной оболочки bash в файле bashusers.txt.
2. Сохраните отсортированный список пользователей, осуществивших вход в систему, в файле onlineusers.txt.
3. Создайте список всех имен файлов из директории /etc , в которых содержится строка conf .
4. Создайте список всех имен файлов из директории /etc , в которых содержится строка conf вне зависимости от регистра символов.
5. Рассмотрите вывод утилиты /sbin/ifconfg . Создайте команду, с помощью которой будут выводиться исключительно IP-адреса и маски подсетей.
6. Создайте команду, которая позволит удалить все не относящиеся к буквенным символы из потока данных.
7. Создайте команду, которая будет принимать файл и выводить каждое слово из него в отдельной строке.
8. Разработайте систему проверки орфографии с интерфейсом командной строки. (Словарь должен находиться в директории /usr/share/dict/ .)
Также вы можете добавить решение из вопроса номер 6 для удаления не относящихся к буквенным символов и фильтр tr -s ‘ ‘ для удаления лишних символов пробелов.
Источник
12 Useful Commands For Filtering Text for Effective File Operations in Linux
In this article, we will review a number of command line tools that act as filters in Linux. A filter is a program that reads standard input, performs an operation upon it and writes the results to standard output.
For this reason, it can be used to process information in powerful ways such as restructuring output to generate useful reports, modifying text in files and many other system administration tasks.
With that said, below are some of the useful file or text filters in Linux.
1. Awk Command
Awk is a remarkable pattern scanning and processing language, it can be used to build useful filters in Linux. You can start using it by reading through our Awk series Part 1 to Part 13.
Additionally, also read through the awk man page for more info and usage options:
2. Sed Command
sed is a powerful stream editor for filtering and transforming text. We’ve already written a two useful articles on sed, that you can go through it here:
The sed man page has added control options and instructions:
3. Grep, Egrep, Fgrep, Rgrep Commands
These filters output lines matching a given pattern. They read lines from a file or standard input, and print all matching lines by default to standard output.
Note: The main program is grep, the variations are simply the same as using specific grep options as below (and they are still being used for backward compatibility):
Below are some basic grep commands:
4. head Command
head is used to display the first parts of a file, it outputs the first 10 lines by default. You can use the -n num flag to specify the number of lines to be displayed:
Learn how to use head command with tail and cat commands for effective usage in Linux.
5. tail Command
tail outputs the last parts (10 lines by default) of a file. Use the -n num switch to specify the number of lines to be displayed.
The command below will output the last 5 lines of the specified file:
Additionally, tail has a special option -f for watching changes in a file in real-time (especially log files).
The following command will enable you monitor changes in the specified file:
Read through the tail man page for a complete list of usage options and instructions:
6. sort Command
sort is used to sort lines of a text file or from standard input.
Below is the content of a file named domains.list:
You can run a simple sort command to sort the file content like so:
You can use sort command in many ways, go through some of the useful articles on sort command as follows:
7. uniq Command
uniq command is used to report or omit repeated lines, it filters lines from standard input and writes the outcome to standard output.
After running sort on an input stream, you can remove repeated lines with uniq as in the example below.
To indicate the number of occurrences of a line, use the -c option and ignore differences in case while comparing by including the -i option:
Read through the uniq man page for further usage info and flags:
8. fmt Command
fmt simple optimal text formatter, it reformats paragraphs in specified file and prints results to the standard output.
The following is the content extracted from the file domain-list.txt:
To reformat the above content to a standard list, run the following command with -w switch is used to define the maximum line width:
9. pr Command
pr command converts text files or standard input for printing. For instance on Debian systems, you can list all installed packages as follows:
To organize the list in pages and columns ready for printing, issue the following command.
The flags used here are:
—column defines number of columns created in the output.
-l specifies page length (default is 66 lines).
10. tr Command
This tool translates or deletes characters from standard input and writes results to standard output.
The syntax for using tr is as follows:
Take a look at the examples below, in the first command, set1( [:upper:] ) represents the case of input characters (all upper case).
Then set2([:lower:]) represents the case in which the resultant characters will be. It’s same thing in the second example and the escape sequence \n means print output on a new line:
11. more Command
more command is a useful file perusal filter created basically for certificate viewing. It shows file content in a page like format, where users can press [Enter] to view more information.
You can use it to view large files like so:
12. less Command
less is the opposite of more command above but it offers extra features and it’s a little faster with large files.
Use it in the same way as more:
Learn Why ‘less’ is Faster Than ‘more’ Command for effective file navigation in Linux.
That’s all for now, do let us know of any useful command line tools not mentioned here, that act as a text filters in Linux via the comment section below.
If You Appreciate What We Do Here On TecMint, You Should Consider:
TecMint is the fastest growing and most trusted community site for any kind of Linux Articles, Guides and Books on the web. Millions of people visit TecMint! to search or browse the thousands of published articles available FREELY to all.
If you like what you are reading, please consider buying us a coffee ( or 2 ) as a token of appreciation.
We are thankful for your never ending support.
Источник
Filters in Linux
Filters are programs that take plain text(either stored in a file or produced by another program) as standard input, transforms it into a meaningful format, and then returns it as standard output. Linux has a number of filters. Some of the most commonly used filters are explained below:
1.cat: Displays the text of the file line by line.
Syntax:
2.head: Displays the first n lines of the specified text files. If the number of lines is not specified then by default prints first 10 lines.
Syntax:
3.tail : It works the same way as head, just in reverse order. The only difference in tail is, it returns the lines from bottom to up.
Syntax:
4.sort: Sorts the lines alphabetically by default but there are many options available to modify the sorting mechanism. Be sure to check out the main page to see everything it can do.
Syntax:
5.uniq: Removes duplicate lines. uniq has a limitation that it can only remove continuous duplicate lines(although this can be fixed by the use of piping). Assuming we have the following data.
Syntax:
You can see that applying uniq doesn’t remove any duplicate lines, because uniq only removes duplicate lines which are together.
When applying uniq to sorted data, it removes the duplicate lines because, after sorting data, duplicate lines come together.
6.wc: wc command gives the number of lines, words and characters in the data.
Syntax:
In above image the wc gives 4 outputs as:
number of lines
number of words
number of characters
path
7.grep: grep is used to search a particular information from a text file.
Syntax:
Below are the two ways in which we can implement grep.
8.tac: tac is just the reverse of cat and it works the same way, i.e., instead of printing from lines 1 through n, it prints lines n through 1. It is just reverse of cat command.
Syntax:
9.sed: sed stands for stream editor. It allows us to apply search and replace operation on our data effectively. sed is quite an advanced filter and all its options can be seen on its man page.
Syntax:
The expression we have used above is very basic and is of the form ‘s/search/replace/g’
In the above image, we can clearly see that Scooby is replaced by Scrapy.
10. nl : nl is used to number the lines of our text data.
Syntax:
It can clearly be seen in the above image that the lines have been numbered