Что такое grep и с чем его едят
Эта заметка навеяна мелькавшими последнее время на хабре постами двух тематик — «интересные команды unix» и «как я подбирал программиста». И описываемые там команды, конечно, местами интересные, но редко практически полезные, а выясняется, что реально полезным инструментарием мы пользоваться и не умеем.
Небольшое лирическое отступление:
Года три назад меня попросили провести собеседование с претендентами на должность unix-сисадмина. На двух крупнейших на тот момент фриланс-биржах на вакансию откликнулись восемь претендентов, двое из которых входили в ТОП-5 рейтинга этих бирж. Я никогда не требую от админов знания наизусть конфигов и считаю, что нужный софт всегда освоится, если есть желание читать, логика в действиях и умение правильно пользоваться инструментарием системы. Посему для начала претендентам были даны две задачки, примерно такого плана:
— поместить задание в крон, которое будет выполняться в каждый чётный час и в 3 часа;
— распечатать из файла /var/run/dmesg.boot информацию о процессоре.
К моему удивлению никто из претендентов с обоими вопросами не справился. Двое, в принципе, не знали о существовании grep.
Поэтому… Лето… Пятница… Перед шашлыками немного поговорим о grep.
Зная местную публику и дабы не возникало излишних инсинуаций сообщаю, что всё нижеизложенное справедливо для
Это важно в связи с
Для начала о том как мы обычно grep’аем файлы.
Используя cat:
Но зачем? Ведь можно и так:
Или вот так (ненавижу такую конструкцию):
Зачем-то считаем отобранные строки с помощью wc:
Сделаем тестовый файлик:
И приступим к поискам:
Опция -w позволяет искать по слову целиком:
А если нужно по началу или концу слова?
Стоящие в начале или конце строки?
Хотите увидеть строки в окрестности искомой?
Только снизу или сверху?
А ещё мы умеем так
И наоборот исключая эти
Разумеется grep поддерживает и прочие базовые квантификаторы, метасимволы и другие прелести регулярок
Пару практических примеров:
Отбираем только строки с ip:
Работает, но так симпатичнее:
Уберём строку с комментарием?
А теперь выберем только сами ip
Вот незадача… Закомментированная строка вернулась. Это связано с особенностью обработки шаблонов. Как быть? Вот так:
Здесь остановимся на инвертировании поиска ключом -v
Допустим нам нужно выполнить «ps -afx | grep ttyv»
Всё бы ничего, но строка «48798 2 S+ 0:00.00 grep ttyv» нам не нужна. Используем -v
Некрасивая конструкция? Потрюкачим немного:
Также не забываем про | (ИЛИ)
ну и тоже самое, иначе:
Ну и если об использовании регулярок в grep’e помнят многие, то об использовании POSIX классов как-то забывают, а это тоже иногда удобно.
Отберём строки с заглавными символами:
Плохо видно что нашли? Подсветим:
Ну и ещё пару трюков для затравки.
Первый скорее академичный. За лет 15 ни разу его не использовал:
Нужно из нашего тестового файла выбрать строки содержащие six или seven или eight:
Пока всё просто:
А теперь только те строки в которых six или seven или eight встречаются несколько раз. Эта фишка именуется Backreferences
Ну и второй трюк, куда более полезный. Необходимо вывести строки в которых 504 с обеих сторон ограничено табуляцией.
Ох как тут не хватает поддержки PCRE…
Использование POSIX-классов не спасает:
На помощь приходит конструкция [CTRL+V][TAB]:
Что ещё не сказал? Разумеется, grep умеет искать в файлах/каталогах и, разумеется, рекурсивно. Найдём в исходниках код, где разрешается использование Intel’ом сторонних SFP-шек. Как пишется allow_unsupported_sfp или unsupported_allow_sfp не помню. Ну да и ладно — это проблемы grep’а:
Надеюсь не утомил. И это была только вершина айсберга grep. Приятного Вам чтения, а мне аппетита на шашлыках!
Ну и удачного Вам grep’a!
Источник
Grep mac os примеры
Understanding The «grep» Command In Mac OS X
Part XV of this series.
October 4th, 2002
I don’t know why Dudley keeps trying to find himself, I found him years ago.
— Peter Cook
This series is designed to help you learn more about the Mac OS X command line. If you have any questions about what you read here, check out the earlier columns, write back in the comments below, or join us in the Hardcore X! forum.
In the previous column, we learned about regular expressions, and how to use them to search for text in vi. Having such a text-searching tool for the command line would be a valuable addition to Unix; naturally, such a tool exists. It is called grep, and it is the subject of today’s column.
grep allows you to search through your entire system, for either the name of a file, or for content within those files. This is similar to the way Sherlock used to work before Sherlock 3, and the way «Find» works today in Jaguar’s GUI. When you need to find a string of text on your system from the command line, grep is the way to do it. Now, on to how to use it.
The grep command will take a regular expression, as well as a list of files. It will then search through the files and, for each line that is matched by the regular expression, print the line. (Supposedly, the name grep comes from ed command g/RE/p, or «global/regular expression/print», which does the same thing within the editor. I can neither confirm nor deny this.) If there are no files indicated, grep will read from standard input. Therefore, you can do things like:
to give a more flexible search. Notice that the regular expression, .es.*, was enclosed in double quotes. Otherwise, we get this: [Note: I think that this is because the asterisk and/or period will confuse the tcsh command line, which tries to use them as metacharacers, so you need the quotes. On the other hand, if you want to anchor the regular expression to the end of a line with a dollar sign, it interprets this as a variable $» and chokes. tcsh is quirky with regular expressions, and I haven’t quite figured out everything with it. I know from experience that the Korn shell, ksh, does not suffer from this. On the other hand, ksh is not the default shell, so there y’are.]
You also need quotes if you have spaces in the regular expression. The difference between grep the file and grep «the » file is that the former will match any occurrence of t-h-e, whereas the latter will match only for t-h-e-space. This means that the former will match «I was there» but the latter will not. Remember that the command line ignores extra spaces, collapsing many into one, unless the spaces are quoted.
As you might expect, grep takes the standard regular expression characters of ., *, ^, $, \, and [ ]. Thus, to count the number of blank lines in a file, do:
Thus, we can see that grep ^$ testfile will print all three blank lines. We can use wc and the pipe, |, to build our own tool to count blank lines. Neat, huh?
In some Unixes (Unices?), there were two versions of grep, grep and egrep, whose primary difference was that each had slightly different additions to the basic regular expression syntax. In Darwin, and therefore in the syntaxes (syntaces?) are combined, and using either command will get you the same as using the other. Thus, you can bounce back and forth between them like so many yo-yos (yo-yi?)[*]
One set of regular expression characters available in grep is the \ < \>pair. This allows you to search for a range of occurrences. Suppose you want to look for «to», followed by three to nine characters, follow by an «a». This can be done by: Again, the quotes are needed here. If you want to match exactly 3, the regular expression is to.\<3\>a. Normally, the \ < \>pair is only available in grep, but in Darwin and it is also available in egrep.
grep‘s regular expression syntax is expanded in to include features not seen in the standard definition of grep. In other words, grep will let you do searches that greps on other Unices won’t. For example, you can use the \ pair to denote the beginnings and endings of words, just like in vi.
We have seen that the asterisk (*) is used to denote «any number of the thing preceding me.» In grep, the plus sign, +, can be used to denote «at least on of the thing preceding me.» So, while the regular expression th*e will match te, the, thhe, . , the regular expression th+e will match the, thhe, thhhe, . . So can see that h+ is the same as hh*. The plus sign is often used in other utilities’ regular expressions, but is not part of grep on most other systems. Make a note of it, there will be a quiz later.
Another bonus freebie that is thrown our way is the question mark, ?, unless you are British and over 35, in which case it is «a mark of interrogation.» grep uses this in regular expressions to denote «zero or one occurrence of the thing before me», or «an optional [whatever is before me].» Therefore, the expression lie?d will match either lied or lid.
Finally, the vertical bar, |, can be used for either/or matching, just like in, you guessed it, vi.
grep can take several options; you can see them all via of course, but I’ve found that the most useful ones are (remember that this works in the grep option format):
-c: «count the lines». Instead of printing all the matched lines, -c merely prints a count of matched lines for each file. Thus that trick isn’t needed for one file. (If you pass in a list of files, though, . )
-e PATTERN: «expression starts here.» Using -e will tell grep «What follows is the pattern with which to search.» This is very useful when your pattern starts with a ‘-‘. Otherwise, the command line might think that your expression is an option and get confused.
-f FILE: «file holds the expression». -f allows you to store a pattern in a file and tell grep «Yo, use this.» I’ve mostly used this when writing scripts that will use the same pattern repeatedly. That way, if I have to change it later, I only have to change it in one place.
-i: «ignore case». -i forces grep to ignore the distinction between uppercase and lowercase. Imagine you need to find matches in a file which may have come from Windows (include shudder here). Now imagine a long string of paired letters like and on and on. Just use -i instead and save yourself time and pain.
-l: «list files». Instead of printing the matched lines,when you use the -l option, grep will just print a list of the files which contain the expression. This is mostly used when you are doing something like in a directory with a lot of files or when you just want to know which files need (processing, editing, etc).
-n: «number». -n means that before each line of output, grep will print its line number within the file.
-v: «invert». -v instructs grep to print only those lines that don’t match the expression.
As you can see, grep is a very powerful tool. It can be used to quickly search files and to filter output on the command line. It does have a couple limitations, though. First, it is no speed demon. Building those regular expressions and parsing a lot of text in a flexible way takes resources, and that takes time. (Admittedly, these days, that isn’t much of an issue, but still, there it is.) Second, consider the following: you are working away, happy as a clam, and the boss says «Cyprian», if your name is Cyprian, «I just got a call from marketing, we need to change the search in all those voodoo scripts you wrote, and we need it in ten minutes.»
Now, you know and I know that you can look for the expression and search for it using \ after \ after \. But my lord, and your duke for that matter, who the heck would want to? Do you realize that you would look for (or something along those lines) and heaven forbid you should make the slightest mistake. If you’re like me, and I know I am, you’d think «Now dash it, there must be an easier way. Surely, in all the history of Unix, someone has had to face just such an emergency and written a grep-like tool to deal with this. Like that Cyprian chap, maybe.» Well, Cyprian has come through. It’s called fgrep (for «fast grep»), and it works a lot like grep except it doesn’t take a regular expression.
Where you would normally place a regular expression, just put in a literal string. Originally it was used to be a fast alternative to grep by trading the power and flexibility of regular expressions for speed. As quick as computers are these days, that isn’t an issue, but if you want to find something that contains a literal period or a literal asterisk, it’s the bee’s knees.
[*] This joke was borrowed at great embarrassment from Shelley Berman. All young whippersnappers are advised to ask their parents or grandparents.
You are encouraged to send Richard your comments, or to post them below.
Most Recent Mac OS X Command Line 101 Columns
Mac OS X Command Line 101 Archives
Back to The Mac Observer For More Mac News!
Richard Burton is a longtime Unix programmer and a handsome brute. He spends his spare time yelling at the television during Colts and Pacers games, writing politically incorrect short stories, and trying to shoot the neighbor’s cat (not really) nesting in his garage. He can be seen running roughshod over the TMO forums under the alias tbone1.
We also offer Yesterday’s News On One Page!
Mac Products Guide | ||||
| ||||
© All information presented on this site is copyrighted by The Mac Observer except where otherwise noted. No portion of this site may be copied without express written consent. Other sites are invited to link to any aspect of this site provided that all content is presented in its original form and is not placed within another .
Источник