- How to Count lines in a file in UNIX/Linux
- Using “wc -l”
- Using awk
- Using sed
- Using grep
- Some more commands
- How do I count the number of rows and columns in a file using bash?
- 13 Answers 13
- Count lines in large files
- 14 Answers 14
- How to count lines in a document?
- 27 Answers 27
- wc -l does not count lines.
- POSIX-compliant solution
- Grep Count Lines If a String / Word Matches on Linux or Unix System
- Grep Count Lines If a String / Word Matches
- Summing up
How to Count lines in a file in UNIX/Linux
Question: I have a file on my Linux system having a lot of lines. How do I count the total number of lines in the file?
Using “wc -l”
There are several ways to count lines in a file. But one of the easiest and widely used way is to use “wc -l”. The wc utility displays the number of lines, words, and bytes contained in each input file, or standard input (if no file is specified) to the standard output.
So consider the file shown below:
1. The “wc -l” command when run on this file, outputs the line count along with the filename.
2. To omit the filename from the result, use:
3. You can always provide the command output to the wc command using pipe. For example:
You can have any command here instead of cat. Output from any command can be piped to wc command to count the lines in the output.
Using awk
If you must want to use awk to find the line count, use the below awk command:
Using sed
Use the below sed command syntax to find line count using GNU sed:
Using grep
Our good old friend «grep» can also be used to count the number of lines in a file. These examples are just to let you know that there are multiple ways to count the lines without using «wc -l». But if asked I will always use «wc -l» instead of these options as it is way too easy to remember.
With GNU grep, you can use the below grep syntax:
Here is another version of grep command to find line count.
Some more commands
Along with the above commands, its good to know some rarely used commands to find the line count in a file.
1. Use the nl command (line numbering filter) to get each line numbered. The syntax for the command is:
Not so direct way to get line count. But you can use awk or sed to get the count from last line. For example:
2. You can also use vi and vim with the command «:set number» to set the number on each line as shown below. If the file is very big, you can use «Shift+G» to go to the last line and get the line count.
3. Use the cat command with -n switch to get each line numbered. Again, here you can get the line count from the last line.
4. You can also use perl one lines to find line count:
Источник
How do I count the number of rows and columns in a file using bash?
Say I have a large file with many rows and many columns. I’d like to find out how many rows and columns I have using bash.
13 Answers 13
Columns: awk ‘
Use head -n 1 for lowest column count, tail -n 1 for highest column count.
Rows: cat file | wc -l or wc -l for the UUOC crowd.
Alternatively to count columns, count the separators between columns. I find this to be a good balance of brevity and ease to remember. Of course, this won’t work if your data include the column separator.
Uses head -n1 to grab the first line of the file. Uses grep -o to to count all the spaces, and output each space found on a new line. Uses wc -l to count the number of lines.
EDIT: As Gaurav Tuli points out below, I forgot to mention you have to mentally add 1 to the result, or otherwise script this math.
If your file is big but you are certain that the number of columns remains the same for each row (and you have no heading) use:
to find the number of columns, where FILE is your file name.
To find the number of lines ‘wc -l FILE’ will work.
Little twist to kirill_igum’s answer, and you can easily count the number of columns of any certain row you want, which was why I’ve come to this question, even though the question is asking for the whole file. (Though if your file has same columns in each line this also still works of course):
Gives the number of columns of row 2. Replace 2 with 55 for example to get it for row 55.
Code above works if your file is separated by tabs, as we define it to «tr». If your file has another separator, say commas, you can still count your «columns» using the same trick by simply changing the separator character «t» to «,»:
You can use bash. Note for very large files in terms of GB, use awk/wc . However it should still be manageable in performance for files with a few MB.
If counting number of columns in the first is enough, try the following:
Источник
Count lines in large files
I commonly work with text files of
20 Gb size and I find myself counting the number of lines in a given file very often.
The way I do it now it’s just cat fname | wc -l , and it takes very long. Is there any solution that’d be much faster?
I work in a high performance cluster with Hadoop installed. I was wondering if a map reduce approach could help.
I’d like the solution to be as simple as one line run, like the wc -l solution, but not sure how feasible it is.
14 Answers 14
Try: sed -n ‘$=’ filename
Also cat is unnecessary: wc -l filename is enough in your present way.
Your limiting speed factor is the I/O speed of your storage device, so changing between simple newlines/pattern counting programs won’t help, because the execution speed difference between those programs are likely to be suppressed by the way slower disk/storage/whatever you have.
But if you have the same file copied across disks/devices, or the file is distributed among those disks, you can certainly perform the operation in parallel. I don’t know specifically about this Hadoop, but assuming you can read a 10gb the file from 4 different locations, you can run 4 different line counting processes, each one in one part of the file, and sum their results up:
Notice the & at each command line, so all will run in parallel; dd works like cat here, but allow us to specify how many bytes to read ( count * bs bytes) and how many to skip at the beginning of the input ( skip * bs bytes). It works in blocks, hence, the need to specify bs as the block size. In this example, I’ve partitioned the 10Gb file in 4 equal chunks of 4Kb * 655360 = 2684354560 bytes = 2.5GB, one given to each job, you may want to setup a script to do it for you based on the size of the file and the number of parallel jobs you will run. You need also to sum the result of the executions, what I haven’t done for my lack of shell script ability.
If your filesystem is smart enough to split big file among many devices, like a RAID or a distributed filesystem or something, and automatically parallelize I/O requests that can be paralellized, you can do such a split, running many parallel jobs, but using the same file path, and you still may have some speed gain.
EDIT: Another idea that occurred to me is, if the lines inside the file have the same size, you can get the exact number of lines by dividing the size of the file by the size of the line, both in bytes. You can do it almost instantaneously in a single job. If you have the mean size and don’t care exactly for the the line count, but want an estimation, you can do this same operation and get a satisfactory result much faster than the exact operation.
Источник
How to count lines in a document?
I have lines like these, and I want to know how many lines I actually have.
Is there a way to count them all using linux commands?
27 Answers 27
This will output the number of lines in :
Or, to omit the from the result use wc -l :
You can also pipe data to wc as well:
To count all lines use:
To filter and count only lines with pattern use:
Or use -v to invert match:
See the grep man page to take a look at the -e,-i and -x args.
there are many ways. using wc is one.
sed -n ‘$=’ file (GNU sed)
The tool wc is the «word counter» in UNIX and UNIX-like operating systems, but you can also use it to count lines in a file by adding the -l option.
wc -l foo will count the number of lines in foo . You can also pipe output from a program like this: ls -l | wc -l , which will tell you how many files are in the current directory (plus one).
If you want to check the total line of all the files in a directory ,you can use find and wc:
wc -l does not count lines.
Yes, this answer may be a bit late to the party, but I haven’t found anyone document a more robust solution in the answers yet.
Contrary to popular belief, POSIX does not require files to end with a newline character at all. Yes, the definition of a POSIX 3.206 Line is as follows:
A sequence of zero or more non- characters plus a terminating character.
However, what many people are not aware of is that POSIX also defines POSIX 3.195 Incomplete Line as:
A sequence of one or more non- characters at the end of the file.
Hence, files without a trailing LF are perfectly POSIX-compliant.
If you choose not to support both EOF types, your program is not POSIX-compliant.
As an example, let’s have look at the following file.
No matter the EOF, I’m sure you would agree that there are two lines. You figured that out by looking at how many lines have been started, not by looking at how many lines have been terminated. In other words, as per POSIX, these two files both have the same amount of lines:
The man page is relatively clear about wc counting newlines, with a newline just being a 0x0a character:
Hence, wc doesn’t even attempt to count what you might call a «line». Using wc to count lines can very well lead to miscounts, depending on the EOF of your input file.
POSIX-compliant solution
You can use grep to count lines just as in the example above. This solution is both more robust and precise, and it supports all the different flavors of what a line in your file could be:
Источник
Grep Count Lines If a String / Word Matches on Linux or Unix System
Grep Count Lines If a String / Word Matches
The syntax is as follows on Linux or Unix-like systems:
grep -c ‘word-to-search’ fileNameHere
For example, search a word named ‘vivek’ in /etc/passwd and count line if a word matches:
$ grep -c vivek /etc/passwd
OR
$ grep -w -c vivek /etc/passwd
Sample outputs indicating that word ‘vivek’ found one times:
However, with the -v or —invert-match option it will count non-matching lines, enter:
$ grep -v -c vivek /etc/passwd
Sample outputs:
- No ads and tracking
- In-depth guides for developers and sysadmins at Opensourceflare✨
- Join my Patreon to support independent content creators and start reading latest guides:
- How to set up Redis sentinel cluster on Ubuntu or Debian Linux
- How To Set Up SSH Keys With YubiKey as two-factor authentication (U2F/FIDO2)
- How to set up Mariadb Galera cluster on Ubuntu or Debian Linux
- A podman tutorial for beginners – part I (run Linux containers without Docker and in daemonless mode)
- How to protect Linux against rogue USB devices using USBGuard
Join Patreon ➔
- -c : Display only a count of selected lines per FILE
- -v : Select non-matching lines
- —invert-match : Same as above.
Using grep command to count how many times the word ‘vivek’ and ‘root’ found in /etc/passwd file on Linux or Unix.
Summing up
We can easily suppress normal grep or egrep command output bypassing the -c option. Instead, it will print a count of matching lines for each input file. I would urge you to read the grep command man page to get additional information by typing the following man command:
man grep
man egrep
grep —help
egrep —help
🐧 Get the latest tutorials on Linux, Open Source & DevOps via
Источник