Linux how to split file

Содержание
  1. 9 Useful Examples of the Split Command in Linux
  2. Examples of Split command in Linux
  3. 1. Split files into multiple files
  4. 2. Split files into multiple files with specific line numbers
  5. 3. Split the files into n number of files
  6. 4. Split files with custom name prefix
  7. 5. Split and Specify Suffix Length
  8. 6. Split with numeric order suffix
  9. 7. Append hex suffixes to split files
  10. 8. Split files into multiple files of specific size
  11. 9. Split files into multiple files of ‘At Most’ size n with
  12. Bonus Tip: Rejoining split files
  13. How to Split Large Text File into Smaller Files in Linux
  14. Split
  15. Split Examples
  16. Split a file into multiple pieces by default usage
  17. Split the file, based upon the number of lines
  18. Split a large file into 500MB files
  19. Split a large file into 200MB files with the given prefix
  20. Split the file and name it with numbers
  21. Csplit
  22. Csplit Examples
  23. Split files based on the number of lines
  24. Split files using regular expressions
  25. Split files with the given prefix
  26. Split a file by suppressing a line that matches the input pattern
  27. Customize the number of digits in the output files names
  28. Forcing csplit to save the output file in case of error
  29. Wrapping up
  30. 11 Useful Split Command Examples for Linux Systems
  31. Example: 1) Split File into Pieces
  32. Example: 2) Split Command with verbose option
  33. Example: 3) Split files with customize line numbers (-l)
  34. Example: 4) Split files with file size using option -b
  35. Example: 5) Create Split files with numeric suffix instead of alphabetic (-d)
  36. Example: 6) Split file with Customize Suffix
  37. Example: 7) Generate n chunks output files with split command (-n)
  38. Example: 8) Prevent Zero Size Split output files with option (-e)
  39. Example:9) Create Split output files of customize suffix length (-a option)
  40. Example: 10) Split ISO file and merge it into a single file.
  41. Example: 11) Verify the Integrity of Merge file using md5sum utility

9 Useful Examples of the Split Command in Linux

To help you learn about the split command I am using a relatively large text file containing 17170 lines and 1.4 MB in size. You can download a copy of this file from the GitHub link.

Note that I will not directly display output in these examples because of the large file sizes. I will use the ll and wc commands to highlight file changes.

I advise you to have a quick look at the wc command to understand the output of the split command examples.

Examples of Split command in Linux

This is the syntax of the Split command:

Let’s see how to use it to split files in Linux.

1. Split files into multiple files

By default, split command creates new files for each 1000 lines. If no prefix is specified, it will use ‘x’. The letters that follow enumerate the files therefore xaa comes first, then xab, and so on.

Let’s split the sample log file:

If you use the ls command, you can see multiple new files in your directory.

You can use wc to quickly check the line counts after splitting.

Remember from earlier that we saw our initial file had 17,170 lines. So we can see our program has done as expected by creating 18 new files. 17 of them are filled with 1000 lines each, and the last one has the remaining 170 lines.

Another way that we can demonstrate what is happening is to run the command with the verbose option. If you’re unfamilar with verbose, you are missing out! It provides more detailed feedback about what your system is doing and it is available to use with many commands.

You can see what’s going on with your command on the display:

2. Split files into multiple files with specific line numbers

I understand that you might not like that files are split into files of 1000 lines. You can changes this behavior with -l option.

When this is added, you can now specify how many lines you want in each of the new files.

As you can guess, now the split files have 500 lines each, except the last one.

Now you have many more files, but with half as many lines in each one.

3. Split the files into n number of files

The -n option makes splitting into a designated number of pieces or chunks easy. You can assign how many files you want by adding an integer value after -n.

Now you can see that there are 15 new files.

4. Split files with custom name prefix

What if you want to use split but keep the original name of my file or make a new name altogether instead of using ‘x’?

You may remember seeing the prefix as part of the syntax described in the beginning of the article. You can write your own custom file name after the source file.

Here are the split files with names starting with the given prefix.

5. Split and Specify Suffix Length

Split features a default suffix length of 2 [aa, ab, etc.]. This will change automatically as the number of files increases, but if you would like to manually change it, that is possible too. So let’s say you want our files to be named something like someSeparatedLogFiles.log_aaaab.

How can you do this? The option -a allows us to specify the length of the suffix.

Читайте также:  Сетевого принтера windows mac os

And here are the split files:

6. Split with numeric order suffix

Up to this point, you have seen your files separated using different letter combinations. Personally, I find it much easier to distinguish files using numbers.

Let’s keep the suffix length from the previous example, but change the alphabetical organization to numeric with the option -d .

So now you will have split files with numerical suffices.

7. Append hex suffixes to split files

Another option for suffix creation is to use in the built-in hex suffix which alternates ordered letters and numbers.

For this example, I will combine a few things I’ve already shown you. I will split the file using my own prefix. I chose an underscore for readability purposes.

I used the -x option to create a hex suffix. Then I split our file into 50 chunks and gave the suffix a length of 6.

And here is the outcome of the above command:

8. Split files into multiple files of specific size

It’s also possible to use file size to break up files in split. Maybe you need to send a large file over a size-capped network as efficiently as possible. You can specify the exact size for your requirements.

The syntax can get a little tricky as we continue to add options. So, I will explain how the -b command works before showing the example.

When you want to create files of a specific size, use the -b option. You can then write nK[B], nM[B], nG[B] where n is the value of your file size and K [1024] is -kibi, M is -mebi, G is -gibi, and so on. KB [1000] is kilo, MB – mega etc.

It may look like there is a lot going on, but it’s not that complex when you break it down. You have specified the source file, our destination filename prefix, a numeric suffix, and separation by file size of 128kB.

Here are the split files:

You can verify the result with the ‘wc’ command.

9. Split files into multiple files of ‘At Most’ size n with

If you wanted to split files into roughly the same size, but preserve the line structure, this might be the best choice for you. With -C , you can specify a maximum size. Then the program will automatically split the files based on complete lines.

You can see in the output that the first split file is of nearly 1MB in size where as the rest of the file is in the second file.

Bonus Tip: Rejoining split files

This isn’t a split command, but it might be helpful for new users.

You can use another command to rejoin those files and create a replica of our complete document. The cat command is short for concatenate which is just a fancy word that means “join items together”. Since all of the files begin with the letter ‘x’, the asterisk will apply the command to any files that begin with that letter.

As you can see, our recreated file is the same size as our original.

Our formatting (including the number of lines) is preserved in the file created.

If you’re new to Linux, I hope this tutorial helped you in understanding the split command. If you are more experienced tell us your favorite way to use split in the comments below!

Источник

How to Split Large Text File into Smaller Files in Linux

Linux has several utilities for breaking down large files into small files. Split and csplit are two of the popular commands which are used for this purpose. These utilities will help to break down big log files and even archive files to make it into a smaller size. This will make convenient to split large files into smaller sizes so that it fits on smaller media storage devices like USB to meet our purpose. By this technique, we can even speed up network file transfers, because parallel transfers of small files are usually faster.

In this tutorial, I’ll explain more on how to use these split and csplit utilities to break-down large files in Linux.

Split

To split large files into smaller files, we can use this command utility in Linux.

You can replace filename with the name of the large file you wish to split. And «prefix» with the name you wish to give the small output files. You can exclude [options], or replace it with either of the following:

The split command will give each output file it creates the name prefix with an extension tacked to the end that indicates its order. By default, the split command adds aa to the first output file, proceeding through the alphabet to zz for subsequent files. By default, most systems use x as the prefix.

Split Examples

Split command splits the file into n lines per file and names the files as PREFIXaa, PREFIXab, PREFIXac, and so on. By default the PREFIX is x , and the number of lines is 1000 lines per file.

Split a file into multiple pieces by default usage

I’ve my log file namely system log with 1099 lines, let’s see the status of my log file after splitting it using this command.

The command splits the log file into two files xaa and xab, with the first one having 1000 lines and dumps the leftover in the second file.

Читайте также:  Mac os install discs

Split the file, based upon the number of lines

We can split the file into multiple pieces based on the number of lines using -l option. Here, I’m splitting my system log file with 1099 lines into smaller files with 200 lines each. Let’s see the commands for the same:

You can see that the command has split my log file into five smaller files with 200 lines each and the last one with the leftover.

Split a large file into 500MB files

You can use the option -b to specify the required size limit to split the files. Please see this command which I used for splitting my 1GB Apache log file into two 500MB files each.

Split a large file into 200MB files with the given prefix

You can use the option -b to specify the 200M file size and the required prefix as the second argument. Please see the command which I used to split my 1GB Apache log to 200MB files with a prefix named split.log below:

In this example, you can see that my log files are broken down into 200MB files with my required prefix.

Split the file and name it with numbers

You can use the option -d to name the files with number suffixes as 00, 01, 02 .. and so on, instead of aa, ab, ac. Please see the command which I used to split my 1GB Apache log to 200MB files with a prefix named log and add numbers to the suffix using the option -d instead of alphabets below:

You can see the manual page of split command using the command man split to see more information.

Csplit

Csplit is another command utility which divides single files into multiple files determined by context lines.

The files created by csplit normally have names of the form

xxnumber
where number is a two digit decimal number which begins at zero and it increments by one for each new file that csplit creates.

csplit also displays the size, in bytes, of each file that it creates as output.

Csplit Examples

By default, the files that csplit produces in output have ‘xx’ as the prefix and the numbers produced in the output are the byte count for the files the command produced.

Split files based on the number of lines

I have a file which contains 8 lines with the domain names, and my requirement is to split that file at the fourth line, then this can be done by passing ‘4’ as a command line argument after the command and file name.

By passing 4 as a command-line argument, this command splits our domainslist file at the 4th line. The numbers produced in the output are the byte count for the files the command produced. Apparently, two files were produced in the output, namely xx00 and xx01.

Split files using regular expressions

We can use regular expressions with the csplit command. For example, in the previous case, if you want the command to repeat the pattern one more time, then you can do this using the following command:

In this case, we can get three output files.

You can use the asterisk wildcard <*>to tell csplit to repeat your split as many times as possible.

Split files with the given prefix

By default, csplit spilts files and produces the output files to have xx as the prefix. However, if you want, you can change that default prefix using the option -f in the command line with a required prefix.

For example, the following command will produce files having ‘domain’ as prefix.

Split a file by suppressing a line that matches the input pattern

This csplit command provides an option to suppress lines that match the input pattern. The option in question is —suppress-matched .

For example, the following command splits our file at line 4 (xx00 will contain upto line 3, while xx11 will contain rest of the lines excluding line 4).

Customize the number of digits in the output files names

By default, the number of digits that follow the prefix in the output filename is 2. We can use this option -n to customize the number of digits following the prefix in the output file names. For example, if you want to have names like xx001, you can use the command line option which requires the input number signifying the number of digits like -n 3 as below:

Forcing csplit to save the output file in case of error

By default, csplit removes the output files created in case of any error situation. However, if you want to forcefully save this output file by using the -k option in the command. Please check this example to see the difference in the execution of this command with and without -k option.

By default, csplit removes the output files created in case of any error situation. However, we can forcefully save this output file by using the ‘-k’ option in the command. Please check this example to see the difference in the execution of this command with and without -k option. On this first example, the command is meant to split our file ‘domainslist’ on line 3 and repeat the command twice like that which means it should split the second file too at line 3 and should repeat it once again. But since our source file has only eight lines, after the first split it repeats once but unable to iterate twice due to the insufficient range. Hence, no output files are produced due to this error.

Читайте также:  Настройка lpt порта windows

But when we executed the same command with this option -k, the output files were not deleted. Please see the result below:

You can check the man page for this tool using man csplit to get more information about this.

Wrapping up

These command-line utilities may not be required for a Linux user on daily basis, but this is one of the important utility which will be helpful for you in your server administration. I hope this article explained all the basic options and uses for these tools. Please post your valuable comments and suggestions on this.

Источник

11 Useful Split Command Examples for Linux Systems

As the name suggests ‘split‘ command is used to split or break a file into the pieces in Linux and UNIX systems. Whenever we split a large file with split command then split output file’s default size is 1000 lines and its default prefix would be ‘x’.

In this article we will discuss 11 useful split command examples for Linux Users. Apart from this we will also discuss how split files can be merge or reassembled into a single file. Syntax for split command:

Some of the important options of split command is shown below:

Example: 1) Split File into Pieces

Let’s assume we have file name with tuxlap.txt, Use below split command to break into the pieces

As we can see the above output ‘tuxlab.txt‘ is split into two pieces with the name ‘xaa’ and ‘xab’.

Example: 2) Split Command with verbose option

We can run split command in verbose mode with option ‘–verbose‘, example is shown below:

Example: 3) Split files with customize line numbers (-l)

Let’s suppose we want to split a file with customize line numbers, let say I want max 200 lines per file.

To achieve this, use ‘-l’ option in split command.

Verify the lines of each file using below command

Example: 4) Split files with file size using option -b

Using Split command we can split a file with file size. Use the following syntax to split files with size in bytes, KB , MB and GB

# split -b nK // n is the numeric value

# split -b nM // n is the numeric value

# split -b nG // n is the numeric value

Split file based on bytes:

Split file based on KB:

Split file based on MB:

Split file based on GB:

Example: 5) Create Split files with numeric suffix instead of alphabetic (-d)

In the above examples we have seen that split command output files are created with alphabetic suffix like xaa, xab….. xan , Use ‘-d’ option with split command to create split output files with numeric suffix like x00, x01, … x0n

Example: 6) Split file with Customize Suffix

With split command we can create split output files with customize suffix. Let’s assume we want to create split output files with customize suffix

Syntax:

Example: 7) Generate n chunks output files with split command (-n)

Let’s suppose we want to split an iso file into 4 chunk output files. Use ‘-n’ option with split command limit the number of split output files.

Verify the Split out files using ll command.

Example: 8) Prevent Zero Size Split output files with option (-e)

There can be some scenarios where we split a small file into a large number of chunk files and zero size split output files can be created in such cases, so to avoid zero size split output file, use the option ‘-e’

Example:9) Create Split output files of customize suffix length (-a option)

Let’s suppose we want to split an iso file and where size of each split output file is 500MB and suffix length is to be 3. Use the following split command:

Example: 10) Split ISO file and merge it into a single file.

Let’s suppose we have a Windows Server ISO file of size 4.2 GB and we are unable to scp this file to remote server because of its size.

To resolve such type of issues we can split the ISO into n number of pieces and will copy these pieces to remote sever and on the remote server we can merge these pieces into a single file using cat command,

View the split output files using ll command,

Now scp these files to remote server and merge these files into a single using cat command

Example: 11) Verify the Integrity of Merge file using md5sum utility

As per Example 10, once the split output files are merged into a single file, then we can check the integrity of actual & merge file with md5sum utility. Example is shown below:

As per the above output, it is confirm that integrity is maintained and we can also say split file are successfully restored to a single file.

That’s all from this article, If you like these examples then please do share your valuable feedback and comments in the comments section below.

Read Also : 16 Echo Command Examples in Linux

Источник

Оцените статью