Linux tar exclude from

Linux tar exclude from

To avoid operating on files whose names match a particular pattern, use the ‘ —exclude ’ or ‘ —exclude-from ’ options.

Causes tar to ignore files that match the pattern .

The ‘ —exclude= pattern ’ option prevents any file or member whose name matches the shell wildcard ( pattern ) from being operated on. For example, to create an archive with all the contents of the directory ‘src’ except for files whose names end in ‘.o’, use the command ‘ tar -cf src.tar —exclude=’*.o’ src ’.

You may give multiple ‘ —exclude ’ options.

‘ —exclude-from= file ’ ‘ -X file ’

Causes tar to ignore files that match the patterns listed in file .

Use the ‘ —exclude-from ’ option to read a list of patterns, one per line, from file ; tar will ignore files matching those patterns. Thus if tar is called as ‘ tar -c -X foo . ’ and the file ‘foo’ contains a single line ‘*.o’, no files whose names end in ‘.o’ will be added to the archive.

Notice, that lines from file are read verbatim. One of the frequent errors is leaving some extra whitespace after a file name, which is difficult to catch using text editors.

However, empty lines are OK.

When archiving directories that are under some version control system (VCS), it is often convenient to read exclusion patterns from this VCS’ ignore files (e.g. ‘.cvsignore’, ‘.gitignore’, etc.) The following options provide such possibility:

Before archiving a directory, see if it contains any of the following files: ‘cvsignore’, ‘.gitignore’, ‘.bzrignore’, or ‘.hgignore’. If so, read ignore patterns from these files.

The patterns are treated much as the corresponding VCS would treat them, i.e.:

Contains shell-style globbing patterns that apply only to the directory where this file resides. No comments are allowed in the file. Empty lines are ignored.

Contains shell-style globbing patterns. Applies to the directory where ‘.gitfile’ is located and all its subdirectories.

Any line beginning with a ‘ # ’ is a comment. Backslash escapes the comment character.

Contains shell globbing-patterns and regular expressions (if prefixed with ‘ RE: ’(16). Patterns affect the directory and all its subdirectories.

Any line beginning with a ‘ # ’ is a comment.

Contains posix regular expressions(17). The line ‘ syntax: glob ’ switches to shell globbing patterns. The line ‘ syntax: regexp ’ switches back. Comments begin with a ‘ # ’. Patterns affect the directory and all its subdirectories.

Before dumping a directory, tar checks if it contains file . If so, exclusion patterns are read from this file. The patterns affect only the directory itself.

Same as ‘ —exclude-ignore ’, except that the patterns read affect both the directory where file resides and all its subdirectories.

Exclude files and directories used by following version control systems: ‘ CVS ’, ‘ RCS ’, ‘ SCCS ’, ‘ SVN ’, ‘ Arch ’, ‘ Bazaar ’, ‘ Mercurial ’, and ‘ Darcs ’.

As of version 1.34, the following files are excluded:

Exclude backup and lock files. This option causes exclusion of files that match the following shell globbing patterns:

When creating an archive, the ‘ —exclude-caches ’ option family causes tar to exclude all directories that contain a cache directory tag. A cache directory tag is a short file with the well-known name ‘CACHEDIR.TAG’ and having a standard header specified in http://www.brynosaurus.com/cachedir/spec.html. Various applications write cache directory tags into directories they use to hold regenerable, non-precious data, so that such data can be more easily excluded from backups.

There are three ‘ exclude-caches ’ options, each providing a different exclusion semantics:

Do not archive the contents of the directory, but archive the directory itself and the ‘CACHEDIR.TAG’ file.

Do not archive the contents of the directory, nor the ‘CACHEDIR.TAG’ file, archive only the directory itself.

Читайте также:  Wps office linux обновление

Omit directories containing ‘CACHEDIR.TAG’ file entirely.

Another option family, ‘ —exclude-tag ’, provides a generalization of this concept. It takes a single argument, a file name to look for. Any directory that contains this file will be excluded from the dump. Similarly to ‘ exclude-caches ’, there are three options in this option family:

Do not dump the contents of the directory, but dump the directory itself and the file .

Do not dump the contents of the directory, nor the file , archive only the directory itself.

Omit directories containing file file entirely.

Multiple ‘ —exclude-tag* ’ options can be given.

For example, given this directory:

The ‘ —exclude-tag ’ will produce the following:

Both the ‘dir/folk’ directory and its tagfile are preserved in the archive, however the rest of files in this directory are not.

Now, using the ‘ —exclude-tag-under ’ option will exclude ‘tagfile’ from the dump, while still preserving the directory itself, as shown in this example:

Finally, using ‘ —exclude-tag-all ’ omits the ‘dir/folk’ directory entirely:

This document was generated on March 24, 2021 using texi2html 5.0.

Источник

[Linux] : How to exclude directory when using “tar” shell command

In day-to-day activities administrators needs to perform regular backups on their Linux servers. Being myself an administrator, i would recommended “tar”, the simple and the best tool. Backup doesn’t mean all the files and folders need to be backed up ! Sometimes we may have to exclude directories like template cache, log files, cache, temporarily created files, gallery directory etc., So in this article, we will see how to exclude certain directories and certain patterns even !

1. tar –exclude “directory”

Note: When excluding directories, make sure NOT to use the trailing slash(/) at the end of the directory name.

I have wasted much time in exploring this. So requesting you to not to waste time and follow the good procedure to get the work done soon.

Bad Practice :

Good Practice :

2. tar –exclude Multiple directories

To exclude multiple directories you can either provide directories separately or by listing each directory seperated by comma and encased in .

Method 1 :

Method 2 :

3. tar –exclude directories from a File

List all the directories to be excluded into a file and use this list to exclude directories during tar.

Method 1 :

Method 2 :

exclude_directory.txt Contains :

4. tar –exclude certain patterns

Sometimes we might find multiple pattern in different folders and we would not be interested only on that pattern. So here we see how to exclude particular pattern.

Источник

Excluding directory when creating a .tar.gz file

I have a /public_html/ folder, in that folder there’s a /tmp/ folder that has like 70gb of files I don’t really need.

Now I am trying to create a .tar.gz of /public_html/ excluding /tmp/

This is the command I ran:

The tar is still being created, and by doing an ls -sh I can see that MyBackup.tar.gz already has about 30gb, and I know for sure that /public_html/ without /tmp/ doesn’t have more than 1GB of files.

What did I do wrong?

10 Answers 10

Try removing the last / at the end of the directory path to exclude

Try moving the —exclude to before the include.

Yes, remove the trailing / and (at least in ubuntu 11.04) all the paths given must be relative or full path. You can’t mix absolute and relative paths in the same command.

will not exclude logs directory but

The correct command for exclude directory from compression is :

Make sure to put —exclude before the source and destination items.

and you can check the contents of the tar.gz file without unzipping :

You can also exclude more than one using only one —exclude . Like this example:

In —exclude= you must finish the directory name without / and must in between MyBackup.tar.gz and /home/user/public_html/

tar -pczf —exclude /path/to/exclude —exclude /another/path/to/exclude/* /path/to/include/ /another/path/to/include/*

Tested in Ubuntu 19.10.

  1. The = after exclude is optional. You can use = instead of space after keyword exclude if you like.
  2. Parameter exclude must be placed before the source.
  3. The difference between use folder name (like the 1st) or the * (like the 2nd) is: the 2nd one will include an empty folder in package but the 1st will not.
Читайте также:  Windows internet explorer supported versions

Источник

tar —exclude doesn’t exclude. Why?

I have this very simple line in a bash script which executes successfully (i.e. producing the _data.tar file), except that it doesn’t exclude the sub-directories it is told exclude via the —exclude option:

Instead, it produces a _data.tar file that contains everything under /data, including the files in the subdirectories I wanted to exclude.

Any idea why? and how to fix this?

Update I implemented my observations based on the link provided in the first answer below (top level dir first, no whitespace after last exclude):

But that didn’t help. All «excluded» sub-directories are present in the resulting _data.tar file.

This is puzzling. Whether this is a bug in current tar (GNU tar 1.23, on a CentOS 6.2, Linux 2.6.32) or «extreme sensitivity» of tar to whitespaces and other easy-to-miss typos, I consider this a bug. For now.

This is horrible: I tried the insight suggested below (no trailing /* ) and it still doesn’t work in the production script:

I can’t see any difference between what I tried and what @Richard Perrin tried, except for the quotes and 2 spaces instead of 1. I am going to try this (must wait for the nightly script to run as the directory to be backed up is huge) and report back.

I am beginning to think that all these tar —exclude sensitivities aren’t tar’s but something in my environment, but then what could that be?

It worked! The last variation tried (no single-quotes and single-space instead of double-space between the —exclude s) tested working. Weird but accepting.

Unbelievable! It turns out that an older version of tar (1.15.1) would only exclude if the top-level dir is last on the command line. This is the exact opposite of how version 1.23 requires. FYI.

Источник

Shell command to tar directory excluding certain files/folders

Is there a simple shell command/script that supports excluding certain files/folders from being archived?

I have a directory that need to be archived with a sub directory that has a number of very large files I do not need to backup.

Not quite solutions:

The tar —exclude=PATTERN command matches the given pattern and excludes those files, but I need specific files & folders to be ignored (full file path), otherwise valid files might be excluded.

I could also use the find command to create a list of files and exclude the ones I don’t want to archive and pass the list to tar, but that only works with for a small amount of files. I have tens of thousands.

I’m beginning to think the only solution is to create a file with a list of files/folders to be excluded, then use rsync with —exclude-from=file to copy all the files to a tmp directory, and then use tar to archive that directory.

Can anybody think of a better/more efficient solution?

EDIT: Charles Ma‘s solution works well. The big gotcha is that the —exclude=’./folder’ MUST be at the beginning of the tar command. Full command (cd first, so backup is relative to that directory):

28 Answers 28

You can have multiple exclude options for tar so

etc will work. Make sure to put —exclude before the source and destination items.

You can exclude directories with —exclude for tar.

If you want to archive everything except /usr you can use:

In your case perhaps something like

Possible options to exclude files/directories from backup using tar:

Exclude files using multiple patterns

Exclude files using an exclude file filled with a list of patterns

Exclude files using tags by placing a tag file in any directory that should be skipped

old question with many answers, but I found that none were quite clear enough for me, so I would like to add my try.

if you have the following structure

with following file/folders

so, you want to make a tar file that contain everyting inside /home/ftp/mysite (to move the site to a new server), but file3 is just junk, and everything in folder3 is also not needed, so we will skip those two.

we use the format

where the c = create, z = zip, and v = verbose (you can see the files as they are entered, usefull to make sure none of the files you exclude are being added). and f= file.

so, my command would look like this

note the files/folders excluded are relatively to the root of your tar (I have tried full path here relative to / but I can not make that work).

Читайте также:  Как редактировать windows forms

hope this will help someone (and me next time I google it)

You can use standard «ant notation» to exclude directories relative.
This works for me and excludes any .git or node_module directories:

This exclude pattern handles filename suffix like png or mp3 as well as directory names like .git and node_modules

I’ve experienced that, at least with the Cygwin version of tar I’m using («CYGWIN_NT-5.1 1.7.17(0.262/5/3) 2012-10-19 14:39 i686 Cygwin» on a Windows XP Home Edition SP3 machine), the order of options is important.

While this construction worked for me:

that one didn’t work:

This, while tar —help reveals the following:

So, the second command should also work, but apparently it doesn’t seem to be the case.

I found this somewhere else so I won’t take credit, but it worked better than any of the solutions above for my mac specific issues (even though this is closed):

For Mac OSX I had to do

tar -zcv —exclude=’folder’ -f theOutputTarFile.tar folderToTar

Note the -f after the —exclude=

For those who have issues with it, some versions of tar would only work properly without the ‘./’ in the exclude value.

Command syntax that work:

These will not work:

After reading all this good answers for different versions and having solved the problem for myself, I think there are very small details that are very important, and rare to GNU/Linux general use, that aren’t stressed enough and deserves more than comments.

So I’m not going to try to answer the question for every case, but instead, try to register where to look when things doesn’t work.

IT IS VERY IMPORTANT TO NOTICE:

  1. THE ORDER OF THE OPTIONS MATTER: it is not the same put the —exclude before than after the file option and directories to backup. This is unexpected at least to me, because in my experience, in GNU/Linux commands, usually the order of the options doesn’t matter.
  2. Different tar versions expects this options in different order: for instance, @Andrew’s answer indicates that in GNU tar v 1.26 and 1.28 the excludes comes last, whereas in my case, with GNU tar 1.29, it’s the other way.
  3. THE TRAILING SLASHES MATTER: at least in GNU tar 1.29, it shouldn’t be any.

In my case, for GNU tar 1.29 on Debian stretch, the command that worked was

The quotes didn’t matter, it worked with or without them.

I hope this will be useful to someone.

If you are trying to exclude Version Control System (VCS) files, tar already supports two interesting options about it! 🙂

  1. Option : —exclude-vcs

This option excludes files and directories used by following version control systems: CVS , RCS , SCCS , SVN , Arch , Bazaar , Mercurial , and Darcs .

As of version 1.32, the following files are excluded:

  • CVS/ , and everything under it
  • RCS/ , and everything under it
  • SCCS/ , and everything under it
  • .git/ , and everything under it
  • .gitignore
  • .gitmodules
  • .gitattributes
  • .cvsignore
  • .svn/ , and everything under it
  • .arch-ids/ , and everything under it
  • / , and everything under it
  • =RELEASE-ID
  • =meta-update
  • =update
  • .bzr
  • .bzrignore
  • .bzrtags
  • .hg
  • .hgignore
  • .hgrags
  1. Option : —exclude-vcs-ignores

When archiving directories that are under some version control system (VCS), it is often convenient to read exclusion patterns from this VCS’ ignore files (e.g. .cvsignore , .gitignore , etc.) This option provide such possibility.

Before archiving a directory, see if it contains any of the following files: cvsignore , .gitignore , .bzrignore , or .hgignore . If so, read ignore patterns from these files.

The patterns are treated much as the corresponding VCS would treat them, i.e.:

Contains shell-style globbing patterns that apply only to the directory where this file resides. No comments are allowed in the file. Empty lines are ignored.

Contains shell-style globbing patterns. Applies to the directory where .gitfile is located and all its subdirectories.

Any line beginning with a # is a comment. Backslash escapes the comment character.

Contains shell globbing-patterns and regular expressions (if prefixed with RE: (16). Patterns affect the directory and all its subdirectories.

Any line beginning with a # is a comment.

Contains posix regular expressions(17). The line syntax: glob switches to shell globbing patterns. The line syntax: regexp switches back. Comments begin with a # . Patterns affect the directory and all its subdirectories.

Источник

Оцените статью