What does –strip-components -C mean in tar?
What does –strip-components -C mean in tar?
https://unix.stackexchange.com/questions/535772/what-does-strip-components-c-mean-in-tar
Asked 6 years, 5 months ago Modified 5 years, 9 months ago Viewed 113k times 61
I want to make sure I understand the following code:
tar xvzf FILE –exclude-from=exclude.me –strip-components 1 -C DESTINATION which was posted in this answer.
From man tar:
–strip-components=NUMBER strip NUMBER leading components from file names on extraction
-C, –directory=DIR change to directory DIR
I didn’t understand the manual explanation for –strip-components.
About -C, I understood that it means something like “put stripped components in a noted directory.”
What does –strip-components -C mean?
directorytarpath Share Improve this question Follow edited Mar 29, 2020 at 2:39 G-Man Says ‘Reinstate Monica’’s user avatar G-Man Says ‘Reinstate Monica’ 24.1k2929 gold badges7676 silver badges132132 bronze badges asked Aug 15, 2019 at 17:14 user149572 May be a duplicate unix.stackexchange.com/q/575261/4778 – ctrl-alt-delor CommentedMar 29, 2020 at 19:53 Add a comment 4 Answers Sorted by:
Highest score (default) 69
+100 The fragment of manpage you included in your question comes from man for GNU tar. GNU is a software project that prefers info manuals over manpages. In fact, tar manpage has been added to the GNU tar source code tree only in 2014 and it still is just a reference, not a full-blown manual with examples. You can invoke a full info manual with info tar, it’s also available online here. It contains several examples of –strip-components usage, the relevant fragments are:
–strip-components=number
Strip given number of leading components from file names before extraction.
For example, if archive archive.tar' contained some/file/name’, then running
tar –extract –file archive.tar –strip-components=2
would extract this file to file `name’.
and:
–strip-components=number
Strip given number of leading components from file names before extraction.
For example, suppose you have archived whole /usr' hierarchy to a tar archive named usr.tar’. Among other files, this archive contains `usr/include/stdlib.h’, which you wish to extract to the current working directory. To do so, you type:
$ tar -xf usr.tar –strip=2 usr/include/stdlib.h
The option --strip=2' instructs tar to strip the two leading components (usr/’ and `include/’) off the file name.
That said;
There are other implementations of tar out there, for example FreeBSD tar manpage has a different explanation of this command:
–strip-components count
Remove the specified number of leading path elements. Pathnames with fewer elements will be silently skipped. Note that the pathname is edited after checking inclusion/exclusion patterns but before security checks.
In other words, you should understand a Unix path as a sequence of elements separated by / (unless there is only one /).
Here is my own example (other examples are available in the info manual I linked to above):
Let’s create a new directory structure:
mkdir -p a/b/c Path a/b/c is composed of 3 elements: a, b, and c.
Create an empty file in this directory and put it into .tar archive:
$ touch a/b/c/FILE $ tar -cf archive.tar a/b/c/FILE FILE is a 4th element of a/b/c/FILE path.
List contents of archive.tar:
$ tar tf archive.tar a/b/c/FILE You can now extract archive.tar with –strip-components and an argument that will tell it how many path elements you want to be removed from the a/b/c/FILE when extracted. Remove an original a directory:
rm -r a Extract with –strip-components=1 - only a has not been recreated:
$ tar xf archive.tar –strip-components=1 $ ls -Al total 16 -rw-r–r– 1 ja users 10240 Mar 26 15:41 archive.tar drwxr-xr-x 3 ja users 4096 Mar 26 15:43 b $ tree b b └── c └── FILE
1 directory, 1 file With –strip-components=2 you see that a/b - 2 elements have not been recreated:
$ rm -r b $ tar xf archive.tar –strip-components=2 $ ls -Al total 16 -rw-r–r– 1 ja users 10240 Mar 26 15:41 archive.tar drwxr-xr-x 2 ja users 4096 Mar 26 15:46 c $ tree c c └── FILE
0 directories, 1 file With –strip-components=3 3 elements a/b/c have not been recreated and we got FILE in the same level directory in which we run tar:
$ rm -r c $ tar xf archive.tar –strip-components=3 $ ls -Al total 12 -rw-r–r– 1 ja users 0 Mar 26 15:39 FILE -rw-r–r– 1 ja users 10240 Mar 26 15:41 archive.tar -C option tells tar to change to a given directory before running a requested operation, extracting but also archiving. In this comment you asked:
Asking tar to do cd: why cd? I mean to ask, why it’s not just mv?
Why do you think that mv is better? To what directory would you like to extract tar archive first:
/tmp - what if it’s missing or full?
“$TMPDIR” - what if it’s unset, missing or full?
current directory - what if user has no w permission, just r and x?
what if a temporary directory, whatever it is already contained files with the same names as in tar archive and extracting would overwrite them?
what if a temporary directory, whatever it is didn’t support Unix filesystems and all info about ownership, executable bits etc. would be lost?
Also notice that -C is a common change directory option in other programs as well, Git and make are first that come to my mind.
Share Improve this answer Follow edited Mar 27, 2020 at 8:58 answered Mar 26, 2020 at 15:12 Arkadiusz Drabczyk’s user avatar Arkadiusz Drabczyk 26.7k55 gold badges5656 silver badges7171 bronze badges Add a comment 25
The –strip-components option is for modifying the filenames of extracted files. –strip-components
If you have a filename foo/bar/baz, then with –strip-components 1 the extracted file would be named bar/baz.
The -C option just means “change directory”. If you use -C /some/other/place you are effectively asking tar to cd /some/other/place before extracting files. This generally means that files would be extracted relative to /some/other/place.
Share Improve this answer Follow answered Aug 15, 2019 at 17:29 larsks’s user avatar larsks 38.6k66 gold badges6161 silver badges7878 bronze badges 4 –strip-components doesn’t have any impact over what gets extracted. The only thing it does is modify filenames. – larsks CommentedAug 15, 2019 at 19:48 I now re read the answer and I think I still miss, or at least become confused, about what is being done; I now offer bounty which I offer, in part, for editing and helping others in similar situation and myself, to ensure understanding. – user149572 CommentedMar 26, 2020 at 9:38 Asking tar to do cd: why cd? I mean to ask, why it’s not just mv? – user149572 CommentedMar 26, 2020 at 9:50 Add a comment 4
To supplement the good answer by larsks, to try to help clear up the confusion around the -C option:
The tar man page states for the -C option:
-C, –directory=DIR Change to DIR before performing any operations. This option is order-sensitive, i.e. it affects all options that follow. so, it is not like mv - it is literally telling tar to change to a different working directory. Also note the point that the order matters. So:
tar r -f archive.tar -C /path file1 file2 would append files file1 and file2 at the location /path to archive.tar, whilst
tar r -f archive.tar file1 file2 -C /path file3 file4 would append file1 and file2 from the current working directory and then file3 and file4 from the /path location.
–strip-components and -C are independent and unrelated options to tar. The first affects the folder structure of the archived files, once they are extracted, and -C specifies the working directory that tar is using to specify files that are external to the archive.
Share Improve this answer Follow edited Mar 26, 2020 at 12:23 answered Mar 26, 2020 at 11:09 Time4Tea’s user avatar Time4Tea 2,65055 gold badges3333 silver badges6666 bronze badges Add a comment 2
I’m not sure what I can say by way of explanation that hasn’t been covered by the other answers. Instead, I’ll try to help you with examples.
Arkadiusz Drabczyk gives the example from the GNU tar manual where your archive includes a large directory tree (e.g., all of /usr):
$ tar -cf usr.tar /usr and you want to drill down into the archive and extract a file from a low-level subdirectory:
$ tar -xf usr.tar –strip=2 usr/include/stdlib.h What the manual doesn’t spell out is that, if you say
$ tar -xf usr.tar usr/include/stdlib.h then tar will create a usr directory in the current directory, and an include subdirectory in usr, and then extract stdlib.h into ./usr/include. If that’s the result you want, then fine; use that command. But, if you want stdlib.h in the current directory, you’ll need to
$ mv usr/include/stdlib.h . and then probably rmdir -p usr/include. It’s doable, but it’s tedious. –strip=2 saves you a lot of typing.
-C can also be used with the c (lower-case C) option (creating an archive), and that might be easier to understand. Suppose John, Kate, Larry and Mary are collaborating on writing a book; they are writing Chapters 1, 2, 3 and 4, respectively. John and Kate (only) are using the same computer, and they want to archive what they’ve done so far to send to the other authors. You could do either of the following:
(cd to some suitable directory) $ tar cf chap12.tar /home/john/chap1 /home/kate/chap2 or $ cd /home $ tar cf /tmp/chap12.tar john/chap1 kate/chap2 But these create an archive that encodes the john/ and kate/ directory paths. If you do that, that would be a reason for Larry and Mary to use –strip=2 or –strip=1 when they extract the archive. If you want to save them that bother (and/or keep your names out of the archive; e.g., for privacy reasons), then you can
(cd to some suitable directory) $ tar cf chap12.tar -C /home/john chap1 -C /home/kate chap2 and then the archive will show just the chap1 and chap2 directories.
Share Improve this answer Follow answered Mar 28, 2020 at 17:41 G-Man Says ‘Reinstate Monica’’s user avatar G-Man Says ‘Reinstate Monica’ 24.1k2929 gold badges7676 silver badges132