Hmmm.. this one has got me scratching my head. I have a process that actually starts from inittab and does all manner of different tasks, including running a little shell script now and then. That shell script runs various programs on a set of files; the customer wants to keep archives of those files just in case something goes wrong.
What actually goes on in that script doesn't matter because that all works fine. It copies files, slices and dices, creates new files, ftp's things here and there.. it all works. But add just a simple tar command to it and things get weird.. very weird.
To simplify testing, I tried just tarring a specific directory:
tar cvf /home/xyz/tmp/testtar.tar /home/xyz/sn.archives/200803071503
Here's what you get if you try to read the file later:
$ tar tvf /home/xyz/tmp/testtar.tar
tar: This does not look like a tar archive
tar: Skipping to next header
tar: Archive contains obsolescent base-64 headers
tar: Error exit delayed from previous errors
Now, if you use the same tar command at the command line, everything is fine - no problems. You can put the same command in cron, too: no issue. The problem isn't tar (or who its run by). The files that the command line or cron produces are obviously different:
$ ls -l tmp/test*tar
-rw-rw-r-- 1 xyz xyz 2539520 Mar 7 18:42 tmp/test2tar.tar
-rw-r--r-- 1 root root 2540402 Mar 7 18:37 tmp/testtar.tar
$ file tmp/test*tar
tmp/test2tar.tar: POSIX tar archive
tmp/testtar.tar: data
Now look at this:
$ od -c tmp/test2tar.tar | head
0000000 h o m e / d r s / s n . a r c h
0000020 i v e s / 2 0 0 8 0 3 0 7 1 5 0
0000040 3 / \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
0000060 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
*
0000140 \0 \0 \0 \0 0 0 0 0 7 5 5 \0 0 0 0 0
0000160 0 0 0 \0 0 0 0 0 0 0 0 \0 0 0 0 0
0000200 0 0 0 0 0 0 0 \0 1 0 7 6 4 2 5 4
0000220 5 2 3 \0 0 1 4 7 4 3 \0 5 \0 \0 \0
0000240 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
$ od -c tmp/testtar.tar | head
0000000 / h o m e / d r s / s n . a r c
0000020 h i v e s / 2 0 0 8 0 3 0 7 1 5
0000040 0 3 / \n / h o m e / d r s / s n
0000060 . a r c h i v e s / 2 0 0 8 0 3
0000100 0 7 1 5 0 3 / s n . 0 1 3 8 . t
0000120 x t \n h o m e / d r s / s n . a
0000140 r c h i v e s / 2 0 0 8 0 3 0 7
0000160 1 5 0 3 / \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
0000200 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
The directory name is duplicated in the header of the file produced from the script.. and overwrites the header info! No wonder it cannot be read back..
Next question: when does the corruption happen? I added this right after the "tar":
tar tvf /home/xyz/tmp/testtar.tar > testtar.tar.read 2>&1
And yes, it's immediately corrupt.
Soooo.. right now my brain is at a dead stop on this one.. any ideas will be entertained graciously.
Enter your email address for automatic notification of new posts here
(be sure to whitelist 'feedburner.com' if you use spam filtering)
| Views for this page | ||||
|---|---|---|---|---|
| Today | This Week | This Month | This Year | Overall |
| 4 | 11 | 20 | 20 | 1,669 |
Have you tried Searching this site?
Unix/Linux/Mac OS X support by phone, email or on-site: Support Rates
This is a Unix/Linux resource website. It contains technical articles about Unix, Linux and general computing related subjects, opinion, news, help files, how-to's, tutorials and more. We appreciate comments and article submissions.

Fri Mar 7 19:23:34 2008: Subject: TonyLawrence
I just thought to add a "set" to the script.. I wonder if tar is getting confused by some environment variable?
Fri Mar 7 19:39:37 2008: Subject: TonyLawrence
Nope, don't think so.
Though there is this that I don't have in a login environment:
POSIXLY_CORRECT=y
but that doesn't change anything..
Fri Mar 7 20:13:07 2008: Subject: TonyLawrence
Trying tar cvof now..
Nope.. no change
Sat Mar 8 03:46:55 2008: Subject: jtimberman
These kind of inconsistencies and corruption with tar's file format are the exact reason why the BRU[1] program was written.
http://www.bru.com/
Sat Mar 8 11:46:37 2008: Subject: TonyLawrence
No, I have to disagree with that. There has to be something more basic going on here.
However, that does get my brain out of a rut: why not try cpio?
Sat Mar 8 12:39:31 2008: Subject: TonyLawrence
Well, cpio is fine, which means tar is "acting up".. I'd like to figure out why just out of curiosity, but I'll use cpio for the task.
Sat Mar 8 14:46:01 2008: Subject: anonymous
Are any other tar processes still alive?
Is there any other process that has your archive file opened?
Sat Mar 8 15:38:09 2008: Subject: strace PhilBurchill
http://burchill.net
Have you tried putting an "strace" on it so you can see what system calls etc are being used?
Sat Mar 8 16:14:43 2008: Subject: BigDumbDinosaur
http://bcstechnology.net
It almost looks as though tar when run from the script is somehow resetting the write pointer to the archive file. Weird! Don't have any immediate answer, except that there must be some obscure environment issue going on. Either that or a shared library used by tar from the CL is not being used when run from a script. BTW, which Linux distro is this?
Sat Mar 8 18:42:13 2008: Subject: TonyLawrence
No, thee are no other processes - if that were the issue, cpio would be corrupted also, but it is not. It's something extra or lacking in the environment.. I don't think an strace is going to help me (though it's worth trying, yes).. One other thing - if you search Google, you'll see that a lot of other folks have had similar problems. This happens to be a RedHat (and fairly new, too), but the Google results are all over distros.. now most of those can be explained easily enough by other factors (ftp'd a file without setting "bin", that kind of thing), but I bet some of them have the same cause as this - whatever it is.
I will try the trace - and of course compare it to one run in the login environment.
Sat Mar 8 19:17:36 2008: Subject: TonyLawrence
I've put a diff of the traces at http://aplawrence.com/Linux/tar-problem.txt and the full trace from the erroring tar at http://aplawrence.com/Linux/trace-tar.txt
Just started looking at it now.. the first execve is the one run from the init script..
First observation is that it is getting an error.. hmmm.. but no, that's just because it couldn't write stdout - duh, why did I say "v"?..
Click here to add your comments