Collectives™ on Stack Overflow
Find centralized, trusted content and collaborate around the technologies you use most.
Learn more about Collectives
Teams
Q&A for work
Connect and share knowledge within a single location that is structured and easy to search.
Learn more about Teams
I run a find command with tee log and xargs process output; by accident I forget add
xargs
in second pipe and found this question.
The example:
% tree
├── a.sh
└── home
└── localdir
├── abc_3
├── abc_6
├── mydir_1
├── mydir_2
└── mydir_3
7 directories, 1 file
and the content of a.sh
is:
% cat a.sh
#!/bin/bash
LOG="/tmp/abc.log"
find home/localdir -name "mydir*" -type d -print | tee $LOG | echo
If I add the second pipe with some command, such as echo
or ls
, the write log action would occasionally fail.
These are some examples when I ran ./a.sh
many times:
% bash -x ./a.sh; cat /tmp/abc.log // this tee failed
+ LOG=/tmp/abc.log
+ find home/localdir -name 'mydir*' -type d -print
+ tee /tmp/abc.log
+ echo
% bash -x ./a.sh; cat /tmp/abc.log // this tee ok
+ LOG=/tmp/abc.log
+ find home/localdir -name 'mydir*' -type d -print
+ tee /tmp/abc.log
+ echo
home/localdir/mydir_2 // this is cat /tmp/abc.log output
home/localdir/mydir_3
home/localdir/mydir_1
Why is it that if I add a second pipe with some command (and forget xargs
), the tee
command will fail occasionally?
–
–
–
–
The problem is that, by default, tee
exits when a write to a pipe fails. So, consider:
find home/localdir -name "mydir*" -type d -print | tee $LOG | echo
If echo
completes first, the pipe will fail and tee
will exit. The timing, though, is imprecise. Every command in the pipeline is in a separate subshell. Also, there are the vagaries of buffering. So, sometimes the log file is written before tee
exits and sometimes it isn't.
For clarity, let's consider a simpler pipeline:
$ seq 10 | tee abc.log | true; declare -p PIPESTATUS; cat abc.log
declare -a PIPESTATUS='([0]="0" [1]="0" [2]="0")'
$ seq 10 | tee abc.log | true; declare -p PIPESTATUS; cat abc.log
declare -a PIPESTATUS='([0]="0" [1]="141" [2]="0")'
In the first execution, each process in the pipeline exits with a success status and the log file is written. In the second execution of the same command, tee
fails with exit code 141
and the log file is not written.
I used true
in place of echo
to illustrate the point that there is nothing special here about echo
. The problem exists for any command that follows tee
that might reject input.
Documentation
Very recent versions of tee
have an option to control the pipe-fail-exit behavior. From man tee
from coreutils-8.25:
--output-error[=MODE]
set behavior on write error. See MODE below
The possibilities for MODE are:
MODE determines behavior with write errors on the outputs:
'warn' diagnose errors writing to any output
'warn-nopipe'
diagnose errors writing to any output not a pipe
'exit' exit on error writing to any output
'exit-nopipe'
exit on error writing to any output not a pipe
The default MODE for the -p option is 'warn-nopipe'. The default
operation when --output-error is not specified, is to exit immediately
on error writing to a pipe, and diagnose errors writing to non pipe
outputs.
As you can see, the default behavior is "to exit immediately
on error writing to a pipe". Thus, if the attempt to write to the process that follows tee
fails before tee
wrote the log file, then tee
will exit without writing the log file.
–
–
–
–
–
Right, piping from tee to something that exits early (not dependent on reading the input from tee in your case) will cause intermittent errors.
For a summary of this gotcha see:
http://www.pixelbeat.org/docs/coreutils-gotchas.html#tee
I debugged the tee
source code, but I'm not familiar with Linux C, so maybe have problems.
tee
belongs to coreutils package, under src/tee.c
First, it set buffer with:
setvbuf (stdout, NULL, _IONBF, 0); // for standard output
setvbuf (descriptors[i], NULL, _IONBF, 0); // for file descriptor
So it is unbuffer?
Second, tee put stdout as its first item in descriptor array, and will write to descriptor with for loop:
/* In the array of NFILES + 1 descriptors, make
the first one correspond to standard output. */
descriptors[0] = stdout;
files[0] = _("standard output");
setvbuf (stdout, NULL, _IONBF, 0);
for (i = 0; i <= nfiles; i++) {
if (descriptors[i]
&& fwrite (buffer, bytes_read, 1, descriptors[i]) != 1) // failed!!!
error (0, errno, "%s", files[i]);
descriptors[i] = NULL;
ok = false;
such as tee a.log
, descriptors[0] is stdout, and descriptors[1] is a.log.
As @John1024 said, pipeline is parallel (what I misunderstand before). The second pipe command, such as echo
, ls
, or true
, not accept input
, so it would not "wait" for the input, and if it execute faster, it will close the pipe (input end) before tee write to output end, so above code, the comment line will failed not not go on writing to file descriptor.
Supply:
The strace
result with killed by SIGPIPE
:
write(1, "1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n", 21) = -1 EPIPE (Broken pipe)
--- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=22649, si_uid=1000} ---
+++ killed by SIGPIPE +++
–
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.