ptrace problem - confounded, dazed and confused at the inconsistencies
Shachar Shemesh
shachar at shemesh.biz
Wed Oct 27 21:23:26 IST 2010
On 27/10/10 20:53, Valery Reznic wrote:
> OK, you was warned :)
>
Yes, I was. Still....
>> How can two programs do the same thing on the same system,
>> and yet get such different results?
>>
> Let's take 'read' syscall.
> read(10, ....)
>
>
I am not talking about a single syscall that behaves differently. I'm
talking about tracing the entire history of the process from the time it
performed "execve".
If you want, I uploaded to
http://fakeroot-ng.lingnu.com/files/clone-traces.tgz the two logs. The
two are from strace, once tracing strace tracing vi. The second time it
is strace attached to the fakeroot-ng daemon after that is monitoring a
shell, and then the shell is used to run "vi". This second log continues
until I kill the daemon to release it from the deadlock.
Both the inner strace and fakeroot-ng were instructed to issue logs of
what they find, and you can find this log in "write" calls throughout
the logs. The logs are, of course, not identical, but I failed to find
any difference that should matter.
Since not all of the logs are interesting, the interesting parts start
when the processes write that they detected an execve of /usr/bin/vi,
and ends a couple of lines after the first time the word "clone" appears
after that point. In the trace-strace log, you can see that after
releasing the process to perform the clone (PTRACE_SYSCALL), it performs
wait4 twice, and gets two notifications, one for the parent thread
(3299) and one for the child one (3300).
In the fakeroot log, you can see the clone(RETURN) log message,
identifying the child thread 3885 being created, but all of the waits
performed only report the parent thread, 3884. This goes on until wait
returns with "nothing more to report", and pselect hangs in futile wait
for the signal to arrive.
Not in this trace, but had I sent a non-lethal signal, you would see the
wait repeated, again saying there is nothing to report, and a hang
again. In essence, the difference in the waits should not have happened,
as far as I can tell, as the system calls were treated the same.
> I suspect there is something like this in your case.
>
You have everything you need in order to prove you suspicion. In fact,
strace is easilly installable from your nearest repository, as well as
vi and bash. Many will also carry fakeroot-ng, but if not, feel free to
pull the latest SVN image and compile it yourself.
> May be there is something that strace do and fakeroot-ng don't?
>
I'm sure there is. I just can't figure out what it is. The strace code
does not appear to have any special handling as opposed to, say, using
clone to create a new process (which is a case which works flawlessly in
fakeroot-ng).
> Setting some flag(s) to clone? Calling some system call that affect wait behaviour?
>
Same flags to clone in both cases (vi sets the same flags, and both
strace and fakeroot-ng change them to the same different flags).
I'm not aware of any settings that globally affects wait's behavior.
Shachar
--
Shachar Shemesh
Lingnu Open Source Consulting Ltd.
http://www.lingnu.com
More information about the Linux-il
mailing list