[BUG] More on t5562 hangs randomly in subtests 6, 8 and 13 in 2.21.0-rc0
To
git@vger.kernel.org
Cc
max@max630.net
From
Randall S. Becker
Date
2019-02-11 16:59:00 UTC
Hi All,

I have localized the hang in t5562 (previous thread) to the
invoke-with-content-length.pl script. At least on NonStop, what happens is
that the perl process hangs waiting for close($out) to complete whether
explicitly or implicitly (if the call is removed). The trace for the perl
process shows it hung at Perl_io_close (the platform's trace anyway). My
interpretation is that the reading process is still around but is no longer
reading on that pipe. If any of the processes hanging around are killed, the
structure unwinds. However, when some of the tests are run, git-http-backend
remains running after subtest 6 and/or 8 runs even if that subtest does not
hang. The presence of other git-http-backend processes seems to interfere
with subsequent tests, and if you run tests individually, subtests 6,8, and
13 consistently pass. Strangely, if a bunch of print statements are added to
another terminal explicitly, the test works consistently, so this is
sounding a bit either like a race condition or flushes are not being handled
consistently although the code appears to handle the latter case.

Simply killing old git-http-backend and/or perl processes does not make a
difference so the race may involve test contents, but I can't make that
determination. There is no correlation with system load.

That's as far as I have been able to analyze the situation at this stage.
I've CC'd the author to see whether there might be some perspective that can
come in here to help out.

This test has broken our CI process for git on NonStop, because of the hang,
so it's rather important to us to get this resolved before the official
2.21.0.

Still hoping for help on this issue,
Randall