From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Daniel Gustafsson <daniel(at)yesql(dot)se>, "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Peter Geoghegan <pg(at)bowt(dot)ie>, Bruce Momjian <bruce(at)momjian(dot)us>, David Rowley <dgrowleyml(at)gmail(dot)com> |
Subject: | Re: Improving connection scalability: GetSnapshotData() |
Date: | 2020-08-16 19:00:12 |
Message-ID: | 20200816190012.nqzmtiaju6ndckb2@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2020-08-16 14:30:24 -0400, Tom Lane wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
> > 690 successful runs later, it didn't trigger for me :(. Seems pretty
> > clear that there's another variable than pure chance, otherwise it seems
> > like that number of runs should have hit the issue, given the number of
> > bf hits vs bf runs.
>
> It seems entirely likely that there's a timing component in this, for
> instance autovacuum coming along at just the right time. It's not too
> surprising that some machines would be more prone to show that than
> others. (Note peripatus is FreeBSD, which we've already learned has
> significantly different kernel scheduler behavior than Linux.)
Yea. Interestingly there was a reproduction on linux since the initial
reports you'd dug up:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=butterflyfish&dt=2020-08-15%2019%3A54%3A53
but that's likely a virtualized environment, so I guess the host
scheduler behaviour could play a similar role.
I'll run a few iterations with rr's chaos mode too, which tries to
randomize scheduling decisions...
I noticed that it's quite hard to actually hit the hot tuple path I
mentioned earlier on my machine. Would probably be good to have a tests
hitting it more reliably. But I'm not immediately seeing how we could
force the necessarily serialization.
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Paul A Jungwirth | 2020-08-16 19:55:21 | Re: range_agg |
Previous Message | Tom Lane | 2020-08-16 18:30:24 | Re: Improving connection scalability: GetSnapshotData() |