 |
C++Talk.NET C++ language newsgroups
|
| View previous topic :: View next topic |
| Author |
Message |
Andreas Trauer Guest
|
Posted: Mon Aug 09, 2004 8:23 pm Post subject: Possible solution to the DCL problem (Scott Meyers, Andrei A |
|
|
Hi,
I have read the notes on the problems with double-checked locking
patterns(DCLP) by Scott Meyers
([url]http://www.nwcpp.org/Downloads/2004/DCLP_notes.pdf)[/url]. The problem
exists, because the C++ standard allows reordering of C++ statements,
as long as the observable behaviour stays the same - with the bad side
effect that the "observable behavior" in C++ is not multi-threading
aware and this finally leads to Scott's conclusion: "There is no
portable way to implement DCLP in C++".
Now I was posted a code fraction that seems to give a good solution to
the problem for C++:
A * A::instancePtr = 0; // static class member
bool A::fullyInitialized = false; // static class member
....
A * A::instance() // static class member
{
if (!fullyInitialized)
{
Lock L;
if (instancePtr == 0)
{
instancePtr = new A;
}
else // !!! this line is the important one
{
fullyInitialized = true;
}
}
return instancePtr;
}
Unless the fullyInitialized flag is set, all threads have to go
through the Lock -> strong serialization, no reordering possible.
When the second thread enters the locked section, it is ensured that
the first thread has fully initialized the instance of A. Therefore it
is safe at that time to open the shortcut around the Lock, thus saving
resources.
Now the question: Do you find a multithreading/multiprocessor problem
in there, which could arise with highly optimizing C++ compilers /
-linkers.
I think it is not even necessary to use volatile variables here.
Of course it takes at least two lock instantiations instead of one,
but who cares...
Andreas
(answers per E-Mail may get lost in the spam filters)
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
Joe Seigh Guest
|
Posted: Mon Aug 09, 2004 9:10 pm Post subject: Re: Possible solution to the DCL problem (Scott Meyers, Andr |
|
|
Andreas Trauer wrote:
| Quote: |
Hi,
I have read the notes on the problems with double-checked locking
patterns(DCLP) by Scott Meyers
([url]http://www.nwcpp.org/Downloads/2004/DCLP_notes.pdf)[/url]. The problem
exists, because the C++ standard allows reordering of C++ statements,
as long as the observable behaviour stays the same - with the bad side
effect that the "observable behavior" in C++ is not multi-threading
aware and this finally leads to Scott's conclusion: "There is no
portable way to implement DCLP in C++".
(snip)
Now the question: Do you find a multithreading/multiprocessor problem
in there, which could arise with highly optimizing C++ compilers /
-linkers.
I think it is not even necessary to use volatile variables here.
Of course it takes at least two lock instantiations instead of one,
but who cares...
|
No, it won't work.
See http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html
Joe Seigh
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
Andreas Trauer Guest
|
Posted: Tue Aug 10, 2004 6:37 pm Post subject: Re: Possible solution to the DCL problem (Scott Meyers, Andr |
|
|
Joe Seigh wrote:
a) The referenced page deals with Java, not C++.
b) The proposed solution is not mentioned on that page.
So why shouldn't it work?
Please be more specific.
Thanks
Andreas
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
Alexander Terekhov Guest
|
|
| Back to top |
|
 |
Joe Seigh Guest
|
Posted: Wed Aug 11, 2004 10:46 am Post subject: Re: Possible solution to the DCL problem (Scott Meyers, Andr |
|
|
Andreas Trauer wrote:
| Quote: |
Joe Seigh wrote:
Andreas Trauer wrote:
Now the question: Do you find a multithreading/multiprocessor problem
in there, which could arise with highly optimizing C++ compilers /
-linkers.
No, it won't work.
See http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html
a) The referenced page deals with Java, not C++.
b) The proposed solution is not mentioned on that page.
So why shouldn't it work?
Please be more specific.
|
On platforms with a relaxed memory model, both loads and stores can be out of order.
This means you need a memory barrier or something to impose ordering both on the
stores (writer side) and on the loads (reader side). On some faulty DCL implementations,
peope assume that you cannot load data before you load a pointer to that data. It's
called a dependent load. This is not true of the Alpha processor as Alexander has
pointed out. On current Intel Pentium processors it is true but is not architected in the
memory model, so you'd have to be alert for the possibility of dependent loads not working
on a future Pentium processor.
So basically you are assuming behavior that is known not to be portable. And what is
portable, mutex locks, doesn't say anything about multi-threaded behavior when you
do *not* use locks. Undefined in other words.
Joe Seigh
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
Tokyo Tomy Guest
|
Posted: Wed Aug 11, 2004 10:56 am Post subject: Re: Possible solution to the DCL problem (Scott Meyers, Andr |
|
|
Andreas Trauer <andreas.trauer (AT) siemens (DOT) com> wrote
| Quote: | Hi,
I have read the notes on the problems with double-checked locking
patterns(DCLP) by Scott Meyers
([url]http://www.nwcpp.org/Downloads/2004/DCLP_notes.pdf)[/url]. The problem
exists, because the C++ standard allows reordering of C++ statements,
as long as the observable behaviour stays the same - with the bad side
effect that the "observable behavior" in C++ is not multi-threading
aware and this finally leads to Scott's conclusion: "There is no
portable way to implement DCLP in C++".
...
|
Excuse me for not replaying to Andreas Trauer's inquiry, but let me
comment on the DCLP_note( referred to the note hereafter) .
Double checks in the pattern is a set of two checks and not a single
check. The example given in the note try to do with the first check
only and without the second check. I think that the example is
ill-formed as a sample of DLCP. It seems that the note misunderstands
the purpose of DLCP and I don't agree with the conclusion:
| Quote: | "There is no portable way to implement DCLP in C++".
|
In the double-checked locking patterns(DCLP), there are two checks:
1.the first check; low cost and imperfect check for locking state,
usually by using an ordinary flag,
2. the second check; high cost and perfect check for locking state to
acquire a lock, by utilizing an atomic locking mechanism supported by
platform.
Even if the first check is successful, the second check is mandatory.
So, what is the purpose of DLCP? If the first check fails, most
likely the second check fails. The first check failure advise the
thread to do other works instead of going to the second check and
doing nothing long but waiting for lock acquisition, and to come back
to the first check again later.
The example given in the note is a good and deeply analyzed example
that shows the second check is mandatory.
Of course, I welcome any proposals by Andreas Trauer or others, which
expand application of DCLP by making the second check not always
mandatory in C++.
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
kanze@gabi-soft.fr Guest
|
Posted: Wed Aug 11, 2004 10:56 am Post subject: Re: Possible solution to the DCL problem (Scott Meyers, Andr |
|
|
Andreas Trauer <andreas.trauer (AT) siemens (DOT) com> wrote
| Quote: | I have read the notes on the problems with double-checked locking
patterns(DCLP) by Scott Meyers
([url]http://www.nwcpp.org/Downloads/2004/DCLP_notes.pdf)[/url]. The problem
exists, because the C++ standard allows reordering of C++ statements,
as long as the observable behaviour stays the same - with the bad side
effect that the "observable behavior" in C++ is not multi-threading
aware and this finally leads to Scott's conclusion: "There is no
portable way to implement DCLP in C++".
|
Apparently, you didn't read it completely, because the problem doesn't
only exist because the C++ standard allows reordering of C++ statements.
The problem exists because modern hardware also reorders writes and
reads, independantly of the order they occur in the generated code. On
most processors today, in order to ensure order, you need some sort of
memory barrier instruction. (Exactly what varies from processor to
processor, and sometimes even from one model to the next of the same
processor.)
| Quote: | Now I was posted a code fraction that seems to give a good solution to
the problem for C++:
A * A::instancePtr = 0; // static class member
bool A::fullyInitialized = false; // static class member
...
A * A::instance() // static class member
{
if (!fullyInitialized)
{
Lock L;
if (instancePtr == 0)
{
instancePtr = new A;
}
else // !!! this line is the important one
{
fullyInitialized = true;
}
}
return instancePtr;
}
Unless the fullyInitialized flag is set, all threads have to go
through the Lock -> strong serialization, no reordering possible.
When the second thread enters the locked section, it is ensured that
the first thread has fully initialized the instance of A. Therefore it
is safe at that time to open the shortcut around the Lock, thus saving
resources.
|
And when the third thread comes along and reads fullyInitialized as
true, it may use stale values for instancePtr and whatever is in A,
because it won't be synchronized. (I believe that a variant using a
thread local "fullyInitialized" may be safe. It ensures that *all*
threads acquire the lock at least once.)
--
James Kanze GABI Software http://www.gabi-soft.fr
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
johnchx Guest
|
Posted: Wed Aug 11, 2004 10:58 am Post subject: Re: Possible solution to the DCL problem (Scott Meyers, Andr |
|
|
Alexander Terekhov <terekhov (AT) web (DOT) de> wrote
I'll try to explain what Alexander is driving at -- or at least my
understanding of what his point is.
Certain architectures effectively require programs to maintain cache
coherency "by hand" -- that is, if CPU0 caches the contents of an
address, and CPU1 writes to that address, CPU0 will still "see" only
the cached value...until a cache coherency instruction is executed on
CPU0 to "synch up" with main memory. (Actually, if I understand the
description of the Alpha, CPU1 will queue a notification to CPU0 that
it has performed a write, and CPU0 will eventually notice...but at any
point in time, CPU0 may have one or more cache invalidation
notifications "waiting to be noticed.")
This is, IIRC, *not* how things work on the i386. In i386-land, cache
coherency is handled more or less by magic, and user code doesn't have
to worry about it.
But there's more to life than Intel.....
So, to apply this to the OP's code:
Assume that our instance is fully initialized by threads running on
CPU1. So, the ctor has been run, instancePtr is properly initialized,
some other thread on CPU1 has set fullyInitialized = true, and all's
right with the world. Further, assume that CPU1 has executed a memory
barrier, so its writes to memory are fully visible to any CPU on the
system that cares to look.
Now imagine a thread on CPU0 executes A::instance(), and assume that
instancePtr and fullyInitialized happen to be cached on CPU0.
Suppose, further, that CPU0 has processed the notification that
fullyInitialized has been changed, but that it has *not* processed the
notification of the change in instancePtr. The new value for
fullyInitialized will then be read from memory: it's true. So CPU0
skips the further testing and returns the *old* value for instancePtr,
i.e. 0.
Worse still, it is possible for CPU0 to obtain a correct value for
instancePtr, but still "see" an outdated cached version of the value
stored there...i.e. *instancePtr might refer to arbitary uninitialized
bytes.
I *think* that's the point, anyway. :-)
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
David Olsen Guest
|
Posted: Wed Aug 11, 2004 10:59 am Post subject: Re: Possible solution to the DCL problem (Scott Meyers, Andr |
|
|
Andreas Trauer wrote:
| Quote: | I have read the notes on the problems with double-checked locking
patterns(DCLP) by Scott Meyers
([url]http://www.nwcpp.org/Downloads/2004/DCLP_notes.pdf)[/url]. The problem
exists, because the C++ standard allows reordering of C++ statements,
as long as the observable behaviour stays the same - with the bad side
effect that the "observable behavior" in C++ is not multi-threading
aware and this finally leads to Scott's conclusion: "There is no
portable way to implement DCLP in C++".
|
The real problem with DCLP, which makes not work in any language, is not
that the compiler may reorder the statements, but that other threads
running on a different processor may see the memory writes in a
different order which results in accessing a variable that hasn't been
initialized.
Your "fix" does not solve that problem.
| Quote: | Now I was posted a code fraction that seems to give a good solution to
the problem for C++:
A * A::instancePtr = 0; // static class member
bool A::fullyInitialized = false; // static class member
A * A::instance() // static class member
{
if (!fullyInitialized)
{
Lock L;
if (instancePtr == 0)
{
instancePtr = new A;
}
else // !!! this line is the important one
{
fullyInitialized = true;
}
}
return instancePtr;
}
|
If you have only two threads, and each thread calls A::instance()
exactly once, then this will work (I think). But it breaks down in the
general case.
Assume two threads running on two different processors. Thread 1 calls
A::instance(), acquires the lock, creates a new object, and sets
A::instancePtr. Thread 1 then calls A::instance() again a very short
time later, acquires the lock again, and sets A::fullyInitialized to
true. Then Thread 2 calls A::instance(). Because thread 2 has never
acquired the lock, its memory hasn't been synchronized with thread 1.
Therefore, it may see none, all, or a subset of the memory writes done
by thread 1. If thread 2 sees all the memory writes, then everything is
fine and A::instance() correctly returns the singleton object without
acquiring the lock. If thread 2 doesn't see the write to
A::fullyInitialized, then it acquires the lock, its memory gets
synchronized, and everything is fine. The problem comes if thread 2
sees the write to A::fullyInitialized but not some of the other writes.
In that case A::instance() either returns a null pointer (if thread 2
didn't see the write to A::instancePtr) or it returns an unitialized
object (if thread 2 didn't see the memory writes that initialized the
singleton object).
You also have problems if three different threads all call A::instance()
at about the same time, but I'll leave it to you to work out all the
details.
If multiple threads are accessing a piece of memory and at least one
thread is writing to that memory, all accesses to that memory must be
properly synchronized. There is no way around that rule.
--
David Olsen
[email]qg4h9ykc5m (AT) yahoo (DOT) com[/email]
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
Alexander Terekhov Guest
|
Posted: Wed Aug 11, 2004 7:40 pm Post subject: Re: Possible solution to the DCL problem (Scott Meyers, Andr |
|
|
johnchx wrote:
[...]
| Quote: | This is, IIRC, *not* how things work on the i386. In i386-land, cache
coherency is handled more or less by magic, and user code doesn't have
to worry about it.
|
Wintel will of course ensure that you can run legacy binaries on
future hardware (with the help of the OS or whatever) not worrying
about more relaxed memory model a la "lazy release consistency"; I
mean things like MTRR/MSRs on newer IA32 hardware and the mapping of
IA32 legacy memory model to Itanium model:
http://groups.google.com/groups?selm=4065BB8C.F6ED94CE%40web.de
(Subject: Re: How to make an update immediately available to all CPUs)
However, if you're concerned with "general" portability to future
Intel IA32 (with AMD's 64 bit extensions, of course ) hardware,
their message is crystal clear:
http://groups.google.com/groups?selm=3C3C9B63.DFDC9920%40web.de
(Subject: Re: Multi-Processor Concurrency Problem)
| Quote: | But there's more to life than Intel.....
|
Yeah.
http://groups.google.com/groups?selm=40E57DD8.1C8E2FC2%40web.de
(Subject: Re: acquiring a new value/ without any mutexes)
See also ("plan9 problem" on IA32):
http://groups.google.com/groups?selm=40A28B51.93C0F6A0%40web.de
(Subject: Re: Q about "visibilty")
regards,
alexander.
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
Scott Meyers Guest
|
Posted: Thu Aug 12, 2004 12:10 pm Post subject: Re: Possible solution to the DCL problem (Scott Meyers, Andr |
|
|
On 9 Aug 2004 16:23:35 -0400, Andreas Trauer wrote:
| Quote: | I have read the notes on the problems with double-checked locking
patterns(DCLP) by Scott Meyers
([url]http://www.nwcpp.org/Downloads/2004/DCLP_notes.pdf)[/url]. The problem
exists, because the C++ standard allows reordering of C++ statements,
as long as the observable behaviour stays the same
|
This is ONE cause of problems. The notes also discuss the memory visibility
issues that most other posters have brought up. (That material is on slides
24-32).
Scott
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
Alexander Terekhov Guest
|
Posted: Fri Aug 13, 2004 4:44 pm Post subject: Re: Possible solution to the DCL problem (Scott Meyers, Andr |
|
|
Scott Meyers wrote:
| Quote: |
On 9 Aug 2004 16:23:35 -0400, Andreas Trauer wrote:
I have read the notes on the problems with double-checked locking
patterns(DCLP) by Scott Meyers
([url]http://www.nwcpp.org/Downloads/2004/DCLP_notes.pdf)[/url]. The problem
exists, because the C++ standard allows reordering of C++ statements,
as long as the observable behaviour stays the same
This is ONE cause of problems. The notes also discuss the memory visibility
issues that most other posters have brought up. (That material is on slides
24-32).
|
Not bad up to Pg. 22. "Multiprocessors, Cache Coherency, and
Memory Barriers" part is somewhat screwed. For example, Pg. 34
and Pg. 35:
Keyboard* temp = pInstance;
Perform acquire;
...
(that notation sucks, BTW) is not really the same (with
respect to reordering) as
Keyboard* temp = pInstance;
Lock L1(args); // acquire
...
because the later can be transformed to
Lock L1(args); // acquire
Keyboard* temp = pInstance;
...
While it does stress the point of Pg. 36, the difference is
quite significant and it can really hurt you in some other
context. Beware.
regards,
alexander.
P.S. atomic<> is the way to go.
--
http://www.google.com/groups?selm=40867BB7.784FEBFE%40web.de
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
Scott Meyers Guest
|
Posted: Sat Aug 14, 2004 10:29 am Post subject: Re: Possible solution to the DCL problem (Scott Meyers, Andr |
|
|
On 13 Aug 2004 12:44:11 -0400, Alexander Terekhov wrote:
| Quote: | Keyboard* temp = pInstance;
Perform acquire;
...
(that notation sucks, BTW) is not really the same (with
respect to reordering) as
Keyboard* temp = pInstance;
Lock L1(args); // acquire
...
because the later can be transformed to
Lock L1(args); // acquire
Keyboard* temp = pInstance;
...
|
Do you mean that compilers will perform code motion of a statement involving a
nonlocal like pInstance across whatever happens inside the Lock constructor, a
constructor that might well make a system call whose internals are unknown to
the compiler and therefore could, in theory, modify pInstance? Do you mean
that compilers will move reads across an acquire? If so, what is your
reasoning for the validity or likely occurrence of such code motion? If not
what do you mean?
| Quote: | While it does stress the point of Pg. 36, the difference is
quite significant and it can really hurt you in some other
context. Beware.
|
Can you please elaborate on this?
| Quote: | P.S. atomic<> is the way to go.
|
What is atomic<>?
Scott
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
Alexander Terekhov Guest
|
Posted: Sun Aug 15, 2004 10:53 am Post subject: Re: Possible solution to the DCL problem (Scott Meyers, Andr |
|
|
Scott Meyers wrote: ...
I'll post a reply on Monday to comp.programming.threads (we're going
to drift off c.l.c++.m topic ).
regards,
alexander.
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
Scott Meyers Guest
|
Posted: Mon Aug 16, 2004 10:44 am Post subject: Re: Possible solution to the DCL problem (Scott Meyers, Andr |
|
|
On 15 Aug 2004 06:53:52 -0400, Alexander Terekhov wrote:
| Quote: | I'll post a reply on Monday to comp.programming.threads (we're going
to drift off c.l.c++.m topic ).
|
From my perspective, the topic is compiler optimizations, and I
encourage you to cross-post to this newsgroup as long as that topic
is part of the thread. When it's not, the moderators will doubtless
let us know.
Scott
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|