C++Talk.NET Forum Index C++Talk.NET
C++ language newsgroups
 
Archives   FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Possible solution to the DCL problem (Scott Meyers, Andrei A
Goto page 1, 2, 3, 4  Next
 
Post new topic   Reply to topic    C++Talk.NET Forum Index -> C++ Language (Moderated)
View previous topic :: View next topic  
Author Message
Andreas Trauer
Guest





PostPosted: Mon Aug 09, 2004 8:23 pm    Post subject: Possible solution to the DCL problem (Scott Meyers, Andrei A Reply with quote



Hi,

I have read the notes on the problems with double-checked locking
patterns(DCLP) by Scott Meyers
([url]http://www.nwcpp.org/Downloads/2004/DCLP_notes.pdf)[/url]. The problem
exists, because the C++ standard allows reordering of C++ statements,
as long as the observable behaviour stays the same - with the bad side
effect that the "observable behavior" in C++ is not multi-threading
aware and this finally leads to Scott's conclusion: "There is no
portable way to implement DCLP in C++".

Now I was posted a code fraction that seems to give a good solution to
the problem for C++:

A * A::instancePtr = 0; // static class member
bool A::fullyInitialized = false; // static class member

....

A * A::instance() // static class member
{
if (!fullyInitialized)
{
Lock L;
if (instancePtr == 0)
{
instancePtr = new A;
}
else // !!! this line is the important one
{
fullyInitialized = true;
}
}
return instancePtr;
}

Unless the fullyInitialized flag is set, all threads have to go
through the Lock -> strong serialization, no reordering possible.
When the second thread enters the locked section, it is ensured that
the first thread has fully initialized the instance of A. Therefore it
is safe at that time to open the shortcut around the Lock, thus saving
resources.

Now the question: Do you find a multithreading/multiprocessor problem
in there, which could arise with highly optimizing C++ compilers /
-linkers.

I think it is not even necessary to use volatile variables here.

Of course it takes at least two lock instantiations instead of one,
but who cares...


Andreas

(answers per E-Mail may get lost in the spam filters)

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Back to top
Joe Seigh
Guest





PostPosted: Mon Aug 09, 2004 9:10 pm    Post subject: Re: Possible solution to the DCL problem (Scott Meyers, Andr Reply with quote





Andreas Trauer wrote:
Quote:

Hi,

I have read the notes on the problems with double-checked locking
patterns(DCLP) by Scott Meyers
([url]http://www.nwcpp.org/Downloads/2004/DCLP_notes.pdf)[/url]. The problem
exists, because the C++ standard allows reordering of C++ statements,
as long as the observable behaviour stays the same - with the bad side
effect that the "observable behavior" in C++ is not multi-threading
aware and this finally leads to Scott's conclusion: "There is no
portable way to implement DCLP in C++".

(snip)
Now the question: Do you find a multithreading/multiprocessor problem
in there, which could arise with highly optimizing C++ compilers /
-linkers.

I think it is not even necessary to use volatile variables here.

Of course it takes at least two lock instantiations instead of one,
but who cares...


No, it won't work.

See http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html

Joe Seigh

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Back to top
Andreas Trauer
Guest





PostPosted: Tue Aug 10, 2004 6:37 pm    Post subject: Re: Possible solution to the DCL problem (Scott Meyers, Andr Reply with quote



Joe Seigh wrote:
Quote:
Andreas Trauer wrote:

Now the question: Do you find a multithreading/multiprocessor problem
in there, which could arise with highly optimizing C++ compilers /
-linkers.

No, it won't work.

See http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html

a) The referenced page deals with Java, not C++.
b) The proposed solution is not mentioned on that page.

So why shouldn't it work?

Please be more specific.

Thanks
Andreas





[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Back to top
Alexander Terekhov
Guest





PostPosted: Tue Aug 10, 2004 7:16 pm    Post subject: Re: Possible solution to the DCL problem (Scott Meyers, Andr Reply with quote


Andreas Trauer wrote:
[...]
Quote:
Please be more specific.

http://www.cs.umd.edu/~pugh/java/memoryModel/AlphaReordering.html

regards,
alexander.

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Back to top
Joe Seigh
Guest





PostPosted: Wed Aug 11, 2004 10:46 am    Post subject: Re: Possible solution to the DCL problem (Scott Meyers, Andr Reply with quote



Andreas Trauer wrote:
Quote:

Joe Seigh wrote:
Andreas Trauer wrote:

Now the question: Do you find a multithreading/multiprocessor problem
in there, which could arise with highly optimizing C++ compilers /
-linkers.

No, it won't work.

See http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html

a) The referenced page deals with Java, not C++.
b) The proposed solution is not mentioned on that page.

So why shouldn't it work?

Please be more specific.


On platforms with a relaxed memory model, both loads and stores can be out of order.
This means you need a memory barrier or something to impose ordering both on the
stores (writer side) and on the loads (reader side). On some faulty DCL implementations,
peope assume that you cannot load data before you load a pointer to that data. It's
called a dependent load. This is not true of the Alpha processor as Alexander has
pointed out. On current Intel Pentium processors it is true but is not architected in the
memory model, so you'd have to be alert for the possibility of dependent loads not working
on a future Pentium processor.

So basically you are assuming behavior that is known not to be portable. And what is
portable, mutex locks, doesn't say anything about multi-threaded behavior when you
do *not* use locks. Undefined in other words.

Joe Seigh

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Back to top
Tokyo Tomy
Guest





PostPosted: Wed Aug 11, 2004 10:56 am    Post subject: Re: Possible solution to the DCL problem (Scott Meyers, Andr Reply with quote

Andreas Trauer <andreas.trauer (AT) siemens (DOT) com> wrote

Quote:
Hi,

I have read the notes on the problems with double-checked locking
patterns(DCLP) by Scott Meyers
([url]http://www.nwcpp.org/Downloads/2004/DCLP_notes.pdf)[/url]. The problem
exists, because the C++ standard allows reordering of C++ statements,
as long as the observable behaviour stays the same - with the bad side
effect that the "observable behavior" in C++ is not multi-threading
aware and this finally leads to Scott's conclusion: "There is no
portable way to implement DCLP in C++".

...


Excuse me for not replaying to Andreas Trauer's inquiry, but let me
comment on the DCLP_note( referred to the note hereafter) .

Double checks in the pattern is a set of two checks and not a single
check. The example given in the note try to do with the first check
only and without the second check. I think that the example is
ill-formed as a sample of DLCP. It seems that the note misunderstands
the purpose of DLCP and I don't agree with the conclusion:
Quote:
"There is no portable way to implement DCLP in C++".

In the double-checked locking patterns(DCLP), there are two checks:
1.the first check; low cost and imperfect check for locking state,
usually by using an ordinary flag,
2. the second check; high cost and perfect check for locking state to
acquire a lock, by utilizing an atomic locking mechanism supported by
platform.

Even if the first check is successful, the second check is mandatory.
So, what is the purpose of DLCP? If the first check fails, most
likely the second check fails. The first check failure advise the
thread to do other works instead of going to the second check and
doing nothing long but waiting for lock acquisition, and to come back
to the first check again later.

The example given in the note is a good and deeply analyzed example
that shows the second check is mandatory.

Of course, I welcome any proposals by Andreas Trauer or others, which
expand application of DCLP by making the second check not always
mandatory in C++.

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Back to top
kanze@gabi-soft.fr
Guest





PostPosted: Wed Aug 11, 2004 10:56 am    Post subject: Re: Possible solution to the DCL problem (Scott Meyers, Andr Reply with quote

Andreas Trauer <andreas.trauer (AT) siemens (DOT) com> wrote


Quote:
I have read the notes on the problems with double-checked locking
patterns(DCLP) by Scott Meyers
([url]http://www.nwcpp.org/Downloads/2004/DCLP_notes.pdf)[/url]. The problem
exists, because the C++ standard allows reordering of C++ statements,
as long as the observable behaviour stays the same - with the bad side
effect that the "observable behavior" in C++ is not multi-threading
aware and this finally leads to Scott's conclusion: "There is no
portable way to implement DCLP in C++".

Apparently, you didn't read it completely, because the problem doesn't
only exist because the C++ standard allows reordering of C++ statements.
The problem exists because modern hardware also reorders writes and
reads, independantly of the order they occur in the generated code. On
most processors today, in order to ensure order, you need some sort of
memory barrier instruction. (Exactly what varies from processor to
processor, and sometimes even from one model to the next of the same
processor.)

Quote:
Now I was posted a code fraction that seems to give a good solution to
the problem for C++:

A * A::instancePtr = 0; // static class member
bool A::fullyInitialized = false; // static class member

...

A * A::instance() // static class member
{
if (!fullyInitialized)
{
Lock L;
if (instancePtr == 0)
{
instancePtr = new A;
}
else // !!! this line is the important one
{
fullyInitialized = true;
}
}
return instancePtr;
}

Unless the fullyInitialized flag is set, all threads have to go
through the Lock -> strong serialization, no reordering possible.

When the second thread enters the locked section, it is ensured that
the first thread has fully initialized the instance of A. Therefore it
is safe at that time to open the shortcut around the Lock, thus saving
resources.

And when the third thread comes along and reads fullyInitialized as
true, it may use stale values for instancePtr and whatever is in A,
because it won't be synchronized. (I believe that a variant using a
thread local "fullyInitialized" may be safe. It ensures that *all*
threads acquire the lock at least once.)

--
James Kanze GABI Software http://www.gabi-soft.fr
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Back to top
johnchx
Guest





PostPosted: Wed Aug 11, 2004 10:58 am    Post subject: Re: Possible solution to the DCL problem (Scott Meyers, Andr Reply with quote

Alexander Terekhov <terekhov (AT) web (DOT) de> wrote
Quote:
Andreas Trauer wrote:
[...]
Please be more specific.

http://www.cs.umd.edu/~pugh/java/memoryModel/AlphaReordering.html


I'll try to explain what Alexander is driving at -- or at least my
understanding of what his point is.

Certain architectures effectively require programs to maintain cache
coherency "by hand" -- that is, if CPU0 caches the contents of an
address, and CPU1 writes to that address, CPU0 will still "see" only
the cached value...until a cache coherency instruction is executed on
CPU0 to "synch up" with main memory. (Actually, if I understand the
description of the Alpha, CPU1 will queue a notification to CPU0 that
it has performed a write, and CPU0 will eventually notice...but at any
point in time, CPU0 may have one or more cache invalidation
notifications "waiting to be noticed.")

This is, IIRC, *not* how things work on the i386. In i386-land, cache
coherency is handled more or less by magic, and user code doesn't have
to worry about it.
But there's more to life than Intel.....

So, to apply this to the OP's code:

Assume that our instance is fully initialized by threads running on
CPU1. So, the ctor has been run, instancePtr is properly initialized,
some other thread on CPU1 has set fullyInitialized = true, and all's
right with the world. Further, assume that CPU1 has executed a memory
barrier, so its writes to memory are fully visible to any CPU on the
system that cares to look.

Now imagine a thread on CPU0 executes A::instance(), and assume that
instancePtr and fullyInitialized happen to be cached on CPU0.
Suppose, further, that CPU0 has processed the notification that
fullyInitialized has been changed, but that it has *not* processed the
notification of the change in instancePtr. The new value for
fullyInitialized will then be read from memory: it's true. So CPU0
skips the further testing and returns the *old* value for instancePtr,
i.e. 0.

Worse still, it is possible for CPU0 to obtain a correct value for
instancePtr, but still "see" an outdated cached version of the value
stored there...i.e. *instancePtr might refer to arbitary uninitialized
bytes.

I *think* that's the point, anyway. :-)

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Back to top
David Olsen
Guest





PostPosted: Wed Aug 11, 2004 10:59 am    Post subject: Re: Possible solution to the DCL problem (Scott Meyers, Andr Reply with quote

Andreas Trauer wrote:
Quote:
I have read the notes on the problems with double-checked locking
patterns(DCLP) by Scott Meyers
([url]http://www.nwcpp.org/Downloads/2004/DCLP_notes.pdf)[/url]. The problem
exists, because the C++ standard allows reordering of C++ statements,
as long as the observable behaviour stays the same - with the bad side
effect that the "observable behavior" in C++ is not multi-threading
aware and this finally leads to Scott's conclusion: "There is no
portable way to implement DCLP in C++".

The real problem with DCLP, which makes not work in any language, is not
that the compiler may reorder the statements, but that other threads
running on a different processor may see the memory writes in a
different order which results in accessing a variable that hasn't been
initialized.

Your "fix" does not solve that problem.

Quote:
Now I was posted a code fraction that seems to give a good solution to
the problem for C++:

A * A::instancePtr = 0; // static class member
bool A::fullyInitialized = false; // static class member

A * A::instance() // static class member
{
if (!fullyInitialized)
{
Lock L;
if (instancePtr == 0)
{
instancePtr = new A;
}
else // !!! this line is the important one
{
fullyInitialized = true;
}
}
return instancePtr;
}

If you have only two threads, and each thread calls A::instance()
exactly once, then this will work (I think). But it breaks down in the
general case.

Assume two threads running on two different processors. Thread 1 calls
A::instance(), acquires the lock, creates a new object, and sets
A::instancePtr. Thread 1 then calls A::instance() again a very short
time later, acquires the lock again, and sets A::fullyInitialized to
true. Then Thread 2 calls A::instance(). Because thread 2 has never
acquired the lock, its memory hasn't been synchronized with thread 1.
Therefore, it may see none, all, or a subset of the memory writes done
by thread 1. If thread 2 sees all the memory writes, then everything is
fine and A::instance() correctly returns the singleton object without
acquiring the lock. If thread 2 doesn't see the write to
A::fullyInitialized, then it acquires the lock, its memory gets
synchronized, and everything is fine. The problem comes if thread 2
sees the write to A::fullyInitialized but not some of the other writes.
In that case A::instance() either returns a null pointer (if thread 2
didn't see the write to A::instancePtr) or it returns an unitialized
object (if thread 2 didn't see the memory writes that initialized the
singleton object).

You also have problems if three different threads all call A::instance()
at about the same time, but I'll leave it to you to work out all the
details.

If multiple threads are accessing a piece of memory and at least one
thread is writing to that memory, all accesses to that memory must be
properly synchronized. There is no way around that rule.

--
David Olsen
[email]qg4h9ykc5m (AT) yahoo (DOT) com[/email]


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Back to top
Alexander Terekhov
Guest





PostPosted: Wed Aug 11, 2004 7:40 pm    Post subject: Re: Possible solution to the DCL problem (Scott Meyers, Andr Reply with quote


johnchx wrote:
[...]
Quote:
This is, IIRC, *not* how things work on the i386. In i386-land, cache
coherency is handled more or less by magic, and user code doesn't have
to worry about it.

Wintel will of course ensure that you can run legacy binaries on
future hardware (with the help of the OS or whatever) not worrying
about more relaxed memory model a la "lazy release consistency"; I
mean things like MTRR/MSRs on newer IA32 hardware and the mapping of
IA32 legacy memory model to Itanium model:

http://groups.google.com/groups?selm=4065BB8C.F6ED94CE%40web.de
(Subject: Re: How to make an update immediately available to all CPUs)

However, if you're concerned with "general" portability to future
Intel IA32 (with AMD's 64 bit extensions, of course Wink ) hardware,
their message is crystal clear:

http://groups.google.com/groups?selm=3C3C9B63.DFDC9920%40web.de
(Subject: Re: Multi-Processor Concurrency Problem)

Quote:
But there's more to life than Intel.....

Yeah.

http://groups.google.com/groups?selm=40E57DD8.1C8E2FC2%40web.de
(Subject: Re: acquiring a new value/ without any mutexes)

See also ("plan9 problem" on IA32):

http://groups.google.com/groups?selm=40A28B51.93C0F6A0%40web.de
(Subject: Re: Q about "visibilty")

regards,
alexander.

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Back to top
Scott Meyers
Guest





PostPosted: Thu Aug 12, 2004 12:10 pm    Post subject: Re: Possible solution to the DCL problem (Scott Meyers, Andr Reply with quote

On 9 Aug 2004 16:23:35 -0400, Andreas Trauer wrote:
Quote:
I have read the notes on the problems with double-checked locking
patterns(DCLP) by Scott Meyers
([url]http://www.nwcpp.org/Downloads/2004/DCLP_notes.pdf)[/url]. The problem
exists, because the C++ standard allows reordering of C++ statements,
as long as the observable behaviour stays the same

This is ONE cause of problems. The notes also discuss the memory visibility
issues that most other posters have brought up. (That material is on slides
24-32).

Scott

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Back to top
Alexander Terekhov
Guest





PostPosted: Fri Aug 13, 2004 4:44 pm    Post subject: Re: Possible solution to the DCL problem (Scott Meyers, Andr Reply with quote


Scott Meyers wrote:
Quote:

On 9 Aug 2004 16:23:35 -0400, Andreas Trauer wrote:
I have read the notes on the problems with double-checked locking
patterns(DCLP) by Scott Meyers
([url]http://www.nwcpp.org/Downloads/2004/DCLP_notes.pdf)[/url]. The problem
exists, because the C++ standard allows reordering of C++ statements,
as long as the observable behaviour stays the same

This is ONE cause of problems. The notes also discuss the memory visibility
issues that most other posters have brought up. (That material is on slides
24-32).

Not bad up to Pg. 22. "Multiprocessors, Cache Coherency, and
Memory Barriers" part is somewhat screwed. For example, Pg. 34
and Pg. 35:

Keyboard* temp = pInstance;
Perform acquire;
...

(that notation sucks, BTW) is not really the same (with
respect to reordering) as

Keyboard* temp = pInstance;
Lock L1(args); // acquire
...

because the later can be transformed to

Lock L1(args); // acquire
Keyboard* temp = pInstance;
...

While it does stress the point of Pg. 36, the difference is
quite significant and it can really hurt you in some other
context. Beware.

regards,
alexander.

P.S. atomic<> is the way to go.

--
http://www.google.com/groups?selm=40867BB7.784FEBFE%40web.de

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Back to top
Scott Meyers
Guest





PostPosted: Sat Aug 14, 2004 10:29 am    Post subject: Re: Possible solution to the DCL problem (Scott Meyers, Andr Reply with quote

On 13 Aug 2004 12:44:11 -0400, Alexander Terekhov wrote:
Quote:
Keyboard* temp = pInstance;
Perform acquire;
...

(that notation sucks, BTW) is not really the same (with
respect to reordering) as

Keyboard* temp = pInstance;
Lock L1(args); // acquire
...

because the later can be transformed to

Lock L1(args); // acquire
Keyboard* temp = pInstance;
...

Do you mean that compilers will perform code motion of a statement involving a
nonlocal like pInstance across whatever happens inside the Lock constructor, a
constructor that might well make a system call whose internals are unknown to
the compiler and therefore could, in theory, modify pInstance? Do you mean
that compilers will move reads across an acquire? If so, what is your
reasoning for the validity or likely occurrence of such code motion? If not
what do you mean?

Quote:
While it does stress the point of Pg. 36, the difference is
quite significant and it can really hurt you in some other
context. Beware.

Can you please elaborate on this?

Quote:
P.S. atomic<> is the way to go.

What is atomic<>?

Scott

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Back to top
Alexander Terekhov
Guest





PostPosted: Sun Aug 15, 2004 10:53 am    Post subject: Re: Possible solution to the DCL problem (Scott Meyers, Andr Reply with quote


Scott Meyers wrote: ...

I'll post a reply on Monday to comp.programming.threads (we're going
to drift off c.l.c++.m topic Wink ).

regards,
alexander.

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Back to top
Scott Meyers
Guest





PostPosted: Mon Aug 16, 2004 10:44 am    Post subject: Re: Possible solution to the DCL problem (Scott Meyers, Andr Reply with quote

On 15 Aug 2004 06:53:52 -0400, Alexander Terekhov wrote:
Quote:
I'll post a reply on Monday to comp.programming.threads (we're going
to drift off c.l.c++.m topic Wink ).

From my perspective, the topic is compiler optimizations, and I
encourage you to cross-post to this newsgroup as long as that topic
is part of the thread. When it's not, the moderators will doubtless
let us know.

Scott

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Back to top
Display posts from previous:   
Post new topic   Reply to topic    C++Talk.NET Forum Index -> C++ Language (Moderated) All times are GMT
Goto page 1, 2, 3, 4  Next
Page 1 of 4

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2006 phpBB Group
SEO toolkit © 2004-2006 webmedic.