 |
C++Talk.NET C++ language newsgroups
|
| View previous topic :: View next topic |
| Author |
Message |
Greg Hickman Guest
|
Posted: Fri Jul 02, 2004 5:40 am Post subject: Splicing/Concatenation and Undefined Behavior |
|
|
Why do 2.1(2) and 2.1(4) say that undefined behavior occurs if a character
sequence that matches the syntax of a universal character name results from
splicing physical source lines or token concatenation? Does this mean it's
possible to construct a well-formed program that unwittingly contains such
undefined behavior? If so, what might it look like and what can we do to
prevent it?
Thanks,
Greg
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
|
|
| Back to top |
|
 |
James Kuyper Guest
|
Posted: Fri Jul 02, 2004 7:47 pm Post subject: Re: Splicing/Concatenation and Undefined Behavior |
|
|
Greg Hickman <greg.hickman (AT) lmco (DOT) com> wrote
| Quote: | Why do 2.1(2) and 2.1(4) say that undefined behavior occurs if a character
sequence that matches the syntax of a universal character name results from
splicing physical source lines or token concatenation? Does this mean it's
possible to construct a well-formed program that unwittingly contains such
undefined behavior? If so, what might it look like and what can we do to
prevent it?
|
// Splice occurs as described in 2.1p2:
int helloU03
88 = 0;
// Splice occurs as described in 2.1p4:
#define STR(a,b) a##b
#define STRING(a,b) STR(a,b)
int STRING(worldU03,89) = 1;
Note: in contrast, the following code would be no problem:
int helloU0388 = 0;
int worldU0399 = 1;
To avoid the problem, just be careful about using string catenation,
and escaped new-lines.
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
|
|
| Back to top |
|
 |
Greg Hickman Guest
|
Posted: Fri Jul 02, 2004 8:46 pm Post subject: Re: Splicing/Concatenation and Undefined Behavior |
|
|
"James Kuyper" <kuyper (AT) wizard (DOT) net> wrote
| Quote: |
// Splice occurs as described in 2.1p2:
int helloU03
88 = 0;
// Splice occurs as described in 2.1p4:
#define STR(a,b) a##b
#define STRING(a,b) STR(a,b)
int STRING(worldU03,89) = 1;
Note: in contrast, the following code would be no problem:
int helloU0388 = 0;
int worldU0399 = 1;
To avoid the problem, just be careful about using string catenation,
and escaped new-lines.
|
I thought these might be the kinds of scenarios being described in the
standard, but wanted to be sure. It isn't readily apparent to me why they
can lead to undefined behavior.
Thanks,
Greg
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
|
|
| Back to top |
|
 |
James Kuyper Guest
|
Posted: Sun Jul 04, 2004 7:22 am Post subject: Re: Splicing/Concatenation and Undefined Behavior |
|
|
[email]greg.hickman (AT) lmco (DOT) com[/email] (Greg Hickman) wrote in message news:<cc4enu$lcq3 (AT) cui1 (DOT) lmms.lmco.com>...
| Quote: | "James Kuyper" <kuyper (AT) wizard (DOT) net> wrote in message
news:8b42afac.0407020400.300d52d8 (AT) posting (DOT) google.com...
// Splice occurs as described in 2.1p2:
int helloU03
88 = 0;
// Splice occurs as described in 2.1p4:
#define STR(a,b) a##b
#define STRING(a,b) STR(a,b)
int STRING(worldU03,89) = 1;
Note: in contrast, the following code would be no problem:
int helloU0388 = 0;
int worldU0399 = 1;
To avoid the problem, just be careful about using string
catenation,
and escaped new-lines.
I thought these might be the kinds of scenarios being described in the
standard, but wanted to be sure. It isn't readily apparent to me why they
can lead to undefined behavior.
|
Giving such splices undefined behavior, relieves the implementation of
any responsibility for checking the results of splices for whether or
not they contain UCNs. On an implementation which takes advantage of
that fact, the most likely form of undefined behavior would be a
failure to notice that the spliced-together identifier should have
been identified as a match with one that never needed splicing. Thus,
it wouldn't recognize the two following lines as containing the same
identifier:
int helloU0388;
helloU03
88 = 1;
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
|
|
| Back to top |
|
 |
Ben Hutchings Guest
|
Posted: Mon Jul 05, 2004 10:03 pm Post subject: Re: Splicing/Concatenation and Undefined Behavior |
|
|
Greg Hickman wrote:
| Quote: | Why do 2.1(2) and 2.1(4) say that undefined behavior occurs if a character
sequence that matches the syntax of a universal character name results from
splicing physical source lines or token concatenation?
|
The justification I see is that this allows implementations to
interpret universal-character-names at whatever stage is most
convenient. This may not be the actual reason for the decision.
| Quote: | Does this mean it's possible to construct a well-formed program that
unwittingly contains such undefined behavior? If so, what might it
look like and what can we do to prevent it?
|
It is possible to contrive a program with undefined behaviour due to
2.1(2):
int u
0100 = 0;
As for 2.1(4), I must admit I have found it useful to construct UCNs
by token concatenation in a macro in order to work with an
implementation that does not support UCNs (VC++ 6), for which the
macro was defined differently. If you only need to support more
standard compilers then I do not see that this would be necessary.
I think it should only take a little self-disicpline to ensure that
whenever you type u you also type a full 4 digits after it. A
search through your source files with the (Perl-style) regexp:
(^|[^\])(\\)*\(\nu|u.{0,3}[^0-9A-Fa-f])
should reveal any deviation from this rule, unless you use trigraphs,
in which case you would need a slightly more complex but rather longer
regex.
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
|
|
| Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|