C++Talk.NET Forum Index C++Talk.NET
C++ language newsgroups
 
Archives   FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Unsinged types
Goto page Previous  1, 2, 3, 4  Next
 
Post new topic   Reply to topic    C++Talk.NET Forum Index -> C Language
View previous topic :: View next topic  
Author Message
Seungbeom Kim
Guest





PostPosted: Fri Jul 20, 2012 7:11 pm    Post subject: Re: Unsinged types Reply with quote



On 2012-07-19 15:47, Ben Pfaff wrote:
Quote:

Many quantities are naturally unsigned; for example, counts and
sizes. These quantities are most naturally modeled with unsigned
types.

Things that you assume should be non-negative often turn out to be
not always so. For example, though no one thinks of a negative day
of the month, the tm_* members of struct tm are signed, because you
sometimes need to be able to represent out-of-range values.

In addition, "unsigned" in C does not only mean non-negativity,
but also implies modulo arithmetic; you can't even get the difference
between two unsigned values in the most natural way (a - b). And the
unsignedness is contagious, so (unsigned_value > -1) may not be true
and ((unsigned)negative_value > 100) may be true to your surprise.

For these reasons, I had the feeling that unsigned was dangerous and
should be limited to values which bitwise operations were intended for
(e.g. bitmasks) or values on nominal/ordinal scales (e.g. character
codes, TCP port numbers).

The reality is, however, that everyone thinks differently, similar
debates are endless, and that unsigned is widely used for other things
(e.g. counts and sizes) even in the standard library, so you have to
live with unsigned being everywhere, and learn to be careful when
mixing signed and unsigned.

--
Seungbeom Kim
Back to top
Eric Sosman
Guest





PostPosted: Fri Jul 20, 2012 7:41 pm    Post subject: Re: Unsinged types Reply with quote



On 7/20/2012 5:11 PM, Seungbeom Kim wrote:
Quote:
On 2012-07-19 15:47, Ben Pfaff wrote:

Many quantities are naturally unsigned; for example, counts and
sizes. These quantities are most naturally modeled with unsigned
types.

Things that you assume should be non-negative often turn out to be
not always so. For example, though no one thinks of a negative day
of the month, the tm_* members of struct tm are signed, because you
sometimes need to be able to represent out-of-range values.

It's also a way to help with some calculations. For example,
you can start from a given date, subtract fourteen from tm_mday,
re-normalize, and easily find out "What date was a fortnight
before January 5?"

Quote:
In addition, "unsigned" in C does not only mean non-negativity,
but also implies modulo arithmetic; you can't even get the difference
between two unsigned values in the most natural way (a - b).

You can't do that with signed integers, either. And in the
cases where naive subtraction doesn't work, you don't just get a
possibly surprising but predictable outcome: You get undefined
behavior. (Ever seen someone write a qsort() comparator that just
subtracts two `int' keys and returns the difference? Ever seen
the chaos that can result?)

Quote:
And the
unsignedness is contagious, so (unsigned_value > -1) may not be true
and ((unsigned)negative_value > 100) may be true to your surprise.

Oh, come on! You might just as well complain about double-ness
being "contagious."

Besides, why blame the surprises on the unsigned operand? They
don't arise from either one of the operands in isolation, but from
the combination of the two -- so the signed operand is every bit as
much to blame as the unsigned. If you blame one, you should blame
the other equally.[*]

[*] Okay, that doesn't always happen in real life: Doheny was
acquitted of offering the bribe that Fall was convicted of taking.
But Roaring Twenties jurisprudence is a poor model for programming!

Quote:
For these reasons, I had the feeling that unsigned was dangerous and
should be limited to values which bitwise operations were intended for
(e.g. bitmasks) or values on nominal/ordinal scales (e.g. character
codes, TCP port numbers).

The reality is, however, that everyone thinks differently, similar
debates are endless, and that unsigned is widely used for other things
(e.g. counts and sizes) even in the standard library, so you have to
live with unsigned being everywhere, and learn to be careful when
mixing signed and unsigned.

... or when mixing signed integer with long double complex, or
when mixing unsigned long with pointer-to-pointer-to-T, or ... In
fact, your recommendation to "be careful when..." can be improved
by deleting "when" and everything after it. Just be careful, okay?

--
Eric Sosman
esosman@ieee-dot-org.invalid
Back to top
Bill Leary
Guest





PostPosted: Fri Jul 20, 2012 10:21 pm    Post subject: Re: Unsinged types Reply with quote



"Jase Schick" wrote in message news:ju9p6k$rg0$1 (AT) speranza (DOT) aioe.org...
Quote:
Hi Does C still need unsigned types? My preferred language Java manages
perfectly well without them. Do many people ever use unsigned types
nowadays and if so why? In a 64-bit world, the extra range is rarely
worth the hastle it seems to me.

((assuming "Unsinged" == "unsigned"))

In my work (BIOS and Embedded systems) yes, we use unsigned data all the
time. In fact, far more often than signed.

As for "... Java manages perfectly well without them." we've recently
encountered some trouble trying to work with binary data in Java because it
keeps doing interesting things with bytes (8 bits) and words (of 16 bits)
when the most significant bit is set.

- Bill
Back to top
Gordon Burditt
Guest





PostPosted: Fri Jul 20, 2012 10:42 pm    Post subject: Re: Unsinged types Reply with quote

Quote:
Hi Does C still need unsigned types?

Yes. Among other things, it's difficult to do multi-precision
math with signed integers, given the properties that C gives it.
The modulo-arithmetic properties of unsigned integers are very
useful if you need, say, multi-megabit integers.

Quote:
Java manages perfectly well without
them.

It's my bet that C needs unsigned types *to implement Java*. And
no, I'm not trying to nitpick about the type of sizeof() or the
argument of malloc(). I'm thinking largely about the BigInteger
types in Java that you need to be able to do math with.

In the openjdk 7u4 distribution, I see 1998 lines in *.c files
and 955 lines in *.cpp files containing 'unsigned'.

Quote:
Do many people ever use unsigned types nowadays and if so why? In a
64-bit world, the extra range is rarely worth the hastle it seems to me.

Everyone doesn't live in a 64-bit world. Some of the embedded processor
programmers haven't gotten past the 16-bit world yet.


Try keeping anything secret using RSA encryption with only 64-bit
keys. Even 1024 bits is marginal.
Back to top
Robert Wessel
Guest





PostPosted: Sat Jul 21, 2012 3:38 am    Post subject: Re: Unsinged types Reply with quote

On Fri, 20 Jul 2012 19:42:27 -0500, gordonb.lo985 (AT) burditt (DOT) org (Gordon
Burditt) wrote:

Quote:
Hi Does C still need unsigned types?

Yes. Among other things, it's difficult to do multi-precision
math with signed integers, given the properties that C gives it.
The modulo-arithmetic properties of unsigned integers are very
useful if you need, say, multi-megabit integers.

Java manages perfectly well without
them.

It's my bet that C needs unsigned types *to implement Java*. And
no, I'm not trying to nitpick about the type of sizeof() or the
argument of malloc(). I'm thinking largely about the BigInteger
types in Java that you need to be able to do math with.

In the openjdk 7u4 distribution, I see 1998 lines in *.c files
and 955 lines in *.cpp files containing 'unsigned'.

Do many people ever use unsigned types nowadays and if so why? In a
64-bit world, the extra range is rarely worth the hastle it seems to me.

Everyone doesn't live in a 64-bit world. Some of the embedded processor
programmers haven't gotten past the 16-bit world yet.


Heck, many haven't gotten *to* the 16-bit world yet.
Back to top
Malcolm McLean
Guest





PostPosted: Sat Jul 21, 2012 6:31 am    Post subject: Re: Unsinged types Reply with quote

בתאריך יום שישי, 2 ביולי 2010 22:54:28 UTC+1, מאת Eric Sosman:
Quote:
On 7/2/2010 4:45 PM, Jase Schick wrote:
Yes. The unsigned type I personally use most frequently is
size_t, but sometimes other situations arise where I want to
represent quantities that are necessarily non-negative.

That's one of my main beefs with size_t.


Often quantities are necessarily non-negative. So a negative number can be used to represent an error value. It's also a garbage detector. A random sequence of bits has a 50% of being a negative number. So assert(Nemployees >= 0) is practically certain to trigger after a few cycles of being passed random numbers. You can sanity test an unsigned, assert(Nemployees < 1000000), but that often leads to other issues.

Then numbers can be inherently non-negative, but intermediates may be negative.
Back to top
Malcolm McLean
Guest





PostPosted: Sat Jul 21, 2012 6:48 am    Post subject: Re: Unsinged types Reply with quote

בתאריך יום שישי, 20 ביולי 2012 10:00:55 UTC+1, מאת Andrew Cooper:
Quote:

A common but subtle cause of security bugs is to use a signed index into
an array rather than an unsigned one, then have said index based on user
input.

What the vast majority of programmers don't understand is that the
signed vs unsigned is not about 'what's the largest number I can
represent', but with the different semantics that the two types have.

The problem is it's a plugs and adapeters system.


Consider this

void getcursorposition(unsigned int *x, unsigned int *y)

an x, y index into a raster is necessarily unsigned, right?

Now we want to draw an octogon round the cursor. So we'll build it on top of a function void drawpolygon(). What signature would you give drawpolygon, and how would you write this code?
Back to top
Ben Bacarisse
Guest





PostPosted: Sat Jul 21, 2012 12:24 pm    Post subject: Re: Unsinged types Reply with quote

Malcolm McLean <malcolm.mclean5 (AT) btinternet (DOT) com> writes:

Quote:
בתאריך יום שישי, 20 ביולי 2012 10:00:55 UTC+1, מאת Andrew Cooper:

A common but subtle cause of security bugs is to use a signed index into
an array rather than an unsigned one, then have said index based on user
input.

What the vast majority of programmers don't understand is that the
signed vs unsigned is not about 'what's the largest number I can
represent', but with the different semantics that the two types have.

The problem is it's a plugs and adapeters system.

Consider this

void getcursorposition(unsigned int *x, unsigned int *y)

an x, y index into a raster is necessarily unsigned, right?

Yes, but not a cursor position. I would be very unhappy with an API
that conflated these two concepts because, as your example shows, it's
natural to represents positions as signed quantities. (In fact I'd want
a position to be represented as some kind of "point" but that's another
matter.)

Quote:
Now we want to draw an octogon round the cursor. So we'll build it on
top of a function void drawpolygon(). What signature would you give
drawpolygon, and how would you write this code?

struct point { int x, y; };

void drawpolygon(struct point *pt, size_t np, bool closed);

struct point octagon[8];
unsigned int cx, cy;
getcursorposition(&cx, &cy);
for (int v = 0; v < 8; v++) {
octagon[v].x = cx;
octagon[v].y = cy;
move_point_polar(&octagon[v], 360/8 * v, radius);
}
drawpolygon(octagon, 8, true);

Having another prototype for getcursorposition would not make very much
difference, though I'd probably "correct" the API like this:

static inline void getcursorposition_as_point(struct point *pt)
{
// Why is there not a function to do this already?
unsigned int x, y;
getcursorposition(&x, &y);
pt->x = x;
pt->y = y;
}

struct point center, octagon[8];
getcursorposition_as_point(&center);
for (int v = 0; v < 8; v++) {
octagon[v] = center;
move_point_polar(&octagon[v], 360/8 * v, radius);
}
drawpolygon(octagon, 8, true);

How would you write it with a "better" prototype for getcursorposition?

--
Ben.
Back to top
Ben Bacarisse
Guest





PostPosted: Sat Jul 21, 2012 2:48 pm    Post subject: Re: Unsinged types Reply with quote

Malcolm McLean <malcolm.mclean5 (AT) btinternet (DOT) com> writes:

Quote:
בתאריך יום שבת, 21 ביולי 2012 15:24:38 UTC+1, מאת Ben Bacarisse:
Malcolm McLean <malcolm.mclean5 (AT) btinternet (DOT) com> writes:

struct point { int x, y; };

void drawpolygon(struct point *pt, size_t np, bool closed);

unsigned int cx, cy;

snip code

octagon[v].x = cx;
octagon[v].y = cy;

On many compilers this line will generate a warning. Not unreasonably,
because unsigned int can go up to 4 billion, singed int only to 2
billion. The compielr has no way of knowing that a cursor position of
3 billion pixels is completely ridiculous.

You can of course simply add a cast. But once you start doing that
you're working against the language instead of with it.

That's why the API is wrong. You did not comment on my other solution
which is to fix the API with a point-filling version. The use of a cast
in such a function is not "working against the language" it's using the
language to fix a dubious API.

"You can't do this neatly in C" posts should include all the "rules" and
all the language features that you want to arbitrarily exclude: "no
warnings from most compilers and no casts, please". That will have the
advantage that I probably won't reply to them.

Quote:
If getcursorpsoition writes to unsigned, there;s no nice way of
writing drawpolygon(). You can define an interface that takes
unsigneds, because every drawable polygon will necessarily be
described by unsigned x, y, positions. But then you need horrible code
in caller to adjust the polygon to take care of the corner cases. Or
you can say that negative points within the polygon are legitimate but
undrawable. That's the sane solution. But then you no longer want any
unsigned co-ordinates cluttering up up the code.

These are arguments are about code that no one has, or should, write.
It's a straw man. Why not just show how much simpler your code is than
mine when getcursorposition uses int *s rather than unsigned int *s?
That will make it clear just how much damage the using of unsigned has
introduced.

--
Ben.
Back to top
BartC
Guest





PostPosted: Sat Jul 21, 2012 3:27 pm    Post subject: Re: Unsinged types Reply with quote

"Malcolm McLean" <malcolm.mclean5 (AT) btinternet (DOT) com> wrote in message
news:5c390903-75bd-477d-a422-592f60a17d19 (AT) googlegroups (DOT) com...
Quote:
בתאריך יום שבת, 21 ביולי 2012 15:24:38 UTC+1, מאת Ben Bacarisse:
Malcolm McLean <malcolm.mclean5 (AT) btinternet (DOT) com> writes:

struct point { int x, y; };

void drawpolygon(struct point *pt, size_t np, bool closed);

unsigned int cx, cy;

snip code

octagon[v].x = cx;
octagon[v].y = cy;

On many compilers this line will generate a warning. Not unreasonably,
because unsigned int can go up to 4 billion, singed int only to 2 billion.
The compielr has no way of knowing that a cursor position of 3 billion
pixels is completely ridiculous.

Not completely. A 1 pixel x 3 billion pixel black & white image only uses
375MB.

That would need 32-bit unsigned, or 64-bit signed, to address. (But I've
lost track of whether you're arguing for or against the use of unsigned
quantities.)

--
Bartc
Back to top
Malcolm McLean
Guest





PostPosted: Sat Jul 21, 2012 3:53 pm    Post subject: Re: Unsinged types Reply with quote

בתאריך יום שבת, 21 ביולי 2012 15:24:38 UTC+1, מאת Ben Bacarisse:
Quote:
Malcolm McLean <malcolm.mclean5 (AT) btinternet (DOT) com> writes:

struct point { int x, y; };

void drawpolygon(struct point *pt, size_t np, bool closed);

unsigned int cx, cy;

snip code

octagon[v].x = cx;
octagon[v].y = cy;

On many compilers this line will generate a warning. Not unreasonably, because unsigned int can go up to 4 billion, singed int only to 2 billion. The compielr has no way of knowing that a cursor position of 3 billion pixels is completely ridiculous.


You can of course simply add a cast. But once you start doing that you're working against the language instead of with it.

If getcursorpsoition writes to unsigned, there;s no nice way of writing drawpolygon(). You can define an interface that takes unsigneds, because every drawable polygon will necessarily be described by unsigned x, y, positions.. But then you need horrible code in caller to adjust the polygon to take care of the corner cases. Or you can say that negative points within the polygon are legitimate but undrawable. That's the sane solution. But then you no longer want any unsigned co-ordinates cluttering up up the code.
Back to top
Malcolm McLean
Guest





PostPosted: Sat Jul 21, 2012 6:22 pm    Post subject: Re: Unsinged types Reply with quote

בתאריך יום שבת, 21 ביולי 2012 17:48:51 UTC+1, מאת Ben Bacarisse:
Quote:
Malcolm McLean malcolm.mclean5 (AT) btinternet (DOT) com writes:

These are arguments are about code that no one has, or should, write.
It's a straw man. Why not just show how much simpler your code is than
mine when getcursorposition uses int *s rather than unsigned int *s?
That will make it clear just how much damage the using of unsigned has
introduced.

void drawoctogonroundcursor(void)

{
int octx[8];
int octy[8];
int cx, cy;
int d = 3; /* this gives the size of the octogon step */

getcursorposition(&cx, &cy);
octx[0] = cx-d; octy[0] = cy-2*d;
octx[1] = cx+d; octy[1] = cy-2*d;
octx[2] = cx+2*d; octy[2] = cy-d;
octx[3] = cx+2*d; octy[3] = cy+d;
octx[4] = cx+d; octy[4] = cy+2*d;
octx[5] = cx-d; octy[5] = cy+2*d;
octx[6] = cx-2*d; octy[6] = cy+d;
octx[7] = cx-2*d; octy[7] = cy-d;

drawpolygon(octx, octy, Cool;

}

There's no messing about. We can focus completely on the drawing logic, which I might have got wrong.
Back to top
Malcolm McLean
Guest





PostPosted: Sat Jul 21, 2012 6:37 pm    Post subject: Re: Unsinged types Reply with quote

בתאריך יום שבת, 21 ביולי 2012 18:27:33 UTC+1, מאת Bart:
Quote:
Malcolm McLean&qu <malcolm.mclean5 (AT) btinternet (DOT) com> wrote in message
news:5c390903-75bd-477d-a422-592f60a17d19 (AT) googlegroups (DOT) com...
The compiler has no way of knowing that a cursor position of 3 billion
pixels is completely ridiculous.

Not completely. A 1 pixel x 3 billion pixel black white image only uses
375MB.

That's why int should be 64 bits on 64 bit machines.


You could technically have a 3 billion x 1 pixel image on a 32 bit machine, but you'll only have one of them in memory at any one time,and it can't be a coloured image. But a lot of operating systems won't allow such a big array. It's annoying that signed int doesn't handle this case, but the situation is rare enough that we can live with it.

Most people with a need to handle such images will move to 64 bit operating systems with more memory. If int is 64 bits, the code will work without any need for a rewrite.
Back to top
Seungbeom Kim
Guest





PostPosted: Sat Jul 21, 2012 7:07 pm    Post subject: Re: Unsinged types Reply with quote

On 2012-07-20 14:41, Eric Sosman wrote:
Quote:
On 7/20/2012 5:11 PM, Seungbeom Kim wrote:

Things that you assume should be non-negative often turn out to be
not always so. For example, though no one thinks of a negative day
of the month, the tm_* members of struct tm are signed, because you
sometimes need to be able to represent out-of-range values.

It's also a way to help with some calculations. For example,
you can start from a given date, subtract fourteen from tm_mday,
re-normalize, and easily find out "What date was a fortnight
before January 5?"

Exactly.

Quote:
In addition, "unsigned" in C does not only mean non-negativity,
but also implies modulo arithmetic; you can't even get the difference
between two unsigned values in the most natural way (a - b).

You can't do that with signed integers, either.

Of course, signed integers are not without their limits (and the design
limits should lie safely within the environmental limits).

However, *most* of the values we deal with are in small ranges that are
not too far from zero; ranges near INT_MAX, for example, but far from
zero are uncommon, if any, and they may need an upgrade to a larger
integer type anyway.

Given that, getting around in signed is usually safe, as you're near
the center, far from the edges, of the field. Doing that in unsigned,
however, is like playing near the edge and risks falling off.


,- usually here
*****
signed [-------+-------+-------+-------]XXXX DANGER
| | |
-INT_MAX 0 INT_MAX UINT_MAX
| | |
unsigned DANGER XXXX[-------+-------+-------+-------]


Quote:
And in the cases where naive subtraction doesn't work, you don't
just get a possibly surprising but predictable outcome: You get
undefined behavior.

True. But (2U - 5U) yielding 4294967293U is not very useful, no matter
how predictable it is, and failing to take that into account is a bug,
just as getting undefined behavior from signed arithmetic overflow is.

Quote:
And the
unsignedness is contagious, so (unsigned_value > -1) may not be true
and ((unsigned)negative_value > 100) may be true to your surprise.

Oh, come on! You might just as well complain about double-ness
being "contagious."

Their "contagiousnesses" per se may be similar, but their effects are
different; conversion from integer to double doesn't (usually) change
the value being converted, or the result of the comparison at least,
but conversion from signed to unsigned often does, often in a very
surprising way, which is exactly my point in the paragraph above.

Quote:
Besides, why blame the surprises on the unsigned operand? They
don't arise from either one of the operands in isolation, but from
the combination of the two -- so the signed operand is every bit as
much to blame as the unsigned. If you blame one, you should blame
the other equally.[*]

[*] Okay, that doesn't always happen in real life: Doheny was
acquitted of offering the bribe that Fall was convicted of taking.
But Roaring Twenties jurisprudence is a poor model for programming!

Sorry, I don't understand the footnote.

Anyway, you're right that the surprises arise from the combination.
If your values are usually closer to the upper limit of signed (say,
INT_MAX) than to zero, then signed gets in the way more often and may
deserve more "blame." If, on the other hand, they are usually closer
to the lower limit of unsigned, i.e. zero, then unsigned more often
gets in the way. If both cases happen equally, you may want to blame
them equally. In most of the cases I encounter, they don't.
Your mileage may vary, though.

Quote:
The reality is, however, that everyone thinks differently, similar
debates are endless, and that unsigned is widely used for other things
(e.g. counts and sizes) even in the standard library, so you have to
live with unsigned being everywhere, and learn to be careful when
mixing signed and unsigned.

... or when mixing signed integer with long double complex, or
when mixing unsigned long with pointer-to-pointer-to-T, or ... In
fact, your recommendation to "be careful when..." can be improved
by deleting "when" and everything after it. Just be careful, okay?

It's not a unanimous improvement, as it makes my statement more general
and more "vacuously true." Maybe I should have said "be *more* careful"
to be clearer, but I won't delete the when-clause. I don't object to
your being careful in everything, though.

--
Seungbeom Kim
Back to top
BartC
Guest





PostPosted: Sat Jul 21, 2012 7:26 pm    Post subject: Re: Unsinged types Reply with quote

"Seungbeom Kim" <musiphil (AT) bawi (DOT) org> wrote in message
news:juf5mm$mr$1 (AT) usenet (DOT) stanford.edu...
Quote:
On 2012-07-20 14:41, Eric Sosman wrote:
On 7/20/2012 5:11 PM, Seungbeom Kim wrote:

In addition, "unsigned" in C does not only mean non-negativity,
but also implies modulo arithmetic; you can't even get the difference
between two unsigned values in the most natural way (a - b).

You can't do that with signed integers, either.

Of course, signed integers are not without their limits (and the design
limits should lie safely within the environmental limits).

However, *most* of the values we deal with are in small ranges that are
not too far from zero; ranges near INT_MAX, for example, but far from
zero are uncommon, if any, and they may need an upgrade to a larger
integer type anyway.

Given that, getting around in signed is usually safe, as you're near
the center, far from the edges, of the field. Doing that in unsigned,
however, is like playing near the edge and risks falling off.


,- usually here
*****
signed [-------+-------+-------+-------]XXXX DANGER
| | |
-INT_MAX 0 INT_MAX UINT_MAX
| | |
unsigned DANGER XXXX[-------+-------+-------+-------]


And in the cases where naive subtraction doesn't work, you don't
just get a possibly surprising but predictable outcome: You get
undefined behavior.

True. But (2U - 5U) yielding 4294967293U is not very useful, no matter
how predictable it is

You've put the point across much better than I've ever been able to, in many
more posts (-1<5 is true, -1<5u is false, etc.)

But, the language is apparently always right. Even if it's due to 'existing
practice'.

--
bartc
Back to top
Post new topic   Reply to topic    C++Talk.NET Forum Index -> C Language All times are GMT
Goto page Previous  1, 2, 3, 4  Next
Page 2 of 4

 
 


Powered by phpBB © 2001, 2006 phpBB Group