C++ Primitives

We’re sure many of you already know how to code in C++. But maybe many of you still don’t have a good grasp on what’s happening under the hood. Our goal in this section is to explore that, so that we’re better-equipped to handle more advanced implementation. We’ll first start off with integers and the integer-like data types.

Base Systems and Binary Numbers

To understand base systems, please read this http://betterexplained.com/articles/numbers-and-bases/.

Bits, Bytes, and Limits of Binary Numbers

If you’ve read the above link, you should have an idea of how binary works. Let’s explore what that means for computers. A bit is a single digit in binary. A binary digit. A bit.

If you are allowed to count numbers up to three digits, what’s the largest number you can represent? 999. Having eight digits allows us to count up to 99,999,999. It’s pretty easy to see that to maximize the number, we just fill everything with 9's.

Now let’s look at bits. If we have one bit, we can count 0 and 1. If we have two bits, we can count: 0, 1, 10, 11. And if we have 3 bits, we can count: 0, 1, 10, 11, 100, 101, 110, 111 -- a maximum of 111. In doing this, we quickly notice a pattern. If we have $n$ bits, the maximum number we can count is $n$ 1’s.

Can we find a formula to convert this strange binary number of $n$ 1’s into something more familiar? It’s always good to look at examples. For example, $11111$ which is $5$ 1’s. If we add $1$ to that, the binary number ticks over and we end up with $100000$, that’s clearly $2^5$ because of how the base system works, so the number we were looking for is $1$ less than that, or $2^n-1$. So with $n$ bits, we can represent a maximum of $2^n-1$.

Let’s apply that knowledge now. One byte is $8$ bits. An unsigned int is $4$ bytes. Thus we know that an unsigned int has 32 bits, and it can represent a maximum of $2^{32} - 1$ or $4294967295$. And if you don’t believe me, try running the following C++ code:

  1. #include <iostream>
  2. using namespace std;
  3. int main() {
  4. unsigned int x = 4294967295;
  5. cout << "The value of x is " << x << endl;
  6. x++;
  7. cout << "The value of x is " << x << endl;
  8. }

Using the same arguments, we can then figure out the following table:

Primitive Type Number of Bytes Limit Formula Limit in Decimal
unsigned char 1 $2^{8} - 1$ 0 to 255
unsigned short 2 $2^{16} - 1$ 0 to 65535
unsigned int, unsigned long 4 $2^{32} - 1$ 0 to 4294967295 (about 4 billion)
unsigned long long 8 $2^{64} - 1$ 0 to 18446744073709551615 (20 digits)

Now you might be wondering why char is on that table when char is supposed to be for characters and not numbers. We’ll talk about the answer to that question later on, but keep that in mind. As for unsigned long long and unsigned int, it’s not that important to memorize the actual numbers. Just remember an approximation of the maximum size it can store.

Negative Numbers

Now with signed numbers, things start getting a bit weird. We’ll explain how it works for 8-bit numbers, because they are shorter.

In signed numbers, 0 to 127 (half of the limit) are normal. After 127 (01111111) comes -128 (10000000) , -127 (10000001) , -126 (10000010) , ..., and so on until -1 (11111111). This scheme is called the 2’s complement scheme.

For those more familiar with modular arithmetic, the reason we choose to use 2's complement is because the signed numbers are congruent to their unsigned counterparts modulo $2^n$. In this scheme, we happen to choose the same binary representation for both of them. For those who aren’t familar with modular arithmetic, forget everything I said and just come back to this later. ^_^

Primitive Type Number of Bytes Limit in Decimal
char 1 -128 to 127
short 2 -32768 to 32767
int, long 4 -2147483648 to 2147483647 (about 2 billion)
long long 8 -9223372036854775808 to 9223372036854775807 (19 digits)