Let’s talk about how (most) computers represent negative numbers. You know, the ones to the left of zero on the number line? The representation most computers use is called two’s complement.
Take the integer quantity fourteen, and think about how we represent it as a string of digits. In decimal, it’s 14. In binary, it’s 1110. Why? Remember that just as in base ten, where
14 = 1 * 10^1 + 4 * 10^0
We have, using a base of two,
14 = 1 * 2^3 + 1 * 2^2 + 1 * 2^1 + 0 * 2^0
But what is a good way to represent negative fourteen?
Wouldn’t it be great if a computer just understood -1110? The catch is, we can use only ones and zeros; using the minus sign “-” is out of the question.
We know that to represent all the non-negative integers up to fifteen, we need a string of four bits, which gives us a distinct string for each of the sixteen (2^4) distinct integers. If we want to represent the corresponding negative integers as well, we’ll have to use another bit. That doubles the number of distinct binary strings we have available, from sixteen to thirty-two.
Whatever representation we come up with for negative integers, it’s probably best if we don’t screw up normal binary arithmetic. We need to leave all integers from 0 to 15 mapped to 00000 to 01111 in binary just as they are. Remember in base ten arithmetic we “carry” if a column sums to over nine (e.g., 07 + 03 = 10). Similarly, in base two, we carry any column summing to over one. So 00001 + 00001 = 00010; 00010 + 00001 = 00011; 00011 + 00001 = 00100; etc. This locks down all the numbers up to 01111.
What’s left to do is to assign binary representations to negative integers. The remaining binary strings at our disposal are the five digit strings ranging from 10000 to 11111. Note that all of them have one as their highest order bit. So adding two binary integers whose sum is greater than 01111 will definitely overflow into negative territory!
Now, how to do this mapping?
For a well-behaved system of (five-digit) binary addition with negative numbers, we turn to modular arithmetic, aka “clock arithmetic.” For our five-bit example in which we have a set of thirty-two binary strings, the appropriate modulus to choose is thirty-two (i.e., 2^5 or 100000). This is the midnight of our hypothetical clock on which we do our math. For example, we have that 11111 + 1 = 100000 ≡ 0. We say that 100000 is equivalent to 0. In this system of addition, which has the algebraic properties of a group (really, a ring – more on that later), every element has an inverse. This inverse (aka complement) under addition is, by definition, the negative. So with this system, we can pose the question: what is X such that 01110 + X is equivalent to 0? The answer can be nothing but the negative of 01110!
We can solve 01110 + X = 100000 (≡ 0) using only the binary arithmetic we already know. We see that:
100000 (≡ 0) = 11111 + 1 = 01110 + 10001 + 1 = 01110 + 10010
100000 (≡ 0) = 11111 + 1 = 10010 + 01101 + 1 = 10010 + 01110
So it makes sense to say 10010 is the negative of 01110, and vice versa. That is, 10010 represents -14! Note, however, that this is true only in binary arithmetic with modulus thirty-two. With a word size larger or smaller than five bits, we’d wind up with different negative string representations.
The equations above demonstrate the well-known rule to form the representation of a negative binary integer in two’s complement: flip the bits and add one. This operation is known as “taking the two’s complement.”
Defining negative numbers this way allows us to define subtraction in terms of addition, so we can reuse our adder hardware circuitry for subtraction. (And lest you ask about multiplication, we’re all good there too, as our chosen system is a homomorphic image of the ring of integers!)
Note that the largest (in absolute terms) negative number we can represent using our five bits is -16, whereas the largest positive number is 15. This is because while every number from 0 to 15 has a corresponding negative, zero is its own negative. So there’s an extra negative number floating around which, strangely enough, is also its own negative.
Why is this system called Two’s Complement? Because adding a positive binary integer and its complement (negative) gives 2^w (w being the number of bits needed). You can think analogously of complementary angles in Euclidean geometry, whose measures sum up to ninety degrees. Note that for most computers today, w is 32 or 64, corresponding to the native word size of the machine; it used to be 16 or maybe even 8, if you were coding for an Intel 8008 chip or something.
If you need to expand the width of a number whose word size is w to size w + k, you can do something called sign extension. This means left-padding the binary string with k copies of the sign bit. This is intuitive for positive numbers: 01110 = 001110 = 0001110, etc. For negative numbers, we can see this as follows. Imagine a binary string of length w that encodes a negative number and call it X. We know its two’s complement is X’ + 1, where X’ is just the string X with its bits flipped. As X + X’ = 2^w – 1 = 11..1 for all lengths w, whenever the two’s complement of X is left-padded with a zero (its sign bit), X must be left-padded with a one (its sign bit).
Another signed binary arithmetic system called Ones’ Complement is so named because in this system, a number and its complement together sum to 1…11; that is, one repeated w times. The rule to this latter system goes: flip the bits (don’t add one!) Unfortunately, this system suffers not least from having two distinct representations of the identity element, zero. I can only imagine the nightmare that is multiplication.
Just for fun, what might would a Ten’s Complement system look like? Suppose we have decimal strings of length two to work with, and we aren’t allowed to use a minus sign. By analogous logic, we could form negative k using the formula 10^2 – k. In which case, we’d represent negative fourteen as 86!