Foundations · #1 of 13

Types, Values, and the Cost of a Copy

What `int x = 5;` actually does, and why `int& y = x;` is a different thing entirely

A type is three promises

When you write int x;, the compiler reserves space for an int and remembers three things:

Size

N bytes

sizeof(int) bytes get set aside on the stack. Usually 4 on a 64-bit system, but the standard only guarantees ≥16 bits. Use int32_t from <cstdint> if you need an exact width.

Representation

bit pattern

The bit pattern in those bytes means a two’s-complement integer. Operations like +, <, and >> are defined for that representation.

Operations

overload set

A type is the operator overload set the compiler looks up when you write x + y. For int it’s a hardware ADD instruction; for std::string it allocates, copies, and returns.

#include <iostream>
#include <cstdint>

int main() {
int        a = 5;
int32_t    b = 5;     // exactly 32 bits, always
unsigned   c = 5;     // same width as int, but no negative values
long long  d = 5;     // ≥ 64 bits, signed

std::cout << "sizeof(a) = " << sizeof(a) << "\n";
std::cout << "sizeof(b) = " << sizeof(b) << "\n";
std::cout << "sizeof(c) = " << sizeof(c) << "\n";
std::cout << "sizeof(d) = " << sizeof(d) << "\n";
return 0;
}
idle

Four integer types, four widths. The compiler picks the instructions to match.

expected output
sizeof(a) = 4
sizeof(b) = 4
sizeof(c) = 4
sizeof(d) = 8
Or run locally
g++ -std=c++23 -O2 snippet.cpp && ./a.out

The lesson: don’t reach for int reflexively. If you’re storing the number of bytes in a file, use std::size_t. If you’re storing a millisecond timestamp, use int64_t. If you’re storing a single ASCII character, use char. The type is the documentation.

Copy is the default

This is the part Python and Java programmers get wrong for weeks. In C++, assigning one variable to another copies by default:

If you're coming from Python or JavaScript, the obvious prediction is 99/99 — `y = x` aliases. What does C++ actually print, and why?
#include <iostream>

int main() {
int x = 5;
int y = x;   // copy: y now holds its own 5
y = 99;      // changes y, not x

std::cout << "x = " << x << "\n";
std::cout << "y = " << y << "\n";
return 0;
}
idle

Two boxes. Two independent fives. Mutating one cannot change the other.

Or run locally
g++ -std=c++23 -O2 snippet.cpp && ./a.out

The compiler emitted exactly two mov instructions to make y. There’s no indirection, no allocator, no garbage collector keeping score. The whole thing happens in nanoseconds and the cost is exactly the cost of the bytes.

For an int, that’s cheap. For a std::vector<double> with a million entries, the same syntax allocates a million doubles’ worth of new memory and copies all of it. Same syntax, wildly different cost. Reading C++ fluently means knowing the cost without looking it up.

When you don’t want a copy: references

A reference is a name for an existing object. Write int& y = x; and y is another name for x — same bytes, same address. Mutating y mutates x. There is no y object on its own.

#include <iostream>

int main() {
int  x = 5;
int& y = x;   // y IS x, just spelled differently
y = 99;       // changes the one and only x

std::cout << "x = " << x << "\n";  // 99
std::cout << "y = " << y << "\n";  // 99
std::cout << "&x = " << &x << "\n";
std::cout << "&y = " << &y << "\n"; // same address!
return 0;
}
idle

A reference is not a pointer. It's a second name for the same box.

expected output
x = 99
y = 99
&x = 0x7ffd2a8b3c14
&y = 0x7ffd2a8b3c14
# (addresses will differ on your machine — the point is they match)
Or run locally
g++ -std=c++23 -O2 snippet.cpp && ./a.out

Three rules about references that compile-time-enforce themselves:

Must be initialized

no default

int& z; is a compile error. References cannot be null.

Cannot be re-seated

bound for life

Once y refers to x, y = w; does not make y refer to w — it copies w’s value into x.

Same size as the address

zero overhead

Usually implemented as a non-null, non-rebindable pointer — but at the language level it’s a name, not a pointer.

References are how C++ gives you “pass by reference” without paying for pointer dereferences in your source code.

const is a contract, not a hint

A const int x = 5; is a value you cannot modify through that name. The compiler will reject x = 6; at compile time. So far, mild.

The interesting use is const T& — a reference that promises not to mutate the thing it refers to. This is the canonical “pass a big object cheaply without letting the callee change it” parameter type:

void print(const std::string& s);   // cheap to call, can't modify s
void mutate(std::string& s);        // cheap to call, will modify s
void take(std::string  s);          // copies; rare unless you'll keep it

Read those three signatures as contracts. print is saying “I won’t change this.” mutate is saying “I might.” take is saying “I want my own copy.” The caller’s mental model is set by the type, not the function name or its docs.

Value categories, briefly

Every C++ expression has a type (we covered that) and a value category. The category answers a question: does this expression refer to an object you could take the address of, or does it produce a temporary that’s about to evaporate?

lvalue

has an address

x, arr[3], *ptr. Can sit on the left of an assignment. Binds to T&.

prvalue

pure rvalue

A literal 5, the result of f() returning by value — a temporary that hasn’t been bound to anything yet. Binds to T&&.

xvalue

eXpiring value

An object about to die, like the result of std::move(x). Treated like a prvalue for overload resolution but lets you steal its guts.

That’s the whole taxonomy. You will not write code that mentions value categories. You will write code whose behavior changes based on them, and the standard’s overload resolution rules use them to pick which constructor or assignment operator to call. We come back to this in lesson 06.

auto is the compiler doing your typing for you

auto x = 5; declares x as int, because 5 is an int literal. auto y = some_function(); declares y with whatever the function returns. The type is still 100% determined at compile time — auto is not Python’s dynamic typing. It’s a request to the compiler: you tell me.

#include <iostream>
#include <vector>
#include <typeinfo>

int main() {
auto a = 5;             // int
auto b = 5.0;           // double
auto c = "hi";          // const char*
auto d = std::vector<int>{1, 2, 3};  // std::vector<int>

std::cout << "a: " << typeid(a).name() << "\n";
std::cout << "b: " << typeid(b).name() << "\n";
std::cout << "c: " << typeid(c).name() << "\n";
std::cout << "d: " << typeid(d).name() << " (size " << d.size() << ")\n";
return 0;
}
idle

typeid names are mangled — pretty-printing them is a lesson 09 topic. The point is just: auto is concrete.

expected output
a: i
b: d
c: PKc
d: St6vectorIiSaIiEE (size 3)
# (mangled names; "i" = int, "d" = double, "PKc" = pointer-to-const-char,
#  the rest is std::vector<int, std::allocator<int>>)
Or run locally
g++ -std=c++23 -O2 snippet.cpp && ./a.out

Three flavors worth knowing now:

value

drops const + ref

auto x = …; always gives you a fresh value. The most common form; also the one that copies if you forgot.

reference

mutate in place

Keeps the reference. Use in range-for over a container of big objects: for (auto& row : matrix) ….

read-only view

cheap + safe

const auto& x = …; — when you only need to look. Default for “loop over a vector of strings” and friends.

If you don’t say which, you got a copy.

What’s next

Lesson 02 covers control flow and functions — the boring-looking surface that’s actually where modern C++ gets to be fun (lambdas, trailing return types, constexpr doing arithmetic at compile time). Then lesson 03 opens the box on stack vs heap, with diagrams.