Types, Values, and the Cost of a Copy
What `int x = 5;` actually does, and why `int& y = x;` is a different thing entirely
A type is three promises
When you write int x;, the compiler reserves space for an int and
remembers three things:
Size
N bytessizeof(int) bytes get set aside on the stack. Usually 4 on a 64-bit
system, but the standard only guarantees ≥16 bits. Use int32_t from
<cstdint> if you need an exact width.
Representation
bit patternThe bit pattern in those bytes means a two’s-complement integer.
Operations like +, <, and >> are defined for that representation.
Operations
overload setA type is the operator overload set the compiler looks up when you
write x + y. For int it’s a hardware ADD instruction; for
std::string it allocates, copies, and returns.
#include <iostream>
#include <cstdint>
int main() {
int a = 5;
int32_t b = 5; // exactly 32 bits, always
unsigned c = 5; // same width as int, but no negative values
long long d = 5; // ≥ 64 bits, signed
std::cout << "sizeof(a) = " << sizeof(a) << "\n";
std::cout << "sizeof(b) = " << sizeof(b) << "\n";
std::cout << "sizeof(c) = " << sizeof(c) << "\n";
std::cout << "sizeof(d) = " << sizeof(d) << "\n";
return 0;
}
Four integer types, four widths. The compiler picks the instructions to match.
sizeof(a) = 4
sizeof(b) = 4
sizeof(c) = 4
sizeof(d) = 8 Or run locally
g++ -std=c++23 -O2 snippet.cpp && ./a.out The lesson: don’t reach for int reflexively. If you’re storing the number
of bytes in a file, use std::size_t. If you’re storing a millisecond
timestamp, use int64_t. If you’re storing a single ASCII character, use
char. The type is the documentation.
Copy is the default
This is the part Python and Java programmers get wrong for weeks. In C++, assigning one variable to another copies by default:
#include <iostream>
int main() {
int x = 5;
int y = x; // copy: y now holds its own 5
y = 99; // changes y, not x
std::cout << "x = " << x << "\n";
std::cout << "y = " << y << "\n";
return 0;
}
Two boxes. Two independent fives. Mutating one cannot change the other.
Or run locally
g++ -std=c++23 -O2 snippet.cpp && ./a.out x = 5
y = 99 The compiler emitted exactly two mov instructions to make y. There’s no
indirection, no allocator, no garbage collector keeping score. The whole
thing happens in nanoseconds and the cost is exactly the cost of the bytes.
For an int, that’s cheap. For a std::vector<double> with a million
entries, the same syntax allocates a million doubles’ worth of new memory
and copies all of it. Same syntax, wildly different cost. Reading C++
fluently means knowing the cost without looking it up.
When you don’t want a copy: references
A reference is a name for an existing object. Write int& y = x; and y
is another name for x — same bytes, same address. Mutating y mutates
x. There is no y object on its own.
#include <iostream>
int main() {
int x = 5;
int& y = x; // y IS x, just spelled differently
y = 99; // changes the one and only x
std::cout << "x = " << x << "\n"; // 99
std::cout << "y = " << y << "\n"; // 99
std::cout << "&x = " << &x << "\n";
std::cout << "&y = " << &y << "\n"; // same address!
return 0;
}
A reference is not a pointer. It's a second name for the same box.
x = 99
y = 99
&x = 0x7ffd2a8b3c14
&y = 0x7ffd2a8b3c14
# (addresses will differ on your machine — the point is they match) Or run locally
g++ -std=c++23 -O2 snippet.cpp && ./a.out Three rules about references that compile-time-enforce themselves:
Must be initialized
no defaultint& z; is a compile error. References cannot be null.
Cannot be re-seated
bound for lifeOnce y refers to x, y = w; does not make y refer to w — it
copies w’s value into x.
Same size as the address
zero overheadUsually implemented as a non-null, non-rebindable pointer — but at the language level it’s a name, not a pointer.
References are how C++ gives you “pass by reference” without paying for pointer dereferences in your source code.
const is a contract, not a hint
A const int x = 5; is a value you cannot modify through that name. The
compiler will reject x = 6; at compile time. So far, mild.
The interesting use is const T& — a reference that promises not to mutate
the thing it refers to. This is the canonical “pass a big object cheaply
without letting the callee change it” parameter type:
void print(const std::string& s); // cheap to call, can't modify s
void mutate(std::string& s); // cheap to call, will modify s
void take(std::string s); // copies; rare unless you'll keep it
Read those three signatures as contracts. print is saying “I won’t
change this.” mutate is saying “I might.” take is saying “I want my
own copy.” The caller’s mental model is set by the type, not the function
name or its docs.
Value categories, briefly
Every C++ expression has a type (we covered that) and a value category. The category answers a question: does this expression refer to an object you could take the address of, or does it produce a temporary that’s about to evaporate?
lvalue
has an addressx, arr[3], *ptr. Can sit on the left of an assignment. Binds to
T&.
prvalue
pure rvalueA literal 5, the result of f() returning by value — a temporary
that hasn’t been bound to anything yet. Binds to T&&.
xvalue
eXpiring valueAn object about to die, like the result of std::move(x). Treated
like a prvalue for overload resolution but lets you steal its guts.
That’s the whole taxonomy. You will not write code that mentions value categories. You will write code whose behavior changes based on them, and the standard’s overload resolution rules use them to pick which constructor or assignment operator to call. We come back to this in lesson 06.
auto is the compiler doing your typing for you
auto x = 5; declares x as int, because 5 is an int literal. auto y = some_function(); declares y with whatever the function returns. The
type is still 100% determined at compile time — auto is not Python’s
dynamic typing. It’s a request to the compiler: you tell me.
#include <iostream>
#include <vector>
#include <typeinfo>
int main() {
auto a = 5; // int
auto b = 5.0; // double
auto c = "hi"; // const char*
auto d = std::vector<int>{1, 2, 3}; // std::vector<int>
std::cout << "a: " << typeid(a).name() << "\n";
std::cout << "b: " << typeid(b).name() << "\n";
std::cout << "c: " << typeid(c).name() << "\n";
std::cout << "d: " << typeid(d).name() << " (size " << d.size() << ")\n";
return 0;
}
typeid names are mangled — pretty-printing them is a lesson 09 topic. The point is just: auto is concrete.
a: i
b: d
c: PKc
d: St6vectorIiSaIiEE (size 3)
# (mangled names; "i" = int, "d" = double, "PKc" = pointer-to-const-char,
# the rest is std::vector<int, std::allocator<int>>) Or run locally
g++ -std=c++23 -O2 snippet.cpp && ./a.out Three flavors worth knowing now:
value
drops const + refauto x = …; always gives you a fresh value. The most common form;
also the one that copies if you forgot.
reference
mutate in placeKeeps the reference. Use in range-for over a container of big objects:
for (auto& row : matrix) ….
read-only view
cheap + safeconst auto& x = …; — when you only need to look. Default for
“loop over a vector of strings” and friends.
If you don’t say which, you got a copy.
What’s next
Lesson 02 covers control flow and functions — the boring-looking surface
that’s actually where modern C++ gets to be fun (lambdas, trailing return
types, constexpr doing arithmetic at compile time). Then lesson 03 opens
the box on stack vs heap, with diagrams.