1. Source code (Go)
package main
import "fmt"
func factorial(n int) int {
r := 1
for i := 2; i <= n; i++ {
r *= i
}
return r
}
func main() {
n := 10
result := factorial(n)
fmt.Println(result)
}
Expected output:
3628800 ✅
2. Build step (go build)
- Go compiler translates
.gointo machine code directly (no separate assembler step like C). - Linker links against the Go runtime (no external libc needed; Go has its own
fmt,os,syscall,runtime, etc). - Produces a static ELF binary (unless you specifically use
cgo).
So compared to C: the binary is fatter because it embeds the runtime, garbage collector, scheduler, and type metadata.
3. Program start (process creation)
When you run ./fact:
- Shell →
execve()syscall (same as C). - Kernel maps ELF into memory (text, data, heap, stack).
- Entry point is not
maindirectly, but Go runtime startup (runtime.rt0_go).
4. Go runtime startup
Flow (simplified):
_rt0_go (assembly stub)
-> runtime·rt0_go (sets up argc/argv/envp from stack)
-> runtime·args()
-> runtime·osinit() (OS thread setup)
-> runtime·schedinit() (scheduler init, memory allocator, GC init)
-> newproc(main.main) (create goroutine for user main)
-> runtime·mstart() (start machine, scheduler loop)
So your main.main runs inside a goroutine, scheduled by Go’s M:N scheduler (goroutines multiplexed onto OS threads).
This is the first big difference from C:
👉 In C, main() is just a function on the process’ initial thread stack.
👉 In Go, main.main is wrapped into a goroutine, with its own tiny growable stack, scheduled by Go runtime.
5. Memory layout (Go process under Linux)
Pretty similar high-level layout as C (because kernel decides this), but runtime carves things up differently:
0x00400000 -> .text (Go program + runtime)
.rodata (string literals, type metadata, itabs, method tables)
.data / .bss (global vars, runtime state)
heap (managed by Go runtime, not libc malloc!)
goroutine stacks (allocated dynamically, mmap’d)
...
0x7fff.... -> OS thread stacks (one per M)
TLS, signal stacks
Key parts:
- Go heap: Managed by Go runtime + GC. Allocations from
new,make, composite literals go here. - Goroutine stacks: Start small (2 KB in modern Go) and grow/shrink dynamically (split stacks, implemented with
mmap+ stack-copying). - Global data: Includes runtime’s tables for types, GC metadata, scheduler state, etc.
6. Execution of factorial in Go
-
When
main.mainruns, Go runtime schedules it on an OS thread (M). -
Arguments (
n) are passed according to Go ABI:- In new Go ABI (Go 1.17+), args go in registers (
DI, SI, DX, CX, R8, R9) like SysV ABI, with spill slots on stack.
- In new Go ABI (Go 1.17+), args go in registers (
-
Locals (
randi) typically live in registers, unless escape analysis forces them to heap (not in this case). -
Return value (
int) inAX.
So factorial(10) executes similarly at the machine-code level to C, but the stack is not a big fixed 8 MB chunk like pthreads — it’s a tiny growable goroutine stack. If more space is needed, Go runtime transparently allocates a bigger stack and copies frame data over.
7. fmt.Println (runtime + syscalls)
This is where the Go runtime does heavy lifting:
fmt.Println→ calls intofmt.Fprintln→fmt.Fprintf→io.Writer.Write.- Default writer is
os.Stdout(a Go struct wrapping FD=1). - Ultimately calls
syscall.Write→ makes a rawwrite(1, buf, len)syscall to the kernel.
So the actual printing mechanism is the same as C (kernel syscall write), but the path is Go runtime code, not libc’s printf.
8. Go scheduler (M:N)
Goroutines are green threads:
- G = goroutine (has stack, PC, status).
- M = machine (OS thread executing Go code).
- P = processor (logical CPU resource, tied to an M, holds run queues).
Execution flow:
- On startup, runtime creates a
Gformain.main. - Scheduler picks an
M(OS thread), assigns aP, and runs thatG. - If more goroutines exist, they are queued in a work-stealing scheduler (run queues per P, work stealing for load balance).
- Blocking syscalls (like
read,write) may park the goroutine and reassign the M to another G.
This means your single main.main G runs on one thread, but if you launched many goroutines, the runtime transparently multiplexes them across threads.
9. Garbage collection impact
Not visible in this factorial program (no heap allocs), but:
- Go runtime’s GC runs concurrently.
- Managed heap is divided into spans & arenas.
- Write barriers are inserted around pointer stores to track references during concurrent GC.
- Small goroutine stacks may move during GC (stack shrinking, root scanning).
C doesn’t have this — memory management is manual.
10. Syscalls and process lifecycle
Syscalls you’d see if you strace ./fact:
execve(launch binary)mmap(for stacks, heap arena, runtime structures)write(from fmt.Println)exit_group(when program exits)
Go runtime hides all this, but it’s happening under the hood.
11. Comparison: C vs Go path
| Stage | C | Go |
|---|---|---|
| Entry point | _start → __libc_start_main | _rt0_go → runtime.schedinit |
main runs | On process main thread stack | As a goroutine (small growable stack) |
| Memory mgmt | malloc/free (libc, brk/mmap) | Go runtime heap + GC |
| Printing | printf (libc → write syscall) | fmt.Println (Go runtime → write) |
| Threads | pthreads (1:1 with OS threads) | goroutines (M:N scheduler) |
| Stack | Fixed 8MB per thread | Starts ~2KB, grows/shrinks as needed |
12. Exit
- After
main.mainreturns, runtime callsruntime.goexitfor cleanup. - If all goroutines are done, runtime calls
exit→ syscallexit_group. - Kernel tears down the process like in C.