x86-64 Assembly
Implementation patterns, named pitfalls, and the autograder cases that catch them in Assembly coursework.
Low-Level and Architecture Language
Annotated x86-64, ARM, MIPS, and RISC-V solutions, with a stack-frame diagram on every function and a pipeline-timing table on every hazard analysis. The single biggest deduction on a computer architecture assignment is a callee-saved register clobbered without a matching push and pop, the exact failure mode our tutors annotate inline. Verified CS graduates with ISA-level depth, from $20 per task, 14-hour average turnaround.
Why Assembly
Annotated x86-64, ARM, MIPS, and RISC-V solutions, with a stack-frame diagram on every function and a pipeline-timing table on every hazard analysis. The single biggest deduction on a computer architecture assignment is a callee-saved register clobbered without a matching push and pop, the exact failure mode our tutors annotate inline. Verified CS graduates with ISA-level depth, from $20 per task, 14-hour average turnaround.
Topics covered
Implementation patterns, named pitfalls, and the autograder cases that catch them in Assembly coursework.
Five-stage pipeline (IF / ID / EX / MEM / WB) with hazard detection, forwarding paths, and stall-cycle counting in computer architecture labs.
AArch64 calling convention, NEON SIMD intrinsics, interrupt handlers, and bare-metal memory-mapped IO for microcontroller assignments.
Implementation patterns, named pitfalls, and the autograder cases that catch them in Assembly coursework.
Implementation patterns, named pitfalls, and the autograder cases that catch them in Assembly coursework.
Implementation patterns, named pitfalls, and the autograder cases that catch them in Assembly coursework.
Full overview
Assembly maps one source line to one CPU instruction, so it is the language a computer architecture course, a systems course, and a reverse-engineering assignment reach for when the goal is the hardware-software interface itself. Four dialects cover most undergraduate work. An x86-64 assignment grades the System V AMD64 ABI on Linux and macOS: argument registers, callee-saved versus caller-saved, the 16-byte stack alignment before a call, and the red zone in leaf functions.
A MIPS assignment runs in the MARS or SPIM simulator and grades the classic 5-stage pipeline (IF, ID, EX, MEM, WB), data hazards, forwarding, the branch delay slot, and CPI calculation. An ARM AArch64 assignment targets the mobile and embedded register file, the barrel shifter, conditional execution, Thumb encodings, and bare-metal Cortex-M interrupt handlers. A RISC-V assignment grades the RV32I or RV64I base set plus the M, A, F, and D extensions on the Spike or QEMU simulator.
Two assignment types cut across all four. A reverse-engineering assignment hands you a binary with no source and asks you to reconstruct control flow with objdump, GDB, radare2, and Ghidra, recognizing the compiler patterns and calling conventions a decompiler leaves behind. A binary-exploitation lab teaches buffer-overflow and return-oriented-programming defenses, scripted with pwntools against a hardened target.
Our assembly tutors comment every instruction at the architectural level: which register it reads, which it writes, which flags it sets, and why it appears at that point. A stack-frame diagram accompanies every function. A pipeline timing diagram with forwarding paths and stall cycles accompanies every hazard analysis.
The CSHH bench for Assembly pairs verified CS graduates with ISA-level depth in x86-64 and ARM64 performance work and tutors who carry instruction-count and Big-O reasoning into the disassembly.
Where Students Get Stuck
Writing rbx, r12-r15, rbp, or rsp without pushing first corrupts the caller stack on return, and the crash surfaces 3 call sites later. We add the push/pop pair at function entry and exit and document which registers the function touches.
The 128 bytes below rsp are scratch space in leaf functions only. Calling anything, a syscall included, invalidates them. We allocate proper stack space with sub rsp, N before any call instruction.
The System V AMD64 ABI requires rsp 16-byte aligned before a call, and a misaligned stack segfaults inside libc when an SSE or AVX instruction runs in the callee. We add a sub rsp, 8 (or equivalent) to realign before the call.
A RAW data hazard needs forwarding or a stall; WAW and WAR hazards do not, in a single-issue pipeline. We draw the 5-stage diagram and label every hazard with its type and the required mitigation.
MIPS runs the instruction after a branch unconditionally, so a nop wastes the slot and a misplaced instruction corrupts the result. We move a useful instruction from before the branch into the delay slot.
GAS reads source-then-destination and NASM reads destination-then-source. A register pair copied in the wrong direction produces plausible output that fails on the first edge case. We pin the syntax to the assignment and verify operand order in GDB.
How we work
Every instruction is annotated with its purpose, the register contents it changes, and the high-level equivalent. A stack-frame diagram per function shows parameter passing, callee-saved preservation, local-variable layout, and the return address. A pipeline timing diagram with forwarding paths and stall cycles accompanies every hazard analysis.
We assemble and test on the target: NASM or GAS plus GDB and objdump for x86-64, MARS or SPIM for MIPS, QEMU for ARM and RISC-V, Spike for the RISC-V golden reference. We write Intel or AT&T syntax to match the assignment. Step 1: read the ISA reference section the assignment depends on before writing a line.
Step 2: draft the assembly on paper, name every register usage, and label every basic block. Step 3: assemble with the required tool (NASM, GAS, MARS, or the simulator). Step 4: run under GDB with set disassembly-flavor intel (or att) and step instruction by instruction, watching the flags and the stack pointer.
Step 5: validate against the autograder format before delivery.
What you receive
Every Assembly delivery ships with the .s or .asm source files in the directory layout your assignment expects, a Makefile or build script matching the autograder format your brief specifies (a NASM or GAS build, a MARS or SPIM project, or a QEMU run target), a SOLUTION.md with the design rationale and an instruction-count or CPI analysis per function where it applies, and a CHECKLIST.md mapping each rubric item to where the code satisfies it. The bundle adds a stack-frame diagram (ASCII or rendered) for every function, a pipeline timing table for every hazard-analysis question, and a 5-bullet oral-defense brief covering the 3 questions a grader is most likely to ask about your register usage or calling convention.
Assignment Types
Functions written to the System V AMD64 ABI with argument registers, callee-saved preservation, 16-byte stack alignment, and the red zone, delivered with a stack-frame diagram. Named pitfall: a callee-saved register written without a push and pop, which corrupts the caller stack and crashes three call sites later; we add the matching push/pop and document every touched register.
MARS or SPIM programs plus 5-stage pipeline analysis: data hazards, forwarding paths, stall bubbles, the branch delay slot, and CPI calculation with a full timing diagram. Named pitfall: a branch-delay-slot instruction executing unexpectedly, where a nop wastes the slot; we move a useful instruction from before the branch into it.
AArch64 register-file work with the barrel shifter, conditional execution, Thumb encodings, and bare-metal Cortex-M interrupt handlers, tested under QEMU. Named pitfall: a NEON load with vld1q_f32 on a misaligned buffer that silently returns wrong results on older Cortex-A; the fix is an alignas(16) buffer.
RV32I and RV64I base instruction sets plus the M, A, F, and D extensions, run on the Spike golden-reference simulator or QEMU with the riscv64-unknown-elf-gcc toolchain. Named pitfall: building with -march=rv64imafdc but running on a simulator configured rv64imc, which raises an illegal-instruction trap on the float ops; we match the -march flag to the simulator ISA.
Reconstructing control flow from a binary with no source using objdump, GDB, radare2, and Ghidra: calling-convention recognition, compiler-pattern identification, and a P-code or disassembly view. Named pitfall: a misidentified function boundary in Ghidra that hides the real entry point; we mark it manually with Create Function and re-run analysis.
Intel SSE and AVX intrinsics via immintrin.h and ARM NEON via arm_neon.h for matrix-multiply and image-convolution kernels, with alignment requirements and 4 to 8x scalar speedups documented. Named pitfall: an AVX-to-AVX-512 transition that costs 70+ cycles without vzeroupper; we add it at function boundaries.
A 512-byte MBR boot loader transitioning from 16-bit real mode to 32-bit protected mode: zeroing segment registers, setting the stack, loading the GDT with lgdt, and the far jump after setting CR0 bit 0, tested in QEMU. Named pitfall: a missing 0x55 0xAA boot signature at offset 510, which makes the firmware skip the sector silently; we verify the magic bytes in the linker output.
Advanced Topics
RAW, WAR, and WAW data hazards, control hazards, structural hazards, forwarding paths, stall bubbles, and CPI calculation with full 5-stage timing diagrams.
Analyzing binaries without source: calling convention recognition, compiler pattern identification, and control flow reconstruction with objdump, GDB, radare2, and Ghidra.
SSE, AVX2, AVX-512 on x86, NEON on ARM, RVV on RISC-V. Vector register usage, alignment requirements, and 4 to 8x performance improvements in numerical computing.
System V AMD64 ABI for Linux and macOS, Microsoft x64 for Windows, AArch64 PCS for ARM, and the RISC-V ELF psABI. Argument registers, caller-saved and callee-saved, stack alignment, red zone, with stack frame diagrams.
Sample Output
; x86-64 Fibonacci (NASM syntax, System V AMD64 ABI)
section .text
global fibonacci
fibonacci:
cmp rdi, 1
jle .base
push rbx ; callee-saved, must preserve
mov rbx, rdi
dec rdi
call fibonacci ; fib(n-1)
push rax
lea rdi, [rbx - 2]
call fibonacci ; fib(n-2)
pop rbx
add rax, rbx
pop rbx ; restore caller's rbx
ret
.base:
mov rax, rdi
ret Tools & Environment
Sample Projects
Proper stack frames with the rbp prologue, callee-saved register preservation (rbx pushed and popped), and an instruction-count analysis with cycle-count estimation.
strlen, strcmp, strcpy, and strcat with null termination, boundary checking, and syscall I/O via $v0 register-based system-call numbers in the MARS simulator.
Bubble sort and insertion sort with AArch64 conditional execution, the barrel shifter for fast multiplication by constants, and memory-mapped UART output under QEMU.
Integer arithmetic with string-to-int and int-to-string conversion routines. Handles negative numbers via two's complement and overflow detection with the OF flag.
Tutors who cover this language
BS CS
620+ assignments completed
PhD CS
1,200+ assignments completed
FAQ
Browse
Submit your assignment and get matched with a verified Assembly tutor. Anonymous handles, encrypted upload, files auto-delete 30 days after delivery.
Submit Assembly Assignment