x86-64 assembly
System V AMD64 calling convention, SSE/AVX vector instructions.
Tutor Profile
BS Computer Science from Purdue University. Specializes in x86-64 assembly and ARM64 assembly.
About the Tutor
James finished his CS bachelors at Purdue, then spent five years writing low-level performance code for a kernel-security shop where the day job was reading Intel and AMD optimization manuals end to end, writing assembly probes that measured pipeline behavior at single-cycle resolution, and reproducing the silicon errata vendors themselves had documented. None of that lands on a typical undergraduate transcript. All of it lands on the kind of architecture assignments students get stuck on. Seven years into tutoring and 620+ CSHH assignments later, he still teaches the same way: from the instruction set up, not from the high-level language down.
His tutoring is heavy on x86-64 and ARM64 assembly because those are the two architectures students actually encounter. He covers RISC-V for courses that teach it. The recurring student frustration is calling-convention bugs: a function that compiles and links but corrupts the stack, returns garbage, or segfaults the moment another function is called. The bug is almost always a register-preservation violation. The student wrote a function that clobbers rbx or r12 (callee-saved under System V AMD64) without pushing them first, the caller relied on those registers still holding their values, and the corruption surfaces three calls later. James traces these by reading the disassembly with the ABI doc open. Students learn to do the same.
A representative case from last semester. A systems-course student submitted a hand-written assembly implementation of memcpy that passed the functional tests but mysteriously broke the test harness on the second invocation. The function preserved the right callee-saved registers, returned the right value, and handled alignment correctly. The bug was that the student had used the red zone (the 128 bytes below rsp under System V AMD64) for a temporary buffer, which is legal in a leaf function but illegal when the function calls anything else, because signal handlers can scribble on it. James found the bug by stepping through with gdb and noticing rsp had not been adjusted before a downstream call. The fix was three instructions to allocate proper stack space. The student went from a failing grade to full credit on the lab and understood why the red zone existed for the first time.
On the architecture side his teaching priority is the memory hierarchy. Most students learn that L1 is fast and DRAM is slow without ever measuring the gap. James walks them through a cycle-accurate model: 4 cycles for L1 hit, 12 for L2, 40 for L3, 200 for DRAM on a modern Intel chip. Then they implement matrix multiplication two ways: a naive triple loop that misses cache constantly and a blocked version that respects the L1 working set. The measured speedup is usually 4 to 10x on the same algorithm. Students who experience this once stop writing code that ignores the cache.
His CSHH workflow is methodical. Brief arrives, he reads the ISA spec section relevant to the assignment first, drafts the assembly by hand on paper, then types it in and runs it under gdb with set disassembly-flavor intel for x86 or set disassembly-flavor att depending on what the course uses. Every instruction in the delivered solution has a comment explaining what it does at the architectural level: which register it reads, which it writes, which flags it sets, why it appears at that point in the code. A "good" student question for James is one where the student can show the disassembly of their compiled code and point at the specific instruction sequence they do not understand. With that, the lesson starts at the actual confusion instead of three layers above it. The pset gets done. The mental model also gets built, and that one carries forward to the next assignment.
Documented Specialties
System V AMD64 calling convention, SSE/AVX vector instructions.
AArch64 calling convention, NEON intrinsics.
RV32I, RV64I, M and F extensions.
Layout asm, stepi, info registers, x/Nxw.
L1/L2/L3/DRAM, TLB, cache blocking.
James handles pipeline hazards (data, control, structural) and microarchitectural state as a recurring CSHH workload, with documented patterns and reference solutions.
Sample Reviewed Code
A representative snippet from James's workflow. Pulled from the diagnostic playbook James runs on incoming CSHH assignments in this language.
# System V AMD64 callee-saved set: rbx, rbp, r12-r15, rsp.
# Push every callee-saved register you touch. Pop in reverse order.
# CMU 15-213 attacklab + bomblab top deduction: missing this prologue.
.globl my_function
my_function:
pushq %rbp # save caller's frame pointer
movq %rsp, %rbp # establish our own
pushq %rbx # we will use rbx below
pushq %r12 # and r12
subq $32, %rsp # 32B local frame (NOT the red zone -
# we call printf below, red zone is invalid)
# ... function body uses rbx, r12 freely ...
call printf@PLT
addq $32, %rsp # tear down local frame
popq %r12 # restore in reverse push order
popq %rbx
popq %rbp
ret
Coverage Map
Subjects
Course Matches
CS50 introduces computational thinking across 10 weeks plus a final project, using 4 languages in sequence: Scratch (week 0), C (weeks 1 through 5), Python (weeks 6 through 7),...
10 recurring assignments covered
Get help with CS50MIT 6.006 introduces algorithms across 13 weeks with 26 lectures, 13 recitations, and 7 problem sets. The Spring 2020 redesign by Erik Demaine, Jason Ku, and Justin Solomon...
8 recurring assignments covered
Get help with 6.006CS61B teaches data structures and Java software engineering across 14 weeks under Josh Hug (since 2017) and Justin Yokota (recent semesters). Lectures cover lists, sets, maps,...
8 recurring assignments covered
Get help with CS61BFAQ
More Named Tutors
Four tutors keep public profiles. The rest of the bench stays off the public site so student-tutor matches stay confidential.
PhD CS
1,200+ assignments completed
MS CS
980+ assignments completed
Active Bench
Behind the four named profiles is a wider matching bench. Submissions auto-route by subject, language, and timezone. The public profiles cover the most-requested specializations; the rest of the roster stays unpublished so student-tutor pairings stay private.
Get matched to a tutor
Submit your assignment with James in mind. We will route the request to the best-fit tutor based on subject, language, and current load. Average first reply inside 30 minutes during business hours.
Submit for James