← Back to All Tutors

Tutor Profile

Dr. Sarah Chen

PhD Computer Science from Georgia Tech. Specializes in graph algorithms and dynamic programming.

Dr. Sarah Chen profile card with credential PhD Computer Science and 1200+ assignments delivered
1,200+ Assignments Delivered Across CSHH and prior tutoring
10 Years Tutoring Since first paid teaching role
2 Languages Covered Python, Java
7 Documented Specialties Each with a diagnostic playbook

About the Tutor

About Sarah

Sarah finished her CS PhD at Georgia Tech with a thesis on graph algorithms for sparse neural-network training. The work sat at the intersection she still tutors from: classical algorithmic thinking applied to PyTorch and JAX pipelines that students cannot otherwise debug. Ten years and 1,200+ CSHH assignments later, she still opens every session the same way. Read the problem statement aloud. Identify the input contract. State the expected output type. Only then look at the code.

CS tutoring is the work she chose over a faculty post. The decision had a specific cause. During the third year of her PhD she ran a weekly peer-tutoring slot for undergraduates working through the Bellman-Ford material in their algorithms class. One student arrived with a 400-line solution that timed out on the staff autograder. The correct answer was 12 lines plus a comment explaining why relaxation is safe in O(VE). The student had been working alone for nine days. That hour changed what she wanted her career to look like. She finished the PhD because the thesis was already drafted, but the academic-job applications never went out.

Her tutoring focuses on the why behind every algorithmic choice. Why memoize the recursion top-down before rewriting it bottom-up. Why a heap is correct for Dijkstra but a Fibonacci heap is asymptotically faster only when E is sparse. Why the reference implementation in CLRS uses an array indexed from 1 instead of 0. Students who work with her for a semester learn to defend their algorithmic decisions in code review, which is the actual skill the curriculum is testing. The grade follows. The understanding is what the grade rewards.

On the ML side her diagnostic playbook is opinionated. PyTorch autograd issues are usually a missing .detach() on a tensor still attached to the computation graph, or a stale optimizer state that survived a model.load_state_dict() call. A vanishing gradient is almost never the optimizer; it is the loss surface and the initialization scheme. She works through these the same way she works through a graph traversal bug: print the shapes, print the requires_grad flags, isolate one minibatch, and reproduce before patching. Students who learn this loop stop guessing. They start narrowing the search space the way a working ML engineer does.

Her sessions are most useful when the student has already tried something. A specific case from last term: a machine-learning student had a softmax classifier that trained to 92% on the training set and 31% on the validation set, classic overfitting on the surface. Sarah pulled the data loader and found the actual bug. The validation split had been generated before the dataset was normalized, so the validation pixels were in [0, 255] while the training pixels were in [0, 1]. The model was learning the right function on the wrong input distribution. A four-line fix to the data pipeline. The student saw it once and never made the same mistake again.

Her CSHH workflow is consistent. The brief arrives, she reads the rubric twice, then drafts an outline of the solution before writing code. She files inline comments that name the invariant each loop maintains, attaches a Big-O analysis as a separate markdown block, and includes pytest cases covering the autograder edge inputs she has seen flagged before. A "good" student question, in her view, is one where the student has already tried something and can show the failing output. That tells her where understanding stops and where the next session should start. She has a personal rule against delivering anything she could not defend at a thesis committee, and that bar tends to produce work the autograder accepts on the first submission.

Documented Specialties

What Sarah Specializes In

graph algorithms

BFS, DFS, Dijkstra, Bellman-Ford, MST.

dynamic programming

Top-down memoization, bottom-up tabulation.

PyTorch autograd debugging

Sarah handles pytorch autograd debugging as a recurring CSHH workload, with documented patterns and reference solutions.

neural-network training instability

Sarah handles neural-network training instability as a recurring CSHH workload, with documented patterns and reference solutions.

Big-O complexity analysis

Sarah handles big-o complexity analysis as a recurring CSHH workload, with documented patterns and reference solutions.

CS161 / 6.006 problem set methodology

Sarah handles cs161 / 6.006 problem set methodology as a recurring CSHH workload, with documented patterns and reference solutions.

Sample Reviewed Code

Code Sarah Has Reviewed

A representative snippet from Sarah's workflow. Pulled from the diagnostic playbook Sarah runs on incoming CSHH assignments in this language.

Python bellman_ford.py

          
          def bellman_ford(graph, source):
        
          
              """Single-source shortest paths. O(VE), handles negative edges.
        
          
              Returns dist[v], or raises on negative cycle. CS161 / 6.006."""
        
          
              dist = {v: float('inf') for v in graph}
        
          
              dist[source] = 0
        
          
              for _ in range(len(graph) - 1):       # V-1 relaxation passes
        
          
                  for u in graph:
        
          
                      for v, w in graph[u]:
        
          
                          if dist[u] + w < dist[v]:
        
          
                              dist[v] = dist[u] + w
        
          
              for u in graph:                       # one extra pass = cycle check
        
          
                  for v, w in graph[u]:
        
          
                      if dist[u] + w < dist[v]:
        
          
                          raise ValueError('negative cycle reachable from source')
        
          
              return dist
        

Coverage Map

Subjects and Languages Sarah Covers

Course Matches

Courses Sarah Specializes In

6.006 Massachusetts Institute of Technology

MIT 6.006: Introduction to Algorithms

MIT 6.006 introduces algorithms across 13 weeks with 26 lectures, 13 recitations, and 7 problem sets. The Spring 2020 redesign by Erik Demaine, Jason Ku, and Justin Solomon...

8 recurring assignments covered

Get help with 6.006

FAQ

Frequently Asked Questions

Why do you ask students to read the problem aloud before touching code?
Most pset failures trace to a misread input contract, not a wrong algorithm. Reading aloud surfaces ambiguity: signed vs unsigned, 0-indexed vs 1-indexed, in-place vs copy-return. Five minutes of contract reading saves two hours of debugging on the wrong solution.
How do you debug a PyTorch model that trains fine on CPU but produces NaN on GPU?
Three checks in order. First, mixed-precision: if you wrapped the forward pass in autocast() but kept the loss in float32, certain ops (softmax over long sequences, log of small probabilities) underflow on fp16. Second, in-place ops on tensors still attached to the graph: x += y triggers a CUDA-specific autograd error that masks as NaN. Third, dataloader workers: pin_memory plus a buggy collate_fn produces silent corruption only on GPU because the CPU path bypasses the pinned-memory copy.
What is your stance on using ChatGPT for algorithms homework?
Useful for explaining a concept you have already partly grasped. Useless for learning to derive the algorithm yourself. The skill being tested in CS161 is not "write Bellman-Ford"; it is "prove your relaxation order terminates". An LLM will write the code and skip the proof. Students who submit the code without the proof get 60% and lose the actual lesson.
How do you decide between recursion, top-down DP, and bottom-up DP?
Start recursive to confirm correctness on small inputs. Add memoization the moment the same subproblem is computed twice. Convert to bottom-up only if recursion-depth limits or call-stack overhead becomes the bottleneck. Most CS161 DP problems land at top-down memoized; bottom-up is a performance refactor, not a correctness step.
What is a "good" question to bring to a tutoring session?
One where the student has tried something concrete and can show the failing output. "I implemented BFS but it returns the wrong distance on graph X" is a good question. "I do not understand graphs" is a question that needs to be decomposed first. The decomposition is part of the lesson.
Do you provide pytest cases with every solution?
Yes. Every full-solution delivery includes pytest cases covering the rubric inputs plus three edge cases I have seen flagged on past Gradescope submissions for that course. Empty input, single-element input, and the input size where O(n^2) starts to time out.

Work With Sarah?

Submit your assignment with Sarah in mind. We will route the request to the best-fit tutor based on subject, language, and current load. Average first reply inside 30 minutes during business hours.

Submit for Sarah