Projects

Over the last quarter, I took a GPU programming class. For the final project of that class, I worked on accelerating GPT-2 inference using CUDA in a group of 4. Specifically, we found a pure C CPU-only implementation of GPT-2 inference online, and accelerated it using CUDA. We settled on writing custom CUDA kernels for every operation rather than deferring some of it to the variety of existing libraries to maximize the educational value of the project, as well as set (and achieved) the goal of making it end-to-end, meaning the data is only transferred from host to device once, then the entire inference process happens on the GPU, and then the data is transferred back to device (per token).
2024-06-15
1 min read
Over the last quarter, I took an advanced topics in ML class on the topic of LLMs for reasoning. For the final project of that class, I did research on activation steering in a group of 4. We worked on various extensions of the idea of Contrastive Activation Addition (Panickssery et al., 2024 arXiv:2312.06681). Contrastive Activation Addition computes “behavior vectors” by averaging the difference in residual stream activations between pairs of positive and negative examples of a particular behavior. Then, the LLM can be “steered” to demonstrate more of less of the chosen behavior, by adding the behavior vector times a coefficient to the residual stream of the model at inference time.
2024-06-12
1 min read
Last quarter I took Software Design, my first real software engineering class and my first formal course in a low-level programming language (C). The structure of the course consists of 6 weeks of guided work on building a physics engine, and then 4 weeks of unstructured work in which students build a game. Throughout the course, students work in groups of 4. You can play the final product of the course, the game I made, here.
2023-06-18
1 min read