Vansh Tibrewal

I like diving into hard problems. Caltech CS. AI research and software engineering. If you’re interested in working with me or hiring me, you can check out my resume here. I’m years old.

Experiences

Incoming Quantitative Research Engineering Intern

Citadel Global Quantitative Strategies · New York City

June 2025

Researcher

Perona Vision Lab · Caltech

January 2024 - Present

Developing methods for improved self-supervised learning for action recognition, using SAM-2 segmentation to refocus the model, overcoming the object-centric bias of self-supervised vision foundation models.
Applying above methods to identifying statistically significant differences in the behaviors of sheep which have undergone neurosurgery.

Researcher

Anima AI + Science Lab · Caltech

November 2022 - Present

Training foundation models for computational fluid dynamics by pre-training on complex geometries in graphics data.
Designing a better drone propeller by modeling the air flow of arbitrary propeller shapes and solving the inverse problem, using below architectures.
Developing multi-scale, graph-based fourier neural operator architectures for modeling fluid flow on arbitrary geometries.
Investigating physics-informed fourier neural operator architectures for modeling PDEs with discontinuities.
Used 3-dimensional vectorized fourier neural operator to model MHD equations governing plasma evolution in Tokamak nuclear fusion reactors.

Research Assistant

Harvard Center for Geographic Analysis

July 2020 - July 2022

Co-authored research paper published in Taylor & Francis Journal on the impact of COVID-19 on excess deaths of various causes.
Implemented Holt-Winters algorithm, ARIMA models in Python.
Extended SEIR model, presented to the National Science Foundation.
Applied ecological regression to novel dataset, implemented in R.

Software Engineering Intern

AuthBase · Startup

June 2021 - August 2021

Trained GPT-2 model on cybersecurity blogs to create chatbot that provides comprehensive information on the latest security vulnerabilities. Reported to CTO of startup applying deep learning to cybersecurity.

Programmer

FIRST Tech Challenge · Competition

December 2020 - July 2021

Worked in a team of 14 to build a robot to partake in FTC “Ultimate Goal” competitions. 1st Rank in FTC UK Remote Qualifying Tournament. Mentored team members in Java programming.

Programmer

Angel Xpress Foundation · Non-profit

May 2021 - June 2021

Built, trained and deployed chatbot to answer the queries of visitors for a website which receives 5000+ monthly visitors, leading to 80% decrease in email queries with no change in number of donors, volunteers for non-profit.

ML Engineer

Coronavirus Visualization Team · Non-profit

June 2020 - May 2021

Used data augmentation and transfer learning to diagnose COVID-19 from CT scans with 97% precision.
Implemented Naive Bayes Classifier to conduct sentiment analysis of Twitter.

Software Engineering Intern

Kentropy Technologies

August 2020 - November 2020

Developed EdTech contact-tracing game to educate kids about the nature of COVID-19.
Used Java for agent-based modeling of COVID-19 spread.

Projects

Accelerating GPT-2 Inference in CUDA

Over the last quarter, I took a GPU programming class. For the final project of that class, I worked on accelerating GPT-2 inference using CUDA in a group of 4. Specifically, we found a pure C CPU-only implementation of GPT-2 inference online, and accelerated it using CUDA. We settled on writing custom CUDA kernels for every operation rather than deferring some of it to the variety of existing libraries to maximize the educational value of the project, as well as set (and achieved) the goal of making it end-to-end, meaning the data is only transferred from host to device once, then the entire inference process happens on the GPU, and then the data is transferred back to device (per token).

2024-06-15

1 min read

Activation Steering for Model Interpretability

Over the last quarter, I took an advanced topics in ML class on the topic of LLMs for reasoning. For the final project of that class, I did research on activation steering in a group of 4. We worked on various extensions of the idea of Contrastive Activation Addition (Panickssery et al., 2024 arXiv:2312.06681). Contrastive Activation Addition computes “behavior vectors” by averaging the difference in residual stream activations between pairs of positive and negative examples of a particular behavior. Then, the LLM can be “steered” to demonstrate more of less of the chosen behavior, by adding the behavior vector times a coefficient to the residual stream of the model at inference time.

2024-06-12

1 min read

Slime, A Web-Based Video Game written in C

Last quarter I took Software Design, my first real software engineering class and my first formal course in a low-level programming language (C). The structure of the course consists of 6 weeks of guided work on building a physics engine, and then 4 weeks of unstructured work in which students build a game. Throughout the course, students work in groups of 4. You can play the final product of the course, the game I made, here.

2023-06-18

1 min read

Sorting Algorithm Visualizer

Last week I had a lot of fun visualizing various pathfinding algorithms so I decided to follow that up with a visualizer for another classic type of algorithm in CS, sorting algorithms. You can find the code for the final version here. Below, I will showcase what algorithms I covered and what the visualizations looked like, in this order: Bubble Sort Merge Sort Quick Sort Heap Sort In addition, the user can choose between 10 and 100 items to sort.

2021-10-22

1 min read