
Vansh Tibrewal
I like diving into hard problems. Caltech CS. AI research and software engineering. If you’re interested in working with me or hiring me, you can check out my resume here. I’m years old.
Experiences
Incoming Quantitative Research Engineering Intern
Citadel Global Quantitative Strategies
·
New York City
June 2025
Researcher
January 2024 - Present
- Developing methods for improved self-supervised learning for action recognition, using SAM-2 segmentation to refocus the model, overcoming the object-centric bias of self-supervised vision foundation models.
- Applying above methods to identifying statistically significant differences in the behaviors of sheep which have undergone neurosurgery.
Researcher
November 2022 - Present
- Training foundation models for computational fluid dynamics by pre-training on complex geometries in graphics data.
- Designing a better drone propeller by modeling the air flow of arbitrary propeller shapes and solving the inverse problem, using below architectures.
- Developing multi-scale, graph-based fourier neural operator architectures for modeling fluid flow on arbitrary geometries.
- Investigating physics-informed fourier neural operator architectures for modeling PDEs with discontinuities.
- Used 3-dimensional vectorized fourier neural operator to model MHD equations governing plasma evolution in Tokamak nuclear fusion reactors.
Research Assistant
July 2020 - July 2022
- Co-authored research paper published in Taylor & Francis Journal on the impact of COVID-19 on excess deaths of various causes.
- Implemented Holt-Winters algorithm, ARIMA models in Python.
- Extended SEIR model, presented to the National Science Foundation.
- Applied ecological regression to novel dataset, implemented in R.
Software Engineering Intern
AuthBase
·
Startup
June 2021 - August 2021
Trained GPT-2 model on cybersecurity blogs to create chatbot that provides comprehensive information on the latest security vulnerabilities. Reported to CTO of startup applying deep learning to cybersecurity.
Programmer
December 2020 - July 2021
Worked in a team of 14 to build a robot to partake in FTC “Ultimate Goal” competitions. 1st Rank in FTC UK Remote Qualifying Tournament. Mentored team members in Java programming.
Programmer
May 2021 - June 2021
Built, trained and deployed chatbot to answer the queries of visitors for a website which receives 5000+ monthly visitors, leading to 80% decrease in email queries with no change in number of donors, volunteers for non-profit.
ML Engineer
June 2020 - May 2021
- Used data augmentation and transfer learning to diagnose COVID-19 from CT scans with 97% precision.
- Implemented Naive Bayes Classifier to conduct sentiment analysis of Twitter.
Software Engineering Intern
Kentropy Technologies
August 2020 - November 2020
- Developed EdTech contact-tracing game to educate kids about the nature of COVID-19.
- Used Java for agent-based modeling of COVID-19 spread.
Projects
Over the last quarter, I took a GPU programming class. For the final project of that class, I worked on accelerating GPT-2 inference using CUDA in a group of 4. Specifically, we found a pure C CPU-only implementation of GPT-2 inference online, and accelerated it using CUDA.
We settled on writing custom CUDA kernels for every operation rather than deferring some of it to the variety of existing libraries to maximize the educational value of the project, as well as set (and achieved) the goal of making it end-to-end, meaning the data is only transferred from host to device once, then the entire inference process happens on the GPU, and then the data is transferred back to device (per token).
2024-06-15
1 min read
Over the last quarter, I took an advanced topics in ML class on the topic of LLMs for reasoning. For the final project of that class, I did research on activation steering in a group of 4. We worked on various extensions of the idea of Contrastive Activation Addition (Panickssery et al., 2024 arXiv:2312.06681).
Contrastive Activation Addition computes “behavior vectors” by averaging the difference in residual stream activations between pairs of positive and negative examples of a particular behavior. Then, the LLM can be “steered” to demonstrate more of less of the chosen behavior, by adding the behavior vector times a coefficient to the residual stream of the model at inference time.
2024-06-12
1 min read
Last quarter I took Software Design, my first real software engineering class and my first formal course in a low-level programming language (C). The structure of the course consists of 6 weeks of guided work on building a physics engine, and then 4 weeks of unstructured work in which students build a game. Throughout the course, students work in groups of 4.
You can play the final product of the course, the game I made, here.
2023-06-18
1 min read
Last week I had a lot of fun visualizing various pathfinding algorithms so I decided to follow that up with a visualizer for another classic type of algorithm in CS, sorting algorithms.
You can find the code for the final version here. Below, I will showcase what algorithms I covered and what the visualizations looked like, in this order:
Bubble Sort Merge Sort Quick Sort Heap Sort In addition, the user can choose between 10 and 100 items to sort.
2021-10-22
1 min read