Projects
A collection of projects I've worked on, from full-stack applications to open-source tools. Each project represents a unique challenge and learning experience.
High-Performance GPU Convolution Kernel
Optimized a CNN forward-pass CUDA kernel, slashing execution time from 13s to 30ms by bypassing memory bottlenecks. Achieved 90% SM occupancy and 83.8% FMA utilization via spatial coarsening and register-level optimization.
Sharded Paxos Key-Value Store
Engineered a highly available, sharded distributed key-value store utilizing the Paxos consensus protocol to orchestrate robust state replication and ensure strict linearizability under network partitions and simulated node crashes.
User-Level Threading & Synchronization Library
Architected a preemptive user-level threading library from scratch. Implemented custom CPU context switching, precise register state management, and rigorous synchronization primitives (mutexes, condition variables) to eliminate race conditions.
An AI agent system designed to make the open-source contribution process seamless. Built a custom Scout MCP server to map repositories, analyze developer patterns, and match developers with optimal contribution opportunities.
An intelligent study platform that automatically generates custom flashcards and practice exams from uploaded course materials, utilizing language models to extract key concepts and streamline the active recall process.
Perfect Pong
A hardware-software integration project utilizing a Raspberry Pi and a 1-D gantry system to catch moving objects in real-time. Engineered a computer vision pipeline using OpenCV with HSV color masking and a custom one-shot prediction algorithm.