Himanish Agarwal ❚

I have written code in quiet hours and in restless ones. In industry I built pipelines that moved data across oceans of machines. In research I followed questions about memory, energy, and the hidden cost of intelligence. At Procter & Gamble I shaped systems that saved time and money. At Cornell and UIUC I asked how hardware breathes when pressed by algorithms.

Now, in Atlanta, the work continues. At Georgia Tech, I contribute to Vajra, a next-generation LLM serving engine. This is the work of the last mile: taking a trained model and building the machinery to serve its intelligence to the world. My part is in the choreography of it all, contributing to the scheduling and benchmarking infrastructure that manages the torrent of questions and measures the machine’s response.

I return, again and again, to the border where software and systems meet. There is the elegance of an algorithm, the stubbornness of a cache, the silence of a GPU waiting for data, and the long patience of models that learn only when given the chance to see more of the world.