I'm a Master's student at CMU's School of Computer Science (Fall 2025), focusing on AI Safety, Alignment, and Reinforcement Learning. Over the past year, I’ve transitioned from being new to reinforcement learning to leading hands-on research in RL-based post-training for large language models. I treat research as a sustained effort—dedicating years to making real progress toward solving complex problems rather than just incremental advances.
At CMU, I work closely with faculty across reasoning, exploration, and multi-agent learning:
Previously, I developed a bi-level hierarchical RL framework that jointly trains solver models and process reward models, improving reasoning robustness and sample efficiency. This work, matured at CMU, is titled Textual Actor Critic Beyond Training and is currently under submission to ICML 2026.