projects
independent things i've built outside of work.
ai copilot for tallyprime. talk to your tally accounting data in natural language via whatsapp, web, or voice. touchless document automation — photos and pdfs become tally entries via ocr. gst/tds compliance toolkit, live tally connectivity via secure desktop agent. multi-tenant architecture with rbac and audit logs. built at discoverminds.ai.
reverse-engineered google lens image search from scratch. upload an image, get visually similar results via trained neural networks. systematically evaluated 7 architectures — resnet, efficientnet, clip, siamese networks, triplet networks, autoencoders. indexed with locality-sensitive hashing (lsh).
fine-tuned openai whisper large v2 for hindi speech recognition over 200 epochs on the google fleurs dataset. deployed on both cpu (huggingface) and gpu (cerebrium, a10). production-ready api with endpoint definitions and sample audio testing utilities.
hiring challenge for sarvam ai that led to a research intern offer at iit madras. semantic chunking to extract and align meaningful audio-text pairs from video content. eda on new testament multilingual audio/text datasets to surface insights for stt/tts optimization.
extensive investigation into gemma and bloom language models for cross-lingual applications. text generation with beam search, top-k, top-p, nucleus sampling. quantization via bitsandbytes for efficient loading of large model variants. implemented svcca (google deepmind) on intermediate layer embeddings with pca + t-sne similarity analysis.
bias evaluation benchmark for multilingual llms, built at acm india summer school & paradox, iit madras. incremental work inspired by pushpak bhattacharyya's research on bias evaluation across languages. evaluated latest multilingual models for fairness and bias — delivered without internet access during the school. team project with sudipto ghosh, sayak chowdhury, and sanjaykumar rathod.