Skip to content
View SannidhyaDas's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report SannidhyaDas

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
SannidhyaDas/README.md

GitHub Banner

Hi there , I'm Sannidhya!


GitHub Profile Views

Medium LinkedIn Gmail Kaggle Google Scholar X Linktree

I’m an M.Sc. Data Science student with a strong foundation in Statistics, passionate about building intelligent systems that create real impact. With hands-on experience in ML, DL, and Generative AI, I love writing clean code—and more importantly, explaining it in a way that makes sense to others.

ai robo

I specialize in:

  1. Predictive Modeling & ML Pipelines
  2. Multimodal Systems (Text, Image, Audio)
  3. LLMs, RAG, and GenAI Applications

Skilled in Python, R, SQL, with expertise in Scikit-learn, XGBoost, Hugging Face Transformers, and tools like Whisper and CLIP. My academic journey in Statistics helped me master complex analytical concepts and apply them in real-world ML workflows. Currently building projects that blend theory with practice—transforming raw data into strategic insights. I thrive on learning, teaching, and collaborating with teams that push the boundaries of AI.

Let’s build something intelligent together.


🧰 Toolbox

Python R NumPy Pandas Seaborn Matplotlib Oracle MySQL PostgreSQL Scikit-learn Keras Kaggle PyTorch OpenCV Apache Spark Hugging Face Git Docker FastAPI TensorFlow LangChain Streamlit Power BI Tableau MongoDB

XGBoost LightGBM CatBoost


📖 Latest Blog Articles

✒️...more blog articles


📈 My GitHub Stats

Top Languages GitHub Stats


Research & Publications

My research lies at the intersection of Natural Language Processing, Multimodal Learning, and Mental Health AI, with a focus on weak supervision, representation learning, and foundation-model adaptation under data scarcity.

Identifying Severity of Depression in Forum Posts
Zafar Sarif, Sannidhya Das , Abhishek Das, Md Fahin Parvej, Dipankar Das
· 📘 RANLP 2025 (Workshop on NLP & Language Models for Digital Humanities)

🔗 Paper: https://acl-bg.org/proceedings/2025/LM4DH%202025/pdf/2025.lm4dh-1.12.pdf

  • Proposed a two-stage weakly supervised framework for depression severity classification without annotated training data.
  • Used BART-MNLI for zero-shot pseudo-label generation and DistilBERT fine-tuning for multi-class prediction.
  • Demonstrated the effectiveness of zero-shot learning + weak supervision for low-resource mental health NLP tasks.
  • Results: 92% internal accuracy; 28.9% accuracy on the official blind test set.

Keywords: NLP · Weak Supervision · Mental Health AI · Transformers

From Voice to Vision: A Multimodal Approach to Speech Emotion Recognition
Sannidhya Das , Dipanjan Saha, Subharthi Ray, Sainik Kumar Mahata, Dipankar Das
· 📘 SPELLL 2025 (Accepted)

  • Reframed Speech Emotion Recognition using spectrograms as visual surrogates for emotional states.
  • Introduced class-wise PCA to preserve emotion-discriminative acoustic representations.
  • Performed systematic unimodal and multimodal evaluations on the MELD dataset.
  • Explored CLIP + Whisper for modality-aware audio–text fusion under class imbalance.
  • Results: 0.4975 Macro F1 (text+audio); 0.38 Macro F1 (CLIP–Whisper).

Keywords: Multimodal Learning · SER · CLIP · Whisper · Affective Computing


Interests:
Agentic Solution · RAG-LLM Pipeline · Multimodal Representation Learning · Weakly Supervised Learning · Affective Computing · Foundation Models· Speech · NLP


📝Want To Know More About Me ?

Check out my Resume Here: Resume

Pinned Loading

  1. Prompt2Sql Prompt2Sql Public

    Prompt2SQL is an AI-powered query generation tool that transforms natural language prompts into executable SQL queries using Large Language Models (LLMs). It bridges the gap between non-technical u…

    Jupyter Notebook 1

  2. talk2pdf-multilingual talk2pdf-multilingual Public

    talk2pdf is an AI-powered application that enables seamless, multilingual voice and text interaction with your PDFs. It combines advanced retrieval-augmented generation (RAG), Gemini AI, and speech…

    Python 1

  3. VerbaVista-AI VerbaVista-AI Public

    VerbaVista is an LLM-powered Streamlit app that transforms any YouTube video into structured English notes and an interactive chatbot. It automatically fetches, translates, chunks, and embeds trans…

    Python 1

  4. MoSPI-GOIStats-2025 MoSPI-GOIStats-2025 Public

    📊 Code & Report Repository (Rank 17) — MoSPI Data Visualization Hackathon 2025, Participant: Sannidhya Das

    R

  5. redBus-DataDecode_rank21_Sol redBus-DataDecode_rank21_Sol Public

    redBus Data Decode Hackathon - 2025 Code Repository of Rank 21 holder.

    Jupyter Notebook

  6. COMSYS-5 COMSYS-5 Public

    Forked from Soham-Chaudhuri/COMSYS-5

    🔗 Code Repository — COMSYS Hackathon-5 (2025), Team: The Attention Seekers

    Jupyter Notebook