Skip to main content

Showing 1–50 of 530 results for author: Shashank

  1. arXiv:2510.00922  [pdf, ps, other

    cs.AI

    On Discovering Algorithms for Adversarial Imitation Learning

    Authors: Shashank Reddy Chirra, Jayden Teoh, Praveen Paruchuri, Pradeep Varakantham

    Abstract: Adversarial Imitation Learning (AIL) methods, while effective in settings with limited expert demonstrations, are often considered unstable. These approaches typically decompose into two components: Density Ratio (DR) estimation $\frac{ρ_E}{ρ_π}$, where a discriminator estimates the relative occupancy of state-action pairs under the policy versus the expert; and Reward Assignment (RA), where this… ▽ More

    Submitted 1 October, 2025; originally announced October 2025.

  2. arXiv:2509.24863  [pdf, ps, other

    cs.CV

    Vision At Night: Exploring Biologically Inspired Preprocessing For Improved Robustness Via Color And Contrast Transformations

    Authors: Lorena Stracke, Lia Nimmermann, Shashank Agnihotri, Margret Keuper, Volker Blanz

    Abstract: Inspired by the human visual system's mechanisms for contrast enhancement and color-opponency, we explore biologically motivated input preprocessing for robust semantic segmentation. By applying Difference-of-Gaussians (DoG) filtering to RGB, grayscale, and opponent-color channels, we enhance local contrast without modifying model architecture or training. Evaluations on Cityscapes, ACDC, and Dark… ▽ More

    Submitted 29 September, 2025; originally announced September 2025.

    Comments: Accepted at the ICCV 2025 Workshop on Responsible Imaging

  3. arXiv:2509.23815  [pdf, ps, other

    cs.CV cs.AI cs.LG

    A Multi-Camera Vision-Based Approach for Fine-Grained Assembly Quality Control

    Authors: Ali Nazeri, Shashank Mishra, Achim Wagner, Martin Ruskowski, Didier Stricker, Jason Rambach

    Abstract: Quality control is a critical aspect of manufacturing, particularly in ensuring the proper assembly of small components in production lines. Existing solutions often rely on single-view imaging or manual inspection, which are prone to errors due to occlusions, restricted perspectives, or lighting inconsistencies. These limitations require the installation of additional inspection stations, which c… ▽ More

    Submitted 28 September, 2025; originally announced September 2025.

    Comments: 6 pages, 3 figures. Accepted for presentation at EUSIPCO 2025 (European Signal Processing Conference)

    MSC Class: 68T45 ACM Class: I.4.8; I.4.1; I.2.10

  4. arXiv:2509.22631  [pdf, ps, other

    cs.CV cs.CL

    LABELING COPILOT: A Deep Research Agent for Automated Data Curation in Computer Vision

    Authors: Debargha Ganguly, Sumit Kumar, Ishwar Balappanawar, Weicong Chen, Shashank Kambhatla, Srinivasan Iyengar, Shivkumar Kalyanaraman, Ponnurangam Kumaraguru, Vipin Chaudhary

    Abstract: Curating high-quality, domain-specific datasets is a major bottleneck for deploying robust vision systems, requiring complex trade-offs between data quality, diversity, and cost when researching vast, unlabeled data lakes. We introduce Labeling Copilot, the first data curation deep research agent for computer vision. A central orchestrator agent, powered by a large multimodal language model, uses… ▽ More

    Submitted 26 September, 2025; originally announced September 2025.

  5. arXiv:2509.22448  [pdf, ps, other

    cs.CV

    $γ$-Quant: Towards Learnable Quantization for Low-bit Pattern Recognition

    Authors: Mishal Fatima, Shashank Agnihotri, Marius Bock, Kanchana Vaishnavi Gandikota, Kristof Van Laerhoven, Michael Moeller, Margret Keuper

    Abstract: Most pattern recognition models are developed on pre-proce\-ssed data. In computer vision, for instance, RGB images processed through image signal processing (ISP) pipelines designed to cater to human perception are the most frequent input to image analysis networks. However, many modern vision tasks operate without a human in the loop, raising the question of whether such pre-processing is optima… ▽ More

    Submitted 26 September, 2025; originally announced September 2025.

    Comments: Accepted at DAGM GCPR 2025

  6. arXiv:2509.14608  [pdf, ps, other

    cs.CR cs.AI

    Enterprise AI Must Enforce Participant-Aware Access Control

    Authors: Shashank Shreedhar Bhatt, Tanmay Rajore, Khushboo Aggarwal, Ganesh Ananthanarayanan, Ranveer Chandra, Nishanth Chandran, Suyash Choudhury, Divya Gupta, Emre Kiciman, Sumit Kumar Pandey, Srinath Setty, Rahul Sharma, Teijia Zhao

    Abstract: Large language models (LLMs) are increasingly deployed in enterprise settings where they interact with multiple users and are trained or fine-tuned on sensitive internal data. While fine-tuning enhances performance by internalizing domain knowledge, it also introduces a critical security risk: leakage of confidential training data to unauthorized users. These risks are exacerbated when LLMs are co… ▽ More

    Submitted 18 September, 2025; originally announced September 2025.

  7. arXiv:2509.13348  [pdf, ps, other

    cs.CY cs.HC

    Towards an AI-Augmented Textbook

    Authors: LearnLM Team, Google, :, Alicia Martín, Amir Globerson, Amy Wang, Anirudh Shekhawat, Anna Iurchenko, Anisha Choudhury, Avinatan Hassidim, Ayça Çakmakli, Ayelet Shasha Evron, Charlie Yang, Courtney Heldreth, Diana Akrong, Gal Elidan, Hairong Mu, Ian Li, Ido Cohen, Katherine Chou, Komal Singh, Lev Borovoi, Lidan Hackmon, Lior Belinsky, Michael Fink , et al. (12 additional authors not shown)

    Abstract: Textbooks are a cornerstone of education, but they have a fundamental limitation: they are a one-size-fits-all medium. Any new material or alternative representation requires arduous human effort, so that textbooks cannot be adapted in a scalable manner. We present an approach for transforming and augmenting textbooks using generative AI, adding layers of multiple representations and personalizati… ▽ More

    Submitted 30 September, 2025; v1 submitted 13 September, 2025; originally announced September 2025.

  8. arXiv:2509.09622  [pdf

    cs.IR

    AskDoc -- Identifying Hidden Healthcare Disparities

    Authors: Shashank Gupta

    Abstract: The objective of this study is to understand the online Ask the Doctor services medical advice on internet platforms via AskDoc, a Reddit community that serves as a public AtD platform and study if platforms mirror existing hurdles and partiality in healthcare across various demographic groups. We downloaded data from January 2020 to May 2022 from AskDoc -- a subreddit, and created regular express… ▽ More

    Submitted 11 September, 2025; originally announced September 2025.

  9. Mitigating Clinician Information Overload: Generative AI for Integrated EHR and RPM Data Analysis

    Authors: Ankit Shetgaonkar, Dipen Pradhan, Lakshit Arora, Sanjay Surendranath Girija, Shashank Kapoor, Aman Raj

    Abstract: Generative Artificial Intelligence (GenAI), particularly Large Language Models (LLMs), offer powerful capabilities for interpreting the complex data landscape in healthcare. In this paper, we present a comprehensive overview of the capabilities, requirements and applications of GenAI for deriving clinical insights and improving clinical efficiency. We first provide some background on the forms and… ▽ More

    Submitted 26 August, 2025; originally announced September 2025.

    Comments: Accepted at IEEE COMPSAC 2025

    Journal ref: 2025 IEEE 49th Annual Computers, Software, and Applications Conference (COMPSAC)

  10. arXiv:2508.21693  [pdf, ps, other

    cs.CV cs.AI cs.CL cs.LG

    Why Stop at Words? Unveiling the Bigger Picture through Line-Level OCR

    Authors: Shashank Vempati, Nishit Anand, Gaurav Talebailkar, Arpan Garai, Chetan Arora

    Abstract: Conventional optical character recognition (OCR) techniques segmented each character and then recognized. This made them prone to error in character segmentation, and devoid of context to exploit language models. Advances in sequence to sequence translation in last decade led to modern techniques first detecting words and then inputting one word at a time to a model to directly output full words a… ▽ More

    Submitted 29 August, 2025; originally announced August 2025.

    Comments: 11 pages. Project Website: https://nishitanand.github.io/line-level-ocr-website

  11. arXiv:2508.20453  [pdf, ps, other

    cs.CL

    MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers

    Authors: Zhenting Wang, Qi Chang, Hemani Patel, Shashank Biju, Cheng-En Wu, Quan Liu, Aolin Ding, Alireza Rezazadeh, Ankit Shah, Yujia Bao, Eugene Siow

    Abstract: We introduce MCP-Bench, a benchmark for evaluating large language models (LLMs) on realistic, multi-step tasks that demand tool use, cross-tool coordination, precise parameter control, and planning/reasoning for solving tasks. Built on the Model Context Protocol (MCP), MCP-Bench connects LLMs to 28 representative live MCP servers spanning 250 tools across domains such as finance, traveling, scient… ▽ More

    Submitted 28 August, 2025; originally announced August 2025.

  12. arXiv:2508.12548  [pdf, ps, other

    cs.IT cs.DS

    Algorithmic Improvements to List Decoding of Folded Reed-Solomon Codes

    Authors: Vikrant Ashvinkumar, Mursalin Habib, Shashank Srivastava

    Abstract: Folded Reed-Solomon (FRS) codes are a well-studied family of codes, known for achieving list decoding capacity. In this work, we give improved deterministic and randomized algorithms for list decoding FRS codes of rate $R$ up to radius $1-R-\varepsilon$. We present a deterministic decoder that runs in near-linear time $\widetilde{O}_{\varepsilon}(n)$, improving upon the best-known runtime… ▽ More

    Submitted 20 August, 2025; v1 submitted 17 August, 2025; originally announced August 2025.

    Comments: Fixed an error in Observation 2.15

  13. arXiv:2508.11502  [pdf, ps, other

    cs.CV

    AIM: Amending Inherent Interpretability via Self-Supervised Masking

    Authors: Eyad Alshami, Shashank Agnihotri, Bernt Schiele, Margret Keuper

    Abstract: It has been observed that deep neural networks (DNNs) often use both genuine as well as spurious features. In this work, we propose "Amending Inherent Interpretability via Self-Supervised Masking" (AIM), a simple yet interestingly effective method that promotes the network's utilization of genuine features over spurious alternatives without requiring additional annotations. In particular, AIM uses… ▽ More

    Submitted 15 August, 2025; originally announced August 2025.

    Comments: Accepted at International Conference on Computer Vision (ICCV) 2025

  14. arXiv:2508.10948  [pdf, ps, other

    cs.LG cs.AI

    Apriel-Nemotron-15B-Thinker

    Authors: Shruthan Radhakrishna, Soham Parikh, Gopal Sarda, Anil Turkkan, Quaizar Vohra, Raymond Li, Dhruv Jhamb, Kelechi Ogueji, Aanjaneya Shukla, Oluwanifemi Bamgbose, Toby Liang, Luke Kumar, Oleksiy Ostapenko, Shiva Krishna Reddy Malay, Aman Tiwari, Tara Bogavelli, Vikas Yadav, Jash Mehta, Saloni Mittal, Akshay Kalkunte, Pulkit Pattnaik, Khalil Slimi, Anirudh Sreeram, Jishnu Nair, Akintunde Oladipo , et al. (10 additional authors not shown)

    Abstract: While large language models (LLMs) have achieved remarkable reasoning capabilities across domains like code, math and other enterprise tasks, their significant memory and computational costs often preclude their use in practical enterprise settings. To this end, we introduce Apriel-Nemotron-15B-Thinker, a 15-billion parameter model in the ServiceNow Apriel SLM series that achieves performance agai… ▽ More

    Submitted 13 August, 2025; originally announced August 2025.

  15. arXiv:2507.17948  [pdf, ps, other

    cs.IR cs.AI

    VERIRAG: Healthcare Claim Verification via Statistical Audit in Retrieval-Augmented Generation

    Authors: Shubham Mohole, Hongjun Choi, Shusen Liu, Christine Klymko, Shashank Kushwaha, Derek Shi, Wesam Sakla, Sainyam Galhotra, Ruben Glatt

    Abstract: Retrieval-augmented generation (RAG) systems are increasingly adopted in clinical decision support, yet they remain methodologically blind-they retrieve evidence but cannot vet its scientific quality. A paper claiming "Antioxidant proteins decreased after alloferon treatment" and a rigorous multi-laboratory replication study will be treated as equally credible, even if the former lacked scientific… ▽ More

    Submitted 23 July, 2025; originally announced July 2025.

  16. arXiv:2507.16761  [pdf, ps, other

    cs.CV cs.LG

    Faithful, Interpretable Chest X-ray Diagnosis with Anti-Aliased B-cos Networks

    Authors: Marcel Kleinmann, Shashank Agnihotri, Margret Keuper

    Abstract: Faithfulness and interpretability are essential for deploying deep neural networks (DNNs) in safety-critical domains such as medical imaging. B-cos networks offer a promising solution by replacing standard linear layers with a weight-input alignment mechanism, producing inherently interpretable, class-specific explanations without post-hoc methods. While maintaining diagnostic performance competit… ▽ More

    Submitted 24 July, 2025; v1 submitted 22 July, 2025; originally announced July 2025.

  17. arXiv:2507.15576  [pdf, ps, other

    cs.CL cs.CV

    Smart Eyes for Silent Threats: VLMs and In-Context Learning for THz Imaging

    Authors: Nicolas Poggi, Shashank Agnihotri, Margret Keuper

    Abstract: Terahertz (THz) imaging enables non-invasive analysis for applications such as security screening and material classification, but effective image classification remains challenging due to limited annotations, low resolution, and visual ambiguity. We introduce In-Context Learning (ICL) with Vision-Language Models (VLMs) as a flexible, interpretable alternative that requires no fine-tuning. Using a… ▽ More

    Submitted 21 July, 2025; originally announced July 2025.

  18. arXiv:2507.10564  [pdf, ps, other

    cs.LG cs.AI eess.SP stat.ML

    Tool-to-Tool Matching Analysis Based Difference Score Computation Methods for Semiconductor Manufacturing

    Authors: Sameera Bharadwaja H., Siddhrath Jandial, Shashank S. Agashe, Rajesh Kumar Reddy Moore, Youngkwan Kim

    Abstract: We consider the problem of tool-to-tool matching (TTTM), also called, chamber matching in the context of a semiconductor manufacturing equipment. Traditional TTTM approaches utilize static configuration data or depend on a golden reference which are difficult to obtain in a commercial manufacturing line. Further, existing methods do not extend very well to a heterogeneous setting, where equipment… ▽ More

    Submitted 7 July, 2025; originally announced July 2025.

  19. arXiv:2507.09001  [pdf, ps, other

    cond-mat.mtrl-sci cond-mat.dis-nn cs.LG physics.comp-ph quant-ph

    Surprisingly High Redundancy in Electronic Structure Data

    Authors: Sazzad Hossain, Ponkrshnan Thiagarajan, Shashank Pathrudkar, Stephanie Taylor, Abhijeet S. Gangan, Amartya S. Banerjee, Susanta Ghosh

    Abstract: Machine Learning (ML) models for electronic structure rely on large datasets generated through expensive Kohn-Sham Density Functional Theory simulations. This study reveals a surprisingly high level of redundancy in such datasets across various material systems, including molecules, simple metals, and complex alloys. Our findings challenge the prevailing assumption that large, exhaustive datasets… ▽ More

    Submitted 11 July, 2025; originally announced July 2025.

  20. arXiv:2507.08836  [pdf, ps, other

    cs.LG cs.PF

    Accuracy and Consumption analysis from a compressed model by CompactifAI from Multiverse Computing

    Authors: Damien Fovet, Shashank Chamoli, Sarah Oury, Srishti Singhal

    Abstract: This study evaluates the performance of a compression method, called CompactifAI, developed by Multiverse Computing, applied to the large language model Llama 3.1 8B\cite{llama}. The evaluation focused on model efficiency (in terms of energy consumption) and accuracy using respectively the frameworks Codecarbon\cite{codecarbon} and Ragas\cite{ragas}. A comparison was performed between the model co… ▽ More

    Submitted 7 July, 2025; originally announced July 2025.

  21. arXiv:2507.06261  [pdf, ps, other

    cs.CL cs.AI

    Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities

    Authors: Gheorghe Comanici, Eric Bieber, Mike Schaekermann, Ice Pasupat, Noveen Sachdeva, Inderjit Dhillon, Marcel Blistein, Ori Ram, Dan Zhang, Evan Rosen, Luke Marris, Sam Petulla, Colin Gaffney, Asaf Aharoni, Nathan Lintz, Tiago Cardal Pais, Henrik Jacobsson, Idan Szpektor, Nan-Jiang Jiang, Krishna Haridasan, Ahmed Omran, Nikunj Saunshi, Dara Bahri, Gaurav Mishra, Eric Chu , et al. (3284 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 2.X model family: Gemini 2.5 Pro and Gemini 2.5 Flash, as well as our earlier Gemini 2.0 Flash and Flash-Lite models. Gemini 2.5 Pro is our most capable model yet, achieving SoTA performance on frontier coding and reasoning benchmarks. In addition to its incredible coding and reasoning skills, Gemini 2.5 Pro is a thinking model that excels at multimodal unde… ▽ More

    Submitted 22 July, 2025; v1 submitted 7 July, 2025; originally announced July 2025.

    Comments: 72 pages, 17 figures

  22. arXiv:2507.05577  [pdf, ps, other

    cs.IR cs.CL cs.LG

    Beyond Retrieval: Ensembling Cross-Encoders and GPT Rerankers with LLMs for Biomedical QA

    Authors: Shashank Verma, Fengyi Jiang, Xiangning Xue

    Abstract: Biomedical semantic question answering rooted in information retrieval can play a crucial role in keeping up to date with vast, rapidly evolving and ever-growing biomedical literature. A robust system can help researchers, healthcare professionals and even layman users access relevant knowledge grounded in evidence. The BioASQ 2025 Task13b Challenge serves as an important benchmark, offering a com… ▽ More

    Submitted 7 July, 2025; originally announced July 2025.

    Comments: Paper submitted to CLEF 2025 CEUR-WS

  23. arXiv:2506.23014  [pdf, ps, other

    cs.SE cs.AI

    Generating Privacy Stories From Software Documentation

    Authors: Wilder Baldwin, Shashank Chintakuntla, Shreyah Parajuli, Ali Pourghasemi, Ryan Shanz, Sepideh Ghanavati

    Abstract: Research shows that analysts and developers consider privacy as a security concept or as an afterthought, which may lead to non-compliance and violation of users' privacy. Most current approaches, however, focus on extracting legal requirements from the regulations and evaluating the compliance of software and processes with them. In this paper, we develop a novel approach based on chain-of-though… ▽ More

    Submitted 28 June, 2025; originally announced June 2025.

    Comments: Accepted to RENext!'25 at the 33rd IEEE International Requirements Engineering 2025 conference

  24. Towards Two-Stage Counterfactual Learning to Rank

    Authors: Shashank Gupta, Yiming Liao, Maarten de Rijke

    Abstract: Counterfactual learning to rank (CLTR) aims to learn a ranking policy from user interactions while correcting for the inherent biases in interaction data, such as position bias. Existing CLTR methods assume a single ranking policy that selects top-K ranking from the entire document candidate set. In real-world applications, the candidate document set is on the order of millions, making a single-st… ▽ More

    Submitted 12 July, 2025; v1 submitted 25 June, 2025; originally announced June 2025.

    Comments: Accepted at ICTIR 2025 (co-located with SIGIR 2025)

  25. arXiv:2506.19035  [pdf, ps, other

    cs.LG

    Failure Modes of Time Series Interpretability Algorithms for Critical Care Applications and Potential Solutions

    Authors: Shashank Yadav, Vignesh Subbian

    Abstract: Interpretability plays a vital role in aligning and deploying deep learning models in critical care, especially in constantly evolving conditions that influence patient survival. However, common interpretability algorithms face unique challenges when applied to dynamic prediction tasks, where patient trajectories evolve over time. Gradient, Occlusion, and Permutation-based methods often struggle w… ▽ More

    Submitted 23 June, 2025; originally announced June 2025.

    Comments: 13 pages, 10 figures, Accepted at the AMIA Annual Symposium 2025. The final version will appear in the official proceedings

  26. arXiv:2506.17180  [pdf, ps, other

    cs.CL

    CLEAR-3K: Assessing Causal Explanatory Capabilities in Language Models

    Authors: Naiming Liu, Richard Baraniuk, Shashank Sonkar

    Abstract: We introduce CLEAR-3K, a dataset of 3,000 assertion-reasoning questions designed to evaluate whether language models can determine if one statement causally explains another. Each question present an assertion-reason pair and challenge language models to distinguish between semantic relatedness and genuine causal explanatory relationships. Through comprehensive evaluation of 21 state-of-the-art la… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

  27. arXiv:2506.15524  [pdf, ps, other

    cs.CV

    NTIRE 2025 Image Shadow Removal Challenge Report

    Authors: Florin-Alexandru Vasluianu, Tim Seizinger, Zhuyun Zhou, Cailian Chen, Zongwei Wu, Radu Timofte, Mingjia Li, Jin Hu, Hainuo Wang, Hengxing Liu, Jiarui Wang, Qiming Hu, Xiaojie Guo, Xin Lu, Jiarong Yang, Yuanfei Bao, Anya Hu, Zihao Fan, Kunyu Wang, Jie Xiao, Xi Wang, Xueyang Fu, Zheng-Jun Zha, Yu-Fan Lin, Chia-Ming Lee , et al. (57 additional authors not shown)

    Abstract: This work examines the findings of the NTIRE 2025 Shadow Removal Challenge. A total of 306 participants have registered, with 17 teams successfully submitting their solutions during the final evaluation phase. Following the last two editions, this challenge had two evaluation tracks: one focusing on reconstruction fidelity and the other on visual perception through a user study. Both tracks were e… ▽ More

    Submitted 18 June, 2025; originally announced June 2025.

  28. arXiv:2506.11586  [pdf, other

    cs.CR cs.LG

    SecONNds: Secure Outsourced Neural Network Inference on ImageNet

    Authors: Shashank Balla

    Abstract: The widespread adoption of outsourced neural network inference presents significant privacy challenges, as sensitive user data is processed on untrusted remote servers. Secure inference offers a privacy-preserving solution, but existing frameworks suffer from high computational overhead and communication costs, rendering them impractical for real-world deployment. We introduce SecONNds, a non-intr… ▽ More

    Submitted 13 June, 2025; originally announced June 2025.

  29. arXiv:2506.10231  [pdf, ps, other

    cs.CL

    Classifying Unreliable Narrators with Large Language Models

    Authors: Anneliese Brei, Katharine Henry, Abhisheik Sharma, Shashank Srivastava, Snigdha Chaturvedi

    Abstract: Often when we interact with a first-person account of events, we consider whether or not the narrator, the primary speaker of the text, is reliable. In this paper, we propose using computational methods to identify unreliable narrators, i.e. those who unintentionally misrepresent information. Borrowing literary theory from narratology to define different types of unreliable narrators based on a va… ▽ More

    Submitted 11 June, 2025; originally announced June 2025.

    Comments: ACL 2025

  30. arXiv:2506.09661  [pdf, ps, other

    eess.IV cs.CV q-bio.TO

    A Cytology Dataset for Early Detection of Oral Squamous Cell Carcinoma

    Authors: Garima Jain, Sanghamitra Pati, Mona Duggal, Amit Sethi, Abhijeet Patil, Gururaj Malekar, Nilesh Kowe, Jitender Kumar, Jatin Kashyap, Divyajeet Rout, Deepali, Hitesh, Nishi Halduniya, Sharat Kumar, Heena Tabassum, Rupinder Singh Dhaliwal, Sucheta Devi Khuraijam, Sushma Khuraijam, Sharmila Laishram, Simmi Kharb, Sunita Singh, K. Swaminadtan, Ranjana Solanki, Deepika Hemranjani, Shashank Nath Singh , et al. (12 additional authors not shown)

    Abstract: Oral squamous cell carcinoma OSCC is a major global health burden, particularly in several regions across Asia, Africa, and South America, where it accounts for a significant proportion of cancer cases. Early detection dramatically improves outcomes, with stage I cancers achieving up to 90 percent survival. However, traditional diagnosis based on histopathology has limited accessibility in low-res… ▽ More

    Submitted 11 June, 2025; originally announced June 2025.

    Comments: 7 pages, 2 figurs

  31. arXiv:2506.06790  [pdf, ps, other

    quant-ph cs.NE

    Adam assisted Fully informed Particle Swarm Optimization ( Adam-FIPSO ) based Parameter Prediction for the Quantum Approximate Optimization Algorithm (QAOA)

    Authors: Shashank Sanjay Bhat, Peiyong Wang, Udaya Parampalli

    Abstract: The Quantum Approximate Optimization Algorithm (QAOA) is a prominent variational algorithm used for solving combinatorial optimization problems such as the Max-Cut problem. A key challenge in QAOA lies in efficiently identifying suitable parameters (gamma, beta) that lead to high-quality solutions. In this paper, we propose a framework that combines Fully Informed Particle Swarm Optimization (FIPS… ▽ More

    Submitted 6 August, 2025; v1 submitted 7 June, 2025; originally announced June 2025.

  32. arXiv:2506.06699  [pdf, ps, other

    cs.LG cs.AI cs.CL

    MarginSel : Max-Margin Demonstration Selection for LLMs

    Authors: Rajeev Bhatt Ambati, James Lester, Shashank Srivastava, Snigdha Chaturvedi

    Abstract: Large Language Models (LLMs) excel at few-shot learning via in-context learning (ICL). However, the effectiveness of ICL is often sensitive to the selection and ordering of demonstration examples. To address this, we present MarginSel: Max-Margin Demonstration Selection for LLMs, a two-step method that selects hard demonstration examples for the ICL prompt, adapting to each test instance. Our appr… ▽ More

    Submitted 7 June, 2025; originally announced June 2025.

  33. arXiv:2506.00316  [pdf, ps, other

    cs.LG math.ST stat.ML

    Active Learning via Regression Beyond Realizability

    Authors: Atul Ganju, Shashaank Aiyer, Ved Sriraman, Karthik Sridharan

    Abstract: We present a new active learning framework for multiclass classification based on surrogate risk minimization that operates beyond the standard realizability assumption. Existing surrogate-based active learning algorithms crucially rely on realizability$\unicode{x2014}$the assumption that the optimal surrogate predictor lies within the model class$\unicode{x2014}$limiting their applicability in pr… ▽ More

    Submitted 30 May, 2025; originally announced June 2025.

  34. arXiv:2505.24477  [pdf, ps, other

    cs.CY cs.AI cs.LG

    Evaluating Gemini in an arena for learning

    Authors: LearnLM Team, Abhinit Modi, Aditya Srikanth Veerubhotla, Aliya Rysbek, Andrea Huber, Ankit Anand, Avishkar Bhoopchand, Brett Wiltshire, Daniel Gillick, Daniel Kasenberg, Eleni Sgouritsa, Gal Elidan, Hengrui Liu, Holger Winnemoeller, Irina Jurenka, James Cohan, Jennifer She, Julia Wilkowski, Kaiz Alarakyia, Kevin R. McKee, Komal Singh, Lisa Wang, Markus Kunesch, Miruna Pîslar, Niv Efron , et al. (12 additional authors not shown)

    Abstract: Artificial intelligence (AI) is poised to transform education, but the research community lacks a robust, general benchmark to evaluate AI models for learning. To assess state-of-the-art support for educational use cases, we ran an "arena for learning" where educators and pedagogy experts conduct blind, head-to-head, multi-turn comparisons of leading AI models. In particular, $N = 189$ educators d… ▽ More

    Submitted 30 May, 2025; originally announced May 2025.

  35. arXiv:2505.21410  [pdf, ps, other

    cs.AI cs.LG cs.RO

    MRSD: Multi-Resolution Skill Discovery for HRL Agents

    Authors: Shashank Sharma, Janina Hoffmann, Vinay Namboodiri

    Abstract: Hierarchical reinforcement learning (HRL) relies on abstract skills to solve long-horizon tasks efficiently. While existing skill discovery methods learns these skills automatically, they are limited to a single skill per task. In contrast, humans learn and use both fine-grained and coarse motor skills simultaneously. Inspired by human motor control, we propose Multi-Resolution Skill Discovery (MR… ▽ More

    Submitted 27 May, 2025; originally announced May 2025.

  36. arXiv:2505.18681  [pdf, ps, other

    cs.DC

    EvoSort: A Genetic-Algorithm-Based Adaptive Parallel Sorting Framework for Large-Scale High Performance Computing

    Authors: Shashank Raj, Kalyanmoy Deb

    Abstract: In today's era of big data, sorting enormous datasets is a major challenge. We present EvoSort, an adaptive parallel sorting framework that employs a Genetic Algorithm (GA) to automatically discover and refine critical parameters, including insertion sort and fallback thresholds, tile size, and mergesort vs Least Significant Digit (LSD) radix sort. EvoSort integrates parallel sorting primitives an… ▽ More

    Submitted 24 May, 2025; originally announced May 2025.

    Comments: Under review at the International Journal of Parallel, Emergent and Distributed Systems

  37. arXiv:2505.18015  [pdf, ps, other

    cs.CV cs.LG

    SemSegBench & DetecBench: Benchmarking Reliability and Generalization Beyond Classification

    Authors: Shashank Agnihotri, David Schader, Jonas Jakubassa, Nico Sharei, Simon Kral, Mehmet Ege Kaçar, Ruben Weber, Margret Keuper

    Abstract: Reliability and generalization in deep learning are predominantly studied in the context of image classification. Yet, real-world applications in safety-critical domains involve a broader set of semantic tasks, such as semantic segmentation and object detection, which come with a diverse set of dedicated model architectures. To facilitate research towards robust model design in segmentation and de… ▽ More

    Submitted 23 May, 2025; originally announced May 2025.

    Comments: First seven listed authors have equal contribution. GitHub: https://github.com/shashankskagnihotri/benchmarking_reliability_generalization. arXiv admin note: text overlap with arXiv:2505.05091

  38. arXiv:2505.16741  [pdf, ps, other

    cs.LG math.OC stat.ML

    Meta-reinforcement learning with minimum attention

    Authors: Pilhwa Lee, Shashank Gupta

    Abstract: Minimum attention applies the least action principle in the changes of control concerning state and time, first proposed by Brockett. The involved regularization is highly relevant in emulating biological control, such as motor learning. We apply minimum attention in reinforcement learning (RL) as part of the rewards and investigate its connection to meta-learning and stabilization. Specifically,… ▽ More

    Submitted 22 May, 2025; originally announced May 2025.

    Comments: 10 pages, 7 figures

  39. arXiv:2505.15505  [pdf, other

    eess.IV cs.CV

    Deep Learning Enabled Segmentation, Classification and Risk Assessment of Cervical Cancer

    Authors: Abdul Samad Shaik, Shashaank Mattur Aswatha, Rahul Jashvantbhai Pandya

    Abstract: Cervical cancer, the fourth leading cause of cancer in women globally, requires early detection through Pap smear tests to identify precancerous changes and prevent disease progression. In this study, we performed a focused analysis by segmenting the cellular boundaries and drawing bounding boxes to isolate the cancer cells. A novel Deep Learning (DL) architecture, the ``Multi-Resolution Fusion De… ▽ More

    Submitted 21 May, 2025; originally announced May 2025.

    Comments: 11 pages, 10 figures

  40. arXiv:2505.14983  [pdf, ps, other

    cs.AI cs.HC cs.RO

    Toward Informed AV Decision-Making: Computational Model of Well-being and Trust in Mobility

    Authors: Zahra Zahedi, Shashank Mehrotra, Teruhisa Misu, Kumar Akash

    Abstract: For future human-autonomous vehicle (AV) interactions to be effective and smooth, human-aware systems that analyze and align human needs with automation decisions are essential. Achieving this requires systems that account for human cognitive states. We present a novel computational model in the form of a Dynamic Bayesian Network (DBN) that infers the cognitive states of both AV users and other ro… ▽ More

    Submitted 20 May, 2025; originally announced May 2025.

  41. arXiv:2505.09368  [pdf, ps, other

    cs.CV cs.LG

    RobustSpring: Benchmarking Robustness to Image Corruptions for Optical Flow, Scene Flow and Stereo

    Authors: Jenny Schmalfuss, Victor Oei, Lukas Mehl, Madlen Bartsch, Shashank Agnihotri, Margret Keuper, Andrés Bruhn

    Abstract: Standard benchmarks for optical flow, scene flow, and stereo vision algorithms generally focus on model accuracy rather than robustness to image corruptions like noise or rain. Hence, the resilience of models to such real-world perturbations is largely unquantified. To address this, we present RobustSpring, a comprehensive dataset and benchmark for evaluating robustness to image corruptions for op… ▽ More

    Submitted 14 May, 2025; originally announced May 2025.

  42. AI and Generative AI Transforming Disaster Management: A Survey of Damage Assessment and Response Techniques

    Authors: Aman Raj, Lakshit Arora, Sanjay Surendranath Girija, Shashank Kapoor, Dipen Pradhan, Ankit Shetgaonkar

    Abstract: Natural disasters, including earthquakes, wildfires and cyclones, bear a huge risk on human lives as well as infrastructure assets. An effective response to disaster depends on the ability to rapidly and efficiently assess the intensity of damage. Artificial Intelligence (AI) and Generative Artificial Intelligence (GenAI) presents a breakthrough solution, capable of combining knowledge from multip… ▽ More

    Submitted 12 May, 2025; originally announced May 2025.

    Comments: Accepted in IEEE Compsac 2025

    Journal ref: 2025 IEEE 49th Annual Computers, Software, and Applications Conference (COMPSAC)

  43. Opportunities and Applications of GenAI in Smart Cities: A User-Centric Survey

    Authors: Ankit Shetgaonkar, Dipen Pradhan, Lakshit Arora, Sanjay Surendranath Girija, Shashank Kapoor, Aman Raj

    Abstract: The proliferation of IoT in cities, combined with Digital Twins, creates a rich data foundation for Smart Cities aimed at improving urban life and operations. Generative AI (GenAI) significantly enhances this potential, moving beyond traditional AI analytics and predictions by processing multimodal content and generating novel outputs like text and simulations. Using specialized or foundational mo… ▽ More

    Submitted 14 August, 2025; v1 submitted 12 May, 2025; originally announced May 2025.

    Comments: Accepted in IEEE COINS 2025

    Journal ref: 2025 IEEE International Conference on Omni-layer Intelligent Systems (COINS)

  44. Explainable Artificial Intelligence Techniques for Software Development Lifecycle: A Phase-specific Survey

    Authors: Lakshit Arora, Sanjay Surendranath Girija, Shashank Kapoor, Aman Raj, Dipen Pradhan, Ankit Shetgaonkar

    Abstract: Artificial Intelligence (AI) is rapidly expanding and integrating more into daily life to automate tasks, guide decision making, and enhance efficiency. However, complex AI models, which make decisions without providing clear explanations (known as the "black-box problem"), currently restrict trust and widespread adoption of AI. Explainable Artificial Intelligence (XAI) has emerged to address the… ▽ More

    Submitted 11 May, 2025; originally announced May 2025.

    Comments: Accepted to IEEE COMPSAC 2025

    Journal ref: 2025 IEEE 49th Annual Computers, Software, and Applications Conference (COMPSAC)

  45. arXiv:2505.06638  [pdf, other

    cs.RO

    3D Characterization of Smoke Plume Dispersion Using Multi-View Drone Swarm

    Authors: Nikil Krishnakumar, Shashank Sharma, Srijan Kumar Pal, Jiarong Hong

    Abstract: This study presents an advanced multi-view drone swarm imaging system for the three-dimensional characterization of smoke plume dispersion dynamics. The system comprises a manager drone and four worker drones, each equipped with high-resolution cameras and precise GPS modules. The manager drone uses image feedback to autonomously detect and position itself above the plume, then commands the worker… ▽ More

    Submitted 10 May, 2025; originally announced May 2025.

    Comments: 10 pages, 8 figures

  46. arXiv:2505.05091  [pdf, ps, other

    cs.CV cs.LG

    DispBench: Benchmarking Disparity Estimation to Synthetic Corruptions

    Authors: Shashank Agnihotri, Amaan Ansari, Annika Dackermann, Fabian Rösch, Margret Keuper

    Abstract: Deep learning (DL) has surpassed human performance on standard benchmarks, driving its widespread adoption in computer vision tasks. One such task is disparity estimation, estimating the disparity between matching pixels in stereo image pairs, which is crucial for safety-critical applications like medical surgeries and autonomous navigation. However, DL-based disparity estimation methods are highl… ▽ More

    Submitted 8 May, 2025; originally announced May 2025.

    Comments: Accepted at CVPR 2025 Workshop on Synthetic Data for Computer Vision

  47. arXiv:2505.05043  [pdf, other

    cs.CV

    xTrace: A Facial Expressive Behaviour Analysis Tool for Continuous Affect Recognition

    Authors: Mani Kumar Tellamekala, Shashank Jaiswal, Thomas Smith, Timur Alamev, Gary McKeown, Anthony Brown, Michel Valstar

    Abstract: Recognising expressive behaviours in face videos is a long-standing challenge in Affective Computing. Despite significant advancements in recent years, it still remains a challenge to build a robust and reliable system for naturalistic and in-the-wild facial expressive behaviour analysis in real time. This paper addresses two key challenges in building such a system: (1). The paucity of large-scal… ▽ More

    Submitted 8 May, 2025; originally announced May 2025.

  48. arXiv:2505.04835  [pdf, ps, other

    cs.CV

    Are Synthetic Corruptions A Reliable Proxy For Real-World Corruptions?

    Authors: Shashank Agnihotri, David Schader, Nico Sharei, Mehmet Ege Kaçar, Margret Keuper

    Abstract: Deep learning (DL) models are widely used in real-world applications but remain vulnerable to distribution shifts, especially due to weather and lighting changes. Collecting diverse real-world data for testing the robustness of DL models is resource-intensive, making synthetic corruptions an attractive alternative for robustness testing. However, are synthetic corruptions a reliable proxy for real… ▽ More

    Submitted 7 May, 2025; originally announced May 2025.

    Comments: Accepted at CVPR 2025 Workshop on Synthetic Data for Computer Vision

  49. arXiv:2505.03771  [pdf, other

    cs.AR cs.MA

    OneDSE: A Unified Microprocessor Metric Prediction and Design Space Exploration Framework

    Authors: Ritik Raj, Akshat Ramachandran, Jeff Nye, Shashank Nemawarkar, Tushar Krishna

    Abstract: With the diminishing returns of Moore Law scaling and as power constraints become more impactful, processor designs rely on architectural innovation to achieve differentiating performance. Innovation complexity has increased the design space of modern high-performance processors. This work offers an efficient and novel design space exploration (DSE) solution to these challenges of modern CPU desig… ▽ More

    Submitted 29 April, 2025; originally announced May 2025.

  50. Adversarial Attacks in Multimodal Systems: A Practitioner's Survey

    Authors: Shashank Kapoor, Sanjay Surendranath Girija, Lakshit Arora, Dipen Pradhan, Ankit Shetgaonkar, Aman Raj

    Abstract: The introduction of multimodal models is a huge step forward in Artificial Intelligence. A single model is trained to understand multiple modalities: text, image, video, and audio. Open-source multimodal models have made these breakthroughs more accessible. However, considering the vast landscape of adversarial attacks across these modalities, these models also inherit vulnerabilities of all the m… ▽ More

    Submitted 5 May, 2025; originally announced May 2025.

    Comments: Accepted in IEEE COMPSAC 2025

    Journal ref: 2025 IEEE 49th Annual Computers, Software, and Applications Conference (COMPSAC)