Updated on 2024.09.24

Table of Contents
  1. <a href=#peft>PEFT</a>
  2. <a href=#text-to-image-generation>Text-to-Image Generation</a>
  3. <a href=#vision-language-models>Vision-Language Models</a>
  4. <a href=#generative-weight-space-modeling>Generative Weight Space Modeling</a>
  5. <a href=#data-distillation>Data Distillation</a>
  6. <a href=#schrodinger-bridge>Schrodinger Bridge</a>

PEFT

Publish Date Title Authors PDF Code
2024-09-17 THaMES: An End-to-End Tool for Hallucination Mitigation and Evaluation in Large Language Models Mengfei Liang et.al. 2409.11353 null
2024-09-17 LPT++: Efficient Training on Mixture of Long-tailed Experts Bowen Dong et.al. 2409.11323 null
2024-09-17 Beyond LoRA: Exploring Efficient Fine-Tuning Techniques for Time Series Foundational Models Divij Gupta et.al. 2409.11302 null
2024-09-18 Propulsion: Steering LLM with Tiny Fine-Tuning Md Kowsher et.al. 2409.10927 link
2024-09-16 From Text to Emoji: How PEFT-Driven Personality Manipulation Unleashes the Emoji Potential in LLMs Navya Jain et.al. 2409.10245 null
2024-09-14 COMFORT: A Continual Fine-Tuning Framework for Foundation Models Targeted at Consumer Healthcare Chia-Hao Li et.al. 2409.09549 null
2024-09-14 Comparing Retrieval-Augmentation and Parameter-Efficient Fine-Tuning for Privacy-Preserving Personalization of Large Language Models Alireza Salemi et.al. 2409.09510 link
2024-09-13 Risks When Sharing LoRA Fine-Tuned Diffusion Model Weights Dixi Yao et.al. 2409.08482 null
2024-09-12 Do Vision Foundation Models Enhance Domain Generalization in Medical Image Segmentation? Kerem Cekmeceli et.al. 2409.07960 link
2024-09-11 Efficient Localized Adaptation of Neural Weather Forecasting: A Case Study in the MENA Region Muhammad Akhtar Munir et.al. 2409.07585 link
2024-09-10 Sam2Rad: A Segmentation Model for Medical Images with Learnable Prompts Assefa Seyoum Wahd et.al. 2409.06821 null
2024-09-11 Ferret: Federated Full-Parameter Tuning at Scale for Large Language Models Yao Shu et.al. 2409.06277 link
2024-09-09 SVFit: Parameter-Efficient Fine-Tuning of Large Pre-Trained Models Using Singular Values Chengwei Sun et.al. 2409.05926 null
2024-09-10 Improving Multimodal Emotion Recognition by Leveraging Acoustic Adaptation and Visual Alignment Zhixian Zhao et.al. 2409.05015 null
2024-09-06 Customizing Large Language Model Generation Style using Parameter-Efficient Finetuning Xinyue Liu et.al. 2409.04574 null
2024-09-04 iConFormer: Dynamic Parameter-Efficient Tuning with Input-Conditioned Adaptation Hayeon Jo et.al. 2409.02838 null
2024-09-04 Deconfounded Causality-aware Parameter-Efficient Fine-Tuning for Problem-Solving Improvement of LLMs Ruoyu Wang et.al. 2409.02686 null
2024-09-04 Robust Federated Finetuning of Foundation Models via Alternating Minimization of LoRA Shuangyi Chen et.al. 2409.02346 null
2024-09-02 Unleashing the Power of Task-Specific Directions in Parameter Efficient Fine-tuning Chongjie Si et.al. 2409.01035 link
2024-08-28 3-in-1: 2D Rotary Adaptation for Efficient Finetuning, Efficient Batching and Composability Baohao Liao et.al. 2409.00119 null
2024-08-21 SORSA: Singular Values and Orthonormal Regularized Singular Vectors Adaptation of Large Language Models Yang Cao et.al. 2409.00055 link
2024-08-30 MoRe Fine-Tuning with 10x Fewer Parameters Wenxuan Tan et.al. 2408.17383 link
2024-09-02 Instant Adversarial Purification with Adversarial Consistency Distillation Chun Tong Lei et.al. 2408.17064 null
2024-08-28 Scaling Up Summarization: Leveraging Large Language Models for Long Text Extractive Summarization Léo Hemamou et.al. 2408.15801 null
2024-08-27 GIFT-SW: Gaussian noise Injected Fine-Tuning of Salient Weights for LLMs Maxim Zhelnin et.al. 2408.15300 link
2024-08-27 Pre-training Everywhere: Parameter-Efficient Fine-Tuning for Medical Image Analysis via Target Parameter Pre-training Xingliang Lei et.al. 2408.15011 null
2024-08-27 CVPT: Cross-Attention help Visual Prompt Tuning adapt visual task Lingyun Huang et.al. 2408.14961 link
2024-08-27 Step-by-Step Unmasking for Parameter-Efficient Fine-tuning of Large Language Models Aradhye Agarwal et.al. 2408.14470 link
2024-08-24 Advancing Enterprise Spatio-Temporal Forecasting Applications: Data Mining Meets Instruction Tuning of Language Models For Multi-modal Time Series Analysis in Low-Resource Settings Sagar Srinivas Sakhinana et.al. 2408.13622 null
2024-08-21 Positional Prompt Tuning for Efficient 3D Representation Learning Shaochen Zhang et.al. 2408.11567 link
2024-08-20 Pluto and Charon: A Time and Memory Efficient Collaborative Edge AI Framework for Personal LLMs Fine-Tuning Bei Ouyang et.al. 2408.10746 null
2024-08-20 TDS-CLIP: Temporal Difference Side Network for Image-to-Video Transfer Learning Bin Wang et.al. 2408.10688 link
2024-08-19 TeamLoRA: Boosting Low-Rank Adaptation with Expert Collaboration and Competition Tianwei Lin et.al. 2408.09856 link
2024-08-16 Learning to Route for Dynamic Adapter Composition in Continual Learning with Language Models Vladimir Araujo et.al. 2408.09053 null
2024-08-14 KIND: Knowledge Integration and Diversion in Diffusion Models Yucheng Xie et.al. 2408.07337 null
2024-08-30 TaSL: Task Skill Localization and Consolidation for Language Model Continual Learning Yujie Feng et.al. 2408.05200 link
2024-08-08 Bias-Aware Low-Rank Adaptation: Mitigating Catastrophic Inheritance of Large Language Models Yupeng Chang et.al. 2408.04556 link
2024-08-06 SARA: Singular-Value Based Adaptive Low-Rank Adaption Jihao Gu et.al. 2408.03290 null
2024-08-06 Leveraging Parameter Efficient Training Methods for Low Resource Text Classification: A Case Study in Marathi Pranita Deshmukh et.al. 2408.03172 null
2024-08-03 TS-SAM: Fine-Tuning Segment-Anything Model for Downstream Tasks Yang Yu et.al. 2408.01835 link
2024-08-02 MoDE: Effective Multi-task Parameter Efficient Fine-Tuning with a Mixture of Dyadic Experts Lin Ning et.al. 2408.01505 null
2024-08-02 Tensor Train Low-rank Approximation (TT-LoRA): Democratizing AI with Accelerated LLMs Afia Anjum et.al. 2408.01008 null
2024-07-31 A Federated Learning-Friendly Approach for Parameter-Efficient Fine-Tuning of SAM in 3D Segmentation Mothilal Asokan et.al. 2407.21739 null
2024-07-28 Forecast-PEFT: Parameter-Efficient Fine-Tuning for Pre-trained Motion Forecasting Models Jifeng Wang et.al. 2407.19564 link
2024-07-24 Parameter-Efficient Fine-Tuning for Continual Learning: A Neural Tangent Kernel Perspective Jingren Liu et.al. 2407.17120 null
2024-07-22 Zero-Shot Embeddings Inform Learning and Forgetting with Vision-Language Encoders Laura Niss et.al. 2407.15731 null
2024-07-21 Learn to Preserve and Diversify: Parameter-Efficient Group with Orthogonal Regularization for Domain Generalization Jiajun Hu et.al. 2407.15085 null
2024-07-16 InstructAV: Instruction Fine-tuning Large Language Models for Authorship Verification Yujia Hu et.al. 2407.12882 link
2024-07-18 Turning Generative Models Degenerate: The Power of Data Poisoning Attacks Shuli Jiang et.al. 2407.12281 null
2024-07-16 Probing the Efficacy of Federated Parameter-Efficient Fine-Tuning of Vision Transformers for Medical Image Classification Naif Alkhunaizi et.al. 2407.11573 null
2024-07-16 An efficient framework based on large foundation model for cervical cytopathology whole slide image screening Jialong Huang et.al. 2407.11486 link
2024-07-10 RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization Xijie Huang et.al. 2407.08044 link
2024-07-10 ROSA: Random Subspace Adaptation for Efficient Fine-Tuning Marawan Gamal Abdel Hameed et.al. 2407.07802 link
2024-07-10 Parameter Efficient Fine Tuning for Multi-scanner PET to PET Reconstruction Yumin Kim et.al. 2407.07517 null
2024-07-09 Reprogramming Distillation for Medical Foundation Models Yuhang Zhou et.al. 2407.06504 null
2024-07-07 See Further for Parameter Efficient Fine-tuning by Standing on the Shoulders of Decomposition Chongjie Si et.al. 2407.05417 link
2024-07-16 LoRA-GA: Low-Rank Adaptation with Gradient Approximation Shaowen Wang et.al. 2407.05000 link
2024-07-05 GPT vs RETRO: Exploring the Intersection of Retrieval and Parameter-Efficient Fine-Tuning Aleksander Ficek et.al. 2407.04528 null
2024-07-04 Deep Content Understanding Toward Entity and Aspect Target Sentiment Analysis on Foundation Models Vorakit Vorakitphan et.al. 2407.04050 link
2024-07-04 ASteISR: Adapting Single Image Super-resolution Pre-trained Model for Efficient Stereo Image Super-resolution Yuanbo Zhou et.al. 2407.03598 null
2024-07-03 Knowledge Composition using Task Vectors with Learned Anisotropic Scaling Frederic Z. Zhang et.al. 2407.02880 link
2024-07-03 Exploring the Capabilities of LLMs for Code Change Related Tasks Lishui Fan et.al. 2407.02824 link
2024-07-02 FineCLIPER: Multi-modal Fine-grained CLIP for Dynamic Facial Expression Recognition with AdaptERs Haodong Chen et.al. 2407.02157 null
2024-07-02 CatMemo at the FinLLM Challenge Task: Fine-Tuning Large Language Models using Data Fusion in Financial Applications Yupeng Cao et.al. 2407.01953 null
2024-07-05 Let the Expert Stick to His Last: Expert-Specialized Fine-Tuning for Sparse Architectural Large Language Models Zihan Wang et.al. 2407.01906 link
2024-07-01 A Fingerprint for Large Language Models Zhiguang Yang et.al. 2407.01235 null
2024-07-02 Embedded Prompt Tuning: Towards Enhanced Calibration of Pretrained Models for Medical Images Wenqiang Zu et.al. 2407.01003 link
2024-06-25 Structured Unrestricted-Rank Matrices for Parameter Efficient Fine-tuning Arijit Sehanobish et.al. 2406.17740 null
2024-06-19 Parameter Training Efficiency Aware Resource Allocation for AIGC in Space-Air-Ground Integrated Networks Liangxin Qian et.al. 2406.13602 null
2024-06-19 Sparse High Rank Adapters Kartikeya Bhardwaj et.al. 2406.13175 null
2024-06-18 Bayesian-LoRA: LoRA based Parameter Efficient Fine-Tuning using Optimal Quantization levels and Rank Values trough Differentiable Bayesian Gates Cristian Meo et.al. 2406.13046 null
2024-06-18 Fighting Randomness with Randomness: Mitigating Optimisation Instability of Fine-Tuning using Delayed Ensemble and Noisy Interpolation Branislav Pecher et.al. 2406.12471 null
2024-06-17 A Semantic-based Layer Freezing Approach to Efficient Fine-Tuning of Language Models Jian Gu et.al. 2406.11753 null
2024-06-16 ExPLoRA: Parameter-Efficient Extended Pre-Training to Adapt Vision Transformers under Domain Shifts Samar Khanna et.al. 2406.10973 null
2024-06-16 ShareLoRA: Parameter Efficient and Robust Large Language Model Fine-tuning via Shared Low-Rank Adaptation Yurun Song et.al. 2406.10785 null
2024-06-16 RoseLoRA: Row and Column-wise Sparse Low-rank Adaptation of Pre-trained Language Model for Knowledge Editing and Fine-tuning Haoyu Wang et.al. 2406.10777 null
2024-06-15 Benchmarking Children’s ASR with Supervised and Self-supervised Speech Foundation Models Ruchao Fan et.al. 2406.10507 link
2024-06-15 Personalized Pieces: Efficient Personalized Large Language Models through Collaborative Efforts Zhaoxuan Tan et.al. 2406.10471 null
2024-06-13 Reflecting on the State of Rehearsal-free Continual Learning with Pretrained Models Lukas Thede et.al. 2406.09384 null
2024-06-12 Exploring Fact Memorization and Style Imitation in LLMs Using QLoRA: An Experimental Study and Quality Assessment Methods Eugene Vyborov et.al. 2406.08582 null
2024-06-12 The Impact of Initialization on LoRA Finetuning Dynamics Soufiane Hayou et.al. 2406.08447 null
2024-06-20 Low-Rank Quantization-Aware Training for LLMs Yelysei Bondarenko et.al. 2406.06385 link
2024-06-10 A Parameter-efficient Language Extension Framework for Multilingual ASR Wei Liu et.al. 2406.06329 null
2024-06-09 A Comprehensive Evaluation of Parameter-Efficient Fine-Tuning on Automated Program Repair Guochang Li et.al. 2406.05639 link
2024-06-07 Efficient Differentially Private Fine-Tuning of Diffusion Models Jing Liu et.al. 2406.05257 null
2024-06-07 CorDA: Context-Oriented Decomposition Adaptation of Large Language Models Yibo Yang et.al. 2406.05223 null
2024-06-07 An Empirical Study on Parameter-Efficient Fine-Tuning for MultiModal Large Language Models Xiongtao Zhou et.al. 2406.05130 null
2024-06-07 MEFT: Memory-Efficient Fine-Tuning through Sparse Adapter Jitai Hao et.al. 2406.04984 link
2024-06-06 Time Sensitive Knowledge Editing through Efficient Finetuning Xiou Ge et.al. 2406.04496 link
2024-06-06 VHDL-Eval: A Framework for Evaluating Large Language Models in VHDL Code Generation Prashanth Vijayaraghavan et.al. 2406.04379 null
2024-06-10 Hypernetworks for Personalizing ASR to Atypical Speech Max Müller-Eberstein et.al. 2406.04240 null
2024-06-06 Light-PEFT: Lightening Parameter-Efficient Fine-Tuning via Early Pruning Naibin Gu et.al. 2406.03792 link
2024-06-05 Choice of PEFT Technique in Continual Learning: Prompt Tuning is Not All You Need Martin Wistuba et.al. 2406.03216 null
2024-06-06 Adapter-X: A Novel General Parameter-Efficient Fine-Tuning Framework for Vision Minglei Li et.al. 2406.03051 null
2024-05-31 Mamba State-Space Models Can Be Strong Downstream Learners John T. Halloran et.al. 2406.00209 null
2024-05-30 ETHER: Efficient Finetuning of Large-Scale Models with Hyperplane Reflections Massimo Bini et.al. 2405.20271 link
2024-05-30 SVFT: Parameter-Efficient Fine-Tuning with Singular Vectors Vijay Lingam et.al. 2405.19597 link
2024-05-29 MemControl: Mitigating Memorization in Medical Diffusion Models via Automated Parameter Selection Raman Dutt et.al. 2405.19458 null
2024-05-29 MLAE: Masked LoRA Experts for Parameter-Efficient Fine-Tuning Junjie Wang et.al. 2405.18897 null
2024-05-29 Parameter-efficient Fine-tuning in Hyperspherical Space for Open-vocabulary Semantic Segmentation Zelin Peng et.al. 2405.18840 null
2024-06-01 Low-Rank Few-Shot Adaptation of Vision-Language Models Maxime Zanella et.al. 2405.18541 null
2024-05-28 Semantic are Beacons: A Semantic Perspective for Unveiling Parameter-Efficient Fine-Tuning in Knowledge Learning Renzhi Wang et.al. 2405.18292 null
2024-05-28 VeLoRA: Memory Efficient Training using Rank-1 Sub-Token Projections Roy Miles et.al. 2405.17991 null
2024-05-28 Sparsity- and Hybridity-Inspired Visual Parameter-Efficient Fine-Tuning for Medical Diagnosis Mingyuan Liu et.al. 2405.17877 null
2024-05-27 LoRA-XS: Low-Rank Adaptation with Extremely Small Number of Parameters Klaudia Bałazy et.al. 2405.17604 link
2024-05-23 EMR-Merging: Tuning-Free High-Performance Model Merging Chenyu Huang et.al. 2405.17461 null
2024-05-28 DoRA: Enhancing Parameter-Efficient Fine-Tuning with Dynamic Rank Distribution Yulong Mao et.al. 2405.17357 link
2024-05-27 $\textit{Trans-LoRA}$ : towards data-free Transferable Parameter Efficient Finetuning Runqian Wang et.al. 2405.17258 null
2024-05-30 Sparse Matrix in Large Language Model Fine-tuning Haoze He et.al. 2405.15525 null
2024-05-24 Prompt Tuning Strikes Back: Customizing Foundation Models with Low-Rank Prompt Adaptation Abhinav Jain et.al. 2405.15282 link
2024-05-27 VB-LoRA: Extreme Parameter Efficient Fine-Tuning with Vector Banks Yang Li et.al. 2405.15179 link
2024-05-23 Bitune: Bidirectional Instruction-Tuning Dawid J. Kopiczko et.al. 2405.14862 null
2024-05-23 Sparse-Tuning: Adapting Vision Transformers with Efficient Fine-tuning and Inference Ting Liu et.al. 2405.14700 link
2024-05-22 Spectral Adapter: Fine-Tuning in Spectral Space Fangzhao Zhang et.al. 2405.13952 null
2024-05-24 MeteoRA: Multiple-tasks Embedded LoRA for Large Language Models Jingwei Xu et.al. 2405.13053 link
2024-05-20 FeTT: Continual Class Incremental Learning via Feature Transformation Tuning Sunyuan Qiang et.al. 2405.11822 null
2024-05-21 HARIS: Human-Like Attention for Reference Image Segmentation Mengxi Zhang et.al. 2405.10707 null
2024-05-28 DP-DyLoRA: Fine-Tuning Transformer-Based Models On-Device under Differentially Private Federated Learning using Dynamic Low-Rank Adaptation Jie Xu et.al. 2405.06368 null
2024-05-09 Selective Fine-tuning on LLM-labeled Data May Reduce Reliance on Human Annotation: A Case Study Using Schedule-of-Event Table Detection Bhawesh Kumar et.al. 2405.06093 null
2024-05-09 Memory-Space Visual Prompting for Efficient Vision-Language Fine-Tuning Shibo Jie et.al. 2405.05615 link
2024-05-07 Refining Joint Text and Source Code Embeddings for Retrieval Task with Parameter-Efficient Fine-Tuning Karim Galliamov et.al. 2405.04126 link
2024-05-04 Random Masking Finds Winning Tickets for Parameter Efficient Fine-tuning Jing Xu et.al. 2405.02596 link
2024-03-16 Empirical Studies of Parameter Efficient Methods for Large Language Models of Code and Knowledge Transfer to R Amirreza Esmaeili et.al. 2405.01553 null
2024-05-02 NeMo-Aligner: Scalable Toolkit for Efficient Model Alignment Gerald Shen et.al. 2405.01481 link
2024-04-29 LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report Justin Zhao et.al. 2405.00732 link
2024-05-01 Investigating Automatic Scoring and Feedback using Large Language Models Gloria Ashiya Katuka et.al. 2405.00602 null
2024-05-01 MoPEFT: A Mixture-of-PEFTs for the Segment Anything Model Rajat Sahay et.al. 2405.00293 null
2024-04-30 SPAFIT: Stratified Progressive Adaptation Fine-tuning for Pre-trained Large Language Models Samir Arora et.al. 2405.00201 null
2024-05-23 HydraLoRA: An Asymmetric LoRA Architecture for Efficient Fine-Tuning Chunlin Tian et.al. 2404.19245 link
2024-05-25 FeDeRA:Efficient Fine-tuning of Language Models in Federated Learning Leveraging Weight Decomposition Yuxuan Yan et.al. 2404.18848 null
2024-04-25 Efficiency in Focus: LayerNorm as a Catalyst for Fine-tuning Medical Visual Language Pre-trained Models Jiawei Chen et.al. 2404.16385 null
2024-05-23 MixLoRA: Enhancing Large Language Models Fine-Tuning with LoRA-based Mixture of Experts Dengchun Li et.al. 2404.15159 link
2024-04-22 ColA: Collaborative Adaptation with Gradient Learning Enmao Diao et.al. 2404.13844 link
2024-04-23 Parameter Efficient Fine Tuning: A Comprehensive Analysis Across Applications Charith Chandra Sai Balne et.al. 2404.13506 null
2024-04-18 SKIP: Skill-Localized Prompt Tuning for Inference Speed Boost-Up Nakyeong Yang et.al. 2404.11916 null
2024-04-16 Shears: Unstructured Sparsity with Neural Low-rank Adapter Search J. Pablo Muñoz et.al. 2404.10934 link
2024-04-16 Exact and Efficient Unlearning for Large Language Model-based Recommendation Zhiyu Hu et.al. 2404.10327 null
2024-04-15 LoRA Dropout as a Sparsity Regularizer for Overfitting Control Yang Lin et.al. 2404.09610 null
2024-04-21 Analyzing the Impact of Data Selection and Fine-Tuning on Economic and Political Biases in LLMs Ahmed Agiza et.al. 2404.08699 link
2024-04-08 Certified PEFTSmoothing: Parameter-Efficient Fine-Tuning with Randomized Smoothing Chengyan Fu et.al. 2404.05350 null
2024-04-08 DLoRA: Distributed Parameter-Efficient Fine-Tuning Solution for Large Language Model Chao Gao et.al. 2404.05182 null
2024-04-12 Q-PEFT: Query-dependent Parameter Efficient Fine-tuning for Text Reranking with Large Language Models Zhiyuan Peng et.al. 2404.04522 null
2024-04-05 Unlocking Parameter-Efficient Fine-Tuning for Low-Resource Language Translation Tong Su et.al. 2404.04212 null
2024-05-22 ReFT: Representation Finetuning for Language Models Zhengxuan Wu et.al. 2404.03592 link
2024-06-11 Personalized LLM Response Generation with Parameterized Memory Injection Kai Zhang et.al. 2404.03565 null
2024-06-20 Eigenpruning: an Interpretability-Inspired PEFT Method Tomás Vergara-Browne et.al. 2404.03147 link
2024-05-28 PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models Fanxu Meng et.al. 2404.02948 link
2024-04-03 Enhancing Low-Resource LLMs Classification with PEFT and Synthetic Data Parth Patwa et.al. 2404.02422 null
2024-04-11 IISAN: Efficiently Adapting Multimodal Representation for Sequential Recommendation with Decoupled PEFT Junchen Fu et.al. 2404.02059 link
2024-03-31 Query-driven Relevant Paragraph Extraction from Legal Judgments T. Y. S. S Santosh et.al. 2404.00595 null
2024-03-30 Edinburgh Clinical NLP at SemEval-2024 Task 2: Fine-tune your model unless you have access to GPT-4 Aryo Pradipta Gema et.al. 2404.00484 link
2024-04-03 InfLoRA: Interference-Free Low-Rank Adaptation for Continual Learning Yan-Shuo Liang et.al. 2404.00228 link
2024-03-27 Is Modularity Transferable? A Case Study through the Lens of Knowledge Distillation Mateusz Klimaszewski et.al. 2403.18804 link
2024-03-26 The Unreasonable Ineffectiveness of the Deeper Layers Andrey Gromov et.al. 2403.17887 null
2024-04-15 ALoRA: Allocating Low-Rank Adaptation for Fine-tuning Large Language Models Zequan Liu et.al. 2403.16187 null
2024-03-22 KnowLA: Enhancing Parameter-efficient Finetuning with Knowledgeable Adaptation Xindi Luo et.al. 2403.14950 link
2024-03-22 A Single Linear Layer Yields Task-Adapted Low-Rank Matrices Hwichan Kim et.al. 2403.14946 null
2024-03-21 AutoRE: Document-Level Relation Extraction with Large Language Models Xue Lilong et.al. 2403.14888 link
2024-04-29 Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey Zeyu Han et.al. 2403.14608 null
2024-03-20 Harnessing Large Language Models for Text-Rich Sequential Recommendation Zhi Zheng et.al. 2403.13325 link
2024-04-16 AFLoRA: Adaptive Freezing of Low Rank Adaptation in Parameter Efficient Fine-Tuning of Large Models Zeyu Liu et.al. 2403.13269 null
2024-03-18 Improving LoRA in Privacy-preserving Federated Learning Youbang Sun et.al. 2403.12313 null
2024-03-18 Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation Wangbo Zhao et.al. 2403.11808 link
2024-03-18 Let’s Focus on Neuron: Neuron-Level Supervised Fine-tuning for Large Language Model Haoyun Xu et.al. 2403.11621 null
2024-03-19 JORA: JAX Tensor-Parallel LoRA Library for Retrieval Augmented Fine-Tuning Anique Tahir et.al. 2403.11366 link
2024-03-14 Introducing Routing Functions to Vision-Language Parameter-Efficient Fine-Tuning with Low-Rank Bottlenecks Tingyu Qu et.al. 2403.09377 link
2024-03-14 PYRA: Parallel Yielding Re-Activation for Training-Inference Efficient Task Adaptation Yizhe Xiong et.al. 2403.09192 link
2024-03-13 Data-oriented Dynamic Fine-tuning Parameter Selection Strategy for FISH Mask based Efficient Fine-tuning Ming Dong et.al. 2403.08484 null

Text-to-Image Generation

Publish Date Title Authors PDF Code
2024-09-18 Massively Multi-Person 3D Human Motion Forecasting with Scene Context Felix B Mueller et.al. 2409.12189 null
2024-09-18 MoRAG – Multi-Fusion Retrieval Augmented Generation for Human Motion Kalakonda Sai Shashank et.al. 2409.12140 null
2024-09-18 Takin: A Cohort of Superior Quality Zero-shot Speech Generation Models EverestAI et.al. 2409.12139 null
2024-09-18 Brain-Streams: fMRI-to-Image Reconstruction with Multi-modal Guidance Jaehoon Joo et.al. 2409.12099 null
2024-09-19 Skill matching at scale: freelancer-project alignment for efficient multilingual candidate retrieval Warren Jouanneau et.al. 2409.12097 null
2024-09-18 Design of Ligand-Binding Proteins with Atomic Flow Matching Junqi Liu et.al. 2409.12080 null
2024-09-18 Denoising diffusion models for high-resolution microscopy image restoration Pamela Osuna-Vargas et.al. 2409.12078 null
2024-09-19 Using Large Language Models to Generate Clinical Trial Tables and Figures Yumeng Yang et.al. 2409.12046 null
2024-09-18 LEMON: Localized Editing with Mesh Optimization and Neural Shaders Furkan Mert Algan et.al. 2409.12024 null
2024-09-18 Promise and Peril of Collaborative Code Generation Models: Balancing Effectiveness and Memorization Zhi Chen et.al. 2409.12020 null
2024-09-18 Towards Global Localization using Multi-Modal Object-Instance Re-Identification Aneesh Chavan et.al. 2409.12002 null
2024-09-18 Tracking Any Point with Frame-Event Fusion Network at High Frame Rate Jiaxiong Liu et.al. 2409.11953 null
2024-09-18 Generation of Complex 3D Human Motion by Temporal and Spatial Composition of Diffusion Models Lorenzo Mandelli et.al. 2409.11920 null
2024-09-18 AlignBot: Aligning VLM-powered Customized Task Planning with User Reminders Through Fine-Tuning for Household Robots Zhaxizhuoma et.al. 2409.11905 null
2024-09-18 Finding the Subjective Truth: Collecting 2 Million Votes for Comprehensive Gen-AI Model Evaluation Dimitrios Christodoulou et.al. 2409.11904 null
2024-09-17 Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion Zhenwei Wang et.al. 2409.11406 null
2024-09-17 Teaching dark matter simulations to speak the halo language Shivam Pandey et.al. 2409.11401 null
2024-09-17 Ultrasound Image Enhancement with the Variance of Diffusion Models Yuxin Zhang et.al. 2409.11380 link
2024-09-17 OSV: One Step is Enough for High-Quality Image to Video Generation Xiaofeng Mao et.al. 2409.11367 null
2024-09-17 Ping! Your Food is Ready: Comparing Different Notification Techniques in 3D AR Cooking Environment Aditya Raikwar et.al. 2409.11357 null
2024-09-17 Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think Gonzalo Martin Garcia et.al. 2409.11355 link
2024-09-17 OmniGen: Unified Image Generation Shitao Xiao et.al. 2409.11340 link
2024-09-17 fMRI-3D: A Comprehensive Dataset for Enhancing fMRI-based 3D Reconstruction Jianxiong Gao et.al. 2409.11315 null
2024-09-17 SpMis: An Investigation of Synthetic Spoken Misinformation Detection Peizhuo Liu et.al. 2409.11308 null
2024-09-17 Measurement of top-quark pair production in association with charm quarks in proton-proton collisions at $\sqrt{s}=13$ TeV with the ATLAS detector ATLAS Collaboration et.al. 2409.11305 null
2024-09-17 NirvaWave: An Accurate and Efficient Near Field Wave Propagation Simulator for 6G and Beyond Vahid Yazdnian et.al. 2409.11293 null
2024-09-17 DroneDiffusion: Robust Quadrotor Dynamics Learning with Diffusion Models Avirup Das et.al. 2409.11292 null
2024-09-17 Neural Networks for Vehicle Routing Problem László Kovács et.al. 2409.11290 null
2024-09-17 Attacking Slicing Network via Side-channel Reinforcement Learning Attack Wei Shao et.al. 2409.11258 null
2024-09-17 Learning Source Disentanglement in Neural Audio Codec Xiaoyu Bie et.al. 2409.11228 null
2024-09-16 Pennsieve - A Collaborative Platform for Translational Neuroscience and Beyond Zack Goldblum et.al. 2409.10509 null
2024-09-16 Torres funerarias chullpa en el valle del río Lauca: un primer análisis arqueoastronómico Alejandro Gangui et.al. 2409.10497 null
2024-09-16 Incorporating Classifier-Free Guidance in Diffusion Model-Based Recommendation Noah Buchanan et.al. 2409.10494 null
2024-09-16 SimInversion: A Simple Framework for Inversion-Based Text-to-Image Editing Qi Qian et.al. 2409.10476 null
2024-09-16 MacDiff: Unified Skeleton Modeling with Masked Conditional Diffusion Lehong Wu et.al. 2409.10473 null
2024-09-16 Signed Graph Autoencoder for Explainable and Polarization-Aware Network Embeddings Nikolaos Nakis et.al. 2409.10452 null
2024-09-16 Mamba-ST: State Space Model for Efficient Style Transfer Filippo Botti et.al. 2409.10385 link
2024-09-16 2D or not 2D: How Does the Dimensionality of Gesture Representation Affect 3D Co-Speech Gesture Generation? Téo Guichoux et.al. 2409.10357 null
2024-09-16 Taming Diffusion Models for Image Restoration: A Review Ziwei Luo et.al. 2409.10353 null
2024-09-16 MEGS: Morphological Evaluation of Galactic Structure Ufuk Çakır et.al. 2409.10346 link
2024-09-16 VAE-QWGAN: Improving Quantum GANs for High Resolution Image Generation Aaron Mark Thomas et.al. 2409.10339 null
2024-09-16 Research and Design of a Financial Intelligent Risk Control Platform Based on Big Data Analysis and Deep Machine Learning Shuochen Bi et.al. 2409.10331 null
2024-09-16 Fairness, not Emotion, Drives Socioeconomic Decision Making Rudra Mukhopadhyay et.al. 2409.10322 null
2024-09-16 On Synthetic Texture Datasets: Challenges, Creation, and Curation Blaine Hoak et.al. 2409.10297 null
2024-09-16 DreamHead: Learning Spatial-Temporal Correspondence via Hierarchical Diffusion for Audio-driven Talking Head Synthesis Fa-Ting Hong et.al. 2409.10281 null
2024-09-13 Closed-Loop Visuomotor Control with Generative Expectation for Robotic Manipulation Qingwen Bu et.al. 2409.09016 link
2024-09-13 A Diffusion Approach to Radiance Field Relighting using Multi-Illumination Synthesis Yohan Poirier-Ginter et.al. 2409.08947 null
2024-09-13 Emerging Reliance Behaviors in Human-AI Text Generation: Hallucinations, Data Quality Assessment, and Cognitive Forcing Functions Zahra Ashktorab et.al. 2409.08937 null
2024-09-13 Latent Space Score-based Diffusion Model for Probabilistic Multivariate Time Series Imputation Guojun Liang et.al. 2409.08917 link
2024-09-13 Gaussian is All You Need: A Unified Framework for Solving Inverse Problems via Diffusion Posterior Sampling Nebiyou Yismaw et.al. 2409.08906 null
2024-09-13 Adjoint Matching: Fine-tuning Flow and Diffusion Generative Models with Memoryless Stochastic Optimal Control Carles Domingo-Enrich et.al. 2409.08861 null
2024-09-13 The Line-Based Dial-a-Ride Problem Kendra Reiter et.al. 2409.08860 null
2024-09-13 InstantDrag: Improving Interactivity in Drag-based Image Editing Joonghyuk Shin et.al. 2409.08857 null
2024-09-13 DX2CT: Diffusion Model for 3D CT Reconstruction from Bi or Mono-planar 2D X-ray(s) Yun Su Jeong et.al. 2409.08850 null
2024-09-13 Development of a Compton Imager Setup Anuraag Arya et.al. 2409.08822 null
2024-09-13 LLaQo: Towards a Query-Based Coach in Expressive Music Performance Assessment Huan Zhang et.al. 2409.08795 null
2024-09-13 What You Say = What You Want? Teaching Humans to Articulate Requirements for LLMs Qianou Ma et.al. 2409.08775 null
2024-09-13 A Hybrid Meta-Learning and Multi-Armed Bandit Approach for Context-Specific Multi-Objective Recommendation Optimization Tiago Cunha et.al. 2409.08752 null
2024-09-13 Adaptive Sampling for Continuous Group Equivariant Neural Networks Berfin Inal et.al. 2409.08741 null
2024-09-13 DFADD: The Diffusion and Flow-Matching Based Audio Deepfake Dataset Jiawei Du et.al. 2409.08731 null
2024-09-12 DreamHOI: Subject-Driven Generation of 3D Human-Object Interactions with Diffusion Priors Thomas Hanwen Zhu et.al. 2409.08278 null
2024-09-12 Hand-Object Interaction Pretraining from Videos Himanshu Gaurav Singh et.al. 2409.08273 null
2024-09-12 Click2Mask: Local Editing with Dynamic Mask Generation Omer Regev et.al. 2409.08272 null
2024-09-12 DreamBeast: Distilling 3D Fantastical Animals with Part-Aware Knowledge Transfer Runjia Li et.al. 2409.08271 null
2024-09-12 Touch2Touch: Cross-Modal Tactile Generation for Object Manipulation Samanta Rodriguez et.al. 2409.08269 null
2024-09-12 Improving Text-guided Object Inpainting with Semantic Pre-inpainting Yifu Chen et.al. 2409.08260 link
2024-09-12 Improving Virtual Try-On with Garment-focused Diffusion Models Siqi Wan et.al. 2409.08258 null
2024-09-12 LoRID: Low-Rank Iterative Diffusion for Adversarial Purification Geigh Zollicoffer et.al. 2409.08255 null
2024-09-12 Dynamic Prompting of Frozen Text-to-Image Diffusion Models for Panoptic Narrative Grounding Hongyu Li et.al. 2409.08251 null
2024-09-12 IFAdapter: Instance Feature Control for Grounded Text-to-Image Generation Yinwei Wu et.al. 2409.08240 null
2024-09-12 Source2Synth: Synthetic Data Generation and Curation Grounded in Real Data Sources Alisia Lupidi et.al. 2409.08239 null
2024-09-12 LT3SD: Latent Trees for 3D Scene Diffusion Quan Meng et.al. 2409.08215 null
2024-09-12 VI3DRM:Towards meticulous 3D Reconstruction from Sparse Views via Photo-Realistic Novel View Synthesis Hao Chen et.al. 2409.08207 null
2024-09-12 High-Frequency Anti-DreamBooth: Robust Defense Against Image Synthesis Takuto Onikubo et.al. 2409.08167 null
2024-09-12 MagicStyle: Portrait Stylization Based on Reference Image Zhaoli Deng et.al. 2409.08156 null
2024-09-11 DreamMesh: Jointly Manipulating and Texturing Triangle Meshes for Text-to-3D Generation Haibo Yang et.al. 2409.07454 null
2024-09-11 Hi3D: Pursuing High-Resolution Image-to-3D Generation with Video Diffusion Models Haibo Yang et.al. 2409.07452 link
2024-09-11 FreeEnhance: Tuning-Free Image Enhancement via Content-Consistent Noising-and-Denoising Process Yang Luo et.al. 2409.07451 null
2024-09-11 Efficient One-Step Diffusion Refinement for Snapshot Compressive Imaging Yunzhen Wang et.al. 2409.07417 null
2024-09-11 Extracting TCPIP Headers at High Speed for the Anonymized Network Traffic Graph Challenge Zhaoyang Han et.al. 2409.07374 null
2024-09-11 Awaking the Slides: A Tuning-free and Knowledge-regulated AI Tutoring System via Language Model Coordination Daniel Zhang-Li et.al. 2409.07372 null
2024-09-11 Event-based Mosaicing Bundle Adjustment Shuang Guo et.al. 2409.07365 link
2024-09-11 Training-Free Guidance for Discrete Diffusion Models for Molecular Generation Thomas J. Kerby et.al. 2409.07359 null
2024-09-11 Learning Robotic Manipulation Policies from Point Clouds with Conditional Flow Matching Eugenio Chisari et.al. 2409.07343 null
2024-09-11 Efficient and Unbiased Sampling of Boltzmann Distributions via Consistency Models Fengzhe Zhang et.al. 2409.07323 null
2024-09-11 Optimizing Neural Network Performance and Interpretability with Diophantine Equation Encoding Ronald Katende et.al. 2409.07310 null
2024-09-11 Exploring User-level Gradient Inversion with a Diffusion Prior Zhuohang Li et.al. 2409.07291 null
2024-09-11 CCFExp: Facial Image Synthesis with Cycle Cross-Fusion Diffusion Model for Facial Paralysis Individuals Weixiang Gao et.al. 2409.07271 link
2024-09-11 Realistic and Efficient Face Swapping: A Unified Approach with Diffusion Models Sanoojan Baliah et.al. 2409.07269 link
2024-09-11 EMOdiffhead: Continuously Emotional Control in Talking Head Generation via Diffusion Jian Zhang et.al. 2409.07255 null
2024-09-10 Technical Report of Mobile Manipulator Robot for Industrial Environments Erfan Amoozad Khalili et.al. 2409.06693 null
2024-09-10 SaRA: High-Efficient Diffusion Model Fine-tuning with Progressive Sparse Low-Rank Adaptation Teng Hu et.al. 2409.06633 null
2024-09-10 MVGaussian: High-Fidelity text-to-3D Content Generation with Multi-View Guidance and Surface Densification Phu Pham et.al. 2409.06620 null
2024-09-10 A Primer on Variational Inference for Physics-Informed Deep Generative Modelling Alex Glyn-Davies et.al. 2409.06560 null
2024-09-10 From LIMA to DeepLIMA: following a new path of interoperability Victor Bocharov et.al. 2409.06550 null
2024-09-10 Enhancing Emotional Text-to-Speech Controllability with Natural Language Guidance through Contrastive Learning and Diffusion Models Xin Jing et.al. 2409.06451 null
2024-09-10 Prompt2Fashion: An automatically generated fashion dataset Georgia Argyro et.al. 2409.06442 link
2024-09-10 Fast nonparametric inference of network backbones for graph sparsification Alec Kirkley et.al. 2409.06417 link
2024-09-10 Distilling Generative-Discriminative Representations for Very Low-Resolution Face Recognition Junzheng Zhang et.al. 2409.06371 null
2024-09-10 What happens to diffusion model likelihood when your model is conditional? Mattias Cross et.al. 2409.06364 null
2024-09-10 DiffQRCoder: Diffusion-based Aesthetic QR Code Generation with Scanning Robustness Guided Iterative Refinement Jia-Wei Liao et.al. 2409.06355 null
2024-09-10 Improving Conditional Level Generation using Automated Validation in Match-3 Games Monica Villanueva Aylagas et.al. 2409.06349 null
2024-09-10 Foragax: An Agent Based Modelling framework based on JAX Siddharth Chaturvedi et.al. 2409.06345 link
2024-09-10 G3PT: Unleash the power of Autoregressive Modeling in 3D Generation via Cross-scale Querying Transformer Jinzhi Zhang et.al. 2409.06322 null
2024-09-10 Learning Augmentation Policies from A Model Zoo for Time Series Forecasting Haochen Yuan et.al. 2409.06282 null
2024-09-09 Fast Generation of Custom Floating-Point Spatial Filters on FPGAs Nelson Campos et.al. 2409.05837 null
2024-09-09 Enhancing Preference-based Linear Bandits via Human Response Time Shen Li et.al. 2409.05798 null
2024-09-09 Predicting Critical Heat Flux with Uncertainty Quantification and Domain Generalization Using Conditional Variational Autoencoders and Deep Neural Networks Farah Alsafadi et.al. 2409.05790 null
2024-09-09 Vector Quantized Diffusion Model Based Speech Bandwidth Extension Yuan Fang et.al. 2409.05784 null
2024-09-09 AS-Speech: Adaptive Style For Speech Synthesis Zhipeng Li et.al. 2409.05730 null
2024-09-09 pFedGPA: Diffusion-based Generative Parameter Aggregation for Personalized Federated Learning Jiahao Lai et.al. 2409.05701 null
2024-09-09 Citizen-Led Personalization of User Interfaces: Investigating How People Customize Interfaces for Themselves and Others Sérgio Alves et.al. 2409.05696 null
2024-09-09 Unlearning or Concealment? A Critical Analysis and Evaluation Metrics for Unlearning in Diffusion Models Aakash Sen Sharma et.al. 2409.05668 null
2024-09-09 Forward KL Regularized Preference Optimization for Aligning Diffusion Policies Zhao Shan et.al. 2409.05622 null
2024-09-09 CustomContrast: A Multilevel Contrastive Perspective For Subject-Driven Text-to-Image Customization Nan Chen et.al. 2409.05606 null
2024-09-09 Latent 3D Brain MRI Counterfactual Wei Peng et.al. 2409.05585 null
2024-09-09 Spatially-Aware Speaker for Vision-and-Language Navigation Instruction Generation Muraleekrishna Gopinathan et.al. 2409.05583 link
2024-09-09 Design and Implementation of TAO DAQ System Shuihan Zhang et.al. 2409.05522 null
2024-09-09 A Taxonomy of Miscompressions: Preparing Image Forensics for Neural Compression Nora Hofer et.al. 2409.05490 null
2024-09-09 DriveScape: Towards High-Resolution Controllable Multi-View Driving Video Generation Wei Wu et.al. 2409.05463 null
2024-09-06 VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation Yecheng Wu et.al. 2409.04429 null
2024-09-06 Exploring Foundation Models for Synthetic Medical Imaging: A Study on Chest X-Rays and Fine-Tuning Techniques Davide Clode da Silva et.al. 2409.04424 null
2024-09-06 Open-MAGVIT2: An Open-Source Project Toward Democratizing Auto-regressive Visual Generation Zhuoyan Luo et.al. 2409.04410 null
2024-09-06 Enhancing Skin Lesion Diagnosis with Ensemble Learning Xiaoyi Liu et.al. 2409.04381 null
2024-09-06 How Fair is Your Diffusion Recommender Model? Daniele Malitesta et.al. 2409.04339 null
2024-09-06 Random effects estimation in a fractional diffusion model based on continuous observations Nesrine Chebli et.al. 2409.04331 null
2024-09-06 Advancing Automated Knowledge Transfer in Evolutionary Multitasking via Large Language Models Yuxiao Huang et.al. 2409.04270 null
2024-09-06 An overview of domain-specific foundation model: key technologies, applications and challenges Haolong Chen et.al. 2409.04267 null
2024-09-06 UniDet3D: Multi-dataset Indoor 3D Object Detection Maksim Kolodiazhnyi et.al. 2409.04234 link
2024-09-06 Generative Modelling via Quantile Regression Johannes Schmidt-Hieber et.al. 2409.04231 null
2024-09-06 Breaking the Brownian Barrier: Models and Manifestations of Molecular Diffusion in Complex Fluids Harish Srinivasan et.al. 2409.04199 null
2024-09-06 GST: Precise 3D Human Body from a Single Image with Gaussian Splatting Transformers Lorenza Prospero et.al. 2409.04196 null
2024-09-06 Subsampling of Correlated Graph Signals Rishabh Ravi et.al. 2409.04107 null
2024-09-06 Estimation of service value parameters for a queue with unobserved balking Daniel Podorojnyi et.al. 2409.04090 null
2024-09-06 D4: Text-guided diffusion model-based domain adaptive data augmentation for vineyard shoot detection Kentaro Hirahara et.al. 2409.04060 null
2024-09-05 Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding Yunze Man et.al. 2409.03757 link
2024-09-05 WildVis: Open Source Visualizer for Million-Scale Chat Logs in the Wild Yuntian Deng et.al. 2409.03753 null
2024-09-05 ArtiFade: Learning to Generate High-quality Subject from Blemished Images Shuya Yang et.al. 2409.03745 null
2024-09-06 RAG based Question-Answering for Contextual Response Prediction System Sriram Veturi et.al. 2409.03708 null
2024-09-05 RealisHuman: A Two-Stage Approach for Refining Malformed Human Parts in Generated Images Benzhi Wang et.al. 2409.03644 null
2024-09-05 DiffEVC: Any-to-Any Emotion Voice Conversion with Expressive Guidance Hsing-Hang Chou et.al. 2409.03636 null
2024-09-05 Generalizing Linear Graphs and Bond Graph Models with Hetero-functional Graphs for System-of-Systems Engineering Applications Ehsanoddin Ghorbanichemazkati et.al. 2409.03630 null
2024-09-05 TCDiff: Triple Condition Diffusion Model with 3D Constraints for Stylizing Synthetic Faces Bernardo Biesseck et.al. 2409.03600 link
2024-09-05 DKDM: Data-Free Knowledge Distillation for Diffusion Models with Any Architecture Qianlong Xiang et.al. 2409.03550 null
2024-09-05 Euclid preparation. Simulations and nonlinearities beyond $Λ$ CDM. 2. Results from non-standard simulations Euclid Collaboration et.al. 2409.03523 null
2024-09-05 Blended Latent Diffusion under Attention Control for Real-World Video Editing Deyin Liu et.al. 2409.03514 null
2024-09-05 Physical Modelling of Piano Sound Haifan Xie et.al. 2409.03481 null
2024-09-05 Data-free Distillation with Degradation-prompt Diffusion for Multi-weather Image Restoration Pei Wang et.al. 2409.03455 null
2024-09-05 Rx Strategist: Prescription Verification using LLM Agents System Phuc Phan Van et.al. 2409.03440 null
2024-09-05 KiloBot: A Programming Language for Deploying Perception-Guided Industrial Manipulators at Scale Wei Gao et.al. 2409.03439 null
2024-09-04 HiPrompt: Tuning-free Higher-Resolution Generation with Hierarchical MLLM Prompts Xinyu Liu et.al. 2409.02919 link
2024-09-04 Latent Watermarking of Audio Generative Models Robin San Roman et.al. 2409.02915 null
2024-09-04 Masked Diffusion Models are Secretly Time-Agnostic Masked Models and Exploit Inaccurate Categorical Sampling Kaiwen Zheng et.al. 2409.02908 null
2024-09-04 Configurable Foundation Models: Building LLMs from a Modular Perspective Chaojun Xiao et.al. 2409.02877 null
2024-09-04 Look Into the LITE in Deep Learning for Time Series Classification Ali Ismail-Fawaz et.al. 2409.02869 link
2024-09-04 Building a Scalable, Effective, and Steerable Search and Ranking Platform Marjan Celikik et.al. 2409.02856 null
2024-09-04 Human-VDM: Learning Single-Image 3D Human Gaussian Splatting from Video Diffusion Models Zhibin Liu et.al. 2409.02851 link
2024-09-04 Anomaly Detection in Offshore Open Radio Access Network Using Long Short-Term Memory Models on a Novel Artificial Intelligence-Driven Cloud-Native Data Platform Abdelrahim Ahmad et.al. 2409.02849 null
2024-09-04 Multi-Track MusicLDM: Towards Versatile Music Generation with Latent Diffusion Model Tornike Karchkhadze et.al. 2409.02845 null
2024-09-04 SNNAX – Spiking Neural Networks in JAX Jamie Lohoff et.al. 2409.02842 null
2024-09-04 Experimental Framework for Generating Reliable Ground Truth for Laryngeal Spatial Segmentation Tasks Hamzeh Ghasemzadeh et.al. 2409.02809 null
2024-09-04 Creating a Gen-AI based Track and Trace Assistant MVP (SuperTracy) for PostNL Mohammad Reshadati et.al. 2409.02711 null
2024-09-04 Rethinking HTG Evaluation: Bridging Generation and Recognition Konstantina Nikolaidou et.al. 2409.02683 link
2024-09-04 Introduction to Machine Learning Laurent Younes et.al. 2409.02668 null
2024-09-04 Creating Domain-Specific Translation Memories for Machine Translation Fine-tuning: The TRENCARD Bilingual Cardiology Corpus Gokhan Dogru et.al. 2409.02667 null
2024-08-30 Generative AI Enables Medical Image Segmentation in Ultra Low-Data Regimes Li Zhang et.al. 2408.17421 link
2024-08-30 Assessing Generative Language Models in Classification Tasks: Performance and Self-Evaluation Capabilities in the Environmental and Climate Change Domain Francesca Grasso et.al. 2408.17362 link
2024-08-30 Subspace Diffusion Posterior Sampling for Travel-Time Tomography Xiang Cao et.al. 2408.17333 null
2024-08-30 Structuring a Training Strategy to Robustify Perception Models with Realistic Image Augmentations Ahmed Hammam et.al. 2408.17311 null
2024-08-30 Leveraging Deep Generative Model For Computational Protein Design And Optimization Boqiao Lai et.al. 2408.17241 null
2024-08-30 Towards Symbolic XAI – Explanation Through Human Understandable Logical Relationships Between Features Thomas Schnake et.al. 2408.17198 null
2024-09-02 Leveraging Blockchain and ANFIS for Optimal Supply Chain Management Amirfarhad Farhadi et.al. 2408.17161 null
2024-08-30 Look, Compare, Decide: Alleviating Hallucination in Large Vision-Language Models via Multi-View Multi-Path Reasoning Xiaoye Qu et.al. 2408.17150 link
2024-08-30 Flow Matching for Optimal Reaction Coordinates of Biomolecular System Mingyuan Zhang et.al. 2408.17139 link
2024-08-30 Temporal and Interactive Modeling for Efficient Human-Human Motion Generation Yabiao Wang et.al. 2408.17135 null
2024-09-02 RISSOLE: Parameter-efficient Diffusion Models via Block-wise Generation and Retrieval-Guidance Avideep Mukherjee et.al. 2408.17095 null
2024-08-30 FissionVAE: Federated Non-IID Image Generation with Latent Space and Decoder Decomposition Chen Hu et.al. 2408.17090 link
2024-08-30 Approximately Invertible Neural Network for Learned Image Compression Yanbo Gao et.al. 2408.17073 null
2024-09-02 Instant Adversarial Purification with Adversarial Consistency Distillation Chun Tong Lei et.al. 2408.17064 null
2024-08-30 Text-to-Image Generation Via Energy-Based CLIP Roy Ganz et.al. 2408.17046 null
2024-08-29 ReconX: Reconstruct Any Scene from Sparse Views with Video Diffusion Model Fangfu Liu et.al. 2408.16767 null
2024-08-29 CSGO: Content-Style Composition in Text-to-Image Generation Peng Xing et.al. 2408.16766 null
2024-08-29 A Score-Based Density Formula, with Applications in Diffusion Generative Models Gen Li et.al. 2408.16765 null
2024-08-29 UV-free Texture Generation with Denoising and Geodesic Heat Diffusions Simone Foti et.al. 2408.16762 link
2024-08-29 One-Shot Learning Meets Depth Diffusion in Multi-Object Videos Anisha Jain et.al. 2408.16704 null
2024-08-29 VMC: A Grammar for Visualizing Statistical Model Checks Ziyang Guo et.al. 2408.16702 null
2024-08-29 GradBias: Unveiling Word Influence on Bias in Text-to-Image Generative Models Moreno D’Incà et.al. 2408.16700 link
2024-08-29 Optimization Models for the Quadratic Traveling Salesperson Problem Yuxiao Chen et.al. 2408.16680 null
2024-08-29 DriveGenVLM: Real-world Video Generation for Vision Language Model based Autonomous Driving Yongjie Fu et.al. 2408.16647 null
2024-08-29 RLCP: A Reinforcement Learning-based Copyright Protection Method for Text-to-Image Diffusion Model Zhuan Shi et.al. 2408.16634 null
2024-08-28 TEDRA: Text-based Editing of Dynamic and Photoreal Actors Basavaraj Sunagad et.al. 2408.15995 null
2024-08-28 Distribution Backtracking Builds A Faster Convergence Trajectory for One-step Diffusion Distillation Shengyuan Zhang et.al. 2408.15991 link
2024-08-28 Thoughtseeds: Evolutionary Priors, Nested Markov Blankets, and the Emergence of Embodied Cognition Prakash Chandra Kavi et.al. 2408.15982 null
2024-08-28 Stability of Primal-Dual Gradient Flow Dynamics for Multi-Block Convex Optimization Problems Ibrahim K. Ozaslan et.al. 2408.15969 null
2024-08-28 MetaGFN: Exploring Distant Modes with Adapted Metadynamics for Continuous GFlowNets Dominic Phillips et.al. 2408.15905 null
2024-08-28 Gen-Swarms: Adapting Deep Generative Models to Swarms of Drones Carlos Plou et.al. 2408.15899 null
2024-08-28 Airfoil Diffusion: Denoising Diffusion Model For Conditional Airfoil Generation Reid Graves et.al. 2408.15898 link
2024-08-28 Disentangled Diffusion Autoencoder for Harmonization of Multi-site Neuroimaging Data Ayodeji Ijishakin et.al. 2408.15890 null
2024-08-29 Recent Decade’s Power Outage Data Reveals the Increasing Vulnerability of U.S. Power Infrastructure Bo Li et.al. 2408.15882 null
2024-08-28 GenDDS: Generating Diverse Driving Video Scenarios with Prompt-to-Video Generative Model Yongjie Fu et.al. 2408.15868 null
2024-08-27 GenRec: Unifying Video Generation and Recognition with Diffusion Models Zejia Weng et.al. 2408.15241 null
2024-08-27 Generative Inbetweening: Adapting Image-to-Video Models for Keyframe Interpolation Xiaojuan Wang et.al. 2408.15239 null
2024-08-27 Simulation of Stochastic Discrete Dislocation Dynamics in Ductile Vs Brittle Materials Santosh Chhetri et.al. 2408.15157 null
2024-08-27 How transformers learn structured data: insights from hierarchical filtering Jerome Garnier-Brun et.al. 2408.15138 null
2024-08-27 DIFR3CT: Latent Diffusion for Probabilistic 3D CT Reconstruction from Few Planar X-Rays Yiran Sun et.al. 2408.15118 link
2024-08-27 Data-Driven Nonlinear Deformation Design of 3D-Printable Shells Samuel Silverman et.al. 2408.15097 link
2024-08-27 Constrained Diffusion Models via Dual Training Shervin Khalafi et.al. 2408.15094 null
2024-08-27 LN-Gen: Rectal Lymph Nodes Generation via Anatomical Features Weidong Guo et.al. 2408.14977 null
2024-08-27 MegActor- $Σ$ : Unlocking Flexible Mixed-Modal Control in Portrait Animation with Diffusion Transformer Shurong Yang et.al. 2408.14975 null
2024-08-27 Integrated Bundling and Pricing of Unique Items Maxime Bouscary et.al. 2408.14913 null
2024-08-26 K-Sort Arena: Efficient and Reliable Benchmarking for Generative Models via K-wise Human Preferences Zhikai Li et.al. 2408.14468 null
2024-08-26 Uncovering Knowledge Gaps in Radiology Report Generation Models through Knowledge Graphs Xiaoman Zhang et.al. 2408.14397 link
2024-08-26 Reprogramming Foundational Large Language Models(LLMs) for Enterprise Adoption for Spatio-Temporal Forecasting Applications: Unveiling a New Era in Copilot-Guided Cross-Modal Time Series Representation Learning Sakhinana Sagar Srinivas et.al. 2408.14387 null
2024-08-26 GR-MG: Leveraging Partially Annotated Data via Multi-Modal Goal Conditioned Policy Peiyan Li et.al. 2408.14368 link
2024-08-27 Foundation Models for Music: A Survey Yinghao Ma et.al. 2408.14340 link
2024-08-26 Automated Machine Learning in Insurance Panyi Dong et.al. 2408.14331 link
2024-08-26 LLM-3D Print: Large Language Models To Monitor and Control 3D Printing Yayati Jadhav et.al. 2408.14307 null
2024-08-26 Learning Local Pattern Modularization for Point Cloud Reconstruction from Unseen Classes Chao Chen et.al. 2408.14279 null
2024-08-26 Towards Synthetic Trace Generation of Modeling Operations using In-Context Learning Approach Vittoriano Muttillo et.al. 2408.14259 null
2024-08-27 Text3DAug – Prompted Instance Augmentation for LiDAR Perception Laurenz Reichardt et.al. 2408.14253 link
2024-08-23 How Diffusion Models Learn to Factorize and Compose Qiyao Liang et.al. 2408.13256 null
2024-08-23 Foundational Model for Electron Micrograph Analysis: Instruction-Tuning Small-Scale Language-and-Vision Assistant for Enterprise Adoption Sakhinana Sagar Srinivas et.al. 2408.13248 null
2024-08-23 CustomCrafter: Customized Video Generation with Preserving Motion and Concept Composition Abilities Tao Wu et.al. 2408.13239 null
2024-08-23 Social Welfare Maximization for Federated Learning with Network Effects Xiang Li et.al. 2408.13223 null
2024-08-23 Instruct-DeBERTa: A Hybrid Approach for Aspect-based Sentiment Analysis on Textual Reviews Dineth Jayakody et.al. 2408.13202 null
2024-08-23 IFH: a Diffusion Framework for Flexible Design of Graph Generative Models Samuel Cognolato et.al. 2408.13194 link
2024-08-23 Deep Learning for Lung Disease Classification Using Transfer Learning and a Customized CNN Architecture with Attention Xiaoyi Liu et.al. 2408.13180 null
2024-08-26 Focus on Neighbors and Know the Whole: Towards Consistent Dense Multiview Text-to-Image Generator for 3D Creation Bonan Li et.al. 2408.13149 null
2024-08-23 Diffusion-based Episodes Augmentation for Offline Multi-Agent Reinforcement Learning Jihwan Oh et.al. 2408.13092 null
2024-08-23 General Intelligent Imaging and Uncertainty Quantification by Deterministic Diffusion Model Weiru Fan et.al. 2408.13061 null
2024-08-22 xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed Representations Can Qin et.al. 2408.12590 null
2024-08-22 ssProp: Energy-Efficient Training for Convolutional Neural Networks with Scheduled Sparse Back Propagation Lujia Zhong et.al. 2408.12561 link
2024-08-22 Show-o: One Single Transformer to Unify Multimodal Understanding and Generation Jinheng Xie et.al. 2408.12528 null
2024-08-22 FlexEdit: Marrying Free-Shape Masks to VLLM for Flexible Image Editing Jue Wang et.al. 2408.12429 link
2024-08-22 Enhanced Infield Agriculture with Interpretable Machine Learning Approaches for Crop Classification Sudi Murindanyi et.al. 2408.12426 null
2024-08-22 4D Diffusion for Dynamic Protein Structure Prediction with Reference Guided Motion Alignment Kaihui Cheng et.al. 2408.12419 null
2024-08-22 CODE: Confident Ordinary Differential Editing Bastien van Delft et.al. 2408.12418 link
2024-08-22 Dynamic PDB: A New Dataset and a SE(3) Model Extension by Integrating Dynamic Behaviors and Physical Properties in Protein Structures Ce Liu et.al. 2408.12413 null
2024-08-22 A Stable Polygamy Approach to Spectrum Access with Channel Reuse Dan Ben Ami et.al. 2408.12402 null
2024-08-22 Multi-Style Facial Sketch Synthesis through Masked Generative Modeling Bowen Sun et.al. 2408.12400 null
2024-08-21 Pixel Is Not A Barrier: An Effective Evasion Attack for Pixel-Domain Diffusion Models Chun-Yen Shih et.al. 2408.11810 null
2024-08-21 ACE: A Cross-Platform Visual-Exoskeletons System for Low-Cost Dexterous Teleoperation Shiqi Yang et.al. 2408.11805 null
2024-08-21 DreamFactory: Pioneering Multi-Scene Long Video Generation with a Multi-Agent Framework Zhifei Xie et.al. 2408.11788 null
2024-08-21 Timeline and Boundary Guided Diffusion Network for Video Shadow Detection Haipeng Zhou et.al. 2408.11785 link
2024-08-21 Sum of Squares Circuits Lorenzo Loconte et.al. 2408.11778 null
2024-08-21 Leveraging Fine-Tuned Retrieval-Augmented Generation with Long-Context Support: For 3GPP Standards Omar Erak et.al. 2408.11775 link
2024-08-21 D-RMGPT: Robot-assisted collaborative tasks driven by large multimodal models M. Forlini et.al. 2408.11761 null
2024-08-21 JieHua Paintings Style Feature Extracting Model using Stable Diffusion with ControlNet Yujia Gu et.al. 2408.11744 null
2024-08-21 Enhancing Cross-Modal Medical Image Segmentation through Compositionality Aniek Eijpe et.al. 2408.11733 link
2024-08-21 AI-assisted Automated Short Answer Grading of Handwritten University Level Mathematics Exams Tianyi Liu et.al. 2408.11728 null
2024-08-20 Reconciling Methodological Paradigms: Employing Large Language Models as Novice Qualitative Research Assistants in Talent Management Research Sreyoshi Bhaduri et.al. 2408.11043 null
2024-08-20 Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model Chunting Zhou et.al. 2408.11039 null
2024-08-20 Full Detector Simulation of a Projective Dual-Readout Segmented Crystal Electromagnetic Calorimeter with Precision Timing Wonyong Chung et.al. 2408.11027 null
2024-08-20 MegaFusion: Extend Diffusion Models towards Higher-resolution Image Generation without Further Tuning Haoning Wu et.al. 2408.11001 link
2024-08-20 GreediRIS: Scalable Influence Maximization using Distributed Streaming Maximum Cover Reet Barik et.al. 2408.10982 null
2024-08-21 Assortment Optimization Under History-Dependent Effects Taotao He et.al. 2408.10967 null
2024-08-20 Kilometer-Scale Convection Allowing Model Emulation using Generative Diffusion Modeling Jaideep Pathak et.al. 2408.10958 null
2024-08-20 SysBench: Can Large Language Models Follow System Messages? Yanzhao Qin et.al. 2408.10943 link
2024-08-20 A Closer Look at Data Augmentation Strategies for Finetuning-Based Low/Few-Shot Object Detection Vladislav Li et.al. 2408.10940 null
2024-08-20 Large Point-to-Gaussian Model for Image-to-3D Generation Longfei Lu et.al. 2408.10935 null
2024-08-19 MeshFormer: High-Quality Mesh Generation with 3D-Guided Reconstruction Model Minghua Liu et.al. 2408.10198 null
2024-08-19 SpaRP: Fast 3D Object Reconstruction and Pose Estimation from Sparse Views Chao Xu et.al. 2408.10195 null
2024-08-19 Customizing Language Models with Instance-wise LoRA for Sequential Recommendation Xiaoyu Kong et.al. 2408.10159 null
2024-08-19 Advancing Voice Cloning for Nepali: Leveraging Transfer Learning in a Low-Resource Language Manjil Karki et.al. 2408.10128 null
2024-08-19 Learning Precise Affordances from Egocentric Videos for Robotic Manipulation Gen Li et.al. 2408.10123 null
2024-08-19 Convert and Speak: Zero-shot Accent Conversion with Minimum Supervision Zhijun Jia et.al. 2408.10096 null
2024-08-19 Stacked Intelligent Metasurfaces for Integrated Sensing and Communications Haoxian Niu et.al. 2408.10043 null
2024-08-19 General Impedance Modeling for Modular Multilevel Converter with Grid-forming and Grid-following Control Chu Sun et.al. 2408.10017 null
2024-08-19 Uniting contrastive and generative learning for event sequences models Aleksandr Yugay et.al. 2408.09995 null
2024-08-19 Multi-layer diffusion model of photovoltaic installations Tomasz Weron et.al. 2408.09904 null
2024-08-16 Automated High-throughput Organic Crystal Structure Prediction via Population-based Sampling Qiang Zhu et.al. 2408.08843 link
2024-08-16 PFDiff: Training-free Acceleration of Diffusion Models through the Gradient Guidance of Past and Future Guangyi Wang et.al. 2408.08822 null
2024-08-16 A Unified Automata-Theoretic Approach to LTLf Modulo Theories (Extended Version) Marco Faella et.al. 2408.08817 null
2024-08-16 EmoDynamiX: Emotional Support Dialogue Strategy Prediction by Modelling MiXed Emotions and Discourse Dynamics Chenwei Wan et.al. 2408.08782 link
2024-08-16 Comparative Analysis of Generative Models: Enhancing Image Synthesis with VAEs, GANs, and Stable Diffusion Sanchayan Vivekananthan et.al. 2408.08751 null
2024-08-16 The Blessing of Strategic Customers in Personalized Pricing Zhi Chen et.al. 2408.08738 null
2024-08-16 ChatZero:Zero-shot Cross-Lingual Dialogue Generation via Pseudo-Target Language Yongkang Liu et.al. 2408.08724 null
2024-08-16 An End-to-End Model for Photo-Sharing Multi-modal Dialogue Generation Peiming Guo et.al. 2408.08650 null
2024-08-16 Modeling the Neonatal Brain Development Using Implicit Neural Representations Florentin Bieder et.al. 2408.08647 link
2024-08-16 Sampling effects on Lasso estimation of drift functions in high-dimensional diffusion processes Chiara Amorino et.al. 2408.08638 null
2024-08-15 Understanding the Local Geometry of Generative Model Manifolds Ahmed Imtiaz Humayun et.al. 2408.08307 null
2024-08-15 Accelerated Image-Aware Generative Diffusion Modeling Tanmay Asthana et.al. 2408.08306 null
2024-08-15 Marker or Markerless? Mode-Switchable Optical Tactile Sensing for Diverse Robot Tasks Ni Ou et.al. 2408.08276 null
2024-08-15 mhGPT: A Lightweight Generative Pre-Trained Transformer for Mental Health Text Analysis Dae-young Kim et.al. 2408.08261 null
2024-08-15 Derivative-Free Guidance in Continuous and Discrete Diffusion Models with Soft Value-Based Decoding Xiner Li et.al. 2408.08252 link
2024-08-15 Picosecond laser pulses for quantum dot-microcavity based single photon generation by cascaded electro-optic modulation of a narrow-linewidth laser Mio Poortvliet et.al. 2408.08213 null
2024-08-15 Not Every Image is Worth a Thousand Words: Quantifying Originality in Stable Diffusion Adi Haviv et.al. 2408.08184 null
2024-08-15 Impact of Comprehensive Data Preprocessing on Predictive Modelling of COVID-19 Mortality Sangita Das et.al. 2408.08142 link
2024-08-15 Decoding Memes: A Comparative Study of Machine Learning Models for Template Identification Levente Murgás et.al. 2408.08126 link
2024-08-15 When Video Coding Meets Multimodal Large Language Models: A Unified Paradigm for Video Coding Pingping Zhang et.al. 2408.08093 null
2024-08-14 Detecting Near-Duplicate Face Images Sudipta Banerjee et.al. 2408.07689 link
2024-08-14 Composing Automatic Differentiation with Custom Derivatives of Higher-Order Functions Sam Estep et.al. 2408.07683 null
2024-08-14 Drug Discovery SMILES-to-Pharmacokinetics Diffusion Models with Deep Molecular Understanding Bing Hu et.al. 2408.07636 null
2024-08-14 Anisotropic Diffusion Model of Communication in 2D Biofilm Yanahan Paramalingam et.al. 2408.07626 null
2024-08-14 Neural Quantum States and Peaked Molecular Wave Functions: Curse or Blessing? Aleksei Malyshev et.al. 2408.07625 null
2024-08-14 MatterGPT: A Generative Transformer for Multi-Property Inverse Design of Solid-State Materials Yan Chen et.al. 2408.07608 null
2024-08-14 PeriodWave: Multi-Period Flow Matching for High-Fidelity Waveform Generation Sang-Hoon Lee et.al. 2408.07547 link
2024-08-14 New Curriculum, New Chance – Retrieval Augmented Generation for Lesson Planning in Ugandan Secondary Schools. Prototype Quality Evaluation Simon Kloker et.al. 2408.07542 null
2024-08-14 DifuzCam: Replacing Camera Lens with a Mask and a Diffusion Model Erez Yosef et.al. 2408.07541 null
2024-08-14 Towards Real-time Video Compressive Sensing on Mobile Devices Miao Cao et.al. 2408.07530 link
2024-08-13 Imagen 3 Imagen-Team-Google et.al. 2408.07009 null
2024-08-13 Low-Bitwidth Floating Point Quantization for Efficient High-Quality Diffusion Models Cheng Chen et.al. 2408.06995 null
2024-08-13 DCMSA: Multi-Head Self-Attention Mechanism Based on Deformable Convolution For Seismic Data Denoising Wang Mingwei et.al. 2408.06963 null
2024-08-13 Neural Speech and Audio Coding Minje Kim et.al. 2408.06954 null
2024-08-13 Diffusion Model for Slate Recommendation Federico Tomasi et.al. 2408.06883 null
2024-08-13 Efficient Search for Customized Activation Functions with Gradient Descent Lukas Strack et.al. 2408.06820 link
2024-08-13 Enhancing Diabetic Retinopathy Diagnosis: A Lightweight CNN Architecture for Efficient Exudate Detection in Retinal Fundus Images Mujadded Al Rabbani Alif et.al. 2408.06784 null
2024-08-13 Improving Synthetic Image Detection Towards Generalization: An Image Transformation Perspective Ouxiang Li et.al. 2408.06741 link
2024-08-13 DiffLoRA: Generating Personalized Low-Rank Adaptation Weights with Diffusion Yujia Wu et.al. 2408.06740 null
2024-08-13 Multimodal Analysis of White Blood Cell Differentiation in Acute Myeloid Leukemia Patients using a β-Variational Autoencoder Gizem Mert et.al. 2408.06720 null
2024-08-12 The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery Chris Lu et.al. 2408.06292 link
2024-08-12 Open-Source Molecular Processing Pipeline for Generating Molecules Shreyas V et.al. 2408.06261 null
2024-08-12 3D Reconstruction of Protein Structures from Multi-view AFM Images using Neural Radiance Fields (NeRFs) Jaydeep Rade et.al. 2408.06244 null
2024-08-12 Cislunar Constellation Design for Space Situational Awareness with Time-Expanded Facility Location Problem Yuri Shimane et.al. 2408.06238 null
2024-08-12 Novel View Synthesis from a Single Image with Pretrained Diffusion Guidance Taewon Kang et.al. 2408.06157 null
2024-08-12 LipidBERT: A Lipid Language Model Pre-trained on METiS de novo Lipid Library Tianhao Yu et.al. 2408.06150 null
2024-08-12 Efficient and Scalable Point Cloud Generation with Sparse Point-Voxel Diffusion Models Ioannis Romanelis et.al. 2408.06145 link
2024-08-12 Med42-v2: A Suite of Clinical LLMs Clément Christophe et.al. 2408.06142 null
2024-08-12 Five Pitfalls When Assessing Synthetic Medical Images with Reference Metrics Melanie Dohmen et.al. 2408.06075 null
2024-08-12 CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer Zhuoyi Yang et.al. 2408.06072 link
2024-08-09 Multi-Garment Customized Model Generation Yichen Liu et.al. 2408.05206 null
2024-08-09 TaSL: Task Skill Localization and Consolidation for Language Model Continual Learning Yujie Feng et.al. 2408.05200 link
2024-08-09 Cell Morphology-Guided Small Molecule Generation with GFlowNets Stephen Zhewen Lu et.al. 2408.05196 link
2024-08-09 Lithography-free patterning of chalcogenide materials for integrated photonic devices Zhen Hu et.al. 2408.05099 null
2024-08-09 Social contagion under hybrid interactions Xincheng Shu et.al. 2408.05050 null
2024-08-09 Infrared Beam-shaping on Demand via Tailored Geometric Phase Metasurfaces employing the Plasmonic Phase-Change Material In3SbTe2 Lukas Conrads et.al. 2408.05044 null
2024-08-09 Collaborative Static-Dynamic Teaching: A Semi-Supervised Framework for Stripe-Like Space Target Detection Zijian Zhu et.al. 2408.05029 null
2024-08-09 Retrieval-augmented code completion for local projects using large language models Marko Hostnik et.al. 2408.05026 null
2024-08-09 DreamCouple: Exploring High Quality Text-to-3D Generation Via Rectified Flow Hangyu Li et.al. 2408.05008 null
2024-08-09 Pay Attention To Mean Fields For Point Cloud Generation Benno Käch et.al. 2408.04997 link
2024-08-08 Puppet-Master: Scaling Interactive Video Generation as a Motion Prior for Part-Level Dynamics Ruining Li et.al. 2408.04631 null
2024-08-08 Transformer Explainer: Interactive Learning of Text-Generative Models Aeree Cho et.al. 2408.04619 null
2024-08-08 Sketch2Scene: Automatic Generation of Interactive 3D Game Scenes from User’s Casual Sketches Yongzhi Xu et.al. 2408.04567 null
2024-08-08 Bias-Aware Low-Rank Adaptation: Mitigating Catastrophic Inheritance of Large Language Models Yupeng Chang et.al. 2408.04556 link
2024-08-08 On the Asymptotic Convergence of Subgraph Generated Models Xinchen Xu et.al. 2408.04541 null
2024-08-08 AExGym: Benchmarks and Environments for Adaptive Experimentation Jimmy Wang et.al. 2408.04531 null
2024-08-08 NFDI4Health workflow and service for synthetic data generation, assessment and risk management Sobhan Moazemi et.al. 2408.04478 null
2024-08-08 Deep Generative Models in Robotics: A Survey on Learning from Multimodal Demonstrations Julen Urain et.al. 2408.04380 null
2024-08-08 Making sense of AI systems development Mateusz Dolata et.al. 2408.04311 null
2024-08-08 AI-Driven Chatbot for Intrusion Detection in Edge Networks: Enhancing Cybersecurity with Ethical User Consent Mugheez Asif et.al. 2408.04281 null
2024-08-07 Prospects for using drones to test formation-flying CubeSat concepts, and other astronomical applications John D. Monnier et.al. 2408.03911 null
2024-08-07 Hate Speech Detection and Classification in Amharic Text with Deep Learning Samuel Minale Gashe et.al. 2408.03849 null
2024-08-07 WalledEval: A Comprehensive Safety Evaluation Toolkit for Large Language Models Prannaya Gupta et.al. 2408.03837 link
2024-08-07 A broken duet: multistable dynamics of dyadic interactions Johan Medrano et.al. 2408.03809 null
2024-08-07 Navigating the Human Maze: Real-Time Robot Pathfinding with Generative Imitation Learning Martin Moder et.al. 2408.03807 link
2024-08-07 Data Generation Scheme for Thermal Modality with Edge-Guided Adversarial Conditional Diffusion Model Guoqing Zhu et.al. 2408.03748 link
2024-08-07 Local Topology Measures of Contextual Language Model Latent Spaces With Applications to Dialogue Term Extraction Benjamin Matthias Ruppik et.al. 2408.03706 null
2024-08-07 Openstory++: A Large-scale Dataset and Benchmark for Instance-aware Open-domain Visual Storytelling Zilyu Ye et.al. 2408.03695 null
2024-08-07 Unsupervised Detection of Fetal Brain Anomalies using Denoising Diffusion Models Markus Ditlev Sjøgren Olsen et.al. 2408.03654 null
2024-08-07 Goal-oriented Semantic Communication for the Metaverse Application Zhe Wang et.al. 2408.03646 null
2024-08-06 MDT-A2G: Exploring Masked Diffusion Transformers for Co-Speech Gesture Generation Xiaofeng Mao et.al. 2408.03312 null
2024-08-06 IPAdapter-Instruct: Resolving Ambiguity in Image-based Conditioning using Instruct Prompts Ciara Rowles et.al. 2408.03209 null
2024-08-06 Personalizing Federated Instrument Segmentation with Visual Trait Priors in Robotic Surgery Jialang Xu et.al. 2408.03208 null
2024-08-06 An Object is Worth 64x64 Pixels: Generating 3D Object via Image Diffusion Xingguang Yan et.al. 2408.03178 null
2024-08-06 Iterative CT Reconstruction via Latent Variable Optimization of Shallow Diffusion Models Sho Ozaki et.al. 2408.03156 null
2024-08-06 Enhancing Twitter Bot Detection via Multimodal Invariant Representations Jibing Gong et.al. 2408.03096 null
2024-08-06 Analysis of Argument Structure Constructions in a Deep Recurrent Language Model Pegah Ramezani et.al. 2408.03062 null
2024-08-06 OpenOmni: A Collaborative Open Source Tool for Building Future-Ready Multimodal Conversational Agents Qiang Sun et.al. 2408.03047 link
2024-08-06 Targeted Visual Prompting for Medical Visual Question Answering Sergio Tascon-Morales et.al. 2408.03043 null
2024-08-06 Training-Free Condition Video Diffusion Models for single frame Spatial-Semantic Echocardiogram Synthesis Van Phi Nguyen et.al. 2408.03035 link
2024-08-05 Command-line Obfuscation Detection using Small Language Models Vojtech Outrata et.al. 2408.02637 null
2024-08-05 VidGen-1M: A Large-Scale Dataset for Text-to-video Generation Zhiyu Tan et.al. 2408.02629 null
2024-08-05 YOWOv3: An Efficient and Generalized Framework for Human Action Detection and Recognition Duc Manh Nguyen Dang et.al. 2408.02623 link
2024-08-05 LaMamba-Diff: Linear-Time High-Fidelity Diffusion Models Based on Local Attention and Mamba Yunxiang Fu et.al. 2408.02615 link
2024-08-05 MetaParticles: Computationally engineered nanomaterials with tunable and responsive properties Massimiliano Paesani et.al. 2408.02564 null
2024-08-05 Fairness and Bias Mitigation in Computer Vision: A Survey Sepehr Dehdashtian et.al. 2408.02464 null
2024-08-05 TGS: Trajectory Generation and Selection using Vision Language Models in Mapless Outdoor Environments Daeun Song et.al. 2408.02454 null
2024-08-05 Why Are My Prompts Leaked? Unraveling Prompt Extraction Threats in Customized Large Language Models Zi Liang et.al. 2408.02416 link
2024-08-05 Multi-weather Cross-view Geo-localization Using Denoising Diffusion Models Tongtong Feng et.al. 2408.02408 null
2024-08-05 A Few-Shot Approach for Relation Extraction Domain Adaptation using Large Language Models Vanni Zavarella et.al. 2408.02377 null
2024-08-02 Conditional LoRA Parameter Generation Xiaolong Jin et.al. 2408.01415 null
2024-08-02 Autoencoders in Function Space Justin Bunker et.al. 2408.01362 link
2024-08-02 MCGMark: An Encodable and Robust Online Watermark for LLM-Generated Malicious Code Kaiwen Ning et.al. 2408.01354 link
2024-08-02 TexGen: Text-Guided 3D Texture Generation with Multi-view Sampling and Resampling Dong Huo et.al. 2408.01291 null
2024-08-02 A General Framework to Boost 3D GS Initialization for Text-to-3D Generation by Lexical Richness Lutao Jiang et.al. 2408.01269 null
2024-08-02 Exchange control in a MOS double quantum dot made using a 300 mm wafer process Jacob F. Chittock-Wood et.al. 2408.01241 null
2024-08-02 CLIP4Sketch: Enhancing Sketch to Mugshot Matching through Dataset Augmentation using Diffusion Models Kushal Kumar Jain et.al. 2408.01233 null
2024-08-02 Reality Fusion: Robust Real-time Immersive Mobile Robot Teleoperation with Volumetric Visual Data Fusion Ke Li et.al. 2408.01225 link
2024-08-02 PSP-GEN: Stochastic inversion of the Process-Structure-Property chain in materials design through deep, generative probabilistic modeling Yaohua Zang et.al. 2408.01114 null
2024-08-02 Six Dragons Fly Again: Reviving 15th-Century Korean Court Music with Transformers and Novel Encoding Danbinaerin Han et.al. 2408.01096 null
2024-08-01 Optimizing Diffusion Models for Joint Trajectory Prediction and Controllable Generation Yixiao Wang et.al. 2408.00766 null
2024-08-01 Smoothed Energy Guidance: Guiding Diffusion Models with Reduced Energy Curvature of Attention Susung Hong et.al. 2408.00760 null
2024-08-01 DynamoLLM: Designing LLM Inference Clusters for Performance and Energy Efficiency Jovan Stojkovic et.al. 2408.00741 null
2024-08-01 TurboEdit: Text-Based Image Editing Using Few-Step Diffusion Models Gilad Deutch et.al. 2408.00735 null
2024-08-01 A Natural Language Processing Framework for Hotel Recommendation Based on Users’ Text Reviews Lavrentia Aravani et.al. 2408.00716 null
2024-08-02 Reinforcement Learning applied to Insurance Portfolio Pursuit Edward James Young et.al. 2408.00713 link
2024-08-01 MotionFix: Text-Driven 3D Human Motion Editing Nikos Athanasiou et.al. 2408.00712 null
2024-08-01 Synthetic dual image generation for reduction of labeling efforts in semantic segmentation of micrographs with a customized metric function Matias Oscar Volman Stern et.al. 2408.00707 null
2024-08-01 AutoM3L: An Automated Multimodal Machine Learning Framework with Large Language Models Daqin Luo et.al. 2408.00665 link
2024-08-01 Privacy-preserving datasets by capturing feature distributions with Conditional VAEs Francesco Di Salvo et.al. 2408.00639 link
2024-07-31 Detecting, Explaining, and Mitigating Memorization in Diffusion Models Yuxin Wen et.al. 2407.21720 link
2024-07-31 Tora: Trajectory-oriented Diffusion Transformer for Video Generation Zhenghao Zhang et.al. 2407.21705 null
2024-07-31 Generative Diffusion Model for Seismic Imaging Improvement of Sparsely Acquired Data and Uncertainty Quantification Xingchen Shi et.al. 2407.21683 null
2024-07-31 Quality Control for Radiology Report Generation Models via Auxiliary Auditing Components Hermione Warr et.al. 2407.21638 null
2024-07-31 LLM-for-X: Application-agnostic Integration of Large Language Models to Support Personal Writing Workflows Lukas Teufelberger et.al. 2407.21593 null
2024-07-31 Long-term investment and energy procurement risk management under uncertainty for an electrolytic green hydrogen producer Owen Palmer et.al. 2407.21574 null
2024-07-31 Conditioned Prompt-Optimization for Continual Deepfake Detection Francesco Laiti et.al. 2407.21554 link
2024-07-31 CXSimulator: A User Behavior Simulation using LLM Embeddings for Web-Marketing Campaign Assessment Akira Kasuga et.al. 2407.21553 null
2024-07-31 Explainable and Controllable Motion Curve Guided Cardiac Ultrasound Video Generation Junxuan Yu et.al. 2407.21490 null
2024-07-31 Maverick: Efficient and Accurate Coreference Resolution Defying Recent Trends Giuliano Martinelli et.al. 2407.21489 link
2024-07-30 Matting by Generation Zhixiang Wang et.al. 2407.21017 null
2024-07-30 Add-SD: Rational Generation without Manual Reference Lingfeng Yang et.al. 2407.21016 link
2024-07-30 Integrating Agent-Based and Compartmental Models for Infectious Disease Modeling: A Novel Hybrid Approach Inan Bostanci et.al. 2407.20993 null
2024-07-30 MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions Xiaowei Chi et.al. 2407.20962 link
2024-07-30 Mitigating calibration errors from mutual coupling with time-domain filtering of 21 cm cosmological radio observations N. Charles et.al. 2407.20923 null
2024-07-30 Impact of Geographical Separation on Spectrum Sharing Markets Kangle Mu et.al. 2407.20909 null
2024-07-30 Dynamic Scene Understanding through Object-Centric Voxelization and Neural Rendering Yanpeng Zhao et.al. 2407.20908 link
2024-07-30 Vulnerabilities in AI-generated Image Detection: The Challenge of Adversarial Attacks Yunfeng Diao et.al. 2407.20836 null
2024-07-30 Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning Norman Di Palo et.al. 2407.20798 null
2024-07-30 SynthVLM: High-Efficiency and High-Quality Synthetic Data for Vision Language Models Zheng Liu et.al. 2407.20756 link
2024-07-29 Specify and Edit: Overcoming Ambiguity in Text-Based Image Editing Ekaterina Iakovleva et.al. 2407.20232 null
2024-07-29 LatentArtiFusion: An Effective and Efficient Histological Artifacts Restoration Framework Zhenqi He et.al. 2407.20172 link
2024-07-29 Diffusion Feedback Helps CLIP See Better Wenxuan Wang et.al. 2407.20171 link
2024-07-29 DDAP: Dual-Domain Anti-Personalization against Text-to-Image Diffusion Models Jing Yang et.al. 2407.20141 null
2024-07-29 Diffusion-DICE: In-Sample Diffusion Guidance for Offline Reinforcement Learning Liyuan Mao et.al. 2407.20109 null
2024-07-29 On the significance of parameters and the projective level in the Choice and Collection axioms Vladimir Kanovei et.al. 2407.20098 null
2024-07-29 Generative Diffusion Model Bootstraps Zero-shot Classification of Fetal Ultrasound Images In Underrepresented African Populations Fangyijie Wang et.al. 2407.20072 link
2024-07-29 ImagiNet: A Multi-Content Dataset for Generalizable Synthetic Image Detection via Contrastive Learning Delyan Boychev et.al. 2407.20020 link
2024-07-29 Reproducibility Study of “ITI-GEN: Inclusive Text-to-Image Generation” Daniel Gallo Fernández et.al. 2407.19996 null
2024-07-29 HeadsetOff: Enabling Photorealistic Video Conferencing on Economical VR Headsets Yili Jin et.al. 2407.19988 null
2024-07-26 Generative Adversarial Networks for Imputing Sparse Learning Performance Liang Zhang et.al. 2407.18875 null
2024-07-26 Unifying Visual and Semantic Feature Spaces with Diffusion Models for Enhanced Cross-Modal Alignment Yuze Zheng et.al. 2407.18854 null
2024-07-26 Scalable Group Choreography via Variational Phase Manifold Learning Nhat Le et.al. 2407.18839 null
2024-07-26 Revision of calcium and scandium abundances in Am stars based on NLTE calculations and comparison with diffusion stellar evolution models L. I. Mashonkina et.al. 2407.18736 null
2024-07-26 BCTR: Bidirectional Conditioning Transformer for Scene Graph Generation Peng Hao et.al. 2407.18715 null
2024-07-26 Q-gen: A Parameterized Quantum Circuit Generator Yikai Mao et.al. 2407.18697 link
2024-07-26 Adversarial Robustification via Text-to-Image Diffusion Models Daewon Choi et.al. 2407.18658 link
2024-07-26 Robust VAEs via Generating Process of Noise Augmented Data Hiroo Irobe et.al. 2407.18632 null
2024-07-26 Denoising Lévy Probabilistic Models Dario Shariatian et.al. 2407.18609 null
2024-07-26 How To Segment in 3D Using 2D Models: Automated 3D Segmentation of Prostate Cancer Metastatic Lesions on PET Volumes Using Multi-Angle Maximum Intensity Projections and Diffusion Models Amirhosein Toosi et.al. 2407.18555 null
2024-07-25 RegionDrag: Fast Region-Based Image Editing with Diffusion Models Jingyi Lu et.al. 2407.18247 null
2024-07-25 VGGHeads: A Large-Scale Synthetic Dataset for 3D Human Heads Orest Kupyn et.al. 2407.18245 null
2024-07-25 CodedVO: Coded Visual Odometry Sachin Shah et.al. 2407.18240 null
2024-07-25 SuperFlow: A Fully-Customized RTL-to-GDS Design Automation Flow for Adiabatic Quantum-Flux-Parametron Superconducting Circuits Yanyue Xie et.al. 2407.18209 null
2024-07-25 Test2VA: Reusing GUI Test Cases for Voice Assistant Features Development in Mobile Applications Garrett Weaver et.al. 2407.18155 null
2024-07-25 Self-supervised pre-training with diffusion model for few-shot landmark detection in x-ray images Roberto Di Via et.al. 2407.18125 null
2024-07-25 Keypoint Promptable Re-Identification Vladimir Somers et.al. 2407.18112 link
2024-07-25 SSTD: Stripe-Like Space Target Detection using Single-Point Supervision Zijian Zhu et.al. 2407.18097 null
2024-07-25 Cross-Observatory Coordination with tilepy: A Novel Tool for Observations of Multi-Messenger Transient Events Monica Seglar-Arroyo et.al. 2407.18076 null
2024-07-25 AttentionHand: Text-driven Controllable Hand Image Generation for 3D Hand Reconstruction in the Wild Junho Park et.al. 2407.18034 link
2024-07-24 SV4D: Dynamic 3D Content Generation with Multi-Frame and Multi-View Consistency Yiming Xie et.al. 2407.17470 null
2024-07-24 BlueTempNet: A Temporal Multi-network Dataset of Social Interactions in Bluesky Social Ujun Jeong et.al. 2407.17451 null
2024-07-24 ProvenanceWidgets: A Library of UI Control Elements to Track and Dynamically Overlay Analytic Provenance Arpit Narechania et.al. 2407.17431 link
2024-07-24 CDDIP: Constrained Diffusion-Driven Deep Image Prior for Seismic Image Reconstruction Paul Goyes-Peñafiel et.al. 2407.17402 link
2024-07-24 Cosmic ray susceptibility of the Terahertz Intensity Mapper detector arrays Lun-Jun Liu et.al. 2407.17381 null
2024-07-24 ViPer: Visual Personalization of Generative Models via Individual Preference Learning Sogand Salehi et.al. 2407.17365 null
2024-07-24 Boosting Large Language Models with Socratic Method for Conversational Mathematics Teaching Yuyang Ding et.al. 2407.17349 link
2024-07-24 Quantum nonlocal modulation cancellation with distributed clocks Stephen D. Chapman et.al. 2407.17330 null
2024-07-25 Enhanced Deep Learning Methodologies and MRI Selection Techniques for Dementia Diagnosis in the Elderly Population Nikolaos Ntampakis et.al. 2407.17324 null
2024-07-24 Edge-Cloud Continuum Orchestration of Critical Services: A Smart-City Approach Rodrigo Rosmaninho et.al. 2407.17314 null
2024-07-23 Diffusion Models for Monocular Depth Estimation: Overcoming Challenging Conditions Fabio Tosi et.al. 2407.16698 link
2024-07-23 From Imitation to Refinement – Residual RL for Precise Visual Assembly Lars Ankile et.al. 2407.16677 null
2024-07-23 RedAgent: Red Teaming Large Language Models with Context-aware Autonomous Language Agent Huiyu Xu et.al. 2407.16667 null
2024-07-23 MovieDreamer: Hierarchical Generation for Coherent Long Visual Sequence Canyu Zhao et.al. 2407.16655 null
2024-07-23 Unveiling and Mitigating Bias in Audio Visual Segmentation Peiwen Sun et.al. 2407.16638 null
2024-07-23 Knowledge-driven AI-generated data for accurate and interpretable breast ultrasound diagnoses Haojun Yu et.al. 2407.16634 null
2024-07-23 GenRec: A Flexible Data Generator for Recommendations Erica Coppolillo et.al. 2407.16594 null
2024-07-23 COALA: A Practical and Vision-Centric Federated Learning Platform Weiming Zhuang et.al. 2407.16560 link
2024-07-23 DreamVTON: Customizing 3D Virtual Try-on with Personalized Diffusion Models Zhenyu Xie et.al. 2407.16511 null
2024-07-23 qMRI Diffusor: Quantitative T1 Mapping of the Brain using a Denoising Diffusion Probabilistic Model Shishuai Wang et.al. 2407.16477 null
2024-07-22 Artist: Aesthetically Controllable Text-Driven Stylization without Training Ruixiang Jiang et.al. 2407.15842 link
2024-07-23 A Large-scale Benchmark Dataset for Commuting Origin-destination Matrix Generation Can Rong et.al. 2407.15823 link
2024-07-22 Stretching Each Dollar: Diffusion Training from Scratch on a Micro-Budget Vikash Sehwag et.al. 2407.15811 null
2024-07-22 Quantum Computing for Phonon Scattering Effects on Thermal Conductivity Xiangjun Tan et.al. 2407.15808 null
2024-07-22 Enhancing Mass Customization Manufacturing: Multiobjective Metaheuristic Algorithms for flow shop Production in Smart Industry Diego Rossit et.al. 2407.15802 null
2024-07-22 Diffusion Model Based Resource Allocation Strategy in Ultra-Reliable Wireless Networked Control Systems Amirhassan Babazadeh Darabi et.al. 2407.15784 null
2024-07-22 A Hamilton-Jacobi approach to road-field reaction-diffusion models Christopher Henderson et.al. 2407.15760 null
2024-07-22 Diffusion for Out-of-Distribution Detection on Road Scenes and Beyond Silvio Galesso et.al. 2407.15739 link
2024-07-22 DStruct2Design: Data and Benchmarks for Data Structure Driven Generative Floor Plan Design Zhi Hao Luo et.al. 2407.15723 link
2024-07-22 Estimating Probability Densities with Transformer and Denoising Diffusion Henry W. Leung et.al. 2407.15703 link
2024-07-19 DEPICT: Diffusion-Enabled Permutation Importance for Image Classification Tasks Sarah Jabbour et.al. 2407.14509 null
2024-07-19 On Pre-training of Multimodal Language Models Customized for Chart Understanding Wan-Cyuan Fan et.al. 2407.14506 null
2024-07-19 T2V-CompBench: A Comprehensive Benchmark for Compositional Text-to-video Generation Kaiyue Sun et.al. 2407.14505 link
2024-07-19 M2D2M: Multi-Motion Generation from Text with Discrete Diffusion Models Seunggeun Chi et.al. 2407.14502 null
2024-07-19 A Precision Cryogenic Positioning Stage for Detector Dithering and Flexure Compensation Stephen A. Smee et.al. 2407.14493 null
2024-07-19 Contrastive Learning with Counterfactual Explanations for Radiology Report Generation Mingjie Li et.al. 2407.14474 null
2024-07-19 Describe Data to get Science-Data-Ready Tooling: Awkward as a Target for Kaitai Struct YAML Manasvi Goyal et.al. 2407.14461 null
2024-07-19 Co-synthesis of Histopathology Nuclei Image-Label Pairs using a Context-Conditioned Joint Diffusion Model Seonghui Min et.al. 2407.14434 null
2024-07-19 Controllable and Efficient Multi-Class Pathology Nuclei Data Augmentation using Text-Conditioned Diffusion Models Hyun-Jic Oh et.al. 2407.14426 null
2024-07-19 GLAudio Listens to the Sound of the Graph Aurelio Sulser et.al. 2407.14387 link
2024-07-18 LogoSticker: Inserting Logos into Diffusion Models for Customized Generation Mingkang Zhu et.al. 2407.13752 null
2024-07-18 Understanding Reinforcement Learning-Based Fine-Tuning of Diffusion Models: A Tutorial and Review Masatoshi Uehara et.al. 2407.13734 link
2024-07-18 Shaded Route Planning Using Active Segmentation and Identification of Satellite Images Longchao Da et.al. 2407.13689 null
2024-07-18 PASTA: Controllable Part-Aware Shape Generation with Autoregressive Transformers Songlin Li et.al. 2407.13677 null
2024-07-18 MeshSegmenter: Zero-Shot Mesh Semantic Segmentation via Texture Synthesis Ziming Zhong et.al. 2407.13675 link
2024-07-18 Open-Vocabulary 3D Semantic Segmentation with Text-to-Image Diffusion Models Xiaoyu Zhu et.al. 2407.13642 null
2024-07-18 Training-free Composite Scene Generation for Layout-to-Image Synthesis Jiaqi Liu et.al. 2407.13609 link
2024-07-18 EnergyDiff: Universal Time-Series Energy Data Generation using Diffusion Models Nan Lin et.al. 2407.13538 null
2024-07-18 VeriQR: A Robustness Verification Tool for Quantum Machine Learning Models Yanling Lin et.al. 2407.13533 null
2024-07-18 All Roads Lead to Rome? Exploring Representational Similarities Between Latent Spaces of Generative Image Models Charumathi Badrinath et.al. 2407.13449 link
2024-07-17 SMooDi: Stylized Motion Diffusion Model Lei Zhong et.al. 2407.12783 null
2024-07-17 VD3D: Taming Large Video Diffusion Transformers for 3D Camera Control Sherwin Bahmani et.al. 2407.12781 null
2024-07-17 Hallucination Index: An Image Quality Metric for Generative Reconstruction Models Matthew Tivnan et.al. 2407.12780 null
2024-07-17 GroundUp: Rapid Sketch-Based 3D City Massing Gizem Esra Unlu et.al. 2407.12739 null
2024-07-17 EchoSight: Advancing Visual-Language Models with Wiki Knowledge Yibin Yan et.al. 2407.12735 null
2024-07-17 NL2Contact: Natural Language Guided 3D Hand-Object Contact Modeling with Diffusion Model Zhongqun Zhang et.al. 2407.12727 null
2024-07-17 An Evaluation of Continual Learning for Advanced Node Semiconductor Defect Inspection Amit Prasad et.al. 2407.12724 null
2024-07-17 Unlocking planetesimal magnetic field histories: a refined, versatile model for thermal evolution and dynamo generation Hannah R. Sanderson et.al. 2407.12721 null
2024-07-17 SlimFlow: Training Smaller One-Step Diffusion Models with Rectified Flow Yuanzhi Zhu et.al. 2407.12718 link
2024-07-17 Teleoperation in Robot-assisted MIS with Adaptive RCM via Admittance Control Ehsan Nasiri et.al. 2407.12711 null
2024-07-16 Efficient Training with Denoised Neural Weights Yifan Gong et.al. 2407.11966 null
2024-07-16 UrbanWorld: An Urban World Model for 3D City Generation Yu Shang et.al. 2407.11965 null
2024-07-16 Context-Guided Diffusion for Out-of-Distribution Molecular and Protein Design Leo Klarner et.al. 2407.11942 link
2024-07-16 Code Documentation and Analysis to Secure Software Development Paul Attie et.al. 2407.11934 null
2024-07-16 Global Optimisation of Black-Box Functions with Generative Models in the Wasserstein Space Tigran Ramazyan et.al. 2407.11917 link
2024-07-16 Quantised Global Autoencoder: A Holistic Approach to Representing Visual Data Tim Elsner et.al. 2407.11913 null
2024-07-16 Data-Juicer Sandbox: A Comprehensive Suite for Multimodal Data-Model Co-development Daoyuan Chen et.al. 2407.11784 link
2024-07-16 Diffusion-driven self-assembly of emerin nanodomains at the nuclear envelope Carlos D. Alas et.al. 2407.11758 null
2024-07-16 Generating Multi-Modal and Multi-Attribute Single-Cell Counts with CFGen Alessandro Palma et.al. 2407.11734 link
2024-07-16 Theoretical Insights into CycleGAN: Analyzing Approximation and Estimation Errors in Unpaired Data Generation Luwei Sun et.al. 2407.11678 null
2024-07-15 Make-An-Agent: A Generalizable Policy Network Generator with Behavior-Prompted Diffusion Yongyuan Liang et.al. 2407.10973 null
2024-07-15 Fast Matrix Multiplications for Lookup Table-Quantized LLMs Han Guo et.al. 2407.10960 link
2024-07-15 InVi: Object Insertion In Videos Using Off-the-Shelf Diffusion Models Nirat Saini et.al. 2407.10958 null
2024-07-16 DataDream: Few-shot Guided Dataset Generation Jae Myung Kim et.al. 2407.10910 link
2024-07-15 Optical Diffusion Models for Image Generation Ilker Oguz et.al. 2407.10897 null
2024-07-15 R3D-AD: Reconstruction via Diffusion for 3D Anomaly Detection Zheyuan Zhou et.al. 2407.10862 null
2024-07-15 Physics-Inspired Generative Models in Medical Imaging: A Review Dennis Hein et.al. 2407.10856 null
2024-07-15 Inferring dark energy properties from the scale factor parametrisation Upala Mukhopadhayay et.al. 2407.10845 null
2024-07-15 MoE-DiffIR: Task-customized Diffusion Priors for Universal Compressed Image Restoration Yulin Ren et.al. 2407.10833 null
2024-07-15 Foundational Autoraters: Taming Large Language Models for Better Automatic Evaluation Tu Vu et.al. 2407.10817 null
2024-07-12 StyleSplat: 3D Object Style Transfer with Gaussian Splatting Sahil Jain et.al. 2407.09473 null
2024-07-12 FairyLandAI: Personalized Fairy Tales utilizing ChatGPT and DALLE-3 Georgios Makridis et.al. 2407.09467 null
2024-07-12 The $μ\mathcal{G}$ Language for Programming Graph Neural Networks Matteo Belenchia et.al. 2407.09441 null
2024-07-12 Graph Neural Network Causal Explanation via Neural Causal Models Arman Behnam et.al. 2407.09378 link
2024-07-12 Computationally Efficient Estimation of Large Probit Models Patrick Ding et.al. 2407.09371 null
2024-07-12 Is Contrasting All You Need? Contrastive Learning for the Detection and Attribution of AI-generated Text Lucio La Cava et.al. 2407.09364 null
2024-07-15 Any-Property-Conditional Molecule Generation with Self-Criticism using Spanning Trees Alexia Jolicoeur-Martineau et.al. 2407.09357 link
2024-07-12 PID: Physics-Informed Diffusion Model for Infrared Image Generation Fangyuan Mao et.al. 2407.09299 link
2024-07-12 Learning Distances from Data with Normalizing Flows and Score Matching Peter Sorrenson et.al. 2407.09297 null
2024-07-12 Surgical Text-to-Image Generation Chinedu Innocent Nwoye et.al. 2407.09230 null
2024-07-11 Video Diffusion Alignment via Reward Gradients Mihir Prabhudesai et.al. 2407.08737 link
2024-07-11 Live2Diff: Live Stream Translation via Uni-directional Attention in Video Diffusion Models Zhening Xing et.al. 2407.08701 null
2024-07-11 FAR-Trans: An Investment Dataset for Financial Asset Recommendation Javier Sanz-Cruzado et.al. 2407.08692 null
2024-07-11 Scattering transforms on the sphere, application to large scale structure modelling Louise Mousset et.al. 2407.08687 null
2024-07-11 CAD-Prompted Generative Models: A Pathway to Feasible and Novel Engineering Designs Leah Chong et.al. 2407.08675 null
2024-07-11 Still-Moving: Customized Video Generation without Customized Video Data Hila Chefer et.al. 2407.08674 null
2024-07-11 Controlling the Fidelity and Diversity of Deep Generative Models via Pseudo Density Shuangqi Li et.al. 2407.08659 null
2024-07-11 Adaptive Smooth Non-Stationary Bandits Joe Suk et.al. 2407.08654 null
2024-07-11 Fine-Tuning Stable Diffusion XL for Stylistic Icon Generation: A Comparison of Caption Size Youssef Sultan et.al. 2407.08513 null
2024-07-11 Latent Conditional Diffusion-based Data Augmentation for Continuous-Time Dynamic Graph Mode Yuxing Tian et.al. 2407.08500 null
2024-07-10 Generative Image as Action Models Mohit Shridhar et.al. 2407.07875 link
2024-07-10 Dynamical Measure Transport and Neural PDE Solvers for Sampling Jingtong Sun et.al. 2407.07873 null
2024-07-10 Controlling Space and Time with Diffusion Models Daniel Watson et.al. 2407.07860 null
2024-07-10 Generic Numerical Analysis of Stochastic Reaction Diffusion Model with applications in excitable media Yahya Alnashri et.al. 2407.07834 null
2024-07-10 Universal and non-universal signatures in the scaling functions of critical variables Gianluca Teza et.al. 2407.07782 null
2024-07-10 Towards Human-Like Driving: Active Inference in Autonomous Vehicle Control Elahe Delavari et.al. 2407.07684 null
2024-07-10 VEnhancer: Generative Space-Time Enhancement for Video Generation Jingwen He et.al. 2407.07667 null
2024-07-10 A Coding-Theoretic Analysis of Hyperspherical Prototypical Learning Geometry Martin Lindström et.al. 2407.07664 link
2024-07-10 The heterogeneous impact of the EU-Canada agreement with causal machine Lionel Fontagné et.al. 2407.07652 null
2024-07-11 MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image Synthesis Wanggui He et.al. 2407.07614 link
2024-07-09 ConceptExpress: Harnessing Diffusion Models for Single-image Unsupervised Concept Extraction Shaozhe Hao et.al. 2407.07077 link
2024-07-09 Latent Space Imaging Matheus Souza et.al. 2407.07052 null
2024-07-09 Generative models of astrophysical fields with scattering transforms on the sphere Louise Mousset et.al. 2407.07007 link
2024-07-10 PEER: Expertizing Domain-Specific Tasks with a Multi-Agent Framework and Tuning Methods Yiying Wang et.al. 2407.06985 link
2024-07-09 Parameter-Efficient and Memory-Efficient Tuning for Vision Transformer: A Disentangled Approach Taolin Zhang et.al. 2407.06964 null
2024-07-09 RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models Bowen Zhang et.al. 2407.06938 null
2024-07-09 HumanRefiner: Benchmarking Abnormal Human Generation and Refining with Coarse-to-fine Pose-Reversible Guidance Guian Fang et.al. 2407.06937 link
2024-07-09 Fine-grained large-scale content recommendations for MSX sellers Manpreet Singh et.al. 2407.06910 null
2024-07-09 Enhanced Battery Degradation-Aware Scheduling for Distribution Network with Electric Vehicle Load Vijay Babu Pamshetti et.al. 2407.06857 null
2024-07-09 A reaction-diffusion model for relapsing-remitting multiple sclerosis with a treatment term Romina Travaglini et.al. 2407.06802 null
2024-07-08 Tailor3D: Customized 3D Assets Editing and Generation with Dual-Side Images Zhangyang Qi et.al. 2407.06191 null
2024-07-08 CrowdMoGen: Zero-Shot Text-Driven Collective Motion Generation Xinying Guo et.al. 2407.06188 null
2024-07-08 JeDi: Joint-Image Diffusion Models for Finetuning-Free Personalized Text-to-Image Generation Yu Zeng et.al. 2407.06187 null
2024-07-08 The Tug-of-War Between Deepfake Generation and Detection Hannah Lee et.al. 2407.06174 null
2024-07-08 ANOLE: An Open, Autoregressive, Native Large Multimodal Models for Interleaved Image-Text Generation Ethan Chern et.al. 2407.06135 link
2024-07-08 Structured Generations: Using Hierarchical Clusters to guide Diffusion Models Jorge da Silva Goncalves et.al. 2407.06124 link
2024-07-08 PerlDiff: Controllable Street View Synthesis Using Perspective-Layout Diffusion Models Jinhua Zhang et.al. 2407.06109 link
2024-07-08 Accelerating Diffusion for SAR-to-Optical Image Translation via Adversarial Consistency Distillation Xinyu Bai et.al. 2407.06095 null
2024-07-08 Assessing Cardiomegaly in Dogs Using a Simple CNN Model Nikhil Deekonda et.al. 2407.06092 null
2024-07-08 Layered Diffusion Model for One-Shot High Resolution Text-to-Image Synthesis Emaad Khwaja et.al. 2407.06079 null
2024-07-05 RAM: Retrieval-Based Affordance Transfer for Generalizable Zero-Shot Robotic Manipulation Yuxuan Kuang et.al. 2407.04689 link
2024-07-05 Thermal and mechanical study of a parametrised cryostat model for optical characterisation of upcoming CMB experiments Thomas J. L. J. Gascard et.al. 2407.04613 link
2024-07-08 PartCraft: Crafting Creative Objects by Parts Kam Woh Ng et.al. 2407.04604 link
2024-07-05 Structural Constraint Integration in Generative Model for Discovery of Quantum Material Candidates Ryotaro Okabe et.al. 2407.04557 null
2024-07-05 Unified continuous-time q-learning for mean-field game and mean-field control problems Xiaoli Wei et.al. 2407.04521 null
2024-07-08 Speed-accuracy trade-off for the diffusion models: Wisdom from nonequilibrium thermodynamics and optimal transport Kotaro Ikeda et.al. 2407.04495 null
2024-07-05 PROUD: PaRetO-gUided Diffusion Model for Multi-objective Generation Yinghua Yao et.al. 2407.04493 null
2024-07-05 Dude: Dual Distribution-Aware Context Prompt Learning For Large Vision-Language Model Duy M. H. Nguyen et.al. 2407.04489 null
2024-07-05 Leveraging Graph Structures to Detect Hallucinations in Large Language Models Noa Nonkes et.al. 2407.04485 link
2024-07-05 VCD-Texture: Variance Alignment based 3D-2D Co-Denoising for Text-Guided Texturing Shang Liu et.al. 2407.04461 null
2024-07-03 DisCo-Diff: Enhancing Continuous Diffusion Models with Discrete Latents Yilun Xu et.al. 2407.03300 link
2024-07-03 Improved Noise Schedule for Diffusion Training Tiankai Hang et.al. 2407.03297 null
2024-07-03 Anomaly-based Framework for Detecting Power Overloading Cyberattacks in Smart Grid AMI Abdelaziz Amara Korba et.al. 2407.03264 null
2024-07-03 SOS! Soft Prompt Attack Against Open-Source Large Language Models Ziqing Yang et.al. 2407.03160 null
2024-07-04 Spatio-Temporal Adaptive Diffusion Models for EEG Super-Resolution in Epilepsy Diagnosis Tong Zhou et.al. 2407.03089 null
2024-07-03 Artificial Inductive Bias for Synthetic Tabular Data Generation in Data-Scarce Scenarios Patricia A. Apellániz et.al. 2407.03080 null
2024-07-03 Electromagnetic Property Sensing Based on Diffusion Model in ISAC System Yuhua Jiang et.al. 2407.03075 null
2024-07-03 Semantic-Aware Power Allocation for Generative Semantic Communications with Foundation Models Chunmei Xu et.al. 2407.03050 null
2024-07-03 SlerpFace: Face Template Protection via Spherical Linear Interpolation Zhizhou Zhong et.al. 2407.03043 null
2024-07-03 An Organism Starts with a Single Pix-Cell: A Neural Cellular Diffusion for High-Resolution Image Synthesis Marawan Elbatel et.al. 2407.03018 link
2024-07-02 Magic Insert: Style-Aware Drag-and-Drop Nataniel Ruiz et.al. 2407.02489 null
2024-07-02 Boosting Consistency in Story Visualization with Rich-Contextual Conditional Diffusion Models Fei Shen et.al. 2407.02482 link
2024-07-02 A Pattern Language for Machine Learning Tasks Benjamin Rodatz et.al. 2407.02424 null
2024-07-02 GCF: Graph Convolutional Networks for Facial Expression Recognition Hozaifa Kassab et.al. 2407.02361 null
2024-07-02 MORPHEUS: Modeling Role from Personalized Dialogue History by Exploring and Utilizing Latent Space Yihong Tang et.al. 2407.02345 null
2024-07-02 Choice-based time slot management in attended home delivery Dorsa Abdolhamidi et.al. 2407.02339 null
2024-07-02 Mining Constraints from Reference Process Models for Detecting Best-Practice Violations in Event Log Adrian Rebmann et.al. 2407.02336 link
2024-07-02 A tactical time slot management problem under mixed logit demand Dorsa Abdolhamidi et.al. 2407.02308 null
2024-07-02 Renard: A Modular Pipeline for Extracting Character Networks from Narrative Texts Arthur Amalvy et.al. 2407.02284 null
2024-07-03 Federated Distillation for Medical Image Classification: Towards Trustworthy Computer-Aided Diagnosis Sufen Ren et.al. 2407.02261 null
2024-06-28 Auto Cherry-Picker: Learning from High-quality Generative Data Driven by Language Yicheng Chen et.al. 2406.20085 null
2024-06-28 The hybrid Josephson rhombus: A superconducting element with tailored current-phase relation L. Banszerus et.al. 2406.20082 null
2024-06-28 HouseCrafter: Lifting Floorplans to 3D Scenes with 2D Diffusion Model Hieu T. Nguyen et.al. 2406.20077 null
2024-06-28 Modeling and LQR Control of Insect Sized Flapping Wing Robot Daksh Dhingra et.al. 2406.20061 null
2024-06-28 Neural Differentiable Modeling with Diffusion-Based Super-resolution for Two-Dimensional Spatiotemporal Turbulence Xiantao Fan et.al. 2406.20047 null
2024-06-28 Electrostatics-based particle sampling and approximate inference Yongchao Huang et.al. 2406.20044 link
2024-06-28 HAITCH: A Framework for Distortion and Motion Correction in Fetal Multi-Shell Diffusion-Weighted MRI Haykel Snoussi et.al. 2406.20042 null
2024-06-28 Concept Lens: Visually Analyzing the Consistency of Semantic Manipulation in GANs Sangwon Jeong et.al. 2406.19987 null
2024-07-01 Text2Robot: Evolutionary Robot Design from Text Descriptions Ryan P. Ringel et.al. 2406.19963 null
2024-06-28 Kolmogorov-Smirnov GAN Maciej Falkiewicz et.al. 2406.19948 link
2024-06-27 Looking 3D: Anomaly Detection with 2D-3D Alignment Ankan Bhunia et.al. 2406.19393 link
2024-06-27 Taming Data and Transformers for Audio Generation Moayed Haji-Ali et.al. 2406.19388 null
2024-06-27 Emergence of Hidden Capabilities: Exploring Learning Dynamics in Concept Space Core Francisco Park et.al. 2406.19370 null
2024-06-27 Accelerating Multiphase Flow Simulations with Denoising Diffusion Model Driven Initializations Jaehong Chung et.al. 2406.19333 null
2024-06-27 Subtractive Training for Music Stem Insertion using Latent Diffusion Models Ivan Villa-Renteria et.al. 2406.19328 null
2024-06-27 Efficient World Models with Context-Aware Tokenization Vincent Micheli et.al. 2406.19320 link
2024-06-27 PNeRV: A Polynomial Neural Representation for Videos Sonam Gupta et.al. 2406.19299 null
2024-06-27 Compositional Image Decomposition with Diffusion Models Jocelin Su et.al. 2406.19298 null
2024-06-27 BISeizuRe: BERT-Inspired Seizure Data Representation to Improve Epilepsy Monitoring Luca Benfenati et.al. 2406.19189 null
2024-06-27 On Pólya-Young urn models and growth processes Markus Kuba et.al. 2406.19110 null
2024-06-26 MatchTime: Towards Automatic Soccer Game Commentary Generation Jiayuan Rao et.al. 2406.18530 link
2024-06-26 MultiDiff: Consistent Novel View Synthesis from a Single Image Norman Müller et.al. 2406.18524 null
2024-06-26 Denoising as Adaptation: Noise-Space Domain Adaptation for Image Restoration Kang Liao et.al. 2406.18516 link
2024-06-26 DiffuseHigh: Training-free Progressive High-Resolution Image Synthesis through Structure Guidance Younghyun Kim et.al. 2406.18459 link
2024-06-26 Cascading Large Language Models for Salient Event Graph Generation Xingwei Tan et.al. 2406.18449 link
2024-06-26 Repeat and Concatenate: 2D to 3D Image Translation with 3D to 3D Generative Modeling Abril Corona-Figueroa et.al. 2406.18422 link
2024-06-26 Towards diffusion models for large-scale sea-ice modelling Tobias Sebastian Finn et.al. 2406.18417 null
2024-06-27 Stable Diffusion Segmentation for Biomedical Images with Single-step Reverse Process Tianyu Lin et.al. 2406.18361 link
2024-06-26 Molecular Diffusion Models with Virtual Receptors Matan Halfon et.al. 2406.18330 null
2024-06-27 Weak Reward Model Transforms Generative Models into Robust Causal Event Extraction Systems Italo Luis da Silva et.al. 2406.18245 link
2024-06-25 DiffusionPDE: Generative PDE-Solving Under Partial Observation Jiahe Huang et.al. 2406.17763 link
2024-06-25 MotionBooth: Motion-Aware Customized Text-to-Video Generation Jianzong Wu et.al. 2406.17758 null
2024-06-25 Accelerating Clinical Evidence Synthesis with Large Language Models Zifeng Wang et.al. 2406.17755 null
2024-06-25 Extensions of Panjer’s recursion for mixed compound distributions Spyridon M. Tzaninis et.al. 2406.17726 null
2024-06-25 PANDA: A self-driving lab for studying electrodeposited polymer films Harley Quinn et.al. 2406.17725 null
2024-06-25 Unified Auto-Encoding with Masked Diffusion Philippe Hansen-Estruch et.al. 2406.17688 link
2024-06-25 LaTable: Towards Large Tabular Models Boris van Breugel et.al. 2406.17673 null
2024-06-26 SpecMaskGIT: Masked Generative Modeling of Audio Spectrograms for Efficient Audio Synthesis and Beyond Marco Comunità et.al. 2406.17672 null
2024-06-25 Banishing LLM Hallucinations Requires Rethinking Generalization Johnny Li et.al. 2406.17642 null
2024-06-25 The experience of humans’ and robots’ mutual (im)politeness in enacted service scenarios: An empirical study Victor Kaptelinin et.al. 2406.17641 null
2024-06-24 FreeTraj: Tuning-Free Trajectory Control in Video Diffusion Models Haonan Qiu et.al. 2406.16863 link
2024-06-24 Dreamitate: Real-World Visuomotor Policy Learning via Video Generation Junbang Liang et.al. 2406.16862 null
2024-06-24 DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation Yuang Peng et.al. 2406.16855 link
2024-06-24 USDC: A Dataset of $\underline{U}$ser $\underline{S}$tance and $\underline{D}$ogmatism in Long $\underline{C}$ onversations Mounika Marreddy et.al. 2406.16833 null
2024-06-24 General Binding Affinity Guidance for Diffusion Models in Structure-Based Drug Design Yue Jian et.al. 2406.16821 null
2024-06-24 ClotheDreamer: Text-Guided Garment Generation with 3D Gaussians Yufei Liu et.al. 2406.16815 null
2024-06-24 Conformal time series decomposition with component-wise exchangeability Derck W. E. Prinzhorn et.al. 2406.16766 link
2024-06-24 Inferring stochastic low-rank recurrent neural networks from neural data Matthijs Pals et.al. 2406.16749 link
2024-06-24 Portrait3D: 3D Head Generation from Single In-the-wild Portrait Image Jinkun Hao et.al. 2406.16710 null
2024-06-24 Geometry-Aware Score Distillation via 3D Consistent Noising and Gradient Consistency Modeling Min-Seop Kwak et.al. 2406.16695 null
2024-06-21 Masked Extended Attention for Zero-Shot Virtual Try-On In The Wild Nadav Orzech et.al. 2406.15331 null
2024-06-21 Rethinking Remote Sensing Change Detection With A Mask View Xiaowen Ma et.al. 2406.15320 link
2024-06-21 You Only Acquire Sparse-channel (YOAS): A Unified Framework for Dense-channel EEG Generation Hongyu Chen et.al. 2406.15269 null
2024-06-21 Evaluating Diversity in Automatic Poetry Generation Yanran Chen et.al. 2406.15267 null
2024-06-21 Fingerprint Membership and Identity Inference Against Generative Adversarial Networks Saverio Cavasin et.al. 2406.15253 null
2024-06-21 MantisScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation Xuan He et.al. 2406.15252 null
2024-06-21 Unsupervised Bayesian Generation of Synthetic CT from CBCT Using Patient-Specific Score-Based Prior Junbo Peng et.al. 2406.15219 null
2024-06-21 Sound and Fury, Signifying Nothing? Impact of Data Breach Disclosure Laws Muhammad Zia Hydari et.al. 2406.15215 null
2024-06-21 Injecting Bias in Text-To-Image Models via Composite-Trigger Backdoors Ali Naseh et.al. 2406.15213 null
2024-06-21 Exploring the Efficacy of Robotic Assistants with ChatGPT and Claude in Enhancing ADHD Therapy: Innovating Treatment Paradigms Santiago Berrezueta-Guzman et.al. 2406.15198 null
2024-06-20 A Survey of Multimodal-Guided Image Editing with Text-to-Image Diffusion Models Xincheng Shuai et.al. 2406.14555 link
2024-06-21 Advancing Fine-Grained Classification by Structure and Subject Preserving Augmentation Eyal Michaeli et.al. 2406.14551 link
2024-06-20 Consistency Models Made Easy Zhengyang Geng et.al. 2406.14548 link
2024-06-20 IRASim: Learning Interactive Real-Robot Action Simulators Fangqi Zhu et.al. 2406.14540 null
2024-06-20 Invertible Consistency Distillation for Text-Guided Image Editing in Around 7 Steps Nikita Starodubcev et.al. 2406.14539 null
2024-06-20 Fantastic Copyrighted Beasts and How (Not) to Generate Them Luxi He et.al. 2406.14526 null
2024-06-20 Photoacoustic methane detection assisted by a gas-filled anti-resonant hollow-core fiber laser Cuiling Zhang et.al. 2406.14521 null
2024-06-20 V-LASIK: Consistent Glasses-Removal from Videos Using Synthetic Data Rotem Shalev-Arkushin et.al. 2406.14510 null
2024-06-20 CodeRAG-Bench: Can Retrieval Augment Code Generation? Zora Zhiruo Wang et.al. 2406.14497 link
2024-06-20 SafeSora: Towards Safety Alignment of Text2Video Generation via a Human Preference Dataset Josef Dai et.al. 2406.14477 link
2024-06-20 CollaFuse: Collaborative Diffusion Models Simeon Allmendinger et.al. 2406.14429 link
2024-06-20 Active Diffusion Subsampling Oisin Nolan et.al. 2406.14388 link
2024-06-20 Multicoloured Hardcore Model: Fast Mixing and Queueing Sam Olesker-Taylor et.al. 2406.14376 null
2024-06-20 FairX: A comprehensive benchmarking tool for model analysis using fairness, utility, and explainability Md Fahim Sikder et.al. 2406.14281 link
2024-06-20 In Tree Structure Should Sentence Be Generated Yaguang Li et.al. 2406.14189 link
2024-06-20 CriDiff: Criss-cross Injection Diffusion Framework via Generative Pre-train for Prostate Segmentation Tingwei Liu et.al. 2406.14186 link
2024-06-20 Tractable Equilibrium Computation in Markov Games through Risk Aversion Eric Mazumdar et.al. 2406.14156 null
2024-06-20 ExVideo: Extending Video Diffusion Models via Parameter-Efficient Post-Tuning Zhongjie Duan et.al. 2406.14130 link
2024-06-20 Dye4AI: Assuring Data Boundary on Generative AI Services Shu Wang et.al. 2406.14114 null
2024-06-20 HeartBeat: Towards Controllable Echocardiography Video Synthesis with Multimodal Conditions-Guided Diffusion Models Xinrui Zhou et.al. 2406.14098 null
2024-06-20 Bridging bulk and surface: An interacting particle system towards the field-road diffusion model Matthieu Alfaro et.al. 2406.14093 null
2024-06-20 A Practical Diffusion Path for Sampling Omar Chehab et.al. 2406.14040 null
2024-06-20 Leveraging eBPF and AI for Ransomware Nose Out Arjun Sekar et.al. 2406.14020 null
2024-06-20 Feature Fusion Based on Mutual-Cross-Attention Mechanism for EEG Emotion Recognition Yimin Zhao et.al. 2406.14014 link
2024-06-20 Exploring Changes in Nation Perception with Nationality-Assigned Personas in LLMs Mahammed Kamruzzaman et.al. 2406.13993 null
2024-06-20 The Elusive Pursuit of Replicating PATE-GAN: Benchmarking, Auditing, Debugging Georgi Ganev et.al. 2406.13985 link
2024-06-20 Similarity-aware Syncretic Latent Diffusion Model for Medical Image Translation with Representation Learning Tingyi Lin et.al. 2406.13977 null
2024-06-20 Synthesizing Multimodal Electronic Health Records via Predictive Diffusion Models Yuan Zhong et.al. 2406.13942 null
2024-06-20 EnTruth: Enhancing the Traceability of Unauthorized Dataset Usage in Text-to-image Diffusion Models with Minimal and Robust Alterations Jie Ren et.al. 2406.13933 null
2024-06-20 Generative AI for Enhancing Active Learning in Education: A Comparative Study of GPT-3.5 and GPT-4 in Crafting Customized Test Questions Hamdireza Rouzegar et.al. 2406.13903 null
2024-06-19 INFusion: Diffusion Regularized Implicit Neural Representations for 2D and 3D accelerated MRI reconstruction Yamin Arefeen et.al. 2406.13895 null
2024-06-19 Open Generative Large Language Models for Galician Pablo Gamallo et.al. 2406.13893 null
2024-06-19 StackRAG Agent: Improving Developer Answers with Retrieval-Augmented Generation Davit Abrahamyan et.al. 2406.13840 link
2024-06-19 RNA-FrameFlow: Flow Matching for de novo 3D RNA Backbone Design Rishabh Anand et.al. 2406.13839 link
2024-06-19 COAC: Cross-layer Optimization of Accelerator Configurability for Efficient CNN Processing Steven Colleman et.al. 2406.13752 null
2024-06-19 GenAI-Bench: Evaluating and Improving Compositional Text-to-Visual Generation Baiqi Li et.al. 2406.13743 link
2024-06-19 Tree-Sliced Wasserstein Distance on a System of Lines Viet-Hoang Tran et.al. 2406.13725 null
2024-06-19 Hitchhiker’s guide on Energy-Based Models: a comprehensive review on the relation with other generative models, sampling and statistical physics Davide Carbone et.al. 2406.13661 null
2024-06-19 Towards Minimal Targeted Updates of Language Models with Targeted Negative Training Lily H. Zhang et.al. 2406.13660 link
2024-06-19 Stability and Generalizability in SDE Diffusion Models with Measure-Preserving Dynamics Weitong Zhang et.al. 2406.13652 null
2024-06-19 On AI-Inspired UI-Design Jialiang Wei et.al. 2406.13631 null
2024-06-19 Can AI be enabled to dynamical downscaling? Training a Latent Diffusion Model to mimic km-scale COSMO-CLM downscaling of ERA5 over Italy Elena Tomasi et.al. 2406.13627 link
2024-06-19 Enhance the Image: Super Resolution using Artificial Intelligence in MRI Ziyu Li et.al. 2406.13625 null
2024-06-19 Generative Modeling by Minimizing the Wasserstein-2 Loss Yu-Jui Huang et.al. 2406.13619 null
2024-06-19 Parameter Training Efficiency Aware Resource Allocation for AIGC in Space-Air-Ground Integrated Networks Liangxin Qian et.al. 2406.13602 null
2024-06-19 ModSec-Learn: Boosting ModSecurity with Machine Learning Christian Scano et.al. 2406.13547 link
2024-06-19 Towards Cyber Threat Intelligence for the IoT Alfonso Iacovazzi et.al. 2406.13543 null
2024-06-19 Image Distillation for Safe Data Sharing in Histopathology Zhe Li et.al. 2406.13536 link
2024-06-19 Diffusion-based Generative Modeling with Discriminative Guidance for Streamable Speech Enhancement Chenda Li et.al. 2406.13471 null
2024-06-19 Unifying nonlinearly constrained nonconvex optimization Charlie Vanaret et.al. 2406.13454 link
2024-06-19 Federating to Grow Transformers with Constrained Resources without Model Sharing Shikun Shen et.al. 2406.13450 null
2024-06-19 Multi-messenger modeling of the Monogem pulsar halo Youyou Li et.al. 2406.13426 null
2024-06-19 Style-NeRF2NeRF: 3D Style Transfer From Style-Aligned Multi-View Images Haruo Fujiwara et.al. 2406.13393 null
2024-06-19 Effective Edge-wise Representation Learning in Edge-Attributed Bipartite Graphs Hewen Wang et.al. 2406.13369 null
2024-06-19 Situational Instructions Database: Task Guidance in Dynamic Environments Muhammad Saif Ullah Khan et.al. 2406.13302 link
2024-06-19 ARDuP: Active Region Video Diffusion for Universal Policies Shuaiyi Huang et.al. 2406.13301 null
2024-06-19 AniFaceDiff: High-Fidelity Face Reenactment via Facial Parametric Conditioned Diffusion Models Ken Chen et.al. 2406.13272 null
2024-06-19 Self-Supervised Diffusion Model for 3-D Seismic Data Reconstruction Xinyang Wang et.al. 2406.13252 null
2024-06-19 Optimizing Inventory Management through Multiobjective Reverse Logistics with Environmental Impact I. B. Wadhawan et.al. 2406.13226 null
2024-06-19 Neural Residual Diffusion Models for Deep Scalable Vision Generation Zhiyuan Ma et.al. 2406.13215 null
2024-06-19 Surgical Triplet Recognition via Diffusion Model Daochang Liu et.al. 2406.13210 null
2024-06-19 Diffusion Model-based FOD Restoration from High Distortion in dMRI Shuo Huang et.al. 2406.13209 null
2024-06-19 Toward Structure Fairness in Dynamic Graph Embedding: A Trend-aware Dual Debiasing Approach Yicong Li et.al. 2406.13201 link
2024-06-19 Synthetic Context Generation for Question Generation Naiming Liu et.al. 2406.13188 null
2024-06-19 Conditional score-based diffusion models for solving inverse problems in mechanics Agnimitra Dasgupta et.al. 2406.13154 null
2024-06-19 von Mises Quasi-Processes for Bayesian Circular Regression Yarden Cohen et.al. 2406.13151 null
2024-06-19 MCAD: Multi-modal Conditioned Adversarial Diffusion Model for High-Quality PET Image Reconstruction Jiaqi Cui et.al. 2406.13150 null
2024-06-19 GVT2RPM: An Empirical Study for General Video Transformer Adaptation to Remote Physiological Measurement Hao Wang et.al. 2406.13136 null
2024-06-19 Thruster-Assisted Incline Walking Kaushik Venkatesh Krishnamurthy et.al. 2406.13118 null
2024-06-18 Sampling 3D Gaussian Scenes in Seconds with Latent Diffusion Models Paul Henderson et.al. 2406.13099 null
2024-06-18 RITA: A Real-time Interactive Talking Avatars Framework Wuxinlin Cheng et.al. 2406.13093 null
2024-06-18 PIPPIN: Generating variable length full events from partons Guillaume Quétant et.al. 2406.13074 link
2024-06-18 MaskPure: Improving Defense Against Text Adversaries with Stochastic Purification Harrison Gietz et.al. 2406.13066 link
2024-06-18 Traffic Prediction considering Multiple Levels of Spatial-temporal Information: A Multi-scale Graph Wavelet-based Approach Zilin Bian et.al. 2406.13038 null
2024-06-18 Sharp detection of low-dimensional structure in probability measures via dimensional logarithmic Sobolev inequalities Matthew T. C. Li et.al. 2406.13036 null
2024-06-18 Data Plagiarism Index: Characterizing the Privacy Risk of Data-Copying in Tabular Generative Models Joshua Ward et.al. 2406.13012 null
2024-06-18 Synergizing Foundation Models and Federated Learning: A Survey Shenghui Li et.al. 2406.12844 null
2024-06-18 Evaluating the design space of diffusion-based generative models Yuqing Wang et.al. 2406.12839 null
2024-06-18 Neural Approximate Mirror Maps for Constrained Diffusion Models Berthy T. Feng et.al. 2406.12816 null
2024-06-19 AITTI: Learning Adaptive Inclusive Token for Text-to-Image Generation Xinyu Hou et.al. 2406.12805 link
2024-06-18 Extracting Training Data from Unconditional Diffusion Models Yunhao Chen et.al. 2406.12752 null
2024-06-18 Useful stochastic bounds in time-varying queues with service and patience times having general joint distribution Shreehari Anand Bodas et.al. 2406.12745 null
2024-06-18 SUPER: Selfie Undistortion and Head Pose Editing with Identity Preservation Polina Karpikova et.al. 2406.12700 null
2024-06-18 Speak in the Scene: Diffusion-based Acoustic Scene Transfer toward Immersive Speech Generation Miseul Kim et.al. 2406.12688 null
2024-06-18 GeoBench: Benchmarking and Analyzing Monocular Geometry Estimation Models Yongtao Ge et.al. 2406.12671 link
2024-06-18 Research and Implementation of Data Enhancement Techniques for Graph Neural Networks Jingzhao Gu et.al. 2406.12640 null
2024-06-18 News Without Borders: Domain Adaptation of Multilingual Sentence Embeddings for Cross-lingual News Recommendation Andreea Iana et.al. 2406.12634 link
2024-06-18 Learning Diffusion at Lightspeed Antonio Terpin et.al. 2406.12616 null
2024-06-18 Unmasking the Veil: An Investigation into Concept Ablation for Privacy and Copyright Protection in Images Shivank Garg et.al. 2406.12592 link
2024-06-18 Behavior-Dependent Linear Recurrent Units for Efficient Sequential Recommendation Chengkai Liu et.al. 2406.12580 link
2024-06-18 Training Diffusion Models with Federated Learning Matthijs de Goede et.al. 2406.12575 null
2024-06-18 P-Tailor: Customizing Personality Traits for Language Models via Mixture of Specialized LoRA Experts Yuhao Dan et.al. 2406.12548 null
2024-06-18 Structured Detection for Simultaneous Super-Resolution and Optical Sectioning in Laser Scanning Microscopy Alessandro Zunino et.al. 2406.12542 link
2024-06-18 Variational Distillation of Diffusion Policies into Mixture of Experts Hongyi Zhou et.al. 2406.12538 null
2024-06-18 HumanSplat: Generalizable Single-Image Human Gaussian Splatting with Structure Priors Panwang Pan et.al. 2406.12459 link
2024-06-18 Planning Using Schrödinger Bridge Diffusion Models Adarsh Srivastava et.al. 2406.12458 link
2024-06-18 Deep Temporal Deaggregation: Large-Scale Spatio-Temporal Generative Models David Bergström et.al. 2406.12423 null
2024-06-18 ROVER: RTL Optimization via Verified E-Graph Rewriting Samuel Coward et.al. 2406.12421 null
2024-06-18 TADM: Temporally-Aware Diffusion Model for Neurodegenerative Progression on Brain MRI Mattia Litrico et.al. 2406.12411 null
2024-06-18 SDNIA-YOLO: A Robust Object Detection Model for Extreme Weather Conditions Yuexiong Ding et.al. 2406.12395 null

Vision-Language Models

Publish Date Title Authors PDF Code
2024-09-18 Qwen2-VL: Enhancing Vision-Language Model’s Perception of the World at Any Resolution Peng Wang et.al. 2409.12191 link
2024-09-18 All-in-one foundational models learning across quantum chemical levels Yuxinxin Chen et.al. 2409.12015 link
2024-09-18 LMMCoDrive: Cooperative Driving with Large Multimodal Model Haichao Liu et.al. 2409.11981 null
2024-09-16 MusicLIME: Explainable Multimodal Music Understanding Theodoros Sotirou et.al. 2409.10496 link
2024-09-19 IRIS: Interactive Responsive Intelligent Segmentation for 3D Affordance Analysis Meng Chu et.al. 2409.10078 null
2024-09-16 AceParse: A Comprehensive Dataset with Diverse Structured Texts for Academic Literature Parsing Huawei Ji et.al. 2409.10016 link
2024-09-14 Keypoints-Integrated Instruction-Following Data Generation for Enhanced Human Pose Understanding in Multimodal Models Dewen Zhang et.al. 2409.09306 null
2024-09-13 Interactive Masked Image Modeling for Multimodal Object Detection in Remote Sensing Minh-Duc Vu et.al. 2409.08885 null
2024-09-13 A Multimodal Approach for Fluid Overload Prediction: Integrating Lung Ultrasound and Clinical Data Tianqi Yang et.al. 2409.08790 null
2024-09-13 Dynamics of Collective Group Affect: Group-level Annotations and the Multimodal Modeling of Convergence and Divergence Navin Raj Prabhu et.al. 2409.08578 null
2024-09-13 A Comprehensive Survey on Deep Multimodal Learning with Missing Modality Renjie Wu et.al. 2409.07825 null
2024-09-12 Top-down Activity Representation Learning for Video Question Answering Yanan Wang et.al. 2409.07748 null
2024-09-11 What to align in multimodal contrastive learning? Benoit Dufumier et.al. 2409.07402 null
2024-09-11 MVLLaVA: An Intelligent Agent for Unified and Flexible Novel View Synthesis Hanyu Jiang et.al. 2409.07129 null
2024-09-11 FSMDet: Vision-guided feature diffusion for fully sparse 3D detector Tianran Liu et.al. 2409.06945 null
2024-09-16 Scaling Law Hypothesis for Multimodal Model Qingyun Sun et.al. 2409.06754 null
2024-09-10 Multiclass Arrhythmia Classification using Smartwatch Photoplethysmography Signals Collected in Real-life Settings Dong Han et.al. 2409.06147 null
2024-09-11 A Survey of Multimodal Composite Editing and Retrieval Suyan Li et.al. 2409.05405 link
2024-09-05 Learning in Order! A Sequential Strategy to Learn Invariant Features for Multimodal Sentiment Analysis Xianbing Zhao et.al. 2409.04473 null
2024-09-06 Generating Faithful and Salient Text from Multimodal Data Tahsina Hashem et.al. 2409.03961 link
2024-09-06 CMM-Math: A Chinese Multimodal Math Dataset To Evaluate and Enhance the Mathematics Reasoning of Large Multimodal Models Wentao Liu et.al. 2409.02834 null
2024-09-10 MMMU-Pro: A More Robust Multi-discipline Multimodal Understanding Benchmark Xiang Yue et.al. 2409.02813 null
2024-09-04 Understanding eGFR Trajectories and Kidney Function Decline via Large Multimodal Models Chih-Yuan Li et.al. 2409.02530 null
2024-09-03 Blocks as Probes: Dissecting Categorization Ability of Large Multimodal Models Bin Fu et.al. 2409.01560 null
2024-09-03 Think Twice Before Recognizing: Large Multimodal Models for General Fine-grained Traffic Sign Recognition Yaozong Gan et.al. 2409.01534 null
2024-09-02 Towards General Industrial Intelligence: A Survey on IIoT-Enhanced Continual Large Models Jiao Chen et.al. 2409.01207 null
2024-09-02 Recoverable Compression: A Multimodal Vision Token Recovery Mechanism Guided by Text Information Yi Chen et.al. 2409.01179 null
2024-08-31 Comparative Analysis of Modality Fusion Approaches for Audio-Visual Person Identification and Verification Aref Farhadipour et.al. 2409.00562 null
2024-08-30 UrBench: A Comprehensive Benchmark for Evaluating Large Multimodal Models in Multi-View Urban Scenarios Baichuan Zhou et.al. 2408.17267 null
2024-08-29 Seeking the Sufficiency and Necessity Causal Features in Multimodal Representation Learning Boyu Chen et.al. 2408.16577 null
2024-08-29 Toward Robust Early Detection of Alzheimer’s Disease via an Integrated Multimodal Learning Approach Yifei Chen et.al. 2408.16343 link
2024-08-28 Meta-Learn Unimodal Signals with Weak Supervision for Multimodal Sentiment Analysis Sijie Mai et.al. 2408.16029 null
2024-08-28 ModalityMirror: Improving Audio Classification in Modality Heterogeneity Federated Learning with Multimodal Distillation Tiantian Feng et.al. 2408.15803 null
2024-08-28 Visual Prompt Engineering for Medical Vision Language Models in Radiology Stefan Denner et.al. 2408.15802 null
2024-08-27 X-Reflect: Cross-Reflection Prompting for Multimodal Recommendation Hanjia Lyu et.al. 2408.15172 null
2024-08-27 The Benefits of Balance: From Information Projections to Variance Reduction Lang Liu et.al. 2408.15065 null
2024-08-27 NeuralOOD: Improving Out-of-Distribution Generalization Performance with Brain-machine Fusion Learning Framework Shuangchen Zhao et.al. 2408.14950 null
2024-08-26 MMR: Evaluating Reading Ability of Large Multimodal Models Jian Chen et.al. 2408.14594 null
2024-09-03 Foundation Models for Music: A Survey Yinghao Ma et.al. 2408.14340 link
2024-08-26 LMM-VQA: Advancing Video Quality Assessment with Large Multimodal Models Qihang Ge et.al. 2408.14008 null
2024-08-27 Quantum Multimodal Contrastive Learning Framework Chi-Sheng Chen et.al. 2408.13919 null
2024-08-25 Tangram: A Challenging Benchmark for Geometric Element Recognizing Jiamin Tang et.al. 2408.13854 null
2024-08-25 Multimodal Ensemble with Conditional Feature Fusion for Dysgraphia Diagnosis in Children from Handwriting Samples Jayakanth Kunhoth et.al. 2408.13754 null
2024-08-24 Preliminary Investigations of a Multi-Faceted Robust and Synergistic Approach in Semiconductor Electron Micrograph Analysis: Integrating Vision Transformers with Large Language and Multimodal Models Sakhinana Sagar Srinivas et.al. 2408.13621 null
2024-08-23 Foundational Model for Electron Micrograph Analysis: Instruction-Tuning Small-Scale Language-and-Vision Assistant for Enterprise Adoption Sakhinana Sagar Srinivas et.al. 2408.13248 null
2024-08-23 Indoor scene recognition from images under visual corruptions Willams de Lima Costa et.al. 2408.13029 null
2024-08-23 Ada2I: Enhancing Modality Balance for Multimodal Conversational Emotion Recognition Cam-Van Thi Nguyen et.al. 2408.12895 null
2024-08-23 Has Multimodal Learning Delivered Universal Intelligence in Healthcare? A Comprehensive Survey Qika Lin et.al. 2408.12880 link
2024-08-22 Assessing Modality Bias in Video Question Answering Benchmarks with Multimodal Large Language Models Jean Park et.al. 2408.12763 null
2024-08-22 Integrating Audio, Visual, and Semantic Information for Enhanced Multimodal Speaker Diarization Luyao Cheng et.al. 2408.12102 null
2024-08-22 Mental-Perceiver: Audio-Textual Multimodal Learning for Mental Health Assessment Jinghui Qin et.al. 2408.12088 null
2024-08-21 GRAB: A Challenging GRaph Analysis Benchmark for Large Multimodal Models Jonathan Roberts et.al. 2408.11817 null
2024-08-21 D-RMGPT: Robot-assisted collaborative tasks driven by large multimodal models M. Forlini et.al. 2408.11761 null
2024-08-21 UniFashion: A Unified Vision-Language Model for Multimodal Fashion Retrieval and Generation Xiangyu Zhao et.al. 2408.11305 link
2024-08-21 BearLLM: A Prior Knowledge-Enhanced Bearing Health Management Framework with Unified Vibration Signal Representation Haotian Peng et.al. 2408.11281 link
2024-08-20 Exploring the use of Generative AI to Support Automated Just-in-Time Programming for Visual Scene Displays Cynthia Zastudil et.al. 2408.11137 null
2024-08-21 SZTU-CMU at MER2024: Improving Emotion-LLaMA with Conv-Attention for Multimodal Emotion Recognition Zebang Cheng et.al. 2408.10500 link
2024-08-19 Enhance Modality Robustness in Text-Centric Multimodal Alignment with Adversarial Prompting Yun-Da Tsai et.al. 2408.09798 null
2024-08-19 Anim-Director: A Large Multimodal Model Powered Agent for Controllable Animation Video Generation Yunxin Li et.al. 2408.09787 link
2024-08-18 PA-LLaVA: A Large Language-Vision Assistant for Human Pathology Image Understanding Dawei Dai et.al. 2408.09530 link
2024-08-17 Measuring Visual Sycophancy in Multimodal Models Jaehyuk Lim et.al. 2408.09111 null
2024-08-16 AdaRank: Disagreement Based Module Rank Prediction for Low-rank Adaptation Yihe Dong et.al. 2408.09015 link
2024-08-16 xGen-MM (BLIP-3): A Family of Open Large Multimodal Models Le Xue et.al. 2408.08872 null
2024-08-16 Tell Codec What Worth Compressing: Semantically Disentangled Image Coding for Machine with LMMs Jinming Liu et.al. 2408.08575 null
2024-08-15 LLaVA-Surg: Towards Multimodal Surgical Assistant via Structured Surgical Video Learning Jiajie Li et.al. 2408.07981 null
2024-08-15 MathScape: Evaluating MLLMs in multimodal Math Scenarios through a Hierarchical Benchmark Minxuan Zhou et.al. 2408.07543 link
2024-08-14 Modality Invariant Multimodal Learning to Handle Missing Modalities: A Single-Branch Approach Muhammad Saad Saeed et.al. 2408.07445 null
2024-08-14 Robust Semi-supervised Multimodal Medical Image Segmentation via Cross Modality Collaboration Xiaogen Zhon et.al. 2408.07341 link
2024-08-14 Enhancing Visual Question Answering through Ranking-Based Hybrid Training and Multimodal Fusion Peiyuan Chen et.al. 2408.07303 null
2024-08-13 PathInsight: Instruction Tuning of Multimodal Datasets and Models for Intelligence Assisted Diagnosis in Histopathology Xiaomin Wu et.al. 2408.07037 null
2024-08-13 EditScribe: Non-Visual Image Editing with Natural Language Verification Loops Ruei-Che Chang et.al. 2408.06632 null
2024-08-13 CROME: Cross-Modal Adapters for Efficient Multimodal LLM Sayna Ebrahimi et.al. 2408.06610 null
2024-08-13 Prioritizing Modalities: Flexible Importance Scheduling in Federated Multimodal Learning Jieming Bian et.al. 2408.06549 null
2024-08-12 VisualAgentBench: Towards Large Multimodal Models as Visual Foundation Agents Xiao Liu et.al. 2408.06327 link
2024-08-11 HateSieve: A Contrastive Learning Framework for Detecting and Segmenting Hateful Content in Multimodal Memes Xuanyu Su et.al. 2408.05794 null
2024-08-08 Enhancing Journalism with AI: A Study of Contextualized Image Captioning for News Articles using LLMs and LMMs Aliki Anagnostopoulou et.al. 2408.04331 null
2024-08-06 LLaVA-OneVision: Easy Visual Task Transfer Bo Li et.al. 2408.03326 link
2024-08-06 Multitask and Multimodal Neural Tuning for Large Models Hao Sun et.al. 2408.03001 null
2024-08-06 Body of Her: A Preliminary Study on End-to-End Humanoid Agent Tenglong Ao et.al. 2408.02879 null
2024-08-04 Distribution-Level Memory Recall for Continual Learning: Preserving Knowledge and Avoiding Confusion Shaoxu Cheng et.al. 2408.02695 null
2024-08-02 A Systematic Review of Intermediate Fusion in Multimodal Deep Learning for Biomedical Applications Valerio Guarrasi et.al. 2408.02686 null
2024-08-05 REVISION: Rendering Tools Enable Spatial Fidelity in Vision-Language Models Agneet Chatterjee et.al. 2408.02231 null
2024-08-04 CACE-Net: Co-guidance Attention and Contrastive Enhancement for Effective Audio-Visual Event Localization Xiang He et.al. 2408.01952 link
2024-08-02 MuChoMusic: Evaluating Music Understanding in Multimodal Audio-Language Models Benno Weck et.al. 2408.01337 link
2024-08-05 Dissecting Dissonance: Benchmarking Large Multimodal Models Against Self-Contradictory Instructions Jin Gao et.al. 2408.01091 link
2024-08-02 GraphAge: Unleashing the power of Graph Neural Network to Decode Epigenetic Aging Saleh Sakib Ahmed et.al. 2408.00984 link
2024-08-01 MM-Vet v2: A Challenging Benchmark to Evaluate Large Multimodal Models for Integrated Capabilities Weihao Yu et.al. 2408.00765 link
2024-08-01 GalleryGPT: Analyzing Paintings with Large Multimodal Models Yi Bin et.al. 2408.00491 link
2024-08-01 Everything We Hear: Towards Tackling Misinformation in Podcasts Sachin Pathiyan Cherumanal et.al. 2408.00292 null
2024-08-01 OmniParser for Pure Vision Based GUI Agent Yadong Lu et.al. 2408.00203 null
2024-07-30 Evolver: Chain-of-Evolution Prompting to Boost Large Multimodal Models for Hateful Meme Detection Jinfa Huang et.al. 2407.21004 null
2024-07-30 HyperMM : Robust Multimodal Learning with Varying-sized Inputs Hava Chaptoukaev et.al. 2407.20768 null
2024-07-30 Effectively Leveraging CLIP for Generating Situational Summaries of Images and Videos Dhruv Verma et.al. 2407.20642 link
2024-07-29 Adversarial Robustness in RGB-Skeleton Action Recognition: Leveraging Attention Modality Reweighter Chao Liu et.al. 2407.19981 null
2024-07-29 ML-Mamba: Efficient Multi-Modal Large Language Model Utilizing Mamba-2 Wenjun Huang et.al. 2407.19832 null
2024-08-02 XLIP: Cross-modal Attention Masked Modelling for Medical Language-Image Pre-Training Biao Wu et.al. 2407.19546 link
2024-07-28 Detached and Interactive Multimodal Learning Yunfeng Fan et.al. 2407.19514 link
2024-07-27 Data Processing Techniques for Modern Multimodal Models Yinheng Li et.al. 2407.19180 null
2024-07-26 MangaUB: A Manga Understanding Benchmark for Large Multimodal Models Hikaru Ikuta et.al. 2407.19034 null
2024-07-26 Unifying Visual and Semantic Feature Spaces with Diffusion Models for Enhanced Cross-Modal Alignment Yuze Zheng et.al. 2407.18854 null
2024-07-26 ChatSchema: A pipeline of extracting structured information with Large Multimodal Models based on schema Fei Wang et.al. 2407.18716 null
2024-07-25 Sparse vs Contiguous Adversarial Pixel Perturbations in Multimodal Models: An Empirical Analysis Cristian-Alexandru Botocan et.al. 2407.18251 link
2024-07-25 $\mathbb{X}$ -Sample Contrastive Loss: Improving Contrastive Learning with Sample Similarity Graphs Vlad Sobal et.al. 2407.18134 null
2024-07-25 Cross-Vendor Reproducibility of Radiomics-based Machine Learning Models for Computer-aided Diagnosis Jatin Chaudhary et.al. 2407.18060 null
2024-07-25 What does Kiki look like? Cross-modal associations between speech sounds and visual shapes in vision-and-language models Tessa Verhoef et.al. 2407.17974 null
2024-07-25 Shapley Value-based Contrastive Alignment for Multimodal Information Extraction Wen Luo et.al. 2407.17854 null
2024-07-25 Enhancing Model Performance: Another Approach to Vision-Language Instruction Tuning Vedanshu et.al. 2407.17813 null
2024-07-25 KiVA: Kid-inspired Visual Analogies for Testing Large Multimodal Models Eunice Yiu et.al. 2407.17773 link
2024-07-24 Testing Large Language Models on Driving Theory Knowledge and Skills for Connected Autonomous Vehicles Zuoyin Tang et.al. 2407.17211 null
2024-07-23 Chameleon: Images Are What You Need For Multimodal Learning Robust To Missing Modalities Muhammad Irzam Liaqat et.al. 2407.16243 null
2024-07-22 LongVideoBench: A Benchmark for Long-context Interleaved Video-Language Understanding Haoning Wu et.al. 2407.15754 link
2024-07-22 Resource-Efficient Federated Multimodal Learning via Layer-wise and Progressive Training Ye Lin Tun et.al. 2407.15426 null
2024-07-21 VideoGameBunny: Towards vision assistants for video games Mohammad Reza Taesiri et.al. 2407.15295 null
2024-07-22 Patch-based Intuitive Multimodal Prototypes Network (PIMPNet) for Alzheimer’s Disease classification Lisa Anita De Santi et.al. 2407.14277 link
2024-07-18 Visual Haystacks: Answering Harder Questions About Sets of Images Tsung-Han Wu et.al. 2407.13766 link
2024-07-17 Text- and Feature-based Models for Compound Multimodal Emotion Recognition in the Wild Nicolas Richet et.al. 2407.12927 link
2024-07-16 ChatBCG: Can AI Read Your Slide Deck? Nikita Singh et.al. 2407.12875 null
2024-07-17 LMMs-Eval: Reality Check on the Evaluation of Large Multimodal Models Kaichen Zhang et.al. 2407.12772 link
2024-07-17 Missing Modality Prediction for Unpaired Multimodal Learning via Joint Embedding of Unimodal Models Donggeun Kim et.al. 2407.12616 null
2024-07-17 E5-V: Universal Embeddings with Multimodal Large Language Models Ting Jiang et.al. 2407.12580 link
2024-07-16 FIRE: A Dataset for Feedback Integration and Refinement Evaluation of Multimodal Models Pengxiang Li et.al. 2407.11522 null
2024-07-16 COMET: “Cone of experience” enhanced large multimodal model for mathematical problem generation Sannyuya Liu et.al. 2407.11315 null
2024-07-15 OpenPSG: Open-set Panoptic Scene Graph Generation via Large Multimodal Models Zijian Zhou et.al. 2407.11213 null
2024-07-15 FabGPT: An Efficient Large Multimodal Model for Complex Wafer Defect Knowledge Queries Yuqi Jiang et.al. 2407.10810 null
2024-07-15 Scaling 3D Reasoning with LMMs to Large Robot Mission Environments Using Datagraphs W. J. Meijer et.al. 2407.10743 null
2024-07-16 Qwen2 Technical Report An Yang et.al. 2407.10671 link
2024-07-15 How and where does CLIP process negation? Vincent Quantmeyer et.al. 2407.10488 null
2024-07-12 Diagnosing and Re-learning for Balanced Multimodal Learning Yake Wei et.al. 2407.09705 link
2024-07-12 Unifying Sequences, Structures, and Descriptions for Any-to-Any Protein Generation with the Large Multimodal Model HelixProtX Zhiyuan Chen et.al. 2407.09274 link
2024-07-12 DART: An Automated End-to-End Object Detection Pipeline with Data Diversification, Open-Vocabulary Bounding Box Annotation, Pseudo-Label Review, and Model Training Chen Xin et.al. 2407.09174 link
2024-07-11 Emerging Practices for Large Multimodal Model (LMM) Assistance for People with Visual Impairments: Implications for Design Jingyi Xie et.al. 2407.08882 null
2024-07-10 RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization Xijie Huang et.al. 2407.08044 link
2024-07-10 LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal Models Feng Li et.al. 2407.07895 link
2024-07-11 InstructLayout: Instruction-Driven 2D and 3D Layout Synthesis with Semantic Graph Prior Chenguo Lin et.al. 2407.07580 null
2024-07-10 Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model Wenqi Zhang et.al. 2407.07053 link
2024-07-08 ANOLE: An Open, Autoregressive, Native Large Multimodal Models for Interleaved Image-Text Generation Ethan Chern et.al. 2407.06135 link
2024-07-07 Multimodal Language Models for Domain-Specific Procedural Video Summarization Nafisa Hussain et.al. 2407.05419 null
2024-07-07 Multimodal Prompt Learning with Missing Modalities for Sentiment Analysis and Emotion Recognition Zirun Guo et.al. 2407.05374 link
2024-07-06 Enhance the Robustness of Text-Centric Multimodal Alignments Ting-Yu Yen et.al. 2407.05036 null
2024-07-06 Completed Feature Disentanglement Learning for Multimodal MRIs Analysis Tianling Liu et.al. 2407.04916 null
2024-07-06 MMSci: A Multimodal Multi-Discipline Dataset for PhD-Level Scientific Comprehension Zekun Li et.al. 2407.04903 link
2024-07-05 VCoME: Verbal Video Composition with Multimodal Editing Effects Weibo Gong et.al. 2407.04697 null
2024-07-05 Multimodal Classification via Modal-Aware Interactive Enhancement Qing-Yuan Jiang et.al. 2407.04587 null
2024-07-05 Robust Multimodal Learning via Representation Decoupling Shicai Wei et.al. 2407.04458 null
2024-07-05 Smart Vision-Language Reasoners Denisa Roberts et.al. 2407.04212 link
2024-07-04 Investigating the Role of Instruction Variety and Task Difficulty in Robotic Manipulation Tasks Amit Parekh et.al. 2407.03967 link
2024-07-04 ADAPT: Multimodal Learning for Detecting Physiological Changes under Missing Modalities Julie Mordacq et.al. 2407.03836 link
2024-07-04 M $\mathbf5$ – A Diverse Benchmark to Assess the Performance of Large Multimodal Models Across Multilingual and Multicultural Vision-Language Tasks Florian Schneider et.al. 2407.03791 null
2024-07-03 HEMM: Holistic Evaluation of Multimodal Foundation Models Paul Pu Liang et.al. 2407.03418 link
2024-07-02 Multi-Peptide: Multimodality Leveraged Language-Graph Learning of Peptide Properties Srivathsan Badrinarayanan et.al. 2407.03380 link
2024-07-02 Understanding Alignment in Multimodal LLMs: A Comprehensive Study Elmira Amirloo et.al. 2407.02477 null
2024-07-02 Synthetic Multimodal Question Generation Ian Wu et.al. 2407.02233 null
2024-07-02 Crossroads of Continents: Automated Artifact Extraction for Cultural Adaptation with Large Multimodal Models Anjishnu Mukherjee et.al. 2407.02067 link
2024-07-01 Empathic Grounding: Explorations using Multimodal Interaction and Large Language Models with Conversational Agents Mehdi Arjmand et.al. 2407.01824 link
2024-07-01 We-Math: Does Your Large Multimodal Model Achieve Human-like Mathematical Reasoning? Runqi Qiao et.al. 2407.01284 link
2024-07-01 Unaligning Everything: Or Aligning Any Text to Any Image in Multimodal Models Shaeke Salman et.al. 2407.01157 null
2024-06-29 AI-powered multimodal modeling of personalized hemodynamics in aortic stenosis Caglar Ozturk et.al. 2407.00535 null
2024-06-29 MMEvalPro: Calibrating Multimodal Benchmarks Towards Trustworthy and Efficient Evaluation Jinsheng Huang et.al. 2407.00468 link
2024-06-29 How to Train Your Fact Verifier: Knowledge Transfer with Multimodal Open Models Jaeyoung Lee et.al. 2407.00369 null
2024-06-28 PathGen-1.6M: 1.6 Million Pathology Image-text Pairs Generation through Multi-agent Collaboration Yuxuan Sun et.al. 2407.00203 null
2024-06-28 EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model Yuxuan Zhang et.al. 2406.20076 link
2024-06-28 InfiniBench: A Comprehensive Benchmark for Large Multimodal Models in Very Long Video Understanding Kirolos Ataallah et.al. 2406.19875 link
2024-06-28 MetaDesigner: Advancing Artistic Typography through AI-Driven, User-Centric, and Multilingual WordArt Synthesis Jun-Yan He et.al. 2406.19859 null
2024-06-28 MM-Instruct: Generated Visual Instructions for Large Multimodal Model Alignment Jihao Liu et.al. 2406.19736 link
2024-06-28 Enhancing Radiological Diagnosis: A Collaborative Approach Integrating AI and Human Expertise for Visual Miss Correction Akash Awasthi et.al. 2406.19686 null
2024-06-28 SK-VQA: Synthetic Knowledge Generation at Scale for Training Context-Augmented Multimodal LLMs Xin Su et.al. 2406.19593 null
2024-06-27 OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding Tao Zhang et.al. 2406.19389 null
2024-06-28 FlowVQA: Mapping Multimodal Logic in Visual Question Answering with Flowcharts Shubhankar Singh et.al. 2406.19237 null
2024-06-27 RAVEN: Multitask Retrieval Augmented Vision-Language Learning Varun Nagaraj Rao et.al. 2406.19150 null
2024-06-27 DocKylin: A Large Multimodal Model for Visual Document Understanding with Efficient Visual Slimming Jiaxin Zhang et.al. 2406.19101 null
2024-06-27 Fairness and Bias in Multimodal AI: A Survey Tosin Adewumi et.al. 2406.19097 null
2024-06-27 MissionGNN: Hierarchical Multimodal GNN-based Weakly Supervised Video Anomaly Recognition with Mission-Specific Knowledge Graph Generation Sanggeon Yun et.al. 2406.18815 null
2024-06-26 MUMU: Bootstrapping Multimodal Image Generation from Text-to-Image Data William Berman et.al. 2406.18790 null
2024-06-26 S3: A Simple Strong Sample-effective Multimodal Dialog System Elisei Rykov et.al. 2406.18305 link
2024-06-26 EHR-Based Mobile and Web Platform for Chronic Disease Risk Prediction Using Large Language Multimodal Models Chun-Chieh Liao et.al. 2406.18087 null
2024-06-26 Speech2UnifiedExpressions: Synchronous Synthesis of Co-Speech Affective Face and Body Expressions from Affordable Inputs Uttaran Bhattacharya et.al. 2406.18068 null
2024-06-25 Human-centered In-building Embodied Delivery Benchmark Zhuoqun Xu et.al. 2406.17898 link
2024-06-25 InFiConD: Interactive No-code Fine-tuning with Concept-based Knowledge Distillation Jinbin Huang et.al. 2406.17838 null
2024-06-25 Data curation via joint example selection further accelerates multimodal learning Talfan Evans et.al. 2406.17711 null
2024-06-25 Towards Probing Speech-Specific Risks in Large Multimodal Models: A Taxonomy, Benchmark, and Insights Hao Yang et.al. 2406.17430 null
2024-06-24 At First Sight: Zero-Shot Classification of Astronomical Images with Large Multimodal Models Dimitrios Tanoglidis et.al. 2406.17057 null
2024-06-24 Revisiting Referring Expression Comprehension Evaluation in the Era of Large Multimodal Models Jierun Chen et.al. 2406.16866 link
2024-06-24 Long Context Transfer from Language to Vision Peiyuan Zhang et.al. 2406.16852 link
2024-06-24 QuadrupedGPT: Towards a Versatile Quadruped Agent in Open-ended Worlds Ye Wang et.al. 2406.16578 null
2024-06-21 Multimodal Task Vectors Enable Many-Shot Multimodal In-Context Learning Brandon Huang et.al. 2406.15334 null
2024-06-21 Is A Picture Worth A Thousand Words? Delving Into Spatial Reasoning for Vision Language Models Jiayu Wang et.al. 2406.14852 null
2024-06-20 Evaluating vision-capable chatbots in interpreting kinematics graphs: a comparative study of free and subscription-based models Giulia Polverini et.al. 2406.14685 null
2024-06-20 Revealing Vision-Language Integration in the Brain with Multimodal Networks Vighnesh Subramaniam et.al. 2406.14481 link
2024-06-25 iWISDM: Assessing instruction following in multimodal models at scale Xiaoxuan Lei et.al. 2406.14343 link
2024-06-20 Two Giraffes in a Dirt Field: Using Game Play to Investigate Situation Modelling in Large Multimodal Models Sherzod Hakimov et.al. 2406.14035 null
2024-06-20 Knowledge-driven Subspace Fusion and Gradient Coordination for Multi-modal Learning Yupei Zhang et.al. 2406.13979 link
2024-06-20 PIN: A Knowledge-Intensive Dataset for Paired and Interleaved Multimodal Documents Junjie Wang et.al. 2406.13923 null
2024-06-19 Through the Theory of Mind’s Eye: Reading Minds with Multimodal Video Large Language Models Zhawnen Chen et.al. 2406.13763 null
2024-06-19 GUI Action Narrator: Where and When Did That Action Take Place? Qinchen Wu et.al. 2406.13719 null
2024-06-19 Is AI fun? HumorDB: a curated dataset and benchmark to investigate graphical humor Veedant Jain et.al. 2406.13564 null
2024-06-19 VisualRWKV: Exploring Recurrent Neural Networks for Visual Language Models Haowen Hou et.al. 2406.13362 link
2024-06-19 Learnable In-Context Vector for Visual Question Answering Yingzhe Peng et.al. 2406.13185 null
2024-06-18 Synergizing Foundation Models and Federated Learning: A Survey Shenghui Li et.al. 2406.12844 null
2024-06-18 OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI Zhen Huang et.al. 2406.12753 link
2024-06-18 Disturbing Image Detection Using LMM-Elicited Emotion Embeddings Maria Tzelepi et.al. 2406.12668 null
2024-06-18 Automatic benchmarking of large multimodal models via iterative experiment programming Alessandro Conti et.al. 2406.12321 link
2024-06-18 Language and Multimodal Models in Sports: A Survey of Datasets and Applications Haotian Xia et.al. 2406.12252 null
2024-06-17 VideoLLM-online: Online Video Large Language Model for Streaming Video Joya Chen et.al. 2406.11816 null
2024-06-17 LLARVA: Vision-Action Instruction Tuning Enhances Robot Learning Dantong Niu et.al. 2406.11815 null
2024-06-17 Multimodal Learning To Improve Segmentation With Intraoperative CBCT & Preoperative CT Maximilian E. Tschuchnig et.al. 2406.11650 null
2024-06-17 Program Synthesis Benchmark for Visual Programming in XLogoOnline Environment Chao Wen et.al. 2406.11334 null
2024-06-17 VideoVista: A Versatile Benchmark for Video Understanding and Reasoning Yunxin Li et.al. 2406.11303 null
2024-06-17 i-SRT: Aligning Large Multimodal Models for Videos by Iterative Self-Retrospective Judgment Daechul Ahn et.al. 2406.11280 link
2024-06-17 MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens Anas Awadalla et.al. 2406.11271 link
2024-06-17 Generative Visual Instruction Tuning Jefferson Hernandez et.al. 2406.11262 link
2024-06-17 Relational Learning in Pre-Trained Models: A Theory from Hypergraph Recovery Perspective Yang Chen et.al. 2406.11249 null
2024-06-16 Investigating Video Reasoning Capability of Large Language Models with Tropes in Movies Hung-Ting Su et.al. 2406.10923 null
2024-06-15 Beyond Raw Videos: Understanding Edited Videos with Large Multimodal Model Lu Xu et.al. 2406.10484 null
2024-06-12 MobileAIBench: Benchmarking LLMs and LMMs for On-Device Use Cases Rithesh Murthy et.al. 2406.10290 null
2024-06-14 VideoGUI: A Benchmark for GUI Automation from Instructional Videos Kevin Qinghong Lin et.al. 2406.10227 null
2024-06-14 ChartMimic: Evaluating LMM’s Cross-Modal Reasoning Capability via Chart-to-Code Generation Chufan Shi et.al. 2406.09961 link
2024-06-14 BiVLC: Extending Vision-Language Compositionality Evaluation with Text-to-Image Retrieval Imanol Miranda et.al. 2406.09952 link
2024-06-13 VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding Muhammad Maaz et.al. 2406.09418 link
2024-06-13 Explore the Limits of Omni-modal Pretraining at Scale Yiyuan Zhang et.al. 2406.09412 link
2024-06-14 4M-21: An Any-to-Any Vision Model for Tens of Tasks and Modalities Roman Bachmann et.al. 2406.09406 null
2024-06-13 Yo’LLaVA: Your Personalized Language and Vision Assistant Thao Nguyen et.al. 2406.09400 null
2024-06-13 CMC-Bench: Towards a New Paradigm of Visual Signal Compression Chunyi Li et.al. 2406.09356 link
2024-06-13 Comparison Visual Instruction Tuning Wei Lin et.al. 2406.09240 null
2024-06-13 Zoom and Shift are All You Need Jiahao Qin et.al. 2406.08866 null
2024-06-11 Embedding-based Multimodal Learning on Pan-Squamous Cell Carcinomas for Improved Survival Outcomes Asim Waqas et.al. 2406.08521 null
2024-06-14 Beyond LLaVA-HD: Diving into High-Resolution Large Multimodal Models Yi-Fan Zhang et.al. 2406.08487 link
2024-06-13 OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text Qingyun Li et.al. 2406.08418 link
2024-06-12 A Concept-Based Explainability Framework for Large Multimodal Models Jayneel Parekh et.al. 2406.08074 null
2024-06-12 LVBench: An Extreme Long Video Understanding Benchmark Weihan Wang et.al. 2406.08035 link
2024-06-11 Cognitive Insights Across Languages: Enhancing Multimodal Interview Analysis David Ortiz-Perez et.al. 2406.07542 link
2024-06-11 Understanding Visual Concepts Across Models Brandon Trabucco et.al. 2406.07506 link
2024-06-11 Unified Modeling Enhanced Multimodal Learning for Precision Neuro-Oncology Huahui Yi et.al. 2406.07078 link
2024-06-14 BTS: Bridging Text and Sound Modalities for Metadata-Aided Respiratory Sound Classification June-Woo Kim et.al. 2406.06786 link
2024-06-10 Vript: A Video Is Worth Thousands of Words Dongjie Yang et.al. 2406.06040 link
2024-06-10 FLEUR: An Explainable Reference-Free Evaluation Metric for Image Captioning Using a Large Multimodal Model Yebin Lee et.al. 2406.06004 link
2024-06-10 CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark David Romero et.al. 2406.05967 null
2024-06-09 Stealthy Targeted Backdoor Attacks against Image Captioning Wenshu Fan et.al. 2406.05874 link
2024-06-09 F-LMM: Grounding Frozen Large Multimodal Models Size Wu et.al. 2406.05821 link
2024-06-08 Generalist Multimodal AI: A Review of Architectures, Challenges and Opportunities Sai Munikoti et.al. 2406.05496 null
2024-06-07 Semantic Segmentation on VSPW Dataset through Masked Video Consistency Chen Liang et.al. 2406.04979 null
2024-06-07 Predictive Dynamic Fusion Bing Cao et.al. 2406.04802 link
2024-06-07 MGIMM: Multi-Granularity Instruction Multimodal Model for Attribute-Guided Remote Sensing Image Detailed Description Cong Yang et.al. 2406.04716 link
2024-06-07 AICoderEval: Improving AI Domain Code Generation of Large Language Models Yinghui Xia et.al. 2406.04712 null
2024-06-06 GenAI Arena: An Open Evaluation Platform for Generative Models Dongfu Jiang et.al. 2406.04485 null
2024-06-06 MAIRA-2: Grounded Radiology Report Generation Shruthi Bannur et.al. 2406.04449 link
2024-06-06 DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effective for LMMs Lingchen Meng et.al. 2406.04334 null
2024-06-06 BLSP-Emo: Towards Empathetic Large Speech-Language Models Chen Wang et.al. 2406.03872 link
2024-06-05 Identification of Stone Deterioration Patterns with Large Multimodal Models Daniele Corradetti et.al. 2406.03207 link
2024-06-05 Exploiting LMM-based knowledge for image classification tasks Maria Tzelepi et.al. 2406.03071 null
2024-06-02 Multimodal Deep Learning for Low-Resource Settings: A Vector Embedding Alignment Approach for Healthcare Applications David Restrepo et.al. 2406.02601 null
2024-06-04 Leveraging Visual Tokens for Extended Text Contexts in Multi-Modal Learning Alex Jinpeng Wang et.al. 2406.02547 link
2024-06-04 Dealing with All-stage Missing Modality: Towards A Universal Model with Robust Reconstruction and Personalization Yunpeng Zhao et.al. 2406.01987 null
2024-06-03 Automatic Fused Multimodal Deep Learning for Plant Identification Alfreds Lapkovskis et.al. 2406.01455 link
2024-06-05 Pulmonary Embolism Mortality Prediction Using Multimodal Learning Based on Computed Tomography Angiography and Clinical Data Zhusi Zhong et.al. 2406.01302 null
2024-06-03 Dragonfly: Multi-Resolution Zoom Supercharges Large Visual-Language Model Kezhen Chen et.al. 2406.00977 link
2024-06-02 Learning Multimodal Behaviors from Scratch with Diffusion Policy Gradient Zechu Li et.al. 2406.00681 null
2024-06-04 StrucTexTv3: An Efficient Vision-Language Model for Text-rich Image Perception, Comprehension, and Beyond Pengyuan Lyu et.al. 2405.21013 null
2024-05-31 Don’t Buy it! Reassessing the Ad Understanding Abilities of Contrastive Multimodal Models A. Bavaresco et.al. 2405.20846 link
2024-06-17 Ovis: Structural Embedding Alignment for Multimodal Large Language Model Shiyin Lu et.al. 2405.20797 link
2024-05-31 Vision-Language Meets the Skeleton: Progressively Distillation with Cross-Modal Knowledge for 3D Action Representation Learning Yang Chen et.al. 2405.20606 link
2024-05-30 Worse than Random? An Embarrassingly Simple Probing Evaluation of Large Multimodal Models in Medical VQA Qianqi Yan et.al. 2405.20421 link
2024-05-30 Retrieval Augmented Structured Generation: Business Document Information Extraction As Tool Use Franz Louis Cesista et.al. 2405.20245 null
2024-05-31 Visual Attention Analysis in Online Learning Miriam Navarro et.al. 2405.20091 null
2024-05-30 MM-Lego: Modular Biomedical Multimodal Models with Minimal Fine-Tuning Konstantin Hemker et.al. 2405.19950 null
2024-05-30 Instruction-Guided Visual Masking Jinliang Zheng et.al. 2405.19783 link
2024-05-29 Thermodynamically Informed Multimodal Learning of High-Dimensional Free Energy Models in Molecular Coarse Graining Blake R. Duschatko et.al. 2405.19386 null
2024-06-09 LLMs Meet Multimodal Generation and Editing: A Survey Yingqing He et.al. 2405.19334 link
2024-05-29 Adaptive Image Quality Assessment via Teaching Large Multimodal Model to Compare Hanwei Zhu et.al. 2405.19298 link
2024-05-31 Benchmarking and Improving Detail Image Caption Hongyuan Dong et.al. 2405.19092 link
2024-05-29 Topological Perspectives on Optimal Multimodal Embedding Spaces Abdul Aziz A. B et.al. 2405.18867 null
2024-05-29 Exploring Exotic Decays of the Higgs Boson to Multi-Photons at the LHC via Multimodal Learning Approaches A. Hammad et.al. 2405.18834 null
2024-05-28 The Evolution of Multimodal Model Architectures Shakti N. Wadekar et.al. 2405.17927 null
2024-05-28 Seeing the Image: Prioritizing Visual Correlation by Contrastive Alignment Xin Xiao et.al. 2405.17871 link
2024-05-28 Full-Stack Allreduce on Multi-Rail Networks Enda Yu et.al. 2405.17870 null
2024-05-28 MMPareto: Boosting Multimodal Learning with Innocent Unimodal Assistance Yake Wei et.al. 2405.17730 link
2024-05-27 Matryoshka Multimodal Models Mu Cai et.al. 2405.17430 null
2024-05-27 XFormParser: A Simple and Effective Multimodal Multilingual Semi-structured Form Parser Xianfu Cheng et.al. 2405.17336 null
2024-05-28 LLM-Optic: Unveiling the Capabilities of Large Language Models for Universal Visual Grounding Haoyu Zhao et.al. 2405.17104 null
2024-05-27 Mitigating Noisy Correspondence by Geometrical Structure Consistency Learning Zihua Zhao et.al. 2405.16996 link
2024-05-27 Multilingual Diversity Improves Vision-Language Representations Thao Nguyen et.al. 2405.16915 null
2024-05-26 Implicit Multimodal Alignment: On the Generalization of Frozen LLMs to Multimodal Inputs Mustafa Shukor et.al. 2405.16700 link
2024-05-25 How Well Do Deep Learning Models Capture Human Concepts? The Case of the Typicality Effect Siddhartha K. Vemuri et.al. 2405.16128 null
2024-05-24 ConvLLaVA: Hierarchical Backbones as Visual Encoder for Large Multimodal Models Chunjiang Ge et.al. 2405.15738 link
2024-05-24 Chain-of-Thought Prompting for Demographic Inference with Large Multimodal Models Yongsheng Yu et.al. 2405.15687 null
2024-05-24 M4U: Evaluating Multilingual Understanding and Reasoning for Large Multimodal Models Hongyu Wang et.al. 2405.15638 link
2024-05-24 DEEM: Diffusion Models Serve as the Eyes of Large Language Models for Image Perception Run Luo et.al. 2405.15232 link
2024-05-24 Shopping Queries Image Dataset (SQID): An Image-Enriched ESCI Dataset for Exploring Multimodal Learning in Product Search Marie Al Ghossein et.al. 2405.15190 link

Generative Weight Space Modeling

Publish Date Title Authors PDF Code    
2024-09-18 Monomial Matrix Group Equivariant Neural Functional Networks Hoang V. Tran et.al. 2409.11697 null    
2024-09-17 Existence of an extremal function of Sobolev critical embedding with an $α$ -homogeneous weight Petr Gurka et.al. 2409.11193 null    
2024-09-16 Inferring stellar parameters and their uncertainties from high-resolution spectroscopy using invertible neural networks Nils Candebat et.al. 2409.10621 null    
2024-09-13 Non-unitary Wightman CFTs and non-unitary vertex algebras Sebastiano Carpi et.al. 2409.08454 null    
2024-09-12 Global well-posedness and scattering in weighted space for nonlinear Schrödinger equations below the Strauss exponent without gauge-invariance Masaki Kawamoto et.al. 2409.08432 null    
2024-09-09 Fast gradient-free optimization of excitations in variational quantum eigensolvers Jonas Jäger et.al. 2409.05939 null    
2024-09-06 SCARF: Scalable Continual Learning Framework for Memory-efficient Multiple Neural Radiance Fields Yuze Wang et.al. 2409.04482 null    
2024-09-04 Federated Quantum-Train with Batched Parameter Generation Chen-Yu Liu et.al. 2409.02763 null    
2024-09-16 Regret Analysis for Randomized Gaussian Process Upper Confidence Bound Shion Takeno et.al. 2409.00979 null    
2024-08-30 Abstracted Gaussian Prototypes for One-Shot Concept Learning Chelsea Zou et.al. 2408.17251 link    
2024-08-23 Emergence of global receptive fields capturing multipartite quantum correlations Oleg M. Sotnikov et.al. 2408.13033 null    
2024-08-22 **Action of $\mathfrak{osp}(1 2n)$ on polynomials tensor $\mathbb{C}^{0 2n}$** Dwight Anderson Williams II et.al. 2408.12324 null
2024-08-19 Unimodal sequences and mixed false theta functions Kevin Allen et.al. 2408.09789 null    
2024-08-16 Onsager-Machlup functional for stochastic lattice dynamical systems driven by time-varying noise Xinze Zhang et.al. 2408.08465 null    
2024-08-10 Variational Inference Failures Under Model Symmetries: Permutation Invariant Posteriors for Bayesian Neural Networks Yoav Gelberg et.al. 2408.05496 null    
2024-08-09 Quasilinear parabolic equations with superlinear nonlinearities in critical spaces Bogdan-Vasile Matioc et.al. 2408.05067 null    
2024-08-08 A framework for generalizing toric inequalities for holographic entanglement entropy Ning Bao et.al. 2408.04741 null    
2024-08-07 Counterfactuals and Uncertainty-Based Explainable Paradigm for the Automated Detection and Segmentation of Renal Cysts in Computed Tomography Images: A Multi-Center Study Zohaib Salahuddin et.al. 2408.03789 null    
2024-08-05 BOTS-LM: Training Large Language Models for Setswana Nathan Brown et.al. 2408.02239 null    
2024-08-02 Conditional LoRA Parameter Generation Xiaolong Jin et.al. 2408.01415 null    
2024-08-01 Reclaiming Residual Knowledge: A Novel Paradigm to Low-Bit Quantization Róisín Luo et.al. 2408.00923 null    
2024-07-31 Semantic Codebook Learning for Dynamic Recommendation Models Zheqi Lv et.al. 2408.00123 null    
2024-07-29 Tensor product weight modules over the affine-Virasoro algebra Qiu-Fan Chen et.al. 2407.19844 null    
2024-07-24 Generalized Hilbert operators acting on weighted spaces of holomorphic functions with sup-norms María J. Beltrán-Meneu et.al. 2407.17646 null    
2024-07-24 Generalized Ordinal Priority Approach for Multi-Attribute Decision-Making under Incomplete Preference Information Renlong Wang et.al. 2407.17099 null    
2024-07-22 WebRPG: Automatic Web Rendering Parameters Generation for Visual Presentation Zirui Shao et.al. 2407.15502 link    
2024-07-18 FSP-Laplace: Function-Space Priors for the Laplace Approximation in Bayesian Deep Learning Tristan Cinquin et.al. 2407.13711 null    
2024-07-19 Parameter Generation of Quantum Approximate Optimization Algorithm with Diffusion Model Fanxu Meng et.al. 2407.12242 null    
2024-07-24 Effect Heterogeneity with Earth Observation in Randomized Controlled Trials: Exploring the Role of Data, Model, and Evaluation Metric Choice Connor T. Jerzak et.al. 2407.11674 link    
2024-07-15 Make-An-Agent: A Generalizable Policy Network Generator with Behavior-Prompted Diffusion Yongyuan Liang et.al. 2407.10973 null    
2024-07-16 The well-posedness of generalized nonlinear wave equations on the lattice graph Bobo Hua et.al. 2407.09815 null    
2024-07-15 Enhancing Robustness of Vision-Language Models through Orthogonality Learning and Cross-Regularization Jinlong Li et.al. 2407.08374 null    
2024-07-09 Fine-Tuning Linear Layers Only Is a Simple yet Effective Way for Task Arithmetic Ruochen Jin et.al. 2407.07089 link    
2024-07-04 Recovering Initial States in Semilinear Parabolic Problems from Time-Averages Lina Sophie Schmitz et.al. 2407.03829 null    
2024-07-01 A quantum deformation of the ${\mathcal N}=2$ superconformal algebra H. Awata et.al. 2407.00901 null    
2024-06-24 WARP: On the Benefits of Weight Averaged Rewarded Policies Alexandre Ramé et.al. 2406.16768 null    
2024-06-24 Improving robustness to corruptions with multiplicative weight perturbations Trung Trinh et.al. 2406.16540 null    
2024-06-21 Determination of certain mod $p$ Galois representations using local constancy Abhik Ganguli et.al. 2406.15600 null    
2024-06-21 Elliptic analysis on collapsing gravitational instantons modelled using the Gibbons-Hawking ansatz Willem Adriaan Salm et.al. 2406.15008 null    
2024-06-20 MEAT: Median-Ensemble Adversarial Training for Improving Robustness and Generalization Zhaozhe Hu et.al. 2406.14259 link    
2024-06-18 From Instance Training to Instruction Learning: Task Adapters Generation from Instructions Huanxuan Liao et.al. 2406.12382 link    
2024-06-17 Kaniadakis entropy in extreme gravitational and cosmological environments: a review on the state-of-the-art and future prospects Giuseppe Gaetano Luciano et.al. 2406.11373 null    
2024-06-16 Analysis and approximation of elliptic problems with Uhlenbeck structure in convex polytopes Tadele Mengesha et.al. 2406.10762 null    
2024-06-14 Towards Scalable and Versatile Weight Space Learning Konstantin Schürholt et.al. 2406.09997 link    
2024-06-13 Interpreting the Weight Space of Customized Diffusion Models Amil Dravid et.al. 2406.09413 link    
2024-06-12 Diffusion Soup: Model Merging for Text-to-Image Diffusion Models Benjamin Biggs et.al. 2406.08431 null    
2024-06-24 Cartan monopoles Andrei Smilga et.al. 2406.06042 null    
2024-06-08 Regularized Training with Generated Datasets for Name-Only Transfer of Vision-Language Models Minho Park et.al. 2406.05432 null    
2024-06-06 Regularized KL-Divergence for Well-Defined Function-Space Variational Inference in Bayesian neural networks Tristan Cinquin et.al. 2406.04317 null    
2024-06-06 A characterization of $(μ,ν)$ -dichotomies via admissibility Lucas Backes et.al. 2406.04126 null    
2024-06-05 Reproducing Kernel Thesis of Hankel Operators on Weighted Hardy Spaces Ana Čolović et.al. 2406.03106 null    
2024-05-21 Backpropogation-Free Multi-modal On-Device Model Adaptation via Cloud-Device Collaboration Wei Ji et.al. 2406.01601 null    
2024-05-29 Thermodynamics of the most generalized form of Holographic Dark Energy and some particular cases with Corrected Entropies Sanghati Saha et.al. 2405.20783 null    
2024-06-20 The Empirical Impact of Neural Parameter Symmetries, or Lack Thereof Derek Lim et.al. 2405.20231 null    
2024-05-28 Universal and Extensible Language-Vision Models for Organ Segmentation and Tumor Detection from Abdominal Computed Tomography Jie Liu et.al. 2405.18356 link    
2024-05-28 $C^2M^3$ : Cycle-Consistent Multi-Model Merging Donato Crisostomi et.al. 2405.17897 link    
2024-05-27 Smoothing effects and extinction in finite time for fractional fast diffusions on Riemannian manifolds Elvise Berchio et.al. 2405.17126 null    
2024-05-31 FedSheafHN: Personalized Federated Learning on Graph-structured Data Wenfei Liang et.al. 2405.16056 null    
2024-05-27 HyperInterval: Hypernetwork approach to training weight interval regions in continual learning Patryk Krukowski et.al. 2405.15444 link    
2024-05-23 Scalable Optimization in the Modular Norm Tim Large et.al. 2405.14813 link    
2024-06-16 A refined Weyl character formula for comodules on $\operatorname{GL}_{2,A}$ Helge Øystein Maakestad et.al. 2405.09210 null    
2024-05-13 Localizing Task Information for Improved Model Merging and Compression Ke Wang et.al. 2405.07813 link    
2024-05-13 $α$ VIL: Learning to Leverage Auxiliary Tasks for Multitask Learning Rafael Kourdis et.al. 2405.07769 null    
2024-05-12 Approximation by a new sequence of operators involving Laguerre polynomials Kapil Kumar et.al. 2405.07228 null    
2024-05-06 Swarm intelligence for full Stokes dynamic imaging reconstruction of interferometric data Alejandro Mus et.al. 2405.03330 null    
2024-05-04 Large Deviation Principles of Invariant Measures of Stochastic Reaction-Diffusion Lattice Systems Bixiang Wang et.al. 2405.02720 null    
2024-05-03 The Immersed Inextensible Interface Problem in 2D Stokes Flow Eduardo García-Juárez et.al. 2405.02446 null    
2024-05-02 Customizing Text-to-Image Models with a Single Image Pair Maxwell Jones et.al. 2405.01536 null    
2024-04-25 Robust Fine-tuning for Pre-trained 3D Point Cloud Models Zhibo Zhang et.al. 2404.16422 null    
2024-04-23 The Geometry of the Set of Equivalent Linear Neural Networks Jonathan Richard Shewchuk et.al. 2404.14855 null    
2024-04-24 Nonexistence of solutions to parabolic problems with a potential on weighted graphs Dario D. Monticelli et.al. 2404.12058 null    
2024-04-17 On the relaxation to equilibrium of a quantum oscillator interacting with a radiation field Pierre-A. Vuillermot et.al. 2404.11329 null    
2024-04-15 Higher-curvature gravity in AdS $_3$, holographic $c$ -theorems and black hole microstates Mariano Chernicoff et.al. 2404.10128 null    
2024-04-16 Asymptotic-preserving approximations for stochastic incompressible viscous fluids and SPDEs on graph Jianbo Cui et.al. 2404.09168 null    
2024-04-09 Perspective on Physical Interpretations of Rényi Entropy in Statistical Mechanics Misaki Ozawa et.al. 2404.06436 null    
2024-04-09 A gluing construction of singular solutions for a fully non-linear equation in conformal geometry María Fernanda Espinal et.al. 2404.05965 null    
2024-04-05 Dissipative Euler flows originating from circular vortex filaments Francisco Gancedo et.al. 2404.04250 null    
2024-04-05 Macdonald characters from a new formula for Macdonald polynomials Houcine Ben Dali et.al. 2404.03904 null    
2024-04-04 Fundamental inequalities for the iterated Fourier-cosine convolution with Gaussian weight and its application Nguyen Thi Hong Phuong et.al. 2404.03609 null    
2024-03-29 Embracing Unknown Step by Step: Towards Reliable Sparse Training in Real World Bowen Lei et.al. 2403.20047 link    
2024-03-28 Model Stock: All we need is just a few fine-tuned models Dong-Hwan Jang et.al. 2403.19522 link    
2024-03-26 A location Invariant Statistic-Based Consistent Estimation Method for Three-Parameter Generalized Exponential Distribution Kiran Prajapat et.al. 2403.17609 null    
2024-06-03 FissionFusion: Fast Geometric Generation and Hierarchical Souping for Medical Image Analysis Santosh Sanjeev et.al. 2403.13341 link    
2024-06-18 Learning Useful Representations of Recurrent Neural Network Weight Matrices Vincent Herrmann et.al. 2403.11998 link    
2024-03-16 Function-space Parameterization of Neural Networks for Sequential Learning Aidan Scannell et.al. 2403.10929 link    
2024-03-14 Imprints of Barrow-Tsallis Cosmology in Primordial Gravitational Waves Petr Jizba et.al. 2403.09797 null    
2024-03-14 Eigenvariety for partially classical Hilbert modular forms Mladen Dimitrov et.al. 2403.09784 null    
2024-03-12 The solenoidal Heisenberg Virasoro algebra and its simple weight modules Boujemaa Agrebaoui et.al. 2403.07381 null    
2024-03-10 FrameQuant: Flexible Low-Bit Quantization for Transformers Harshavardhan Adepu et.al. 2403.06082 link    
2024-03-06 The solenoidal Virasoro algebra and its simple weight modules Boujemaa Agrebaoui et.al. 2403.03753 null    
2024-03-05 Tensor Decomposition-based Time Varying Channel Estimation for mmWave MIMO-OFDM Systems Ruizhe Wang et.al. 2403.02942 null    
2024-03-05 Neural Redshift: Random Networks are not Random Functions Damien Teney et.al. 2403.02241 null    
2024-03-04 Tiny fluctuations of the averaging process around its degenerate steady state Federico Sau et.al. 2403.02032 null    
2024-03-15 Training-Free Pretrained Model Merging Zhengqi Xu et.al. 2403.01753 link    
2024-04-22 HanDiffuser: Text-to-Image Generation With Realistic Hand Appearances Supreeth Narasimhaswamy et.al. 2403.01693 null    
2024-03-13 TOOLVERIFIER: Generalization to New Tools via Self-Verification Dheeraj Mekala et.al. 2402.14158 link    
2024-02-21 Computing Tangent Spaces to Eigenvarieties James Rawson et.al. 2402.13799 null    
2024-05-28 Neural Network Parameter Diffusion Kai Wang et.al. 2402.13144 link    
2024-02-19 Exponential attractors for a nonlocal delayed reaction-diffusion equation on an unbounded domain Wenjie Hu et.al. 2402.11856 null    
2024-02-18 Discrete Neural Algorithmic Reasoning Gleb Rodionov et.al. 2402.11628 link    
2024-02-17 Uncertainty Quantification of Graph Convolution Neural Network Models of Evolving Processes Jeremiah Hauth et.al. 2402.11179 null    
2024-06-06 Generalizability of Mixture of Domain-Specific Adapters from the Lens of Signed Weight Directions and its Application to Effective Model Pruning Tuc Nguyen et.al. 2402.10639 null    
2024-02-14 TAI-GAN: A Temporally and Anatomically Informed Generative Adversarial Network for early-to-late frame conversion in dynamic cardiac PET inter-frame motion correction Xueqi Guo et.al. 2402.09567 null    
2024-02-14 The cohomology of $p$ -adic Deligne-Luszitg schemes of Coxeter type Alexander B. Ivanov et.al. 2402.09017 null    
2024-02-09 The Asymptotic Structure of Cosmological Integrals Paolo Benincasa et.al. 2402.06558 null    
2024-02-07 Universal Neural Functionals Allan Zhou et.al. 2402.05232 link    
2024-02-06 Maximal regularity and optimal control for a non-local Cahn-Hilliard tumour growth model Matteo Fornoni et.al. 2402.04204 null    
2024-02-06 Improved Generalization of Weight Space Networks via Augmentations Aviv Shamsian et.al. 2402.04081 null    
2024-02-02 Training-time Neuron Alignment through Permutation Subspace for Improving Linear Mode Connectivity and Model Fusion Zexi Li et.al. 2402.01342 null    
2024-02-01 Understanding Neural Network Systems for Image Analysis using Vector Spaces and Inverse Maps Rebecca Pattichis et.al. 2402.00261 null    
2024-01-26 Do deep neural networks utilize the weight space efficiently? Onur Can Koyun et.al. 2401.16438 null    
2024-01-22 On strong growth conditions for weighted spaces of entire functions Gerhard Schindl et.al. 2401.14330 null    
2024-01-24 Task structure and nonlinearity jointly determine learned representational geometry Matteo Alleman et.al. 2401.13558 null    
2024-01-25 Sparse Domination of Singular Bilinear Forms on Non-Homogeneous spaces Paco Villarroya et.al. 2401.13130 null    
2024-01-22 WARM: On the Benefits of Weight Averaged Reward Models Alexandre Ramé et.al. 2401.12187 null    
2024-01-17 Cesàro operators associated with Borel measures acting on weighted spaces of holomorphic functions with sup-norm Maria José Beltrán Meneu et.al. 2401.09406 null    
2024-01-15 Singular fractal dimension at periodicity cascades in parameters spaces Carlos E. P. Abreu et.al. 2401.07648 null    
2024-01-17 Computing Fringe Presentations of Multigraded Persistence Modules Fabian Lenzen et.al. 2401.06008 null    
2024-01-10 Grimoire is All You Need for Enhancing Large Language Models Ding Chen et.al. 2401.03385 link    
2024-03-26 Artificial Intelligence for Operations Research: Revolutionizing the Operations Research Process Zhenan Fan et.al. 2401.03244 null    
2023-12-31 A Compact Representation for Bayesian Neural Networks By Removing Permutation Symmetry Tim Z. Xiao et.al. 2401.00611 link    
2023-12-28 Fractional non-homogeneous counting process Nick Laskin et.al. 2312.17389 null    
2023-12-28 Some unimodal sequences of Kronecker coefficients Alimzhan Amanov et.al. 2312.17054 null    
2023-12-24 The Vlasov-Maxwell-Boltzmann/Landau system with polynomial perturbation near Maxwellian Chuqi Cao et.al. 2312.15510 null    
2023-12-22 Emage: Non-Autoregressive Text-to-Image Generation Zhangyin Feng et.al. 2312.14988 null    
2023-12-21 Hypercyclic shifts on lattice graphs Anton Baranov et.al. 2312.13934 null    
2023-12-21 Scattering for 2d semi-relativistic Hartree equations with short range potential Changhun Yang et.al. 2312.13606 null    
2023-12-21 Entropic Inflation in Presence of Scalar Field Sergei D. Odintsov et.al. 2312.13587 null    
2023-12-30 Time is Encoded in the Weights of Finetuned Language Models Kai Nylund et.al. 2312.13401 link    
2023-12-14 Efficient momentum space approach to superconductivity in quasiperiodic systems Mao Yoshii et.al. 2312.09124 null    
2023-12-13 Best one-sided algebraic approximation by average modulus Raheam A. Al-Saphory et.al. 2312.08407 null    
2023-12-19 Well-Posedness of Quasilinear Parabolic Equations in Time-Weighted Spaces Bogdan Matioc et.al. 2312.07974 null    
2023-12-12 Rethinking Compression: Reduced Order Modelling of Latent Features in Large Language Models Arnav Chavan et.al. 2312.07046 link    
2023-12-11 Model Breadcrumbs: Scaling Multi-Task Model Merging with Sparse Masks MohammadReza Davari et.al. 2312.06795 null    
2023-12-08 Stoichiometry preservation and generalization of Bilger mixture fraction for non-premixed combustion with differential molecular diffusion Haifeng Wang et.al. 2312.05204 null    
2023-12-01 New polyconvolution product for Fourier-cosine and Laplace integral operators and their applications Trinh Tuan et.al. 2312.00764 null    
2023-11-30 Modelling Einstein cluster using Einasto profile Ritwik Acharyya et.al. 2311.18622 null    
2023-11-27 Extraction of the microscopic properties of quasi-particles using deep neural networks Olga Soloveva et.al. 2311.15984 null    
2024-01-24 Deep Latent Force Models: ODE-based Process Convolutions for Bayesian Deep Learning Thomas Baldwin-McDonald et.al. 2311.14828 null    

Data Distillation

Publish Date Title Authors PDF Code
2024-09-18 Applications of Knowledge Distillation in Remote Sensing: A Survey Yassine Himeur et.al. 2409.12111 null
2024-09-18 Data Efficient Acoustic Scene Classification using Teacher-Informed Confusing Class Instruction Jin Jie Sean Yeo et.al. 2409.11964 null
2024-09-18 Distillation-free Scaling of Large SSMs for Images and Videos Hamid Suleman et.al. 2409.11867 null
2024-09-18 EFCM: Efficient Fine-tuning on Compressed Models for deployment of large models in medical image analysis Shaojie Li et.al. 2409.11817 null
2024-09-18 Efficient Low-Resolution Face Recognition via Bridge Distillation Shiming Ge et.al. 2409.11786 null
2024-09-18 RUIE: Retrieval-based Unified Information Extraction using Large Language Model Xincheng Liao et.al. 2409.11673 null
2024-09-17 Time-Series Forecasting, Knowledge Distillation, and Refinement within a Multimodal PDE Foundation Model Derek Jollie et.al. 2409.11609 link
2024-09-17 Unleashing the Potential of Mamba: Boosting a LiDAR 3D Sparse Detector by Using Cross-Model Knowledge Distillation Rui Yu et.al. 2409.11018 null
2024-09-17 Single-stage TTS with Masked Audio Token Modeling and Semantic Knowledge Distillation Gerard I. Gállego et.al. 2409.11003 null
2024-09-16 Frequency-Guided Masking for Enhanced Vision Self-Supervised Learning Amin Karimi Monsefi et.al. 2409.10362 null
2024-09-16 Human Insights Driven Latent Space for Different Driving Perspectives: A Unified Encoder for Efficient Multi-Task Inference Huy-Dung Nguyen et.al. 2409.10095 null
2024-09-14 Effective Pre-Training of Audio Transformers for Sound Event Detection Florian Schmid et.al. 2409.09546 link
2024-09-14 Integrated Multi-Level Knowledge Distillation for Enhanced Speaker Verification Wenhao Yang et.al. 2409.09389 null
2024-09-14 Joint Semantic Knowledge Distillation and Masked Acoustic Modeling for Full-band Speech Restoration with Improved Intelligibility Xiaoyu Liu et.al. 2409.09357 null
2024-09-13 Exploring System-Heterogeneous Federated Learning with Dynamic Model Selection Dixi Yao et.al. 2409.08858 null
2024-09-13 AWF: Adaptive Weight Fusion for Enhanced Class Incremental Semantic Segmentation Zechao Sun et.al. 2409.08516 null
2024-09-12 DiReDi: Distillation and Reverse Distillation for AIoT Applications Chen Sun et.al. 2409.08308 null
2024-09-12 Ruri: Japanese General Text Embeddings Hayato Tsukagoshi et.al. 2409.07737 null
2024-09-12 Learn from Balance: Rectifying Knowledge Transfer for Long-Tailed Scenarios Xinlei Huang et.al. 2409.07694 null
2024-09-11 DS-ViT: Dual-Stream Vision Transformer for Cross-Task Distillation in Alzheimer’s Early Diagnosis Ke Chen et.al. 2409.07584 null
2024-09-11 EchoDFKD: Data-Free Knowledge Distillation for Cardiac Ultrasound Segmentation using Synthetic Data Grégoire Petit et.al. 2409.07566 null
2024-09-11 Enhancing CTC-Based Visual Speech Recognition Hendrik Laux et.al. 2409.07210 null
2024-09-11 A Continual and Incremental Learning Approach for TinyML On-device Training Using Dataset Distillation and Model Size Adaption Marcus Rüb et.al. 2409.07114 null
2024-09-16 Privacy-Preserving Federated Learning with Consistency via Knowledge Distillation Using Conditional Generator Kangyang Luo et.al. 2409.06955 null
2024-09-10 Applied Federated Model Personalisation in the Industrial Domain: A Comparative Study Ilias Siniosoglou et.al. 2409.06904 null
2024-09-10 EasyST: A Simple Framework for Spatio-Temporal Prediction Jiabin Tang et.al. 2409.06748 link
2024-09-10 Knowledge Distillation via Query Selection for Detection Transformer Yi Liu et.al. 2409.06443 null
2024-09-10 Distilling Generative-Discriminative Representations for Very Low-Resolution Face Recognition Junzheng Zhang et.al. 2409.06371 null
2024-09-09 Joint Input and Output Coordination for Class-Incremental Learning Shuai Wang et.al. 2409.05620 null
2024-09-09 LEROjD: Lidar Extended Radar-Only Object Detection Patrick Palmer et.al. 2409.05564 link
2024-09-09 Look One and More: Distilling Hybrid Order Relational Knowledge for Cross-Resolution Image Recognition Shiming Ge et.al. 2409.05384 null
2024-09-09 FedBrain-Distill: Communication-Efficient Federated Brain Tumor Classification Using Ensemble Knowledge Distillation on Non-IID Data Rasoul Jafari Gohari et.al. 2409.05359 link
2024-09-07 LoCa: Logit Calibration for Knowledge Distillation Runming Yang et.al. 2409.04778 null
2024-09-06 SCARF: Scalable Continual Learning Framework for Memory-efficient Multiple Neural Radiance Fields Yuze Wang et.al. 2409.04482 null
2024-09-05 Experimentation in Content Moderation using RWKV Umut Yildirim et.al. 2409.03939 null
2024-09-05 Data-Efficient Generation for Dataset Distillation Zhe Li et.al. 2409.03929 null
2024-09-05 DKDM: Data-Free Knowledge Distillation for Diffusion Models with Any Architecture Qianlong Xiang et.al. 2409.03550 null
2024-09-05 Data-free Distillation with Degradation-prompt Diffusion for Multi-weather Image Restoration Pei Wang et.al. 2409.03455 null
2024-09-05 Efficient Image Compression Using Advanced State Space Models Bouzid Arezki et.al. 2409.02743 null
2024-09-04 CLDA: Collaborative Learning for Enhanced Unsupervised Domain Adaptation Minhee Cho et.al. 2409.02699 null
2024-09-04 Low-Resolution Object Recognition with Cross-Resolution Relational Contrastive Distillation Kangkai Zhang et.al. 2409.02555 null
2024-09-04 A design of magnetic tunnel junctions for the deployment of neuromorphic hardware for edge computing Davi Rodrigues et.al. 2409.02528 null
2024-09-04 Non-target Divergence Hypothesis: Toward Understanding Domain Gaps in Cross-Modal Knowledge Distillation Yilong Chen et.al. 2409.02438 null
2024-09-03 Low-Resolution Face Recognition via Adaptable Instance-Relation Distillation Ruixin Shi et.al. 2409.02049 null
2024-09-03 Efficient Point Cloud Classification via Offline Distillation Framework and Negative-Weight Self-Distillation Technique Qiang Zheng et.al. 2409.02020 null
2024-09-03 Contemporary Model Compression on Large Language Models Inference Dong Liu et.al. 2409.01990 null
2024-09-05 Adaptive Explicit Knowledge Transfer for Knowledge Distillation Hyungkeun Park et.al. 2409.01679 null
2024-09-03 Improving Apple Object Detection with Occlusion-Enhanced Distillation Liang Geng et.al. 2409.01573 null
2024-09-02 Dataset Distillation from First Principles: Integrating Core Information Extraction and Purposeful Learning Vyacheslav Kungurtsev et.al. 2409.01410 null
2024-09-02 MobileIQA: Exploiting Mobile-level Diverse Opinion Network For No-Reference Image Quality Assessment Using Knowledge Distillation Zewen Chen et.al. 2409.01212 link
2024-09-04 Diffusion-Driven Data Replay: A Novel Approach to Combat Forgetting in Federated Class Continual Learning Jinglin Liang et.al. 2409.01128 link
2024-09-02 Compressing VAE-Based Out-of-Distribution Detectors for Embedded Deployment Aditya Bansal et.al. 2409.00880 null
2024-09-01 LanguaShrink: Reducing Token Overhead with Psycholinguistics Xuechen Liang et.al. 2409.00855 null
2024-08-30 How Knowledge Distillation Mitigates the Synthetic Gap in Fair Face Recognition Pedro C. Neto et.al. 2408.17399 link
2024-08-30 HiTSR: A Hierarchical Transformer for Reference-based Super-Resolution Masoomeh Aslahishahri et.al. 2408.16959 link
2024-08-29 VLM-KD: Knowledge Distillation from VLM for Long-Tail Visual Recognition Zaiwei Zhang et.al. 2408.16930 null
2024-08-29 Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal Sampling Hritik Bansal et.al. 2408.16737 null
2024-08-29 MST-KD: Multiple Specialized Teachers Knowledge Distillation for Fair Face Recognition Eduarda Caldeira et.al. 2408.16563 link
2024-08-29 UDD: Dataset Distillation via Mining Underutilized Regions Shiguang Wang et.al. 2408.16268 null
2024-08-29 Neural Spectral Decomposition for Dataset Distillation Shaolei Yang et.al. 2408.16236 null
2024-08-28 EMP: Enhance Memory in Data Pruning Jinying Xiao et.al. 2408.16031 null
2024-08-28 LLaVA-MoD: Making LLaVA Tiny via MoE Knowledge Distillation Fangxun Shu et.al. 2408.15881 link
2024-08-28 ModalityMirror: Improving Audio Classification in Modality Heterogeneity Federated Learning with Multimodal Distillation Tiantian Feng et.al. 2408.15803 null
2024-08-28 Online pre-training with long-form videos Itsuki Kato et.al. 2408.15651 null
2024-08-28 Boosting Lossless Speculative Decoding via Feature Sampling and Partial Alignment Distillation Lujun Gui et.al. 2408.15562 null
2024-08-27 Leveraging Self-supervised Audio Representations for Data-Efficient Acoustic Scene Classification Yiqiang Cai et.al. 2408.14862 link
2024-08-26 Bridging the Gap: Unpacking the Hidden Challenges in Knowledge Distillation for Online Ranking Systems Nikhil Khani et.al. 2408.14678 null
2024-08-26 TSAK: Two-Stage Semantic-Aware Knowledge Distillation for Efficient Wearable Modality and Model Optimization in Manufacturing Lines Hymalai Bello et.al. 2408.14146 null

Schrodinger Bridge

Publish Date Title Authors PDF Code
2024-09-18 Massively Multi-Person 3D Human Motion Forecasting with Scene Context Felix B Mueller et.al. 2409.12189 null
2024-09-18 MoRAG – Multi-Fusion Retrieval Augmented Generation for Human Motion Kalakonda Sai Shashank et.al. 2409.12140 null
2024-09-18 Cyclicity Analysis of the Ornstein-Uhlenbeck Process Vivek Kaushik et.al. 2409.12102 null
2024-09-18 Brain-Streams: fMRI-to-Image Reconstruction with Multi-modal Guidance Jaehoon Joo et.al. 2409.12099 null
2024-09-18 Denoising diffusion models for high-resolution microscopy image restoration Pamela Osuna-Vargas et.al. 2409.12078 null
2024-09-18 SFDA-rPPG: Source-Free Domain Adaptive Remote Physiological Measurement with Spatio-Temporal Consistency Yiping Xie et.al. 2409.12040 null
2024-09-18 LEMON: Localized Editing with Mesh Optimization and Neural Shaders Furkan Mert Algan et.al. 2409.12024 null
2024-09-18 Generation of Complex 3D Human Motion by Temporal and Spatial Composition of Diffusion Models Lorenzo Mandelli et.al. 2409.11920 null
2024-09-18 DPI-TTS: Directional Patch Interaction for Fast-Converging and Style Temporal Modeling in Text-to-Speech Xin Qi et.al. 2409.11835 null
2024-09-18 RaggeDi: Diffusion-based State Estimation of Disordered Rags, Sheets, Towels and Blankets Jikai Ye et.al. 2409.11831 null
2024-09-18 InverseMeetInsert: Robust Real Image Editing via Geometric Accumulation Inversion in Guided Diffusion Models Yan Zheng et.al. 2409.11734 null
2024-09-18 GUNet: A Graph Convolutional Network United Diffusion Model for Stable and Diversity Pose Generation Shuowen Liang et.al. 2409.11689 null
2024-09-18 Recurrent Interpolants for Probabilistic Time Series Prediction Yu Chen et.al. 2409.11684 null
2024-09-18 SRIF: Semantic Shape Registration Empowered by Diffusion-based Image Morphing and Flow Estimation Mingze Sun et.al. 2409.11682 null
2024-09-18 Electromagnetic Property Sensing and Channel Reconstruction Based on Diffusion Schrödinger Bridge in ISAC Yuhua Jiang et.al. 2409.11651 null
2024-09-17 Ultrasound Image Enhancement with the Variance of Diffusion Models Yuxin Zhang et.al. 2409.11380 link
2024-09-17 OSV: One Step is Enough for High-Quality Image to Video Generation Xiaofeng Mao et.al. 2409.11367 null
2024-09-17 Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think Gonzalo Martin Garcia et.al. 2409.11355 link
2024-09-17 OmniGen: Unified Image Generation Shitao Xiao et.al. 2409.11340 link
2024-09-17 Parameter dependent rough SDEs with applications to rough PDEs Fabio Bugini et.al. 2409.11330 null
2024-09-17 fMRI-3D: A Comprehensive Dataset for Enhancing fMRI-based 3D Reconstruction Jianxiong Gao et.al. 2409.11315 null
2024-09-17 DroneDiffusion: Robust Quadrotor Dynamics Learning with Diffusion Models Avirup Das et.al. 2409.11292 null
2024-09-17 Score Forgetting Distillation: A Swift, Data-Free Method for Machine Unlearning in Diffusion Models Tianqi Chen et.al. 2409.11219 null
2024-09-17 High-Resolution Speech Restoration with Latent Diffusion Model Tushar Dhyani et.al. 2409.11145 null
2024-09-17 In-situ measurements of light diffusion in an optically dense atomic ensemble Antoine Glicenstein et.al. 2409.11117 null
2024-09-17 TacDiffusion: Force-domain Diffusion Policy for Precise Tactile Manipulation Yansong Wu et.al. 2409.11047 null
2024-09-17 Enhanced segmentation of femoral bone metastasis in CT scans of patients using synthetic data generation with 3D diffusion models Emile Saillard et.al. 2409.11011 null
2024-09-17 Local discontinuous Galerkin method for nonlinear BSPDEs of Neumann boundary conditions with deep backward dynamic programming time-marching Yixiang Dai et.al. 2409.11004 null
2024-09-17 Edge-based Denoising Image Compression Ryugo Morita et.al. 2409.10978 null
2024-09-17 CUNSB-RFIE: Context-aware Unpaired Neural Schrödinger Bridge in Retinal Fundus Image Enhancement Xuanzhao Dong et.al. 2409.10966 null
2024-09-16 Incorporating Classifier-Free Guidance in Diffusion Model-Based Recommendation Noah Buchanan et.al. 2409.10494 null
2024-09-16 SimInversion: A Simple Framework for Inversion-Based Text-to-Image Editing Qi Qian et.al. 2409.10476 null
2024-09-16 MacDiff: Unified Skeleton Modeling with Masked Conditional Diffusion Lehong Wu et.al. 2409.10473 null
2024-09-16 Mamba-ST: State Space Model for Efficient Style Transfer Filippo Botti et.al. 2409.10385 link
2024-09-16 Stochastic Control of UAVs: An Optimal Tradeoff between Performance, Flight Smoothness and Control Effort George Rapakoulias et.al. 2409.10369 null
2024-09-16 Taming Diffusion Models for Image Restoration: A Review Ziwei Luo et.al. 2409.10353 null
2024-09-16 Fairness, not Emotion, Drives Socioeconomic Decision Making Rudra Mukhopadhyay et.al. 2409.10322 null
2024-09-16 DreamHead: Learning Spatial-Temporal Correspondence via Hierarchical Diffusion for Audio-driven Talking Head Synthesis Fa-Ting Hong et.al. 2409.10281 null
2024-09-16 RealDiff: Real-world 3D Shape Completion using Self-Supervised Diffusion Models Başak Melis Öcal et.al. 2409.10180 null
2024-09-16 PSHuman: Photorealistic Single-view Human Reconstruction using Cross-Scale Diffusion Peng Li et.al. 2409.10141 null
2024-09-16 Approximating the signature of Brownian motion for high order SDE simulation James Foster et.al. 2409.10118 null
2024-09-16 DDoS: Diffusion Distribution Similarity for Out-of-Distribution Detection Kun Fang et.al. 2409.10094 null
2024-09-16 MotionCom: Automatic and Motion-Aware Image Composition with LLM and Video Diffusion Prior Weijing Tao et.al. 2409.10090 link
2024-09-16 Cross-modality image synthesis from TOF-MRA to CTA using diffusion-based models Alexander Koch et.al. 2409.10089 null
2024-09-16 A Riemannian Approach to Ground Metric Learning for Optimal Transport Pratik Jawanpuria et.al. 2409.10085 null
2024-09-13 Closed-Loop Visuomotor Control with Generative Expectation for Robotic Manipulation Qingwen Bu et.al. 2409.09016 link
2024-09-13 A Diffusion Approach to Radiance Field Relighting using Multi-Illumination Synthesis Yohan Poirier-Ginter et.al. 2409.08947 null
2024-09-13 Latent Space Score-based Diffusion Model for Probabilistic Multivariate Time Series Imputation Guojun Liang et.al. 2409.08917 link
2024-09-13 Gaussian is All You Need: A Unified Framework for Solving Inverse Problems via Diffusion Posterior Sampling Nebiyou Yismaw et.al. 2409.08906 null
2024-09-13 Adjoint Matching: Fine-tuning Flow and Diffusion Generative Models with Memoryless Stochastic Optimal Control Carles Domingo-Enrich et.al. 2409.08861 null
2024-09-13 InstantDrag: Improving Interactivity in Drag-based Image Editing Joonghyuk Shin et.al. 2409.08857 null
2024-09-13 DX2CT: Diffusion Model for 3D CT Reconstruction from Bi or Mono-planar 2D X-ray(s) Yun Su Jeong et.al. 2409.08850 null
2024-09-13 Measure-Theoretic Time-Delay Embedding Jonah Botvinick-Greenhouse et.al. 2409.08768 link
2024-09-13 DFADD: The Diffusion and Flow-Matching Based Audio Deepfake Dataset Jiawei Du et.al. 2409.08731 null
2024-09-13 Asymptotics for Random Quadratic Transportation Costs Martin Huesmann et.al. 2409.08612 null
2024-09-13 Finite-time thermodynamic bounds and tradeoff relations for information processing Takuya Kamijima et.al. 2409.08606 null
2024-09-13 STA-V2A: Video-to-Audio Generation with Semantic and Temporal Alignment Yong Ren et.al. 2409.08601 null
2024-09-13 LHQ-SVC: Lightweight and High Quality Singing Voice Conversion Modeling Yubo Huang et.al. 2409.08583 null
2024-09-13 DiffFAS: Face Anti-Spoofing via Generative Diffusion Models Xinxu Ge et.al. 2409.08572 link
2024-09-13 Think Twice Before You Act: Improving Inverse Problem Solving With MCMC Yaxuan Zhu et.al. 2409.08551 null
2024-09-12 DreamHOI: Subject-Driven Generation of 3D Human-Object Interactions with Diffusion Priors Thomas Hanwen Zhu et.al. 2409.08278 null
2024-09-12 DreamBeast: Distilling 3D Fantastical Animals with Part-Aware Knowledge Transfer Runjia Li et.al. 2409.08271 null
2024-09-12 Touch2Touch: Cross-Modal Tactile Generation for Object Manipulation Samanta Rodriguez et.al. 2409.08269 null
2024-09-12 Improving Text-guided Object Inpainting with Semantic Pre-inpainting Yifu Chen et.al. 2409.08260 link
2024-09-12 Improving Virtual Try-On with Garment-focused Diffusion Models Siqi Wan et.al. 2409.08258 null
2024-09-12 LoRID: Low-Rank Iterative Diffusion for Adversarial Purification Geigh Zollicoffer et.al. 2409.08255 null
2024-09-12 Dynamic Prompting of Frozen Text-to-Image Diffusion Models for Panoptic Narrative Grounding Hongyu Li et.al. 2409.08251 null
2024-09-12 IFAdapter: Instance Feature Control for Grounded Text-to-Image Generation Yinwei Wu et.al. 2409.08240 null
2024-09-12 How can the tragedy of the commons be prevented?: Introducing Linear Quadratic Mixed Mean Field Games Gokce Dayanikli et.al. 2409.08235 null
2024-09-12 LT3SD: Latent Trees for 3D Scene Diffusion Quan Meng et.al. 2409.08215 null
2024-09-12 VI3DRM:Towards meticulous 3D Reconstruction from Sparse Views via Photo-Realistic Novel View Synthesis Hao Chen et.al. 2409.08207 null
2024-09-12 MagicStyle: Portrait Stylization Based on Reference Image Zhaoli Deng et.al. 2409.08156 null
2024-09-12 EZIGen: Enhancing zero-shot subject-driven image generation with precise subject encoding and decoupled guidance Zicheng Duan et.al. 2409.08091 null
2024-09-12 Diffusion-Based Image-to-Image Translation by Noise Correction via Prompt Interpolation Junsung Lee et.al. 2409.08077 null
2024-09-12 AI-accelerated discovery of high critical temperature superconductors Xiao-Qi Han et.al. 2409.08065 null
2024-09-11 DreamMesh: Jointly Manipulating and Texturing Triangle Meshes for Text-to-3D Generation Haibo Yang et.al. 2409.07454 null
2024-09-11 Hi3D: Pursuing High-Resolution Image-to-3D Generation with Video Diffusion Models Haibo Yang et.al. 2409.07452 link
2024-09-11 FreeEnhance: Tuning-Free Image Enhancement via Content-Consistent Noising-and-Denoising Process Yang Luo et.al. 2409.07451 null
2024-09-11 Efficient One-Step Diffusion Refinement for Snapshot Compressive Imaging Yunzhen Wang et.al. 2409.07417 null
2024-09-11 Training-Free Guidance for Discrete Diffusion Models for Molecular Generation Thomas J. Kerby et.al. 2409.07359 null
2024-09-11 Learning Robotic Manipulation Policies from Point Clouds with Conditional Flow Matching Eugenio Chisari et.al. 2409.07343 null
2024-09-11 Efficient and Unbiased Sampling of Boltzmann Distributions via Consistency Models Fengzhe Zhang et.al. 2409.07323 null
2024-09-11 Exploring User-level Gradient Inversion with a Diffusion Prior Zhuohang Li et.al. 2409.07291 null
2024-09-11 CCFExp: Facial Image Synthesis with Cycle Cross-Fusion Diffusion Model for Facial Paralysis Individuals Weixiang Gao et.al. 2409.07271 link
2024-09-11 Realistic and Efficient Face Swapping: A Unified Approach with Diffusion Models Sanoojan Baliah et.al. 2409.07269 link
2024-09-11 EMOdiffhead: Continuously Emotional Control in Talking Head Generation via Diffusion Jian Zhang et.al. 2409.07255 null
2024-09-12 Alignment of Diffusion Models: Fundamentals, Challenges, and Future Buhua Liu et.al. 2409.07253 link
2024-09-11 Diff-VPS: Video Polyp Segmentation via a Multi-task Diffusion Network with Adversarial Temporal Reasoning Yingling Lu et.al. 2409.07238 link
2024-09-11 Phy124: Fast Physics-Driven 4D Content Generation from a Single Image Jiajing Lin et.al. 2409.07179 null
2024-09-11 Mamba Policy: Towards Efficient 3D Diffusion Policy with Hybrid Selective State Models Jiahang Cao et.al. 2409.07163 null
2024-09-10 SaRA: High-Efficient Diffusion Model Fine-tuning with Progressive Sparse Low-Rank Adaptation Teng Hu et.al. 2409.06633 null
2024-09-10 One-Shot Imitation under Mismatched Execution Kushal Kedia et.al. 2409.06615 null
2024-09-10 Modelling Global Trade with Optimal Transport Thomas Gaskin et.al. 2409.06554 link
2024-09-10 Robust financial calibration: a Bayesian approach for neural SDEs Christa Cuchiero et.al. 2409.06551 link
2024-09-10 Enhancing Emotional Text-to-Speech Controllability with Natural Language Guidance through Contrastive Learning and Diffusion Models Xin Jing et.al. 2409.06451 null
2024-09-10 Robust semi-parametric signal detection in particle physics with classifiers decorrelated via optimal transport Purvasha Chakravarti et.al. 2409.06399 null
2024-09-10 Distilling Generative-Discriminative Representations for Very Low-Resolution Face Recognition Junzheng Zhang et.al. 2409.06371 null
2024-09-10 What happens to diffusion model likelihood when your model is conditional? Mattias Cross et.al. 2409.06364 null
2024-09-10 DiffQRCoder: Diffusion-based Aesthetic QR Code Generation with Scanning Robustness Guided Iterative Refinement Jia-Wei Liao et.al. 2409.06355 null
2024-09-10 Geometry of the Space of Partitioned Networks: A Unified Theoretical and Computational Framework Stephen Y Zhang et.al. 2409.06302 link
2024-09-10 Multi-Source Music Generation with Latent Diffusion Zhongweiyang Xu et.al. 2409.06190 link
2024-09-10 MyGo: Consistent and Controllable Multi-View Driving Video Generation with Camera Control Yining Yao et.al. 2409.06189 null
2024-09-10 EDADepth: Enhanced Data Augmentation for Monocular Depth Estimation Nischal Khanal et.al. 2409.06183 link
2024-09-09 Latent Diffusion Bridges for Unsupervised Musical Audio Timbre Transfer Michele Mancusi et.al. 2409.06096 null
2024-09-09 SVS-GAN: Leveraging GANs for Semantic Video Synthesis Khaled M. Seyam et.al. 2409.06074 null
2024-09-09 Enhancing Preference-based Linear Bandits via Human Response Time Shen Li et.al. 2409.05798 null
2024-09-09 Vector Quantized Diffusion Model Based Speech Bandwidth Extension Yuan Fang et.al. 2409.05784 null
2024-09-09 AS-Speech: Adaptive Style For Speech Synthesis Zhipeng Li et.al. 2409.05730 null
2024-09-09 Distributionally Robust Stochastic Data-Driven Predictive Control with Optimized Feedback Gain Ruiqi Li et.al. 2409.05727 null
2024-09-09 Quantitative approximation of stochastic kinetic equations: from discrete to continuum Zimo Hao et.al. 2409.05706 null
2024-09-09 pFedGPA: Diffusion-based Generative Parameter Aggregation for Personalized Federated Learning Jiahao Lai et.al. 2409.05701 null
2024-09-09 Unlearning or Concealment? A Critical Analysis and Evaluation Metrics for Unlearning in Diffusion Models Aakash Sen Sharma et.al. 2409.05668 null
2024-09-09 Forward KL Regularized Preference Optimization for Aligning Diffusion Policies Zhao Shan et.al. 2409.05622 null
2024-09-09 CipherDM: Secure Three-Party Inference for Diffusion Model Sampling Xin Zhao et.al. 2409.05414 null
2024-09-09 Sequential Posterior Sampling with Diffusion Models Tristan S. W. Stevens et.al. 2409.05399 null
2024-09-09 TERD: A Unified Framework for Safeguarding Diffusion Models Against Backdoors Yichuan Mo et.al. 2409.05294 link
2024-09-08 The Stochastic Gause predator-prey model: noise-induced extinctions and invariance Leon Alexander Valencia et.al. 2409.05237 null
2024-09-08 Nuclear transparencies with a two step process of the $A(e,e’π^+)$ reactions Tae Keun Choi et.al. 2409.05129 null
2024-09-08 Diffusion-based Speech Enhancement with Schrödinger Bridge and Symmetric Noise Schedule Siyi Wang et.al. 2409.05116 null
2024-09-08 A Survey on Diffusion Models for Recommender Systems Jianghao Lin et.al. 2409.05033 link
2024-09-06 VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation Yecheng Wu et.al. 2409.04429 null
2024-09-06 Exploring Foundation Models for Synthetic Medical Imaging: A Study on Chest X-Rays and Fine-Tuning Techniques Davide Clode da Silva et.al. 2409.04424 null
2024-09-06 How Fair is Your Diffusion Recommender Model? Daniele Malitesta et.al. 2409.04339 null
2024-09-06 Random effects estimation in a fractional diffusion model based on continuous observations Nesrine Chebli et.al. 2409.04331 null
2024-09-06 Probabilistic Representation for Viscosity Solutions to Double-Obstacle Quasi-Variational Inequalities Magnus Perninge et.al. 2409.04207 null
2024-09-06 Breaking the Brownian Barrier: Models and Manifestations of Molecular Diffusion in Complex Fluids Harish Srinivasan et.al. 2409.04199 null
2024-09-06 GST: Precise 3D Human Body from a Single Image with Gaussian Splatting Transformers Lorenza Prospero et.al. 2409.04196 null
2024-09-06 D4: Text-guided diffusion model-based domain adaptive data augmentation for vineyard shoot detection Kentaro Hirahara et.al. 2409.04060 null
2024-09-06 A policy iteration algorithm for non-Markovian control problems Dylan Possamaï et.al. 2409.04037 null
2024-09-06 One-Shot Diffusion Mimicker for Handwritten Text Generation Gang Dai et.al. 2409.04004 link
2024-09-06 DreamForge: Motion-Aware Autoregressive Video Generation for Multi-View Driving Scenes Jianbiao Mei et.al. 2409.04003 link
2024-09-05 Data-Efficient Generation for Dataset Distillation Zhe Li et.al. 2409.03929 null
2024-09-05 Generating High Dimensional User-Specific Wireless Channels using Diffusion Models Taekyun Lee et.al. 2409.03924 null
2024-09-05 Neural Entropy Akhil Premkumar et.al. 2409.03817 null
2024-09-05 Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding Yunze Man et.al. 2409.03757 link
2024-09-05 ArtiFade: Learning to Generate High-quality Subject from Blemished Images Shuya Yang et.al. 2409.03745 null
2024-09-05 Quantum optimal transport with convex regularization Emanuele Caputo et.al. 2409.03698 null
2024-09-05 RealisHuman: A Two-Stage Approach for Refining Malformed Human Parts in Generated Images Benzhi Wang et.al. 2409.03644 null
2024-09-05 DiffEVC: Any-to-Any Emotion Voice Conversion with Expressive Guidance Hsing-Hang Chou et.al. 2409.03636 null
2024-09-05 TCDiff: Triple Condition Diffusion Model with 3D Constraints for Stylizing Synthetic Faces Bernardo Biesseck et.al. 2409.03600 link
2024-09-05 DKDM: Data-Free Knowledge Distillation for Diffusion Models with Any Architecture Qianlong Xiang et.al. 2409.03550 null
2024-09-05 On the mean field limit of consensus based methods Marvin Koß et.al. 2409.03518 null
2024-09-05 Blended Latent Diffusion under Attention Control for Real-World Video Editing Deyin Liu et.al. 2409.03514 null
2024-09-05 Data-free Distillation with Degradation-prompt Diffusion for Multi-weather Image Restoration Pei Wang et.al. 2409.03455 null
2024-09-05 Recursive Quantization for $\mathcal{L}_2$ Stabilization of a Finite Capacity Stochastic Control Loop with Intermittent State Observations Shrija Karmakar et.al. 2409.03398 null
2024-09-05 Enhancing User-Centric Privacy Protection: An Interactive Framework through Diffusion Models and Machine Unlearning Huaxi Huang et.al. 2409.03326 null
2024-09-05 SVP: Style-Enhanced Vivid Portrait Talking Head Diffusion Model Weipeng Tan et.al. 2409.03270 null
2024-09-05 RoomDiffusion: A Specialized Diffusion Model in the Interior Design Industry Zhaowei Wang et.al. 2409.03198 null
2024-09-04 Spatial Diffusion for Cell Layout Generation Chen Li et.al. 2409.03106 link
2024-09-04 HiPrompt: Tuning-free Higher-Resolution Generation with Hierarchical MLLM Prompts Xinyu Liu et.al. 2409.02919 link
2024-09-04 Masked Diffusion Models are Secretly Time-Agnostic Masked Models and Exploit Inaccurate Categorical Sampling Kaiwen Zheng et.al. 2409.02908 null
2024-09-04 Human-VDM: Learning Single-Image 3D Human Gaussian Splatting from Video Diffusion Models Zhibin Liu et.al. 2409.02851 link
2024-09-04 Multi-Track MusicLDM: Towards Versatile Music Generation with Latent Diffusion Model Tornike Karchkhadze et.al. 2409.02845 null
2024-09-04 Skip-and-Play: Depth-Driven Pose-Preserved Image Generation for Any Objects Kyungmin Jo et.al. 2409.02653 null
2024-09-04 MADiff: Motion-Aware Mamba Diffusion Models for Hand Trajectory Prediction on Egocentric Videos Junyi Ma et.al. 2409.02638 null
2024-09-04 Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency Jianwen Jiang et.al. 2409.02634 null
2024-09-04 Rate-Adaptive Generative Semantic Communication Using Conditional Diffusion Models Pujing Yang et.al. 2409.02597 null
2024-09-04 Solving Video Inverse Problems Using Image Diffusion Models Taesung Kwon et.al. 2409.02574 null
2024-09-04 StyleTokenizer: Defining Image Style by a Single Instance for Controlling Diffusion Models Wen Li et.al. 2409.02543 link
2024-09-04 Sample what you cant compress Vighnesh Birodkar et.al. 2409.02529 null
2024-09-04 Continual Diffuser (CoD): Mastering Continual Offline Reinforcement Learning with Experience Rehearsal Jifeng Hu et.al. 2409.02512 link
2024-09-04 Demographic parity in regression and classification within the unawareness framework Vincent Divol et.al. 2409.02471 null
2024-09-04 Training-free Color-Style Disentanglement for Constrained Text-to-Image Synthesis Aishwarya Agarwal et.al. 2409.02429 null
2024-09-04 Diffusion Models Learn Low-Dimensional Distributions via Subspace Clustering Peng Wang et.al. 2409.02426 link
2024-08-30 Subspace Diffusion Posterior Sampling for Travel-Time Tomography Xiang Cao et.al. 2408.17333 null
2024-08-30 Likelihood estimation for stochastic differential equations with mixed effects Fernando Baltazar-Larios et.al. 2408.17257 null
2024-08-30 The random periodic solutions for McKean-Vlasov stochastic differential equations Jianhai Bao et.al. 2408.17242 null
2024-08-30 A methodological framework for Resilience as a Service (RaaS) in multimodal urban transportation networks Sara Jaber et.al. 2408.17233 null
2024-09-02 RISSOLE: Parameter-efficient Diffusion Models via Block-wise Generation and Retrieval-Guidance Avideep Mukherjee et.al. 2408.17095 null
2024-09-02 Instant Adversarial Purification with Adversarial Consistency Distillation Chun Tong Lei et.al. 2408.17064 null
2024-08-30 Text-to-Image Generation Via Energy-Based CLIP Roy Ganz et.al. 2408.17046 null
2024-08-30 High-fidelity holographic beam shaping with optimal transport and phase diversity Hunter Swan et.al. 2408.17025 null
2024-08-30 Contrastive Learning with Synthetic Positives Dewen Zeng et.al. 2408.16965 link
2024-09-02 Enabling Local Editing in Diffusion Models by Joint and Individual Component Analysis Theodoros Kouzelis et.al. 2408.16845 null
2024-08-29 ReconX: Reconstruct Any Scene from Sparse Views with Video Diffusion Model Fangfu Liu et.al. 2408.16767 null
2024-09-04 CSGO: Content-Style Composition in Text-to-Image Generation Peng Xing et.al. 2408.16766 null
2024-08-29 DriveGenVLM: Real-world Video Generation for Vision Language Model based Autonomous Driving Yongjie Fu et.al. 2408.16647 null
2024-09-02 RLCP: A Reinforcement Learning-based Copyright Protection Method for Text-to-Image Diffusion Model Zhuan Shi et.al. 2408.16634 null
2024-08-29 A Score-based Generative Solver for PDE-constrained Inverse Problems with Complex Priors Yankun Hong et.al. 2408.16626 null