Updated on 2024.12.21
Table of Contents
- <a href=#peft>PEFT</a>
- <a href=#text-to-image-generation>Text-to-Image Generation</a>
- <a href=#vision-language-models>Vision-Language Models</a>
- <a href=#generative-weight-space-modeling>Generative Weight Space Modeling</a>
- <a href=#data-distillation>Data Distillation</a>
- <a href=#schrodinger-bridge>Schrodinger Bridge</a>
- <a href=#dataset-distillation>Dataset Distillation</a>
- <a href=#synthetic-data-generation>Synthetic Data Generation</a>
PEFT
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2024-12-19 | FedPIA – Permuting and Integrating Adapters leveraging Wasserstein Barycenters for Finetuning Foundation Models in Multi-Modal Federated Learning | Pramit Saha et.al. | 2412.14424 | null |
2024-12-18 | Parameter-efficient Fine-tuning for improved Convolutional Baseline for Brain Tumor Segmentation in Sub-Saharan Africa Adult Glioma Dataset | Bijay Adhikari et.al. | 2412.14100 | null |
2024-12-18 | A Comprehensive Evaluation of Parameter-Efficient Fine-Tuning on Method-Level Code Smell Detection | Beiqi Zhang et.al. | 2412.13801 | null |
2024-12-18 | Refining Salience-Aware Sparse Fine-Tuning Strategies for Language Models | Xinxin Liu et.al. | 2412.13488 | null |
2024-12-17 | Train More Parameters But Mind Their Placement: Insights into Language Adaptation with PEFT | Jenny Kunz et.al. | 2412.12674 | link |
2024-12-16 | Visual Instruction Tuning with 500x Fewer Parameters through Modality Linear Representation-Steering | Jinhe Bi et.al. | 2412.12359 | link |
2024-12-16 | A LoRA is Worth a Thousand Pictures | Chenxi Liu et.al. | 2412.12048 | null |
2024-12-11 | Adaptive Principal Components Allocation with the $\ell_{2,g}$ -regularized Gaussian Graphical Model for Efficient Fine-Tuning Large Models | Jingjing Zheng et.al. | 2412.08592 | link |
2024-12-10 | PETALface: Parameter Efficient Transfer Learning for Low-resolution Face Recognition | Kartik Narayan et.al. | 2412.07771 | null |
2024-12-10 | MoDULA: Mixture of Domain-Specific and Universal LoRA for Multi-Task Learning | Yufei Ma et.al. | 2412.07405 | null |
2024-12-13 | Crack-EdgeSAM Self-Prompting Crack Segmentation System for Edge Devices | Yingchu Wang et.al. | 2412.07205 | null |
2024-12-08 | Taming Sensitive Weights : Noise Perturbation Fine-tuning for Robust LLM Quantization | Dongwei Wang et.al. | 2412.06858 | null |
2024-12-09 | BoRA: Bi-dimensional Weight-Decomposed Low-Rank Adaptation | Qiushi Wang et.al. | 2412.06441 | null |
2024-12-19 | S $^{2}$ FT: Efficient, Scalable and Generalizable LLM Fine-tuning by Structured Sparsity | Xinyu Yang et.al. | 2412.06289 | null |
2024-12-08 | KaSA: Knowledge-Aware Singular-Value Adaptation of Large Language Models | Fan Wang et.al. | 2412.06071 | link |
2024-12-07 | Training-Free Bayesianization for Low-Rank Adapters of Large Language Models | Haizhou Shi et.al. | 2412.05723 | link |
2024-12-06 | PETapter: Leveraging PET-style classification heads for modular few-shot parameter-efficient fine-tuning | Jonas Rieger et.al. | 2412.04975 | null |
2024-12-04 | Prompting Large Language Models for Clinical Temporal Relation Extraction | Jianping He et.al. | 2412.04512 | null |
2024-12-05 | SoRA: Singular Value Decomposed Low-Rank Adaptation for Domain Generalizable Representation Learning | Seokju Yun et.al. | 2412.04077 | link |
2024-12-04 | Improving Linguistic Diversity of Large Language Models with Possibility Exploration Fine-Tuning | Long Mai et.al. | 2412.03343 | link |
2024-12-03 | Mixture of Physical Priors Adapter for Parameter-Efficient Fine-Tuning | Zhaozhi Wang et.al. | 2412.02759 | null |
2024-12-03 | CPP-UT-Bench: Can LLMs Write Complex Unit Tests in C++? | Vaishnavi Bhargava et.al. | 2412.02735 | null |
2024-12-03 | LoRA Diffusion: Zero-Shot LoRA Synthesis for Diffusion Model Personalization | Ethan Smith et.al. | 2412.02352 | null |
2024-12-03 | A Comprehensive Evaluation of Large Language Models on Aspect-Based Sentiment Analysis | Changzhi Zhou et.al. | 2412.02279 | null |
2024-11-30 | Unified Parameter-Efficient Unlearning for LLMs | Chenlu Ding et.al. | 2412.00383 | null |
2024-11-29 | SURE-VQA: Systematic Understanding of Robustness Evaluation in Medical VQA Tasks | Kim-Celine Kahl et.al. | 2411.19688 | link |
2024-11-28 | Parameter-Efficient Transfer Learning for Music Foundation Models | Yiwei Ding et.al. | 2411.19371 | link |
2024-11-28 | PEFT-as-an-Attack! Jailbreaking Language Models during Federated Parameter-Efficient Fine-Tuning | Shenghui Li et.al. | 2411.19335 | null |
2024-11-28 | Enhancing Parameter-Efficient Fine-Tuning of Vision Transformers through Frequency-Based Adaptation | Son Thai Ly et.al. | 2411.19297 | link |
2024-11-27 | Challenges in Adapting Multilingual LLMs to Low-Resource Languages using LoRA PEFT Tuning | Omkar Khade et.al. | 2411.18571 | null |
2024-11-26 | PEFTGuard: Detecting Backdoor Attacks Against Parameter-Efficient Fine-Tuning | Zhen Sun et.al. | 2411.17453 | null |
2024-11-29 | Promptable Anomaly Segmentation with SAM Through Self-Perception Tuning | Hui-Yue Yang et.al. | 2411.17217 | null |
2024-11-25 | Towards Efficient Model-Heterogeneity Federated Learning for Large Models | Ruofan Jia et.al. | 2411.16796 | null |
2024-11-25 | Parameter Efficient Instruction Tuning: An Empirical Study | Pengfei He et.al. | 2411.16775 | null |
2024-11-25 | Graph Adapter of EEG Foundation Models for Parameter Efficient Fine Tuning | Toyotaro Suzumura et.al. | 2411.16155 | null |
2024-11-24 | Efficient and Private: Memorisation under differentially private parameter-efficient fine-tuning in language models | Olivia Ma et.al. | 2411.15831 | null |
2024-11-21 | Parameter Efficient Mamba Tuning via Projector-targeted Diagonal-centric Linear Transformation | Seokil Ham et.al. | 2411.15224 | null |
2024-11-22 | LoRA-FAIR: Federated LoRA Fine-Tuning with Aggregation and Initialization Refinement | Jieming Bian et.al. | 2411.14961 | null |
2024-11-21 | Multi LoRA Meets Vision: Merging multiple adapters to create a multi task model | Ege Kesim et.al. | 2411.14064 | null |
2024-11-17 | F $^3$ OCUS – Federated Finetuning of Vision-Language Foundation Models with Optimal Client Layer Updating Strategy via Multi-objective Meta-Heuristics | Pramit Saha et.al. | 2411.11912 | null |
2024-11-16 | HELENE: Hessian Layer-wise Clipping and Gradient Annealing for Accelerating Fine-tuning LLM with Zeroth-order Optimization | Huaqin Zhao et.al. | 2411.10696 | null |
2024-11-12 | PERFT: Parameter-Efficient Routed Fine-Tuning for Mixture-of-Expert Model | Yilun Liu et.al. | 2411.08212 | null |
2024-11-10 | Prompt-Efficient Fine-Tuning for GPT-like Deep Models to Reduce Hallucination and to Improve Reproducibility in Scientific Text Generation Using Stochastic Optimisation Techniques | Daniil Sulimov et.al. | 2411.06445 | null |
2024-11-06 | MambaPEFT: Exploring Parameter-Efficient Fine-Tuning for Mamba | Masakazu Yoshimura et.al. | 2411.03855 | null |
2024-11-04 | PipeLLM: Fast and Confidential Large Language Model Services with Speculative Pipelined Encryption | Yifan Tan et.al. | 2411.03357 | null |
2024-11-05 | Efficient and Effective Adaptation of Multimodal Foundation Models in Sequential Recommendation | Junchen Fu et.al. | 2411.02992 | null |
2024-11-04 | Parameter-Efficient Fine-Tuning of Large Language Models for Unit Test Generation: An Empirical Study | André Storhaug et.al. | 2411.02462 | null |
2024-11-04 | Expanding Sparse Tuning for Low Memory Usage | Shufan Shen et.al. | 2411.01800 | link |
2024-11-15 | Visual Fourier Prompt Tuning | Runjia Zeng et.al. | 2411.01327 | link |
2024-10-31 | CleaR: Towards Robust and Generalized Parameter-Efficient Fine-Tuning for Noisy Label Learning | Yeachan Kim et.al. | 2411.00873 | null |
2024-10-30 | FPE-LLM: Highly Intelligent Time-Series Forecasting and Language Interaction LLM in Energy Systems | Zihang Qiu et.al. | 2411.00852 | null |
2024-11-01 | Dual Low-Rank Adaptation for Continual Learning with Pre-Trained Models | Huancheng Chen et.al. | 2411.00623 | null |
2024-11-01 | Is Multiple Object Tracking a Matter of Specialization? | Gianluca Mancusi et.al. | 2411.00553 | null |
2024-11-01 | C2A: Client-Customized Adaptation for Parameter-Efficient Federated Learning | Yeachan Kim et.al. | 2411.00311 | link |
2024-10-29 | Preserving Pre-trained Representation Space: On Effectiveness of Prefix-tuning for Large Multi-modal Models | Donghoon Kim et.al. | 2411.00029 | null |
2024-10-30 | Efficient Adaptation of Pre-trained Vision Transformer via Householder Transformation | Wei Dong et.al. | 2410.22952 | null |
2024-10-30 | MALoRA: Mixture of Asymmetric Low-Rank Adaptation for Enhanced Multi-Task Learning | Xujia Wang et.al. | 2410.22782 | null |
2024-10-29 | Meta-Learning Adaptable Foundation Models | Jacob L. Block et.al. | 2410.22264 | null |
2024-10-29 | Capacity Control is an Effective Memorization Mitigation Mechanism in Text-Conditional Diffusion Models | Raman Dutt et.al. | 2410.22149 | link |
2024-10-30 | IntLoRA: Integral Low-rank Adaptation of Quantized Diffusion Models | Hang Guo et.al. | 2410.21759 | link |
2024-10-28 | KD-LoRA: A Hybrid Approach to Efficient Fine-Tuning with LoRA and Knowledge Distillation | Rambod Azimi et.al. | 2410.20777 | link |
2024-10-27 | Get Large Language Models Ready to Speak: A Late-fusion Approach for Speech Generation | Maohao Shen et.al. | 2410.20336 | null |
2024-11-01 | Parameter-Efficient Fine-Tuning in Large Models: A Survey of Methodologies | Luping Wang et.al. | 2410.19878 | null |
2024-10-23 | MiLoRA: Efficient Mixture of Low-Rank Adaptation for Large Language Models Fine-tuning | Jingfan Zhang et.al. | 2410.18035 | null |
2024-10-22 | Towards Real Zero-Shot Camouflaged Object Segmentation without Camouflaged Annotations | Cheng Lei et.al. | 2410.16953 | null |
2024-10-22 | MoRE: Multi-Modal Contrastive Pre-training with Transformers on X-Rays, ECGs, and Diagnostic Report | Samrajya Thapa et.al. | 2410.16239 | link |
2024-10-21 | Natural GaLore: Accelerating GaLore for memory-efficient LLM Training and Fine-tuning | Arijit Das et.al. | 2410.16029 | link |
2024-10-18 | Unlearning Backdoor Attacks for LLMs with Weak-to-Strong Knowledge Distillation | Shuai Zhao et.al. | 2410.14425 | link |
2024-10-17 | LoLDU: Low-Rank Adaptation via Lower-Diag-Upper Decomposition for Parameter-Efficient Fine-Tuning | Yiming Shi et.al. | 2410.13618 | link |
2024-10-16 | Communication-Efficient and Tensorized Federated Fine-Tuning of Large Language Models | Sajjad Ghiasvand et.al. | 2410.13097 | null |
2024-10-17 | Prompt Compression for Large Language Models: A Survey | Zongqian Li et.al. | 2410.12388 | link |
2024-10-15 | Layer-wise Importance Matters: Less Memory for Better Performance in Parameter-efficient Fine-tuning of Large Language Models | Kai Yao et.al. | 2410.11772 | link |
2024-10-15 | LoKO: Low-Rank Kalman Optimizer for Online Fine-Tuning of Large Models | Hossein Abdi et.al. | 2410.11551 | null |
2024-10-15 | RoCoFT: Efficient Finetuning of Large Language Models with Row-Column Updates | Md Kowsher et.al. | 2410.10075 | link |
2024-10-13 | BiDoRA: Bi-level Optimization-Based Weight-Decomposed Low-Rank Adaptation | Peijia Qin et.al. | 2410.09758 | null |
2024-10-12 | Towards Efficient Visual-Language Alignment of the Q-Former for Visual Reasoning Tasks | Sungkyung Kim et.al. | 2410.09489 | link |
2024-10-15 | MTL-LoRA: Low-Rank Adaptation for Multi-Task Learning | Yaming Yang et.al. | 2410.09437 | null |
2024-10-09 | Parameter-Efficient Fine-Tuning via Selective Discrete Cosine Transform | Yixian Shen et.al. | 2410.09103 | null |
2024-10-04 | BIPEFT: Budget-Guided Iterative Search for Parameter Efficient Fine-Tuning of Large Pretrained Language Models | Aofei Chang et.al. | 2410.09079 | null |
2024-10-11 | Parameter-Efficient Fine-Tuning of State Space Models | Kevin Galim et.al. | 2410.09016 | link |
2024-10-10 | Parameter-Efficient Fine-Tuning in Spectral Domain for Point Cloud Learning | Dingkang Liang et.al. | 2410.08114 | link |
2024-10-10 | SLIM: Let LLM Learn More and Forget Less with Soft LoRA and Identity Mixture | Jiayi Han et.al. | 2410.07739 | null |
2024-10-10 | Enhancing Zeroth-order Fine-tuning for Language Models with Low-rank Structures | Yiming Chen et.al. | 2410.07698 | link |
2024-10-09 | SparseGrad: A Selective Method for Efficient Fine-tuning of MLP Layers | Viktoriia Chekalina et.al. | 2410.07383 | link |
2024-10-09 | Functional-level Uncertainty Quantification for Calibrated Fine-tuning on LLMs | Ruijia Niu et.al. | 2410.06431 | null |
2024-10-08 | Are Large Language Models State-of-the-art Quality Estimators for Machine Translation of User-generated Content? | Shenbin Qian et.al. | 2410.06338 | link |
2024-10-15 | LoRTA: Low Rank Tensor Adaptation of Large Language Models | Ignacio Hounie et.al. | 2410.04060 | null |
2024-10-03 | Llama SLayer 8B: Shallow Layers Hold the Key to Knowledge Injection | Tianxiang Chen et.al. | 2410.02330 | link |
2024-10-02 | TPP-LLM: Modeling Temporal Point Processes by Efficiently Fine-Tuning Large Language Models | Zefang Liu et.al. | 2410.02062 | link |
2024-10-02 | NEAT: Nonlinear Parameter-efficient Adaptation of Pre-trained Models | Yibo Zhong et.al. | 2410.01870 | null |
2024-09-27 | A GEN AI Framework for Medical Note Generation | Hui Yi Leong et.al. | 2410.01841 | null |
2024-10-02 | DLP-LoRA: Efficient Task-Specific LoRA Fusion with a Dynamic, Lightweight Plugin for Large Language Models | Yuxuan Zhang et.al. | 2410.01497 | link |
2024-10-01 | PrivTuner with Homomorphic Encryption and LoRA: A P3EFT Scheme for Privacy-Preserving Parameter-Efficient Fine-Tuning of AI Foundation Models | Yang Li et.al. | 2410.00433 | null |
2024-09-30 | Adapting LLMs for the Medical Domain in Portuguese: A Study on Fine-Tuning and Model Evaluation | Pedro Henrique Paiola et.al. | 2410.00163 | null |
2024-09-30 | Resource Allocation for Stable LLM Training in Mobile Edge Computing | Chang Liu et.al. | 2409.20247 | null |
2024-09-30 | Reference Trustable Decoding: A Training-Free Augmentation Paradigm for Large Language Models | Luohe Shi et.al. | 2409.20181 | null |
2024-09-28 | FINE: Factorizing Knowledge for Initialization of Variable-sized Diffusion Models | Yucheng Xie et.al. | 2409.19289 | null |
2024-10-01 | Backdoor Attacks for LLMs with Weak-To-Strong Knowledge Distillation | Shuai Zhao et.al. | 2409.17946 | null |
2024-09-26 | PEDRO: Parameter-Efficient Fine-tuning with Prompt DEpenDent Representation MOdification | Tianfang Xie et.al. | 2409.17834 | null |
2024-09-30 | Efficient In-Domain Question Answering for Resource-Constrained Environments | Isaac Chung et.al. | 2409.17648 | null |
2024-10-07 | PACE: marrying generalization in PArameter-efficient fine-tuning with Consistency rEgularization | Yao Ni et.al. | 2409.17137 | link |
2024-09-25 | Parameter-efficient Bayesian Neural Networks for Uncertainty-aware Depth Estimation | Richard D. Paul et.al. | 2409.17085 | null |
2024-10-02 | Bone: Block Affine Transformation as Parameter Efficient Fine-tuning Methods for Large Language Models | Jiale Kang et.al. | 2409.15371 | link |
2024-09-22 | Flat-LoRA: Low-Rank Adaption over a Flat Loss Landscape | Tao Li et.al. | 2409.14396 | null |
2024-10-01 | Obliviate: Neutralizing Task-agnostic Backdoors within the Parameter-efficient Fine-tuning Paradigm | Jaehan Kim et.al. | 2409.14119 | link |
2024-09-20 | HUT: A More Computation Efficient Fine-Tuning Method With Hadamard Updated Transformation | Geyuan Zhang et.al. | 2409.13501 | null |
2024-09-17 | THaMES: An End-to-End Tool for Hallucination Mitigation and Evaluation in Large Language Models | Mengfei Liang et.al. | 2409.11353 | link |
2024-09-17 | LPT++: Efficient Training on Mixture of Long-tailed Experts | Bowen Dong et.al. | 2409.11323 | null |
2024-09-17 | Beyond LoRA: Exploring Efficient Fine-Tuning Techniques for Time Series Foundational Models | Divij Gupta et.al. | 2409.11302 | null |
2024-09-18 | Propulsion: Steering LLM with Tiny Fine-Tuning | Md Kowsher et.al. | 2409.10927 | link |
2024-09-16 | From Text to Emoji: How PEFT-Driven Personality Manipulation Unleashes the Emoji Potential in LLMs | Navya Jain et.al. | 2409.10245 | null |
2024-09-14 | COMFORT: A Continual Fine-Tuning Framework for Foundation Models Targeted at Consumer Healthcare | Chia-Hao Li et.al. | 2409.09549 | null |
2024-09-14 | Comparing Retrieval-Augmentation and Parameter-Efficient Fine-Tuning for Privacy-Preserving Personalization of Large Language Models | Alireza Salemi et.al. | 2409.09510 | link |
2024-09-13 | Risks When Sharing LoRA Fine-Tuned Diffusion Model Weights | Dixi Yao et.al. | 2409.08482 | null |
2024-09-12 | Do Vision Foundation Models Enhance Domain Generalization in Medical Image Segmentation? | Kerem Cekmeceli et.al. | 2409.07960 | link |
2024-09-11 | Efficient Localized Adaptation of Neural Weather Forecasting: A Case Study in the MENA Region | Muhammad Akhtar Munir et.al. | 2409.07585 | link |
2024-09-10 | Sam2Rad: A Segmentation Model for Medical Images with Learnable Prompts | Assefa Seyoum Wahd et.al. | 2409.06821 | link |
2024-09-11 | Ferret: Federated Full-Parameter Tuning at Scale for Large Language Models | Yao Shu et.al. | 2409.06277 | link |
2024-09-09 | SVFit: Parameter-Efficient Fine-Tuning of Large Pre-Trained Models Using Singular Values | Chengwei Sun et.al. | 2409.05926 | null |
2024-09-10 | Improving Multimodal Emotion Recognition by Leveraging Acoustic Adaptation and Visual Alignment | Zhixian Zhao et.al. | 2409.05015 | null |
2024-09-06 | Customizing Large Language Model Generation Style using Parameter-Efficient Finetuning | Xinyue Liu et.al. | 2409.04574 | null |
2024-09-04 | iConFormer: Dynamic Parameter-Efficient Tuning with Input-Conditioned Adaptation | Hayeon Jo et.al. | 2409.02838 | null |
2024-09-04 | Deconfounded Causality-aware Parameter-Efficient Fine-Tuning for Problem-Solving Improvement of LLMs | Ruoyu Wang et.al. | 2409.02686 | null |
2024-09-04 | Robust Federated Finetuning of Foundation Models via Alternating Minimization of LoRA | Shuangyi Chen et.al. | 2409.02346 | null |
2024-09-02 | Unleashing the Power of Task-Specific Directions in Parameter Efficient Fine-tuning | Chongjie Si et.al. | 2409.01035 | link |
2024-08-28 | 3-in-1: 2D Rotary Adaptation for Efficient Finetuning, Efficient Batching and Composability | Baohao Liao et.al. | 2409.00119 | link |
2024-08-21 | SORSA: Singular Values and Orthonormal Regularized Singular Vectors Adaptation of Large Language Models | Yang Cao et.al. | 2409.00055 | link |
2024-08-30 | MoRe Fine-Tuning with 10x Fewer Parameters | Wenxuan Tan et.al. | 2408.17383 | link |
2024-09-02 | Instant Adversarial Purification with Adversarial Consistency Distillation | Chun Tong Lei et.al. | 2408.17064 | null |
2024-08-28 | Scaling Up Summarization: Leveraging Large Language Models for Long Text Extractive Summarization | Léo Hemamou et.al. | 2408.15801 | null |
2024-08-27 | GIFT-SW: Gaussian noise Injected Fine-Tuning of Salient Weights for LLMs | Maxim Zhelnin et.al. | 2408.15300 | link |
2024-08-27 | Pre-training Everywhere: Parameter-Efficient Fine-Tuning for Medical Image Analysis via Target Parameter Pre-training | Xingliang Lei et.al. | 2408.15011 | null |
2024-08-27 | CVPT: Cross-Attention help Visual Prompt Tuning adapt visual task | Lingyun Huang et.al. | 2408.14961 | link |
2024-08-27 | Step-by-Step Unmasking for Parameter-Efficient Fine-tuning of Large Language Models | Aradhye Agarwal et.al. | 2408.14470 | link |
2024-08-24 | Advancing Enterprise Spatio-Temporal Forecasting Applications: Data Mining Meets Instruction Tuning of Language Models For Multi-modal Time Series Analysis in Low-Resource Settings | Sagar Srinivas Sakhinana et.al. | 2408.13622 | null |
2024-08-21 | Positional Prompt Tuning for Efficient 3D Representation Learning | Shaochen Zhang et.al. | 2408.11567 | link |
2024-08-20 | Pluto and Charon: A Time and Memory Efficient Collaborative Edge AI Framework for Personal LLMs Fine-Tuning | Bei Ouyang et.al. | 2408.10746 | null |
2024-08-20 | TDS-CLIP: Temporal Difference Side Network for Image-to-Video Transfer Learning | Bin Wang et.al. | 2408.10688 | link |
2024-08-19 | TeamLoRA: Boosting Low-Rank Adaptation with Expert Collaboration and Competition | Tianwei Lin et.al. | 2408.09856 | link |
2024-08-16 | Learning to Route for Dynamic Adapter Composition in Continual Learning with Language Models | Vladimir Araujo et.al. | 2408.09053 | null |
2024-08-14 | KIND: Knowledge Integration and Diversion in Diffusion Models | Yucheng Xie et.al. | 2408.07337 | null |
2024-08-30 | TaSL: Task Skill Localization and Consolidation for Language Model Continual Learning | Yujie Feng et.al. | 2408.05200 | link |
2024-08-08 | Bias-Aware Low-Rank Adaptation: Mitigating Catastrophic Inheritance of Large Language Models | Yupeng Chang et.al. | 2408.04556 | link |
2024-08-06 | SARA: Singular-Value Based Adaptive Low-Rank Adaption | Jihao Gu et.al. | 2408.03290 | null |
2024-08-06 | Leveraging Parameter Efficient Training Methods for Low Resource Text Classification: A Case Study in Marathi | Pranita Deshmukh et.al. | 2408.03172 | null |
2024-08-03 | TS-SAM: Fine-Tuning Segment-Anything Model for Downstream Tasks | Yang Yu et.al. | 2408.01835 | link |
2024-08-02 | MoDE: Effective Multi-task Parameter Efficient Fine-Tuning with a Mixture of Dyadic Experts | Lin Ning et.al. | 2408.01505 | null |
2024-08-02 | Tensor Train Low-rank Approximation (TT-LoRA): Democratizing AI with Accelerated LLMs | Afia Anjum et.al. | 2408.01008 | null |
2024-07-31 | A Federated Learning-Friendly Approach for Parameter-Efficient Fine-Tuning of SAM in 3D Segmentation | Mothilal Asokan et.al. | 2407.21739 | null |
2024-07-28 | Forecast-PEFT: Parameter-Efficient Fine-Tuning for Pre-trained Motion Forecasting Models | Jifeng Wang et.al. | 2407.19564 | link |
2024-07-24 | Parameter-Efficient Fine-Tuning for Continual Learning: A Neural Tangent Kernel Perspective | Jingren Liu et.al. | 2407.17120 | null |
2024-07-22 | Zero-Shot Embeddings Inform Learning and Forgetting with Vision-Language Encoders | Laura Niss et.al. | 2407.15731 | null |
2024-07-21 | Learn to Preserve and Diversify: Parameter-Efficient Group with Orthogonal Regularization for Domain Generalization | Jiajun Hu et.al. | 2407.15085 | null |
2024-07-16 | InstructAV: Instruction Fine-tuning Large Language Models for Authorship Verification | Yujia Hu et.al. | 2407.12882 | link |
2024-07-18 | Turning Generative Models Degenerate: The Power of Data Poisoning Attacks | Shuli Jiang et.al. | 2407.12281 | null |
2024-07-16 | Probing the Efficacy of Federated Parameter-Efficient Fine-Tuning of Vision Transformers for Medical Image Classification | Naif Alkhunaizi et.al. | 2407.11573 | null |
2024-07-16 | An efficient framework based on large foundation model for cervical cytopathology whole slide image screening | Jialong Huang et.al. | 2407.11486 | link |
2024-07-10 | RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization | Xijie Huang et.al. | 2407.08044 | link |
2024-07-10 | ROSA: Random Subspace Adaptation for Efficient Fine-Tuning | Marawan Gamal Abdel Hameed et.al. | 2407.07802 | link |
2024-07-10 | Parameter Efficient Fine Tuning for Multi-scanner PET to PET Reconstruction | Yumin Kim et.al. | 2407.07517 | null |
2024-07-09 | Reprogramming Distillation for Medical Foundation Models | Yuhang Zhou et.al. | 2407.06504 | null |
2024-07-07 | See Further for Parameter Efficient Fine-tuning by Standing on the Shoulders of Decomposition | Chongjie Si et.al. | 2407.05417 | link |
2024-07-16 | LoRA-GA: Low-Rank Adaptation with Gradient Approximation | Shaowen Wang et.al. | 2407.05000 | link |
2024-07-05 | GPT vs RETRO: Exploring the Intersection of Retrieval and Parameter-Efficient Fine-Tuning | Aleksander Ficek et.al. | 2407.04528 | null |
2024-07-04 | Deep Content Understanding Toward Entity and Aspect Target Sentiment Analysis on Foundation Models | Vorakit Vorakitphan et.al. | 2407.04050 | link |
2024-07-04 | ASteISR: Adapting Single Image Super-resolution Pre-trained Model for Efficient Stereo Image Super-resolution | Yuanbo Zhou et.al. | 2407.03598 | null |
2024-07-03 | Knowledge Composition using Task Vectors with Learned Anisotropic Scaling | Frederic Z. Zhang et.al. | 2407.02880 | link |
2024-07-03 | Exploring the Capabilities of LLMs for Code Change Related Tasks | Lishui Fan et.al. | 2407.02824 | link |
2024-07-02 | FineCLIPER: Multi-modal Fine-grained CLIP for Dynamic Facial Expression Recognition with AdaptERs | Haodong Chen et.al. | 2407.02157 | null |
2024-07-02 | CatMemo at the FinLLM Challenge Task: Fine-Tuning Large Language Models using Data Fusion in Financial Applications | Yupeng Cao et.al. | 2407.01953 | null |
2024-07-05 | Let the Expert Stick to His Last: Expert-Specialized Fine-Tuning for Sparse Architectural Large Language Models | Zihan Wang et.al. | 2407.01906 | link |
2024-07-01 | A Fingerprint for Large Language Models | Zhiguang Yang et.al. | 2407.01235 | null |
2024-07-02 | Embedded Prompt Tuning: Towards Enhanced Calibration of Pretrained Models for Medical Images | Wenqiang Zu et.al. | 2407.01003 | link |
2024-06-25 | Structured Unrestricted-Rank Matrices for Parameter Efficient Fine-tuning | Arijit Sehanobish et.al. | 2406.17740 | null |
2024-06-19 | Parameter Training Efficiency Aware Resource Allocation for AIGC in Space-Air-Ground Integrated Networks | Liangxin Qian et.al. | 2406.13602 | null |
2024-06-19 | Sparse High Rank Adapters | Kartikeya Bhardwaj et.al. | 2406.13175 | null |
2024-06-18 | Bayesian-LoRA: LoRA based Parameter Efficient Fine-Tuning using Optimal Quantization levels and Rank Values trough Differentiable Bayesian Gates | Cristian Meo et.al. | 2406.13046 | null |
2024-06-18 | Fighting Randomness with Randomness: Mitigating Optimisation Instability of Fine-Tuning using Delayed Ensemble and Noisy Interpolation | Branislav Pecher et.al. | 2406.12471 | link |
2024-06-17 | A Semantic-based Layer Freezing Approach to Efficient Fine-Tuning of Language Models | Jian Gu et.al. | 2406.11753 | null |
2024-06-16 | ExPLoRA: Parameter-Efficient Extended Pre-Training to Adapt Vision Transformers under Domain Shifts | Samar Khanna et.al. | 2406.10973 | null |
2024-06-16 | ShareLoRA: Parameter Efficient and Robust Large Language Model Fine-tuning via Shared Low-Rank Adaptation | Yurun Song et.al. | 2406.10785 | null |
2024-06-16 | RoseLoRA: Row and Column-wise Sparse Low-rank Adaptation of Pre-trained Language Model for Knowledge Editing and Fine-tuning | Haoyu Wang et.al. | 2406.10777 | null |
2024-06-15 | Benchmarking Children’s ASR with Supervised and Self-supervised Speech Foundation Models | Ruchao Fan et.al. | 2406.10507 | link |
2024-06-15 | Personalized Pieces: Efficient Personalized Large Language Models through Collaborative Efforts | Zhaoxuan Tan et.al. | 2406.10471 | link |
2024-06-13 | Reflecting on the State of Rehearsal-free Continual Learning with Pretrained Models | Lukas Thede et.al. | 2406.09384 | null |
2024-06-12 | Exploring Fact Memorization and Style Imitation in LLMs Using QLoRA: An Experimental Study and Quality Assessment Methods | Eugene Vyborov et.al. | 2406.08582 | null |
2024-06-12 | The Impact of Initialization on LoRA Finetuning Dynamics | Soufiane Hayou et.al. | 2406.08447 | null |
2024-06-20 | Low-Rank Quantization-Aware Training for LLMs | Yelysei Bondarenko et.al. | 2406.06385 | link |
2024-06-10 | A Parameter-efficient Language Extension Framework for Multilingual ASR | Wei Liu et.al. | 2406.06329 | null |
2024-06-09 | A Comprehensive Evaluation of Parameter-Efficient Fine-Tuning on Automated Program Repair | Guochang Li et.al. | 2406.05639 | link |
2024-06-07 | Efficient Differentially Private Fine-Tuning of Diffusion Models | Jing Liu et.al. | 2406.05257 | null |
2024-06-07 | CorDA: Context-Oriented Decomposition Adaptation of Large Language Models | Yibo Yang et.al. | 2406.05223 | link |
2024-06-07 | An Empirical Study on Parameter-Efficient Fine-Tuning for MultiModal Large Language Models | Xiongtao Zhou et.al. | 2406.05130 | link |
2024-06-07 | MEFT: Memory-Efficient Fine-Tuning through Sparse Adapter | Jitai Hao et.al. | 2406.04984 | link |
2024-06-06 | Time Sensitive Knowledge Editing through Efficient Finetuning | Xiou Ge et.al. | 2406.04496 | link |
2024-06-06 | VHDL-Eval: A Framework for Evaluating Large Language Models in VHDL Code Generation | Prashanth Vijayaraghavan et.al. | 2406.04379 | null |
2024-06-10 | Hypernetworks for Personalizing ASR to Atypical Speech | Max Müller-Eberstein et.al. | 2406.04240 | null |
2024-06-06 | Light-PEFT: Lightening Parameter-Efficient Fine-Tuning via Early Pruning | Naibin Gu et.al. | 2406.03792 | link |
2024-06-05 | Choice of PEFT Technique in Continual Learning: Prompt Tuning is Not All You Need | Martin Wistuba et.al. | 2406.03216 | null |
2024-06-06 | Adapter-X: A Novel General Parameter-Efficient Fine-Tuning Framework for Vision | Minglei Li et.al. | 2406.03051 | null |
2024-05-31 | Mamba State-Space Models Can Be Strong Downstream Learners | John T. Halloran et.al. | 2406.00209 | null |
2024-05-30 | ETHER: Efficient Finetuning of Large-Scale Models with Hyperplane Reflections | Massimo Bini et.al. | 2405.20271 | link |
2024-05-30 | SVFT: Parameter-Efficient Fine-Tuning with Singular Vectors | Vijay Lingam et.al. | 2405.19597 | link |
2024-05-29 | MemControl: Mitigating Memorization in Medical Diffusion Models via Automated Parameter Selection | Raman Dutt et.al. | 2405.19458 | link |
2024-05-29 | MLAE: Masked LoRA Experts for Parameter-Efficient Fine-Tuning | Junjie Wang et.al. | 2405.18897 | link |
2024-05-29 | Parameter-efficient Fine-tuning in Hyperspherical Space for Open-vocabulary Semantic Segmentation | Zelin Peng et.al. | 2405.18840 | null |
2024-06-01 | Low-Rank Few-Shot Adaptation of Vision-Language Models | Maxime Zanella et.al. | 2405.18541 | null |
2024-05-28 | Semantic are Beacons: A Semantic Perspective for Unveiling Parameter-Efficient Fine-Tuning in Knowledge Learning | Renzhi Wang et.al. | 2405.18292 | null |
2024-05-28 | VeLoRA: Memory Efficient Training using Rank-1 Sub-Token Projections | Roy Miles et.al. | 2405.17991 | link |
2024-05-28 | Sparsity- and Hybridity-Inspired Visual Parameter-Efficient Fine-Tuning for Medical Diagnosis | Mingyuan Liu et.al. | 2405.17877 | null |
2024-05-27 | LoRA-XS: Low-Rank Adaptation with Extremely Small Number of Parameters | Klaudia Bałazy et.al. | 2405.17604 | link |
2024-05-23 | EMR-Merging: Tuning-Free High-Performance Model Merging | Chenyu Huang et.al. | 2405.17461 | link |
2024-05-28 | DoRA: Enhancing Parameter-Efficient Fine-Tuning with Dynamic Rank Distribution | Yulong Mao et.al. | 2405.17357 | link |
2024-05-27 | $\textit{Trans-LoRA}$ : towards data-free Transferable Parameter Efficient Finetuning | Runqian Wang et.al. | 2405.17258 | null |
2024-05-30 | Sparse Matrix in Large Language Model Fine-tuning | Haoze He et.al. | 2405.15525 | null |
2024-05-24 | Prompt Tuning Strikes Back: Customizing Foundation Models with Low-Rank Prompt Adaptation | Abhinav Jain et.al. | 2405.15282 | link |
2024-05-27 | VB-LoRA: Extreme Parameter Efficient Fine-Tuning with Vector Banks | Yang Li et.al. | 2405.15179 | link |
2024-05-23 | Bitune: Bidirectional Instruction-Tuning | Dawid J. Kopiczko et.al. | 2405.14862 | null |
2024-05-23 | Sparse-Tuning: Adapting Vision Transformers with Efficient Fine-tuning and Inference | Ting Liu et.al. | 2405.14700 | link |
2024-05-22 | Spectral Adapter: Fine-Tuning in Spectral Space | Fangzhao Zhang et.al. | 2405.13952 | link |
2024-05-24 | MeteoRA: Multiple-tasks Embedded LoRA for Large Language Models | Jingwei Xu et.al. | 2405.13053 | link |
2024-05-20 | FeTT: Continual Class Incremental Learning via Feature Transformation Tuning | Sunyuan Qiang et.al. | 2405.11822 | null |
2024-05-21 | HARIS: Human-Like Attention for Reference Image Segmentation | Mengxi Zhang et.al. | 2405.10707 | null |
2024-05-28 | DP-DyLoRA: Fine-Tuning Transformer-Based Models On-Device under Differentially Private Federated Learning using Dynamic Low-Rank Adaptation | Jie Xu et.al. | 2405.06368 | null |
2024-05-09 | Selective Fine-tuning on LLM-labeled Data May Reduce Reliance on Human Annotation: A Case Study Using Schedule-of-Event Table Detection | Bhawesh Kumar et.al. | 2405.06093 | null |
2024-05-09 | Memory-Space Visual Prompting for Efficient Vision-Language Fine-Tuning | Shibo Jie et.al. | 2405.05615 | link |
2024-05-07 | Refining Joint Text and Source Code Embeddings for Retrieval Task with Parameter-Efficient Fine-Tuning | Karim Galliamov et.al. | 2405.04126 | link |
2024-05-04 | Random Masking Finds Winning Tickets for Parameter Efficient Fine-tuning | Jing Xu et.al. | 2405.02596 | link |
2024-03-16 | Empirical Studies of Parameter Efficient Methods for Large Language Models of Code and Knowledge Transfer to R | Amirreza Esmaeili et.al. | 2405.01553 | null |
2024-05-02 | NeMo-Aligner: Scalable Toolkit for Efficient Model Alignment | Gerald Shen et.al. | 2405.01481 | link |
2024-04-29 | LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report | Justin Zhao et.al. | 2405.00732 | link |
2024-05-01 | Investigating Automatic Scoring and Feedback using Large Language Models | Gloria Ashiya Katuka et.al. | 2405.00602 | null |
2024-05-01 | MoPEFT: A Mixture-of-PEFTs for the Segment Anything Model | Rajat Sahay et.al. | 2405.00293 | null |
2024-04-30 | SPAFIT: Stratified Progressive Adaptation Fine-tuning for Pre-trained Large Language Models | Samir Arora et.al. | 2405.00201 | null |
2024-05-23 | HydraLoRA: An Asymmetric LoRA Architecture for Efficient Fine-Tuning | Chunlin Tian et.al. | 2404.19245 | link |
2024-05-25 | FeDeRA:Efficient Fine-tuning of Language Models in Federated Learning Leveraging Weight Decomposition | Yuxuan Yan et.al. | 2404.18848 | null |
2024-04-25 | Efficiency in Focus: LayerNorm as a Catalyst for Fine-tuning Medical Visual Language Pre-trained Models | Jiawei Chen et.al. | 2404.16385 | null |
2024-05-23 | MixLoRA: Enhancing Large Language Models Fine-Tuning with LoRA-based Mixture of Experts | Dengchun Li et.al. | 2404.15159 | link |
2024-04-22 | ColA: Collaborative Adaptation with Gradient Learning | Enmao Diao et.al. | 2404.13844 | link |
2024-04-23 | Parameter Efficient Fine Tuning: A Comprehensive Analysis Across Applications | Charith Chandra Sai Balne et.al. | 2404.13506 | null |
2024-04-18 | SKIP: Skill-Localized Prompt Tuning for Inference Speed Boost-Up | Nakyeong Yang et.al. | 2404.11916 | null |
2024-04-16 | Shears: Unstructured Sparsity with Neural Low-rank Adapter Search | J. Pablo Muñoz et.al. | 2404.10934 | link |
2024-04-16 | Exact and Efficient Unlearning for Large Language Model-based Recommendation | Zhiyu Hu et.al. | 2404.10327 | null |
2024-04-15 | LoRA Dropout as a Sparsity Regularizer for Overfitting Control | Yang Lin et.al. | 2404.09610 | null |
2024-04-21 | Analyzing the Impact of Data Selection and Fine-Tuning on Economic and Political Biases in LLMs | Ahmed Agiza et.al. | 2404.08699 | link |
2024-04-08 | Certified PEFTSmoothing: Parameter-Efficient Fine-Tuning with Randomized Smoothing | Chengyan Fu et.al. | 2404.05350 | null |
2024-04-08 | DLoRA: Distributed Parameter-Efficient Fine-Tuning Solution for Large Language Model | Chao Gao et.al. | 2404.05182 | null |
2024-04-12 | Q-PEFT: Query-dependent Parameter Efficient Fine-tuning for Text Reranking with Large Language Models | Zhiyuan Peng et.al. | 2404.04522 | null |
2024-04-05 | Unlocking Parameter-Efficient Fine-Tuning for Low-Resource Language Translation | Tong Su et.al. | 2404.04212 | null |
2024-05-22 | ReFT: Representation Finetuning for Language Models | Zhengxuan Wu et.al. | 2404.03592 | link |
2024-06-11 | Personalized LLM Response Generation with Parameterized Memory Injection | Kai Zhang et.al. | 2404.03565 | null |
2024-06-20 | Eigenpruning: an Interpretability-Inspired PEFT Method | Tomás Vergara-Browne et.al. | 2404.03147 | link |
2024-05-28 | PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models | Fanxu Meng et.al. | 2404.02948 | link |
2024-04-03 | Enhancing Low-Resource LLMs Classification with PEFT and Synthetic Data | Parth Patwa et.al. | 2404.02422 | null |
2024-04-11 | IISAN: Efficiently Adapting Multimodal Representation for Sequential Recommendation with Decoupled PEFT | Junchen Fu et.al. | 2404.02059 | link |
2024-03-31 | Query-driven Relevant Paragraph Extraction from Legal Judgments | T. Y. S. S Santosh et.al. | 2404.00595 | null |
2024-03-30 | Edinburgh Clinical NLP at SemEval-2024 Task 2: Fine-tune your model unless you have access to GPT-4 | Aryo Pradipta Gema et.al. | 2404.00484 | link |
2024-04-03 | InfLoRA: Interference-Free Low-Rank Adaptation for Continual Learning | Yan-Shuo Liang et.al. | 2404.00228 | link |
2024-03-27 | Is Modularity Transferable? A Case Study through the Lens of Knowledge Distillation | Mateusz Klimaszewski et.al. | 2403.18804 | link |
2024-03-26 | The Unreasonable Ineffectiveness of the Deeper Layers | Andrey Gromov et.al. | 2403.17887 | null |
2024-04-15 | ALoRA: Allocating Low-Rank Adaptation for Fine-tuning Large Language Models | Zequan Liu et.al. | 2403.16187 | null |
2024-03-22 | KnowLA: Enhancing Parameter-efficient Finetuning with Knowledgeable Adaptation | Xindi Luo et.al. | 2403.14950 | link |
2024-03-22 | A Single Linear Layer Yields Task-Adapted Low-Rank Matrices | Hwichan Kim et.al. | 2403.14946 | null |
2024-03-21 | AutoRE: Document-Level Relation Extraction with Large Language Models | Xue Lilong et.al. | 2403.14888 | link |
2024-04-29 | Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey | Zeyu Han et.al. | 2403.14608 | null |
2024-03-20 | Harnessing Large Language Models for Text-Rich Sequential Recommendation | Zhi Zheng et.al. | 2403.13325 | link |
2024-04-16 | AFLoRA: Adaptive Freezing of Low Rank Adaptation in Parameter Efficient Fine-Tuning of Large Models | Zeyu Liu et.al. | 2403.13269 | null |
2024-03-18 | Improving LoRA in Privacy-preserving Federated Learning | Youbang Sun et.al. | 2403.12313 | null |
2024-03-18 | Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation | Wangbo Zhao et.al. | 2403.11808 | link |
2024-03-18 | Let’s Focus on Neuron: Neuron-Level Supervised Fine-tuning for Large Language Model | Haoyun Xu et.al. | 2403.11621 | null |
2024-03-19 | JORA: JAX Tensor-Parallel LoRA Library for Retrieval Augmented Fine-Tuning | Anique Tahir et.al. | 2403.11366 | link |
2024-03-14 | Introducing Routing Functions to Vision-Language Parameter-Efficient Fine-Tuning with Low-Rank Bottlenecks | Tingyu Qu et.al. | 2403.09377 | link |
2024-03-14 | PYRA: Parallel Yielding Re-Activation for Training-Inference Efficient Task Adaptation | Yizhe Xiong et.al. | 2403.09192 | link |
2024-03-13 | Data-oriented Dynamic Fine-tuning Parameter Selection Strategy for FISH Mask based Efficient Fine-tuning | Ming Dong et.al. | 2403.08484 | null |
Text-to-Image Generation
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2024-12-19 | LeviTor: 3D Trajectory Oriented Image-to-Video Synthesis | Hanlin Wang et.al. | 2412.15214 | null |
2024-12-19 | Flowing from Words to Pixels: A Framework for Cross-Modality Evolution | Qihao Liu et.al. | 2412.15213 | null |
2024-12-19 | Generative Multiview Relighting for 3D Reconstruction under Extreme Illumination Variation | Hadi Alzayer et.al. | 2412.15211 | null |
2024-12-19 | AV-Link: Temporally-Aligned Diffusion Features for Cross-Modal Audio-Video Generation | Moayed Haji-Ali et.al. | 2412.15191 | null |
2024-12-19 | LlamaFusion: Adapting Pretrained Language Models for Multimodal Generation | Weijia Shi et.al. | 2412.15188 | null |
2024-12-19 | Tiled Diffusion | Or Madar et.al. | 2412.15185 | null |
2024-12-19 | SqueezeMe: Efficient Gaussian Avatars for VR | Shunsuke Saito et.al. | 2412.15171 | null |
2024-12-19 | OnlineVPO: Align Video Diffusion Model with Online Video-Centric Preference Optimization | Jiacheng Zhang et.al. | 2412.15159 | null |
2024-12-19 | Prompt-A-Video: Prompt Your Video Diffusion Model via Preference-Aligned LLM | Yatai Ji et.al. | 2412.15156 | link |
2024-12-19 | Jet: A Modern Transformer-Based Normalizing Flow | Alexander Kolesnikov et.al. | 2412.15129 | null |
2024-12-19 | Predictive Inverse Dynamics Models are Scalable Learners for Robotic Manipulation | Yang Tian et.al. | 2412.15109 | null |
2024-12-19 | Learning Disentangled Equivariant Representation for Explicitly Controllable 3D Molecule Generation | Haoran Liu et.al. | 2412.15086 | null |
2024-12-19 | Eigenstate Preparation on Quantum Computers | Joey Bonitati et.al. | 2412.15081 | null |
2024-12-19 | Uni-Renderer: Unifying Rendering and Inverse Rendering Via Dual Stream Diffusion | Zhifei Chen et.al. | 2412.15050 | null |
2024-12-19 | DCTdiff: Intriguing Properties of Image Generative Modeling in the DCT Space | Mang Ning et.al. | 2412.15032 | link |
2024-12-18 | AniDoc: Animation Creation Made Easier | Yihao Meng et.al. | 2412.14173 | null |
2024-12-19 | E-CAR: Efficient Continuous Autoregressive Image Generation via Multistage Modeling | Zhihang Yuan et.al. | 2412.14170 | null |
2024-12-18 | Autoregressive Video Generation without Vector Quantization | Haoge Deng et.al. | 2412.14169 | link |
2024-12-18 | VideoDPO: Omni-Preference Alignment for Video Diffusion Generation | Runtao Liu et.al. | 2412.14167 | null |
2024-12-18 | MetaMorph: Multimodal Understanding and Generation via Instruction Tuning | Shengbang Tong et.al. | 2412.14164 | null |
2024-12-18 | MCMat: Multiview-Consistent and Physically Accurate PBR Material Generation | Shenhao Zhu et.al. | 2412.14148 | null |
2024-12-18 | Event-based Photometric Bundle Adjustment | Shuang Guo et.al. | 2412.14111 | null |
2024-12-18 | Future Research Avenues for Artificial Intelligence in Digital Gaming: An Exploratory Report | Markus Dablander et.al. | 2412.14085 | null |
2024-12-18 | SurgSora: Decoupled RGBD-Flow Diffusion Model for Controllable Surgical Video Generation | Tong Chen et.al. | 2412.14018 | null |
2024-12-18 | Comparative Analysis of Machine Learning-Based Imputation Techniques for Air Quality Datasets with High Missing Data Rates | Sen Yan et.al. | 2412.13966 | null |
2024-12-18 | A Rose by Any Other Name: LLM-Generated Explanations Are Good Proxies for Human Explanations to Collect Label Distributions on NLI | Beiduo Chen et.al. | 2412.13942 | null |
2024-12-18 | Development of a High-Resolution, High-Dynamic-Range Charge Detector for Ion Beam Monitoring | O. Adriani et.al. | 2412.13934 | null |
2024-12-18 | Investigating the Effects of Diffusion-based Conditional Generative Speech Models Used for Speech Enhancement on Dysarthric Speech | Joanna Reszka et.al. | 2412.13933 | null |
2024-12-18 | Graph-Driven Models for Gas Mixture Identification and Concentration Estimation on Heterogeneous Sensor Array Signals | Ding Wang et.al. | 2412.13891 | null |
2024-12-18 | Navigating limitations with precision: A fine-grained ensemble approach to wrist pathology recognition on a limited x-ray dataset | Ammar Ahmed et.al. | 2412.13884 | null |
2024-12-17 | CoMPaSS: Enhancing Spatial Understanding in Text-to-Image Diffusion Models | Gaoyang Zhang et.al. | 2412.13195 | link |
2024-12-17 | StreetCrafter: Street View Synthesis with Controllable Video Diffusion Models | Yunzhi Yan et.al. | 2412.13188 | null |
2024-12-17 | Move-in-2D: 2D-Conditioned Human Motion Generation | Hsin-Ping Huang et.al. | 2412.13185 | null |
2024-12-17 | F-Bench: Rethinking Human Preference Evaluation Metrics for Benchmarking Face Generation, Customization, and Restoration | Lu Liu et.al. | 2412.13155 | null |
2024-12-17 | Prompt Augmentation for Self-supervised Text-guided Image Manipulation | Rumeysa Bodur et.al. | 2412.13081 | null |
2024-12-17 | 3D MedDiffusion: A 3D Medical Diffusion Model for Controllable and High-quality Medical Image Generation | Haoshen Wang et.al. | 2412.13059 | null |
2024-12-17 | Guiding Generative Protein Language Models with Reinforcement Learning | Filippo Stocco et.al. | 2412.12979 | null |
2024-12-18 | Attentive Eraser: Unleashing Diffusion Model’s Object Removal Potential via Self-Attention Redirection Guidance | Wenhao Sun et.al. | 2412.12974 | link |
2024-12-17 | ArchesWeather & ArchesWeatherGen: a deterministic and generative model for efficient ML weather forecasting | Guillaume Couairon et.al. | 2412.12971 | link |
2024-12-17 | Modified UNIFAC 2.0 – A Group-Contribution Method Completed with Machine Learning | Nicolas Hayer et.al. | 2412.12962 | null |
2024-12-17 | MOPO: Multi-Objective Prompt Optimization for Affective Text Generation | Yarik Menchaca Resendiz et.al. | 2412.12948 | null |
2024-12-17 | Generation of cosmic ray trajectories by a Diffusion Model trained on test particles in 3D magnetohydrodynamic turbulence | Johannes Martin et.al. | 2412.12923 | null |
2024-12-17 | Unsupervised Region-Based Image Editing of Denoising Diffusion Models | Zixiang Li et.al. | 2412.12912 | null |
2024-12-18 | ArtAug: Enhancing Text-to-Image Generation through Synthesis-Understanding Interaction | Zhongjie Duan et.al. | 2412.12888 | link |
2024-12-17 | Memory-minimal quantum generation of stochastic processes: spectral invariants of quantum hidden Markov models | Magdalini Zonnios et.al. | 2412.12812 | null |
2024-12-16 | Causal Diffusion Transformers for Generative Modeling | Chaorui Deng et.al. | 2412.12095 | link |
2024-12-16 | CAP4D: Creating Animatable 4D Portrait Avatars with Morphable Multi-View Diffusion Models | Felix Taubner et.al. | 2412.12093 | null |
2024-12-16 | Wonderland: Navigating 3D Scenes from a Single Image | Hanwen Liang et.al. | 2412.12091 | null |
2024-12-16 | A LoRA is Worth a Thousand Pictures | Chenxi Liu et.al. | 2412.12048 | null |
2024-12-16 | LLMs for Cold-Start Cutting Plane Separator Configuration | Connor Lawless et.al. | 2412.12038 | null |
2024-12-16 | Learning to Navigate in Mazes with Novel Layouts using Abstract Top-down Maps | Linfeng Zhao et.al. | 2412.12024 | null |
2024-12-16 | The entropic optimal (self-)transport problem: Limit distributions for decreasing regularization with application to score function estimation | Gilles Mordant et.al. | 2412.12007 | null |
2024-12-16 | Controllable Shadow Generation with Single-Step Diffusion Models from Synthetic Data | Onur Tasar et.al. | 2412.11972 | null |
2024-12-16 | The Erdős unit distance problem for small point sets | Boris Alexeev et.al. | 2412.11914 | null |
2024-12-16 | CharacterBench: Benchmarking Character Customization of Large Language Models | Jinfeng Zhou et.al. | 2412.11912 | link |
2024-12-16 | Towards Understanding Systems Trade-offs in Retrieval-Augmented Generation Model Inference | Michael Shen et.al. | 2412.11854 | null |
2024-12-16 | ColorFlow: Retrieval-Augmented Image Sequence Colorization | Junhao Zhuang et.al. | 2412.11815 | null |
2024-12-16 | InterDyn: Controllable Interactive Dynamics with Video Diffusion Models | Rick Akkerman et.al. | 2412.11785 | null |
2024-12-16 | Joint Reconstruction of the Activity and the Attenuation in PET by Diffusion Posterior Sampling: a Feasibility Study | Clémentine Phung-Ngoc et.al. | 2412.11776 | null |
2024-12-17 | No More Adam: Learning Rate Scaling at Initialization is All You Need | Minghao Xu et.al. | 2412.11768 | link |
2024-12-13 | Towards a foundation model for heavy-ion collision experiments through point cloud diffusion | Manjunath Omana Kuttan et.al. | 2412.10352 | null |
2024-12-13 | BrushEdit: All-In-One Image Inpainting and Editing | Yaowei Li et.al. | 2412.10316 | null |
2024-12-13 | Iterating the Transient Light Transport Matrix for Non-Line-of-Sight Imaging | Talha Sultan et.al. | 2412.10300 | null |
2024-12-13 | Coherent 3D Scene Diffusion From a Single RGB Image | Manuel Dahnert et.al. | 2412.10294 | null |
2024-12-13 | Adversarial Robustness of Bottleneck Injected Deep Neural Networks for Task-Oriented Communication | Alireza Furutanpey et.al. | 2412.10265 | null |
2024-12-13 | Targeted Angular Reversal of Weights (TARS) for Knowledge Removal in Large Language Models | Harry J. Davies et.al. | 2412.10257 | null |
2024-12-13 | Exploring the Frontiers of Animation Video Generation in the Sora Era: Method, Dataset and Benchmark | Yudong Jiang et.al. | 2412.10255 | null |
2024-12-13 | Radiator Tailoring for Enhanced Performance in InAs-Based Near-Field Thermophotovoltaics | Mathieu Giroux et.al. | 2412.10217 | null |
2024-12-13 | GAF: Gaussian Avatar Reconstruction from Monocular Videos via Multi-view Diffusion | Jiapeng Tang et.al. | 2412.10209 | null |
2024-12-13 | Efficient Generative Modeling with Residual Vector Quantization-Based Tokens | Jaehyeon Kim et.al. | 2412.10208 | null |
2024-12-13 | Simple Guidance Mechanisms for Discrete Diffusion Models | Yair Schiff et.al. | 2412.10193 | link |
2024-12-13 | SwiftTry: Fast and Consistent Video Virtual Try-On with Diffusion Models | Hung Nguyen et.al. | 2412.10178 | null |
2024-12-13 | Learning payoffs while routing in skill-based queues | Sanne van Kempen et.al. | 2412.10168 | null |
2024-12-13 | The Art of Deception: Color Visual Illusions and Diffusion Models | Alex Gomez-Villa et.al. | 2412.10122 | null |
2024-12-13 | Familiarity: Better Evaluation of Zero-Shot Named Entity Recognition by Quantifying Label Shifts in Synthetic Training Data | Jonas Golde et.al. | 2412.10121 | null |
2024-12-12 | FreeScale: Unleashing the Resolution of Diffusion Models via Tuning-Free Scale Fusion | Haonan Qiu et.al. | 2412.09626 | null |
2024-12-12 | Illusion3D: 3D Multiview Illusion with 2D Diffusion Priors | Yue Feng et.al. | 2412.09625 | null |
2024-12-12 | GenEx: Generating an Explorable World | Taiming Lu et.al. | 2412.09624 | null |
2024-12-12 | OmniDrag: Enabling Motion Control for Omnidirectional Image-to-Video Generation | Weiqi Li et.al. | 2412.09623 | null |
2024-12-12 | LoRACLR: Contrastive Adaptation for Customization of Diffusion Models | Enis Simsar et.al. | 2412.09622 | null |
2024-12-12 | SnapGen: Taming High-Resolution Text-to-Image Models for Mobile Devices with Efficient Architectures and Training | Dongting Hu et.al. | 2412.09619 | null |
2024-12-12 | EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM | Zhuofan Zong et.al. | 2412.09618 | null |
2024-12-12 | Context Canvas: Enhancing Text-to-Image Diffusion Models with Knowledge Graph-Based RAG | Kavana Venkatesh et.al. | 2412.09614 | null |
2024-12-13 | Olympus: A Universal Task Router for Computer Vision Tasks | Yuanze Lin et.al. | 2412.09612 | link |
2024-12-12 | Owl-1: Omni World Model for Consistent Long Video Generation | Yuanhui Huang et.al. | 2412.09600 | link |
2024-12-12 | LiftImage3D: Lifting Any Single Image to 3D Gaussians with Video Generation Priors | Yabo Chen et.al. | 2412.09597 | null |
2024-12-12 | Neural LightRig: Unlocking Accurate Object Normal and Material Estimation with Multi-Light Diffusion | Zexin He et.al. | 2412.09593 | null |
2024-12-12 | Improving the Reliability of Cable Broadband Networks via Proactive Network Maintenance | Jiyao Hu et.al. | 2412.09564 | null |
2024-12-12 | Meshtron: High-Fidelity, Artist-Like 3D Mesh Generation at Scale | Zekun Hao et.al. | 2412.09548 | null |
2024-12-12 | SimAvatar: Simulation-Ready Avatars with Layered Hair and Clothing | Xueting Li et.al. | 2412.09545 | null |
2024-12-11 | Generative Semantic Communication: Architectures, Technologies, and Applications | Jinke Ren et.al. | 2412.08642 | null |
2024-12-11 | DMin: Scalable Training Data Influence Estimation for Diffusion Models | Huawei Lin et.al. | 2412.08637 | link |
2024-12-11 | Multimodal Latent Language Modeling with Next-Token Diffusion | Yutao Sun et.al. | 2412.08635 | link |
2024-12-11 | An SDR-Based Monostatic Wi-Fi System with Analog Self-Interference Cancellation for Sensing | Andreas Toftegaard Kristensen et.al. | 2412.08612 | null |
2024-12-12 | Design2GarmentCode: Turning Design Concepts to Tangible Garments Through Program Synthesis | Feng Zhou et.al. | 2412.08603 | null |
2024-12-11 | TryOffAnyone: Tiled Cloth Generation from a Dressed Person | Ioannis Xarchakos et.al. | 2412.08573 | link |
2024-12-12 | Watermarking Training Data of Music Generation Models | Pascal Epple et.al. | 2412.08549 | null |
2024-12-11 | Orderly Management of Packets in RDMA by Eunomia | Sana Mahmood et.al. | 2412.08540 | null |
2024-12-11 | Ensemble-Based Quantum-Token Protocol Benchmarked on IBM Quantum Processors | Lucas Tsunaki et.al. | 2412.08530 | null |
2024-12-11 | Comparative Opinion Mining in Product Reviews: Multi-perspective Prompt-based Learning | Hai-Yen Thi Nguyen et.al. | 2412.08508 | null |
2024-12-11 | Open-Loop and Model Predictive Control for Electric Vehicle Charging to Manage Excess Renewable Energy Supply in Texas | Kelsey M. Nelson et.al. | 2412.08505 | null |
2024-12-11 | Learning Flow Fields in Attention for Controllable Person Image Generation | Zijian Zhou et.al. | 2412.08486 | link |
2024-12-11 | InvDiff: Invariant Guidance for Bias Mitigation in Diffusion Models | Min Hou et.al. | 2412.08480 | link |
2024-12-11 | CC-Diff: Enhancing Contextual Coherence in Remote Sensing Image Synthesis | Mu Zhang et.al. | 2412.08464 | null |
2024-12-11 | Federated Learning for Traffic Flow Prediction with Synthetic Data Augmentation | Fermin Orozco et.al. | 2412.08460 | null |
2024-12-10 | Efficient Diversity-Preserving Diffusion Alignment via Gradient-Informed GFlowNets | Zhen Liu et.al. | 2412.07775 | null |
2024-12-10 | UniReal: Universal Image Generation and Editing via Learning Real-world Dynamics | Xi Chen et.al. | 2412.07774 | null |
2024-12-10 | From Slow Bidirectional to Fast Causal Video Generators | Tianwei Yin et.al. | 2412.07772 | null |
2024-12-10 | Make-A-Texture: Fast Shape-Aware Texture Generation in 3 Seconds | Xiaoyu Xiang et.al. | 2412.07766 | null |
2024-12-10 | Bayesian Optimization of Antibodies Informed by a Generative Model of Evolving Sequences | Alan Nawzad Amin et.al. | 2412.07763 | link |
2024-12-10 | Repurposing Pre-trained Video Diffusion Models for Event-based Video Interpolation | Jingxi Chen et.al. | 2412.07761 | null |
2024-12-10 | SynCamMaster: Synchronizing Multi-Camera Video Generation from Diverse Viewpoints | Jianhong Bai et.al. | 2412.07760 | link |
2024-12-10 | PortraitTalk: Towards Customizable One-Shot Audio-to-Talking Face Generation | Fatemeh Nazarieh et.al. | 2412.07754 | null |
2024-12-10 | Multi-Shot Character Consistency for Text-to-Video Generation | Yuval Atzmon et.al. | 2412.07750 | null |
2024-12-10 | StyleMaster: Stylize Your Video with Artistic Generation and Translation | Zixuan Ye et.al. | 2412.07744 | null |
2024-12-10 | STIV: Scalable Text and Image Conditioned Video Generation | Zongyu Lin et.al. | 2412.07730 | null |
2024-12-10 | ObjCtrl-2.5D: Training-free Object Control with Camera Poses | Zhouxia Wang et.al. | 2412.07721 | null |
2024-12-10 | ACDiT: Interpolating Autoregressive Conditional Modeling and Diffusion Transformer | Jinyi Hu et.al. | 2412.07720 | link |
2024-12-10 | Privacy-Preserving Customer Support: A Framework for Secure and Scalable Interactions | Anant Prakash Awasthi et.al. | 2412.07687 | null |
2024-12-10 | Optimizing Sensor Redundancy in Sequential Decision-Making Problems | Jonas Nüßlein et.al. | 2412.07686 | null |
2024-12-10 | [MASK] is All You Need | Vincent Tao Hu et.al. | 2412.06787 | link |
2024-12-09 | Tactile DreamFusion: Exploiting Tactile Sensing for 3D Generation | Ruihan Gao et.al. | 2412.06785 | link |
2024-12-09 | Diverse Score Distillation | Yanbo Xu et.al. | 2412.06780 | null |
2024-12-09 | Visual Lexicon: Rich Image Features in Language Space | XuDong Wang et.al. | 2412.06774 | null |
2024-12-09 | InstantRestore: Single-Step Personalized Face Restoration with Shared-Image Attention | Howard Zhang et.al. | 2412.06753 | null |
2024-12-09 | ONEBench to Test Them All: Sample-Level Benchmarking Over Open-Ended Capabilities | Adhiraj Ghosh et.al. | 2412.06745 | null |
2024-12-10 | ContRail: A Framework for Realistic Railway Image Synthesis using ControlNet | Andrei-Robert Alexandrescu et.al. | 2412.06742 | null |
2024-12-09 | Take Fake as Real: Realistic-like Robust Black-box Adversarial Attack to Evade AIGC Detection | Caiyun Xie et.al. | 2412.06727 | link |
2024-12-09 | You See it, You Got it: Learning 3D Creation on Pose-Free Videos at Scale | Baorui Ma et.al. | 2412.06699 | link |
2024-12-09 | Gen-3Diffusion: Realistic Image-to-3D Generation via 2D & 3D Diffusion Synergy | Yuxuan Xue et.al. | 2412.06698 | null |
2024-12-09 | Diff5T: Benchmarking Human Brain Diffusion MRI with an Extensive 5.0 Tesla K-Space and Spatial Dataset | Shanshan Wang et.al. | 2412.06666 | null |
2024-12-09 | Efficiency Meets Fidelity: A Novel Quantization Framework for Stable Diffusion | Shuaiting Li et.al. | 2412.06661 | null |
2024-12-09 | MVReward: Better Aligning and Evaluating Multi-View Diffusion Models with Human Preferences | Weitao Wang et.al. | 2412.06614 | null |
2024-12-09 | Augmented reality for upper limb rehabilitation: real-time kinematic feedback with HoloLens 2 | Beatrice Luciani et.al. | 2412.06596 | null |
2024-12-09 | EmoSpeech: A Corpus of Emotionally Rich and Contextually Detailed Speech Annotations | Weizhen Bian et.al. | 2412.06581 | null |
2024-12-06 | Stag-1: Towards Realistic 4D Driving Simulation with Video Generation Model | Lening Wang et.al. | 2412.05280 | link |
2024-12-06 | Perturb-and-Revise: Flexible 3D Editing with Generative Trajectories | Susung Hong et.al. | 2412.05279 | null |
2024-12-06 | Birth and Death of a Rose | Chen Geng et.al. | 2412.05278 | null |
2024-12-06 | MotionFlow: Attention-Driven Motion Transfer in Video Diffusion Models | Tuna Han Salih Meral et.al. | 2412.05275 | null |
2024-12-06 | Go-or-Grow Models in Biology: a Monster on a Leash | R. Thiessen et.al. | 2412.05191 | null |
2024-12-06 | Privacy Drift: Evolving Privacy Concerns in Incremental Learning | Sayyed Farid Ahamed et.al. | 2412.05183 | null |
2024-12-06 | DNF: Unconditional 4D Generation with Dictionary-based Neural Fields | Xinyi Zhang et.al. | 2412.05161 | null |
2024-12-06 | A text-to-tabular approach to generate synthetic patient data using LLMs | Margaux Tornqvist et.al. | 2412.05153 | link |
2024-12-06 | LoRA.rar: Learning to Merge LoRAs via Hypernetworks for Subject-Style Conditioned Image Generation | Donald Shenaj et.al. | 2412.05148 | null |
2024-12-06 | How to Squeeze An Explanation Out of Your Model | Tiago Roxo et.al. | 2412.05134 | null |
2024-12-06 | Probabilistic Galaxy Field Generation with Diffusion Models | Tanner Sether et.al. | 2412.05131 | null |
2024-12-06 | The Silent Prompt: Initial Noise as Implicit Guidance for Goal-Driven Image Generation | Ruoyu Wang et.al. | 2412.05101 | null |
2024-12-06 | Reconstructing Quantitative Cerebral Perfusion Images Directly From Measured Sinogram Data Acquired Using C-arm Cone-Beam CT | Haotian Zhao et.al. | 2412.05084 | null |
2024-12-06 | ReF-LDM: A Latent Diffusion Model for Reference-based Face Image Restoration | Chi-Wei Hsiao et.al. | 2412.05043 | null |
2024-12-06 | Get It Right: Improving Comprehensibility with Adaptable Speech Expression of a Humanoid Service Robot | Thomas Sievers et.al. | 2412.05022 | null |
2024-12-05 | PaintScene4D: Consistent 4D Scene Generation from Text Prompts | Vinayak Gupta et.al. | 2412.04471 | null |
2024-12-05 | LayerFusion: Harmonized Multi-Layer Text-to-Image Generation with Generative Priors | Yusuf Dalva et.al. | 2412.04460 | null |
2024-12-05 | Four-Plane Factorized Video Autoencoders | Mohammed Suhail et.al. | 2412.04452 | null |
2024-12-05 | MEMO: Memory-Guided Diffusion for Expressive Talking Video Generation | Longtao Zheng et.al. | 2412.04448 | null |
2024-12-05 | DiCoDe: Diffusion-Compressed Deep Tokens for Autoregressive Video Generation with Language Models | Yizhuo Li et.al. | 2412.04446 | null |
2024-12-05 | Learning Artistic Signatures: Symmetry Discovery and Style Transfer | Emma Finn et.al. | 2412.04441 | null |
2024-12-05 | GenMAC: Compositional Text-to-Video Generation with Multi-Agent Collaboration | Kaiyi Huang et.al. | 2412.04440 | null |
2024-12-05 | Divot: Diffusion Powers Video Tokenizer for Comprehension and Generation | Yuying Ge et.al. | 2412.04432 | link |
2024-12-05 | Infinity: Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis | Jian Han et.al. | 2412.04431 | link |
2024-12-05 | Reversible molecular simulation for training classical and machine learning force fields | Joe G Greener et.al. | 2412.04374 | link |
2024-12-05 | Machine Theory of Mind for Autonomous Cyber-Defence | Luke Swaby et.al. | 2412.04367 | null |
2024-12-05 | ActFusion: a Unified Diffusion Model for Action Segmentation and Anticipation | Dayoung Gong et.al. | 2412.04353 | null |
2024-12-05 | RMD: A Simple Baseline for More General Human Motion Generation via Training-free Retrieval-Augmented Motion Diffuse | Zhouyingcheng Liao et.al. | 2412.04343 | null |
2024-12-05 | Likelihood-Scheduled Score-Based Generative Modeling for Fully 3D PET Image Reconstruction | George Webber et.al. | 2412.04339 | null |
2024-12-05 | Multi-Subject Image Synthesis as a Generative Prior for Single-Subject PET Image Reconstruction | George Webber et.al. | 2412.04324 | null |
2024-12-04 | Navigation World Models | Amir Bar et.al. | 2412.03572 | null |
2024-12-04 | MIDI: Multi-Instance Diffusion for Single Image to 3D Scene Generation | Zehuan Huang et.al. | 2412.03558 | null |
2024-12-04 | NODE-AdvGAN: Improving the transferability and perceptual similarity of adversarial examples by dynamic-system-driven adversarial generative model | Xinheng Xie et.al. | 2412.03539 | null |
2024-12-04 | NVComposer: Boosting Generative Novel View Synthesis with Multiple Sparse and Unposed Images | Lingen Li et.al. | 2412.03517 | null |
2024-12-04 | Distilling Diffusion Models to Efficient 3D LiDAR Scene Completion | Shengyuan Zhang et.al. | 2412.03515 | link |
2024-12-04 | Data Fusion of Semantic and Depth Information in the Context of Object Detection | Md Abu Yusuf et.al. | 2412.03490 | null |
2024-12-04 | Flow Matching with General Discrete Paths: A Kinetic-Optimal Perspective | Neta Shaul et.al. | 2412.03487 | null |
2024-12-04 | Pre-trained Multiple Latent Variable Generative Models are good defenders against Adversarial Attacks | Dario Serez et.al. | 2412.03453 | link |
2024-12-04 | CleanDIFT: Diffusion Features without Noise | Nick Stracke et.al. | 2412.03439 | link |
2024-12-04 | SINGER: Vivid Audio-driven Singing Video Generation with Multi-scale Spectral Diffusion Model | Yan Li et.al. | 2412.03430 | null |
2024-12-04 | Skel3D: Skeleton Guided Novel View Synthesis | Aron Fóthi et.al. | 2412.03407 | null |
2024-12-04 | Identifiability implies consistency of MLE in partially observed diffusions on a torus | Ibrahim Ekren et.al. | 2412.03380 | null |
2024-12-04 | TASR: Timestep-Aware Diffusion Model for Image Super-Resolution | Qinwei Lin et.al. | 2412.03355 | link |
2024-12-04 | DIVE: Taming DINO for Subject-Driven Video Editing | Yi Huang et.al. | 2412.03347 | null |
2024-12-04 | Geometry-guided Cross-view Diffusion for One-to-many Cross-view Image Synthesis | Tao Jun Lin et.al. | 2412.03315 | null |
2024-12-03 | Motion Prompting: Controlling Video Generation with Motion Trajectories | Daniel Geng et.al. | 2412.02700 | null |
2024-12-03 | Diffusion-based Visual Anagram as Multi-task Learning | Zhiyuan Xu et.al. | 2412.02693 | link |
2024-12-03 | FoundHand: Large-Scale Domain-Specific Learning for Controllable Hand Image Generation | Kefan Chen et.al. | 2412.02690 | null |
2024-12-04 | SNOOPI: Supercharged One-step Diffusion Distillation with Proper Guidance | Viet Nguyen et.al. | 2412.02687 | null |
2024-12-03 | AniGS: Animatable Gaussian Avatar from a Single Image with Inconsistent Gaussian Reconstruction | Lingteng Qiu et.al. | 2412.02684 | null |
2024-12-03 | Sharp-It: A Multi-view to Multi-view Diffusion Model for 3D Synthesis and Manipulation | Yiftach Edelstein et.al. | 2412.02631 | null |
2024-12-03 | The effect of priors on Learning with Restricted Boltzmann Machines | Gianluca Manzan et.al. | 2412.02623 | null |
2024-12-03 | ComPair-2: A Next Generation Medium Energy Gamma-ray Telescope Prototype | Regina Caputo et.al. | 2412.02562 | null |
2024-12-03 | The Two-Center Problem of Uncertain Points on Cactus Graphs | Haitao Xu et.al. | 2412.02559 | null |
2024-12-03 | ShadowHack: Hacking Shadows via Luminance-Color Divide and Conquer | Jin Hu et.al. | 2412.02545 | link |
2024-12-03 | Unveiling Concept Attribution in Diffusion Models | Quang H. Nguyen et.al. | 2412.02542 | null |
2024-12-03 | LLMForecaster: Improving Seasonal Event Forecasts with Unstructured Textual Data | Hanyu Zhang et.al. | 2412.02525 | null |
2024-12-03 | GerPS-Compare: Comparing NER methods for legal norm analysis | Sarah T. Bachinger et.al. | 2412.02427 | null |
2024-12-03 | It Takes Two: Real-time Co-Speech Two-person’s Interaction Generation via Reactive Auto-regressive Diffusion Model | Mingyi Shi et.al. | 2412.02419 | null |
2024-12-03 | A Multi-Agent Framework for Extensible Structured Text Generation in PLCs | Donghao Yang et.al. | 2412.02410 | null |
2024-11-29 | Nanostructured micrometric-pore membranes for nanofiltration: Micrometric geometry may optimize performance, energy efficiency and operational lifetime | J. C. Verde et.al. | 2411.19900 | null |
2024-11-29 | Input-Output Optics as a Causal Time Series Mapping: A Generative Machine Learning Solution | Abhijit Sen et.al. | 2411.19897 | null |
2024-11-29 | MoTe: Learning Motion-Text Diffusion Model for Multiple Generation Tasks | Yiming Wu et.al. | 2411.19786 | null |
2024-11-29 | Riemannian Denoising Score Matching for Molecular Structure Optimization with Accurate Energy | Jeheon Woo et.al. | 2411.19769 | null |
2024-11-29 | JetFormer: An Autoregressive Generative Model of Raw Images and Text | Michael Tschannen et.al. | 2411.19722 | null |
2024-11-29 | Inverse Design of Mechanical Metamaterials Using a Point-Cloud-Based Deep Generative Model | Seungwook Hong et.al. | 2411.19681 | null |
2024-11-29 | TexGaussian: Generating High-quality PBR Material via Octree-based 3D Gaussian Splatting | Bojun Xiong et.al. | 2411.19654 | null |
2024-11-29 | Uniform Attention Maps: Boosting Image Fidelity in Reconstruction and Editing | Wenyi Mo et.al. | 2411.19652 | link |
2024-11-29 | Enhancing Security in Third-Party Library Reuse – Comprehensive Detection of 1-day Vulnerability through Code Patch Analysis | Shangzhi Xu et.al. | 2411.19648 | null |
2024-11-29 | Accelerating Multimodal Large Language Models via Dynamic Visual-Token Exit and the Empirical Findings | Qiong Wu et.al. | 2411.19628 | link |
2024-11-29 | Unimib Assistant: designing a student-friendly RAG-based chatbot for all their needs | Chiara Antico et.al. | 2411.19554 | null |
2024-11-29 | Deepfake Media Generation and Detection in the Generative AI Era: A Survey and Outlook | Florinel-Alin Croitoru et.al. | 2411.19537 | link |
2024-11-29 | Quantized Delta Weight Is Safety Keeper | Yule Liu et.al. | 2411.19530 | null |
2024-12-02 | DisCoRD: Discrete Tokens to Continuous Motion via Rectified Flow Decoding | Jungbin Cho et.al. | 2411.19527 | null |
2024-11-29 | Ditto: Motion-Space Diffusion for Controllable Realtime Talking Head Synthesis | Tianqi Li et.al. | 2411.19509 | null |
2024-11-27 | Textured Gaussians for Enhanced 3D Scene Appearance Modeling | Brian Chao et.al. | 2411.18625 | null |
2024-11-27 | GeneMAN: Generalizable Single-Image 3D Human Reconstruction from Multi-Source Human Data | Wentao Wang et.al. | 2411.18624 | null |
2024-11-27 | Diffusion Self-Distillation for Zero-Shot Customized Image Generation | Shengqu Cai et.al. | 2411.18616 | null |
2024-11-27 | CAT4D: Create Anything in 4D with Multi-View Video Diffusion Models | Rundi Wu et.al. | 2411.18613 | null |
2024-11-27 | Evaluating and Improving the Effectiveness of Synthetic Chest X-Rays for Medical Image Analysis | Eva Prakash et.al. | 2411.18602 | null |
2024-11-27 | Bit symmetry entails the symmetry of the quantum transition probability | Gerd Niestegge et.al. | 2411.18589 | null |
2024-11-27 | Building Confidence in Deep Generative Protein Design | Tianyuan Zheng et.al. | 2411.18568 | link |
2024-11-27 | High-throughput antibody screening with high-quality factor nanophotonics and bioprinting | Sajjad Abdollahramezani et.al. | 2411.18557 | null |
2024-11-27 | FAM Diffusion: Frequency and Attention Modulation for High-Resolution Image Generation with Stable Diffusion | Haosen Yang et.al. | 2411.18552 | null |
2024-11-28 | Enhancing weed detection performance by means of GenAI-based image augmentation | Sourav Modak et.al. | 2411.18513 | null |
2024-11-27 | GATE OpenING: A Comprehensive Benchmark for Judging Open-ended Interleaved Image-Text Generation | Pengfei Zhou et.al. | 2411.18499 | null |
2024-11-27 | Synthetic ECG Generation for Data Augmentation and Transfer Learning in Arrhythmia Classification | José Fernando Núñez et.al. | 2411.18456 | null |
2024-11-27 | Is my Meeting Summary Good? Estimating Quality with a Multi-LLM Evaluator | Frederic Kirstein et.al. | 2411.18444 | null |
2024-11-27 | Learning the Evolution of Physical Structure of Galaxies via Diffusion Models | Andrew Lizarraga et.al. | 2411.18440 | link |
2024-11-27 | Search for heavy scalar or pseudoscalar states in $\mathrm{t \bar{t}}$ events at CMS | Laurids Jeppe et.al. | 2411.18414 | null |
2024-11-27 | StableAnimator: High-Quality Identity-Preserving Human Image Animation | Shuyuan Tu et.al. | 2411.17697 | link |
2024-11-26 | ScribbleLight: Single Image Indoor Relighting with Scribbles | Jun Myeong Choi et.al. | 2411.17696 | null |
2024-11-26 | Visatronic: A Multimodal Decoder-Only Model for Speech Synthesis | Akshita Gupta et.al. | 2411.17690 | null |
2024-11-26 | GenDeg: Diffusion-Based Degradation Synthesis for Generalizable All-in-One Image Restoration | Sudarshan Rajagopalan et.al. | 2411.17687 | null |
2024-11-26 | Semi-analytical model for the calculation of solar radiation pressure and its effects on a LEO satellite with predicting the change in position vectors using machine learning techniques | Pranava Seth et.al. | 2411.17626 | null |
2024-11-26 | Accelerating Vision Diffusion Transformers with Skip Branches | Guanjie Chen et.al. | 2411.17616 | link |
2024-11-26 | Mixed-State Quantum Denoising Diffusion Probabilistic Model | Gino Kwun et.al. | 2411.17608 | null |
2024-11-26 | Making History Readable | Bipasha Banerjee et.al. | 2411.17600 | null |
2024-11-26 | VideoDirector: Precise Video Editing via Text-to-Video Models | Yukun Wang et.al. | 2411.17592 | null |
2024-11-26 | Rapid Deployment of Domain-specific Hyperspectral Image Processors with Application to Autonomous Driving | Jon Gutiérrez-Zaballa et.al. | 2411.17543 | null |
2024-11-26 | Metaverse Innovation Canvas: A Tool for Extended Reality Product/Service Development | Amir Reza Asadi et.al. | 2411.17541 | null |
2024-11-26 | IMPROVE: Improving Medical Plausibility without Reliance on HumanValidation – An Enhanced Prototype-Guided Diffusion Framework | Anurag Shandilya et.al. | 2411.17535 | null |
2024-11-26 | FTMoMamba: Motion Generation with Frequency and Text State Space Models | Chengjian Li et.al. | 2411.17532 | null |
2024-11-26 | Exact and Heuristic Approaches for the Covering Tour Location Routing Problem | Andreas Hagn et.al. | 2411.17510 | link |
2024-11-26 | WF-VAE: Enhancing Video VAE by Wavelet-Driven Energy Flow for Latent Video Diffusion Model | Zongjian Li et.al. | 2411.17459 | link |
2024-11-25 | Generative Omnimatte: Learning to Decompose Video into Layers | Yao-Chih Lee et.al. | 2411.16683 | null |
2024-11-25 | Diffusion Features for Zero-Shot 6DoF Object Pose Estimation | Bernd Von Gimborn et.al. | 2411.16668 | null |
2024-11-25 | DreamRunner: Fine-Grained Storytelling Video Generation with Retrieval-Augmented Motion Adaptation | Zun Wang et.al. | 2411.16657 | null |
2024-11-25 | Exploring Discrete Flow Matching for 3D De Novo Molecule Generation | Ian Dunn et.al. | 2411.16644 | link |
2024-11-25 | LegoPET: Hierarchical Feature Guided Conditional Diffusion for PET Image Reconstruction | Yiran Sun et.al. | 2411.16629 | null |
2024-11-25 | Chat2SVG: Vector Graphics Generation with Large Language Models and Image Diffusion Models | Ronghuan Wu et.al. | 2411.16602 | null |
2024-11-25 | Unlocking The Potential of Adaptive Attacks on Diffusion-Based Purification | Andre Kassis et.al. | 2411.16598 | link |
2024-11-25 | Rethinking Diffusion for Text-Driven Human Motion Generation | Zichong Meng et.al. | 2411.16575 | null |
2024-11-25 | Representation Collapsing Problems in Vector Quantization | Wenhao Zhao et.al. | 2411.16550 | null |
2024-11-25 | ADOBI: Adaptive Diffusion Bridge For Blind Inverse Problems with Application to MRI Reconstruction | Yuyang Hu et.al. | 2411.16535 | null |
2024-11-25 | PriorPath: Coarse-To-Fine Approach for Controlled De-Novo Pathology Semantic Masks Generation | Nati Daniel et.al. | 2411.16515 | null |
2024-11-25 | Noise Diffusion for Enhancing Semantic Faithfulness in Text-to-Image Synthesis | Boming Miao et.al. | 2411.16503 | null |
2024-11-25 | Multi-Resolution Generative Modeling of Human Motion from Limited Data | David Eduardo Moreno-Villamarín et.al. | 2411.16498 | null |
2024-11-25 | Learning by Analogy: Enhancing Few-Shot Prompting for Math Word Problem Solving with Computational Graph-Based Retrieval | Xiaocong Yang et.al. | 2411.16454 | null |
2024-11-25 | Model-based reinforcement corrosion prediction: Continuous calibration with Bayesian optimization and corrosion wire sensor data | A. Potnis et.al. | 2411.16447 | null |
2024-11-22 | DiffusionDrive: Truncated Diffusion Model for End-to-End Autonomous Driving | Bencheng Liao et.al. | 2411.15139 | link |
2024-11-22 | Material Anything: Generating Materials for Any 3D Object via Diffusion | Xin Huang et.al. | 2411.15138 | null |
2024-11-22 | VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement | Daeun Lee et.al. | 2411.15115 | null |
2024-11-22 | RE-Bench: Evaluating frontier AI R&D capabilities of language model agents against human experts | Hjalmar Wijk et.al. | 2411.15114 | link |
2024-11-22 | Efficient Pruning of Text-to-Image Models: Insights from Pruning Stable Diffusion | Samarth N Ramesh et.al. | 2411.15113 | null |
2024-11-22 | Leapfrog Latent Consistency Model (LLCM) for Medical Images Generation | Lakshmikar R. Polamreddy et.al. | 2411.15084 | link |
2024-11-22 | Towards Speaker Identification with Minimal Dataset and Constrained Resources using 1D-Convolution Neural Network | Irfan Nafiz Shahan et.al. | 2411.15082 | link |
2024-11-22 | Empowering Clients: Transformation of Design Processes Due to Generative AI | Johannes Schneider et.al. | 2411.15061 | null |
2024-11-22 | The 1D nonlocal Fisher-KPP equation with a top hat kernel. Part 3. The effect of perturbations in the kernel | David John Needham et.al. | 2411.15054 | null |
2024-11-22 | FloAt: Flow Warping of Self-Attention for Clothing Animation Generation | Swasti Shreya Mishra et.al. | 2411.15028 | null |
2024-11-22 | Enhancing Exploration with Diffusion Policies in Hybrid Off-Policy RL: Application to Non-Prehensile Manipulation | Huy Le et.al. | 2411.14913 | null |
2024-11-22 | Dynamically Encircled Higher-order Exceptional Points in an Optical Fiber | Arpan Roy et.al. | 2411.14874 | null |
2024-11-22 | Prioritize Denoising Steps on Diffusion Model Preference Alignment via Explicit Denoised Distribution Estimation | Dingyuan Shi et.al. | 2411.14871 | null |
2024-11-22 | Latent Schrodinger Bridge: Prompting Latent Diffusion for Fast Unpaired Image-to-Image Translation | Jeongsol Kim et.al. | 2411.14863 | null |
2024-11-22 | Style-Friendly SNR Sampler for Style-Driven Generation | Jooyoung Choi et.al. | 2411.14793 | null |
2024-11-21 | Stable Flow: Vital Layers for Training-Free Image Editing | Omri Avrahami et.al. | 2411.14430 | null |
2024-11-21 | Transformer-based Heuristic for Advanced Air Mobility Planning | Jun Xiang et.al. | 2411.14427 | null |
2024-11-21 | A Python-Based Approach to Sputter Deposition Simulations in Combinatorial Materials Science | Felix Thelen et.al. | 2411.14413 | null |
2024-11-21 | Multi-Agent Environments for Vehicle Routing Problems | Ricardo Gama et.al. | 2411.14411 | link |
2024-11-21 | Baking Gaussian Splatting into Diffusion Denoiser for Fast and Scalable Single-stage Image-to-3D Generation | Yuanhao Cai et.al. | 2411.14384 | null |
2024-11-21 | CoNFiLD-inlet: Synthetic Turbulence Inflow Using Generative Latent Diffusion Models with Neural Fields | Xin-Yang Liu et.al. | 2411.14378 | null |
2024-11-21 | Enhancing Medical Image Segmentation with Deep Learning and Diffusion Models | Houze Liu et.al. | 2411.14353 | null |
2024-11-21 | DINO-X: A Unified Vision Model for Open-World Object Detection and Understanding | Tianhe Ren et.al. | 2411.14347 | link |
2024-11-21 | Lower Dimensional Spherical Representation of Medium Voltage Load Profiles for Visualization, Outlier Detection, and Generative Modelling | Edgar Mauricio Salazar Duque et.al. | 2411.14346 | null |
2024-11-21 | StereoCrafter-Zero: Zero-Shot Stereo Video Generation with Noisy Restart | Jian Shi et.al. | 2411.14295 | null |
2024-11-21 | Efficient Aspect-Based Summarization of Climate Change Reports with Small Language Models | Iacopo Ghinassi et.al. | 2411.14272 | link |
2024-11-21 | Guided MRI Reconstruction via Schrödinger Bridge | Yue Wang et.al. | 2411.14269 | null |
2024-11-21 | Regional Attention for Shadow Removal | Hengxing Liu et.al. | 2411.14201 | link |
2024-11-21 | TaQ-DiT: Time-aware Quantization for Diffusion Transformers | Xinyan Liu et.al. | 2411.14172 | null |
2024-11-21 | Creating a Formally Verified Neural Network for Autonomous Navigation: An Experience Report | Syed Ali Asadullah Bukhari et.al. | 2411.14163 | link |
2024-11-20 | REDUCIO! Generating 1024 $\times$ 1024 Video within 16 Seconds using Extremely Compressed Motion Latents | Rui Tian et.al. | 2411.13552 | link |
2024-11-20 | Identity Preserving 3D Head Stylization with Multiview Score Distillation | Bahri Batuhan Bilecen et.al. | 2411.13536 | null |
2024-11-20 | VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative Models | Ziqi Huang et.al. | 2411.13503 | link |
2024-11-20 | LIMBA: An Open-Source Framework for the Preservation and Valorization of Low-Resource Languages using Generative Models | Salvatore Mario Carta et.al. | 2411.13453 | null |
2024-11-20 | Heuristically Adaptive Diffusion-Model Evolutionary Strategy | Benedikt Hartl et.al. | 2411.13420 | null |
2024-11-20 | Energy-based generative models for monoclonal antibodies | Paul Pereira et.al. | 2411.13390 | link |
2024-11-20 | Small and Close-In Planets are Uncommon around A-type Stars | Steven Giacalone et.al. | 2411.13363 | null |
2024-11-20 | Vertical Validation: Evaluating Implicit Generative Models for Graphs on Thin Support Regions | Mai Elkady et.al. | 2411.13358 | null |
2024-11-20 | A CSI Feedback Framework based on Transmitting the Important Values and Generating the Others | Zhilin Du et.al. | 2411.13298 | null |
2024-11-21 | Structure-Based Molecule Optimization via Gradient-Guided Bayesian Update | Keyue Qiu et.al. | 2411.13280 | null |
2024-11-20 | XMask3D: Cross-modal Mask Reasoning for Open Vocabulary 3D Semantic Segmentation | Ziyi Wang et.al. | 2411.13243 | link |
2024-11-20 | BIPro: Zero-shot Chinese Poem Generation via Block Inverse Prompting Constrained Generation Framework | Xu Zou et.al. | 2411.13237 | null |
2024-11-20 | Building music with Lego bricks and Raspberry Pi | Ana M. Barbancho et.al. | 2411.13224 | null |
2024-11-20 | A computational framework for integrating Predictive processes with evidence Accumulation Models (PAM) | Antonino Visalli et.al. | 2411.13203 | link |
2024-11-20 | OpenMS WebApps: Building User-Friendly Solutions for MS Analysis | Tom David Müller et.al. | 2411.13189 | null |
2024-11-19 | Enhancing Multi-Class Disease Classification: Neoplasms, Cardiovascular, Nervous System, and Digestive Disorders Using Advanced LLMs | Ahmed Akib Jawad Karim et.al. | 2411.12712 | null |
2024-11-19 | OrigamiPlot: An R Package and Shiny Web App Enhanced Visualizations for Multivariate Data | Yiwen Lu et.al. | 2411.12674 | null |
2024-11-19 | Auto-Evaluation with Few Labels through Post-hoc Regression | Benjamin Eyre et.al. | 2411.12665 | null |
2024-11-19 | PoM: Efficient Image and Video Generation with the Polynomial Mixer | David Picard et.al. | 2411.12663 | link |
2024-11-19 | Optimizing Airline Reservation Systems with Edge-Enabled Microservices: A Framework for Real-Time Data Processing and Enhanced User Responsiveness | Biman Barua et.al. | 2411.12650 | null |
2024-11-19 | DLBacktrace: A Model Agnostic Explainability for any Deep Learning Models | Vinay Kumar Sankarapu et.al. | 2411.12643 | link |
2024-11-19 | Improving Controllability and Editability for Pretrained Text-to-Music Generation Models | Yixiao Zhang et.al. | 2411.12641 | null |
2024-11-19 | Universal programmable waveguide arrays | Akram Youssry et.al. | 2411.12610 | null |
2024-11-19 | Whisper Finetuning on Nepali Language | Sanjay Rijal et.al. | 2411.12587 | null |
2024-11-19 | Predicting Customer Satisfaction by Replicating the Survey Response Distribution | Etienne Manderscheid et.al. | 2411.12539 | null |
2024-11-19 | Data Pruning in Generative Diffusion Models | Rania Briq et.al. | 2411.12523 | null |
2024-11-19 | Probe-Me-Not: Protecting Pre-trained Encoders from Malicious Probing | Ruyi Ding et.al. | 2411.12508 | null |
2024-11-19 | Empirical Privacy Evaluations of Generative and Predictive Machine Learning Models – A review and challenges for practice | Flavio Hafner et.al. | 2411.12451 | null |
2024-11-19 | Frequency-Aware Guidance for Blind Image Restoration via Diffusion Models | Jun Xiao et.al. | 2411.12450 | null |
2024-11-19 | A general modeling and simulation framework for dynamic vehicle routing | Markó Horváth et.al. | 2411.12406 | link |
2024-11-18 | QARM: Quantitative Alignment Multi-Modal Recommendation at Kuaishou | Xinchen Luo et.al. | 2411.11739 | null |
2024-11-18 | Aligning Few-Step Diffusion Models with Dense Reward Difference Learning | Ziyi Zhang et.al. | 2411.11727 | link |
2024-11-18 | Multiscale nonlinear integration drives accurate encoding of input information | Giorgio Nicoletti et.al. | 2411.11710 | null |
2024-11-18 | Robust Reinforcement Learning under Diffusion Models for Data with Jumps | Chenyang Jiang et.al. | 2411.11697 | null |
2024-11-18 | Active droplets controlled by enzymatic reactions | Jacques Fries et.al. | 2411.11696 | null |
2024-11-18 | Do Captioning Metrics Reflect Music Semantic Alignment? | Jinwoo Lee et.al. | 2411.11692 | null |
2024-11-18 | Conceptwm: A Diffusion Model Watermark for Concept Protection | Liangqi Lei et.al. | 2411.11688 | null |
2024-11-19 | GNN-Based Code Annotation Logic for Establishing Security Boundaries in C Code | Varun Gadey et.al. | 2411.11567 | null |
2024-11-19 | Cascaded Diffusion Models for 2D and 3D Microscopy Image Synthesis to Enhance Cell Segmentation | Rüveyda Yilmaz et.al. | 2411.11515 | null |
2024-11-18 | Collaborative Contrastive Network for Click-Through Rate Prediction | Chen Gao et.al. | 2411.11508 | null |
2024-11-18 | LaVin-DiT: Large Vision Diffusion Transformer | Zhaoqing Wang et.al. | 2411.11505 | null |
2024-11-18 | Alien Recombination: Exploring Concept Blends Beyond Human Cognitive Availability in Visual Art | Alejandro Hernandez et.al. | 2411.11494 | null |
2024-11-18 | MVLight: Relightable Text-to-3D Generation via Light-conditioned Multi-View Diffusion | Dongseok Shim et.al. | 2411.11475 | null |
2024-11-18 | GLDesigner: Leveraging Multi-Modal LLMs as Designer for Enhanced Aesthetic Text Glyph Layouts | Junwen He et.al. | 2411.11435 | null |
2024-11-18 | CLUE-MARK: Watermarking Diffusion Models using CLWE | Kareem Shehata et.al. | 2411.11434 | null |
2024-11-15 | M-VAR: Decoupled Scale-wise Autoregressive Modeling for High-Quality Image Generation | Sucheng Ren et.al. | 2411.10433 | link |
2024-11-15 | Mitigating Parameter Degeneracy using Joint Conditional Diffusion Model for WECC Composite Load Model in Power Systems | Feiqin Zhu et.al. | 2411.10431 | null |
2024-11-15 | Multiscale Dubuc: A New Similarity Measure for Time Series | Mahsa Khazaei et.al. | 2411.10418 | link |
2024-11-15 | Experimental generation of extreme electron beams for advanced accelerator applications | Claudio Emma et.al. | 2411.10413 | null |
2024-11-15 | How to Build a Quantum Supercomputer: Scaling Challenges and Opportunities | Masoud Mohseni et.al. | 2411.10406 | null |
2024-11-15 | Nonlinearity-Driven Morphing and Control of Topological Modes in Non-Hermitian Systems | Zhao-Fan Cai et.al. | 2411.10398 | null |
2024-11-15 | Towards High-Fidelity 3D Portrait Generation with Rich Details by Cross-View Prior-Aware Diffusion | Haoran Wei et.al. | 2411.10369 | null |
2024-11-15 | Safe Text-to-Image Generation: Simply Sanitize the Prompt Embedding | Huming Qiu et.al. | 2411.10329 | null |
2024-11-15 | Probabilistic Prior Driven Attention Mechanism Based on Diffusion Model for Imaging Through Atmospheric Turbulence | Guodong Sun et.al. | 2411.10321 | null |
2024-11-15 | Assortment Optimization under the Multinomial Logit Model with Covering Constraints | Omar El Housni et.al. | 2411.10310 | null |
2024-11-15 | Modification Takes Courage: Seamless Image Stitching via Reference-Driven Inpainting | Ziqi Xie et.al. | 2411.10309 | link |
2024-11-15 | MDHP-Net: Detecting Injection Attacks on In-vehicle Network using Multi-Dimensional Hawkes Process and Temporal Model | Qi Liu et.al. | 2411.10258 | null |
2024-11-15 | The Unreasonable Effectiveness of Guidance for Diffusion Models | Tim Kaiser et.al. | 2411.10257 | null |
2024-11-15 | Smooth transport map via diffusion process | Arthur Stéphanovitch et.al. | 2411.10235 | null |
2024-11-15 | ColorEdit: Training-free Image-Guided Color editing with diffusion model | Xingxi Yin et.al. | 2411.10232 | null |
2024-11-14 | A Bayesian Optimization Approach to Machine Translation Reranking | Julius Cheng et.al. | 2411.09694 | null |
2024-11-14 | SimTube: Generating Simulated Video Comments through Multimodal AI and User Personas | Yu-Kai Hung et.al. | 2411.09577 | null |
2024-11-14 | Golden Noise for Diffusion Models: A Learning Framework | Zikai Zhou et.al. | 2411.09502 | null |
2024-11-14 | Sparse Bayesian Generative Modeling for Compressive Sensing | Benedikt Böck et.al. | 2411.09483 | link |
2024-11-14 | DiffRoad: Realistic and Diverse Road Scenario Generation for Autonomous Vehicle Testing | Junjie Zhou et.al. | 2411.09451 | null |
2024-11-14 | Image Regeneration: Evaluating Text-to-Image Model via Generating Identical Image with Multimodal Large Language Models | Chutian Meng et.al. | 2411.09449 | null |
2024-11-14 | A survey of probabilistic generative frameworks for molecular simulations | Richard John et.al. | 2411.09388 | link |
2024-11-14 | Multi-scale Generative Modeling for Fast Sampling | Xiongye Xiao et.al. | 2411.09356 | null |
2024-11-14 | ParaLBench: A Large-Scale Benchmark for Computational Paralinguistics over Acoustic Foundation Models | Zixing Zhang et.al. | 2411.09349 | null |
2024-11-15 | Approximate Probabilistic Inference for Time-Series Data A Robust Latent Gaussian Model With Temporal Awareness | Anton Johansson et.al. | 2411.09312 | null |
2024-11-14 | EEG-Based Speech Decoding: A Novel Approach Using Multi-Kernel Ensemble Diffusion Models | Soowon Kim et.al. | 2411.09302 | null |
2024-11-14 | LES-Talker: Fine-Grained Emotion Editing for Talking Head Generation in Linear Emotion Space | Guanwen Feng et.al. | 2411.09268 | null |
2024-11-14 | Jailbreak Attacks and Defenses against Multimodal Generative Models: A Survey | Xuannan Liu et.al. | 2411.09259 | link |
2024-11-14 | RibCageImp: A Deep Learning Framework for 3D Ribcage Implant Generation | Gyanendra Chaubey et.al. | 2411.09204 | null |
2024-11-14 | Improvement and Implementation of a Speech Emotion Recognition Model Based on Dual-Layer LSTM | Xiaoran Yang et.al. | 2411.09189 | null |
2024-11-13 | 4D Gaussian Splatting in the Wild with Uncertainty-Aware Regularization | Mijeong Kim et.al. | 2411.08879 | null |
2024-11-13 | A generalized software framework for consolidation of radiotherapy planning and delivery data from diverse data sources | Yasin Abdulkadir et.al. | 2411.08876 | null |
2024-11-13 | Offline Adaptation of Quadruped Locomotion using Diffusion Models | Reece O’Mahoney et.al. | 2411.08832 | null |
2024-11-13 | SANDWICH: Towards an Offline, Differentiable, Fully-Trainable Wireless Neural Ray-Tracing Surrogate | Yifei Jin et.al. | 2411.08767 | null |
2024-11-13 | Analyst Reports and Stock Performance: Evidence from the Chinese Market | Rui Liu et.al. | 2411.08726 | null |
2024-11-14 | Reducing ADC Front-end Costs During Training of On-sensor Printed Multilayer Perceptrons | Florentia Afentaki et.al. | 2411.08674 | null |
2024-11-13 | Joint Model Caching and Resource Allocation in Generative AI-Enabled Wireless Edge Networks | Zhang Liu et.al. | 2411.08672 | null |
2024-11-13 | Toward Human Understanding with Controllable Synthesis | Hanz Cuevas-Velasquez et.al. | 2411.08663 | null |
2024-11-13 | The Galactica database: an open, generic and versatile tool for the dissemination of simulation data in astrophysics | Damien Chapon et.al. | 2411.08647 | null |
2024-11-13 | Towards More Accurate Fake Detection on Images Generated from Advanced Generative and Neural Rendering Models | Chengdong Dong et.al. | 2411.08642 | null |
2024-11-13 | Deep Generative Demand Learning for Newsvendor and Pricing | Shijin Gong et.al. | 2411.08631 | null |
2024-11-13 | LG-Gaze: Learning Geometry-aware Continuous Prompts for Language-Guided Gaze Estimation | Pengwei Yin et.al. | 2411.08606 | null |
2024-11-13 | CorrSynth – A Correlated Sampling Method for Diverse Dataset Generation from LLMs | Suhas S Kowshik et.al. | 2411.08553 | null |
2024-11-13 | Explainers’ Mental Representations of Explainees’ Needs in Everyday Explanations | Michael Erol Schaffer et.al. | 2411.08514 | null |
2024-11-13 | HyperFace: Generating Synthetic Face Recognition Datasets by Exploring Face Embedding Hypersphere | Hatef Otroshi Shahreza et.al. | 2411.08470 | null |
2024-11-12 | Scaling Properties of Diffusion Models for Perceptual Tasks | Rahul Ravishankar et.al. | 2411.08034 | null |
2024-11-12 | GaussianAnything: Interactive Point Cloud Latent Diffusion for 3D Generation | Yushi Lan et.al. | 2411.08033 | null |
2024-11-12 | Wavelet Latent Diffusion (Wala): Billion-Parameter 3D Generative Model with Compact Wavelet Encodings | Aditya Sanghi et.al. | 2411.08017 | link |
2024-11-12 | JanusFlow: Harmonizing Autoregression and Rectified Flow for Unified Multimodal Understanding and Generation | Yiyang Ma et.al. | 2411.07975 | link |
2024-11-12 | Diverse capability and scaling of diffusion and auto-regressive models when learning abstract rules | Binxu Wang et.al. | 2411.07873 | null |
2024-11-12 | Trustful LLMs: Customizing and Grounding Text Generation with Knowledge Bases and Dual Decoders | Xiaofeng Zhu et.al. | 2411.07870 | null |
2024-11-12 | CDXFormer: Boosting Remote Sensing Change Detection with Extended Long Short-Term Memory | Zhenkai Wu et.al. | 2411.07863 | link |
2024-11-12 | Sparsity-Aware Optimization of In-Memory Bayesian Binary Neural Network Accelerators | Prabodh Katti et.al. | 2411.07842 | null |
2024-11-12 | Novel View Synthesis with Pixel-Space Diffusion Models | Noam Elata et.al. | 2411.07765 | null |
2024-11-12 | Nanosecond nanothermometry in an electron microscope | Florian Castioni et.al. | 2411.07764 | null |
2024-11-12 | LapGSR: Laplacian Reconstructive Network for Guided Thermal Super-Resolution | Aditya Kasliwal et.al. | 2411.07750 | null |
2024-11-12 | The relationship between general equilibrium models with infinite-lived agents and overlapping generations models, and some applications | Ngoc-Sang Pham et.al. | 2411.07674 | null |
2024-11-12 | Evaluating the Generation of Spatial Relations in Text and Image Generative Models | Shang Hong Sim et.al. | 2411.07664 | null |
2024-11-12 | Leveraging Previous Steps: A Training-free Fast Solver for Flow Diffusion | Kaiyu Song et.al. | 2411.07627 | null |
2024-11-12 | Unraveling the Connections between Flow Matching and Diffusion Probabilistic Models in Training-free Conditional Generation | Kaiyu Song et.al. | 2411.07625 | null |
2024-11-11 | Score-based generative diffusion with “active” correlated noise sources | Alexandra Lamtyugina et.al. | 2411.07233 | null |
2024-11-12 | Add-it: Training-Free Object Insertion in Images With Pretrained Diffusion Models | Yoad Tewel et.al. | 2411.07232 | null |
2024-11-11 | Learning from Limited and Imperfect Data | Harsh Rangwani et.al. | 2411.07229 | null |
2024-11-11 | TempCharBERT: Keystroke Dynamics for Continuous Access Control Based on Pre-trained Language Models | Matheus Simão et.al. | 2411.07224 | null |
2024-11-11 | DLCR: A Generative Data Expansion Framework via Diffusion for Clothes-Changing Person Re-ID | Nyle Siddiqui et.al. | 2411.07205 | link |
2024-11-11 | Crossover from inhomogeneous to homogeneous response of a resonantly driven hBN quantum emitter | Domitille Gérard et.al. | 2411.07202 | null |
2024-11-11 | OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision | Cong Wei et.al. | 2411.07199 | null |
2024-11-11 | More Expressive Attention with Negative Weights | Ang Lv et.al. | 2411.07176 | link |
2024-11-11 | Edify 3D: Scalable High-Quality 3D Asset Generation | NVIDIA et.al. | 2411.07135 | null |
2024-11-11 | Benchmarking LLMs’ Judgments with No Gold Standard | Shengwei Xu et.al. | 2411.07127 | link |
2024-11-11 | Edify Image: High-Quality Image Generation with Pixel Space Laplacian Diffusion Models | NVIDIA et.al. | 2411.07126 | null |
2024-11-11 | Decoding Visual Experience and Mapping Semantics through Whole-Brain Analysis Using fMRI Foundation Models | Yanchen Wang et.al. | 2411.07121 | link |
2024-11-11 | Scaling Mesh Generation via Compressive Tokenization | Haohan Weng et.al. | 2411.07025 | link |
2024-11-11 | An Electrocardiogram Monitoring Device Based on STM32 | Wenqi Guan et.al. | 2411.06962 | null |
2024-11-11 | Generative Feature Training of Thin 2-Layer Networks | Johannes Hertrich et.al. | 2411.06848 | link |
2024-11-08 | StdGEN: Semantic-Decomposed 3D Character Generation from Single Images | Yuze He et.al. | 2411.05738 | null |
2024-11-08 | Image2Text2Image: A Novel Framework for Label-Free Evaluation of Image-to-Text Generation with Text-to-Image Diffusion Models | Jia-Hong Huang et.al. | 2411.05706 | null |
2024-11-08 | Improving Molecular Graph Generation with Flow Matching and Optimal Transport | Xiaoyang Hou et.al. | 2411.05676 | null |
2024-11-08 | Towards Lifelong Few-Shot Customization of Text-to-Image Diffusion | Nan Song et.al. | 2411.05544 | null |
2024-11-08 | Improving image synthesis with diffusion-negative sampling | Alakh Desai et.al. | 2411.05473 | null |
2024-11-08 | Bridging the Gap between Learning and Inference for Diffusion-Based Molecule Generation | Peidong Liu et.al. | 2411.05472 | link |
2024-11-08 | IntellBot: Retrieval Augmented LLM Chatbot for Cyber Threat Knowledge Delivery | Dincy R. Arikkat et.al. | 2411.05442 | null |
2024-11-08 | RED: Residual Estimation Diffusion for Low-Dose PET Sinogram Reconstruction | Xingyu Ai et.al. | 2411.05354 | null |
2024-11-08 | Electro-diffusive modeling and the role of spine geometry on action potential propagation in neurons | Rahul Gulati et.al. | 2411.05329 | null |
2024-11-08 | Social balance in directed networks | Bingjie Hao et.al. | 2411.05327 | null |
2024-11-08 | SeqRFM: Fast RFM Analysis in Sequence Data | Yanxin Zheng et.al. | 2411.05317 | link |
2024-11-08 | Differentiable Calibration of Inexact Stochastic Simulation Models via Kernel Score Minimization | Ziwei Su et.al. | 2411.05315 | null |
2024-11-08 | A Real-time Face Mask Detection and Social Distancing System for COVID-19 using Attention-InceptionV3 Model | Abdullah Al Asif et.al. | 2411.05312 | null |
2024-11-08 | Adaptive Whole-Body PET Image Denoising Using 3D Diffusion Models with ControlNet | Boxiao Yu et.al. | 2411.05302 | null |
2024-11-08 | GPT Semantic Cache: Reducing LLM Costs and Latency via Semantic Embedding Caching | Sajal Regmi et.al. | 2411.05276 | null |
2024-11-07 | SVDQunat: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models | Muyang Li et.al. | 2411.05007 | link |
2024-11-07 | ProEdit: Simple Progression is All You Need for High-Quality 3D Scene Editing | Jun-Kun Chen et.al. | 2411.05006 | null |
2024-11-07 | Diff-2-in-1: Bridging Generation and Dense Perception with Diffusion Models | Shuhong Zheng et.al. | 2411.05005 | null |
2024-11-07 | ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning | David Junhao Zhang et.al. | 2411.05003 | null |
2024-11-07 | SG-I2V: Self-Guided Trajectory Control in Image-to-Video Generation | Koichi Namekata et.al. | 2411.04989 | null |
2024-11-07 | Few-Shot Task Learning through Inverse Generative Modeling | Aviv Netanyahu et.al. | 2411.04987 | null |
2024-11-07 | How fast does the WallGo? A package for computing wall velocities in first-order phase transitions | Andreas Ekstedt et.al. | 2411.04970 | link |
2024-11-07 | VAIR: Visuo-Acoustic Implicit Representations for Low-Cost, Multi-Modal Transparent Surface Reconstruction in Indoor Scenes | Advaith V. Sethuraman et.al. | 2411.04963 | null |
2024-11-07 | Uncovering Hidden Subspaces in Video Diffusion Models Using Re-Identification | Mischa Dombrowski et.al. | 2411.04956 | null |
2024-11-07 | Fed-LDR: Federated Local Data-infused Graph Creation with Node-centric Model Refinement | Jiechao Gao et.al. | 2411.04936 | null |
2024-11-07 | DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion | Wenqiang Sun et.al. | 2411.04928 | null |
2024-11-07 | StoryAgent: Customized Storytelling Video Generation via Multi-Agent Collaboration | Panwen Hu et.al. | 2411.04925 | null |
2024-11-07 | Stem-OB: Generalizable Visual Imitation Learning with Stem-Like Convergent Observation through Diffusion Inversion | Kaizhe Hu et.al. | 2411.04919 | link |
2024-11-07 | GASE: Generatively Augmented Sentence Encoding | Manuel Frank et.al. | 2411.04914 | null |
2024-11-07 | Controlling Human Shape and Pose in Text-to-Image Diffusion Models via Domain Adaptation | Benito Buchheim et.al. | 2411.04724 | null |
2024-11-06 | Community Forensics: Using Thousands of Generators to Train Fake Image Detectors | Jeongsoo Park et.al. | 2411.04125 | null |
2024-11-06 | Stepping Forward on the Last Mile | Chen Feng et.al. | 2411.04036 | null |
2024-11-06 | Prototyping O-RAN Enabled UAV Experimentation for the AERPAW Testbed | Joshua Moore et.al. | 2411.04027 | null |
2024-11-06 | Object-Centric Dexterous Manipulation from Human Motion Data | Yuanpei Chen et.al. | 2411.04005 | null |
2024-11-06 | Synomaly Noise and Multi-Stage Diffusion: A Novel Approach for Unsupervised Anomaly Detection in Ultrasound Imaging | Yuan Bi et.al. | 2411.04004 | null |
2024-11-06 | ET-SEED: Efficient Trajectory-Level SE(3) Equivariant Diffusion Policy | Chenrui Tie et.al. | 2411.03990 | null |
2024-11-06 | ReEdit: Multimodal Exemplar-Based Image Editing with Diffusion Models | Ashutosh Srivastava et.al. | 2411.03982 | null |
2024-11-06 | Customized Multiple Clustering via Multi-Modal Subspace Proxy Learning | Jiawei Yao et.al. | 2411.03978 | link |
2024-11-06 | Bayesian algorithmic perfumery: A Hierarchical Relevance Vector Machine for the Estimation of Personalized Fragrance Preferences based on Three Sensory Layers and Jungian Personality Archetypes | Rolando Gonzales Martinez et.al. | 2411.03965 | null |
2024-11-06 | Long-Form Text-to-Music Generation with Adaptive Prompts: A Case of Study in Tabletop Role-Playing Games Soundtracks | Felipe Marra et.al. | 2411.03948 | link |
2024-11-06 | Can Custom Models Learn In-Context? An Exploration of Hybrid Architecture Performance on In-Context Learning Tasks | Ryan Campbell et.al. | 2411.03945 | link |
2024-11-06 | GUIDE-VAE: Advancing Data Generation with User Information and Pattern Dictionaries | Kutay Bölat et.al. | 2411.03936 | null |
2024-11-06 | Large Generative Model-assisted Talking-face Semantic Communication System | Feibo Jiang et.al. | 2411.03876 | null |
2024-11-06 | ROBIN: Robust and Invisible Watermarks for Diffusion Models with Adversarial Optimization | Huayang Huang et.al. | 2411.03862 | link |
2024-11-06 | Sub-DM:Subspace Diffusion Model with Orthogonal Decomposition for MRI Reconstruction | Yu Guan et.al. | 2411.03758 | null |
2024-11-05 | MME-Finance: A Multimodal Finance Benchmark for Expert-level Understanding and Reasoning | Ziliang Gan et.al. | 2411.03314 | null |
2024-11-05 | LLMs for Domain Generation Algorithm Detection | Reynier Leyva La O et.al. | 2411.03307 | null |
2024-11-05 | DiffLM: Controllable Synthetic Data Generation via Diffusion Language Models | Ying Zhou et.al. | 2411.03250 | null |
2024-11-05 | On Improved Conditioning Mechanisms and Pre-training Strategies for Diffusion Models | Tariq Berrada Ifriqi et.al. | 2411.03177 | null |
2024-11-05 | Unleashing the power of novel conditional generative approaches for new materials discovery | Lev Novitskiy et.al. | 2411.03156 | link |
2024-11-05 | Local Lesion Generation is Effective for Capsule Endoscopy Image Data Augmentation in a Limited Data Setting | Adrian B. Chłopowiec et.al. | 2411.03098 | null |
2024-11-05 | Gradient-Guided Conditional Diffusion Models for Private Image Reconstruction: Analyzing Adversarial Impacts of Differential Privacy and Denoising | Tao Huang et.al. | 2411.03053 | null |
2024-11-05 | GarVerseLOD: High-Fidelity 3D Garment Reconstruction from a Single In-the-Wild Image using a Dataset with Levels of Details | Zhongjin Luo et.al. | 2411.03047 | null |
2024-11-05 | Speaker Emotion Recognition: Leveraging Self-Supervised Models for Feature Extraction Using Wav2Vec2 and HuBERT | Pourya Jafarzadeh et.al. | 2411.02964 | null |
2024-11-05 | IMUDiffusion: A Diffusion Model for Multivariate Time Series Synthetisation for Inertial Motion Capturing Systems | Heiko Oppel et.al. | 2411.02954 | null |
2024-11-05 | LDPM: Towards undersampled MRI reconstruction with MR-VAE and Latent Diffusion Prior | Xingjian Tang et.al. | 2411.02951 | null |
2024-11-05 | A scalable generative model for dynamical system reconstruction from neuroimaging data | Eric Volkmann et.al. | 2411.02949 | link |
2024-11-05 | Exploring the Interplay Between Video Generation and World Models in Autonomous Driving: A Survey | Ao Fu et.al. | 2411.02914 | null |
2024-11-05 | The Unreasonable Effectiveness of LLMs for Query Optimization | Peter Akioyamen et.al. | 2411.02862 | link |
2024-11-05 | ADOPT: Modified Adam Can Converge with Any $β_2$ with the Optimal Rate | Shohei Taniguchi et.al. | 2411.02853 | link |
2024-11-04 | Training-free Regional Prompting for Diffusion Transformers | Anthony Chen et.al. | 2411.02395 | link |
2024-11-04 | How Far is Video Generation from World Model: A Physical Law Perspective | Bingyi Kang et.al. | 2411.02385 | null |
2024-11-04 | Virgo Filaments IV: Using WISE to Measure the Modification of Star-Forming Disks in the Extended Regions Around the Virgo Cluster | Kim Conger et.al. | 2411.02352 | null |
2024-11-04 | Diffusion-based Generative Multicasting with Intent-aware Semantic Decomposition | Xinkai Liu et.al. | 2411.02334 | null |
2024-11-05 | PPLLaVA: Varied Video Sequence Understanding With Prompt Guidance | Ruyang Liu et.al. | 2411.02327 | link |
2024-11-04 | LayerDAG: A Layerwise Autoregressive Diffusion Model for Directed Acyclic Graph Generation | Mufei Li et.al. | 2411.02322 | link |
2024-11-04 | CRMArena: Understanding the Capacity of LLM Agents to Perform Professional CRM Tasks in Realistic Environments | Kung-Hsiang Huang et.al. | 2411.02305 | link |
2024-11-04 | Hunyuan3D-1.0: A Unified Framework for Text-to-3D and Image-to-3D Generation | Xianghui Yang et.al. | 2411.02293 | null |
2024-11-04 | Counterfactual Explanations via Riemannian Latent Space Traversal | Paraskevas Pegios et.al. | 2411.02259 | null |
2024-11-04 | FewViewGS: Gaussian Splatting with Few View Matching and Multi-stage Training | Ruihong Yin et.al. | 2411.02229 | null |
2024-11-04 | Recursive Learning of Asymptotic Variational Objectives | Alessandro Mastrototaro et.al. | 2411.02217 | null |
2024-11-04 | Digi2Real: Bridging the Realism Gap in Synthetic Data Face Recognition via Foundation Models | Anjith George et.al. | 2411.02188 | null |
2024-11-04 | Touch-to-Touch Translation – Learning the Mapping Between Heterogeneous Tactile Sensing Technologies | Francesco Grella et.al. | 2411.02187 | null |
2024-11-04 | CleAR: Robust Context-Guided Generative Lighting Estimation for Mobile Augmented Reality | Yiqin Zhao et.al. | 2411.02179 | null |
2024-11-04 | CryptoEL: A Novel Experiential Learning Tool for Enhancing K-12 Cryptography Education | Pranathi Rayavaram et.al. | 2411.02143 | null |
2024-10-31 | Bridging Geometric States via Geometric Diffusion Bridge | Shengjie Luo et.al. | 2410.24220 | null |
2024-10-31 | Enhancing Motion in Text-to-Video Generation with Decomposed Encoding and Conditioning | Penghui Ruan et.al. | 2410.24219 | link |
2024-10-31 | DiffPano: Scalable and Consistent Text to Panorama Generation with Spherical Epipolar-Aware Diffusion | Weicai Ye et.al. | 2410.24203 | link |
2024-10-31 | Multi-Attribute Linguistic Tuning for Controlled Paraphrase Generation | Mohamed Elgaar et.al. | 2410.24199 | null |
2024-10-31 | Generative modelling for mass-mapping with fast uncertainty quantification | Jessica J. Whitney et.al. | 2410.24197 | link |
2024-10-31 | AR-Pro: Counterfactual Explanations for Anomaly Repair with Formal Properties | Xiayan Ji et.al. | 2410.24178 | null |
2024-10-31 | **Redefining |
Fu Feng et.al. | 2410.24160 | null |
2024-10-31 | Scaling Concept With Text-Guided Diffusion Models | Chao Huang et.al. | 2410.24151 | null |
2024-10-31 | Repository-Level Compositional Code Translation and Validation | Ali Reza Ibrahimzada et.al. | 2410.24117 | link |
2024-10-31 | Extended electrochemical monitoring of biomolecular binding using commercially available, reusable electrodes in microliter volumes | Jeremy Mendez et.al. | 2410.24110 | null |
2024-10-31 | Sparsh: Self-supervised touch representations for vision-based tactile sensing | Carolina Higuera et.al. | 2410.24090 | null |
2024-10-31 | Understanding Generalizability of Diffusion Models Requires Rethinking the Hidden Gaussian Structure | Xiang Li et.al. | 2410.24060 | link |
2024-10-31 | TPC: Test-time Procrustes Calibration for Diffusion-based Human Image Animation | Sunjae Yoon et.al. | 2410.24037 | null |
2024-10-31 | Unveiling Synthetic Faces: How Synthetic Datasets Can Expose Real Identities | Hatef Otroshi Shahreza et.al. | 2410.24015 | null |
2024-10-31 | DiffPAD: Denoising Diffusion-based Adversarial Patch Decontamination | Jia Fu et.al. | 2410.24006 | link |
2024-10-30 | ReferEverything: Towards Segmenting Everything We Can Speak of in Videos | Anurag Bagchi et.al. | 2410.23287 | null |
2024-10-30 | Provable acceleration for diffusion models under minimal assumptions | Gen Li et.al. | 2410.23285 | null |
2024-10-30 | RelationBooth: Towards Relation-Aware Customized Object Generation | Qingyu Shi et.al. | 2410.23280 | null |
2024-10-30 | SlowFast-VGen: Slow-Fast Learning for Action-Driven Long Video Generation | Yining Hong et.al. | 2410.23277 | null |
2024-10-30 | Multi-student Diffusion Distillation for Better One-step Generators | Yanke Song et.al. | 2410.23274 | null |
2024-10-30 | ReaWristic: Remote Touch Sensation to Fingers from a Wristband via Visually Augmented Electro-Tactile Feedback | Yudai Tanaka et.al. | 2410.23193 | null |
2024-10-30 | Real-Time Personalization for LLM-based Recommendation with Customized In-Context Learning | Keqin Bao et.al. | 2410.23136 | link |
2024-10-30 | Educating for Hardware Specialization in the Chiplet Era: A Path for the HPC Community | Kazutomo Yoshii et.al. | 2410.23127 | null |
2024-10-30 | CausalDiff: Causality-Inspired Disentanglement via Diffusion Model for Adversarial Defense | Mingkun Zhang et.al. | 2410.23091 | link |
2024-10-30 | General Bayesian quantile regression for counts via generative modeling | Yuta Yamauchi et.al. | 2410.23081 | null |
2024-10-30 | Controlling Language and Diffusion Models by Transporting Activations | Pau Rodriguez et.al. | 2410.23054 | link |
2024-10-30 | Dispersion kinks from electronic correlations in an unconventional iron-based superconductor | Ming-Hua Chang et.al. | 2410.23044 | null |
2024-10-30 | Improving Musical Accompaniment Co-creation via Diffusion Transformers | Javier Nistal et.al. | 2410.23005 | null |
2024-10-30 | DexGraspNet 2.0: Learning Generative Dexterous Grasping in Large-scale Synthetic Cluttered Scenes | Jialiang Zhang et.al. | 2410.23004 | null |
2024-10-30 | LumiSculpt: A Consistency Lighting Control Network for Video Generation | Yuxin Zhang et.al. | 2410.22979 | null |
2024-10-29 | CaStL: Constraints as Specifications through LLM Translation for Long-Horizon Task and Motion Planning | Weihang Guo et.al. | 2410.22225 | null |
2024-10-29 | A Gaussian Process Generative Model for QCD Equation of State | Jiaxuan Gong et.al. | 2410.22160 | null |
2024-10-29 | Capacity Control is an Effective Memorization Mitigation Mechanism in Text-Conditional Diffusion Models | Raman Dutt et.al. | 2410.22149 | link |
2024-10-29 | AmpleGCG-Plus: A Strong Generative Model of Adversarial Suffixes to Jailbreak LLMs with Higher Success Rates in Fewer Attempts | Vishal Kumar et.al. | 2410.22143 | null |
2024-10-29 | Infrared photometry with InGaAs detectors: First light with SPECULOOS | Peter P. Pedersen et.al. | 2410.22140 | link |
2024-10-29 | SimRec: Mitigating the Cold-Start Problem in Sequential Recommendation by Integrating Item Similarity | Shaked Brody et.al. | 2410.22136 | link |
2024-10-29 | Protecting Privacy in Multimodal Large Language Models with MLLMU-Bench | Zheyuan Liu et.al. | 2410.22108 | link |
2024-10-29 | Variational inference for pile-up removal at hadron colliders with diffusion models | Malte Algren et.al. | 2410.22074 | null |
2024-10-29 | PACA: Perspective-Aware Cross-Attention Representation for Zero-Shot Scene Rearrangement | Shutong Jin et.al. | 2410.22059 | null |
2024-10-29 | Dual Conditional Diffusion Models for Sequential Recommendation | Hongtao Huang et.al. | 2410.21967 | null |
2024-10-29 | PrefPaint: Aligning Image Inpainting Diffusion Model with Human Preference | Kendong Liu et.al. | 2410.21966 | null |
2024-10-29 | CT to PET Translation: A Large-scale Dataset and Domain-Knowledge-Guided Diffusion Approach | Dac Thai Nguyen et.al. | 2410.21932 | link |
2024-10-29 | Guided Diffusion-based Counterfactual Augmentation for Robust Session-based Recommendation | Muskan Gupta et.al. | 2410.21892 | null |
2024-10-29 | On the study of the limit cycles for a class of population models with time-varying factors | Renhao Tian et.al. | 2410.21848 | null |
2024-10-29 | Diffusion as Reasoning: Enhancing Object Goal Navigation with LLM-Biased Diffusion Model | Yiming Ji et.al. | 2410.21842 | null |
2024-10-28 | On Inductive Biases That Enable Generalization of Diffusion Transformers | Jie An et.al. | 2410.21273 | link |
2024-10-28 | EoRA: Training-free Compensation for Compressed LLM with Eigenspace Low-Rank Approximation | Shih-Yang Liu et.al. | 2410.21271 | null |
2024-10-28 | LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior | Hanyu Wang et.al. | 2410.21264 | null |
2024-10-28 | One-Step Diffusion Policy: Fast Visuomotor Policies via Diffusion Distillation | Zhendong Wang et.al. | 2410.21257 | null |
2024-10-28 | On learning higher-order cumulants in diffusion models | Gert Aarts et.al. | 2410.21212 | null |
2024-10-28 | The VSPEC Collection: A suite of utilities to model spectroscopic phase curves of 3D exoplanet atmospheres in the presence of stellar variability | Ted M Johnson et.al. | 2410.21190 | null |
2024-10-28 | Trajectory Flow Matching with Applications to Clinical Time Series Modeling | Xi Zhang et.al. | 2410.21154 | link |
2024-10-28 | Synthetica: Large Scale Synthetic Data for Robot Perception | Ritvik Singh et.al. | 2410.21153 | null |
2024-10-28 | Extrapolating Prospective Glaucoma Fundus Images through Diffusion Model in Irregular Longitudinal Sequences | Zhihao Zhao et.al. | 2410.21130 | null |
2024-10-28 | Shallow Diffuse: Robust and Invisible Watermarking through Low-Dimensional Subspaces in Diffusion Models | Wenda Li et.al. | 2410.21088 | link |
2024-10-28 | Federated Time Series Generation on Feature and Temporally Misaligned Data | Chenrui Fan et.al. | 2410.21072 | null |
2024-10-28 | Kandinsky 3: Text-to-Image Synthesis for Multifunctional Generative Framework | Vladimir Arkhipkin et.al. | 2410.21061 | link |
2024-10-28 | Beyond Autoregression: Fast LLMs via Self-Distillation Through Time | Justin Deschenaux et.al. | 2410.21035 | link |
2024-10-29 | EEG-Driven 3D Object Reconstruction with Color Consistency and Diffusion Prior | Xin Xiang et.al. | 2410.20981 | null |
2024-10-28 | MovieCharacter: A Tuning-Free Framework for Controllable Character Video Synthesis | Di Qiu et.al. | 2410.20974 | null |
2024-10-25 | Model merging with SVD to tie the Knots | George Stoica et.al. | 2410.19735 | link |
2024-10-25 | Adversarial Environment Design via Regret-Guided Diffusion Models | Hojun Chung et.al. | 2410.19715 | null |
2024-10-25 | Perception, Control and Hardware for In-Hand Slip-Aware Object Manipulation with Parallel Grippers | Gabriel Arslan Waltersson et.al. | 2410.19660 | null |
2024-10-25 | DiffGS: Functional Gaussian Splatting Diffusion | Junsheng Zhou et.al. | 2410.19657 | null |
2024-10-25 | VARS: Vision-based Assessment of Risk in Security Systems | Pranav Gupta et.al. | 2410.19642 | null |
2024-10-25 | Diffusion models for lattice gauge field simulations | Qianteng Zhu et.al. | 2410.19602 | null |
2024-10-25 | Energy Efficient Dual Designs of FeFET-Based Analog In-Memory Computing with Inherent Shift-Add Capability | Zeyu Yang et.al. | 2410.19593 | null |
2024-10-25 | Hybrid Memetic Search for Electric Vehicle Routing with Time Windows, Simultaneous Pickup-Delivery, and Partial Recharges | Zubin Zheng et.al. | 2410.19580 | null |
2024-10-25 | Utilizing Image Transforms and Diffusion Models for Generative Modeling of Short and Long Time Series | Ilan Naiman et.al. | 2410.19538 | null |
2024-10-25 | Ensemble Data Assimilation for Particle-based Methods | Marius Duvillard et.al. | 2410.19525 | null |
2024-10-25 | Marked Temporal Bayesian Flow Point Processes | Hui Chen et.al. | 2410.19512 | null |
2024-10-25 | EDGE: Enhanced Grounded GUI Understanding with Enriched Multi-Granularity Synthetic Data | Xuetian Chen et.al. | 2410.19461 | null |
2024-10-28 | NeuroClips: Towards High-fidelity and Smooth fMRI-to-Video Reconstruction | Zixuan Gong et.al. | 2410.19452 | link |
2024-10-25 | Learned Reference-based Diffusion Sampling for multi-modal distributions | Maxence Noble et.al. | 2410.19449 | null |
2024-10-25 | Generative Diffusion Models for Sequential Recommendations | Sharare Zolghadr et.al. | 2410.19429 | null |
2024-10-24 | Framer: Interactive Frame Interpolation | Wen Wang et.al. | 2410.18978 | null |
2024-10-24 | MotionCLR: Motion Generation and Training-free Editing via Understanding Attention Mechanisms | Ling-Hao Chen et.al. | 2410.18977 | null |
2024-10-24 | Unbounded: A Generative Infinite Game of Character Life Simulation | Jialu Li et.al. | 2410.18975 | null |
2024-10-24 | 3D-Adapter: Geometry-Consistent Multi-View Diffusion for High-Quality 3D Generation | Hansheng Chen et.al. | 2410.18974 | link |
2024-10-24 | On the Crucial Role of Initialization for Matrix Factorization | Bingcong Li et.al. | 2410.18965 | null |
2024-10-24 | Stable Consistency Tuning: Understanding and Improving Consistency Models | Fu-Yun Wang et.al. | 2410.18958 | link |
2024-10-24 | Generation of synthetic financial time series by diffusion models | Tomonori Takahashi et.al. | 2410.18897 | null |
2024-10-24 | Diff-Instruct++: Training One-step Text-to-image Generator Model to Align with Human Preferences | Weijian Luo et.al. | 2410.18881 | null |
2024-10-24 | The Cat and Mouse Game: The Ongoing Arms Race Between Diffusion Models and Detection Methods | Linda Laurier et.al. | 2410.18866 | null |
2024-10-24 | From Efficiency to Equity: Measuring Fairness in Preference Learning | Shreeyash Gowaikar et.al. | 2410.18841 | null |
2024-10-24 | From English-Centric to Effective Bilingual: LLMs with Custom Tokenizers for Underrepresented Languages | Artur Kiulian et.al. | 2410.18836 | null |
2024-10-24 | Multi-Scale Diffusion: Enhancing Spatial Layout in High-Resolution Panoramic Image Generation | Xiaoyu Zhang et.al. | 2410.18830 | null |
2024-10-24 | Towards Visual Text Design Transfer Across Languages | Yejin Choi et.al. | 2410.18823 | null |
2024-10-24 | Fast constrained sampling in pre-trained diffusion models | Alexandros Graikos et.al. | 2410.18804 | null |
2024-10-24 | Large Generative AI Models meet Open Networks for 6G: Integration, Platform, and Monetization | Peizheng Li et.al. | 2410.18790 | null |
2024-10-23 | DynamicCity: Large-Scale LiDAR Generation from Dynamic Scenes | Hengwei Bian et.al. | 2410.18084 | null |
2024-10-23 | Prioritized Generative Replay | Renhao Wang et.al. | 2410.18082 | null |
2024-10-23 | WorldSimBench: Towards Video Generation Models as World Simulators | Yiran Qin et.al. | 2410.18072 | null |
2024-10-23 | TP-Eval: Tap Multimodal LLMs’ Potential in Evaluation by Customizing Prompts | Yuxuan Xie et.al. | 2410.18071 | null |
2024-10-23 | Training Free Guided Flow Matching with Optimal Control | Luran Wang et.al. | 2410.18070 | null |
2024-10-23 | Spectrally shaped THz pulses from tapered dielectric waveguides | Karel Peetermans et.al. | 2410.17975 | null |
2024-10-23 | Optical Generative Models | Shiqi Chen et.al. | 2410.17970 | null |
2024-10-23 | A Wavelet Diffusion GAN for Image Super-Resolution | Lorenzo Aloisi et.al. | 2410.17966 | null |
2024-10-23 | Addressing Asynchronicity in Clinical Multimodal Fusion via Individualized Chest X-ray Generation | Wenfang Yao et.al. | 2410.17918 | link |
2024-10-23 | regAL: Python Package for Active Learning of Regression Problems | Elizaveta Surzhikova et.al. | 2410.17917 | null |
2024-10-23 | Scaling Diffusion Language Models via Adaptation from Autoregressive Models | Shansan Gong et.al. | 2410.17891 | link |
2024-10-23 | Non-intrusive Speech Quality Assessment with Diffusion Models Trained on Clean Speech | Danilo de Oliveira et.al. | 2410.17834 | null |
2024-10-23 | PGDiffSeg: Prior-Guided Denoising Diffusion Model with Parameter-Shared Attention for Breast Cancer Segmentation | Feiyan Feng et.al. | 2410.17812 | null |
2024-10-23 | GenUDC: High Quality 3D Mesh Generation with Unsigned Dual Contouring Representation | Ruowei Wang et.al. | 2410.17802 | link |
2024-10-23 | Regularized autoregressive modeling and its application to audio signal declipping | Ondřej Mokrý et.al. | 2410.17790 | link |
2024-10-22 | Large Language Models Empowered Personalized Web Agents | Hongru Cai et.al. | 2410.17236 | null |
2024-10-22 | Creativity in AI: Progresses and Challenges | Mete Ismayilzada et.al. | 2410.17218 | null |
2024-10-22 | Audio-to-Score Conversion Model Based on Whisper methodology | Hongyao Zhang et.al. | 2410.17209 | null |
2024-10-22 | Reinforcement learning on structure-conditioned categorical diffusion for protein inverse folding | Yasha Ektefaie et.al. | 2410.17173 | link |
2024-10-22 | Performance of the CMS high-level trigger during LHC Run 2 | CMS Collaboration et.al. | 2410.17038 | null |
2024-10-22 | Hybrid Generative AI for De Novo Design of Co-Crystals with Enhanced Tabletability | Nina Gubina et.al. | 2410.17005 | link |
2024-10-22 | DiP-GO: A Diffusion Pruner via Few-step Gradient Optimization | Haowei Zhu et.al. | 2410.16942 | null |
2024-10-22 | Hierarchical Clustering for Conditional Diffusion in Image Generation | Jorge da Silva Goncalves et.al. | 2410.16910 | link |
2024-10-22 | Bayes without Underfitting: Fully Correlated Deep Learning Posteriors via Alternating Projections | Marco Miani et.al. | 2410.16901 | null |
2024-10-22 | VistaDream: Sampling multiview consistent images for single-view scene reconstruction | Haiping Wang et.al. | 2410.16892 | null |
2024-10-22 | CK4Gen: A Knowledge Distillation Framework for Generating High-Utility Synthetic Survival Datasets in Healthcare | Nicholas I-Hsien Kuo et.al. | 2410.16872 | null |
2024-10-22 | MPDS: A Movie Posters Dataset for Image Generation with Diffusion Model | Meng Xu et.al. | 2410.16840 | null |
2024-10-22 | Bridging Search and Recommendation in Generative Retrieval: Does One Task Help the Other? | Gustavo Penha et.al. | 2410.16823 | null |
2024-10-22 | Evaluating the Effectiveness of Attack-Agnostic Features for Morphing Attack Detection | Laurent Colbois et.al. | 2410.16802 | link |
2024-10-22 | One-Step Diffusion Distillation through Score Implicit Matching | Weijian Luo et.al. | 2410.16794 | link |
2024-10-21 | MvDrag3D: Drag-based Creative 3D Editing via Multi-view Generation-Reconstruction Priors | Honghua Chen et.al. | 2410.16272 | null |
2024-10-21 | Agent-to-Sim: Learning Interactive Behavior Models from Casual Longitudinal Videos | Gengshan Yang et.al. | 2410.16259 | null |
2024-10-21 | Distribution Learning with Valid Outputs Beyond the Worst-Case | Nick Rittler et.al. | 2410.16253 | null |
2024-10-21 | Building A Coding Assistant via the Retrieval-Augmented Language Model | Xinze Li et.al. | 2410.16229 | link |
2024-10-21 | CiteClick: A Browser Extension for Real-Time Scholar Citation Tracking | Nishat Raihan et.al. | 2410.16211 | null |
2024-10-21 | A Framework for Evaluating Predictive Models Using Synthetic Image Covariates and Longitudinal Data | Simon Deltadahl et.al. | 2410.16177 | null |
2024-10-22 | Warped Diffusion: Solving Video Inverse Problems with Image Diffusion Models | Giannis Daras et.al. | 2410.16152 | null |
2024-10-21 | Modelling Structured Data Learning with Restricted Boltzmann Machines in the Teacher-Student Setting | Robin Thériault et.al. | 2410.16150 | null |
2024-10-21 | SeaDAG: Semi-autoregressive Diffusion for Conditional Directed Acyclic Graph Generation | Xinyi Zhou et.al. | 2410.16119 | null |
2024-10-21 | Critical Example Mining for Vehicle Trajectory Prediction using Flow-based Generative Models | Zhezhang Ding et.al. | 2410.16083 | null |
2024-10-21 | Continuous Speech Synthesis using per-token Latent Diffusion | Arnon Turetzky et.al. | 2410.16048 | null |
2024-10-21 | Some generalizations of the convective model of jet generation | S. N. Artekha et.al. | 2410.16035 | null |
2024-10-21 | ComPO: Community Preferences for Language Model Personalization | Sachin Kumar et.al. | 2410.16027 | null |
2024-10-21 | Massimo: Public Queue Monitoring and Management using Mass-Spring Model | Abhijeet Kumar et.al. | 2410.16012 | null |
2024-10-21 | AI-Driven Innovations in Modern Cloud Computing | Animesh Kumar et.al. | 2410.15960 | null |
2024-10-18 | BiGR: Harnessing Binary Latent Codes for Image Generation and Improved Visual Representation Capabilities | Shaozhe Hao et.al. | 2410.14672 | link |
2024-10-18 | How Does Data Diversity Shape the Weight Landscape of Neural Networks? | Yang Ba et.al. | 2410.14602 | null |
2024-10-18 | Bayesian Multi-wavelength Imaging of the LMC SN1987A with SRG/eROSITA | Vincent Eberle et.al. | 2410.14599 | null |
2024-10-18 | Neuro-Symbolic Traders: Assessing the Wisdom of AI Crowds in Markets | Namid R. Stillman et.al. | 2410.14587 | null |
2024-10-18 | Reimagining partial thickness keratoplasty: An eye mountable robot for autonomous big bubble needle insertion | Y. Wang et.al. | 2410.14577 | null |
2024-10-18 | Multi-modal Pose Diffuser: A Multimodal Generative Conditional Pose Prior | Calvin-Khang Ta et.al. | 2410.14540 | null |
2024-10-18 | Blockchain-Based Trust and Transparency in Airline Reservation Systems using Microservices Architecture | Biman Barua et.al. | 2410.14518 | null |
2024-10-18 | LEAD: Latent Realignment for Human Motion Diffusion | Nefeli Andreou et.al. | 2410.14508 | null |
2024-10-18 | Reinforcement Learning in Non-Markov Market-Making | Luca Lalor et.al. | 2410.14504 | null |
2024-10-18 | Data-driven topology design with persistent homology for enhancing population diversity | Taisei Kii et.al. | 2410.14496 | null |
2024-10-18 | ANT: Adaptive Noise Schedule for Time Series Diffusion Models | Seunghan Lee et.al. | 2410.14488 | link |
2024-10-21 | CaTs and DAGs: Integrating Directed Acyclic Graphs with Transformers and Fully-Connected Neural Networks for Causally Constrained Predictions | Matthew J. Vowels et.al. | 2410.14485 | link |
2024-10-18 | DRL Optimization Trajectory Generation via Wireless Network Intent-Guided Diffusion Models for Optimizing Resource Allocation | Junjie Wu et.al. | 2410.14481 | null |
2024-10-18 | Flow-based Sampling for Entanglement Entropy and the Machine Learning of Defects | Andrea Bulgarelli et.al. | 2410.14466 | null |
2024-10-18 | FashionR2R: Texture-preserving Rendered-to-Real Image Translation with Diffusion Models | Rui Hu et.al. | 2410.14429 | null |
2024-10-17 | Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens | Lijie Fan et.al. | 2410.13863 | null |
2024-10-17 | Diffusing States and Matching Scores: A New Framework for Imitation Learning | Runzhe Wu et.al. | 2410.13855 | link |
2024-10-17 | Influence Functions for Scalable Data Attribution in Diffusion Models | Bruno Mlodozeniec et.al. | 2410.13850 | null |
2024-10-17 | VidPanos: Generative Panoramic Videos from Casual Panning Videos | Jingwei Ma et.al. | 2410.13832 | null |
2024-10-17 | DreamVideo-2: Zero-Shot Subject-Driven Video Customization with Precise Motion Control | Yujie Wei et.al. | 2410.13830 | null |
2024-10-17 | Deep Generative Models Unveil Patterns in Medical Images Through Vision-Language Conditioning | Xiaodan Xing et.al. | 2410.13823 | link |
2024-10-17 | ConsisSR: Delving Deep into Consistency in Diffusion-based Image Super-Resolution | Junhao Gu et.al. | 2410.13807 | null |
2024-10-17 | Probing the Latent Hierarchical Structure of Data via Diffusion Models | Antonio Sclocchi et.al. | 2410.13770 | null |
2024-10-17 | Theory on Score-Mismatched Diffusion Models and Zero-Shot Conditional Samplers | Yuchen Liang et.al. | 2410.13746 | null |
2024-10-17 | Improved Convergence Rate for Diffusion Probabilistic Models | Gen Li et.al. | 2410.13738 | null |
2024-10-17 | Optimizing Probabilistic Conformal Prediction with Vectorized Non-Conformity Scores | Minxing Zheng et.al. | 2410.13735 | null |
2024-10-18 | DAWN: Dynamic Frame Avatar with Non-autoregressive Diffusion Framework for Talking Head Video Generation | Hanbo Cheng et.al. | 2410.13726 | link |
2024-10-17 | Movie Gen: A Cast of Media Foundation Models | Adam Polyak et.al. | 2410.13720 | link |
2024-10-18 | Diffusion Curriculum: Synthetic-to-Real Generative Curriculum Learning via Image-Guided Diffusion | Yijun Liang et.al. | 2410.13674 | link |
2024-10-17 | Fine-Tuning Discrete Diffusion Models via Reward Optimization with Applications to DNA and Protein Design | Chenyu Wang et.al. | 2410.13643 | link |
2024-10-16 | Geometry-Aware Generative Autoencoders for Warped Riemannian Metric Learning and Generative Modeling on Data Manifolds | Xingzhi Sun et.al. | 2410.12779 | null |
2024-10-16 | Meta-Unlearning on Diffusion Models: Preventing Relearning Unlearned Concepts | Hongcheng Gao et.al. | 2410.12777 | link |
2024-10-16 | SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image And Video Generation | Jaehong Yoon et.al. | 2410.12761 | null |
2024-10-16 | Signature of Vertical Mixing in Hydrogen-dominated Exoplanet Atmospheres | Vikas Soni et.al. | 2410.12737 | null |
2024-10-16 | Counterfactual Generative Modeling with Variational Causal Inference | Yulun Wu et.al. | 2410.12730 | null |
2024-10-16 | FusionLLM: A Decentralized LLM Training System on Geo-distributed GPUs with Adaptive Compression | Zhenheng Tang et.al. | 2410.12707 | null |
2024-10-16 | Embedding an Ethical Mind: Aligning Text-to-Image Synthesis via Lightweight Value Optimization | Xingqi Wang et.al. | 2410.12700 | link |
2024-10-16 | AdaptiveDrag: Semantic-Driven Dragging on Diffusion-Based Image Editing | DuoSheng Chen et.al. | 2410.12696 | null |
2024-10-16 | 3DIS: Depth-Driven Decoupled Instance Synthesis for Text-to-Image Generation | Dewei Zhou et.al. | 2410.12669 | null |
2024-10-16 | Towards Designing Scalable Quantum-Enhanced Generative Networks for Neutrino Physics Experiments with Liquid Argon Time Projection Chambers | Andrea Delgado et.al. | 2410.12650 | null |
2024-10-16 | A Robo-Advisor System: expected utility modeling via pairwise comparisons | Bo Chen et.al. | 2410.12570 | null |
2024-10-16 | One Step Diffusion via Shortcut Models | Kevin Frans et.al. | 2410.12557 | link |
2024-10-16 | Disentangling data distribution for Federated Learning | Xinyuan Zhao et.al. | 2410.12530 | null |
2024-10-16 | Shaping a Stabilized Video by Mitigating Unintended Changes for Concept-Augmented Video Editing | Mingce Guo et.al. | 2410.12526 | null |
2024-10-16 | MING: A Functional Approach to Learning Molecular Generative Models | Van Khoa Nguyen et.al. | 2410.12522 | null |
2024-10-15 | High-Resolution Frame Interpolation with Patch-based Cascaded Diffusion | Junhwa Hur et.al. | 2410.11838 | null |
2024-10-15 | On the Effectiveness of Dataset Alignment for Fake Image Detection | Anirudh Sundara Rajan et.al. | 2410.11835 | null |
2024-10-15 | Bayesian Experimental Design via Contrastive Diffusions | Jacopo Iollo et.al. | 2410.11826 | link |
2024-10-15 | KITTEN: A Knowledge-Intensive Evaluation of Image Generation on Visual Entities | Hsin-Ping Huang et.al. | 2410.11824 | null |
2024-10-15 | Improving Long-Text Alignment for Text-to-Image Diffusion Models | Luping Liu et.al. | 2410.11817 | link |
2024-10-15 | SGEdit: Bridging LLM with Text2Image Generative Model for Scene Graph-based Image Editing | Zhiyuan Zhang et.al. | 2410.11815 | null |
2024-10-16 | Efficient Diffusion Models: A Comprehensive Survey from Principles to Practices | Zhiyuan Ma et.al. | 2410.11795 | null |
2024-10-15 | G-Designer: Architecting Multi-agent Communication Topologies via Graph Neural Networks | Guibin Zhang et.al. | 2410.11782 | null |
2024-10-15 | Technical Report of 1:10 Scale Autonomous Vehicle Robot | Amirhossein Kheiri Holighi et.al. | 2410.11746 | null |
2024-10-15 | Probabilistic Principles for Biophysics and Neuroscience: Entropy Production, Bayesian Mechanics & the Free-Energy Principle | Lancelot Da Costa et.al. | 2410.11735 | null |
2024-10-15 | Patch-Based Diffusion Models Beat Whole-Image Models for Mismatched Distribution Inverse Problems | Jason Hu et.al. | 2410.11730 | null |
2024-10-15 | Parameter estimation of structural dynamics with neural operators enabled surrogate modeling | Mingyuan Zhou et.al. | 2410.11712 | null |
2024-10-15 | Findings of the WMT 2024 Shared Task on Chat Translation | Wafaa Mohammed et.al. | 2410.11624 | null |
2024-10-15 | DeformPAM: Data-Efficient Learning for Long-horizon Deformable Object Manipulation via Preference-based Action Alignment | Wendi Chen et.al. | 2410.11584 | link |
2024-10-15 | A Data-Driven Aggressive Autonomous Racing Framework Utilizing Local Trajectory Planning with Velocity Prediction | Zhouheng Li et.al. | 2410.11570 | link |
2024-10-14 | Tex4D: Zero-shot 4D Scene Texturing with Video Diffusion Models | Jingzhi Bao et.al. | 2410.10821 | link |
2024-10-15 | TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models | Mu Cai et.al. | 2410.10818 | link |
2024-10-14 | LVD-2M: A Long-take Video Dataset with Temporally Dense Captions | Tianwei Xiong et.al. | 2410.10816 | link |
2024-10-14 | Depth Any Video with Scalable Synthetic Data | Honghui Yang et.al. | 2410.10815 | link |
2024-10-14 | HART: Efficient Visual Generation with Hybrid Autoregressive Transformer | Haotian Tang et.al. | 2410.10812 | link |
2024-10-14 | TrajDiffuse: A Conditional Diffusion Model for Environment-Aware Trajectory Prediction | Qingze et.al. | 2410.10804 | link |
2024-10-14 | Boosting Camera Motion Control for Video Diffusion Transformers | Soon Yau Cheong et.al. | 2410.10802 | null |
2024-10-14 | Semantic Image Inversion and Editing using Rectified Stochastic Differential Equations | Litu Rout et.al. | 2410.10792 | null |
2024-10-14 | ControlMM: Controllable Masked Motion Generation | Ekkasit Pinyoanuntapong et.al. | 2410.10780 | null |
2024-10-14 | Adaptive Diffusion Terrain Generator for Autonomous Uneven Terrain Navigation | Youwei Yu et.al. | 2410.10766 | null |
2024-10-14 | DragEntity: Trajectory Guided Video Generation using Entity and Positional Relationships | Zhang Wan et.al. | 2410.10751 | null |
2024-10-14 | CosForce: A Force-Based General Model for Simulating Pedestrian Anticipation and Reaction Mechanisms | Jinghui Wang et.al. | 2410.10746 | null |
2024-10-14 | FlexGen: Flexible Multi-View Generation from Text and Image Inputs | Xinli Xu et.al. | 2410.10745 | null |
2024-10-14 | Deep Compression Autoencoder for Efficient High-Resolution Diffusion Models | Junyu Chen et.al. | 2410.10733 | link |
2024-10-14 | Large Language Models Are Active Critics in NLG Evaluation | Shuying Xu et.al. | 2410.10724 | null |
2024-10-11 | SceneCraft: Layout-Guided 3D Scene Generation | Xiuyu Yang et.al. | 2410.09049 | link |
2024-10-11 | Linear Convergence of Diffusion Models Under the Manifold Hypothesis | Peter Potaptchik et.al. | 2410.09046 | null |
2024-10-11 | PEAR: A Robust and Flexible Automation Framework for Ptychography Enabled by Multiple Large Language Model Agents | Xiangyu Yin et.al. | 2410.09034 | link |
2024-10-11 | Semantic Score Distillation Sampling for Compositional Text-to-3D Generation | Ling Yang et.al. | 2410.09009 | link |
2024-10-11 | WaveDiffusion: Exploring Full Waveform Inversion via Joint Diffusion in the Latent Space | Hanchen Wang et.al. | 2410.09002 | null |
2024-10-11 | Maximizing the Potential of Synthetic Data: Insights from Random Matrix Theory | Aymane El Firdoussi et.al. | 2410.08942 | null |
2024-10-11 | DiffPO: A causal diffusion model for learning distributions of potential outcomes | Yuchen Ma et.al. | 2410.08924 | null |
2024-10-11 | An End-to-End Deep Learning Method for Solving Nonlocal Allen-Cahn and Cahn-Hilliard Phase-Field Models | Yuwei Geng et.al. | 2410.08914 | null |
2024-10-11 | Conditional Generative Models for Contrast-Enhanced Synthesis of T1w and T1 Maps in Brain MRI | Moritz Piening et.al. | 2410.08894 | link |
2024-10-11 | MATCH: Model-Aware TVM-based Compilation for Heterogeneous Edge Devices | Mohamed Amine Hamdi et.al. | 2410.08855 | link |
2024-10-14 | LIME-Eval: Rethinking Low-light Image Enhancement Evaluation via Object Detection | Mingjia Li et.al. | 2410.08810 | link |
2024-10-11 | Bad Neighbors: On Understanding VPN Provider Networks | Teemu Rytilahti et.al. | 2410.08737 | link |
2024-10-11 | 5G as Enabler for Industrie 4.0 Use Cases: Challenges and Concepts | M. Gundall et.al. | 2410.08726 | null |
2024-10-11 | Investigating Human-Computer Interaction and Visual Comprehension in Text Generation Process of Natural Language Generation Models | Yunchao Wang et.al. | 2410.08723 | null |
2024-10-11 | Impact of Surface Reflections in Maritime Obstacle Detection | Samed Yalçın et.al. | 2410.08713 | link |
2024-10-10 | LatteCLIP: Unsupervised CLIP Fine-Tuning via LMM-Synthetic Texts | Anh-Quan Cao et.al. | 2410.08211 | null |
2024-10-10 | DICE: Discrete Inversion Enabling Controllable Editing for Multinomial Diffusion and Masked Generative Models | Xiaoxiao He et.al. | 2410.08207 | null |
2024-10-10 | HybridBooth: Hybrid Prompt Inversion for Efficient Subject-Driven Generation | Shanyan Guan et.al. | 2410.08192 | null |
2024-10-10 | DifFRelight: Diffusion-Based Facial Performance Relighting | Mingming He et.al. | 2410.08188 | null |
2024-10-10 | RGM: Reconstructing High-fidelity 3D Car Assets with Relightable 3D-GS Generative Model from a Single Image | Xiaoxue Chen et.al. | 2410.08181 | null |
2024-10-10 | ZeroComp: Zero-shot Object Compositing from Image Intrinsics via Diffusion | Zitian Zhang et.al. | 2410.08168 | null |
2024-10-10 | DART: Denoising Autoregressive Transformer for Scalable Text-to-Image Generation | Jiatao Gu et.al. | 2410.08159 | null |
2024-10-10 | Progressive Autoregressive Video Diffusion Models | Desai Xie et.al. | 2410.08151 | link |
2024-10-10 | Steering Masked Discrete Diffusion Models via Discrete Denoising Posterior Prediction | Jarrid Rector-Brooks et.al. | 2410.08134 | null |
2024-10-10 | Robust AI-Generated Text Detection by Restricted Embeddings | Kristian Kuznetsov et.al. | 2410.08113 | link |
2024-10-10 | LiPO: LiDAR Inertial Odometry for ICP Comparison | Darwin Mick et.al. | 2410.08097 | null |
2024-10-10 | Unstable Unlearning: The Hidden Risk of Concept Resurgence in Diffusion Models | Vinith M. Suriyakumar et.al. | 2410.08074 | null |
2024-10-10 | Reversible Decoupling Network for Single Image Reflection Removal | Hao Zhao et.al. | 2410.08063 | link |
2024-10-10 | A Target-Aware Analysis of Data Augmentation for Hate Speech Detection | Camilla Casula et.al. | 2410.08053 | null |
2024-10-10 | LADIMO: Face Morph Generation through Biometric Template Inversion with Latent Diffusion | Marcel Grimmer et.al. | 2410.07988 | link |
2024-10-09 | IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation | Xinchen Zhang et.al. | 2410.07171 | link |
2024-10-09 | Sylber: Syllabic Embedding Representation of Speech from Raw Audio | Cheol Jun Cho et.al. | 2410.07168 | link |
2024-10-09 | AvatarGO: Zero-shot 4D Human-Object Interaction Generation and Animation | Yukang Cao et.al. | 2410.07164 | null |
2024-10-09 | InstructG2I: Synthesizing Images from Multimodal Attributed Graphs | Bowen Jin et.al. | 2410.07157 | link |
2024-10-09 | Trans4D: Realistic Geometry-Aware Transition for Compositional Text-to-4D Synthesis | Bohan Zeng et.al. | 2410.07155 | link |
2024-10-10 | EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models | Rui Zhao et.al. | 2410.07133 | link |
2024-10-09 | Personalized Visual Instruction Tuning | Renjie Pi et.al. | 2410.07113 | link |
2024-10-09 | A Gentle Introduction and Tutorial on Deep Generative Models in Transportation Research | Seongjin Choi et.al. | 2410.07066 | link |
2024-10-09 | Efficient Distribution Matching of Representations via Noise-Injected Deep InfoMax | Ivan Butakov et.al. | 2410.06993 | null |
2024-10-09 | Diffusion Density Estimators | Akhil Premkumar et.al. | 2410.06986 | null |
2024-10-09 | Jointly Generating Multi-view Consistent PBR Textures using Collaborative Control | Shimon Vainer et.al. | 2410.06985 | null |
2024-10-09 | Structure-Centric Robust Monocular Depth Estimation via Knowledge Distillation | Runze Chen et.al. | 2410.06982 | null |
2024-10-09 | Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think | Sihyun Yu et.al. | 2410.06940 | link |
2024-10-09 | VEC-Sim: A Simulation Platform for Evaluating Service Caching and Computation Offloading Policies in Vehicular Edge Networks | Fan Wu et.al. | 2410.06934 | null |
2024-10-09 | Generative Model for Less-Resourced Language with 1 billion parameters | Domen Vreš et.al. | 2410.06898 | null |
2024-10-07 | DART: A Diffusion-Based Autoregressive Motion Model for Real-Time Text-Driven Motion Control | Kaifeng Zhao et.al. | 2410.05260 | null |
2024-10-07 | GS-VTON: Controllable 3D Virtual Try-on with Gaussian Splatting | Yukang Cao et.al. | 2410.05259 | null |
2024-10-07 | SePPO: Semi-Policy Preference Optimization for Diffusion Alignment | Daoan Zhang et.al. | 2410.05255 | link |
2024-10-07 | DiffuseReg: Denoising Diffusion Model for Obtaining Deformation Fields in Unsupervised Deformable Image Registration | Yongtai Zhuo et.al. | 2410.05234 | link |
2024-10-07 | Density estimation with LLMs: a geometric investigation of in-context learning trajectories | Toni J. B. Liu et.al. | 2410.05218 | null |
2024-10-07 | Avoiding Deadlocks via Weak Deadlock Sets | Gianpaolo Oriolo et.al. | 2410.05175 | null |
2024-10-07 | Presto! Distilling Steps and Layers for Accelerating Music Generation | Zachary Novack et.al. | 2410.05167 | null |
2024-10-08 | A Simulation-Free Deep Learning Approach to Stochastic Optimal Control | Mengjian Hua et.al. | 2410.05163 | null |
2024-10-07 | Smart Jamming Attack and Mitigation on Deep Transfer Reinforcement Learning Enabled Resource Allocation for Network Slicing | Shavbo Salehi et.al. | 2410.05153 | null |
2024-10-07 | Leveraging Multimodal Diffusion Models to Accelerate Imaging with Side Information | Timofey Efimov et.al. | 2410.05143 | null |
2024-10-07 | Agnostic Smoothed Online Learning | Moïse Blanchard et.al. | 2410.05124 | null |
2024-10-07 | Human-Feedback Efficient Reinforcement Learning for Online Diffusion Model Finetuning | Ayano Hiranaka et.al. | 2410.05116 | null |
2024-10-07 | Synthetic Generation of Dermatoscopic Images with GAN and Closed-Form Factorization | Rohan Reddy Mekala et.al. | 2410.05114 | null |
2024-10-07 | Hyper-Representations: Learning from Populations of Neural Networks | Konstantin Schürholt et.al. | 2410.05107 | link |
2024-10-07 | DreamSat: Towards a General 3D Model for Novel View Synthesis of Space Objects | Nidhi Mathihalli et.al. | 2410.05097 | link |
2024-10-04 | Estimating Body and Hand Motion in an Ego-sensed World | Brent Yi et.al. | 2410.03665 | null |
2024-10-04 | Enhance Reasoning by Learning from Mistakes: Peer-Review Knowledge Distillation from Multiple Large Language Models | Zhuochun Li et.al. | 2410.03663 | null |
2024-10-04 | Geometric Representation Condition Improves Equivariant Molecule Generation | Zian Li et.al. | 2410.03655 | null |
2024-10-04 | Aligning LLMs with Individual Preferences via Interaction | Shujin Wu et.al. | 2410.03642 | link |
2024-10-04 | Real-World Benchmarks Make Membership Inference Attacks Fail on Diffusion Models | Chumeng Liang et.al. | 2410.03640 | link |
2024-10-04 | Conditional Enzyme Generation Using Protein Language Models with Adapters | Jason Yang et.al. | 2410.03634 | null |
2024-10-04 | How Discrete and Continuous Diffusion Meet: Comprehensive Analysis of Discrete Diffusion Models via a Stochastic Integral Framework | Yinuo Ren et.al. | 2410.03601 | null |
2024-10-04 | Teaching Transformers Modular Arithmetic at Scale | Eshika Saxena et.al. | 2410.03569 | null |
2024-10-04 | Not All Diffusion Model Activations Have Been Evaluated as Discriminative Features | Benyuan Meng et.al. | 2410.03558 | link |
2024-10-04 | Loading Ceramics: Visualising Possibilities of Robotics in Ceramics | Varvara Guljajeva et.al. | 2410.03550 | null |
2024-10-04 | NRGBoost: Energy-Based Generative Boosted Trees | João Bravo et.al. | 2410.03535 | null |
2024-10-04 | Generative Artificial Intelligence for Navigating Synthesizable Chemical Space | Wenhao Gao et.al. | 2410.03494 | link |
2024-10-04 | SeBS-Flow: Benchmarking Serverless Cloud Function Workflows | Larissa Schmid et.al. | 2410.03480 | null |
2024-10-04 | Formalizing MLTL Formula Progression in Isabelle/HOL | Katherine Kosaian et.al. | 2410.03465 | null |
2024-10-04 | Diffusion State-Guided Projected Gradient for Inverse Problems | Rayhan Zirvi et.al. | 2410.03463 | null |
2024-10-03 | SIEVE: General Purpose Data Filtering System Matching GPT-4o Accuracy at 1% the Cost | Jifan Zhang et.al. | 2410.02755 | null |
2024-10-03 | CriSPO: Multi-Aspect Critique-Suggestion-guided Automatic Prompt Optimization for Text Generation | Han He et.al. | 2410.02748 | null |
2024-10-03 | Salient Information Prompting to Steer Content in Prompt-based Abstractive Summarization | Lei Xu et.al. | 2410.02741 | link |
2024-10-03 | Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models | Zhengfeng Lai et.al. | 2410.02740 | null |
2024-10-03 | Custom Non-Linear Model Predictive Control for Obstacle Avoidance in Indoor and Outdoor Environments | Lara Laban et.al. | 2410.02732 | link |
2024-10-03 | A Photonic Parameter-shift Rule: Enabling Gradient Computation for Photonic Quantum Computers | Axel Pappalardo et.al. | 2410.02726 | null |
2024-10-03 | AlzhiNet: Traversing from 2DCNN to 3DCNN, Towards Early Detection and Diagnosis of Alzheimer’s Disease | Romoke Grace Akindele et.al. | 2410.02714 | null |
2024-10-03 | SteerDiff: Steering towards Safe Text-to-Image Diffusion Models | Hongxiang Zhang et.al. | 2410.02710 | null |
2024-10-03 | ControlAR: Controllable Image Generation with Autoregressive Models | Zongming Li et.al. | 2410.02705 | link |
2024-10-03 | User-centric Immersive Communications in 6G: A Data-oriented Approach via Digital Twin | Conghao Zhou et.al. | 2410.02688 | null |
2024-10-03 | GUD: Generation with Unified Diffusion | Mathis Gerdes et.al. | 2410.02667 | null |
2024-10-03 | Grounded Answers for Multi-agent Decision-making Problem through Generative World Model | Zeyang Liu et.al. | 2410.02664 | null |
2024-10-03 | Scalable Simulation-free Entropic Unbalanced Optimal Transport | Jaemoo Choi et.al. | 2410.02656 | null |
2024-10-03 | Measuring and Improving Persuasiveness of Generative Models | Somesh Singh et.al. | 2410.02653 | null |
2024-10-03 | Efficient calibration of the shifted square-root diffusion model to credit default swap spreads using asymptotic approximations | Ankush Agarwal et.al. | 2410.02645 | null |
2024-10-02 | FabricDiffusion: High-Fidelity Texture Transfer for 3D Garments Generation from In-The-Wild Clothing Images | Cheng Zhang et.al. | 2410.01801 | null |
2024-10-02 | Bellman Diffusion: Generative Modeling as Learning a Linear Operator in the Distribution Space | Yangming Li et.al. | 2410.01796 | null |
2024-10-02 | Dynamical-generative downscaling of climate model ensembles | Ignacio Lopez-Gomez et.al. | 2410.01776 | null |
2024-10-02 | Towards deep learning sequence-structure co-generation for protein design | Chentong Wang et.al. | 2410.01773 | null |
2024-10-02 | ImageFolder: Autoregressive Image Generation with Folded Tokens | Xiang Li et.al. | 2410.01756 | link |
2024-10-02 | AssessITS: Integrating procedural guidelines and practical evaluation metrics for organizational IT and Cybersecurity risk assessment | Mir Mehedi Rahman et.al. | 2410.01750 | null |
2024-10-02 | VitaGlyph: Vitalizing Artistic Typography with Flexible Dual-branch Diffusion Models | Kailai Feng et.al. | 2410.01738 | link |
2024-10-02 | HarmoniCa: Harmonizing Training and Inference for Better Feature Cache in Diffusion Transformer Acceleration | Yushi Huang et.al. | 2410.01723 | null |
2024-10-02 | Towards a Theoretical Understanding of Synthetic Data in LLM Post-Training: A Reverse-Bottleneck Perspective | Zeyu Gan et.al. | 2410.01720 | link |
2024-10-02 | COMUNI: Decomposing Common and Unique Video Signals for Diffusion-based Video Generation | Mingzhen Sun et.al. | 2410.01718 | null |
2024-10-02 | A Mathematics-Inspired Learning-to-Optimize Framework for Decentralized Optimization | Yutong He et.al. | 2410.01700 | null |
2024-10-02 | Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding | Yao Teng et.al. | 2410.01699 | link |
2024-10-02 | Lossy Semantic Communication for the Logical Deduction of the State of the World | Ahmet Faruk Saz et.al. | 2410.01676 | link |
2024-10-02 | Conformal Generative Modeling with Improved Sample Efficiency through Sequential Greedy Filtering | Klaus-Rudolf Kladny et.al. | 2410.01660 | null |
2024-10-02 | On The Adaptation of Unlimiformer for Decoder-Only Transformers | Kian Ahrabian et.al. | 2410.01637 | null |
2024-09-30 | SpaceMesh: A Continuous Representation for Learning Manifold Surface Meshes | Tianchang Shen et.al. | 2409.20562 | null |
2024-09-30 | Annealing Flow Generative Model Towards Sampling High-Dimensional and Multi-Modal Distributions | Dongze Wu et.al. | 2409.20547 | link |
2024-09-30 | A Compact Quantum Random Number Generator Based on Balanced Detection of Shot Noise | Jaideep Singh et.al. | 2409.20515 | null |
2024-09-30 | NUTRIVISION: A System for Automatic Diet Management in Smart Healthcare | Madhumita Veeramreddy et.al. | 2409.20508 | null |
2024-09-30 | COLLAGE: Collaborative Human-Agent Interaction Generation using Hierarchical Latent Diffusion and Language Models | Divyanshu Daiya et.al. | 2409.20502 | null |
2024-09-30 | FreeMask: Rethinking the Importance of Attention Masks for Zero-Shot Video Editing | Lingling Cai et.al. | 2409.20500 | null |
2024-09-30 | All-optical autoencoder machine learning framework using diffractive processors | Peijie Feng et.al. | 2409.20346 | null |
2024-09-30 | Devil is in Details: Locality-Aware 3D Abdominal CT Volume Generation for Self-Supervised Organ Segmentation | Yuran Wang et.al. | 2409.20332 | null |
2024-09-30 | UIR-LoRA: Achieving Universal Image Restoration through Multiple Low-Rank Adaptation | Cheng Zhang et.al. | 2409.20197 | link |
2024-09-30 | Ensemble Kalman Diffusion Guidance: A Derivative-free Method for Inverse Problems | Hongkai Zheng et.al. | 2409.20175 | null |
2024-09-30 | Erase, then Redraw: A Novel Data Augmentation Approach for Free Space Detection Using Diffusion Model | Fulong Ma et.al. | 2409.20164 | null |
2024-09-30 | Conditional Diffusion Models are Minimax-Optimal and Manifold-Adaptive for Conditional Distribution Estimation | Rong Tang et.al. | 2409.20124 | null |
2024-09-30 | Training a Computer Vision Model for Commercial Bakeries with Primarily Synthetic Images | Thomas H. Schmitt et.al. | 2409.20122 | null |
2024-09-30 | Reaction-diffusion model for a population structured in phenotype and space I – Criterion for persistence | Nathanaël Boutillon et.al. | 2409.20118 | null |
2024-09-30 | Near-Field Coupling Coil System: A Novel Radiofrequency Coil Solution for MRI | Zhiguang Mo et.al. | 2409.20095 | null |
2024-09-27 | $O(d/T)$ Convergence Theory for Diffusion Probabilistic Models under Minimal Assumptions | Gen Li et.al. | 2409.18959 | null |
2024-09-27 | ReviveDiff: A Universal Diffusion Model for Restoring Images in Adverse Weather Conditions | Wenfeng Huang et.al. | 2409.18932 | null |
2024-09-27 | Unsupervised Low-light Image Enhancement with Lookup Tables and Diffusion Priors | Yunlong Lin et.al. | 2409.18899 | null |
2024-09-27 | Detecting Dataset Abuse in Fine-Tuning Stable Diffusion Models for Text-to-Image Synthesis | Songrui Wang et.al. | 2409.18897 | null |
2024-09-27 | HM3: Hierarchical Multi-Objective Model Merging for Pretrained Models | Yu Zhou et.al. | 2409.18893 | null |
2024-09-27 | Explainable Artifacts for Synthetic Western Blot Source Attribution | João Phillipe Cardenuto et.al. | 2409.18881 | link |
2024-09-27 | Emu3: Next-Token Prediction is All You Need | Xinlong Wang et.al. | 2409.18869 | null |
2024-09-27 | Challenges of Generating Structurally Diverse Graphs | Fedor Velikonivtsev et.al. | 2409.18859 | link |
2024-09-27 | Moldable Development Patterns | Oscar Nierstrasz et.al. | 2409.18811 | null |
2024-09-27 | Convergence of Diffusion Models Under the Manifold Hypothesis in High-Dimensions | Iskander Azangulov et.al. | 2409.18804 | null |
2024-09-27 | Student-Oriented Teacher Knowledge Refinement for Knowledge Distillation | Chaomin Shen et.al. | 2409.18785 | null |
2024-09-27 | Geometric deep learning for galaxy-halo connection: a case study for galaxy intrinsic alignments | Yesukhei Jagvaral et.al. | 2409.18761 | null |
2024-09-27 | Cottention: Linear Transformers With Cosine Attention | Gabriel Mongaras et.al. | 2409.18747 | link |
2024-09-27 | Read Over the Lines: Attacking LLMs and Toxicity Detection Systems with ASCII Art to Mask Profanity | Sergey Berezin et.al. | 2409.18708 | link |
2024-09-27 | MG-Net: Learn to Customize QAOA with Circuit Depth Awareness | Yang Qian et.al. | 2409.18692 | link |
2024-09-26 | FlowTurbo: Towards Real-time Flow-Based Image Generation with Velocity Refiner | Wenliang Zhao et.al. | 2409.18128 | link |
2024-09-26 | Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction | Jing He et.al. | 2409.18124 | null |
2024-09-26 | EdgeRunner: Auto-regressive Auto-encoder for Artistic Mesh Generation | Jiaxiang Tang et.al. | 2409.18114 | null |
2024-09-26 | MALPOLON: A Framework for Deep Species Distribution Modeling | Theo Larcher et.al. | 2409.18102 | link |
2024-09-26 | StackGen: Generating Stable Structures from Silhouettes via Diffusion | Luzhe Sun et.al. | 2409.18098 | null |
2024-09-26 | DiffSSC: Semantic LiDAR Scan Completion using Denoising Diffusion Probabilistic Models | Helin Cao et.al. | 2409.18092 | null |
2024-09-26 | Stable Video Portraits | Mirela Ostrek et.al. | 2409.18083 | null |
2024-09-26 | LightAvatar: Efficient Head Avatar as Dynamic Neural Light Field | Huan Wang et.al. | 2409.18057 | link |
2024-09-26 | Automated Detection and Analysis of Power Words in Persuasive Text Using Natural Language Processing | Sahil Garje et.al. | 2409.18033 | null |
2024-09-26 | PhoCoLens: Photorealistic and Consistent Reconstruction in Lensless Imaging | Xin Cai et.al. | 2409.17996 | null |
2024-09-26 | Joint Localization and Planning using Diffusion | L. Lao Beyer et.al. | 2409.17995 | null |
2024-09-26 | Manufacturing, processing, applications, and advancements of Fe-based shape memory alloys | Anwar Algamal et.al. | 2409.17973 | null |
2024-09-26 | CNCA: Toward Customizable and Natural Generation of Adversarial Camouflage for Vehicle Detectors | Linye Lyu et.al. | 2409.17963 | null |
2024-09-26 | Relativistic diffusion model for hadron production in p-Pb collisions at the LHC | Philipp Schulz et.al. | 2409.17960 | null |
2024-09-26 | Perturb, Attend, Detect and Localize (PADL): Robust Proactive Image Defense | Filippo Bartolucci et.al. | 2409.17941 | null |
2024-09-25 | DreamWaltz-G: Expressive 3D Gaussian Avatars from Skeleton-Guided 2D Diffusion | Yukun Huang et.al. | 2409.17145 | link |
2024-09-25 | Language-oriented Semantic Communication for Image Transmission with Fine-Tuned Diffusion Model | Xinfeng Wei et.al. | 2409.17104 | null |
2024-09-25 | Accumulator-Aware Post-Training Quantization | Ian Colbert et.al. | 2409.17092 | null |
2024-09-25 | Ctrl-GenAug: Controllable Generative Augmentation for Medical Sequence Classification | Xinrui Zhou et.al. | 2409.17091 | null |
2024-09-25 | Degradation-Guided One-Step Image Super-Resolution with Diffusion Priors | Aiping Zhang et.al. | 2409.17058 | link |
2024-09-25 | ControlCity: A Multimodal Diffusion Model Based Approach for Accurate Geospatial Data Generation and Urban Morphology Analysis | Fangshuo Zhou et.al. | 2409.17049 | link |
2024-09-25 | GeoBiked: A Dataset with Geometric Features and Automated Labeling Techniques to Enable Deep Generative Models in Engineering Design | Phillip Mueller et.al. | 2409.17045 | null |
2024-09-25 | CNN Mixture-of-Depths | Rinor Cakaj et.al. | 2409.17016 | null |
2024-09-25 | Single Image, Any Face: Generalisable 3D Face Generation | Wenqing Wang et.al. | 2409.16990 | null |
2024-09-25 | Dynamic Obstacle Avoidance through Uncertainty-Based Adaptive Planning with Diffusion | Vineet Punyamoorty et.al. | 2409.16950 | null |
2024-09-25 | DALDA: Data Augmentation Leveraging Diffusion Model and LLM with Adaptive Guidance Scaling | Kyuheon Jung et.al. | 2409.16949 | link |
2024-09-25 | Divergence asymmetry and connected components in a general duplication-divergence graph model | Dario Borrelli et.al. | 2409.16943 | null |
2024-09-25 | Generative Object Insertion in Gaussian Splatting with a Multi-View Diffusion Model | Hongliang Zhong et.al. | 2409.16938 | link |
2024-09-25 | Linking in Style: Understanding learned features in deep learning models | Maren H. Wehrheim et.al. | 2409.16865 | link |
2024-09-25 | A Versatile and Differentiable Hand-Object Interaction Representation | Théo Morales et.al. | 2409.16855 | null |
2024-09-18 | Massively Multi-Person 3D Human Motion Forecasting with Scene Context | Felix B Mueller et.al. | 2409.12189 | link |
2024-09-18 | MoRAG – Multi-Fusion Retrieval Augmented Generation for Human Motion | Kalakonda Sai Shashank et.al. | 2409.12140 | null |
2024-09-24 | Takin: A Cohort of Superior Quality Zero-shot Speech Generation Models | Sijing Chen et.al. | 2409.12139 | null |
2024-09-18 | Brain-Streams: fMRI-to-Image Reconstruction with Multi-modal Guidance | Jaehoon Joo et.al. | 2409.12099 | null |
2024-09-19 | Skill matching at scale: freelancer-project alignment for efficient multilingual candidate retrieval | Warren Jouanneau et.al. | 2409.12097 | null |
2024-09-18 | Design of Ligand-Binding Proteins with Atomic Flow Matching | Junqi Liu et.al. | 2409.12080 | null |
2024-09-18 | Denoising diffusion models for high-resolution microscopy image restoration | Pamela Osuna-Vargas et.al. | 2409.12078 | null |
2024-09-19 | Using Large Language Models to Generate Clinical Trial Tables and Figures | Yumeng Yang et.al. | 2409.12046 | null |
2024-09-18 | LEMON: Localized Editing with Mesh Optimization and Neural Shaders | Furkan Mert Algan et.al. | 2409.12024 | null |
2024-09-18 | Promise and Peril of Collaborative Code Generation Models: Balancing Effectiveness and Memorization | Zhi Chen et.al. | 2409.12020 | null |
2024-09-18 | Towards Global Localization using Multi-Modal Object-Instance Re-Identification | Aneesh Chavan et.al. | 2409.12002 | link |
2024-09-18 | Tracking Any Point with Frame-Event Fusion Network at High Frame Rate | Jiaxiong Liu et.al. | 2409.11953 | null |
2024-09-18 | Generation of Complex 3D Human Motion by Temporal and Spatial Composition of Diffusion Models | Lorenzo Mandelli et.al. | 2409.11920 | null |
2024-09-18 | AlignBot: Aligning VLM-powered Customized Task Planning with User Reminders Through Fine-Tuning for Household Robots | Zhaxizhuoma et.al. | 2409.11905 | null |
2024-09-18 | Finding the Subjective Truth: Collecting 2 Million Votes for Comprehensive Gen-AI Model Evaluation | Dimitrios Christodoulou et.al. | 2409.11904 | null |
2024-09-17 | Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion | Zhenwei Wang et.al. | 2409.11406 | null |
2024-09-17 | Teaching dark matter simulations to speak the halo language | Shivam Pandey et.al. | 2409.11401 | link |
2024-09-17 | Ultrasound Image Enhancement with the Variance of Diffusion Models | Yuxin Zhang et.al. | 2409.11380 | link |
2024-09-17 | OSV: One Step is Enough for High-Quality Image to Video Generation | Xiaofeng Mao et.al. | 2409.11367 | null |
2024-09-17 | Ping! Your Food is Ready: Comparing Different Notification Techniques in 3D AR Cooking Environment | Aditya Raikwar et.al. | 2409.11357 | null |
2024-09-17 | Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think | Gonzalo Martin Garcia et.al. | 2409.11355 | link |
2024-09-17 | OmniGen: Unified Image Generation | Shitao Xiao et.al. | 2409.11340 | link |
2024-09-17 | fMRI-3D: A Comprehensive Dataset for Enhancing fMRI-based 3D Reconstruction | Jianxiong Gao et.al. | 2409.11315 | null |
2024-09-17 | SpMis: An Investigation of Synthetic Spoken Misinformation Detection | Peizhuo Liu et.al. | 2409.11308 | null |
2024-09-17 | Measurement of top-quark pair production in association with charm quarks in proton-proton collisions at $\sqrt{s}=13$ TeV with the ATLAS detector | ATLAS Collaboration et.al. | 2409.11305 | null |
2024-09-17 | NirvaWave: An Accurate and Efficient Near Field Wave Propagation Simulator for 6G and Beyond | Vahid Yazdnian et.al. | 2409.11293 | link |
2024-09-17 | DroneDiffusion: Robust Quadrotor Dynamics Learning with Diffusion Models | Avirup Das et.al. | 2409.11292 | null |
2024-09-17 | Neural Networks for Vehicle Routing Problem | László Kovács et.al. | 2409.11290 | null |
2024-09-17 | Attacking Slicing Network via Side-channel Reinforcement Learning Attack | Wei Shao et.al. | 2409.11258 | null |
2024-09-17 | Learning Source Disentanglement in Neural Audio Codec | Xiaoyu Bie et.al. | 2409.11228 | null |
2024-09-16 | Pennsieve - A Collaborative Platform for Translational Neuroscience and Beyond | Zack Goldblum et.al. | 2409.10509 | null |
2024-09-16 | Torres funerarias chullpa en el valle del río Lauca: un primer análisis arqueoastronómico | Alejandro Gangui et.al. | 2409.10497 | null |
2024-09-16 | Incorporating Classifier-Free Guidance in Diffusion Model-Based Recommendation | Noah Buchanan et.al. | 2409.10494 | null |
2024-09-16 | SimInversion: A Simple Framework for Inversion-Based Text-to-Image Editing | Qi Qian et.al. | 2409.10476 | null |
2024-09-16 | MacDiff: Unified Skeleton Modeling with Masked Conditional Diffusion | Lehong Wu et.al. | 2409.10473 | null |
2024-09-16 | Signed Graph Autoencoder for Explainable and Polarization-Aware Network Embeddings | Nikolaos Nakis et.al. | 2409.10452 | null |
2024-09-16 | Mamba-ST: State Space Model for Efficient Style Transfer | Filippo Botti et.al. | 2409.10385 | link |
2024-09-16 | 2D or not 2D: How Does the Dimensionality of Gesture Representation Affect 3D Co-Speech Gesture Generation? | Téo Guichoux et.al. | 2409.10357 | null |
2024-09-16 | Taming Diffusion Models for Image Restoration: A Review | Ziwei Luo et.al. | 2409.10353 | null |
2024-09-16 | MEGS: Morphological Evaluation of Galactic Structure | Ufuk Çakır et.al. | 2409.10346 | link |
2024-09-16 | VAE-QWGAN: Improving Quantum GANs for High Resolution Image Generation | Aaron Mark Thomas et.al. | 2409.10339 | null |
2024-09-16 | Research and Design of a Financial Intelligent Risk Control Platform Based on Big Data Analysis and Deep Machine Learning | Shuochen Bi et.al. | 2409.10331 | null |
2024-09-16 | Fairness, not Emotion, Drives Socioeconomic Decision Making | Rudra Mukhopadhyay et.al. | 2409.10322 | null |
2024-09-16 | On Synthetic Texture Datasets: Challenges, Creation, and Curation | Blaine Hoak et.al. | 2409.10297 | null |
2024-09-16 | DreamHead: Learning Spatial-Temporal Correspondence via Hierarchical Diffusion for Audio-driven Talking Head Synthesis | Fa-Ting Hong et.al. | 2409.10281 | null |
2024-09-13 | Closed-Loop Visuomotor Control with Generative Expectation for Robotic Manipulation | Qingwen Bu et.al. | 2409.09016 | link |
2024-09-13 | A Diffusion Approach to Radiance Field Relighting using Multi-Illumination Synthesis | Yohan Poirier-Ginter et.al. | 2409.08947 | null |
2024-09-13 | Emerging Reliance Behaviors in Human-AI Text Generation: Hallucinations, Data Quality Assessment, and Cognitive Forcing Functions | Zahra Ashktorab et.al. | 2409.08937 | null |
2024-09-13 | Latent Space Score-based Diffusion Model for Probabilistic Multivariate Time Series Imputation | Guojun Liang et.al. | 2409.08917 | link |
2024-09-13 | Gaussian is All You Need: A Unified Framework for Solving Inverse Problems via Diffusion Posterior Sampling | Nebiyou Yismaw et.al. | 2409.08906 | null |
2024-09-13 | Adjoint Matching: Fine-tuning Flow and Diffusion Generative Models with Memoryless Stochastic Optimal Control | Carles Domingo-Enrich et.al. | 2409.08861 | null |
2024-09-13 | The Line-Based Dial-a-Ride Problem | Kendra Reiter et.al. | 2409.08860 | link |
2024-09-13 | InstantDrag: Improving Interactivity in Drag-based Image Editing | Joonghyuk Shin et.al. | 2409.08857 | null |
2024-09-13 | DX2CT: Diffusion Model for 3D CT Reconstruction from Bi or Mono-planar 2D X-ray(s) | Yun Su Jeong et.al. | 2409.08850 | null |
2024-09-13 | Development of a Compton Imager Setup | Anuraag Arya et.al. | 2409.08822 | null |
2024-09-13 | LLaQo: Towards a Query-Based Coach in Expressive Music Performance Assessment | Huan Zhang et.al. | 2409.08795 | link |
2024-09-13 | What You Say = What You Want? Teaching Humans to Articulate Requirements for LLMs | Qianou Ma et.al. | 2409.08775 | link |
2024-09-13 | A Hybrid Meta-Learning and Multi-Armed Bandit Approach for Context-Specific Multi-Objective Recommendation Optimization | Tiago Cunha et.al. | 2409.08752 | null |
2024-09-13 | Adaptive Sampling for Continuous Group Equivariant Neural Networks | Berfin Inal et.al. | 2409.08741 | null |
2024-09-13 | DFADD: The Diffusion and Flow-Matching Based Audio Deepfake Dataset | Jiawei Du et.al. | 2409.08731 | link |
2024-09-12 | DreamHOI: Subject-Driven Generation of 3D Human-Object Interactions with Diffusion Priors | Thomas Hanwen Zhu et.al. | 2409.08278 | null |
2024-09-12 | Hand-Object Interaction Pretraining from Videos | Himanshu Gaurav Singh et.al. | 2409.08273 | null |
2024-09-12 | Click2Mask: Local Editing with Dynamic Mask Generation | Omer Regev et.al. | 2409.08272 | null |
2024-09-12 | DreamBeast: Distilling 3D Fantastical Animals with Part-Aware Knowledge Transfer | Runjia Li et.al. | 2409.08271 | null |
2024-09-12 | Touch2Touch: Cross-Modal Tactile Generation for Object Manipulation | Samanta Rodriguez et.al. | 2409.08269 | null |
2024-09-12 | Improving Text-guided Object Inpainting with Semantic Pre-inpainting | Yifu Chen et.al. | 2409.08260 | link |
2024-09-12 | Improving Virtual Try-On with Garment-focused Diffusion Models | Siqi Wan et.al. | 2409.08258 | null |
2024-09-12 | LoRID: Low-Rank Iterative Diffusion for Adversarial Purification | Geigh Zollicoffer et.al. | 2409.08255 | null |
2024-09-12 | Dynamic Prompting of Frozen Text-to-Image Diffusion Models for Panoptic Narrative Grounding | Hongyu Li et.al. | 2409.08251 | null |
2024-09-12 | IFAdapter: Instance Feature Control for Grounded Text-to-Image Generation | Yinwei Wu et.al. | 2409.08240 | null |
2024-09-12 | Source2Synth: Synthetic Data Generation and Curation Grounded in Real Data Sources | Alisia Lupidi et.al. | 2409.08239 | null |
2024-09-12 | LT3SD: Latent Trees for 3D Scene Diffusion | Quan Meng et.al. | 2409.08215 | null |
2024-09-12 | VI3DRM:Towards meticulous 3D Reconstruction from Sparse Views via Photo-Realistic Novel View Synthesis | Hao Chen et.al. | 2409.08207 | null |
2024-09-12 | High-Frequency Anti-DreamBooth: Robust Defense Against Image Synthesis | Takuto Onikubo et.al. | 2409.08167 | link |
2024-09-12 | MagicStyle: Portrait Stylization Based on Reference Image | Zhaoli Deng et.al. | 2409.08156 | null |
2024-09-11 | DreamMesh: Jointly Manipulating and Texturing Triangle Meshes for Text-to-3D Generation | Haibo Yang et.al. | 2409.07454 | null |
2024-09-11 | Hi3D: Pursuing High-Resolution Image-to-3D Generation with Video Diffusion Models | Haibo Yang et.al. | 2409.07452 | link |
2024-09-11 | FreeEnhance: Tuning-Free Image Enhancement via Content-Consistent Noising-and-Denoising Process | Yang Luo et.al. | 2409.07451 | null |
2024-09-11 | Efficient One-Step Diffusion Refinement for Snapshot Compressive Imaging | Yunzhen Wang et.al. | 2409.07417 | null |
2024-09-11 | Extracting TCPIP Headers at High Speed for the Anonymized Network Traffic Graph Challenge | Zhaoyang Han et.al. | 2409.07374 | null |
2024-09-11 | Awaking the Slides: A Tuning-free and Knowledge-regulated AI Tutoring System via Language Model Coordination | Daniel Zhang-Li et.al. | 2409.07372 | null |
2024-09-11 | Event-based Mosaicing Bundle Adjustment | Shuang Guo et.al. | 2409.07365 | link |
2024-09-11 | Training-Free Guidance for Discrete Diffusion Models for Molecular Generation | Thomas J. Kerby et.al. | 2409.07359 | null |
2024-09-11 | Learning Robotic Manipulation Policies from Point Clouds with Conditional Flow Matching | Eugenio Chisari et.al. | 2409.07343 | null |
2024-09-11 | Efficient and Unbiased Sampling of Boltzmann Distributions via Consistency Models | Fengzhe Zhang et.al. | 2409.07323 | null |
2024-09-11 | Optimizing Neural Network Performance and Interpretability with Diophantine Equation Encoding | Ronald Katende et.al. | 2409.07310 | null |
2024-09-11 | Exploring User-level Gradient Inversion with a Diffusion Prior | Zhuohang Li et.al. | 2409.07291 | null |
2024-09-11 | CCFExp: Facial Image Synthesis with Cycle Cross-Fusion Diffusion Model for Facial Paralysis Individuals | Weixiang Gao et.al. | 2409.07271 | link |
2024-09-11 | Realistic and Efficient Face Swapping: A Unified Approach with Diffusion Models | Sanoojan Baliah et.al. | 2409.07269 | link |
2024-09-11 | EMOdiffhead: Continuously Emotional Control in Talking Head Generation via Diffusion | Jian Zhang et.al. | 2409.07255 | null |
2024-09-10 | Technical Report of Mobile Manipulator Robot for Industrial Environments | Erfan Amoozad Khalili et.al. | 2409.06693 | null |
2024-09-10 | SaRA: High-Efficient Diffusion Model Fine-tuning with Progressive Sparse Low-Rank Adaptation | Teng Hu et.al. | 2409.06633 | null |
2024-09-10 | MVGaussian: High-Fidelity text-to-3D Content Generation with Multi-View Guidance and Surface Densification | Phu Pham et.al. | 2409.06620 | null |
2024-09-10 | A Primer on Variational Inference for Physics-Informed Deep Generative Modelling | Alex Glyn-Davies et.al. | 2409.06560 | null |
2024-09-10 | From LIMA to DeepLIMA: following a new path of interoperability | Victor Bocharov et.al. | 2409.06550 | null |
2024-09-10 | Enhancing Emotional Text-to-Speech Controllability with Natural Language Guidance through Contrastive Learning and Diffusion Models | Xin Jing et.al. | 2409.06451 | null |
2024-09-10 | Prompt2Fashion: An automatically generated fashion dataset | Georgia Argyro et.al. | 2409.06442 | link |
2024-09-10 | Fast nonparametric inference of network backbones for graph sparsification | Alec Kirkley et.al. | 2409.06417 | link |
2024-09-10 | Distilling Generative-Discriminative Representations for Very Low-Resolution Face Recognition | Junzheng Zhang et.al. | 2409.06371 | null |
2024-09-10 | What happens to diffusion model likelihood when your model is conditional? | Mattias Cross et.al. | 2409.06364 | null |
2024-09-10 | DiffQRCoder: Diffusion-based Aesthetic QR Code Generation with Scanning Robustness Guided Iterative Refinement | Jia-Wei Liao et.al. | 2409.06355 | null |
2024-09-10 | Improving Conditional Level Generation using Automated Validation in Match-3 Games | Monica Villanueva Aylagas et.al. | 2409.06349 | null |
2024-09-10 | Foragax: An Agent Based Modelling framework based on JAX | Siddharth Chaturvedi et.al. | 2409.06345 | link |
2024-09-10 | G3PT: Unleash the power of Autoregressive Modeling in 3D Generation via Cross-scale Querying Transformer | Jinzhi Zhang et.al. | 2409.06322 | null |
2024-09-10 | Learning Augmentation Policies from A Model Zoo for Time Series Forecasting | Haochen Yuan et.al. | 2409.06282 | null |
2024-09-09 | Fast Generation of Custom Floating-Point Spatial Filters on FPGAs | Nelson Campos et.al. | 2409.05837 | null |
2024-09-09 | Enhancing Preference-based Linear Bandits via Human Response Time | Shen Li et.al. | 2409.05798 | null |
2024-09-09 | Predicting Critical Heat Flux with Uncertainty Quantification and Domain Generalization Using Conditional Variational Autoencoders and Deep Neural Networks | Farah Alsafadi et.al. | 2409.05790 | null |
2024-09-09 | Vector Quantized Diffusion Model Based Speech Bandwidth Extension | Yuan Fang et.al. | 2409.05784 | null |
2024-09-09 | AS-Speech: Adaptive Style For Speech Synthesis | Zhipeng Li et.al. | 2409.05730 | null |
2024-09-09 | pFedGPA: Diffusion-based Generative Parameter Aggregation for Personalized Federated Learning | Jiahao Lai et.al. | 2409.05701 | null |
2024-09-09 | Citizen-Led Personalization of User Interfaces: Investigating How People Customize Interfaces for Themselves and Others | Sérgio Alves et.al. | 2409.05696 | null |
2024-09-09 | Unlearning or Concealment? A Critical Analysis and Evaluation Metrics for Unlearning in Diffusion Models | Aakash Sen Sharma et.al. | 2409.05668 | null |
2024-09-09 | Forward KL Regularized Preference Optimization for Aligning Diffusion Policies | Zhao Shan et.al. | 2409.05622 | null |
2024-09-09 | CustomContrast: A Multilevel Contrastive Perspective For Subject-Driven Text-to-Image Customization | Nan Chen et.al. | 2409.05606 | null |
2024-09-09 | Latent 3D Brain MRI Counterfactual | Wei Peng et.al. | 2409.05585 | null |
2024-09-09 | Spatially-Aware Speaker for Vision-and-Language Navigation Instruction Generation | Muraleekrishna Gopinathan et.al. | 2409.05583 | link |
2024-09-09 | Design and Implementation of TAO DAQ System | Shuihan Zhang et.al. | 2409.05522 | null |
2024-09-09 | A Taxonomy of Miscompressions: Preparing Image Forensics for Neural Compression | Nora Hofer et.al. | 2409.05490 | null |
2024-09-09 | DriveScape: Towards High-Resolution Controllable Multi-View Driving Video Generation | Wei Wu et.al. | 2409.05463 | null |
2024-09-06 | VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation | Yecheng Wu et.al. | 2409.04429 | link |
2024-09-06 | Exploring Foundation Models for Synthetic Medical Imaging: A Study on Chest X-Rays and Fine-Tuning Techniques | Davide Clode da Silva et.al. | 2409.04424 | null |
2024-09-06 | Open-MAGVIT2: An Open-Source Project Toward Democratizing Auto-regressive Visual Generation | Zhuoyan Luo et.al. | 2409.04410 | null |
2024-09-06 | Enhancing Skin Lesion Diagnosis with Ensemble Learning | Xiaoyi Liu et.al. | 2409.04381 | null |
2024-09-06 | How Fair is Your Diffusion Recommender Model? | Daniele Malitesta et.al. | 2409.04339 | null |
2024-09-06 | Random effects estimation in a fractional diffusion model based on continuous observations | Nesrine Chebli et.al. | 2409.04331 | null |
2024-09-06 | Advancing Automated Knowledge Transfer in Evolutionary Multitasking via Large Language Models | Yuxiao Huang et.al. | 2409.04270 | null |
2024-09-06 | An overview of domain-specific foundation model: key technologies, applications and challenges | Haolong Chen et.al. | 2409.04267 | null |
2024-09-06 | UniDet3D: Multi-dataset Indoor 3D Object Detection | Maksim Kolodiazhnyi et.al. | 2409.04234 | link |
2024-09-06 | Generative Modelling via Quantile Regression | Johannes Schmidt-Hieber et.al. | 2409.04231 | null |
2024-09-06 | Breaking the Brownian Barrier: Models and Manifestations of Molecular Diffusion in Complex Fluids | Harish Srinivasan et.al. | 2409.04199 | null |
2024-09-06 | GST: Precise 3D Human Body from a Single Image with Gaussian Splatting Transformers | Lorenza Prospero et.al. | 2409.04196 | null |
2024-09-06 | Subsampling of Correlated Graph Signals | Rishabh Ravi et.al. | 2409.04107 | null |
2024-09-06 | Estimation of service value parameters for a queue with unobserved balking | Daniel Podorojnyi et.al. | 2409.04090 | null |
2024-09-06 | D4: Text-guided diffusion model-based domain adaptive data augmentation for vineyard shoot detection | Kentaro Hirahara et.al. | 2409.04060 | null |
2024-09-05 | Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding | Yunze Man et.al. | 2409.03757 | link |
2024-09-05 | WildVis: Open Source Visualizer for Million-Scale Chat Logs in the Wild | Yuntian Deng et.al. | 2409.03753 | null |
2024-09-05 | ArtiFade: Learning to Generate High-quality Subject from Blemished Images | Shuya Yang et.al. | 2409.03745 | null |
2024-09-06 | RAG based Question-Answering for Contextual Response Prediction System | Sriram Veturi et.al. | 2409.03708 | null |
2024-09-05 | RealisHuman: A Two-Stage Approach for Refining Malformed Human Parts in Generated Images | Benzhi Wang et.al. | 2409.03644 | link |
2024-09-05 | DiffEVC: Any-to-Any Emotion Voice Conversion with Expressive Guidance | Hsing-Hang Chou et.al. | 2409.03636 | null |
2024-09-05 | Generalizing Linear Graphs and Bond Graph Models with Hetero-functional Graphs for System-of-Systems Engineering Applications | Ehsanoddin Ghorbanichemazkati et.al. | 2409.03630 | null |
2024-09-05 | TCDiff: Triple Condition Diffusion Model with 3D Constraints for Stylizing Synthetic Faces | Bernardo Biesseck et.al. | 2409.03600 | link |
2024-09-05 | DKDM: Data-Free Knowledge Distillation for Diffusion Models with Any Architecture | Qianlong Xiang et.al. | 2409.03550 | null |
2024-09-05 | Euclid preparation. Simulations and nonlinearities beyond $Λ$ CDM. 2. Results from non-standard simulations | Euclid Collaboration et.al. | 2409.03523 | null |
2024-09-05 | Blended Latent Diffusion under Attention Control for Real-World Video Editing | Deyin Liu et.al. | 2409.03514 | null |
2024-09-05 | Physical Modelling of Piano Sound | Haifan Xie et.al. | 2409.03481 | null |
2024-09-05 | Data-free Distillation with Degradation-prompt Diffusion for Multi-weather Image Restoration | Pei Wang et.al. | 2409.03455 | null |
2024-09-05 | Rx Strategist: Prescription Verification using LLM Agents System | Phuc Phan Van et.al. | 2409.03440 | null |
2024-09-05 | KiloBot: A Programming Language for Deploying Perception-Guided Industrial Manipulators at Scale | Wei Gao et.al. | 2409.03439 | null |
2024-09-04 | HiPrompt: Tuning-free Higher-Resolution Generation with Hierarchical MLLM Prompts | Xinyu Liu et.al. | 2409.02919 | link |
2024-09-04 | Latent Watermarking of Audio Generative Models | Robin San Roman et.al. | 2409.02915 | null |
2024-09-04 | Masked Diffusion Models are Secretly Time-Agnostic Masked Models and Exploit Inaccurate Categorical Sampling | Kaiwen Zheng et.al. | 2409.02908 | null |
2024-09-04 | Configurable Foundation Models: Building LLMs from a Modular Perspective | Chaojun Xiao et.al. | 2409.02877 | null |
2024-09-04 | Look Into the LITE in Deep Learning for Time Series Classification | Ali Ismail-Fawaz et.al. | 2409.02869 | link |
2024-09-04 | Building a Scalable, Effective, and Steerable Search and Ranking Platform | Marjan Celikik et.al. | 2409.02856 | null |
2024-09-04 | Human-VDM: Learning Single-Image 3D Human Gaussian Splatting from Video Diffusion Models | Zhibin Liu et.al. | 2409.02851 | link |
2024-09-04 | Anomaly Detection in Offshore Open Radio Access Network Using Long Short-Term Memory Models on a Novel Artificial Intelligence-Driven Cloud-Native Data Platform | Abdelrahim Ahmad et.al. | 2409.02849 | null |
2024-09-04 | Multi-Track MusicLDM: Towards Versatile Music Generation with Latent Diffusion Model | Tornike Karchkhadze et.al. | 2409.02845 | null |
2024-09-04 | SNNAX – Spiking Neural Networks in JAX | Jamie Lohoff et.al. | 2409.02842 | null |
2024-09-04 | Experimental Framework for Generating Reliable Ground Truth for Laryngeal Spatial Segmentation Tasks | Hamzeh Ghasemzadeh et.al. | 2409.02809 | null |
2024-09-04 | Creating a Gen-AI based Track and Trace Assistant MVP (SuperTracy) for PostNL | Mohammad Reshadati et.al. | 2409.02711 | null |
2024-09-04 | Rethinking HTG Evaluation: Bridging Generation and Recognition | Konstantina Nikolaidou et.al. | 2409.02683 | link |
2024-09-04 | Introduction to Machine Learning | Laurent Younes et.al. | 2409.02668 | null |
2024-09-04 | Creating Domain-Specific Translation Memories for Machine Translation Fine-tuning: The TRENCARD Bilingual Cardiology Corpus | Gokhan Dogru et.al. | 2409.02667 | null |
2024-08-30 | Generative AI Enables Medical Image Segmentation in Ultra Low-Data Regimes | Li Zhang et.al. | 2408.17421 | link |
2024-08-30 | Assessing Generative Language Models in Classification Tasks: Performance and Self-Evaluation Capabilities in the Environmental and Climate Change Domain | Francesca Grasso et.al. | 2408.17362 | link |
2024-08-30 | Subspace Diffusion Posterior Sampling for Travel-Time Tomography | Xiang Cao et.al. | 2408.17333 | null |
2024-08-30 | Structuring a Training Strategy to Robustify Perception Models with Realistic Image Augmentations | Ahmed Hammam et.al. | 2408.17311 | null |
2024-08-30 | Leveraging Deep Generative Model For Computational Protein Design And Optimization | Boqiao Lai et.al. | 2408.17241 | null |
2024-08-30 | Towards Symbolic XAI – Explanation Through Human Understandable Logical Relationships Between Features | Thomas Schnake et.al. | 2408.17198 | null |
2024-09-02 | Leveraging Blockchain and ANFIS for Optimal Supply Chain Management | Amirfarhad Farhadi et.al. | 2408.17161 | null |
2024-08-30 | Look, Compare, Decide: Alleviating Hallucination in Large Vision-Language Models via Multi-View Multi-Path Reasoning | Xiaoye Qu et.al. | 2408.17150 | link |
2024-08-30 | Flow Matching for Optimal Reaction Coordinates of Biomolecular System | Mingyuan Zhang et.al. | 2408.17139 | link |
2024-08-30 | Temporal and Interactive Modeling for Efficient Human-Human Motion Generation | Yabiao Wang et.al. | 2408.17135 | null |
2024-09-02 | RISSOLE: Parameter-efficient Diffusion Models via Block-wise Generation and Retrieval-Guidance | Avideep Mukherjee et.al. | 2408.17095 | null |
2024-08-30 | FissionVAE: Federated Non-IID Image Generation with Latent Space and Decoder Decomposition | Chen Hu et.al. | 2408.17090 | link |
2024-08-30 | Approximately Invertible Neural Network for Learned Image Compression | Yanbo Gao et.al. | 2408.17073 | null |
2024-09-02 | Instant Adversarial Purification with Adversarial Consistency Distillation | Chun Tong Lei et.al. | 2408.17064 | null |
2024-08-30 | Text-to-Image Generation Via Energy-Based CLIP | Roy Ganz et.al. | 2408.17046 | null |
2024-08-29 | ReconX: Reconstruct Any Scene from Sparse Views with Video Diffusion Model | Fangfu Liu et.al. | 2408.16767 | null |
2024-08-29 | CSGO: Content-Style Composition in Text-to-Image Generation | Peng Xing et.al. | 2408.16766 | null |
2024-08-29 | A Score-Based Density Formula, with Applications in Diffusion Generative Models | Gen Li et.al. | 2408.16765 | null |
2024-08-29 | UV-free Texture Generation with Denoising and Geodesic Heat Diffusions | Simone Foti et.al. | 2408.16762 | link |
2024-08-29 | One-Shot Learning Meets Depth Diffusion in Multi-Object Videos | Anisha Jain et.al. | 2408.16704 | null |
2024-08-29 | VMC: A Grammar for Visualizing Statistical Model Checks | Ziyang Guo et.al. | 2408.16702 | null |
2024-08-29 | GradBias: Unveiling Word Influence on Bias in Text-to-Image Generative Models | Moreno D’Incà et.al. | 2408.16700 | link |
2024-08-29 | Optimization Models for the Quadratic Traveling Salesperson Problem | Yuxiao Chen et.al. | 2408.16680 | null |
2024-08-29 | DriveGenVLM: Real-world Video Generation for Vision Language Model based Autonomous Driving | Yongjie Fu et.al. | 2408.16647 | null |
2024-08-29 | RLCP: A Reinforcement Learning-based Copyright Protection Method for Text-to-Image Diffusion Model | Zhuan Shi et.al. | 2408.16634 | null |
2024-08-28 | TEDRA: Text-based Editing of Dynamic and Photoreal Actors | Basavaraj Sunagad et.al. | 2408.15995 | null |
2024-08-28 | Distribution Backtracking Builds A Faster Convergence Trajectory for One-step Diffusion Distillation | Shengyuan Zhang et.al. | 2408.15991 | link |
2024-08-28 | Thoughtseeds: Evolutionary Priors, Nested Markov Blankets, and the Emergence of Embodied Cognition | Prakash Chandra Kavi et.al. | 2408.15982 | null |
2024-08-28 | Stability of Primal-Dual Gradient Flow Dynamics for Multi-Block Convex Optimization Problems | Ibrahim K. Ozaslan et.al. | 2408.15969 | null |
2024-08-28 | MetaGFN: Exploring Distant Modes with Adapted Metadynamics for Continuous GFlowNets | Dominic Phillips et.al. | 2408.15905 | null |
2024-08-28 | Gen-Swarms: Adapting Deep Generative Models to Swarms of Drones | Carlos Plou et.al. | 2408.15899 | null |
2024-08-28 | Airfoil Diffusion: Denoising Diffusion Model For Conditional Airfoil Generation | Reid Graves et.al. | 2408.15898 | link |
2024-08-28 | Disentangled Diffusion Autoencoder for Harmonization of Multi-site Neuroimaging Data | Ayodeji Ijishakin et.al. | 2408.15890 | null |
2024-08-29 | Recent Decade’s Power Outage Data Reveals the Increasing Vulnerability of U.S. Power Infrastructure | Bo Li et.al. | 2408.15882 | null |
2024-08-28 | GenDDS: Generating Diverse Driving Video Scenarios with Prompt-to-Video Generative Model | Yongjie Fu et.al. | 2408.15868 | null |
2024-08-27 | GenRec: Unifying Video Generation and Recognition with Diffusion Models | Zejia Weng et.al. | 2408.15241 | link |
2024-08-27 | Generative Inbetweening: Adapting Image-to-Video Models for Keyframe Interpolation | Xiaojuan Wang et.al. | 2408.15239 | null |
2024-08-27 | Simulation of Stochastic Discrete Dislocation Dynamics in Ductile Vs Brittle Materials | Santosh Chhetri et.al. | 2408.15157 | null |
2024-08-27 | How transformers learn structured data: insights from hierarchical filtering | Jerome Garnier-Brun et.al. | 2408.15138 | link |
2024-08-27 | DIFR3CT: Latent Diffusion for Probabilistic 3D CT Reconstruction from Few Planar X-Rays | Yiran Sun et.al. | 2408.15118 | link |
2024-08-27 | Data-Driven Nonlinear Deformation Design of 3D-Printable Shells | Samuel Silverman et.al. | 2408.15097 | link |
2024-08-27 | Constrained Diffusion Models via Dual Training | Shervin Khalafi et.al. | 2408.15094 | null |
2024-08-27 | LN-Gen: Rectal Lymph Nodes Generation via Anatomical Features | Weidong Guo et.al. | 2408.14977 | null |
2024-08-27 | MegActor- $Σ$ : Unlocking Flexible Mixed-Modal Control in Portrait Animation with Diffusion Transformer | Shurong Yang et.al. | 2408.14975 | null |
2024-08-27 | Integrated Bundling and Pricing of Unique Items | Maxime Bouscary et.al. | 2408.14913 | null |
2024-08-26 | K-Sort Arena: Efficient and Reliable Benchmarking for Generative Models via K-wise Human Preferences | Zhikai Li et.al. | 2408.14468 | null |
2024-08-26 | Uncovering Knowledge Gaps in Radiology Report Generation Models through Knowledge Graphs | Xiaoman Zhang et.al. | 2408.14397 | link |
2024-08-26 | Reprogramming Foundational Large Language Models(LLMs) for Enterprise Adoption for Spatio-Temporal Forecasting Applications: Unveiling a New Era in Copilot-Guided Cross-Modal Time Series Representation Learning | Sakhinana Sagar Srinivas et.al. | 2408.14387 | null |
2024-08-26 | GR-MG: Leveraging Partially Annotated Data via Multi-Modal Goal Conditioned Policy | Peiyan Li et.al. | 2408.14368 | link |
2024-08-27 | Foundation Models for Music: A Survey | Yinghao Ma et.al. | 2408.14340 | link |
2024-08-26 | Automated Machine Learning in Insurance | Panyi Dong et.al. | 2408.14331 | link |
2024-08-26 | LLM-3D Print: Large Language Models To Monitor and Control 3D Printing | Yayati Jadhav et.al. | 2408.14307 | null |
2024-08-26 | Learning Local Pattern Modularization for Point Cloud Reconstruction from Unseen Classes | Chao Chen et.al. | 2408.14279 | null |
2024-08-26 | Towards Synthetic Trace Generation of Modeling Operations using In-Context Learning Approach | Vittoriano Muttillo et.al. | 2408.14259 | null |
2024-08-27 | Text3DAug – Prompted Instance Augmentation for LiDAR Perception | Laurenz Reichardt et.al. | 2408.14253 | link |
2024-08-23 | How Diffusion Models Learn to Factorize and Compose | Qiyao Liang et.al. | 2408.13256 | null |
2024-08-23 | Foundational Model for Electron Micrograph Analysis: Instruction-Tuning Small-Scale Language-and-Vision Assistant for Enterprise Adoption | Sakhinana Sagar Srinivas et.al. | 2408.13248 | null |
2024-08-23 | CustomCrafter: Customized Video Generation with Preserving Motion and Concept Composition Abilities | Tao Wu et.al. | 2408.13239 | null |
2024-08-23 | Social Welfare Maximization for Federated Learning with Network Effects | Xiang Li et.al. | 2408.13223 | null |
2024-08-23 | Instruct-DeBERTa: A Hybrid Approach for Aspect-based Sentiment Analysis on Textual Reviews | Dineth Jayakody et.al. | 2408.13202 | null |
2024-08-23 | IFH: a Diffusion Framework for Flexible Design of Graph Generative Models | Samuel Cognolato et.al. | 2408.13194 | link |
2024-08-23 | Deep Learning for Lung Disease Classification Using Transfer Learning and a Customized CNN Architecture with Attention | Xiaoyi Liu et.al. | 2408.13180 | null |
2024-08-26 | Focus on Neighbors and Know the Whole: Towards Consistent Dense Multiview Text-to-Image Generator for 3D Creation | Bonan Li et.al. | 2408.13149 | null |
2024-08-23 | Diffusion-based Episodes Augmentation for Offline Multi-Agent Reinforcement Learning | Jihwan Oh et.al. | 2408.13092 | null |
2024-08-23 | General Intelligent Imaging and Uncertainty Quantification by Deterministic Diffusion Model | Weiru Fan et.al. | 2408.13061 | null |
2024-08-22 | xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed Representations | Can Qin et.al. | 2408.12590 | null |
2024-08-22 | ssProp: Energy-Efficient Training for Convolutional Neural Networks with Scheduled Sparse Back Propagation | Lujia Zhong et.al. | 2408.12561 | link |
2024-08-22 | Show-o: One Single Transformer to Unify Multimodal Understanding and Generation | Jinheng Xie et.al. | 2408.12528 | null |
2024-08-22 | FlexEdit: Marrying Free-Shape Masks to VLLM for Flexible Image Editing | Jue Wang et.al. | 2408.12429 | link |
2024-08-22 | Enhanced Infield Agriculture with Interpretable Machine Learning Approaches for Crop Classification | Sudi Murindanyi et.al. | 2408.12426 | null |
2024-08-22 | 4D Diffusion for Dynamic Protein Structure Prediction with Reference Guided Motion Alignment | Kaihui Cheng et.al. | 2408.12419 | null |
2024-08-22 | CODE: Confident Ordinary Differential Editing | Bastien van Delft et.al. | 2408.12418 | link |
2024-08-22 | Dynamic PDB: A New Dataset and a SE(3) Model Extension by Integrating Dynamic Behaviors and Physical Properties in Protein Structures | Ce Liu et.al. | 2408.12413 | null |
2024-08-22 | A Stable Polygamy Approach to Spectrum Access with Channel Reuse | Dan Ben Ami et.al. | 2408.12402 | null |
2024-08-22 | Multi-Style Facial Sketch Synthesis through Masked Generative Modeling | Bowen Sun et.al. | 2408.12400 | null |
2024-08-21 | Pixel Is Not A Barrier: An Effective Evasion Attack for Pixel-Domain Diffusion Models | Chun-Yen Shih et.al. | 2408.11810 | null |
2024-08-21 | ACE: A Cross-Platform Visual-Exoskeletons System for Low-Cost Dexterous Teleoperation | Shiqi Yang et.al. | 2408.11805 | null |
2024-08-21 | DreamFactory: Pioneering Multi-Scene Long Video Generation with a Multi-Agent Framework | Zhifei Xie et.al. | 2408.11788 | null |
2024-08-21 | Timeline and Boundary Guided Diffusion Network for Video Shadow Detection | Haipeng Zhou et.al. | 2408.11785 | link |
2024-08-21 | Sum of Squares Circuits | Lorenzo Loconte et.al. | 2408.11778 | null |
2024-08-21 | Leveraging Fine-Tuned Retrieval-Augmented Generation with Long-Context Support: For 3GPP Standards | Omar Erak et.al. | 2408.11775 | link |
2024-08-21 | D-RMGPT: Robot-assisted collaborative tasks driven by large multimodal models | M. Forlini et.al. | 2408.11761 | null |
2024-08-21 | JieHua Paintings Style Feature Extracting Model using Stable Diffusion with ControlNet | Yujia Gu et.al. | 2408.11744 | null |
2024-08-21 | Enhancing Cross-Modal Medical Image Segmentation through Compositionality | Aniek Eijpe et.al. | 2408.11733 | link |
2024-08-21 | AI-assisted Automated Short Answer Grading of Handwritten University Level Mathematics Exams | Tianyi Liu et.al. | 2408.11728 | null |
2024-08-20 | Reconciling Methodological Paradigms: Employing Large Language Models as Novice Qualitative Research Assistants in Talent Management Research | Sreyoshi Bhaduri et.al. | 2408.11043 | null |
2024-08-20 | Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model | Chunting Zhou et.al. | 2408.11039 | null |
2024-08-20 | Full Detector Simulation of a Projective Dual-Readout Segmented Crystal Electromagnetic Calorimeter with Precision Timing | Wonyong Chung et.al. | 2408.11027 | null |
2024-08-20 | MegaFusion: Extend Diffusion Models towards Higher-resolution Image Generation without Further Tuning | Haoning Wu et.al. | 2408.11001 | link |
2024-08-20 | GreediRIS: Scalable Influence Maximization using Distributed Streaming Maximum Cover | Reet Barik et.al. | 2408.10982 | null |
2024-08-21 | Assortment Optimization Under History-Dependent Effects | Taotao He et.al. | 2408.10967 | null |
2024-08-20 | Kilometer-Scale Convection Allowing Model Emulation using Generative Diffusion Modeling | Jaideep Pathak et.al. | 2408.10958 | null |
2024-08-20 | SysBench: Can Large Language Models Follow System Messages? | Yanzhao Qin et.al. | 2408.10943 | link |
2024-08-20 | A Closer Look at Data Augmentation Strategies for Finetuning-Based Low/Few-Shot Object Detection | Vladislav Li et.al. | 2408.10940 | null |
2024-08-20 | Large Point-to-Gaussian Model for Image-to-3D Generation | Longfei Lu et.al. | 2408.10935 | null |
2024-08-19 | MeshFormer: High-Quality Mesh Generation with 3D-Guided Reconstruction Model | Minghua Liu et.al. | 2408.10198 | null |
2024-08-19 | SpaRP: Fast 3D Object Reconstruction and Pose Estimation from Sparse Views | Chao Xu et.al. | 2408.10195 | null |
2024-08-19 | Customizing Language Models with Instance-wise LoRA for Sequential Recommendation | Xiaoyu Kong et.al. | 2408.10159 | link |
2024-08-19 | Advancing Voice Cloning for Nepali: Leveraging Transfer Learning in a Low-Resource Language | Manjil Karki et.al. | 2408.10128 | null |
2024-08-19 | Learning Precise Affordances from Egocentric Videos for Robotic Manipulation | Gen Li et.al. | 2408.10123 | null |
2024-08-19 | Convert and Speak: Zero-shot Accent Conversion with Minimum Supervision | Zhijun Jia et.al. | 2408.10096 | null |
2024-08-19 | Stacked Intelligent Metasurfaces for Integrated Sensing and Communications | Haoxian Niu et.al. | 2408.10043 | null |
2024-08-19 | General Impedance Modeling for Modular Multilevel Converter with Grid-forming and Grid-following Control | Chu Sun et.al. | 2408.10017 | null |
2024-08-19 | Uniting contrastive and generative learning for event sequences models | Aleksandr Yugay et.al. | 2408.09995 | null |
2024-08-19 | Multi-layer diffusion model of photovoltaic installations | Tomasz Weron et.al. | 2408.09904 | null |
2024-08-16 | Automated High-throughput Organic Crystal Structure Prediction via Population-based Sampling | Qiang Zhu et.al. | 2408.08843 | link |
2024-08-16 | PFDiff: Training-free Acceleration of Diffusion Models through the Gradient Guidance of Past and Future | Guangyi Wang et.al. | 2408.08822 | null |
2024-08-16 | A Unified Automata-Theoretic Approach to LTLf Modulo Theories (Extended Version) | Marco Faella et.al. | 2408.08817 | null |
2024-08-16 | EmoDynamiX: Emotional Support Dialogue Strategy Prediction by Modelling MiXed Emotions and Discourse Dynamics | Chenwei Wan et.al. | 2408.08782 | link |
2024-08-16 | Comparative Analysis of Generative Models: Enhancing Image Synthesis with VAEs, GANs, and Stable Diffusion | Sanchayan Vivekananthan et.al. | 2408.08751 | null |
2024-08-16 | The Blessing of Strategic Customers in Personalized Pricing | Zhi Chen et.al. | 2408.08738 | null |
2024-08-16 | ChatZero:Zero-shot Cross-Lingual Dialogue Generation via Pseudo-Target Language | Yongkang Liu et.al. | 2408.08724 | null |
2024-08-16 | An End-to-End Model for Photo-Sharing Multi-modal Dialogue Generation | Peiming Guo et.al. | 2408.08650 | null |
2024-08-16 | Modeling the Neonatal Brain Development Using Implicit Neural Representations | Florentin Bieder et.al. | 2408.08647 | link |
2024-08-16 | Sampling effects on Lasso estimation of drift functions in high-dimensional diffusion processes | Chiara Amorino et.al. | 2408.08638 | null |
2024-08-15 | Understanding the Local Geometry of Generative Model Manifolds | Ahmed Imtiaz Humayun et.al. | 2408.08307 | null |
2024-08-15 | Accelerated Image-Aware Generative Diffusion Modeling | Tanmay Asthana et.al. | 2408.08306 | null |
2024-08-15 | Marker or Markerless? Mode-Switchable Optical Tactile Sensing for Diverse Robot Tasks | Ni Ou et.al. | 2408.08276 | null |
2024-08-15 | mhGPT: A Lightweight Generative Pre-Trained Transformer for Mental Health Text Analysis | Dae-young Kim et.al. | 2408.08261 | null |
2024-08-15 | Derivative-Free Guidance in Continuous and Discrete Diffusion Models with Soft Value-Based Decoding | Xiner Li et.al. | 2408.08252 | link |
2024-08-15 | Picosecond laser pulses for quantum dot-microcavity based single photon generation by cascaded electro-optic modulation of a narrow-linewidth laser | Mio Poortvliet et.al. | 2408.08213 | null |
2024-08-15 | Not Every Image is Worth a Thousand Words: Quantifying Originality in Stable Diffusion | Adi Haviv et.al. | 2408.08184 | null |
2024-08-15 | Impact of Comprehensive Data Preprocessing on Predictive Modelling of COVID-19 Mortality | Sangita Das et.al. | 2408.08142 | link |
2024-08-15 | Decoding Memes: A Comparative Study of Machine Learning Models for Template Identification | Levente Murgás et.al. | 2408.08126 | link |
2024-08-15 | When Video Coding Meets Multimodal Large Language Models: A Unified Paradigm for Video Coding | Pingping Zhang et.al. | 2408.08093 | null |
2024-08-14 | Detecting Near-Duplicate Face Images | Sudipta Banerjee et.al. | 2408.07689 | link |
2024-08-14 | Composing Automatic Differentiation with Custom Derivatives of Higher-Order Functions | Sam Estep et.al. | 2408.07683 | null |
2024-08-14 | Drug Discovery SMILES-to-Pharmacokinetics Diffusion Models with Deep Molecular Understanding | Bing Hu et.al. | 2408.07636 | null |
2024-08-14 | Anisotropic Diffusion Model of Communication in 2D Biofilm | Yanahan Paramalingam et.al. | 2408.07626 | null |
2024-08-14 | Neural Quantum States and Peaked Molecular Wave Functions: Curse or Blessing? | Aleksei Malyshev et.al. | 2408.07625 | null |
2024-08-14 | MatterGPT: A Generative Transformer for Multi-Property Inverse Design of Solid-State Materials | Yan Chen et.al. | 2408.07608 | null |
2024-08-14 | PeriodWave: Multi-Period Flow Matching for High-Fidelity Waveform Generation | Sang-Hoon Lee et.al. | 2408.07547 | link |
2024-08-14 | New Curriculum, New Chance – Retrieval Augmented Generation for Lesson Planning in Ugandan Secondary Schools. Prototype Quality Evaluation | Simon Kloker et.al. | 2408.07542 | null |
2024-08-14 | DifuzCam: Replacing Camera Lens with a Mask and a Diffusion Model | Erez Yosef et.al. | 2408.07541 | null |
2024-08-14 | Towards Real-time Video Compressive Sensing on Mobile Devices | Miao Cao et.al. | 2408.07530 | link |
2024-08-13 | Imagen 3 | Imagen-Team-Google et.al. | 2408.07009 | null |
2024-08-13 | Low-Bitwidth Floating Point Quantization for Efficient High-Quality Diffusion Models | Cheng Chen et.al. | 2408.06995 | null |
2024-08-13 | DCMSA: Multi-Head Self-Attention Mechanism Based on Deformable Convolution For Seismic Data Denoising | Wang Mingwei et.al. | 2408.06963 | null |
2024-08-13 | Neural Speech and Audio Coding | Minje Kim et.al. | 2408.06954 | null |
2024-08-13 | Diffusion Model for Slate Recommendation | Federico Tomasi et.al. | 2408.06883 | null |
2024-08-13 | Efficient Search for Customized Activation Functions with Gradient Descent | Lukas Strack et.al. | 2408.06820 | link |
2024-08-13 | Enhancing Diabetic Retinopathy Diagnosis: A Lightweight CNN Architecture for Efficient Exudate Detection in Retinal Fundus Images | Mujadded Al Rabbani Alif et.al. | 2408.06784 | null |
2024-08-13 | Improving Synthetic Image Detection Towards Generalization: An Image Transformation Perspective | Ouxiang Li et.al. | 2408.06741 | link |
2024-08-13 | DiffLoRA: Generating Personalized Low-Rank Adaptation Weights with Diffusion | Yujia Wu et.al. | 2408.06740 | null |
2024-08-13 | Multimodal Analysis of White Blood Cell Differentiation in Acute Myeloid Leukemia Patients using a β-Variational Autoencoder | Gizem Mert et.al. | 2408.06720 | null |
2024-08-12 | The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery | Chris Lu et.al. | 2408.06292 | link |
2024-08-12 | Open-Source Molecular Processing Pipeline for Generating Molecules | Shreyas V et.al. | 2408.06261 | null |
2024-08-12 | 3D Reconstruction of Protein Structures from Multi-view AFM Images using Neural Radiance Fields (NeRFs) | Jaydeep Rade et.al. | 2408.06244 | null |
2024-08-12 | Cislunar Constellation Design for Space Situational Awareness with Time-Expanded Facility Location Problem | Yuri Shimane et.al. | 2408.06238 | null |
2024-08-12 | Novel View Synthesis from a Single Image with Pretrained Diffusion Guidance | Taewon Kang et.al. | 2408.06157 | null |
2024-08-12 | LipidBERT: A Lipid Language Model Pre-trained on METiS de novo Lipid Library | Tianhao Yu et.al. | 2408.06150 | null |
2024-08-12 | Efficient and Scalable Point Cloud Generation with Sparse Point-Voxel Diffusion Models | Ioannis Romanelis et.al. | 2408.06145 | link |
2024-08-12 | Med42-v2: A Suite of Clinical LLMs | Clément Christophe et.al. | 2408.06142 | null |
2024-08-12 | Five Pitfalls When Assessing Synthetic Medical Images with Reference Metrics | Melanie Dohmen et.al. | 2408.06075 | null |
2024-08-12 | CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer | Zhuoyi Yang et.al. | 2408.06072 | link |
2024-08-09 | Multi-Garment Customized Model Generation | Yichen Liu et.al. | 2408.05206 | null |
2024-08-09 | TaSL: Task Skill Localization and Consolidation for Language Model Continual Learning | Yujie Feng et.al. | 2408.05200 | link |
2024-08-09 | Cell Morphology-Guided Small Molecule Generation with GFlowNets | Stephen Zhewen Lu et.al. | 2408.05196 | link |
2024-08-09 | Lithography-free patterning of chalcogenide materials for integrated photonic devices | Zhen Hu et.al. | 2408.05099 | null |
2024-08-09 | Social contagion under hybrid interactions | Xincheng Shu et.al. | 2408.05050 | null |
2024-08-09 | Infrared Beam-shaping on Demand via Tailored Geometric Phase Metasurfaces employing the Plasmonic Phase-Change Material In3SbTe2 | Lukas Conrads et.al. | 2408.05044 | null |
2024-08-09 | Collaborative Static-Dynamic Teaching: A Semi-Supervised Framework for Stripe-Like Space Target Detection | Zijian Zhu et.al. | 2408.05029 | null |
2024-08-09 | Retrieval-augmented code completion for local projects using large language models | Marko Hostnik et.al. | 2408.05026 | null |
2024-08-09 | DreamCouple: Exploring High Quality Text-to-3D Generation Via Rectified Flow | Hangyu Li et.al. | 2408.05008 | null |
2024-08-09 | Pay Attention To Mean Fields For Point Cloud Generation | Benno Käch et.al. | 2408.04997 | link |
2024-08-08 | Puppet-Master: Scaling Interactive Video Generation as a Motion Prior for Part-Level Dynamics | Ruining Li et.al. | 2408.04631 | null |
2024-08-08 | Transformer Explainer: Interactive Learning of Text-Generative Models | Aeree Cho et.al. | 2408.04619 | null |
2024-08-08 | Sketch2Scene: Automatic Generation of Interactive 3D Game Scenes from User’s Casual Sketches | Yongzhi Xu et.al. | 2408.04567 | null |
2024-08-08 | Bias-Aware Low-Rank Adaptation: Mitigating Catastrophic Inheritance of Large Language Models | Yupeng Chang et.al. | 2408.04556 | link |
2024-08-08 | On the Asymptotic Convergence of Subgraph Generated Models | Xinchen Xu et.al. | 2408.04541 | null |
2024-08-08 | AExGym: Benchmarks and Environments for Adaptive Experimentation | Jimmy Wang et.al. | 2408.04531 | null |
2024-08-08 | NFDI4Health workflow and service for synthetic data generation, assessment and risk management | Sobhan Moazemi et.al. | 2408.04478 | null |
2024-08-08 | Deep Generative Models in Robotics: A Survey on Learning from Multimodal Demonstrations | Julen Urain et.al. | 2408.04380 | null |
2024-08-08 | Making sense of AI systems development | Mateusz Dolata et.al. | 2408.04311 | null |
2024-08-08 | AI-Driven Chatbot for Intrusion Detection in Edge Networks: Enhancing Cybersecurity with Ethical User Consent | Mugheez Asif et.al. | 2408.04281 | null |
2024-08-07 | Prospects for using drones to test formation-flying CubeSat concepts, and other astronomical applications | John D. Monnier et.al. | 2408.03911 | null |
2024-08-07 | Hate Speech Detection and Classification in Amharic Text with Deep Learning | Samuel Minale Gashe et.al. | 2408.03849 | null |
2024-08-07 | WalledEval: A Comprehensive Safety Evaluation Toolkit for Large Language Models | Prannaya Gupta et.al. | 2408.03837 | link |
2024-08-07 | A broken duet: multistable dynamics of dyadic interactions | Johan Medrano et.al. | 2408.03809 | link |
2024-08-07 | Navigating the Human Maze: Real-Time Robot Pathfinding with Generative Imitation Learning | Martin Moder et.al. | 2408.03807 | link |
2024-08-07 | Data Generation Scheme for Thermal Modality with Edge-Guided Adversarial Conditional Diffusion Model | Guoqing Zhu et.al. | 2408.03748 | link |
2024-08-07 | Local Topology Measures of Contextual Language Model Latent Spaces With Applications to Dialogue Term Extraction | Benjamin Matthias Ruppik et.al. | 2408.03706 | null |
2024-08-07 | Openstory++: A Large-scale Dataset and Benchmark for Instance-aware Open-domain Visual Storytelling | Zilyu Ye et.al. | 2408.03695 | link |
2024-08-07 | Unsupervised Detection of Fetal Brain Anomalies using Denoising Diffusion Models | Markus Ditlev Sjøgren Olsen et.al. | 2408.03654 | null |
2024-08-07 | Goal-oriented Semantic Communication for the Metaverse Application | Zhe Wang et.al. | 2408.03646 | null |
2024-08-06 | MDT-A2G: Exploring Masked Diffusion Transformers for Co-Speech Gesture Generation | Xiaofeng Mao et.al. | 2408.03312 | null |
2024-08-06 | IPAdapter-Instruct: Resolving Ambiguity in Image-based Conditioning using Instruct Prompts | Ciara Rowles et.al. | 2408.03209 | null |
2024-08-06 | Personalizing Federated Instrument Segmentation with Visual Trait Priors in Robotic Surgery | Jialang Xu et.al. | 2408.03208 | null |
2024-08-06 | An Object is Worth 64x64 Pixels: Generating 3D Object via Image Diffusion | Xingguang Yan et.al. | 2408.03178 | null |
2024-08-06 | Iterative CT Reconstruction via Latent Variable Optimization of Shallow Diffusion Models | Sho Ozaki et.al. | 2408.03156 | null |
2024-08-06 | Enhancing Twitter Bot Detection via Multimodal Invariant Representations | Jibing Gong et.al. | 2408.03096 | null |
2024-08-06 | Analysis of Argument Structure Constructions in a Deep Recurrent Language Model | Pegah Ramezani et.al. | 2408.03062 | null |
2024-08-06 | OpenOmni: A Collaborative Open Source Tool for Building Future-Ready Multimodal Conversational Agents | Qiang Sun et.al. | 2408.03047 | link |
2024-08-06 | Targeted Visual Prompting for Medical Visual Question Answering | Sergio Tascon-Morales et.al. | 2408.03043 | link |
2024-08-06 | Training-Free Condition Video Diffusion Models for single frame Spatial-Semantic Echocardiogram Synthesis | Van Phi Nguyen et.al. | 2408.03035 | link |
2024-08-05 | Command-line Obfuscation Detection using Small Language Models | Vojtech Outrata et.al. | 2408.02637 | null |
2024-08-05 | VidGen-1M: A Large-Scale Dataset for Text-to-video Generation | Zhiyu Tan et.al. | 2408.02629 | null |
2024-08-05 | YOWOv3: An Efficient and Generalized Framework for Human Action Detection and Recognition | Duc Manh Nguyen Dang et.al. | 2408.02623 | link |
2024-08-05 | LaMamba-Diff: Linear-Time High-Fidelity Diffusion Models Based on Local Attention and Mamba | Yunxiang Fu et.al. | 2408.02615 | link |
2024-08-05 | MetaParticles: Computationally engineered nanomaterials with tunable and responsive properties | Massimiliano Paesani et.al. | 2408.02564 | null |
2024-08-05 | Fairness and Bias Mitigation in Computer Vision: A Survey | Sepehr Dehdashtian et.al. | 2408.02464 | null |
2024-08-05 | TGS: Trajectory Generation and Selection using Vision Language Models in Mapless Outdoor Environments | Daeun Song et.al. | 2408.02454 | null |
2024-08-05 | Why Are My Prompts Leaked? Unraveling Prompt Extraction Threats in Customized Large Language Models | Zi Liang et.al. | 2408.02416 | link |
2024-08-05 | Multi-weather Cross-view Geo-localization Using Denoising Diffusion Models | Tongtong Feng et.al. | 2408.02408 | null |
2024-08-05 | A Few-Shot Approach for Relation Extraction Domain Adaptation using Large Language Models | Vanni Zavarella et.al. | 2408.02377 | null |
2024-08-02 | Conditional LoRA Parameter Generation | Xiaolong Jin et.al. | 2408.01415 | null |
2024-08-02 | Autoencoders in Function Space | Justin Bunker et.al. | 2408.01362 | link |
2024-08-02 | MCGMark: An Encodable and Robust Online Watermark for LLM-Generated Malicious Code | Kaiwen Ning et.al. | 2408.01354 | link |
2024-08-02 | TexGen: Text-Guided 3D Texture Generation with Multi-view Sampling and Resampling | Dong Huo et.al. | 2408.01291 | null |
2024-08-02 | A General Framework to Boost 3D GS Initialization for Text-to-3D Generation by Lexical Richness | Lutao Jiang et.al. | 2408.01269 | null |
2024-08-02 | Exchange control in a MOS double quantum dot made using a 300 mm wafer process | Jacob F. Chittock-Wood et.al. | 2408.01241 | null |
2024-08-02 | CLIP4Sketch: Enhancing Sketch to Mugshot Matching through Dataset Augmentation using Diffusion Models | Kushal Kumar Jain et.al. | 2408.01233 | null |
2024-08-02 | Reality Fusion: Robust Real-time Immersive Mobile Robot Teleoperation with Volumetric Visual Data Fusion | Ke Li et.al. | 2408.01225 | link |
2024-08-02 | PSP-GEN: Stochastic inversion of the Process-Structure-Property chain in materials design through deep, generative probabilistic modeling | Yaohua Zang et.al. | 2408.01114 | null |
2024-08-02 | Six Dragons Fly Again: Reviving 15th-Century Korean Court Music with Transformers and Novel Encoding | Danbinaerin Han et.al. | 2408.01096 | link |
2024-08-01 | Optimizing Diffusion Models for Joint Trajectory Prediction and Controllable Generation | Yixiao Wang et.al. | 2408.00766 | null |
2024-08-01 | Smoothed Energy Guidance: Guiding Diffusion Models with Reduced Energy Curvature of Attention | Susung Hong et.al. | 2408.00760 | link |
2024-08-01 | DynamoLLM: Designing LLM Inference Clusters for Performance and Energy Efficiency | Jovan Stojkovic et.al. | 2408.00741 | null |
2024-08-01 | TurboEdit: Text-Based Image Editing Using Few-Step Diffusion Models | Gilad Deutch et.al. | 2408.00735 | null |
2024-08-01 | A Natural Language Processing Framework for Hotel Recommendation Based on Users’ Text Reviews | Lavrentia Aravani et.al. | 2408.00716 | null |
2024-08-02 | Reinforcement Learning applied to Insurance Portfolio Pursuit | Edward James Young et.al. | 2408.00713 | link |
2024-08-01 | MotionFix: Text-Driven 3D Human Motion Editing | Nikos Athanasiou et.al. | 2408.00712 | null |
2024-08-01 | Synthetic dual image generation for reduction of labeling efforts in semantic segmentation of micrographs with a customized metric function | Matias Oscar Volman Stern et.al. | 2408.00707 | null |
2024-08-01 | AutoM3L: An Automated Multimodal Machine Learning Framework with Large Language Models | Daqin Luo et.al. | 2408.00665 | link |
2024-08-01 | Privacy-preserving datasets by capturing feature distributions with Conditional VAEs | Francesco Di Salvo et.al. | 2408.00639 | link |
2024-07-31 | Detecting, Explaining, and Mitigating Memorization in Diffusion Models | Yuxin Wen et.al. | 2407.21720 | link |
2024-07-31 | Tora: Trajectory-oriented Diffusion Transformer for Video Generation | Zhenghao Zhang et.al. | 2407.21705 | link |
2024-07-31 | Generative Diffusion Model for Seismic Imaging Improvement of Sparsely Acquired Data and Uncertainty Quantification | Xingchen Shi et.al. | 2407.21683 | null |
2024-07-31 | Quality Control for Radiology Report Generation Models via Auxiliary Auditing Components | Hermione Warr et.al. | 2407.21638 | null |
2024-07-31 | LLM-for-X: Application-agnostic Integration of Large Language Models to Support Personal Writing Workflows | Lukas Teufelberger et.al. | 2407.21593 | null |
2024-07-31 | Long-term investment and energy procurement risk management under uncertainty for an electrolytic green hydrogen producer | Owen Palmer et.al. | 2407.21574 | null |
2024-07-31 | Conditioned Prompt-Optimization for Continual Deepfake Detection | Francesco Laiti et.al. | 2407.21554 | link |
2024-07-31 | CXSimulator: A User Behavior Simulation using LLM Embeddings for Web-Marketing Campaign Assessment | Akira Kasuga et.al. | 2407.21553 | null |
2024-07-31 | Explainable and Controllable Motion Curve Guided Cardiac Ultrasound Video Generation | Junxuan Yu et.al. | 2407.21490 | null |
2024-07-31 | Maverick: Efficient and Accurate Coreference Resolution Defying Recent Trends | Giuliano Martinelli et.al. | 2407.21489 | link |
2024-07-30 | Matting by Generation | Zhixiang Wang et.al. | 2407.21017 | null |
2024-07-30 | Add-SD: Rational Generation without Manual Reference | Lingfeng Yang et.al. | 2407.21016 | link |
2024-07-30 | Integrating Agent-Based and Compartmental Models for Infectious Disease Modeling: A Novel Hybrid Approach | Inan Bostanci et.al. | 2407.20993 | null |
2024-07-30 | MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions | Xiaowei Chi et.al. | 2407.20962 | link |
2024-07-30 | Mitigating calibration errors from mutual coupling with time-domain filtering of 21 cm cosmological radio observations | N. Charles et.al. | 2407.20923 | null |
2024-07-30 | Impact of Geographical Separation on Spectrum Sharing Markets | Kangle Mu et.al. | 2407.20909 | null |
2024-07-30 | Dynamic Scene Understanding through Object-Centric Voxelization and Neural Rendering | Yanpeng Zhao et.al. | 2407.20908 | link |
2024-07-30 | Vulnerabilities in AI-generated Image Detection: The Challenge of Adversarial Attacks | Yunfeng Diao et.al. | 2407.20836 | null |
2024-07-30 | Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning | Norman Di Palo et.al. | 2407.20798 | null |
2024-07-30 | SynthVLM: High-Efficiency and High-Quality Synthetic Data for Vision Language Models | Zheng Liu et.al. | 2407.20756 | link |
2024-07-29 | Specify and Edit: Overcoming Ambiguity in Text-Based Image Editing | Ekaterina Iakovleva et.al. | 2407.20232 | null |
2024-07-29 | LatentArtiFusion: An Effective and Efficient Histological Artifacts Restoration Framework | Zhenqi He et.al. | 2407.20172 | link |
2024-07-29 | Diffusion Feedback Helps CLIP See Better | Wenxuan Wang et.al. | 2407.20171 | link |
2024-07-29 | DDAP: Dual-Domain Anti-Personalization against Text-to-Image Diffusion Models | Jing Yang et.al. | 2407.20141 | null |
2024-07-29 | Diffusion-DICE: In-Sample Diffusion Guidance for Offline Reinforcement Learning | Liyuan Mao et.al. | 2407.20109 | null |
2024-07-29 | On the significance of parameters and the projective level in the Choice and Collection axioms | Vladimir Kanovei et.al. | 2407.20098 | null |
2024-07-29 | Generative Diffusion Model Bootstraps Zero-shot Classification of Fetal Ultrasound Images In Underrepresented African Populations | Fangyijie Wang et.al. | 2407.20072 | link |
2024-07-29 | ImagiNet: A Multi-Content Dataset for Generalizable Synthetic Image Detection via Contrastive Learning | Delyan Boychev et.al. | 2407.20020 | link |
2024-07-29 | Reproducibility Study of “ITI-GEN: Inclusive Text-to-Image Generation” | Daniel Gallo Fernández et.al. | 2407.19996 | link |
2024-07-29 | HeadsetOff: Enabling Photorealistic Video Conferencing on Economical VR Headsets | Yili Jin et.al. | 2407.19988 | null |
2024-07-26 | Generative Adversarial Networks for Imputing Sparse Learning Performance | Liang Zhang et.al. | 2407.18875 | null |
2024-07-26 | Unifying Visual and Semantic Feature Spaces with Diffusion Models for Enhanced Cross-Modal Alignment | Yuze Zheng et.al. | 2407.18854 | null |
2024-07-26 | Scalable Group Choreography via Variational Phase Manifold Learning | Nhat Le et.al. | 2407.18839 | null |
2024-07-26 | Revision of calcium and scandium abundances in Am stars based on NLTE calculations and comparison with diffusion stellar evolution models | L. I. Mashonkina et.al. | 2407.18736 | null |
2024-07-26 | BCTR: Bidirectional Conditioning Transformer for Scene Graph Generation | Peng Hao et.al. | 2407.18715 | null |
2024-07-26 | Q-gen: A Parameterized Quantum Circuit Generator | Yikai Mao et.al. | 2407.18697 | link |
2024-07-26 | Adversarial Robustification via Text-to-Image Diffusion Models | Daewon Choi et.al. | 2407.18658 | link |
2024-07-26 | Robust VAEs via Generating Process of Noise Augmented Data | Hiroo Irobe et.al. | 2407.18632 | null |
2024-07-26 | Denoising Lévy Probabilistic Models | Dario Shariatian et.al. | 2407.18609 | link |
2024-07-26 | How To Segment in 3D Using 2D Models: Automated 3D Segmentation of Prostate Cancer Metastatic Lesions on PET Volumes Using Multi-Angle Maximum Intensity Projections and Diffusion Models | Amirhosein Toosi et.al. | 2407.18555 | link |
2024-07-25 | RegionDrag: Fast Region-Based Image Editing with Diffusion Models | Jingyi Lu et.al. | 2407.18247 | null |
2024-07-25 | VGGHeads: A Large-Scale Synthetic Dataset for 3D Human Heads | Orest Kupyn et.al. | 2407.18245 | link |
2024-07-25 | CodedVO: Coded Visual Odometry | Sachin Shah et.al. | 2407.18240 | null |
2024-07-25 | SuperFlow: A Fully-Customized RTL-to-GDS Design Automation Flow for Adiabatic Quantum-Flux-Parametron Superconducting Circuits | Yanyue Xie et.al. | 2407.18209 | null |
2024-07-25 | Test2VA: Reusing GUI Test Cases for Voice Assistant Features Development in Mobile Applications | Garrett Weaver et.al. | 2407.18155 | null |
2024-07-25 | Self-supervised pre-training with diffusion model for few-shot landmark detection in x-ray images | Roberto Di Via et.al. | 2407.18125 | null |
2024-07-25 | Keypoint Promptable Re-Identification | Vladimir Somers et.al. | 2407.18112 | link |
2024-07-25 | SSTD: Stripe-Like Space Target Detection using Single-Point Supervision | Zijian Zhu et.al. | 2407.18097 | null |
2024-07-25 | Cross-Observatory Coordination with tilepy: A Novel Tool for Observations of Multi-Messenger Transient Events | Monica Seglar-Arroyo et.al. | 2407.18076 | null |
2024-07-25 | AttentionHand: Text-driven Controllable Hand Image Generation for 3D Hand Reconstruction in the Wild | Junho Park et.al. | 2407.18034 | link |
2024-07-24 | SV4D: Dynamic 3D Content Generation with Multi-Frame and Multi-View Consistency | Yiming Xie et.al. | 2407.17470 | null |
2024-07-24 | BlueTempNet: A Temporal Multi-network Dataset of Social Interactions in Bluesky Social | Ujun Jeong et.al. | 2407.17451 | link |
2024-07-24 | ProvenanceWidgets: A Library of UI Control Elements to Track and Dynamically Overlay Analytic Provenance | Arpit Narechania et.al. | 2407.17431 | link |
2024-07-24 | CDDIP: Constrained Diffusion-Driven Deep Image Prior for Seismic Image Reconstruction | Paul Goyes-Peñafiel et.al. | 2407.17402 | link |
2024-07-24 | Cosmic ray susceptibility of the Terahertz Intensity Mapper detector arrays | Lun-Jun Liu et.al. | 2407.17381 | null |
2024-07-24 | ViPer: Visual Personalization of Generative Models via Individual Preference Learning | Sogand Salehi et.al. | 2407.17365 | null |
2024-07-24 | Boosting Large Language Models with Socratic Method for Conversational Mathematics Teaching | Yuyang Ding et.al. | 2407.17349 | link |
2024-07-24 | Quantum nonlocal modulation cancellation with distributed clocks | Stephen D. Chapman et.al. | 2407.17330 | null |
2024-07-25 | Enhanced Deep Learning Methodologies and MRI Selection Techniques for Dementia Diagnosis in the Elderly Population | Nikolaos Ntampakis et.al. | 2407.17324 | null |
2024-07-24 | Edge-Cloud Continuum Orchestration of Critical Services: A Smart-City Approach | Rodrigo Rosmaninho et.al. | 2407.17314 | null |
2024-07-23 | Diffusion Models for Monocular Depth Estimation: Overcoming Challenging Conditions | Fabio Tosi et.al. | 2407.16698 | link |
2024-07-23 | From Imitation to Refinement – Residual RL for Precise Visual Assembly | Lars Ankile et.al. | 2407.16677 | null |
2024-07-23 | RedAgent: Red Teaming Large Language Models with Context-aware Autonomous Language Agent | Huiyu Xu et.al. | 2407.16667 | null |
2024-07-23 | MovieDreamer: Hierarchical Generation for Coherent Long Visual Sequence | Canyu Zhao et.al. | 2407.16655 | null |
2024-07-23 | Unveiling and Mitigating Bias in Audio Visual Segmentation | Peiwen Sun et.al. | 2407.16638 | null |
2024-07-23 | Knowledge-driven AI-generated data for accurate and interpretable breast ultrasound diagnoses | Haojun Yu et.al. | 2407.16634 | null |
2024-07-23 | GenRec: A Flexible Data Generator for Recommendations | Erica Coppolillo et.al. | 2407.16594 | null |
2024-07-23 | COALA: A Practical and Vision-Centric Federated Learning Platform | Weiming Zhuang et.al. | 2407.16560 | link |
2024-07-23 | DreamVTON: Customizing 3D Virtual Try-on with Personalized Diffusion Models | Zhenyu Xie et.al. | 2407.16511 | null |
2024-07-23 | qMRI Diffusor: Quantitative T1 Mapping of the Brain using a Denoising Diffusion Probabilistic Model | Shishuai Wang et.al. | 2407.16477 | null |
2024-07-22 | Artist: Aesthetically Controllable Text-Driven Stylization without Training | Ruixiang Jiang et.al. | 2407.15842 | link |
2024-07-23 | A Large-scale Benchmark Dataset for Commuting Origin-destination Matrix Generation | Can Rong et.al. | 2407.15823 | link |
2024-07-22 | Stretching Each Dollar: Diffusion Training from Scratch on a Micro-Budget | Vikash Sehwag et.al. | 2407.15811 | null |
2024-07-22 | Quantum Computing for Phonon Scattering Effects on Thermal Conductivity | Xiangjun Tan et.al. | 2407.15808 | null |
2024-07-22 | Enhancing Mass Customization Manufacturing: Multiobjective Metaheuristic Algorithms for flow shop Production in Smart Industry | Diego Rossit et.al. | 2407.15802 | null |
2024-07-22 | Diffusion Model Based Resource Allocation Strategy in Ultra-Reliable Wireless Networked Control Systems | Amirhassan Babazadeh Darabi et.al. | 2407.15784 | null |
2024-07-22 | A Hamilton-Jacobi approach to road-field reaction-diffusion models | Christopher Henderson et.al. | 2407.15760 | null |
2024-07-22 | Diffusion for Out-of-Distribution Detection on Road Scenes and Beyond | Silvio Galesso et.al. | 2407.15739 | link |
2024-07-22 | DStruct2Design: Data and Benchmarks for Data Structure Driven Generative Floor Plan Design | Zhi Hao Luo et.al. | 2407.15723 | link |
2024-07-22 | Estimating Probability Densities with Transformer and Denoising Diffusion | Henry W. Leung et.al. | 2407.15703 | link |
2024-07-19 | DEPICT: Diffusion-Enabled Permutation Importance for Image Classification Tasks | Sarah Jabbour et.al. | 2407.14509 | null |
2024-07-19 | On Pre-training of Multimodal Language Models Customized for Chart Understanding | Wan-Cyuan Fan et.al. | 2407.14506 | null |
2024-07-19 | T2V-CompBench: A Comprehensive Benchmark for Compositional Text-to-video Generation | Kaiyue Sun et.al. | 2407.14505 | link |
2024-07-19 | M2D2M: Multi-Motion Generation from Text with Discrete Diffusion Models | Seunggeun Chi et.al. | 2407.14502 | null |
2024-07-19 | A Precision Cryogenic Positioning Stage for Detector Dithering and Flexure Compensation | Stephen A. Smee et.al. | 2407.14493 | null |
2024-07-19 | Contrastive Learning with Counterfactual Explanations for Radiology Report Generation | Mingjie Li et.al. | 2407.14474 | null |
2024-07-19 | Describe Data to get Science-Data-Ready Tooling: Awkward as a Target for Kaitai Struct YAML | Manasvi Goyal et.al. | 2407.14461 | null |
2024-07-19 | Co-synthesis of Histopathology Nuclei Image-Label Pairs using a Context-Conditioned Joint Diffusion Model | Seonghui Min et.al. | 2407.14434 | null |
2024-07-19 | Controllable and Efficient Multi-Class Pathology Nuclei Data Augmentation using Text-Conditioned Diffusion Models | Hyun-Jic Oh et.al. | 2407.14426 | null |
2024-07-19 | GLAudio Listens to the Sound of the Graph | Aurelio Sulser et.al. | 2407.14387 | link |
2024-07-18 | LogoSticker: Inserting Logos into Diffusion Models for Customized Generation | Mingkang Zhu et.al. | 2407.13752 | null |
2024-07-18 | Understanding Reinforcement Learning-Based Fine-Tuning of Diffusion Models: A Tutorial and Review | Masatoshi Uehara et.al. | 2407.13734 | link |
2024-07-18 | Shaded Route Planning Using Active Segmentation and Identification of Satellite Images | Longchao Da et.al. | 2407.13689 | null |
2024-07-18 | PASTA: Controllable Part-Aware Shape Generation with Autoregressive Transformers | Songlin Li et.al. | 2407.13677 | link |
2024-07-18 | MeshSegmenter: Zero-Shot Mesh Semantic Segmentation via Texture Synthesis | Ziming Zhong et.al. | 2407.13675 | link |
2024-07-18 | Open-Vocabulary 3D Semantic Segmentation with Text-to-Image Diffusion Models | Xiaoyu Zhu et.al. | 2407.13642 | null |
2024-07-18 | Training-free Composite Scene Generation for Layout-to-Image Synthesis | Jiaqi Liu et.al. | 2407.13609 | link |
2024-07-18 | EnergyDiff: Universal Time-Series Energy Data Generation using Diffusion Models | Nan Lin et.al. | 2407.13538 | null |
2024-07-18 | VeriQR: A Robustness Verification Tool for Quantum Machine Learning Models | Yanling Lin et.al. | 2407.13533 | null |
2024-07-18 | All Roads Lead to Rome? Exploring Representational Similarities Between Latent Spaces of Generative Image Models | Charumathi Badrinath et.al. | 2407.13449 | link |
2024-07-17 | SMooDi: Stylized Motion Diffusion Model | Lei Zhong et.al. | 2407.12783 | null |
2024-07-17 | VD3D: Taming Large Video Diffusion Transformers for 3D Camera Control | Sherwin Bahmani et.al. | 2407.12781 | null |
2024-07-17 | Hallucination Index: An Image Quality Metric for Generative Reconstruction Models | Matthew Tivnan et.al. | 2407.12780 | null |
2024-07-17 | GroundUp: Rapid Sketch-Based 3D City Massing | Gizem Esra Unlu et.al. | 2407.12739 | null |
2024-07-17 | EchoSight: Advancing Visual-Language Models with Wiki Knowledge | Yibin Yan et.al. | 2407.12735 | null |
2024-07-17 | NL2Contact: Natural Language Guided 3D Hand-Object Contact Modeling with Diffusion Model | Zhongqun Zhang et.al. | 2407.12727 | null |
2024-07-17 | An Evaluation of Continual Learning for Advanced Node Semiconductor Defect Inspection | Amit Prasad et.al. | 2407.12724 | null |
2024-07-17 | Unlocking planetesimal magnetic field histories: a refined, versatile model for thermal evolution and dynamo generation | Hannah R. Sanderson et.al. | 2407.12721 | null |
2024-07-17 | SlimFlow: Training Smaller One-Step Diffusion Models with Rectified Flow | Yuanzhi Zhu et.al. | 2407.12718 | link |
2024-07-17 | Teleoperation in Robot-assisted MIS with Adaptive RCM via Admittance Control | Ehsan Nasiri et.al. | 2407.12711 | null |
2024-07-16 | Efficient Training with Denoised Neural Weights | Yifan Gong et.al. | 2407.11966 | null |
2024-07-16 | UrbanWorld: An Urban World Model for 3D City Generation | Yu Shang et.al. | 2407.11965 | link |
2024-07-16 | Context-Guided Diffusion for Out-of-Distribution Molecular and Protein Design | Leo Klarner et.al. | 2407.11942 | link |
2024-07-16 | Code Documentation and Analysis to Secure Software Development | Paul Attie et.al. | 2407.11934 | null |
2024-07-16 | Global Optimisation of Black-Box Functions with Generative Models in the Wasserstein Space | Tigran Ramazyan et.al. | 2407.11917 | link |
2024-07-16 | Quantised Global Autoencoder: A Holistic Approach to Representing Visual Data | Tim Elsner et.al. | 2407.11913 | null |
2024-07-16 | Data-Juicer Sandbox: A Comprehensive Suite for Multimodal Data-Model Co-development | Daoyuan Chen et.al. | 2407.11784 | link |
2024-07-16 | Diffusion-driven self-assembly of emerin nanodomains at the nuclear envelope | Carlos D. Alas et.al. | 2407.11758 | null |
2024-07-16 | Generating Multi-Modal and Multi-Attribute Single-Cell Counts with CFGen | Alessandro Palma et.al. | 2407.11734 | link |
2024-07-16 | Theoretical Insights into CycleGAN: Analyzing Approximation and Estimation Errors in Unpaired Data Generation | Luwei Sun et.al. | 2407.11678 | null |
2024-07-15 | Make-An-Agent: A Generalizable Policy Network Generator with Behavior-Prompted Diffusion | Yongyuan Liang et.al. | 2407.10973 | null |
2024-07-15 | Fast Matrix Multiplications for Lookup Table-Quantized LLMs | Han Guo et.al. | 2407.10960 | link |
2024-07-15 | InVi: Object Insertion In Videos Using Off-the-Shelf Diffusion Models | Nirat Saini et.al. | 2407.10958 | null |
2024-07-16 | DataDream: Few-shot Guided Dataset Generation | Jae Myung Kim et.al. | 2407.10910 | link |
2024-07-15 | Optical Diffusion Models for Image Generation | Ilker Oguz et.al. | 2407.10897 | null |
2024-07-15 | R3D-AD: Reconstruction via Diffusion for 3D Anomaly Detection | Zheyuan Zhou et.al. | 2407.10862 | null |
2024-07-15 | Physics-Inspired Generative Models in Medical Imaging: A Review | Dennis Hein et.al. | 2407.10856 | null |
2024-07-15 | Inferring dark energy properties from the scale factor parametrisation | Upala Mukhopadhayay et.al. | 2407.10845 | null |
2024-07-15 | MoE-DiffIR: Task-customized Diffusion Priors for Universal Compressed Image Restoration | Yulin Ren et.al. | 2407.10833 | null |
2024-07-15 | Foundational Autoraters: Taming Large Language Models for Better Automatic Evaluation | Tu Vu et.al. | 2407.10817 | null |
2024-07-12 | StyleSplat: 3D Object Style Transfer with Gaussian Splatting | Sahil Jain et.al. | 2407.09473 | null |
2024-07-12 | FairyLandAI: Personalized Fairy Tales utilizing ChatGPT and DALLE-3 | Georgios Makridis et.al. | 2407.09467 | null |
2024-07-12 | The $μ\mathcal{G}$ Language for Programming Graph Neural Networks | Matteo Belenchia et.al. | 2407.09441 | null |
2024-07-12 | Graph Neural Network Causal Explanation via Neural Causal Models | Arman Behnam et.al. | 2407.09378 | link |
2024-07-12 | Computationally Efficient Estimation of Large Probit Models | Patrick Ding et.al. | 2407.09371 | null |
2024-07-12 | Is Contrasting All You Need? Contrastive Learning for the Detection and Attribution of AI-generated Text | Lucio La Cava et.al. | 2407.09364 | null |
2024-07-15 | Any-Property-Conditional Molecule Generation with Self-Criticism using Spanning Trees | Alexia Jolicoeur-Martineau et.al. | 2407.09357 | link |
2024-07-12 | PID: Physics-Informed Diffusion Model for Infrared Image Generation | Fangyuan Mao et.al. | 2407.09299 | link |
2024-07-12 | Learning Distances from Data with Normalizing Flows and Score Matching | Peter Sorrenson et.al. | 2407.09297 | null |
2024-07-12 | Surgical Text-to-Image Generation | Chinedu Innocent Nwoye et.al. | 2407.09230 | null |
2024-07-11 | Video Diffusion Alignment via Reward Gradients | Mihir Prabhudesai et.al. | 2407.08737 | link |
2024-07-11 | Live2Diff: Live Stream Translation via Uni-directional Attention in Video Diffusion Models | Zhening Xing et.al. | 2407.08701 | null |
2024-07-11 | FAR-Trans: An Investment Dataset for Financial Asset Recommendation | Javier Sanz-Cruzado et.al. | 2407.08692 | null |
2024-07-11 | Scattering transforms on the sphere, application to large scale structure modelling | Louise Mousset et.al. | 2407.08687 | null |
2024-07-11 | CAD-Prompted Generative Models: A Pathway to Feasible and Novel Engineering Designs | Leah Chong et.al. | 2407.08675 | null |
2024-07-11 | Still-Moving: Customized Video Generation without Customized Video Data | Hila Chefer et.al. | 2407.08674 | null |
2024-07-11 | Controlling the Fidelity and Diversity of Deep Generative Models via Pseudo Density | Shuangqi Li et.al. | 2407.08659 | null |
2024-07-11 | Adaptive Smooth Non-Stationary Bandits | Joe Suk et.al. | 2407.08654 | null |
2024-07-11 | Fine-Tuning Stable Diffusion XL for Stylistic Icon Generation: A Comparison of Caption Size | Youssef Sultan et.al. | 2407.08513 | null |
2024-07-11 | Latent Conditional Diffusion-based Data Augmentation for Continuous-Time Dynamic Graph Mode | Yuxing Tian et.al. | 2407.08500 | null |
2024-07-10 | Generative Image as Action Models | Mohit Shridhar et.al. | 2407.07875 | link |
2024-07-10 | Dynamical Measure Transport and Neural PDE Solvers for Sampling | Jingtong Sun et.al. | 2407.07873 | null |
2024-07-10 | Controlling Space and Time with Diffusion Models | Daniel Watson et.al. | 2407.07860 | null |
2024-07-10 | Generic Numerical Analysis of Stochastic Reaction Diffusion Model with applications in excitable media | Yahya Alnashri et.al. | 2407.07834 | null |
2024-07-10 | Universal and non-universal signatures in the scaling functions of critical variables | Gianluca Teza et.al. | 2407.07782 | null |
2024-07-10 | Towards Human-Like Driving: Active Inference in Autonomous Vehicle Control | Elahe Delavari et.al. | 2407.07684 | null |
2024-07-10 | VEnhancer: Generative Space-Time Enhancement for Video Generation | Jingwen He et.al. | 2407.07667 | null |
2024-07-10 | A Coding-Theoretic Analysis of Hyperspherical Prototypical Learning Geometry | Martin Lindström et.al. | 2407.07664 | link |
2024-07-10 | The heterogeneous impact of the EU-Canada agreement with causal machine | Lionel Fontagné et.al. | 2407.07652 | null |
2024-07-11 | MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image Synthesis | Wanggui He et.al. | 2407.07614 | link |
2024-07-09 | ConceptExpress: Harnessing Diffusion Models for Single-image Unsupervised Concept Extraction | Shaozhe Hao et.al. | 2407.07077 | link |
2024-07-09 | Latent Space Imaging | Matheus Souza et.al. | 2407.07052 | null |
2024-07-09 | Generative models of astrophysical fields with scattering transforms on the sphere | Louise Mousset et.al. | 2407.07007 | link |
2024-07-10 | PEER: Expertizing Domain-Specific Tasks with a Multi-Agent Framework and Tuning Methods | Yiying Wang et.al. | 2407.06985 | link |
2024-07-09 | Parameter-Efficient and Memory-Efficient Tuning for Vision Transformer: A Disentangled Approach | Taolin Zhang et.al. | 2407.06964 | null |
2024-07-09 | RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models | Bowen Zhang et.al. | 2407.06938 | null |
2024-07-09 | HumanRefiner: Benchmarking Abnormal Human Generation and Refining with Coarse-to-fine Pose-Reversible Guidance | Guian Fang et.al. | 2407.06937 | link |
2024-07-09 | Fine-grained large-scale content recommendations for MSX sellers | Manpreet Singh et.al. | 2407.06910 | null |
2024-07-09 | Enhanced Battery Degradation-Aware Scheduling for Distribution Network with Electric Vehicle Load | Vijay Babu Pamshetti et.al. | 2407.06857 | null |
2024-07-09 | A reaction-diffusion model for relapsing-remitting multiple sclerosis with a treatment term | Romina Travaglini et.al. | 2407.06802 | null |
2024-07-08 | Tailor3D: Customized 3D Assets Editing and Generation with Dual-Side Images | Zhangyang Qi et.al. | 2407.06191 | null |
2024-07-08 | CrowdMoGen: Zero-Shot Text-Driven Collective Motion Generation | Xinying Guo et.al. | 2407.06188 | null |
2024-07-08 | JeDi: Joint-Image Diffusion Models for Finetuning-Free Personalized Text-to-Image Generation | Yu Zeng et.al. | 2407.06187 | null |
2024-07-08 | The Tug-of-War Between Deepfake Generation and Detection | Hannah Lee et.al. | 2407.06174 | null |
2024-07-08 | ANOLE: An Open, Autoregressive, Native Large Multimodal Models for Interleaved Image-Text Generation | Ethan Chern et.al. | 2407.06135 | link |
2024-07-08 | Structured Generations: Using Hierarchical Clusters to guide Diffusion Models | Jorge da Silva Goncalves et.al. | 2407.06124 | link |
2024-07-08 | PerlDiff: Controllable Street View Synthesis Using Perspective-Layout Diffusion Models | Jinhua Zhang et.al. | 2407.06109 | link |
2024-07-08 | Accelerating Diffusion for SAR-to-Optical Image Translation via Adversarial Consistency Distillation | Xinyu Bai et.al. | 2407.06095 | null |
2024-07-08 | Assessing Cardiomegaly in Dogs Using a Simple CNN Model | Nikhil Deekonda et.al. | 2407.06092 | null |
2024-07-08 | Layered Diffusion Model for One-Shot High Resolution Text-to-Image Synthesis | Emaad Khwaja et.al. | 2407.06079 | null |
2024-07-05 | RAM: Retrieval-Based Affordance Transfer for Generalizable Zero-Shot Robotic Manipulation | Yuxuan Kuang et.al. | 2407.04689 | link |
2024-07-05 | Thermal and mechanical study of a parametrised cryostat model for optical characterisation of upcoming CMB experiments | Thomas J. L. J. Gascard et.al. | 2407.04613 | link |
2024-07-08 | PartCraft: Crafting Creative Objects by Parts | Kam Woh Ng et.al. | 2407.04604 | link |
2024-07-05 | Structural Constraint Integration in Generative Model for Discovery of Quantum Material Candidates | Ryotaro Okabe et.al. | 2407.04557 | null |
2024-07-05 | Unified continuous-time q-learning for mean-field game and mean-field control problems | Xiaoli Wei et.al. | 2407.04521 | null |
2024-07-08 | Speed-accuracy trade-off for the diffusion models: Wisdom from nonequilibrium thermodynamics and optimal transport | Kotaro Ikeda et.al. | 2407.04495 | null |
2024-07-05 | PROUD: PaRetO-gUided Diffusion Model for Multi-objective Generation | Yinghua Yao et.al. | 2407.04493 | link |
2024-07-05 | Dude: Dual Distribution-Aware Context Prompt Learning For Large Vision-Language Model | Duy M. H. Nguyen et.al. | 2407.04489 | null |
2024-07-05 | Leveraging Graph Structures to Detect Hallucinations in Large Language Models | Noa Nonkes et.al. | 2407.04485 | link |
2024-07-05 | VCD-Texture: Variance Alignment based 3D-2D Co-Denoising for Text-Guided Texturing | Shang Liu et.al. | 2407.04461 | null |
2024-07-03 | DisCo-Diff: Enhancing Continuous Diffusion Models with Discrete Latents | Yilun Xu et.al. | 2407.03300 | link |
2024-07-03 | Improved Noise Schedule for Diffusion Training | Tiankai Hang et.al. | 2407.03297 | null |
2024-07-03 | Anomaly-based Framework for Detecting Power Overloading Cyberattacks in Smart Grid AMI | Abdelaziz Amara Korba et.al. | 2407.03264 | null |
2024-07-03 | SOS! Soft Prompt Attack Against Open-Source Large Language Models | Ziqing Yang et.al. | 2407.03160 | null |
2024-07-04 | Spatio-Temporal Adaptive Diffusion Models for EEG Super-Resolution in Epilepsy Diagnosis | Tong Zhou et.al. | 2407.03089 | null |
2024-07-03 | Artificial Inductive Bias for Synthetic Tabular Data Generation in Data-Scarce Scenarios | Patricia A. Apellániz et.al. | 2407.03080 | link |
2024-07-03 | Electromagnetic Property Sensing Based on Diffusion Model in ISAC System | Yuhua Jiang et.al. | 2407.03075 | null |
2024-07-03 | Semantic-Aware Power Allocation for Generative Semantic Communications with Foundation Models | Chunmei Xu et.al. | 2407.03050 | null |
2024-07-03 | SlerpFace: Face Template Protection via Spherical Linear Interpolation | Zhizhou Zhong et.al. | 2407.03043 | null |
2024-07-03 | An Organism Starts with a Single Pix-Cell: A Neural Cellular Diffusion for High-Resolution Image Synthesis | Marawan Elbatel et.al. | 2407.03018 | link |
2024-07-02 | Magic Insert: Style-Aware Drag-and-Drop | Nataniel Ruiz et.al. | 2407.02489 | null |
2024-07-02 | Boosting Consistency in Story Visualization with Rich-Contextual Conditional Diffusion Models | Fei Shen et.al. | 2407.02482 | link |
2024-07-02 | A Pattern Language for Machine Learning Tasks | Benjamin Rodatz et.al. | 2407.02424 | null |
2024-07-02 | GCF: Graph Convolutional Networks for Facial Expression Recognition | Hozaifa Kassab et.al. | 2407.02361 | null |
2024-07-02 | MORPHEUS: Modeling Role from Personalized Dialogue History by Exploring and Utilizing Latent Space | Yihong Tang et.al. | 2407.02345 | null |
2024-07-02 | Choice-based time slot management in attended home delivery | Dorsa Abdolhamidi et.al. | 2407.02339 | null |
2024-07-02 | Mining Constraints from Reference Process Models for Detecting Best-Practice Violations in Event Log | Adrian Rebmann et.al. | 2407.02336 | link |
2024-07-02 | A tactical time slot management problem under mixed logit demand | Dorsa Abdolhamidi et.al. | 2407.02308 | null |
2024-07-02 | Renard: A Modular Pipeline for Extracting Character Networks from Narrative Texts | Arthur Amalvy et.al. | 2407.02284 | link |
2024-07-03 | Federated Distillation for Medical Image Classification: Towards Trustworthy Computer-Aided Diagnosis | Sufen Ren et.al. | 2407.02261 | null |
2024-06-28 | Auto Cherry-Picker: Learning from High-quality Generative Data Driven by Language | Yicheng Chen et.al. | 2406.20085 | null |
2024-06-28 | The hybrid Josephson rhombus: A superconducting element with tailored current-phase relation | L. Banszerus et.al. | 2406.20082 | null |
2024-06-28 | HouseCrafter: Lifting Floorplans to 3D Scenes with 2D Diffusion Model | Hieu T. Nguyen et.al. | 2406.20077 | null |
2024-06-28 | Modeling and LQR Control of Insect Sized Flapping Wing Robot | Daksh Dhingra et.al. | 2406.20061 | null |
2024-06-28 | Neural Differentiable Modeling with Diffusion-Based Super-resolution for Two-Dimensional Spatiotemporal Turbulence | Xiantao Fan et.al. | 2406.20047 | null |
2024-06-28 | Electrostatics-based particle sampling and approximate inference | Yongchao Huang et.al. | 2406.20044 | link |
2024-06-28 | HAITCH: A Framework for Distortion and Motion Correction in Fetal Multi-Shell Diffusion-Weighted MRI | Haykel Snoussi et.al. | 2406.20042 | null |
2024-06-28 | Concept Lens: Visually Analyzing the Consistency of Semantic Manipulation in GANs | Sangwon Jeong et.al. | 2406.19987 | null |
2024-07-01 | Text2Robot: Evolutionary Robot Design from Text Descriptions | Ryan P. Ringel et.al. | 2406.19963 | link |
2024-06-28 | Kolmogorov-Smirnov GAN | Maciej Falkiewicz et.al. | 2406.19948 | link |
2024-06-27 | Looking 3D: Anomaly Detection with 2D-3D Alignment | Ankan Bhunia et.al. | 2406.19393 | link |
2024-06-27 | Taming Data and Transformers for Audio Generation | Moayed Haji-Ali et.al. | 2406.19388 | null |
2024-06-27 | Emergence of Hidden Capabilities: Exploring Learning Dynamics in Concept Space | Core Francisco Park et.al. | 2406.19370 | link |
2024-06-27 | Accelerating Multiphase Flow Simulations with Denoising Diffusion Model Driven Initializations | Jaehong Chung et.al. | 2406.19333 | null |
2024-06-27 | Subtractive Training for Music Stem Insertion using Latent Diffusion Models | Ivan Villa-Renteria et.al. | 2406.19328 | null |
2024-06-27 | Efficient World Models with Context-Aware Tokenization | Vincent Micheli et.al. | 2406.19320 | link |
2024-06-27 | PNeRV: A Polynomial Neural Representation for Videos | Sonam Gupta et.al. | 2406.19299 | null |
2024-06-27 | Compositional Image Decomposition with Diffusion Models | Jocelin Su et.al. | 2406.19298 | null |
2024-06-27 | BISeizuRe: BERT-Inspired Seizure Data Representation to Improve Epilepsy Monitoring | Luca Benfenati et.al. | 2406.19189 | null |
2024-06-27 | On Pólya-Young urn models and growth processes | Markus Kuba et.al. | 2406.19110 | null |
2024-06-26 | MatchTime: Towards Automatic Soccer Game Commentary Generation | Jiayuan Rao et.al. | 2406.18530 | link |
2024-06-26 | MultiDiff: Consistent Novel View Synthesis from a Single Image | Norman Müller et.al. | 2406.18524 | null |
2024-06-26 | Denoising as Adaptation: Noise-Space Domain Adaptation for Image Restoration | Kang Liao et.al. | 2406.18516 | link |
2024-06-26 | DiffuseHigh: Training-free Progressive High-Resolution Image Synthesis through Structure Guidance | Younghyun Kim et.al. | 2406.18459 | link |
2024-06-26 | Cascading Large Language Models for Salient Event Graph Generation | Xingwei Tan et.al. | 2406.18449 | link |
2024-06-26 | Repeat and Concatenate: 2D to 3D Image Translation with 3D to 3D Generative Modeling | Abril Corona-Figueroa et.al. | 2406.18422 | link |
2024-06-26 | Towards diffusion models for large-scale sea-ice modelling | Tobias Sebastian Finn et.al. | 2406.18417 | null |
2024-06-27 | Stable Diffusion Segmentation for Biomedical Images with Single-step Reverse Process | Tianyu Lin et.al. | 2406.18361 | link |
2024-06-26 | Molecular Diffusion Models with Virtual Receptors | Matan Halfon et.al. | 2406.18330 | null |
2024-06-27 | Weak Reward Model Transforms Generative Models into Robust Causal Event Extraction Systems | Italo Luis da Silva et.al. | 2406.18245 | link |
2024-06-25 | DiffusionPDE: Generative PDE-Solving Under Partial Observation | Jiahe Huang et.al. | 2406.17763 | link |
2024-06-25 | MotionBooth: Motion-Aware Customized Text-to-Video Generation | Jianzong Wu et.al. | 2406.17758 | null |
2024-06-25 | Accelerating Clinical Evidence Synthesis with Large Language Models | Zifeng Wang et.al. | 2406.17755 | null |
2024-06-25 | Extensions of Panjer’s recursion for mixed compound distributions | Spyridon M. Tzaninis et.al. | 2406.17726 | null |
2024-06-25 | PANDA: A self-driving lab for studying electrodeposited polymer films | Harley Quinn et.al. | 2406.17725 | null |
2024-06-25 | Unified Auto-Encoding with Masked Diffusion | Philippe Hansen-Estruch et.al. | 2406.17688 | link |
2024-06-25 | LaTable: Towards Large Tabular Models | Boris van Breugel et.al. | 2406.17673 | null |
2024-06-26 | SpecMaskGIT: Masked Generative Modeling of Audio Spectrograms for Efficient Audio Synthesis and Beyond | Marco Comunità et.al. | 2406.17672 | null |
2024-06-25 | Banishing LLM Hallucinations Requires Rethinking Generalization | Johnny Li et.al. | 2406.17642 | null |
2024-06-25 | The experience of humans’ and robots’ mutual (im)politeness in enacted service scenarios: An empirical study | Victor Kaptelinin et.al. | 2406.17641 | null |
2024-06-24 | FreeTraj: Tuning-Free Trajectory Control in Video Diffusion Models | Haonan Qiu et.al. | 2406.16863 | link |
2024-06-24 | Dreamitate: Real-World Visuomotor Policy Learning via Video Generation | Junbang Liang et.al. | 2406.16862 | null |
2024-06-24 | DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation | Yuang Peng et.al. | 2406.16855 | link |
2024-06-24 | USDC: A Dataset of $\underline{U}$ser $\underline{S}$tance and $\underline{D}$ogmatism in Long $\underline{C}$ onversations | Mounika Marreddy et.al. | 2406.16833 | null |
2024-06-24 | General Binding Affinity Guidance for Diffusion Models in Structure-Based Drug Design | Yue Jian et.al. | 2406.16821 | null |
2024-06-24 | ClotheDreamer: Text-Guided Garment Generation with 3D Gaussians | Yufei Liu et.al. | 2406.16815 | null |
2024-06-24 | Conformal time series decomposition with component-wise exchangeability | Derck W. E. Prinzhorn et.al. | 2406.16766 | link |
2024-06-24 | Inferring stochastic low-rank recurrent neural networks from neural data | Matthijs Pals et.al. | 2406.16749 | link |
2024-06-24 | Portrait3D: 3D Head Generation from Single In-the-wild Portrait Image | Jinkun Hao et.al. | 2406.16710 | null |
2024-06-24 | Geometry-Aware Score Distillation via 3D Consistent Noising and Gradient Consistency Modeling | Min-Seop Kwak et.al. | 2406.16695 | null |
2024-06-21 | Masked Extended Attention for Zero-Shot Virtual Try-On In The Wild | Nadav Orzech et.al. | 2406.15331 | null |
2024-06-21 | Rethinking Remote Sensing Change Detection With A Mask View | Xiaowen Ma et.al. | 2406.15320 | link |
2024-06-21 | You Only Acquire Sparse-channel (YOAS): A Unified Framework for Dense-channel EEG Generation | Hongyu Chen et.al. | 2406.15269 | null |
2024-06-21 | Evaluating Diversity in Automatic Poetry Generation | Yanran Chen et.al. | 2406.15267 | link |
2024-06-21 | Fingerprint Membership and Identity Inference Against Generative Adversarial Networks | Saverio Cavasin et.al. | 2406.15253 | null |
2024-06-21 | MantisScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation | Xuan He et.al. | 2406.15252 | null |
2024-06-21 | Unsupervised Bayesian Generation of Synthetic CT from CBCT Using Patient-Specific Score-Based Prior | Junbo Peng et.al. | 2406.15219 | null |
2024-06-21 | Sound and Fury, Signifying Nothing? Impact of Data Breach Disclosure Laws | Muhammad Zia Hydari et.al. | 2406.15215 | null |
2024-06-21 | Injecting Bias in Text-To-Image Models via Composite-Trigger Backdoors | Ali Naseh et.al. | 2406.15213 | link |
2024-06-21 | Exploring the Efficacy of Robotic Assistants with ChatGPT and Claude in Enhancing ADHD Therapy: Innovating Treatment Paradigms | Santiago Berrezueta-Guzman et.al. | 2406.15198 | null |
2024-06-20 | A Survey of Multimodal-Guided Image Editing with Text-to-Image Diffusion Models | Xincheng Shuai et.al. | 2406.14555 | link |
2024-06-21 | Advancing Fine-Grained Classification by Structure and Subject Preserving Augmentation | Eyal Michaeli et.al. | 2406.14551 | link |
2024-06-20 | Consistency Models Made Easy | Zhengyang Geng et.al. | 2406.14548 | link |
2024-06-20 | IRASim: Learning Interactive Real-Robot Action Simulators | Fangqi Zhu et.al. | 2406.14540 | null |
2024-06-20 | Invertible Consistency Distillation for Text-Guided Image Editing in Around 7 Steps | Nikita Starodubcev et.al. | 2406.14539 | null |
2024-06-20 | Fantastic Copyrighted Beasts and How (Not) to Generate Them | Luxi He et.al. | 2406.14526 | null |
2024-06-20 | Photoacoustic methane detection assisted by a gas-filled anti-resonant hollow-core fiber laser | Cuiling Zhang et.al. | 2406.14521 | null |
2024-06-20 | V-LASIK: Consistent Glasses-Removal from Videos Using Synthetic Data | Rotem Shalev-Arkushin et.al. | 2406.14510 | null |
2024-06-20 | CodeRAG-Bench: Can Retrieval Augment Code Generation? | Zora Zhiruo Wang et.al. | 2406.14497 | link |
2024-06-20 | SafeSora: Towards Safety Alignment of Text2Video Generation via a Human Preference Dataset | Josef Dai et.al. | 2406.14477 | link |
2024-06-20 | CollaFuse: Collaborative Diffusion Models | Simeon Allmendinger et.al. | 2406.14429 | link |
2024-06-20 | Active Diffusion Subsampling | Oisin Nolan et.al. | 2406.14388 | link |
2024-06-20 | Multicoloured Hardcore Model: Fast Mixing and Queueing | Sam Olesker-Taylor et.al. | 2406.14376 | null |
2024-06-20 | FairX: A comprehensive benchmarking tool for model analysis using fairness, utility, and explainability | Md Fahim Sikder et.al. | 2406.14281 | link |
2024-06-20 | In Tree Structure Should Sentence Be Generated | Yaguang Li et.al. | 2406.14189 | link |
2024-06-20 | CriDiff: Criss-cross Injection Diffusion Framework via Generative Pre-train for Prostate Segmentation | Tingwei Liu et.al. | 2406.14186 | link |
2024-06-20 | Tractable Equilibrium Computation in Markov Games through Risk Aversion | Eric Mazumdar et.al. | 2406.14156 | null |
2024-06-20 | ExVideo: Extending Video Diffusion Models via Parameter-Efficient Post-Tuning | Zhongjie Duan et.al. | 2406.14130 | link |
2024-06-20 | Dye4AI: Assuring Data Boundary on Generative AI Services | Shu Wang et.al. | 2406.14114 | null |
2024-06-20 | HeartBeat: Towards Controllable Echocardiography Video Synthesis with Multimodal Conditions-Guided Diffusion Models | Xinrui Zhou et.al. | 2406.14098 | null |
2024-06-20 | Bridging bulk and surface: An interacting particle system towards the field-road diffusion model | Matthieu Alfaro et.al. | 2406.14093 | null |
2024-06-20 | A Practical Diffusion Path for Sampling | Omar Chehab et.al. | 2406.14040 | null |
2024-06-20 | Leveraging eBPF and AI for Ransomware Nose Out | Arjun Sekar et.al. | 2406.14020 | null |
2024-06-20 | Feature Fusion Based on Mutual-Cross-Attention Mechanism for EEG Emotion Recognition | Yimin Zhao et.al. | 2406.14014 | link |
2024-06-20 | Exploring Changes in Nation Perception with Nationality-Assigned Personas in LLMs | Mahammed Kamruzzaman et.al. | 2406.13993 | null |
2024-06-20 | The Elusive Pursuit of Replicating PATE-GAN: Benchmarking, Auditing, Debugging | Georgi Ganev et.al. | 2406.13985 | link |
2024-06-20 | Similarity-aware Syncretic Latent Diffusion Model for Medical Image Translation with Representation Learning | Tingyi Lin et.al. | 2406.13977 | null |
2024-06-20 | Synthesizing Multimodal Electronic Health Records via Predictive Diffusion Models | Yuan Zhong et.al. | 2406.13942 | null |
2024-06-20 | EnTruth: Enhancing the Traceability of Unauthorized Dataset Usage in Text-to-image Diffusion Models with Minimal and Robust Alterations | Jie Ren et.al. | 2406.13933 | null |
2024-06-20 | Generative AI for Enhancing Active Learning in Education: A Comparative Study of GPT-3.5 and GPT-4 in Crafting Customized Test Questions | Hamdireza Rouzegar et.al. | 2406.13903 | null |
2024-06-19 | INFusion: Diffusion Regularized Implicit Neural Representations for 2D and 3D accelerated MRI reconstruction | Yamin Arefeen et.al. | 2406.13895 | null |
2024-06-19 | Open Generative Large Language Models for Galician | Pablo Gamallo et.al. | 2406.13893 | null |
2024-06-19 | StackRAG Agent: Improving Developer Answers with Retrieval-Augmented Generation | Davit Abrahamyan et.al. | 2406.13840 | link |
2024-06-19 | RNA-FrameFlow: Flow Matching for de novo 3D RNA Backbone Design | Rishabh Anand et.al. | 2406.13839 | link |
2024-06-19 | COAC: Cross-layer Optimization of Accelerator Configurability for Efficient CNN Processing | Steven Colleman et.al. | 2406.13752 | null |
2024-06-19 | GenAI-Bench: Evaluating and Improving Compositional Text-to-Visual Generation | Baiqi Li et.al. | 2406.13743 | link |
2024-06-19 | Tree-Sliced Wasserstein Distance on a System of Lines | Viet-Hoang Tran et.al. | 2406.13725 | null |
2024-06-19 | Hitchhiker’s guide on Energy-Based Models: a comprehensive review on the relation with other generative models, sampling and statistical physics | Davide Carbone et.al. | 2406.13661 | null |
2024-06-19 | Towards Minimal Targeted Updates of Language Models with Targeted Negative Training | Lily H. Zhang et.al. | 2406.13660 | link |
2024-06-19 | Stability and Generalizability in SDE Diffusion Models with Measure-Preserving Dynamics | Weitong Zhang et.al. | 2406.13652 | null |
2024-06-19 | On AI-Inspired UI-Design | Jialiang Wei et.al. | 2406.13631 | null |
2024-06-19 | Can AI be enabled to dynamical downscaling? Training a Latent Diffusion Model to mimic km-scale COSMO-CLM downscaling of ERA5 over Italy | Elena Tomasi et.al. | 2406.13627 | link |
2024-06-19 | Enhance the Image: Super Resolution using Artificial Intelligence in MRI | Ziyu Li et.al. | 2406.13625 | null |
2024-06-19 | Generative Modeling by Minimizing the Wasserstein-2 Loss | Yu-Jui Huang et.al. | 2406.13619 | null |
2024-06-19 | Parameter Training Efficiency Aware Resource Allocation for AIGC in Space-Air-Ground Integrated Networks | Liangxin Qian et.al. | 2406.13602 | null |
2024-06-19 | ModSec-Learn: Boosting ModSecurity with Machine Learning | Christian Scano et.al. | 2406.13547 | link |
2024-06-19 | Towards Cyber Threat Intelligence for the IoT | Alfonso Iacovazzi et.al. | 2406.13543 | null |
2024-06-19 | Image Distillation for Safe Data Sharing in Histopathology | Zhe Li et.al. | 2406.13536 | link |
2024-06-19 | Diffusion-based Generative Modeling with Discriminative Guidance for Streamable Speech Enhancement | Chenda Li et.al. | 2406.13471 | null |
2024-06-19 | Unifying nonlinearly constrained nonconvex optimization | Charlie Vanaret et.al. | 2406.13454 | link |
2024-06-19 | Federating to Grow Transformers with Constrained Resources without Model Sharing | Shikun Shen et.al. | 2406.13450 | null |
2024-06-19 | Multi-messenger modeling of the Monogem pulsar halo | Youyou Li et.al. | 2406.13426 | null |
2024-06-19 | Style-NeRF2NeRF: 3D Style Transfer From Style-Aligned Multi-View Images | Haruo Fujiwara et.al. | 2406.13393 | null |
2024-06-19 | Effective Edge-wise Representation Learning in Edge-Attributed Bipartite Graphs | Hewen Wang et.al. | 2406.13369 | null |
2024-06-19 | Situational Instructions Database: Task Guidance in Dynamic Environments | Muhammad Saif Ullah Khan et.al. | 2406.13302 | link |
2024-06-19 | ARDuP: Active Region Video Diffusion for Universal Policies | Shuaiyi Huang et.al. | 2406.13301 | null |
2024-06-19 | AniFaceDiff: High-Fidelity Face Reenactment via Facial Parametric Conditioned Diffusion Models | Ken Chen et.al. | 2406.13272 | null |
2024-06-19 | Self-Supervised Diffusion Model for 3-D Seismic Data Reconstruction | Xinyang Wang et.al. | 2406.13252 | null |
2024-06-19 | Optimizing Inventory Management through Multiobjective Reverse Logistics with Environmental Impact | I. B. Wadhawan et.al. | 2406.13226 | null |
2024-06-19 | Neural Residual Diffusion Models for Deep Scalable Vision Generation | Zhiyuan Ma et.al. | 2406.13215 | null |
2024-06-19 | Surgical Triplet Recognition via Diffusion Model | Daochang Liu et.al. | 2406.13210 | null |
2024-06-19 | Diffusion Model-based FOD Restoration from High Distortion in dMRI | Shuo Huang et.al. | 2406.13209 | null |
2024-06-19 | Toward Structure Fairness in Dynamic Graph Embedding: A Trend-aware Dual Debiasing Approach | Yicong Li et.al. | 2406.13201 | link |
2024-06-19 | Synthetic Context Generation for Question Generation | Naiming Liu et.al. | 2406.13188 | null |
2024-06-19 | Conditional score-based diffusion models for solving inverse problems in mechanics | Agnimitra Dasgupta et.al. | 2406.13154 | null |
2024-06-19 | von Mises Quasi-Processes for Bayesian Circular Regression | Yarden Cohen et.al. | 2406.13151 | null |
2024-06-19 | MCAD: Multi-modal Conditioned Adversarial Diffusion Model for High-Quality PET Image Reconstruction | Jiaqi Cui et.al. | 2406.13150 | null |
2024-06-19 | GVT2RPM: An Empirical Study for General Video Transformer Adaptation to Remote Physiological Measurement | Hao Wang et.al. | 2406.13136 | null |
2024-06-19 | Thruster-Assisted Incline Walking | Kaushik Venkatesh Krishnamurthy et.al. | 2406.13118 | null |
2024-06-18 | Sampling 3D Gaussian Scenes in Seconds with Latent Diffusion Models | Paul Henderson et.al. | 2406.13099 | null |
2024-06-18 | RITA: A Real-time Interactive Talking Avatars Framework | Wuxinlin Cheng et.al. | 2406.13093 | null |
2024-06-18 | PIPPIN: Generating variable length full events from partons | Guillaume Quétant et.al. | 2406.13074 | link |
2024-06-18 | MaskPure: Improving Defense Against Text Adversaries with Stochastic Purification | Harrison Gietz et.al. | 2406.13066 | link |
2024-06-18 | Traffic Prediction considering Multiple Levels of Spatial-temporal Information: A Multi-scale Graph Wavelet-based Approach | Zilin Bian et.al. | 2406.13038 | null |
2024-06-18 | Sharp detection of low-dimensional structure in probability measures via dimensional logarithmic Sobolev inequalities | Matthew T. C. Li et.al. | 2406.13036 | null |
2024-06-18 | Data Plagiarism Index: Characterizing the Privacy Risk of Data-Copying in Tabular Generative Models | Joshua Ward et.al. | 2406.13012 | null |
2024-06-18 | Synergizing Foundation Models and Federated Learning: A Survey | Shenghui Li et.al. | 2406.12844 | null |
2024-06-18 | Evaluating the design space of diffusion-based generative models | Yuqing Wang et.al. | 2406.12839 | null |
2024-06-18 | Neural Approximate Mirror Maps for Constrained Diffusion Models | Berthy T. Feng et.al. | 2406.12816 | null |
2024-06-19 | AITTI: Learning Adaptive Inclusive Token for Text-to-Image Generation | Xinyu Hou et.al. | 2406.12805 | link |
2024-06-18 | Extracting Training Data from Unconditional Diffusion Models | Yunhao Chen et.al. | 2406.12752 | null |
2024-06-18 | Useful stochastic bounds in time-varying queues with service and patience times having general joint distribution | Shreehari Anand Bodas et.al. | 2406.12745 | null |
2024-06-18 | SUPER: Selfie Undistortion and Head Pose Editing with Identity Preservation | Polina Karpikova et.al. | 2406.12700 | null |
2024-06-18 | Speak in the Scene: Diffusion-based Acoustic Scene Transfer toward Immersive Speech Generation | Miseul Kim et.al. | 2406.12688 | null |
2024-06-18 | GeoBench: Benchmarking and Analyzing Monocular Geometry Estimation Models | Yongtao Ge et.al. | 2406.12671 | link |
2024-06-18 | Research and Implementation of Data Enhancement Techniques for Graph Neural Networks | Jingzhao Gu et.al. | 2406.12640 | null |
2024-06-18 | News Without Borders: Domain Adaptation of Multilingual Sentence Embeddings for Cross-lingual News Recommendation | Andreea Iana et.al. | 2406.12634 | link |
2024-06-18 | Learning Diffusion at Lightspeed | Antonio Terpin et.al. | 2406.12616 | null |
2024-06-18 | Unmasking the Veil: An Investigation into Concept Ablation for Privacy and Copyright Protection in Images | Shivank Garg et.al. | 2406.12592 | link |
2024-06-18 | Behavior-Dependent Linear Recurrent Units for Efficient Sequential Recommendation | Chengkai Liu et.al. | 2406.12580 | link |
2024-06-18 | Training Diffusion Models with Federated Learning | Matthijs de Goede et.al. | 2406.12575 | null |
2024-06-18 | P-Tailor: Customizing Personality Traits for Language Models via Mixture of Specialized LoRA Experts | Yuhao Dan et.al. | 2406.12548 | null |
2024-06-18 | Structured Detection for Simultaneous Super-Resolution and Optical Sectioning in Laser Scanning Microscopy | Alessandro Zunino et.al. | 2406.12542 | link |
2024-06-18 | Variational Distillation of Diffusion Policies into Mixture of Experts | Hongyi Zhou et.al. | 2406.12538 | null |
2024-06-18 | HumanSplat: Generalizable Single-Image Human Gaussian Splatting with Structure Priors | Panwang Pan et.al. | 2406.12459 | link |
2024-06-18 | Planning Using Schrödinger Bridge Diffusion Models | Adarsh Srivastava et.al. | 2406.12458 | link |
2024-06-18 | Deep Temporal Deaggregation: Large-Scale Spatio-Temporal Generative Models | David Bergström et.al. | 2406.12423 | null |
2024-06-18 | ROVER: RTL Optimization via Verified E-Graph Rewriting | Samuel Coward et.al. | 2406.12421 | null |
2024-06-18 | TADM: Temporally-Aware Diffusion Model for Neurodegenerative Progression on Brain MRI | Mattia Litrico et.al. | 2406.12411 | null |
2024-06-18 | SDNIA-YOLO: A Robust Object Detection Model for Extreme Weather Conditions | Yuexiong Ding et.al. | 2406.12395 | null |
Vision-Language Models
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2024-12-19 | OpenEMMA: Open-Source Multimodal Model for End-to-End Autonomous Driving | Shuo Xing et.al. | 2412.15208 | null |
2024-12-19 | LlamaFusion: Adapting Pretrained Language Models for Multimodal Generation | Weijia Shi et.al. | 2412.15188 | null |
2024-12-19 | Qwen2.5 Technical Report | Qwen et.al. | 2412.15115 | null |
2024-12-19 | Progressive Multimodal Reasoning via Active Retrieval | Guanting Dong et.al. | 2412.14835 | null |
2024-12-19 | Explainable Tampered Text Detection via Multimodal Large Models | Chenfan Qu et.al. | 2412.14816 | null |
2024-12-18 | Descriptive Caption Enhancement with Visual Specialists for Multimodal Perception | Yanpeng Sun et.al. | 2412.14233 | link |
2024-12-18 | AnySat: An Earth Observation Model for Any Resolutions, Scales, and Modalities | Guillaume Astruc et.al. | 2412.14123 | link |
2024-12-19 | G-VEval: A Versatile Metric for Evaluating Image and Video Captions Using GPT-4o | Tony Cheng Tong et.al. | 2412.13647 | link |
2024-12-18 | Detecting Machine-Generated Music with Explainability – A Challenge and Early Benchmarks | Yupei Li et.al. | 2412.13421 | null |
2024-12-17 | DoPTA: Improving Document Layout Analysis using Patch-Text Alignment | Nikitha SR et.al. | 2412.12902 | null |
2024-12-17 | Multi-Dimensional Insights: Benchmarking Real-World Personalization in Large Multimodal Models | YiFan Zhang et.al. | 2412.12606 | null |
2024-12-17 | PBVS 2024 Solution: Self-Supervised Learning and Sampling Strategies for SAR Classification in Extreme Long-Tail Distribution | Yuhyun Kim et.al. | 2412.12565 | null |
2024-12-17 | Causal Diffusion Transformers for Generative Modeling | Chaorui Deng et.al. | 2412.12095 | link |
2024-12-16 | CPath-Omni: A Unified Multimodal Foundation Model for Patch and Whole Slide Image Analysis in Computational Pathology | Yuxuan Sun et.al. | 2412.12077 | null |
2024-12-16 | Gramian Multimodal Representation Learning and Alignment | Giordano Cicchetti et.al. | 2412.11959 | null |
2024-12-16 | LMM-Regularized CLIP Embeddings for Image Classification | Maria Tzelepi et.al. | 2412.11663 | null |
2024-12-15 | Seeing the Forest and the Trees: Solving Visual Graph and Tree Based Data Structure Problems using Large Multimodal Models | Sebastian Gutierrez et.al. | 2412.11088 | null |
2024-12-13 | Apollo: An Exploration of Video Understanding in Large Multimodal Models | Orr Zohar et.al. | 2412.10360 | null |
2024-12-13 | Performance of ChatGPT on tasks involving physics visual representations: the case of the Brief Electricity and Magnetism Assessment | Giulia Polverini et.al. | 2412.10019 | null |
2024-12-12 | Vision-Language Models Represent Darker-Skinned Black Individuals as More Homogeneous than Lighter-Skinned Black Individuals | Messi H. J. Lee et.al. | 2412.09668 | null |
2024-12-12 | Exemplar Masking for Multimodal Incremental Learning | Yi-Lun Lee et.al. | 2412.09549 | link |
2024-12-12 | Embeddings are all you need! Achieving High Performance Medical Image Classification through Training-Free Embedding Analysis | Raj Hansini Khoiwal et.al. | 2412.09445 | null |
2024-12-12 | Enhancing Modality Representation and Alignment for Multimodal Cold-start Active Learning | Meng Shen et.al. | 2412.09126 | null |
2024-12-12 | A Wander Through the Multimodal Landscape: Efficient Transfer Learning via Low-rank Sequence Multimodal Adapter | Zirun Guo et.al. | 2412.08979 | null |
2024-12-11 | StreamChat: Chatting with Streaming Video | Jihao Liu et.al. | 2412.08646 | null |
2024-12-11 | Multimodal Latent Language Modeling with Next-Token Diffusion | Yutao Sun et.al. | 2412.08635 | link |
2024-12-12 | Design2GarmentCode: Turning Design Concepts to Tangible Garments Through Program Synthesis | Feng Zhou et.al. | 2412.08603 | null |
2024-12-11 | Illusory VQA: Benchmarking and Enhancing Multimodal Models on Visual Illusions | Mohammadmostafa Rostamkhani et.al. | 2412.08169 | link |
2024-12-10 | Explaining and Mitigating the Modality Gap in Contrastive Multimodal Learning | Can Yaras et.al. | 2412.07909 | null |
2024-12-10 | BiMediX2: Bio-Medical EXpert LMM for Diverse Medical Modalities | Sahal Shaji Mullappilly et.al. | 2412.07769 | link |
2024-12-10 | ACDiT: Interpolating Autoregressive Conditional Modeling and Diffusion Transformer | Jinyi Hu et.al. | 2412.07720 | link |
2024-12-13 | DriveMM: All-in-One Large Multimodal Model for Autonomous Driving | Zhijian Huang et.al. | 2412.07689 | link |
2024-12-10 | Driving with InternVL: Oustanding Champion in the Track on Driving with Language of the Autonomous Grand Challenge at CVPR 2024 | Jiahan Li et.al. | 2412.07247 | null |
2024-12-10 | Maya: An Instruction Finetuned Multilingual Multimodal Model | Nahid Alam et.al. | 2412.07112 | link |
2024-12-09 | How to Merge Your Multimodal Models Over Time? | Sebastian Dziadzio et.al. | 2412.06712 | link |
2024-12-09 | Ranked from Within: Ranking Large Multimodal Models for Visual Question Answering Without Labels | Weijie Tu et.al. | 2412.06461 | null |
2024-12-09 | iLLaVA: An Image is Worth Fewer Than 1/3 Input Tokens in Large Multimodal Models | Lianyu Hu et.al. | 2412.06263 | link |
2024-12-08 | A Self-Learning Multimodal Approach for Fake News Detection | Hao Chen et.al. | 2412.05843 | null |
2024-12-08 | SILMM: Self-Improving Large Multimodal Models for Compositional Text-to-Image Generation | Leigang Qu et.al. | 2412.05818 | null |
2024-12-07 | WavFusion: Towards wav2vec 2.0 Multimodal Speech Emotion Recognition | Feng Li et.al. | 2412.05558 | null |
2024-12-07 | Comprehensive Evaluation of Multimodal AI Models in Medical Imaging Diagnosis: From Data Augmentation to Preference-Based Comparison | Cailian Ruan et.al. | 2412.05536 | null |
2024-12-06 | Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling | Zhe Chen et.al. | 2412.05271 | link |
2024-12-05 | Lattice Lingo: Effect of Textual Detail on Multimodal Learning for Property Prediction of Crystals | Mrigi Munjal et.al. | 2412.04670 | null |
2024-12-05 | BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks | Juan Rodriguez et.al. | 2412.04626 | null |
2024-12-05 | MageBench: Bridging Large Multimodal Models to Agents | Miaosen Zhang et.al. | 2412.04531 | link |
2024-12-04 | Video Quality Assessment: A Comprehensive Survey | Qi Zheng et.al. | 2412.04508 | link |
2024-12-05 | SIDA: Social Media Image Deepfake Detection, Localization and Explanation with Large Multimodal Model | Zhenglin Huang et.al. | 2412.04292 | null |
2024-12-05 | CALMM-Drive: Confidence-Aware Autonomous Driving with Large Multimodal Model | Ruoyu Yao et.al. | 2412.04209 | null |
2024-12-05 | AIpparel: A Large Multimodal Generative Model for Digital Garments | Kiyohiro Nakayama et.al. | 2412.03937 | null |
2024-12-05 | MegaCOIN: Enhancing Medium-Grained Color Perception for Vision-Language Models | Ming-Chang Chiu et.al. | 2412.03927 | link |
2024-12-04 | Inst-IT: Boosting Multimodal Instance Understanding via Explicit Visual Prompt Instruction Tuning | Wujian Peng et.al. | 2412.03565 | link |
2024-12-04 | Training-Free Mitigation of Language Reasoning Degradation After Multimodal Instruction Tuning | Neale Ratzlaff et.al. | 2412.03467 | null |
2024-12-06 | SJTU:Spatial judgments in multimodal models towards unified segmentation through coordinate detection | Joongwon Chae et.al. | 2412.02565 | link |
2024-12-03 | Initial Study On Improving Segmentation By Combining Preoperative CT And Intraoperative CBCT Using Synthetic Data | Maximilian E. Tschuchnig et.al. | 2412.02294 | null |
2024-12-05 | CC-OCR: A Comprehensive and Challenging OCR Benchmark for Evaluating Large Multimodal Models in Literacy | Zhibo Yang et.al. | 2412.02210 | null |
2024-12-03 | VideoICL: Confidence-based Iterative In-context Learning for Out-of-Distribution Video Understanding | Kangsan Kim et.al. | 2412.02186 | link |
2024-12-04 | Agri-LLaVA: Knowledge-Infused Large Multimodal Assistant on Agricultural Pests and Diseases | Liqiong Wang et.al. | 2412.02158 | link |
2024-12-02 | Attacks on multimodal models | Viacheslav Iablochnikov et.al. | 2412.01725 | link |
2024-12-02 | LamRA: Large Multimodal Model as Your Advanced Retrieval Assistant | Yikun Liu et.al. | 2412.01720 | null |
2024-12-01 | VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by Video Spatiotemporal Augmentation | Weiming Ren et.al. | 2412.00927 | null |
2024-11-30 | MaintAGT:Sim2Real-Guided Multimodal Large Model for Intelligent Maintenance with Chain-of-Thought Reasoning | Hongliang He et.al. | 2412.00481 | null |
2024-11-30 | Approximate Fiber Product: A Preliminary Algebraic-Geometric Perspective on Multimodal Embedding Alignment | Dongfang Zhao et.al. | 2412.00373 | null |
2024-12-04 | ROSE: Revolutionizing Open-Set Dense Segmentation with Patch-Wise Perceptual Large Multimodal Model | Kunyang Han et.al. | 2412.00153 | null |
2024-11-28 | Sparse Attention Vectors: Generative Multimodal Model Features Are Discriminative Vision-Language Classifiers | Chancharik Mitra et.al. | 2412.00142 | null |
2024-12-02 | LUMIA: Linear probing for Unimodal and MultiModal Membership Inference Attacks leveraging internal LLM states | Luis Ibanez-Lissen et.al. | 2411.19876 | null |
2024-11-29 | SDR-GNN: Spectral Domain Reconstruction Graph Neural Network for Incomplete Multimodal Learning in Conversational Emotion Recognition | Fangze Fu et.al. | 2411.19822 | null |
2024-11-29 | JetFormer: An Autoregressive Generative Model of Raw Images and Text | Michael Tschannen et.al. | 2411.19722 | null |
2024-11-28 | Beyond Logit Lens: Contextual Embeddings for Robust Hallucination Detection & Grounding in VLMs | Anirudh Phukan et.al. | 2411.19187 | null |
2024-11-28 | Examining Multimodal Gender and Content Bias in ChatGPT-4o | Roberto Balestri et.al. | 2411.19140 | null |
2024-11-28 | ScratchEval: Are GPT-4o Smarter than My Child? Evaluating Large Multimodal Models with Visual Programming Challenges | Rao Fu et.al. | 2411.18932 | link |
2024-11-27 | Active Data Curation Effectively Distills Large-Scale Multimodal Models | Vishaal Udandarao et.al. | 2411.18674 | null |
2024-11-27 | AMPS: ASR with Multimodal Paraphrase Supervision | Amruta Parulekar et.al. | 2411.18368 | null |
2024-12-03 | Large Language Model-Brained GUI Agents: A Survey | Chaoyun Zhang et.al. | 2411.18279 | link |
2024-11-27 | Grid-augumented vision: A simple yet effective approach for enhanced spatial understanding in multi-modal agents | Joongwon Chae et.al. | 2411.18270 | link |
2024-11-27 | Multimodal Integration of Longitudinal Noninvasive Diagnostics for Survival Prediction in Immunotherapy Using Deep Learning | Melda Yeghaian et.al. | 2411.18253 | null |
2024-11-26 | NEMO: Can Multimodal LLMs Identify Attribute-Modified Objects? | Jiaxuan Li et.al. | 2411.17794 | null |
2024-11-26 | Visatronic: A Multimodal Decoder-Only Model for Speech Synthesis | Akshita Gupta et.al. | 2411.17690 | null |
2024-11-26 | AIGV-Assessor: Benchmarking and Evaluating the Perceptual Quality of Text-to-Video Generation with LMM | Jiarui Wang et.al. | 2411.17221 | link |
2024-11-26 | Learning Robust Anymodal Segmentor with Unimodal and Cross-modal Distillation | Xu Zheng et.al. | 2411.17141 | link |
2024-11-26 | Relations, Negations, and Numbers: Looking for Logic in Generative Text-to-Image Models | Colin Conwell et.al. | 2411.17066 | link |
2024-11-26 | Multimodal Alignment and Fusion: A Survey | Songtao Li et.al. | 2411.17040 | null |
2024-11-27 | SAR3D: Autoregressive 3D Object Generation and Understanding via Multi-scale 3D VQVAE | Yongwei Chen et.al. | 2411.16856 | null |
2024-11-23 | Document Haystacks: Vision-Language Reasoning Over Piles of 1000+ Documents | Jun Chen et.al. | 2411.16740 | link |
2024-11-26 | All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages | Ashmal Vayani et.al. | 2411.16508 | link |
2024-11-25 | Boosting 3D Object Generation through PBR Materials | Yitong Wang et.al. | 2411.16080 | null |
2024-11-24 | M3-CVC: Controllable Video Compression with Multimodal Generative Models | Rui Wan et.al. | 2411.15798 | null |
2024-11-23 | Knowledge Transfer Across Modalities with Natural Language Supervision | Carlo Alberto Barbano et.al. | 2411.15611 | null |
2024-11-23 | From Complexity to Parsimony: Integrating Latent Class Analysis to Uncover Multimodal Learning Patterns in Collaborative Learning | Lixiang Yan et.al. | 2411.15590 | null |
2024-11-23 | Botfip-LLM: An Enhanced Multimodal Scientific Computing Framework Leveraging Knowledge Distillation from Large Language Models | Tianhao Chen et.al. | 2411.15525 | null |
2024-11-23 | MambaVLT: Time-Evolving Multimodal State Space Model for Vision-Language Tracking | Xinqi Liu et.al. | 2411.15459 | null |
2024-11-23 | freePruner: A Training-free Approach for Large Multimodal Model Acceleration | Bingxin Xu et.al. | 2411.15446 | null |
2024-11-22 | PRIMUS: Pretraining IMU Encoders with Multimodal Self-Supervision | Arnav M. Das et.al. | 2411.15127 | null |
2024-11-22 | Large Multi-modal Models Can Interpret Features in Large Multi-modal Models | Kaichen Zhang et.al. | 2411.14982 | link |
2024-11-25 | Information Extraction from Heterogeneous Documents without Ground Truth Labels using Synthetic Label Generation and Knowledge Distillation | Aniket Bhattacharyya et.al. | 2411.14957 | null |
2024-11-22 | Benchmarking Multimodal Models for Ukrainian Language Understanding Across Academic and Cultural Domains | Yurii Paniv et.al. | 2411.14647 | null |
2024-11-21 | Generative AI for Music and Audio | Hao-Wen Dong et.al. | 2411.14627 | null |
2024-11-21 | FuseGPT: Learnable Layers Fusion of Generative Pre-trained Transformers | Zehua Pei et.al. | 2411.14507 | null |
2024-11-21 | MMGenBench: Evaluating the Limits of LMMs from the Text-to-Image Generation Perspective | Hailang Huang et.al. | 2411.14062 | link |
2024-11-21 | Multimodal 3D Reasoning Segmentation with Complex Scenes | Xueying Jiang et.al. | 2411.13927 | null |
2024-11-20 | VideoAutoArena: An Automated Arena for Evaluating Large Multimodal Models in Video Analysis through User Simulation | Ziyang Luo et.al. | 2411.13281 | null |
2024-11-19 | VILA-M3: Enhancing Vision-Language Models with Medical Expert Knowledge | Vishwesh Nath et.al. | 2411.12915 | null |
2024-11-19 | Mitigating Perception Bias: A Training-Free Approach to Enhance LMM for Image Quality Assessment | Siyi Pan et.al. | 2411.12791 | null |
2024-11-18 | MMBind: Unleashing the Potential of Distributed and Heterogeneous Data for Multimodal Learning in IoT | Xiaomin Ouyang et.al. | 2411.12126 | null |
2024-11-17 | SymDPO: Boosting In-Context Learning of Large Multimodal Models with Symbol Demonstration Direct Preference Optimization | Hongrui Jia et.al. | 2411.11909 | link |
2024-11-18 | The Power of Many: Multi-Agent Multimodal Models for Cultural Image Captioning | Longju Bai et.al. | 2411.11758 | link |
2024-11-18 | Artificial Scientific Discovery | Antonio Norelli et.al. | 2411.11672 | null |
2024-11-18 | InstruGen: Automatic Instruction Generation for Vision-and-Language Navigation Via Large Multimodal Models | Yu Yan et.al. | 2411.11394 | null |
2024-11-19 | SoK: Unifying Cybersecurity and Cybersafety of Multimodal Foundation Models with an Information Theory Approach | Ruoxi Sun et.al. | 2411.11195 | null |
2024-11-16 | ViBe: A Text-to-Video Benchmark for Evaluating Hallucination in Large Multimodal Models | Vipula Rawte et.al. | 2411.10867 | null |
2024-11-19 | MLAN: Language-Based Instruction Tuning Improves Zero-Shot Generalization of Multimodal Large Language Models | Jianhong Tu et.al. | 2411.10557 | link |
2024-11-15 | Everything is a Video: Unifying Modalities through Next-Frame Prediction | G. Thomas Hudson et.al. | 2411.10503 | null |
2024-11-15 | Weakly-Supervised Multimodal Learning on MIMIC-CXR | Andrea Agostini et.al. | 2411.10356 | link |
2024-11-21 | Instruction-Guided Editing Controls for Images and Multimedia: A Survey in LLM era | Thanh Tam Nguyen et.al. | 2411.09955 | link |
2024-11-14 | Cross-Modal Consistency in Multimodal Large Language Models | Xiang Zhang et.al. | 2411.09273 | null |
2024-11-14 | SmartInv: Multimodal Learning for Smart Contract Invariant Inference | Sally Junsong Wang et.al. | 2411.09217 | null |
2024-11-13 | Multimodal Object Detection using Depth and Image Data for Manufacturing Parts | Nazanin Mahjourian et.al. | 2411.09062 | null |
2024-11-13 | Bridging the Visual Gap: Fine-Tuning Multimodal Models with Knowledge-Adapted Captions | Moran Yanuka et.al. | 2411.09018 | null |
2024-11-13 | AstroM $^3$ : A self-supervised multimodal model for astronomy | Mariia Rizhko et.al. | 2411.08842 | null |
2024-11-13 | Multimodal Instruction Tuning with Hybrid State Space Models | Jianing Zhou et.al. | 2411.08840 | null |
2024-11-13 | Retrieval Augmented Recipe Generation | Guoshan Liu et.al. | 2411.08715 | null |
2024-11-12 | DPU: Dynamic Prototype Updating for Multimodal Out-of-Distribution Detection | Shawn Li et.al. | 2411.08227 | link |
2024-11-12 | Leveraging Multimodal Models for Enhanced Neuroimaging Diagnostics in Alzheimer’s Disease | Francesco Chiumento et.al. | 2411.07871 | null |
2024-11-12 | SparrowVQE: Visual Question Explanation for Course Content Understanding | Jialu Li et.al. | 2411.07516 | link |
2024-11-12 | BLIP3-KALE: Knowledge Augmented Large-Scale Dense Captions | Anas Awadalla et.al. | 2411.07461 | null |
2024-11-11 | Multimodal Fusion Balancing Through Game-Theoretic Regularization | Konstantinos Kontras et.al. | 2411.07335 | null |
2024-11-11 | OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision | Cong Wei et.al. | 2411.07199 | null |
2024-11-09 | M-Longdoc: A Benchmark For Multimodal Super-Long Document Understanding And A Retrieval-Aware Tuning Framework | Yew Ken Chia et.al. | 2411.06176 | null |
2024-11-09 | An Empirical Analysis on Spatial Reasoning Capabilities of Large Multimodal Models | Fatemeh Shiri et.al. | 2411.06048 | link |
2024-11-08 | Towards Low-Resource Harmful Meme Detection with LMM Agents | Jianzhao Huang et.al. | 2411.05383 | link |
2024-11-08 | Exploring the Alignment Landscape: LLMs and Geometric Deep Models in Protein Representation | Dong Shu et.al. | 2411.05316 | link |
2024-11-07 | HourVideo: 1-Hour Video-Language Understanding | Keshigeyan Chandrasegaran et.al. | 2411.04998 | link |
2024-11-07 | VideoGLaMM: A Large Multimodal Model for Pixel-Level Visual Grounding in Videos | Shehan Munasinghe et.al. | 2411.04923 | null |
2024-11-07 | Exploring Hierarchical Molecular Graph Representation in Multimodal LLMs | Chengxin Hu et.al. | 2411.04708 | null |
2024-11-06 | AutoGameUI: Constructing High-Fidelity Game UIs via Multimodal Learning and Interactive Web-Based Tool | Zhongliang Tang et.al. | 2411.03709 | null |
2024-11-05 | MME-Finance: A Multimodal Finance Benchmark for Expert-level Understanding and Reasoning | Ziliang Gan et.al. | 2411.03314 | null |
2024-11-05 | HumanVLM: Foundation for Human-Scene Vision-Language Model | Dawei Dai et.al. | 2411.03034 | null |
2024-11-05 | Toward Robust Incomplete Multimodal Sentiment Analysis via Hierarchical Representation Learning | Mingcheng Li et.al. | 2411.02793 | null |
2024-11-11 | INQUIRE: A Natural World Text-to-Image Retrieval Benchmark | Edward Vendrow et.al. | 2411.02537 | link |
2024-11-04 | See it, Think it, Sorted: Large Multimodal Models are Few-shot Time Series Anomaly Analyzers | Jiaxin Zhuang et.al. | 2411.02465 | null |
2024-11-07 | TableGPT2: A Large Multimodal Model with Tabular Data Integration | Aofeng Su et.al. | 2411.02059 | link |
2024-11-04 | Foundations and Recent Trends in Multimodal Mobile Agents: A Survey | Biao Wu et.al. | 2411.02006 | link |
2024-11-04 | KptLLM: Unveiling the Power of Large Language Model for Keypoint Comprehension | Jie Yang et.al. | 2411.01846 | null |
2024-11-03 | EEE-Bench: A Comprehensive Multimodal Electrical And Electronics Engineering Benchmark | Ming Li et.al. | 2411.01492 | null |
2024-11-03 | Classifier-guided Gradient Modulation for Enhanced Multimodal Learning | Zirun Guo et.al. | 2411.01409 | link |
2024-11-02 | LoRA-Contextualizing Adaptation of Large Multimodal Models for Long Document Understanding | Jian Chen et.al. | 2411.01106 | null |
2024-11-01 | Text2Freq: Learning Series Patterns from Text via Frequency Domain | Ming-Chih Lo et.al. | 2411.00929 | null |
2024-11-01 | V-LoRA: An Efficient and Flexible System Boosts Vision Applications with LoRA LMM | Liang Mi et.al. | 2411.00915 | null |
2024-11-01 | Analyzing Multimodal Integration in the Variational Autoencoder from an Information-Theoretic Perspective | Carlotta Langer et.al. | 2411.00522 | null |
2024-10-31 | TurtleBench: A Visual Programming Benchmark in Turtle Geometry | Sina Rismanchian et.al. | 2411.00264 | link |
2024-10-31 | ResiDual Transformer Alignment with Spectral Decomposition | Lorenzo Basile et.al. | 2411.00246 | null |
2024-10-31 | Nearest Neighbor Normalization Improves Multimodal Retrieval | Neil Chowdhury et.al. | 2410.24114 | link |
2024-11-04 | AndroidLab: Training and Systematic Benchmarking of Android Autonomous Agents | Yifan Xu et.al. | 2410.24024 | link |
2024-10-31 | Audio Is the Achilles’ Heel: Red Teaming Audio Large Multimodal Models | Hao Yang et.al. | 2410.23861 | null |
2024-10-30 | CLIPErase: Efficient Unlearning of Visual-Textual Associations in CLIP | Tianyu Yang et.al. | 2410.23330 | null |
2024-10-30 | EMMA: End-to-End Multimodal Model for Autonomous Driving | Jyh-Jing Hwang et.al. | 2410.23262 | null |
2024-10-29 | ProMQA: Question Answering Dataset for Multimodal Procedural Activity Understanding | Kimihiro Hasegawa et.al. | 2410.22211 | link |
2024-10-29 | Beyond Text: Optimizing RAG with Multimodal Inputs for Industrial Applications | Monica Riedler et.al. | 2410.21943 | link |
2024-10-28 | AiSciVision: A Framework for Specializing Large Multimodal Models in Scientific Image Classification | Brendan Hogan et.al. | 2410.21480 | link |
2024-10-27 | Mind Your Step (by Step): Chain-of-Thought can Reduce Performance on Tasks where Thinking Makes Humans Worse | Ryan Liu et.al. | 2410.21333 | null |
2024-10-28 | IndraEye: Infrared Electro-Optical UAV-based Perception Dataset for Robust Downstream Tasks | Manjunath D et.al. | 2410.20953 | link |
2024-10-27 | Generator Matching: Generative modeling with arbitrary Markov processes | Peter Holderrieth et.al. | 2410.20587 | null |
2024-10-27 | PaPaGei: Open Foundation Models for Optical Physiological Signals | Arvind Pillai et.al. | 2410.20542 | link |
2024-10-25 | Turn-by-Turn Indoor Navigation for the Visually Impaired | Santosh Srinivasaiah et.al. | 2410.19954 | null |
2024-10-25 | A Multimodal Approach For Endoscopic VCE Image Classification Using BiomedCLIP-PubMedBERT | Nagarajan Ganapathy et.al. | 2410.19944 | link |
2024-10-25 | OpenWebVoyager: Building Multimodal Web Agents via Iterative Real-World Exploration, Feedback and Optimization | Hongliang He et.al. | 2410.19609 | link |
2024-10-24 | Visual Text Matters: Improving Text-KVQA with Visual Text Entity Knowledge-aware Large Multimodal Assistant | Abhirama Subramanyam Penamakuri et.al. | 2410.19144 | link |
2024-10-24 | VideoWebArena: Evaluating Long Context Multimodal Agents with Video Understanding Web Tasks | Lawrence Jang et.al. | 2410.19100 | null |
2024-10-24 | CAMEL-Bench: A Comprehensive Arabic LMM Benchmark | Sara Ghaboura et.al. | 2410.18976 | link |
2024-10-24 | Deep Insights into Cognitive Decline: A Survey of Leveraging Non-Intrusive Modalities with Deep Learning Techniques | David Ortiz-Perez et.al. | 2410.18972 | null |
2024-10-24 | OSCAR: Operating System Control via State-Aware Reasoning and Re-Planning | Xiaoqiang Wang et.al. | 2410.18963 | null |
2024-10-24 | A Survey of Multimodal Sarcasm Detection | Shafkat Farabi et.al. | 2410.18882 | null |
2024-10-27 | R-CoT: Reverse Chain-of-Thought Problem Generation for Geometric Reasoning in Large Multimodal Models | Linger Deng et.al. | 2410.17885 | link |
2024-10-22 | JMMMU: A Japanese Massive Multi-discipline Multimodal Understanding Benchmark for Culture-aware Evaluation | Shota Onohara et.al. | 2410.17250 | null |
2024-10-22 | An Eye for an AI: Evaluating GPT-4o’s Visual Perception Skills and Geometric Reasoning Skills Using Computer Graphics Questions | Tony Haoran Feng et.al. | 2410.16991 | null |
2024-10-21 | DocEdit-v2: Document Structure Editing Via Multimodal LLM Grounding | Manan Suri et.al. | 2410.16472 | null |
2024-10-21 | Promoting cross-modal representations to improve multimodal foundation models for physiological signals | Ching Fang et.al. | 2410.16424 | null |
2024-10-22 | Mini-InternVL: A Flexible-Transfer Pocket Multimodal Model with 5% Parameters and 90% Performance | Zhangwei Gao et.al. | 2410.16261 | link |
2024-10-22 | MoRE: Multi-Modal Contrastive Pre-training with Transformers on X-Rays, ECGs, and Diagnostic Report | Samrajya Thapa et.al. | 2410.16239 | link |
2024-10-21 | Griffon-G: Bridging Vision-Language and Vision-Centric Tasks via Large Multimodal Models | Yufei Zhan et.al. | 2410.16163 | link |
2024-10-21 | LMHaze: Intensity-aware Image Dehazing with a Large-scale Multi-intensity Real Haze Dataset | Ruikun Zhang et.al. | 2410.16095 | link |
2024-10-21 | How to Build a Pre-trained Multimodal model for Simultaneously Chatting and Decision-making? | Zuojin Tang et.al. | 2410.15885 | null |
2024-10-21 | Multimodal Learning for Embryo Viability Prediction in Clinical IVF | Junsik Kim et.al. | 2410.15581 | null |
2024-10-20 | IPO: Interpretable Prompt Optimization for Vision-Language Models | Yingjun Du et.al. | 2410.15397 | link |
2024-10-20 | Modality-Fair Preference Optimization for Trustworthy MLLM Alignment | Songtao Jiang et.al. | 2410.15334 | null |
2024-10-19 | ChitroJera: A Regionally Relevant Visual Question Answering Dataset for Bangla | Deeparghya Dutta Barua et.al. | 2410.14991 | null |
2024-10-19 | SemiHVision: Enhancing Medical Multimodal Models with a Semi-Human Annotated Dataset and Fine-Tuned Instruction Generation | Junda Wang et.al. | 2410.14948 | link |
2024-10-18 | Croc: Pretraining Large Multimodal Models with Cross-Modal Comprehension | Yin Xie et.al. | 2410.14332 | link |
2024-10-18 | Personalized Image Generation with Large Multimodal Models | Yiyan Xu et.al. | 2410.14170 | null |
2024-10-18 | Coherence-Driven Multimodal Safety Dialogue with Active Learning for Embodied Agents | Sabit Hassan et.al. | 2410.14141 | null |
2024-10-17 | Janus: Decoupling Visual Encoding for Unified Multimodal Understanding and Generation | Chengyue Wu et.al. | 2410.13848 | link |
2024-10-18 | Harnessing Webpage UIs for Text-Rich Visual Understanding | Junpeng Liu et.al. | 2410.13824 | null |
2024-10-17 | Parameter-efficient Adaptation of Multilingual Multimodal Models for Low-resource ASR | Abhishek Gupta et.al. | 2410.13445 | null |
2024-10-16 | The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio | Sicong Leng et.al. | 2410.12787 | null |
2024-10-16 | HumanEval-V: Evaluating Visual Understanding and Reasoning Abilities of Large Multimodal Models Through Coding Tasks | Fengji Zhang et.al. | 2410.12381 | link |
2024-10-15 | CtrlSynth: Controllable Image Text Synthesis for Data-Efficient Multimodal Learning | Qingqing Cao et.al. | 2410.11963 | null |
2024-10-15 | Generalizable Spacecraft Trajectory Generation via Multimodal Learning with Transformers | Davide Celestini et.al. | 2410.11723 | null |
2024-10-15 | Unveiling the Mystery of Visual Attributes of Concrete and Abstract Concepts: Variability, Nearest Neighbors, and Challenging Categories | Tarun Tater et.al. | 2410.11657 | link |
2024-10-15 | On-the-fly Modulation for Balanced Multimodal Learning | Yake Wei et.al. | 2410.11582 | link |
2024-10-15 | Enhancing Unimodal Latent Representations in Multimodal VAEs through Iterative Amortized Inference | Yuta Oshima et.al. | 2410.11403 | null |
2024-10-14 | Saliency Guided Optimization of Diffusion Latents | Xiwen Wang et.al. | 2410.10257 | null |
2024-10-14 | MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models | Peng Xia et.al. | 2410.10139 | link |
2024-10-13 | LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models | Junyan Ye et.al. | 2410.09732 | null |
2024-10-12 | Reconstructive Visual Instruction Tuning | Haochen Wang et.al. | 2410.09575 | null |
2024-10-11 | Can GPTs Evaluate Graphic Design Based on Design Principles? | Daichi Haraguchi et.al. | 2410.08885 | null |
2024-10-11 | VERIFIED: A Video Corpus Moment Retrieval Benchmark for Fine-Grained Video Understanding | Houlun Chen et.al. | 2410.08593 | link |
2024-10-10 | ElasticTok: Adaptive Tokenization for Image and Video | Wilson Yan et.al. | 2410.08368 | null |
2024-10-10 | Flex-MoE: Modeling Arbitrary Modality Combination via the Flexible Mixture-of-Experts | Sukwon Yun et.al. | 2410.08245 | link |
2024-10-10 | LatteCLIP: Unsupervised CLIP Fine-Tuning via LMM-Synthetic Texts | Anh-Quan Cao et.al. | 2410.08211 | null |
2024-10-10 | Emerging Pixel Grounding in Large Multimodal Models Without Grounding Supervision | Shengcao Cao et.al. | 2410.08209 | null |
2024-10-10 | MRAG-Bench: Vision-Centric Evaluation for Retrieval-Augmented Multimodal Models | Wenbo Hu et.al. | 2410.08182 | null |
2024-10-10 | Generated Bias: Auditing Internal Bias Dynamics of Text-To-Image Generative Models | Abhishek Mandal et.al. | 2410.07884 | null |
2024-10-09 | The Cognitive Capabilities of Generative AI: A Comparative Analysis with Human Benchmarks | Isaac R. Galatzer-Levy et.al. | 2410.07391 | null |
2024-10-12 | Deep Correlated Prompting for Visual Recognition with Missing Modalities | Lianyu Hu et.al. | 2410.06558 | link |
2024-10-11 | Chip-Tuning: Classify Before Language Models Say | Fangwei Zhu et.al. | 2410.06541 | link |
2024-10-09 | Does Spatial Cognition Emerge in Frontier Models? | Santhosh Kumar Ramakrishnan et.al. | 2410.06468 | null |
2024-10-08 | Multimodal Representation Learning using Adaptive Graph Construction | Weichen Huang et.al. | 2410.06395 | null |
2024-10-08 | Temporal Image Caption Retrieval Competition – Description and Results | Jakub Pokrywka et.al. | 2410.06314 | null |
2024-10-08 | PDF-WuKong: A Large Multimodal Model for Efficient Long PDF Reading with End-to-End Sparse Sampling | Xudong Xie et.al. | 2410.05970 | link |
2024-10-08 | ModalPrompt:Dual-Modality Guided Prompt for Continual Learning of Large Multimodal Models | Fanhu Zeng et.al. | 2410.05849 | null |
2024-10-08 | Multimodal Large Language Models and Tunings: Vision, Language, Sensors, Audio, and Beyond | Soyeon Caren Han et.al. | 2410.05608 | link |
2024-10-08 | TeaserGen: Generating Teasers for Long Documentaries | Weihan Xu et.al. | 2410.05586 | null |
2024-10-07 | R-Bench: Are your Large Multimodal Model Robust to Real-world Corruptions? | Chunyi Li et.al. | 2410.05474 | link |
2024-10-07 | RespLLM: Unifying Audio and Text with Multimodal LLMs for Generalized Respiratory Health Prediction | Yuwei Zhang et.al. | 2410.05361 | null |
2024-10-07 | Patch is Enough: Naturalistic Adversarial Patch against Vision-Language Pre-training Models | Dehong Kong et.al. | 2410.04884 | null |
2024-10-06 | VISTA: A Visual and Textual Attention Dataset for Interpreting Multimodal Models | Harshit et.al. | 2410.04609 | null |
2024-10-06 | UniMuMo: Unified Text, Music and Motion Generation | Han Yang et.al. | 2410.04534 | link |
2024-10-08 | Gamified crowd-sourcing of high-quality data for visual fine-tuning | Shashank Yadav et.al. | 2410.04038 | null |
2024-10-07 | Multimodal Point-of-Interest Recommendation | Yuta Kanzawa et.al. | 2410.03265 | null |
2024-10-04 | Bridging the Gap between Text, Audio, Image, and Any Sequence: A Novel Approach using Gloss-based Annotation | Sen Fang et.al. | 2410.03146 | null |
2024-10-04 | AuroraCap: Efficient, Performant Video Detailed Captioning and a New Benchmark | Wenhao Chai et.al. | 2410.03051 | null |
2024-10-07 | CPFD: Confidence-aware Privileged Feature Distillation for Short Video Classification | Jinghao Shi et.al. | 2410.03038 | null |
2024-10-07 | MMP: Towards Robust Multi-Modal Learning with Masked Modality Projection | Niki Nezakati et.al. | 2410.03010 | null |
2024-10-03 | Vinoground: Scrutinizing LMMs over Dense Temporal Reasoning with Short Videos | Jianrui Zhang et.al. | 2410.02763 | null |
2024-10-03 | Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models | Zhengfeng Lai et.al. | 2410.02740 | null |
2024-10-04 | Video Instruction Tuning With Synthetic Data | Yuanhan Zhang et.al. | 2410.02713 | null |
2024-10-03 | LLaVA-Critic: Learning to Evaluate Multimodal Models | Tianyi Xiong et.al. | 2410.02712 | null |
2024-10-03 | Plots Unlock Time-Series Understanding in Multimodal Models | Mayank Daswani et.al. | 2410.02637 | null |
2024-10-02 | Anchors Aweigh! Sail for Optimal Unified Multi-Modal Representations | Minoh Jeong et.al. | 2410.02086 | null |
2024-10-02 | Toward a Holistic Evaluation of Robustness in CLIP Models | Weijie Tu et.al. | 2410.01534 | null |
2024-10-02 | SHAP-CAT: A interpretable multi-modal framework enhancing WSI classification via virtual staining and shapley-value-based multimodal fusion | Jun Wang et.al. | 2410.01408 | null |
2024-10-02 | Backdooring Vision-Language Models with Out-Of-Distribution Data | Weimin Lyu et.al. | 2410.01264 | null |
2024-10-02 | OCC-MLLM:Empowering Multimodal Large Language Model For the Understanding of Occluded Objects | Wenmo Qiu et.al. | 2410.01261 | null |
2024-09-30 | Robin3D: Improving 3D Large Language Model via Robust Instruction Tuning | Weitai Kang et.al. | 2410.00255 | link |
2024-09-30 | Using Large Multimodal Models to Extract Knowledge Components for Knowledge Tracing from Multimedia Question Information | Hyeongdon Moon et.al. | 2409.20167 | link |
2024-10-02 | Visual Context Window Extension: A New Perspective for Long Video Understanding | Hongchen Wei et.al. | 2409.20018 | null |
2024-09-30 | Towards Robust Multimodal Sentiment Analysis with Incomplete Data | Haoyu Zhang et.al. | 2409.20012 | link |
2024-09-28 | FairPIVARA: Reducing and Assessing Biases in CLIP-Based Multimodal Models | Diego A. B. Moreira et.al. | 2409.19474 | link |
2024-09-28 | From Unimodal to Multimodal: Scaling up Projectors to Align Modalities | Mayug Maniparambil et.al. | 2409.19425 | null |
2024-10-02 | CLIP-MoE: Towards Building Mixture of Experts for CLIP with Diversified Multiplet Upcycling | Jihai Zhang et.al. | 2409.19291 | link |
2024-09-28 | TrojVLM: Backdoor Attack Against Vision Language Models | Weimin Lyu et.al. | 2409.19232 | null |
2024-09-27 | Multimodal Markup Document Models for Graphic Design Completion | Kotaro Kikuchi et.al. | 2409.19051 | null |
2024-09-27 | Emu3: Next-Token Prediction is All You Need | Xinlong Wang et.al. | 2409.18869 | null |
2024-09-27 | Data Analysis in the Era of Generative AI | Jeevana Priya Inala et.al. | 2409.18475 | null |
2024-09-26 | MultiClimate: Multimodal Stance Detection on Climate Change Videos | Jiawen Wang et.al. | 2409.18346 | link |
2024-09-26 | LLaVA-3D: A Simple yet Effective Pathway to Empowering LMMs with 3D-awareness | Chenming Zhu et.al. | 2409.18125 | null |
2024-09-26 | GSON: A Group-based Social Navigation Framework with Large Multimodal Model | Shangyi Luo et.al. | 2409.18084 | null |
2024-09-26 | A Multimodal Single-Branch Embedding Network for Recommendation in Cold-Start and Missing Modality Scenarios | Christian Ganhör et.al. | 2409.17864 | link |
2024-09-26 | Harnessing Shared Relations via Multimodal Mixup Contrastive Learning for Multimodal Classification | Raja Kumar et.al. | 2409.17777 | link |
2024-09-26 | MIO: A Foundation Model on Multimodal Tokens | Zekun Wang et.al. | 2409.17692 | link |
2024-09-25 | Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models | Matt Deitke et.al. | 2409.17146 | link |
2024-09-24 | CDChat: A Large Multimodal Model for Remote Sensing Change Description | Mubashir Noman et.al. | 2409.16261 | link |
2024-09-24 | CLSP: High-Fidelity Contrastive Language-State Pre-training for Agent State Representation | Fuxian Huang et.al. | 2409.15806 | null |
2024-09-18 | Recommendation with Generative Models | Yashar Deldjoo et.al. | 2409.15173 | null |
2024-09-23 | With Ears to See and Eyes to Hear: Sound Symbolism Experiments with Multimodal Large Language Models | Tyler Loakman et.al. | 2409.14917 | link |
2024-09-22 | Patch Ranking: Efficient CLIP by Learning to Rank Local Patches | Cheng-En Wu et.al. | 2409.14607 | null |
2024-09-22 | Can-Do! A Dataset and Neuro-Symbolic Grounded Framework for Embodied Planning with Large Multimodal Models | Yew Ken Chia et.al. | 2409.14277 | null |
2024-09-20 | Brain-Cognition Fingerprinting via Graph-GCCA with Contrastive Learning | Yixin Wang et.al. | 2409.13887 | null |
2024-09-20 | Instruction-guided Multi-Granularity Segmentation and Captioning with Large Multimodal Model | Li Zhou et.al. | 2409.13407 | link |
2024-09-20 | A Novel Adaptive Fine-Tuning Algorithm for Multimodal Models: Self-Optimizing Classification and Selection of High-Quality Datasets in Remote Sensing | Yi Ren et.al. | 2409.13345 | null |
2024-09-20 | ChemDFM-X: Towards Large Multimodal Model for Chemistry | Zihan Zhao et.al. | 2409.13194 | null |
2024-09-19 | MMSearch: Benchmarking the Potential of Large Models as Multi-modal Search Engines | Dongzhi Jiang et.al. | 2409.12959 | null |
2024-09-24 | TinyVLA: Towards Fast, Data-Efficient Vision-Language-Action Models for Robotic Manipulation | Junjie Wen et.al. | 2409.12514 | null |
2024-09-18 | Qwen2-VL: Enhancing Vision-Language Model’s Perception of the World at Any Resolution | Peng Wang et.al. | 2409.12191 | link |
2024-09-18 | All-in-one foundational models learning across quantum chemical levels | Yuxinxin Chen et.al. | 2409.12015 | link |
2024-09-18 | LMMCoDrive: Cooperative Driving with Large Multimodal Model | Haichao Liu et.al. | 2409.11981 | link |
2024-09-16 | MusicLIME: Explainable Multimodal Music Understanding | Theodoros Sotirou et.al. | 2409.10496 | link |
2024-09-19 | IRIS: Interactive Responsive Intelligent Segmentation for 3D Affordance Analysis | Meng Chu et.al. | 2409.10078 | null |
2024-09-16 | AceParse: A Comprehensive Dataset with Diverse Structured Texts for Academic Literature Parsing | Huawei Ji et.al. | 2409.10016 | link |
2024-09-14 | Keypoints-Integrated Instruction-Following Data Generation for Enhanced Human Pose Understanding in Multimodal Models | Dewen Zhang et.al. | 2409.09306 | null |
2024-09-13 | Interactive Masked Image Modeling for Multimodal Object Detection in Remote Sensing | Minh-Duc Vu et.al. | 2409.08885 | null |
2024-09-13 | A Multimodal Approach for Fluid Overload Prediction: Integrating Lung Ultrasound and Clinical Data | Tianqi Yang et.al. | 2409.08790 | null |
2024-09-13 | Dynamics of Collective Group Affect: Group-level Annotations and the Multimodal Modeling of Convergence and Divergence | Navin Raj Prabhu et.al. | 2409.08578 | null |
2024-09-13 | A Comprehensive Survey on Deep Multimodal Learning with Missing Modality | Renjie Wu et.al. | 2409.07825 | null |
2024-09-12 | Top-down Activity Representation Learning for Video Question Answering | Yanan Wang et.al. | 2409.07748 | null |
2024-09-11 | What to align in multimodal contrastive learning? | Benoit Dufumier et.al. | 2409.07402 | null |
2024-09-11 | MVLLaVA: An Intelligent Agent for Unified and Flexible Novel View Synthesis | Hanyu Jiang et.al. | 2409.07129 | null |
2024-09-11 | FSMDet: Vision-guided feature diffusion for fully sparse 3D detector | Tianran Liu et.al. | 2409.06945 | null |
2024-09-16 | Scaling Law Hypothesis for Multimodal Model | Qingyun Sun et.al. | 2409.06754 | null |
2024-09-10 | Multiclass Arrhythmia Classification using Smartwatch Photoplethysmography Signals Collected in Real-life Settings | Dong Han et.al. | 2409.06147 | null |
2024-09-11 | A Survey of Multimodal Composite Editing and Retrieval | Suyan Li et.al. | 2409.05405 | link |
2024-09-05 | Learning in Order! A Sequential Strategy to Learn Invariant Features for Multimodal Sentiment Analysis | Xianbing Zhao et.al. | 2409.04473 | null |
2024-09-06 | Generating Faithful and Salient Text from Multimodal Data | Tahsina Hashem et.al. | 2409.03961 | link |
2024-09-06 | CMM-Math: A Chinese Multimodal Math Dataset To Evaluate and Enhance the Mathematics Reasoning of Large Multimodal Models | Wentao Liu et.al. | 2409.02834 | link |
2024-09-10 | MMMU-Pro: A More Robust Multi-discipline Multimodal Understanding Benchmark | Xiang Yue et.al. | 2409.02813 | null |
2024-09-04 | Understanding eGFR Trajectories and Kidney Function Decline via Large Multimodal Models | Chih-Yuan Li et.al. | 2409.02530 | null |
2024-09-03 | Blocks as Probes: Dissecting Categorization Ability of Large Multimodal Models | Bin Fu et.al. | 2409.01560 | null |
2024-09-03 | Think Twice Before Recognizing: Large Multimodal Models for General Fine-grained Traffic Sign Recognition | Yaozong Gan et.al. | 2409.01534 | null |
2024-09-02 | Towards General Industrial Intelligence: A Survey on IIoT-Enhanced Continual Large Models | Jiao Chen et.al. | 2409.01207 | null |
2024-09-02 | Recoverable Compression: A Multimodal Vision Token Recovery Mechanism Guided by Text Information | Yi Chen et.al. | 2409.01179 | null |
2024-08-31 | Comparative Analysis of Modality Fusion Approaches for Audio-Visual Person Identification and Verification | Aref Farhadipour et.al. | 2409.00562 | null |
2024-08-30 | UrBench: A Comprehensive Benchmark for Evaluating Large Multimodal Models in Multi-View Urban Scenarios | Baichuan Zhou et.al. | 2408.17267 | null |
2024-08-29 | Seeking the Sufficiency and Necessity Causal Features in Multimodal Representation Learning | Boyu Chen et.al. | 2408.16577 | null |
2024-08-29 | Toward Robust Early Detection of Alzheimer’s Disease via an Integrated Multimodal Learning Approach | Yifei Chen et.al. | 2408.16343 | link |
2024-08-28 | Meta-Learn Unimodal Signals with Weak Supervision for Multimodal Sentiment Analysis | Sijie Mai et.al. | 2408.16029 | null |
2024-08-28 | ModalityMirror: Improving Audio Classification in Modality Heterogeneity Federated Learning with Multimodal Distillation | Tiantian Feng et.al. | 2408.15803 | null |
2024-08-28 | Visual Prompt Engineering for Medical Vision Language Models in Radiology | Stefan Denner et.al. | 2408.15802 | null |
2024-08-27 | X-Reflect: Cross-Reflection Prompting for Multimodal Recommendation | Hanjia Lyu et.al. | 2408.15172 | null |
2024-08-27 | The Benefits of Balance: From Information Projections to Variance Reduction | Lang Liu et.al. | 2408.15065 | null |
2024-08-27 | NeuralOOD: Improving Out-of-Distribution Generalization Performance with Brain-machine Fusion Learning Framework | Shuangchen Zhao et.al. | 2408.14950 | null |
2024-08-26 | MMR: Evaluating Reading Ability of Large Multimodal Models | Jian Chen et.al. | 2408.14594 | null |
2024-09-03 | Foundation Models for Music: A Survey | Yinghao Ma et.al. | 2408.14340 | link |
2024-08-26 | LMM-VQA: Advancing Video Quality Assessment with Large Multimodal Models | Qihang Ge et.al. | 2408.14008 | null |
2024-08-27 | Quantum Multimodal Contrastive Learning Framework | Chi-Sheng Chen et.al. | 2408.13919 | null |
2024-08-25 | Tangram: A Challenging Benchmark for Geometric Element Recognizing | Jiamin Tang et.al. | 2408.13854 | null |
2024-08-25 | Multimodal Ensemble with Conditional Feature Fusion for Dysgraphia Diagnosis in Children from Handwriting Samples | Jayakanth Kunhoth et.al. | 2408.13754 | null |
2024-08-24 | Preliminary Investigations of a Multi-Faceted Robust and Synergistic Approach in Semiconductor Electron Micrograph Analysis: Integrating Vision Transformers with Large Language and Multimodal Models | Sakhinana Sagar Srinivas et.al. | 2408.13621 | null |
2024-08-23 | Foundational Model for Electron Micrograph Analysis: Instruction-Tuning Small-Scale Language-and-Vision Assistant for Enterprise Adoption | Sakhinana Sagar Srinivas et.al. | 2408.13248 | null |
2024-08-23 | Indoor scene recognition from images under visual corruptions | Willams de Lima Costa et.al. | 2408.13029 | null |
2024-08-23 | Ada2I: Enhancing Modality Balance for Multimodal Conversational Emotion Recognition | Cam-Van Thi Nguyen et.al. | 2408.12895 | null |
2024-08-23 | Has Multimodal Learning Delivered Universal Intelligence in Healthcare? A Comprehensive Survey | Qika Lin et.al. | 2408.12880 | link |
2024-08-22 | Assessing Modality Bias in Video Question Answering Benchmarks with Multimodal Large Language Models | Jean Park et.al. | 2408.12763 | null |
2024-08-22 | Integrating Audio, Visual, and Semantic Information for Enhanced Multimodal Speaker Diarization | Luyao Cheng et.al. | 2408.12102 | null |
2024-08-22 | Mental-Perceiver: Audio-Textual Multimodal Learning for Mental Health Assessment | Jinghui Qin et.al. | 2408.12088 | null |
2024-08-21 | GRAB: A Challenging GRaph Analysis Benchmark for Large Multimodal Models | Jonathan Roberts et.al. | 2408.11817 | null |
2024-08-21 | D-RMGPT: Robot-assisted collaborative tasks driven by large multimodal models | M. Forlini et.al. | 2408.11761 | null |
2024-08-21 | UniFashion: A Unified Vision-Language Model for Multimodal Fashion Retrieval and Generation | Xiangyu Zhao et.al. | 2408.11305 | link |
2024-08-21 | BearLLM: A Prior Knowledge-Enhanced Bearing Health Management Framework with Unified Vibration Signal Representation | Haotian Peng et.al. | 2408.11281 | link |
2024-08-20 | Exploring the use of Generative AI to Support Automated Just-in-Time Programming for Visual Scene Displays | Cynthia Zastudil et.al. | 2408.11137 | null |
2024-08-21 | SZTU-CMU at MER2024: Improving Emotion-LLaMA with Conv-Attention for Multimodal Emotion Recognition | Zebang Cheng et.al. | 2408.10500 | link |
2024-08-19 | Enhance Modality Robustness in Text-Centric Multimodal Alignment with Adversarial Prompting | Yun-Da Tsai et.al. | 2408.09798 | null |
2024-08-19 | Anim-Director: A Large Multimodal Model Powered Agent for Controllable Animation Video Generation | Yunxin Li et.al. | 2408.09787 | link |
2024-08-18 | PA-LLaVA: A Large Language-Vision Assistant for Human Pathology Image Understanding | Dawei Dai et.al. | 2408.09530 | link |
2024-08-17 | Measuring Visual Sycophancy in Multimodal Models | Jaehyuk Lim et.al. | 2408.09111 | link |
2024-08-16 | AdaRank: Disagreement Based Module Rank Prediction for Low-rank Adaptation | Yihe Dong et.al. | 2408.09015 | link |
2024-08-16 | xGen-MM (BLIP-3): A Family of Open Large Multimodal Models | Le Xue et.al. | 2408.08872 | null |
2024-08-16 | Tell Codec What Worth Compressing: Semantically Disentangled Image Coding for Machine with LMMs | Jinming Liu et.al. | 2408.08575 | null |
2024-08-15 | LLaVA-Surg: Towards Multimodal Surgical Assistant via Structured Surgical Video Learning | Jiajie Li et.al. | 2408.07981 | null |
2024-08-15 | MathScape: Evaluating MLLMs in multimodal Math Scenarios through a Hierarchical Benchmark | Minxuan Zhou et.al. | 2408.07543 | link |
2024-08-14 | Modality Invariant Multimodal Learning to Handle Missing Modalities: A Single-Branch Approach | Muhammad Saad Saeed et.al. | 2408.07445 | null |
2024-08-14 | Robust Semi-supervised Multimodal Medical Image Segmentation via Cross Modality Collaboration | Xiaogen Zhon et.al. | 2408.07341 | link |
2024-08-14 | Enhancing Visual Question Answering through Ranking-Based Hybrid Training and Multimodal Fusion | Peiyuan Chen et.al. | 2408.07303 | null |
2024-08-13 | PathInsight: Instruction Tuning of Multimodal Datasets and Models for Intelligence Assisted Diagnosis in Histopathology | Xiaomin Wu et.al. | 2408.07037 | null |
2024-08-13 | EditScribe: Non-Visual Image Editing with Natural Language Verification Loops | Ruei-Che Chang et.al. | 2408.06632 | null |
2024-08-13 | CROME: Cross-Modal Adapters for Efficient Multimodal LLM | Sayna Ebrahimi et.al. | 2408.06610 | null |
2024-08-13 | Prioritizing Modalities: Flexible Importance Scheduling in Federated Multimodal Learning | Jieming Bian et.al. | 2408.06549 | null |
2024-08-12 | VisualAgentBench: Towards Large Multimodal Models as Visual Foundation Agents | Xiao Liu et.al. | 2408.06327 | link |
2024-08-11 | HateSieve: A Contrastive Learning Framework for Detecting and Segmenting Hateful Content in Multimodal Memes | Xuanyu Su et.al. | 2408.05794 | null |
2024-08-08 | Enhancing Journalism with AI: A Study of Contextualized Image Captioning for News Articles using LLMs and LMMs | Aliki Anagnostopoulou et.al. | 2408.04331 | null |
2024-08-06 | LLaVA-OneVision: Easy Visual Task Transfer | Bo Li et.al. | 2408.03326 | link |
2024-08-06 | Multitask and Multimodal Neural Tuning for Large Models | Hao Sun et.al. | 2408.03001 | null |
2024-08-06 | Body of Her: A Preliminary Study on End-to-End Humanoid Agent | Tenglong Ao et.al. | 2408.02879 | null |
2024-08-04 | Distribution-Level Memory Recall for Continual Learning: Preserving Knowledge and Avoiding Confusion | Shaoxu Cheng et.al. | 2408.02695 | null |
2024-08-02 | A Systematic Review of Intermediate Fusion in Multimodal Deep Learning for Biomedical Applications | Valerio Guarrasi et.al. | 2408.02686 | null |
2024-08-05 | REVISION: Rendering Tools Enable Spatial Fidelity in Vision-Language Models | Agneet Chatterjee et.al. | 2408.02231 | null |
2024-08-04 | CACE-Net: Co-guidance Attention and Contrastive Enhancement for Effective Audio-Visual Event Localization | Xiang He et.al. | 2408.01952 | link |
2024-08-02 | MuChoMusic: Evaluating Music Understanding in Multimodal Audio-Language Models | Benno Weck et.al. | 2408.01337 | link |
2024-08-05 | Dissecting Dissonance: Benchmarking Large Multimodal Models Against Self-Contradictory Instructions | Jin Gao et.al. | 2408.01091 | link |
2024-08-02 | GraphAge: Unleashing the power of Graph Neural Network to Decode Epigenetic Aging | Saleh Sakib Ahmed et.al. | 2408.00984 | link |
2024-08-01 | MM-Vet v2: A Challenging Benchmark to Evaluate Large Multimodal Models for Integrated Capabilities | Weihao Yu et.al. | 2408.00765 | link |
2024-08-01 | GalleryGPT: Analyzing Paintings with Large Multimodal Models | Yi Bin et.al. | 2408.00491 | link |
2024-08-01 | Everything We Hear: Towards Tackling Misinformation in Podcasts | Sachin Pathiyan Cherumanal et.al. | 2408.00292 | null |
2024-08-01 | OmniParser for Pure Vision Based GUI Agent | Yadong Lu et.al. | 2408.00203 | null |
2024-07-30 | Evolver: Chain-of-Evolution Prompting to Boost Large Multimodal Models for Hateful Meme Detection | Jinfa Huang et.al. | 2407.21004 | null |
2024-07-30 | HyperMM : Robust Multimodal Learning with Varying-sized Inputs | Hava Chaptoukaev et.al. | 2407.20768 | null |
2024-07-30 | Effectively Leveraging CLIP for Generating Situational Summaries of Images and Videos | Dhruv Verma et.al. | 2407.20642 | link |
2024-07-29 | Adversarial Robustness in RGB-Skeleton Action Recognition: Leveraging Attention Modality Reweighter | Chao Liu et.al. | 2407.19981 | null |
2024-07-29 | ML-Mamba: Efficient Multi-Modal Large Language Model Utilizing Mamba-2 | Wenjun Huang et.al. | 2407.19832 | null |
2024-08-02 | XLIP: Cross-modal Attention Masked Modelling for Medical Language-Image Pre-Training | Biao Wu et.al. | 2407.19546 | link |
2024-07-28 | Detached and Interactive Multimodal Learning | Yunfeng Fan et.al. | 2407.19514 | link |
2024-07-27 | Data Processing Techniques for Modern Multimodal Models | Yinheng Li et.al. | 2407.19180 | null |
2024-07-26 | MangaUB: A Manga Understanding Benchmark for Large Multimodal Models | Hikaru Ikuta et.al. | 2407.19034 | null |
2024-07-26 | Unifying Visual and Semantic Feature Spaces with Diffusion Models for Enhanced Cross-Modal Alignment | Yuze Zheng et.al. | 2407.18854 | null |
2024-07-26 | ChatSchema: A pipeline of extracting structured information with Large Multimodal Models based on schema | Fei Wang et.al. | 2407.18716 | null |
2024-07-25 | Sparse vs Contiguous Adversarial Pixel Perturbations in Multimodal Models: An Empirical Analysis | Cristian-Alexandru Botocan et.al. | 2407.18251 | link |
2024-07-25 | $\mathbb{X}$ -Sample Contrastive Loss: Improving Contrastive Learning with Sample Similarity Graphs | Vlad Sobal et.al. | 2407.18134 | null |
2024-07-25 | Cross-Vendor Reproducibility of Radiomics-based Machine Learning Models for Computer-aided Diagnosis | Jatin Chaudhary et.al. | 2407.18060 | null |
2024-07-25 | What does Kiki look like? Cross-modal associations between speech sounds and visual shapes in vision-and-language models | Tessa Verhoef et.al. | 2407.17974 | null |
2024-07-25 | Shapley Value-based Contrastive Alignment for Multimodal Information Extraction | Wen Luo et.al. | 2407.17854 | null |
2024-07-25 | Enhancing Model Performance: Another Approach to Vision-Language Instruction Tuning | Vedanshu et.al. | 2407.17813 | null |
2024-07-25 | KiVA: Kid-inspired Visual Analogies for Testing Large Multimodal Models | Eunice Yiu et.al. | 2407.17773 | link |
2024-07-24 | Testing Large Language Models on Driving Theory Knowledge and Skills for Connected Autonomous Vehicles | Zuoyin Tang et.al. | 2407.17211 | null |
2024-07-23 | Chameleon: Images Are What You Need For Multimodal Learning Robust To Missing Modalities | Muhammad Irzam Liaqat et.al. | 2407.16243 | null |
2024-07-22 | LongVideoBench: A Benchmark for Long-context Interleaved Video-Language Understanding | Haoning Wu et.al. | 2407.15754 | link |
2024-07-22 | Resource-Efficient Federated Multimodal Learning via Layer-wise and Progressive Training | Ye Lin Tun et.al. | 2407.15426 | null |
2024-07-21 | VideoGameBunny: Towards vision assistants for video games | Mohammad Reza Taesiri et.al. | 2407.15295 | null |
2024-07-22 | Patch-based Intuitive Multimodal Prototypes Network (PIMPNet) for Alzheimer’s Disease classification | Lisa Anita De Santi et.al. | 2407.14277 | link |
2024-07-18 | Visual Haystacks: Answering Harder Questions About Sets of Images | Tsung-Han Wu et.al. | 2407.13766 | link |
2024-07-17 | Text- and Feature-based Models for Compound Multimodal Emotion Recognition in the Wild | Nicolas Richet et.al. | 2407.12927 | link |
2024-07-16 | ChatBCG: Can AI Read Your Slide Deck? | Nikita Singh et.al. | 2407.12875 | null |
2024-07-17 | LMMs-Eval: Reality Check on the Evaluation of Large Multimodal Models | Kaichen Zhang et.al. | 2407.12772 | link |
2024-07-17 | Missing Modality Prediction for Unpaired Multimodal Learning via Joint Embedding of Unimodal Models | Donggeun Kim et.al. | 2407.12616 | null |
2024-07-17 | E5-V: Universal Embeddings with Multimodal Large Language Models | Ting Jiang et.al. | 2407.12580 | link |
2024-07-16 | FIRE: A Dataset for Feedback Integration and Refinement Evaluation of Multimodal Models | Pengxiang Li et.al. | 2407.11522 | null |
2024-07-16 | COMET: “Cone of experience” enhanced large multimodal model for mathematical problem generation | Sannyuya Liu et.al. | 2407.11315 | null |
2024-07-15 | OpenPSG: Open-set Panoptic Scene Graph Generation via Large Multimodal Models | Zijian Zhou et.al. | 2407.11213 | link |
2024-07-15 | FabGPT: An Efficient Large Multimodal Model for Complex Wafer Defect Knowledge Queries | Yuqi Jiang et.al. | 2407.10810 | null |
2024-07-15 | Scaling 3D Reasoning with LMMs to Large Robot Mission Environments Using Datagraphs | W. J. Meijer et.al. | 2407.10743 | null |
2024-07-16 | Qwen2 Technical Report | An Yang et.al. | 2407.10671 | link |
2024-07-15 | How and where does CLIP process negation? | Vincent Quantmeyer et.al. | 2407.10488 | null |
2024-07-12 | Diagnosing and Re-learning for Balanced Multimodal Learning | Yake Wei et.al. | 2407.09705 | link |
2024-07-12 | Unifying Sequences, Structures, and Descriptions for Any-to-Any Protein Generation with the Large Multimodal Model HelixProtX | Zhiyuan Chen et.al. | 2407.09274 | link |
2024-07-12 | DART: An Automated End-to-End Object Detection Pipeline with Data Diversification, Open-Vocabulary Bounding Box Annotation, Pseudo-Label Review, and Model Training | Chen Xin et.al. | 2407.09174 | link |
2024-07-11 | Emerging Practices for Large Multimodal Model (LMM) Assistance for People with Visual Impairments: Implications for Design | Jingyi Xie et.al. | 2407.08882 | null |
2024-07-10 | RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization | Xijie Huang et.al. | 2407.08044 | link |
2024-07-10 | LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal Models | Feng Li et.al. | 2407.07895 | link |
2024-07-11 | InstructLayout: Instruction-Driven 2D and 3D Layout Synthesis with Semantic Graph Prior | Chenguo Lin et.al. | 2407.07580 | null |
2024-07-10 | Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model | Wenqi Zhang et.al. | 2407.07053 | link |
2024-07-08 | ANOLE: An Open, Autoregressive, Native Large Multimodal Models for Interleaved Image-Text Generation | Ethan Chern et.al. | 2407.06135 | link |
2024-07-07 | Multimodal Language Models for Domain-Specific Procedural Video Summarization | Nafisa Hussain et.al. | 2407.05419 | null |
2024-07-07 | Multimodal Prompt Learning with Missing Modalities for Sentiment Analysis and Emotion Recognition | Zirun Guo et.al. | 2407.05374 | link |
2024-07-06 | Enhance the Robustness of Text-Centric Multimodal Alignments | Ting-Yu Yen et.al. | 2407.05036 | null |
2024-07-06 | Completed Feature Disentanglement Learning for Multimodal MRIs Analysis | Tianling Liu et.al. | 2407.04916 | null |
2024-07-06 | MMSci: A Multimodal Multi-Discipline Dataset for PhD-Level Scientific Comprehension | Zekun Li et.al. | 2407.04903 | link |
2024-07-05 | VCoME: Verbal Video Composition with Multimodal Editing Effects | Weibo Gong et.al. | 2407.04697 | null |
2024-07-05 | Multimodal Classification via Modal-Aware Interactive Enhancement | Qing-Yuan Jiang et.al. | 2407.04587 | null |
2024-07-05 | Robust Multimodal Learning via Representation Decoupling | Shicai Wei et.al. | 2407.04458 | null |
2024-07-05 | Smart Vision-Language Reasoners | Denisa Roberts et.al. | 2407.04212 | link |
2024-07-04 | Investigating the Role of Instruction Variety and Task Difficulty in Robotic Manipulation Tasks | Amit Parekh et.al. | 2407.03967 | link |
2024-07-04 | ADAPT: Multimodal Learning for Detecting Physiological Changes under Missing Modalities | Julie Mordacq et.al. | 2407.03836 | link |
2024-07-04 | M $\mathbf5$ – A Diverse Benchmark to Assess the Performance of Large Multimodal Models Across Multilingual and Multicultural Vision-Language Tasks | Florian Schneider et.al. | 2407.03791 | null |
2024-07-03 | HEMM: Holistic Evaluation of Multimodal Foundation Models | Paul Pu Liang et.al. | 2407.03418 | link |
2024-07-02 | Multi-Peptide: Multimodality Leveraged Language-Graph Learning of Peptide Properties | Srivathsan Badrinarayanan et.al. | 2407.03380 | link |
2024-07-02 | Understanding Alignment in Multimodal LLMs: A Comprehensive Study | Elmira Amirloo et.al. | 2407.02477 | null |
2024-07-02 | Synthetic Multimodal Question Generation | Ian Wu et.al. | 2407.02233 | null |
2024-07-02 | Crossroads of Continents: Automated Artifact Extraction for Cultural Adaptation with Large Multimodal Models | Anjishnu Mukherjee et.al. | 2407.02067 | link |
2024-07-01 | Empathic Grounding: Explorations using Multimodal Interaction and Large Language Models with Conversational Agents | Mehdi Arjmand et.al. | 2407.01824 | link |
2024-07-01 | We-Math: Does Your Large Multimodal Model Achieve Human-like Mathematical Reasoning? | Runqi Qiao et.al. | 2407.01284 | link |
2024-07-01 | Unaligning Everything: Or Aligning Any Text to Any Image in Multimodal Models | Shaeke Salman et.al. | 2407.01157 | null |
2024-06-29 | AI-powered multimodal modeling of personalized hemodynamics in aortic stenosis | Caglar Ozturk et.al. | 2407.00535 | null |
2024-06-29 | MMEvalPro: Calibrating Multimodal Benchmarks Towards Trustworthy and Efficient Evaluation | Jinsheng Huang et.al. | 2407.00468 | link |
2024-06-29 | How to Train Your Fact Verifier: Knowledge Transfer with Multimodal Open Models | Jaeyoung Lee et.al. | 2407.00369 | null |
2024-06-28 | PathGen-1.6M: 1.6 Million Pathology Image-text Pairs Generation through Multi-agent Collaboration | Yuxuan Sun et.al. | 2407.00203 | null |
2024-06-28 | EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model | Yuxuan Zhang et.al. | 2406.20076 | link |
2024-06-28 | InfiniBench: A Comprehensive Benchmark for Large Multimodal Models in Very Long Video Understanding | Kirolos Ataallah et.al. | 2406.19875 | link |
2024-06-28 | MetaDesigner: Advancing Artistic Typography through AI-Driven, User-Centric, and Multilingual WordArt Synthesis | Jun-Yan He et.al. | 2406.19859 | null |
2024-06-28 | MM-Instruct: Generated Visual Instructions for Large Multimodal Model Alignment | Jihao Liu et.al. | 2406.19736 | link |
2024-06-28 | Enhancing Radiological Diagnosis: A Collaborative Approach Integrating AI and Human Expertise for Visual Miss Correction | Akash Awasthi et.al. | 2406.19686 | null |
2024-06-28 | SK-VQA: Synthetic Knowledge Generation at Scale for Training Context-Augmented Multimodal LLMs | Xin Su et.al. | 2406.19593 | null |
2024-06-27 | OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding | Tao Zhang et.al. | 2406.19389 | null |
2024-06-28 | FlowVQA: Mapping Multimodal Logic in Visual Question Answering with Flowcharts | Shubhankar Singh et.al. | 2406.19237 | null |
2024-06-27 | RAVEN: Multitask Retrieval Augmented Vision-Language Learning | Varun Nagaraj Rao et.al. | 2406.19150 | null |
2024-06-27 | DocKylin: A Large Multimodal Model for Visual Document Understanding with Efficient Visual Slimming | Jiaxin Zhang et.al. | 2406.19101 | null |
2024-06-27 | Fairness and Bias in Multimodal AI: A Survey | Tosin Adewumi et.al. | 2406.19097 | null |
2024-06-27 | MissionGNN: Hierarchical Multimodal GNN-based Weakly Supervised Video Anomaly Recognition with Mission-Specific Knowledge Graph Generation | Sanggeon Yun et.al. | 2406.18815 | null |
2024-06-26 | MUMU: Bootstrapping Multimodal Image Generation from Text-to-Image Data | William Berman et.al. | 2406.18790 | null |
2024-06-26 | S3: A Simple Strong Sample-effective Multimodal Dialog System | Elisei Rykov et.al. | 2406.18305 | link |
2024-06-26 | EHR-Based Mobile and Web Platform for Chronic Disease Risk Prediction Using Large Language Multimodal Models | Chun-Chieh Liao et.al. | 2406.18087 | null |
2024-06-26 | Speech2UnifiedExpressions: Synchronous Synthesis of Co-Speech Affective Face and Body Expressions from Affordable Inputs | Uttaran Bhattacharya et.al. | 2406.18068 | null |
2024-06-25 | Human-centered In-building Embodied Delivery Benchmark | Zhuoqun Xu et.al. | 2406.17898 | link |
2024-06-25 | InFiConD: Interactive No-code Fine-tuning with Concept-based Knowledge Distillation | Jinbin Huang et.al. | 2406.17838 | null |
2024-06-25 | Data curation via joint example selection further accelerates multimodal learning | Talfan Evans et.al. | 2406.17711 | null |
2024-06-25 | Towards Probing Speech-Specific Risks in Large Multimodal Models: A Taxonomy, Benchmark, and Insights | Hao Yang et.al. | 2406.17430 | link |
2024-06-24 | At First Sight: Zero-Shot Classification of Astronomical Images with Large Multimodal Models | Dimitrios Tanoglidis et.al. | 2406.17057 | null |
2024-06-24 | Revisiting Referring Expression Comprehension Evaluation in the Era of Large Multimodal Models | Jierun Chen et.al. | 2406.16866 | link |
2024-06-24 | Long Context Transfer from Language to Vision | Peiyuan Zhang et.al. | 2406.16852 | link |
2024-06-24 | QuadrupedGPT: Towards a Versatile Quadruped Agent in Open-ended Worlds | Ye Wang et.al. | 2406.16578 | null |
2024-06-21 | Multimodal Task Vectors Enable Many-Shot Multimodal In-Context Learning | Brandon Huang et.al. | 2406.15334 | link |
2024-06-21 | Is A Picture Worth A Thousand Words? Delving Into Spatial Reasoning for Vision Language Models | Jiayu Wang et.al. | 2406.14852 | link |
2024-06-20 | Evaluating vision-capable chatbots in interpreting kinematics graphs: a comparative study of free and subscription-based models | Giulia Polverini et.al. | 2406.14685 | null |
2024-06-20 | Revealing Vision-Language Integration in the Brain with Multimodal Networks | Vighnesh Subramaniam et.al. | 2406.14481 | link |
2024-06-25 | iWISDM: Assessing instruction following in multimodal models at scale | Xiaoxuan Lei et.al. | 2406.14343 | link |
2024-06-20 | Two Giraffes in a Dirt Field: Using Game Play to Investigate Situation Modelling in Large Multimodal Models | Sherzod Hakimov et.al. | 2406.14035 | null |
2024-06-20 | Knowledge-driven Subspace Fusion and Gradient Coordination for Multi-modal Learning | Yupei Zhang et.al. | 2406.13979 | link |
2024-06-20 | PIN: A Knowledge-Intensive Dataset for Paired and Interleaved Multimodal Documents | Junjie Wang et.al. | 2406.13923 | null |
2024-06-19 | Through the Theory of Mind’s Eye: Reading Minds with Multimodal Video Large Language Models | Zhawnen Chen et.al. | 2406.13763 | null |
2024-06-19 | GUI Action Narrator: Where and When Did That Action Take Place? | Qinchen Wu et.al. | 2406.13719 | null |
2024-06-19 | Is AI fun? HumorDB: a curated dataset and benchmark to investigate graphical humor | Veedant Jain et.al. | 2406.13564 | null |
2024-06-19 | VisualRWKV: Exploring Recurrent Neural Networks for Visual Language Models | Haowen Hou et.al. | 2406.13362 | link |
2024-06-19 | Learnable In-Context Vector for Visual Question Answering | Yingzhe Peng et.al. | 2406.13185 | link |
2024-06-18 | Synergizing Foundation Models and Federated Learning: A Survey | Shenghui Li et.al. | 2406.12844 | null |
2024-06-18 | OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI | Zhen Huang et.al. | 2406.12753 | link |
2024-06-18 | Disturbing Image Detection Using LMM-Elicited Emotion Embeddings | Maria Tzelepi et.al. | 2406.12668 | null |
2024-06-18 | Automatic benchmarking of large multimodal models via iterative experiment programming | Alessandro Conti et.al. | 2406.12321 | link |
2024-06-18 | Language and Multimodal Models in Sports: A Survey of Datasets and Applications | Haotian Xia et.al. | 2406.12252 | null |
2024-06-17 | VideoLLM-online: Online Video Large Language Model for Streaming Video | Joya Chen et.al. | 2406.11816 | null |
2024-06-17 | LLARVA: Vision-Action Instruction Tuning Enhances Robot Learning | Dantong Niu et.al. | 2406.11815 | null |
2024-06-17 | Multimodal Learning To Improve Segmentation With Intraoperative CBCT & Preoperative CT | Maximilian E. Tschuchnig et.al. | 2406.11650 | null |
2024-06-17 | Program Synthesis Benchmark for Visual Programming in XLogoOnline Environment | Chao Wen et.al. | 2406.11334 | null |
2024-06-17 | VideoVista: A Versatile Benchmark for Video Understanding and Reasoning | Yunxin Li et.al. | 2406.11303 | null |
2024-06-17 | i-SRT: Aligning Large Multimodal Models for Videos by Iterative Self-Retrospective Judgment | Daechul Ahn et.al. | 2406.11280 | link |
2024-06-17 | MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens | Anas Awadalla et.al. | 2406.11271 | link |
2024-06-17 | Generative Visual Instruction Tuning | Jefferson Hernandez et.al. | 2406.11262 | link |
2024-06-17 | Relational Learning in Pre-Trained Models: A Theory from Hypergraph Recovery Perspective | Yang Chen et.al. | 2406.11249 | null |
2024-06-16 | Investigating Video Reasoning Capability of Large Language Models with Tropes in Movies | Hung-Ting Su et.al. | 2406.10923 | null |
2024-06-15 | Beyond Raw Videos: Understanding Edited Videos with Large Multimodal Model | Lu Xu et.al. | 2406.10484 | link |
2024-06-12 | MobileAIBench: Benchmarking LLMs and LMMs for On-Device Use Cases | Rithesh Murthy et.al. | 2406.10290 | null |
2024-06-14 | VideoGUI: A Benchmark for GUI Automation from Instructional Videos | Kevin Qinghong Lin et.al. | 2406.10227 | null |
2024-06-14 | ChartMimic: Evaluating LMM’s Cross-Modal Reasoning Capability via Chart-to-Code Generation | Chufan Shi et.al. | 2406.09961 | link |
2024-06-14 | BiVLC: Extending Vision-Language Compositionality Evaluation with Text-to-Image Retrieval | Imanol Miranda et.al. | 2406.09952 | link |
2024-06-13 | VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding | Muhammad Maaz et.al. | 2406.09418 | link |
2024-06-13 | Explore the Limits of Omni-modal Pretraining at Scale | Yiyuan Zhang et.al. | 2406.09412 | link |
2024-06-14 | 4M-21: An Any-to-Any Vision Model for Tens of Tasks and Modalities | Roman Bachmann et.al. | 2406.09406 | null |
2024-06-13 | Yo’LLaVA: Your Personalized Language and Vision Assistant | Thao Nguyen et.al. | 2406.09400 | link |
2024-06-13 | CMC-Bench: Towards a New Paradigm of Visual Signal Compression | Chunyi Li et.al. | 2406.09356 | link |
2024-06-13 | Comparison Visual Instruction Tuning | Wei Lin et.al. | 2406.09240 | null |
2024-06-13 | Zoom and Shift are All You Need | Jiahao Qin et.al. | 2406.08866 | null |
2024-06-11 | Embedding-based Multimodal Learning on Pan-Squamous Cell Carcinomas for Improved Survival Outcomes | Asim Waqas et.al. | 2406.08521 | null |
2024-06-14 | Beyond LLaVA-HD: Diving into High-Resolution Large Multimodal Models | Yi-Fan Zhang et.al. | 2406.08487 | link |
2024-06-13 | OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text | Qingyun Li et.al. | 2406.08418 | link |
2024-06-12 | A Concept-Based Explainability Framework for Large Multimodal Models | Jayneel Parekh et.al. | 2406.08074 | link |
2024-06-12 | LVBench: An Extreme Long Video Understanding Benchmark | Weihan Wang et.al. | 2406.08035 | link |
2024-06-11 | Cognitive Insights Across Languages: Enhancing Multimodal Interview Analysis | David Ortiz-Perez et.al. | 2406.07542 | link |
2024-06-11 | Understanding Visual Concepts Across Models | Brandon Trabucco et.al. | 2406.07506 | link |
2024-06-11 | Unified Modeling Enhanced Multimodal Learning for Precision Neuro-Oncology | Huahui Yi et.al. | 2406.07078 | link |
2024-06-14 | BTS: Bridging Text and Sound Modalities for Metadata-Aided Respiratory Sound Classification | June-Woo Kim et.al. | 2406.06786 | link |
2024-06-10 | Vript: A Video Is Worth Thousands of Words | Dongjie Yang et.al. | 2406.06040 | link |
2024-06-10 | FLEUR: An Explainable Reference-Free Evaluation Metric for Image Captioning Using a Large Multimodal Model | Yebin Lee et.al. | 2406.06004 | link |
2024-06-10 | CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark | David Romero et.al. | 2406.05967 | null |
2024-06-09 | Stealthy Targeted Backdoor Attacks against Image Captioning | Wenshu Fan et.al. | 2406.05874 | link |
2024-06-09 | F-LMM: Grounding Frozen Large Multimodal Models | Size Wu et.al. | 2406.05821 | link |
2024-06-08 | Generalist Multimodal AI: A Review of Architectures, Challenges and Opportunities | Sai Munikoti et.al. | 2406.05496 | null |
2024-06-07 | Semantic Segmentation on VSPW Dataset through Masked Video Consistency | Chen Liang et.al. | 2406.04979 | null |
2024-06-07 | Predictive Dynamic Fusion | Bing Cao et.al. | 2406.04802 | link |
2024-06-07 | MGIMM: Multi-Granularity Instruction Multimodal Model for Attribute-Guided Remote Sensing Image Detailed Description | Cong Yang et.al. | 2406.04716 | link |
2024-06-07 | AICoderEval: Improving AI Domain Code Generation of Large Language Models | Yinghui Xia et.al. | 2406.04712 | null |
2024-06-06 | GenAI Arena: An Open Evaluation Platform for Generative Models | Dongfu Jiang et.al. | 2406.04485 | null |
2024-06-06 | MAIRA-2: Grounded Radiology Report Generation | Shruthi Bannur et.al. | 2406.04449 | link |
2024-06-06 | DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effective for LMMs | Lingchen Meng et.al. | 2406.04334 | null |
2024-06-06 | BLSP-Emo: Towards Empathetic Large Speech-Language Models | Chen Wang et.al. | 2406.03872 | link |
2024-06-05 | Identification of Stone Deterioration Patterns with Large Multimodal Models | Daniele Corradetti et.al. | 2406.03207 | link |
2024-06-05 | Exploiting LMM-based knowledge for image classification tasks | Maria Tzelepi et.al. | 2406.03071 | null |
2024-06-02 | Multimodal Deep Learning for Low-Resource Settings: A Vector Embedding Alignment Approach for Healthcare Applications | David Restrepo et.al. | 2406.02601 | null |
2024-06-04 | Leveraging Visual Tokens for Extended Text Contexts in Multi-Modal Learning | Alex Jinpeng Wang et.al. | 2406.02547 | link |
2024-06-04 | Dealing with All-stage Missing Modality: Towards A Universal Model with Robust Reconstruction and Personalization | Yunpeng Zhao et.al. | 2406.01987 | null |
2024-06-03 | Automatic Fused Multimodal Deep Learning for Plant Identification | Alfreds Lapkovskis et.al. | 2406.01455 | link |
2024-06-05 | Pulmonary Embolism Mortality Prediction Using Multimodal Learning Based on Computed Tomography Angiography and Clinical Data | Zhusi Zhong et.al. | 2406.01302 | null |
2024-06-03 | Dragonfly: Multi-Resolution Zoom Supercharges Large Visual-Language Model | Kezhen Chen et.al. | 2406.00977 | link |
2024-06-02 | Learning Multimodal Behaviors from Scratch with Diffusion Policy Gradient | Zechu Li et.al. | 2406.00681 | null |
2024-06-04 | StrucTexTv3: An Efficient Vision-Language Model for Text-rich Image Perception, Comprehension, and Beyond | Pengyuan Lyu et.al. | 2405.21013 | null |
2024-05-31 | Don’t Buy it! Reassessing the Ad Understanding Abilities of Contrastive Multimodal Models | A. Bavaresco et.al. | 2405.20846 | link |
2024-06-17 | Ovis: Structural Embedding Alignment for Multimodal Large Language Model | Shiyin Lu et.al. | 2405.20797 | link |
2024-05-31 | Vision-Language Meets the Skeleton: Progressively Distillation with Cross-Modal Knowledge for 3D Action Representation Learning | Yang Chen et.al. | 2405.20606 | link |
2024-05-30 | Worse than Random? An Embarrassingly Simple Probing Evaluation of Large Multimodal Models in Medical VQA | Qianqi Yan et.al. | 2405.20421 | link |
2024-05-30 | Retrieval Augmented Structured Generation: Business Document Information Extraction As Tool Use | Franz Louis Cesista et.al. | 2405.20245 | null |
2024-05-31 | Visual Attention Analysis in Online Learning | Miriam Navarro et.al. | 2405.20091 | null |
2024-05-30 | MM-Lego: Modular Biomedical Multimodal Models with Minimal Fine-Tuning | Konstantin Hemker et.al. | 2405.19950 | null |
2024-05-30 | Instruction-Guided Visual Masking | Jinliang Zheng et.al. | 2405.19783 | link |
2024-05-29 | Thermodynamically Informed Multimodal Learning of High-Dimensional Free Energy Models in Molecular Coarse Graining | Blake R. Duschatko et.al. | 2405.19386 | null |
2024-06-09 | LLMs Meet Multimodal Generation and Editing: A Survey | Yingqing He et.al. | 2405.19334 | link |
2024-05-29 | Adaptive Image Quality Assessment via Teaching Large Multimodal Model to Compare | Hanwei Zhu et.al. | 2405.19298 | link |
2024-05-31 | Benchmarking and Improving Detail Image Caption | Hongyuan Dong et.al. | 2405.19092 | link |
2024-05-29 | Topological Perspectives on Optimal Multimodal Embedding Spaces | Abdul Aziz A. B et.al. | 2405.18867 | null |
2024-05-29 | Exploring Exotic Decays of the Higgs Boson to Multi-Photons at the LHC via Multimodal Learning Approaches | A. Hammad et.al. | 2405.18834 | null |
2024-05-28 | The Evolution of Multimodal Model Architectures | Shakti N. Wadekar et.al. | 2405.17927 | null |
2024-05-28 | Seeing the Image: Prioritizing Visual Correlation by Contrastive Alignment | Xin Xiao et.al. | 2405.17871 | link |
2024-05-28 | Full-Stack Allreduce on Multi-Rail Networks | Enda Yu et.al. | 2405.17870 | null |
2024-05-28 | MMPareto: Boosting Multimodal Learning with Innocent Unimodal Assistance | Yake Wei et.al. | 2405.17730 | link |
2024-05-27 | Matryoshka Multimodal Models | Mu Cai et.al. | 2405.17430 | null |
2024-05-27 | XFormParser: A Simple and Effective Multimodal Multilingual Semi-structured Form Parser | Xianfu Cheng et.al. | 2405.17336 | link |
2024-05-28 | LLM-Optic: Unveiling the Capabilities of Large Language Models for Universal Visual Grounding | Haoyu Zhao et.al. | 2405.17104 | null |
2024-05-27 | Mitigating Noisy Correspondence by Geometrical Structure Consistency Learning | Zihua Zhao et.al. | 2405.16996 | link |
2024-05-27 | Multilingual Diversity Improves Vision-Language Representations | Thao Nguyen et.al. | 2405.16915 | null |
2024-05-26 | Implicit Multimodal Alignment: On the Generalization of Frozen LLMs to Multimodal Inputs | Mustafa Shukor et.al. | 2405.16700 | link |
2024-05-25 | How Well Do Deep Learning Models Capture Human Concepts? The Case of the Typicality Effect | Siddhartha K. Vemuri et.al. | 2405.16128 | null |
2024-05-24 | ConvLLaVA: Hierarchical Backbones as Visual Encoder for Large Multimodal Models | Chunjiang Ge et.al. | 2405.15738 | link |
2024-05-24 | Chain-of-Thought Prompting for Demographic Inference with Large Multimodal Models | Yongsheng Yu et.al. | 2405.15687 | null |
2024-05-24 | M4U: Evaluating Multilingual Understanding and Reasoning for Large Multimodal Models | Hongyu Wang et.al. | 2405.15638 | link |
2024-05-24 | DEEM: Diffusion Models Serve as the Eyes of Large Language Models for Image Perception | Run Luo et.al. | 2405.15232 | link |
2024-05-24 | Shopping Queries Image Dataset (SQID): An Image-Enriched ESCI Dataset for Exploring Multimodal Learning in Product Search | Marie Al Ghossein et.al. | 2405.15190 | link |
Generative Weight Space Modeling
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2024-12-19 | DI-PCG: Diffusion-based Efficient Inverse Procedural Content Generation for High-quality 3D Asset Creation | Wang Zhao et.al. | 2412.15200 | null |
2024-12-18 | On the principle of linearized stability for quasilinear evolution equations in time-weighted spaces | Bogdan-Vasile Matioc et.al. | 2412.13940 | null |
2024-12-17 | On the Bäcklund transform and the stability of the line soliton of the KP-II equation on $\mathbb R^2$ | Lorenzo Pompili et.al. | 2412.12530 | null |
2024-12-13 | On the embedding of weighted Sobolev spaces with applications to a planar nonlinear Schrödinger equation | Antonio Azzolini et.al. | 2412.10067 | null |
2024-12-12 | Modified scattering for the cubic dispersion-managed NLS | Jason Murphy et.al. | 2412.09762 | null |
2024-12-12 | LoRACLR: Contrastive Adaptation for Customization of Diffusion Models | Enis Simsar et.al. | 2412.09622 | null |
2024-12-11 | Exploring superconformal Yang-Mills theories through matrix Bessel kernels | Zoltan Bajnok et.al. | 2412.08732 | null |
2024-12-09 | Bilinear singular integral operators with kernels in weighted spaces | Petr Honzík et.al. | 2412.07014 | null |
2024-12-04 | Pixel-level and Semantic-level Adjustable Super-resolution: A Dual-LoRA Approach | Lingchen Sun et.al. | 2412.03017 | link |
2024-11-21 | Strong localization blurs criticality of time series for spreading phenomena on networks | Juliane T. Moraes et.al. | 2412.01842 | null |
2024-12-02 | Geometric invariant theory and stretched Kostka quasi-polynomials | Marc Besson et.al. | 2412.01651 | null |
2024-11-29 | Origin-Destination Demand Prediction: An Urban Radiation and Attraction Perspective | Xuan Ma et.al. | 2412.00167 | null |
2024-11-29 | Rényi complexity in mean-field disordered systems | Nina Javerzat et.al. | 2411.19817 | null |
2024-11-28 | An Extensive Evaluation of Factual Consistency in Large Language Models for Data-to-Text Generation | Joy Mahapatra et.al. | 2411.19203 | null |
2024-11-27 | Task Arithmetic Through The Lens Of One-Shot Federated Learning | Zhixu Tao et.al. | 2411.18607 | null |
2024-11-25 | Spectral properties of Lévy Fokker–Planck equations | Hardy Chan et.al. | 2411.16424 | null |
2024-11-20 | Nonlinear orbital stability of stationary shock profiles for the Lax-Wendroff scheme | Jean-François Coulombel et.al. | 2411.13094 | null |
2024-11-26 | Enhancing generalization in high energy physics using white-box adversarial attacks | Franck Rothen et.al. | 2411.09296 | null |
2024-11-11 | Minimal nilpotent finite $W$-algebra and cuspidal module category of $\mathfrak{sp}_{2n}$ | Genqiang Liu et.al. | 2411.06768 | null |
2024-11-07 | Well-Posedness and Regularity of the Heat Equation with Robin Boundary Conditions in the Two-Dimensional Wedge | Marco Bravin et.al. | 2411.04651 | null |
2024-11-04 | SALSA: Soup-based Alignment Learning for Stronger Adaptation in RLHF | Atoosa Chegini et.al. | 2411.01798 | null |
2024-12-06 | Modular Duality in Deep Learning | Jeremy Bernstein et.al. | 2410.21265 | null |
2024-10-26 | MarDini: Masked Autoregressive Diffusion for Video Generation at Scale | Haozhe Liu et.al. | 2410.20280 | null |
2024-10-25 | Four-parameter Mittag-Leffler functions and their associated coherent states | Dušan Popov et.al. | 2410.19462 | null |
2024-10-24 | Bielik 7B v0.1: A Polish Language Model – Development, Insights, and Evaluation | Krzysztof Ociepa et.al. | 2410.18565 | null |
2024-10-21 | Two dimensional delta Bose gas in a weighted space | Sudheesh Surendranath et.al. | 2410.16550 | null |
2024-10-21 | In Search of the Successful Interpolation: On the Role of Sharpness in CLIP Generalization | Alireza Abdollahpoorrostam et.al. | 2410.16476 | link |
2024-10-23 | Universal approximation results for neural networks with non-polynomial activation function over non-compact domains | Ariel Neufeld et.al. | 2410.14759 | null |
2024-10-23 | Harnessing Your DRAM and SSD for Sustainable and Accessible LLM Inference with Mixed-Precision and Multi-level Caching | Jie Peng et.al. | 2410.14740 | null |
2024-10-16 | Differential Shape Optimization with Image Representation for Photonic Design | Zhaocheng Liu et.al. | 2410.13074 | null |
2024-10-15 | Scaling Laws for Multilingual Language Models | Yifei He et.al. | 2410.12883 | null |
2024-10-16 | AutoSimTTF: A Fully Automatic Pipeline for Electric Field Simulation and Treatment Planning of Tumor Treating Fields | Minmin Wang et.al. | 2410.12196 | null |
2024-10-15 | Model Swarms: Collaborative Search to Adapt LLM Experts via Swarm Intelligence | Shangbin Feng et.al. | 2410.11163 | null |
2024-10-14 | Deep Linear Probe Generators for Weight Space Learning | Jonathan Kahana et.al. | 2410.10811 | null |
2024-10-14 | Generating Model Parameters for Controlling: Parameter Diffusion for Controllable Multi-Task Recommendation | Chenglei Shen et.al. | 2410.10639 | null |
2024-10-14 | MoTE: Reconciling Generalization with Specialization for Visual-Language to Video Knowledge Transfer | Minghao Zhu et.al. | 2410.10589 | link |
2024-10-15 | Regions of Level $\ell$ of Catalan/Semiorder-Type Arrangements | Yanru Chen et.al. | 2410.10198 | null |
2024-10-13 | A Quantum Circuit-Based Compression Perspective for Parameter-Efficient Learning | Chen-Yu Liu et.al. | 2410.09846 | null |
2024-10-11 | Meta-Transfer Learning Empowered Temporal Graph Networks for Cross-City Real Estate Appraisal | Weijia Zhang et.al. | 2410.08947 | null |
2024-10-09 | Efficient Weight-Space Laplace-Gaussian Filtering and Smoothing for Sequential Deep Learning | Joanna Sliwa et.al. | 2410.06800 | null |
2024-10-09 | Revisiting Multi-Permutation Equivariance through the Lens of Irreducible Representations | Yonatan Sverdlov et.al. | 2410.06665 | link |
2024-10-08 | Weighted Embeddings for Low-Dimensional Graph Representation | Thomas Bläsius et.al. | 2410.06042 | null |
2024-10-05 | Computing ground states of Bose-Einstein condensation by normalized deep neural network | Weizhu Bao et.al. | 2410.05319 | link |
2024-10-07 | Hyper-Representations: Learning from Populations of Neural Networks | Konstantin Schürholt et.al. | 2410.05107 | link |
2024-10-06 | Integrable Modules of Map full Toroidal Lie Algebras | Pradeep Bisht et.al. | 2410.04495 | null |
2024-10-06 | Global well-posedness for the defocusing 3D quadratic NLS in the sharp critical space | Jia Shen et.al. | 2410.04337 | null |
2024-10-05 | Equivariant Neural Functional Networks for Transformers | Viet-Hoang Tran et.al. | 2410.04209 | null |
2024-10-15 | Learning on LoRAs: GL-Equivariant Processing of Low-Rank Weight Spaces for Large Finetuned Models | Theo Putterman et.al. | 2410.04207 | null |
2024-10-04 | Measuring and Controlling Solution Degeneracy across Task-Trained Recurrent Neural Networks | Ann Huang et.al. | 2410.03972 | null |
2024-10-04 | Autoregressive Moving-average Attention Mechanism for Time Series Forecasting | Jiecheng Lu et.al. | 2410.03159 | link |
2024-10-02 | Composing Global Optimizers to Reasoning Tasks via Algebraic Objects in Neural Nets | Yuandong Tian et.al. | 2410.01779 | link |
2024-10-01 | SynCOM: A tool for simulating coronal outflows | Valmir Moraes Filho et.al. | 2410.01004 | null |
2024-10-01 | On the prime ideals of higher secant varieties of Veronese embeddings of small degrees | Katsuhisa Furukawa et.al. | 2410.00652 | null |
2024-09-30 | Old Optimizer, New Norm: An Anthology | Jeremy Bernstein et.al. | 2409.20325 | null |
2024-09-27 | Effects of Peierls phases in open linear chains | Anselmo M. Marques et.al. | 2409.18780 | null |
2024-09-27 | Density of states in neural networks: an in-depth exploration of learning in parameter space | Margherita Mele et.al. | 2409.18683 | null |
2024-09-26 | The time periodic problem for the Navier-Stokes equations in exterior domains in weighted spaces | Reinhard Farwig et.al. | 2409.17590 | null |
2024-09-25 | Scalable Ensemble Diversification for OOD Generalization and Detection | Alexander Rubinstein et.al. | 2409.16797 | null |
2024-10-04 | Lessons Learned from a Unifying Empirical Study of Parameter-Efficient Transfer Learning (PETL) in Visual Recognition | Zheda Mai et.al. | 2409.16434 | link |
2024-09-24 | VascX Models: Model Ensembles for Retinal Vascular Analysis from Color Fundus Images | Jose Vargas Quiros et.al. | 2409.16016 | link |
2024-09-23 | Efficient Large-Scale Quantum Optimization via Counterdiabatic Ansatz | Jie Liu et.al. | 2409.15055 | null |
2024-09-24 | Weighted Approximation By Max-Product Generalized Exponential Sampling Series | Satyaranjan Pradhan et.al. | 2409.14884 | null |
2024-09-21 | Weakly magnetized black holes in Einstein-ModMax theory | Haryanto M. Siahaan et.al. | 2409.13967 | null |
2024-09-18 | Monomial Matrix Group Equivariant Neural Functional Networks | Hoang V. Tran et.al. | 2409.11697 | link |
2024-09-17 | Existence of an extremal function of Sobolev critical embedding with an $α$ -homogeneous weight | Petr Gurka et.al. | 2409.11193 | null |
2024-09-16 | Inferring stellar parameters and their uncertainties from high-resolution spectroscopy using invertible neural networks | Nils Candebat et.al. | 2409.10621 | null |
2024-09-13 | Non-unitary Wightman CFTs and non-unitary vertex algebras | Sebastiano Carpi et.al. | 2409.08454 | null |
2024-09-12 | Global well-posedness and scattering in weighted space for nonlinear Schrödinger equations below the Strauss exponent without gauge-invariance | Masaki Kawamoto et.al. | 2409.08432 | null |
2024-09-09 | Fast gradient-free optimization of excitations in variational quantum eigensolvers | Jonas Jäger et.al. | 2409.05939 | null |
2024-09-06 | SCARF: Scalable Continual Learning Framework for Memory-efficient Multiple Neural Radiance Fields | Yuze Wang et.al. | 2409.04482 | null |
2024-09-04 | Federated Quantum-Train with Batched Parameter Generation | Chen-Yu Liu et.al. | 2409.02763 | null |
2024-09-16 | Regret Analysis for Randomized Gaussian Process Upper Confidence Bound | Shion Takeno et.al. | 2409.00979 | null |
2024-08-30 | Abstracted Gaussian Prototypes for One-Shot Concept Learning | Chelsea Zou et.al. | 2408.17251 | link |
2024-08-23 | Emergence of global receptive fields capturing multipartite quantum correlations | Oleg M. Sotnikov et.al. | 2408.13033 | null |
2024-08-22 | **Action of $\mathfrak{osp}(1 | 2n)$ on polynomials tensor $\mathbb{C}^{0 | 2n}$** | Dwight Anderson Williams II et.al. |
2024-08-19 | Unimodal sequences and mixed false theta functions | Kevin Allen et.al. | 2408.09789 | null |
2024-08-16 | Onsager-Machlup functional for stochastic lattice dynamical systems driven by time-varying noise | Xinze Zhang et.al. | 2408.08465 | null |
2024-08-10 | Variational Inference Failures Under Model Symmetries: Permutation Invariant Posteriors for Bayesian Neural Networks | Yoav Gelberg et.al. | 2408.05496 | null |
2024-08-09 | Quasilinear parabolic equations with superlinear nonlinearities in critical spaces | Bogdan-Vasile Matioc et.al. | 2408.05067 | null |
2024-08-08 | A framework for generalizing toric inequalities for holographic entanglement entropy | Ning Bao et.al. | 2408.04741 | null |
2024-08-07 | Counterfactuals and Uncertainty-Based Explainable Paradigm for the Automated Detection and Segmentation of Renal Cysts in Computed Tomography Images: A Multi-Center Study | Zohaib Salahuddin et.al. | 2408.03789 | null |
2024-08-05 | BOTS-LM: Training Large Language Models for Setswana | Nathan Brown et.al. | 2408.02239 | null |
2024-08-02 | Conditional LoRA Parameter Generation | Xiaolong Jin et.al. | 2408.01415 | null |
2024-08-01 | Reclaiming Residual Knowledge: A Novel Paradigm to Low-Bit Quantization | Róisín Luo et.al. | 2408.00923 | null |
2024-07-31 | Semantic Codebook Learning for Dynamic Recommendation Models | Zheqi Lv et.al. | 2408.00123 | null |
2024-07-29 | Tensor product weight modules over the affine-Virasoro algebra | Qiu-Fan Chen et.al. | 2407.19844 | null |
2024-07-24 | Generalized Hilbert operators acting on weighted spaces of holomorphic functions with sup-norms | María J. Beltrán-Meneu et.al. | 2407.17646 | null |
2024-07-24 | Generalized Ordinal Priority Approach for Multi-Attribute Decision-Making under Incomplete Preference Information | Renlong Wang et.al. | 2407.17099 | null |
2024-07-22 | WebRPG: Automatic Web Rendering Parameters Generation for Visual Presentation | Zirui Shao et.al. | 2407.15502 | link |
2024-07-18 | FSP-Laplace: Function-Space Priors for the Laplace Approximation in Bayesian Deep Learning | Tristan Cinquin et.al. | 2407.13711 | null |
2024-07-19 | Parameter Generation of Quantum Approximate Optimization Algorithm with Diffusion Model | Fanxu Meng et.al. | 2407.12242 | null |
2024-07-24 | Effect Heterogeneity with Earth Observation in Randomized Controlled Trials: Exploring the Role of Data, Model, and Evaluation Metric Choice | Connor T. Jerzak et.al. | 2407.11674 | link |
2024-07-15 | Make-An-Agent: A Generalizable Policy Network Generator with Behavior-Prompted Diffusion | Yongyuan Liang et.al. | 2407.10973 | null |
2024-07-16 | The well-posedness of generalized nonlinear wave equations on the lattice graph | Bobo Hua et.al. | 2407.09815 | null |
2024-07-15 | Enhancing Robustness of Vision-Language Models through Orthogonality Learning and Cross-Regularization | Jinlong Li et.al. | 2407.08374 | null |
2024-07-09 | Fine-Tuning Linear Layers Only Is a Simple yet Effective Way for Task Arithmetic | Ruochen Jin et.al. | 2407.07089 | link |
2024-07-04 | Recovering Initial States in Semilinear Parabolic Problems from Time-Averages | Lina Sophie Schmitz et.al. | 2407.03829 | null |
2024-07-01 | A quantum deformation of the ${\mathcal N}=2$ superconformal algebra | H. Awata et.al. | 2407.00901 | null |
2024-06-24 | WARP: On the Benefits of Weight Averaged Rewarded Policies | Alexandre Ramé et.al. | 2406.16768 | null |
2024-06-24 | Improving robustness to corruptions with multiplicative weight perturbations | Trung Trinh et.al. | 2406.16540 | link |
2024-06-21 | Determination of certain mod $p$ Galois representations using local constancy | Abhik Ganguli et.al. | 2406.15600 | null |
2024-06-21 | Elliptic analysis on collapsing gravitational instantons modelled using the Gibbons-Hawking ansatz | Willem Adriaan Salm et.al. | 2406.15008 | null |
2024-06-20 | MEAT: Median-Ensemble Adversarial Training for Improving Robustness and Generalization | Zhaozhe Hu et.al. | 2406.14259 | link |
2024-06-18 | From Instance Training to Instruction Learning: Task Adapters Generation from Instructions | Huanxuan Liao et.al. | 2406.12382 | link |
2024-06-17 | Kaniadakis entropy in extreme gravitational and cosmological environments: a review on the state-of-the-art and future prospects | Giuseppe Gaetano Luciano et.al. | 2406.11373 | null |
2024-06-16 | Analysis and approximation of elliptic problems with Uhlenbeck structure in convex polytopes | Tadele Mengesha et.al. | 2406.10762 | null |
2024-06-14 | Towards Scalable and Versatile Weight Space Learning | Konstantin Schürholt et.al. | 2406.09997 | link |
2024-06-13 | Interpreting the Weight Space of Customized Diffusion Models | Amil Dravid et.al. | 2406.09413 | link |
2024-06-12 | Diffusion Soup: Model Merging for Text-to-Image Diffusion Models | Benjamin Biggs et.al. | 2406.08431 | null |
2024-06-24 | Cartan monopoles | Andrei Smilga et.al. | 2406.06042 | null |
2024-06-08 | Regularized Training with Generated Datasets for Name-Only Transfer of Vision-Language Models | Minho Park et.al. | 2406.05432 | link |
2024-06-06 | Regularized KL-Divergence for Well-Defined Function-Space Variational Inference in Bayesian neural networks | Tristan Cinquin et.al. | 2406.04317 | null |
2024-06-06 | A characterization of $(μ,ν)$ -dichotomies via admissibility | Lucas Backes et.al. | 2406.04126 | null |
2024-06-05 | Reproducing Kernel Thesis of Hankel Operators on Weighted Hardy Spaces | Ana Čolović et.al. | 2406.03106 | null |
2024-05-21 | Backpropogation-Free Multi-modal On-Device Model Adaptation via Cloud-Device Collaboration | Wei Ji et.al. | 2406.01601 | null |
2024-05-29 | Thermodynamics of the most generalized form of Holographic Dark Energy and some particular cases with Corrected Entropies | Sanghati Saha et.al. | 2405.20783 | null |
2024-06-20 | The Empirical Impact of Neural Parameter Symmetries, or Lack Thereof | Derek Lim et.al. | 2405.20231 | link |
2024-05-28 | Universal and Extensible Language-Vision Models for Organ Segmentation and Tumor Detection from Abdominal Computed Tomography | Jie Liu et.al. | 2405.18356 | link |
2024-05-28 | $C^2M^3$ : Cycle-Consistent Multi-Model Merging | Donato Crisostomi et.al. | 2405.17897 | link |
2024-05-27 | Smoothing effects and extinction in finite time for fractional fast diffusions on Riemannian manifolds | Elvise Berchio et.al. | 2405.17126 | null |
2024-05-31 | FedSheafHN: Personalized Federated Learning on Graph-structured Data | Wenfei Liang et.al. | 2405.16056 | null |
2024-05-27 | HyperInterval: Hypernetwork approach to training weight interval regions in continual learning | Patryk Krukowski et.al. | 2405.15444 | link |
2024-05-23 | Scalable Optimization in the Modular Norm | Tim Large et.al. | 2405.14813 | link |
2024-06-16 | A refined Weyl character formula for comodules on $\operatorname{GL}_{2,A}$ | Helge Øystein Maakestad et.al. | 2405.09210 | null |
2024-05-13 | Localizing Task Information for Improved Model Merging and Compression | Ke Wang et.al. | 2405.07813 | link |
2024-05-13 | $α$ VIL: Learning to Leverage Auxiliary Tasks for Multitask Learning | Rafael Kourdis et.al. | 2405.07769 | null |
2024-05-12 | Approximation by a new sequence of operators involving Laguerre polynomials | Kapil Kumar et.al. | 2405.07228 | null |
2024-05-06 | Swarm intelligence for full Stokes dynamic imaging reconstruction of interferometric data | Alejandro Mus et.al. | 2405.03330 | null |
2024-05-04 | Large Deviation Principles of Invariant Measures of Stochastic Reaction-Diffusion Lattice Systems | Bixiang Wang et.al. | 2405.02720 | null |
2024-05-03 | The Immersed Inextensible Interface Problem in 2D Stokes Flow | Eduardo García-Juárez et.al. | 2405.02446 | null |
2024-05-02 | Customizing Text-to-Image Models with a Single Image Pair | Maxwell Jones et.al. | 2405.01536 | null |
2024-04-25 | Robust Fine-tuning for Pre-trained 3D Point Cloud Models | Zhibo Zhang et.al. | 2404.16422 | null |
2024-04-23 | The Geometry of the Set of Equivalent Linear Neural Networks | Jonathan Richard Shewchuk et.al. | 2404.14855 | null |
2024-04-24 | Nonexistence of solutions to parabolic problems with a potential on weighted graphs | Dario D. Monticelli et.al. | 2404.12058 | null |
2024-04-17 | On the relaxation to equilibrium of a quantum oscillator interacting with a radiation field | Pierre-A. Vuillermot et.al. | 2404.11329 | null |
2024-04-15 | Higher-curvature gravity in AdS $_3$, holographic $c$ -theorems and black hole microstates | Mariano Chernicoff et.al. | 2404.10128 | null |
2024-04-16 | Asymptotic-preserving approximations for stochastic incompressible viscous fluids and SPDEs on graph | Jianbo Cui et.al. | 2404.09168 | null |
2024-04-09 | Perspective on Physical Interpretations of Rényi Entropy in Statistical Mechanics | Misaki Ozawa et.al. | 2404.06436 | null |
2024-04-09 | A gluing construction of singular solutions for a fully non-linear equation in conformal geometry | María Fernanda Espinal et.al. | 2404.05965 | null |
2024-04-05 | Dissipative Euler flows originating from circular vortex filaments | Francisco Gancedo et.al. | 2404.04250 | null |
2024-04-05 | Macdonald characters from a new formula for Macdonald polynomials | Houcine Ben Dali et.al. | 2404.03904 | null |
2024-04-04 | Fundamental inequalities for the iterated Fourier-cosine convolution with Gaussian weight and its application | Nguyen Thi Hong Phuong et.al. | 2404.03609 | null |
2024-03-29 | Embracing Unknown Step by Step: Towards Reliable Sparse Training in Real World | Bowen Lei et.al. | 2403.20047 | link |
2024-03-28 | Model Stock: All we need is just a few fine-tuned models | Dong-Hwan Jang et.al. | 2403.19522 | link |
2024-03-26 | A location Invariant Statistic-Based Consistent Estimation Method for Three-Parameter Generalized Exponential Distribution | Kiran Prajapat et.al. | 2403.17609 | null |
2024-06-03 | FissionFusion: Fast Geometric Generation and Hierarchical Souping for Medical Image Analysis | Santosh Sanjeev et.al. | 2403.13341 | link |
2024-06-18 | Learning Useful Representations of Recurrent Neural Network Weight Matrices | Vincent Herrmann et.al. | 2403.11998 | link |
2024-03-16 | Function-space Parameterization of Neural Networks for Sequential Learning | Aidan Scannell et.al. | 2403.10929 | link |
2024-03-14 | Imprints of Barrow-Tsallis Cosmology in Primordial Gravitational Waves | Petr Jizba et.al. | 2403.09797 | null |
2024-03-14 | Eigenvariety for partially classical Hilbert modular forms | Mladen Dimitrov et.al. | 2403.09784 | null |
2024-03-12 | The solenoidal Heisenberg Virasoro algebra and its simple weight modules | Boujemaa Agrebaoui et.al. | 2403.07381 | null |
2024-03-10 | FrameQuant: Flexible Low-Bit Quantization for Transformers | Harshavardhan Adepu et.al. | 2403.06082 | link |
2024-03-06 | The solenoidal Virasoro algebra and its simple weight modules | Boujemaa Agrebaoui et.al. | 2403.03753 | null |
2024-03-05 | Tensor Decomposition-based Time Varying Channel Estimation for mmWave MIMO-OFDM Systems | Ruizhe Wang et.al. | 2403.02942 | null |
2024-03-05 | Neural Redshift: Random Networks are not Random Functions | Damien Teney et.al. | 2403.02241 | null |
2024-03-04 | Tiny fluctuations of the averaging process around its degenerate steady state | Federico Sau et.al. | 2403.02032 | null |
2024-03-15 | Training-Free Pretrained Model Merging | Zhengqi Xu et.al. | 2403.01753 | link |
2024-04-22 | HanDiffuser: Text-to-Image Generation With Realistic Hand Appearances | Supreeth Narasimhaswamy et.al. | 2403.01693 | null |
2024-03-13 | TOOLVERIFIER: Generalization to New Tools via Self-Verification | Dheeraj Mekala et.al. | 2402.14158 | link |
2024-02-21 | Computing Tangent Spaces to Eigenvarieties | James Rawson et.al. | 2402.13799 | null |
2024-05-28 | Neural Network Parameter Diffusion | Kai Wang et.al. | 2402.13144 | link |
2024-02-19 | Exponential attractors for a nonlocal delayed reaction-diffusion equation on an unbounded domain | Wenjie Hu et.al. | 2402.11856 | null |
2024-02-18 | Discrete Neural Algorithmic Reasoning | Gleb Rodionov et.al. | 2402.11628 | link |
2024-02-17 | Uncertainty Quantification of Graph Convolution Neural Network Models of Evolving Processes | Jeremiah Hauth et.al. | 2402.11179 | null |
2024-06-06 | Generalizability of Mixture of Domain-Specific Adapters from the Lens of Signed Weight Directions and its Application to Effective Model Pruning | Tuc Nguyen et.al. | 2402.10639 | null |
2024-02-14 | TAI-GAN: A Temporally and Anatomically Informed Generative Adversarial Network for early-to-late frame conversion in dynamic cardiac PET inter-frame motion correction | Xueqi Guo et.al. | 2402.09567 | null |
2024-02-14 | The cohomology of $p$ -adic Deligne-Luszitg schemes of Coxeter type | Alexander B. Ivanov et.al. | 2402.09017 | null |
2024-02-09 | The Asymptotic Structure of Cosmological Integrals | Paolo Benincasa et.al. | 2402.06558 | null |
2024-02-07 | Universal Neural Functionals | Allan Zhou et.al. | 2402.05232 | link |
2024-02-06 | Maximal regularity and optimal control for a non-local Cahn-Hilliard tumour growth model | Matteo Fornoni et.al. | 2402.04204 | null |
2024-02-06 | Improved Generalization of Weight Space Networks via Augmentations | Aviv Shamsian et.al. | 2402.04081 | link |
2024-02-02 | Training-time Neuron Alignment through Permutation Subspace for Improving Linear Mode Connectivity and Model Fusion | Zexi Li et.al. | 2402.01342 | null |
2024-02-01 | Understanding Neural Network Systems for Image Analysis using Vector Spaces and Inverse Maps | Rebecca Pattichis et.al. | 2402.00261 | link |
2024-01-26 | Do deep neural networks utilize the weight space efficiently? | Onur Can Koyun et.al. | 2401.16438 | null |
2024-01-22 | On strong growth conditions for weighted spaces of entire functions | Gerhard Schindl et.al. | 2401.14330 | null |
2024-01-24 | Task structure and nonlinearity jointly determine learned representational geometry | Matteo Alleman et.al. | 2401.13558 | null |
2024-01-25 | Sparse Domination of Singular Bilinear Forms on Non-Homogeneous spaces | Paco Villarroya et.al. | 2401.13130 | null |
2024-01-22 | WARM: On the Benefits of Weight Averaged Reward Models | Alexandre Ramé et.al. | 2401.12187 | null |
2024-01-17 | Cesàro operators associated with Borel measures acting on weighted spaces of holomorphic functions with sup-norm | Maria José Beltrán Meneu et.al. | 2401.09406 | null |
2024-01-15 | Singular fractal dimension at periodicity cascades in parameters spaces | Carlos E. P. Abreu et.al. | 2401.07648 | null |
2024-01-17 | Computing Fringe Presentations of Multigraded Persistence Modules | Fabian Lenzen et.al. | 2401.06008 | null |
2024-01-10 | Grimoire is All You Need for Enhancing Large Language Models | Ding Chen et.al. | 2401.03385 | link |
2024-03-26 | Artificial Intelligence for Operations Research: Revolutionizing the Operations Research Process | Zhenan Fan et.al. | 2401.03244 | null |
2023-12-31 | A Compact Representation for Bayesian Neural Networks By Removing Permutation Symmetry | Tim Z. Xiao et.al. | 2401.00611 | link |
2023-12-28 | Fractional non-homogeneous counting process | Nick Laskin et.al. | 2312.17389 | null |
2023-12-28 | Some unimodal sequences of Kronecker coefficients | Alimzhan Amanov et.al. | 2312.17054 | null |
2023-12-24 | The Vlasov-Maxwell-Boltzmann/Landau system with polynomial perturbation near Maxwellian | Chuqi Cao et.al. | 2312.15510 | null |
2023-12-22 | Emage: Non-Autoregressive Text-to-Image Generation | Zhangyin Feng et.al. | 2312.14988 | null |
2023-12-21 | Hypercyclic shifts on lattice graphs | Anton Baranov et.al. | 2312.13934 | null |
2023-12-21 | Scattering for 2d semi-relativistic Hartree equations with short range potential | Changhun Yang et.al. | 2312.13606 | null |
2023-12-21 | Entropic Inflation in Presence of Scalar Field | Sergei D. Odintsov et.al. | 2312.13587 | null |
2023-12-30 | Time is Encoded in the Weights of Finetuned Language Models | Kai Nylund et.al. | 2312.13401 | link |
2023-12-14 | Efficient momentum space approach to superconductivity in quasiperiodic systems | Mao Yoshii et.al. | 2312.09124 | null |
2023-12-13 | Best one-sided algebraic approximation by average modulus | Raheam A. Al-Saphory et.al. | 2312.08407 | null |
2023-12-19 | Well-Posedness of Quasilinear Parabolic Equations in Time-Weighted Spaces | Bogdan Matioc et.al. | 2312.07974 | null |
2023-12-12 | Rethinking Compression: Reduced Order Modelling of Latent Features in Large Language Models | Arnav Chavan et.al. | 2312.07046 | link |
2023-12-11 | Model Breadcrumbs: Scaling Multi-Task Model Merging with Sparse Masks | MohammadReza Davari et.al. | 2312.06795 | null |
2023-12-08 | Stoichiometry preservation and generalization of Bilger mixture fraction for non-premixed combustion with differential molecular diffusion | Haifeng Wang et.al. | 2312.05204 | null |
2023-12-01 | New polyconvolution product for Fourier-cosine and Laplace integral operators and their applications | Trinh Tuan et.al. | 2312.00764 | null |
2023-11-30 | Modelling Einstein cluster using Einasto profile | Ritwik Acharyya et.al. | 2311.18622 | null |
2023-11-27 | Extraction of the microscopic properties of quasi-particles using deep neural networks | Olga Soloveva et.al. | 2311.15984 | null |
2024-01-24 | Deep Latent Force Models: ODE-based Process Convolutions for Bayesian Deep Learning | Thomas Baldwin-McDonald et.al. | 2311.14828 | null |
Data Distillation
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2024-10-25 | FLiP: Privacy-Preserving Federated Learning based on the Principle of Least Privileg | ShiMao Xu et.al. | 2410.19548 | null |
2024-10-25 | SWITCH: Studying with Teacher for Knowledge Distillation of Large Language Models | Jahyun Koo et.al. | 2410.19503 | null |
2024-10-24 | AlignCap: Aligning Speech Emotion Captioning to Human Preferences | Ziqi Liang et.al. | 2410.19134 | null |
2024-10-24 | High-dimensional Analysis of Knowledge Distillation: Weak-to-Strong Generalization and Scaling Laws | M. Emrullah Ildiz et.al. | 2410.18837 | null |
2024-10-24 | Knowledge Distillation Using Frontier Open-source LLMs: Generalizability and the Role of Synthetic Data | Anup Shirgaonkar et.al. | 2410.18588 | null |
2024-10-24 | SIKeD: Self-guided Iterative Knowledge Distillation for mathematical reasoning | Shivam Adarsh et.al. | 2410.18574 | link |
2024-10-23 | ELAICHI: Enhancing Low-resource TTS by Addressing Infrequent and Low-frequency Character Bigrams | Srija Anand et.al. | 2410.17901 | null |
2024-10-23 | Towards Active Participant-Centric Vertical Federated Learning: Some Representations May Be All You Need | Jon Irureta et.al. | 2410.17648 | null |
2024-10-23 | Towards Effective Data-Free Knowledge Distillation via Diverse Diffusion Augmentation | Muquan Li et.al. | 2410.17606 | link |
2024-10-23 | Physics-driven AI for Channel Estimation in Cellular Network | Xiaoqian Qi et.al. | 2410.17525 | null |
2024-10-22 | MiniPLM: Knowledge Distillation for Pre-Training Language Models | Yuxian Gu et.al. | 2410.17215 | link |
2024-10-22 | Emphasizing Discriminative Features for Dataset Distillation in Complex Scenarios | Kai Wang et.al. | 2410.17193 | link |
2024-10-22 | CK4Gen: A Knowledge Distillation Framework for Generating High-Utility Synthetic Survival Datasets in Healthcare | Nicholas I-Hsien Kuo et.al. | 2410.16872 | null |
2024-10-22 | AttriPrompter: Auto-Prompting with Attribute Semantics for Zero-shot Nuclei Detection via Visual-Language Pre-trained Models | Yongjian Wu et.al. | 2410.16820 | link |
2024-10-22 | SafetyAnalyst: Interpretable, transparent, and steerable LLM safety moderation | Jing-Jing Li et.al. | 2410.16665 | null |
2024-10-21 | Pre-training Distillation for Large Language Models: A Design Space Exploration | Hao Peng et.al. | 2410.16215 | null |
2024-10-18 | Interpreting Microbiome Relative Abundance Data Using Symbolic Regression | Swagatam Haldar et.al. | 2410.16109 | link |
2024-10-21 | Are Large-scale Soft Labels Necessary for Large-scale Dataset Distillation? | Lingao Xiao et.al. | 2410.15919 | link |
2024-10-21 | Model Mimic Attack: Knowledge Distillation for Provably Transferable Adversarial Examples | Kirill Lukyanov et.al. | 2410.15889 | null |
2024-10-20 | Hybrid Memory Replay: Blending Real and Distilled Data for Class Incremental Learning | Jiangtao Kong et.al. | 2410.15372 | null |
2024-10-20 | GSSF: Generalized Structural Sparse Function for Deep Cross-modal Metric Learning | Haiwen Diao et.al. | 2410.15266 | link |
2024-10-19 | LLaVA-Ultra: Large Chinese Language and Vision Assistant for Ultrasound | Xuechen Guo et.al. | 2410.15074 | null |
2024-10-19 | Improving Pronunciation and Accent Conversion through Knowledge Distillation And Synthetic Ground-Truth from Native TTS | Tuan Nam Nguyen et.al. | 2410.14997 | null |
2024-10-17 | CAKD: A Correlation-Aware Knowledge Distillation Framework Based on Decoupling Kullback-Leibler Divergence | Zao Zhang et.al. | 2410.14741 | null |
2024-10-18 | Unlearning Backdoor Attacks for LLMs with Weak-to-Strong Knowledge Distillation | Shuai Zhao et.al. | 2410.14425 | link |
2024-10-18 | Preview-based Category Contrastive Learning for Knowledge Distillation | Muhe Ding et.al. | 2410.14143 | null |
2024-10-17 | Leveraging Fine-Tuned Language Models for Efficient and Accurate Smart Contract Auditing | Zhiyuan Wei et.al. | 2410.13918 | link |
2024-10-17 | GDeR: Safeguarding Efficiency, Balancing, and Robustness via Prototypical Graph Pruning | Guibin Zhang et.al. | 2410.13761 | link |
2024-10-17 | An Active Learning Framework for Inclusive Generation by Large Language Models | Sabit Hassan et.al. | 2410.13641 | null |
2024-10-18 | Towards Satellite Non-IID Imagery: A Spectral Clustering-Assisted Federated Learning Approach | Luyao Zou et.al. | 2410.13602 | null |
2024-10-17 | Enhancing Dataset Distillation via Label Inconsistency Elimination and Learning Pattern Refinement | Chuhao Zhou et.al. | 2410.13311 | link |
2024-10-18 | Cyber Attacks Prevention Towards Prosumer-based EV Charging Stations: An Edge-assisted Federated Prototype Knowledge Distillation Approach | Luyao Zou et.al. | 2410.13260 | null |
2024-10-16 | TAS: Distilling Arbitrary Teacher and Student via a Hybrid Assistant | Guopeng Li et.al. | 2410.12342 | null |
2024-10-16 | Optimizing YOLOv5s Object Detection through Knowledge Distillation algorithm | Guanming Huang et.al. | 2410.12259 | null |
2024-10-16 | TransAgent: Transfer Vision-Language Foundation Models with Heterogeneous Agent Collaboration | Yiwei Guo et.al. | 2410.12183 | link |
2024-10-17 | SAM-Guided Masked Token Prediction for 3D Scene Understanding | Zhimin Chen et.al. | 2410.12158 | null |
2024-10-15 | MoE-Pruner: Pruning Mixture-of-Experts Large Language Model using the Hints from Its Router | Yanyue Xie et.al. | 2410.12013 | null |
2024-10-15 | Breaking Modality Gap in RGBT Tracking: Coupled Knowledge Distillation | Andong Lu et.al. | 2410.11586 | link |
2024-10-15 | Learning from Imperfect Data: Towards Efficient Knowledge Distillation of Autoregressive Language Models for Text-to-SQL | Qihuang Zhong et.al. | 2410.11371 | null |
2024-10-15 | Speculative Knowledge Distillation: Bridging the Teacher-Student Gap Through Interleaved Sampling | Wenda Xu et.al. | 2410.11325 | null |
2024-10-14 | BrainMVP: Multi-modal Vision Pre-training for Brain Image Analysis using Multi-parametric MRI | Shaohao Rui et.al. | 2410.10604 | null |
2024-10-14 | ROSAR: An Adversarial Re-Training Framework for Robust Side-Scan Sonar Object Detection | Martin Aubard et.al. | 2410.10554 | link |
2024-10-14 | Temperature-Centric Investigation of Speculative Decoding with Knowledge Distillation | Siru Ouyang et.al. | 2410.10141 | null |
2024-10-14 | REHRSeg: Unleashing the Power of Self-Supervised Super-Resolution for Resource-Efficient 3D MRI Segmentation | Zhiyun Song et.al. | 2410.10097 | null |
2024-10-15 | Self-Data Distillation for Recovering Quality in Pruned Large Language Models | Vithursan Thangarasa et.al. | 2410.09982 | null |
2024-10-13 | Generalized Group Data Attribution | Dan Ley et.al. | 2410.09940 | null |
2024-10-12 | Distilling Invariant Representations with Dual Augmentation | Nikolaos Giakoumoglou et.al. | 2410.09474 | null |
2024-10-12 | Declarative Knowledge Distillation from Large Language Models for Visual Question Answering Datasets | Thomas Eiter et.al. | 2410.09428 | link |
2024-10-15 | Transforming In-Vehicle Network Intrusion Detection: VAE-based Knowledge Distillation Meets Explainable AI | Muhammet Anil Yagiz et.al. | 2410.09043 | null |
2024-10-11 | Mentor-KD: Making Small Language Models Better Multi-step Reasoners | Hojae Lee et.al. | 2410.09037 | link |
2024-10-11 | Contrastive Knowledge Distillation for Robust Multimodal Sentiment Analysis | Zhongyi Sang et.al. | 2410.08692 | null |
2024-10-11 | DistDD: Distributed Data Distillation Aggregation through Gradient Matching | Peiran Wang et.al. | 2410.08665 | null |
2024-10-11 | GAI-Enabled Explainable Personalized Federated Semi-Supervised Learning | Yubo Peng et.al. | 2410.08634 | null |
2024-10-11 | Simultaneous Reward Distillation and Preference Learning: Get You a Language Model Who Can Do Both | Abhijnan Nath et.al. | 2410.08458 | null |
2024-10-10 | What is Left After Distillation? How Knowledge Transfer Impacts Fairness and Bias | Aida Mohammadshahi et.al. | 2410.08407 | null |
2024-10-10 | A Lightweight Target-Driven Network of Stereo Matching for Inland Waterways | Jing Su et.al. | 2410.07915 | null |
2024-10-10 | SNN-PAR: Energy Efficient Pedestrian Attribute Recognition via Spiking Neural Networks | Haiyang Wang et.al. | 2410.07857 | link |
2024-10-12 | Relational Diffusion Distillation for Efficient Image Generation | Weilun Feng et.al. | 2410.07679 | link |
2024-10-10 | Teddy: Efficient Large-Scale Dataset Distillation via Taylor-Approximated Matching | Ruonan Yu et.al. | 2410.07579 | null |
2024-10-09 | Unlocking Real-Time Fluorescence Lifetime Imaging: Multi-Pixel Parallelism for FPGA-Accelerated Processing | Ismail Erbas et.al. | 2410.07364 | null |
2024-10-09 | S2HPruner: Soft-to-Hard Distillation Bridges the Discretization Gap in Pruning | Weihao Lin et.al. | 2410.07046 | null |
2024-10-09 | Structure-Centric Robust Monocular Depth Estimation via Knowledge Distillation | Runze Chen et.al. | 2410.06982 | null |
2024-10-09 | Efficient and Robust Knowledge Distillation from A Stronger Teacher Based on Correlation Matching | Wenqi Niu et.al. | 2410.06561 | null |
2024-10-10 | KnowledgeSG: Privacy-Preserving Synthetic Text Generation with Knowledge Distillation from Server | Wenhao Wang et.al. | 2410.05725 | link |
2024-10-07 | Progressive distillation induces an implicit curriculum | Abhishek Panigrahi et.al. | 2410.05464 | null |
2024-10-07 | ReasoningRank: Teaching Student Models to Rank through Reasoning-Based Knowledge Distillation | Yuelyu Ji et.al. | 2410.05168 | null |
2024-10-07 | MetaDD: Boosting Dataset Distillation with Neural Network Architecture-Invariant Generalization | Yunlong Zhao et.al. | 2410.05103 | null |
2024-10-06 | CAPEEN: Image Captioning with Early Exits and Knowledge Distillation | Divya Jyoti Bajpai et.al. | 2410.04433 | link |
2024-10-06 | DAdEE: Unsupervised Domain Adaptation in Early Exit PLMs | Divya Jyoti Bajpai et.al. | 2410.04424 | link |
2024-10-10 | Towards Understanding and Enhancing Security of Proof-of-Training for DNN Model Ownership Verification | Yijia Chang et.al. | 2410.04397 | null |
2024-10-10 | Distillation-Free One-Step Diffusion for Real-World Image Super-Resolution | Jianze Li et.al. | 2410.04224 | link |
2024-10-05 | Accelerating Diffusion Models with One-to-Many Knowledge Distillation | Linfeng Zhang et.al. | 2410.04191 | null |
2024-10-05 | DiDOTS: Knowledge Distillation from Large-Language-Models for Dementia Obfuscation in Transcribed Speech | Dominika Woszczyk et.al. | 2410.04188 | null |
2024-10-05 | Gap Preserving Distillation by Building Bidirectional Mappings with A Dynamic Teacher | Yong Guo et.al. | 2410.04140 | null |
2024-10-05 | WiDistill: Distilling Large-scale Wi-Fi Datasets with Trajectory Matching | Tiantian Wang et.al. | 2410.04073 | link |
2024-10-04 | Enhance Reasoning by Learning from Mistakes: Peer-Review Knowledge Distillation from Multiple Large Language Models | Zhuochun Li et.al. | 2410.03663 | null |
2024-10-04 | DocKD: Knowledge Distillation from LLMs for Open-World Document Understanding Models | Sungnyun Kim et.al. | 2410.03061 | null |
2024-10-03 | Dataset Distillation via Knowledge Distillation: Towards Efficient Self-Supervised Pre-Training of Deep Networks | Siddharth Joshi et.al. | 2410.02116 | null |
2024-10-02 | PHI-S: Distribution Balancing for Label-Free Multi-Teacher Distillation | Mike Ranzinger et.al. | 2410.01680 | null |
2024-10-04 | HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models | Seanie Lee et.al. | 2410.01524 | link |
2024-10-02 | Foldable SuperNets: Scalable Merging of Transformers with Different Initializations and Tasks | Edan Kinderman et.al. | 2410.01483 | link |
2024-10-02 | PairDistill: Pairwise Relevance Distillation for Dense Retrieval | Chao-Wei Huang et.al. | 2410.01383 | link |
2024-10-02 | “No Matter What You Do!”: Mitigating Backdoor Attacks in Graph Neural Networks | Jiale Zhang et.al. | 2410.01272 | link |
2024-10-01 | Compressing Recurrent Neural Networks for FPGA-accelerated Implementation in Fluorescence Lifetime Imaging | Ismail Erbas et.al. | 2410.00948 | null |
2024-10-01 | Local-to-Global Self-Supervised Representation Learning for Diabetic Retinopathy Grading | Mostafa Hajighasemloua et.al. | 2410.00779 | null |
2024-10-01 | Efficient Technical Term Translation: A Knowledge Distillation Approach for Parenthetical Terminology Translation | Jiyoon Myung et.al. | 2410.00683 | null |
2024-10-01 | AMR-Evol: Adaptive Modular Response Evolution Elicits Better Knowledge Distillation for Large Language Models in Code Generation | Ziyang Luo et.al. | 2410.00558 | link |
2024-10-01 | Self-Updatable Large Language Models with Parameter Integration | Yu Wang et.al. | 2410.00487 | null |
2024-10-01 | Advancing Medical Radiograph Representation Learning: A Hybrid Pre-training Paradigm with Multilevel Semantic Granularity | Hanqi Jiang et.al. | 2410.00448 | null |
2024-09-30 | Collaborative Knowledge Distillation via a Learning-by-Education Node Community | Anestis Kaimakamidis et.al. | 2410.00074 | null |
2024-09-30 | Enhancing Romanian Offensive Language Detection through Knowledge Distillation, Multi-Task Learning, and Data Augmentation | Vlad-Cristian Matei et.al. | 2409.20498 | null |
2024-10-02 | Linear Projections of Teacher Embeddings for Few-Class Distillation | Noel Loo et.al. | 2409.20449 | null |
2024-09-30 | Classroom-Inspired Multi-Mentor Distillation with Adaptive Learning Strategies | Shalini Sarode et.al. | 2409.20237 | null |
2024-10-01 | HYDRA-FL: Hybrid Knowledge Distillation for Robust and Accurate Federated Learning | Momin Ahmad Khan et.al. | 2409.19912 | null |
2024-09-29 | Tailored Federated Learning: Leveraging Direction Regulation & Knowledge Distillation | Huidong Tang et.al. | 2409.19741 | null |
2024-09-29 | InfantCryNet: A Data-driven Framework for Intelligent Analysis of Infant Cries | Mengze Hong et.al. | 2409.19689 | null |
2024-09-28 | Mind the Gap: Promoting Missing Modality Brain Tumor Segmentation with Alignment | Tianyi Liu et.al. | 2409.19366 | null |
2024-09-27 | Semi-Supervised Bone Marrow Lesion Detection from Knee MRI Segmentation Using Mask Inpainting Models | Shihua Qin et.al. | 2409.19185 | null |
2024-09-27 | Multi-modal Cross-domain Self-supervised Pre-training for fMRI and EEG Fusion | Xinxu Wei et.al. | 2409.19130 | null |
2024-10-01 | Pruning then Reweighting: Towards Data-Efficient Training of Diffusion Models | Yize Li et.al. | 2409.19128 | link |
2024-09-27 | MiniVLN: Efficient Vision-and-Language Navigation by Progressive Knowledge Distillation | Junyou Zhu et.al. | 2409.18800 | null |
2024-09-27 | Student-Oriented Teacher Knowledge Refinement for Knowledge Distillation | Chaomin Shen et.al. | 2409.18785 | null |
2024-09-27 | Harmonizing knowledge Transfer in Neural Network with Unified Distillation | Yaomin Huang et.al. | 2409.18565 | null |
2024-09-27 | Towards Diverse Device Heterogeneous Federated Learning via Task Arithmetic Knowledge Integration | Mahdi Morafah et.al. | 2409.18461 | link |
2024-10-01 | Backdoor Attacks for LLMs with Weak-To-Strong Knowledge Distillation | Shuai Zhao et.al. | 2409.17946 | null |
2024-09-26 | Kendall’s $τ$ Coefficient for Logits Distillation | Yuchen Guan et.al. | 2409.17823 | null |
2024-09-26 | Diversity-Driven Synthesis: Enhancing Dataset Distillation through Directed Weight Adjustment | Jiawei Du et.al. | 2409.17612 | link |
2024-09-26 | Dataset Distillation-based Hybrid Federated Learning on Non-IID Data | Xiufang Shi et.al. | 2409.17517 | null |
2024-09-26 | Shape-intensity knowledge distillation for robust medical image segmentation | Wenhui Dong et.al. | 2409.17503 | link |
2024-09-25 | MT2KD: Towards A General-Purpose Encoder for Speech, Speaker, and Audio Events | Xiaoyu Yang et.al. | 2409.17010 | null |
2024-09-25 | Adverse Weather Optical Flow: Cumulative Homogeneous-Heterogeneous Adaptation | Hanyu Zhou et.al. | 2409.17001 | null |
2024-09-25 | A Novel Framework for Analyzing Structural Transformation in Data-Constrained Economies Using Bayesian Modeling and Machine Learning | Ronald Katende et.al. | 2409.16738 | null |
2024-09-25 | SelectiveKD: A semi-supervised framework for cancer detection in DBT through Knowledge Distillation and Pseudo-labeling | Laurent Dillard et.al. | 2409.16581 | null |
2024-09-24 | AIM 2024 Challenge on UHD Blind Photo Quality Assessment | Vlad Hosu et.al. | 2409.16271 | null |
2024-09-24 | Label-Augmented Dataset Distillation | Seoungyoon Kang et.al. | 2409.16239 | null |
2024-09-25 | Privacy Evaluation Benchmarks for NLP Models | Wei Huang et.al. | 2409.15868 | link |
2024-09-24 | Twin Network Augmentation: A Novel Training Strategy for Improved Spiking Neural Networks and Efficient Weight Quantization | Lucas Deckers et.al. | 2409.15849 | null |
2024-09-23 | TS-TCD: Triplet-Level Cross-Modal Distillation for Time-Series Forecasting Using Large Language Models | Pengfei Wang et.al. | 2409.14978 | null |
2024-09-23 | DSG-KD: Knowledge Distillation from Domain-Specific to General Language Models | Sangyeon Cho et.al. | 2409.14904 | link |
2024-09-23 | Pre-trained Language Model and Knowledge Distillation for Lightweight Sequential Recommendation | Li Li et.al. | 2409.14810 | null |
2024-09-23 | An Adverse Weather-Immune Scheme with Unfolded Regularization and Foundation Model Knowledge Distillation for Street Scene Understanding | Wei-Bin Kou et.al. | 2409.14737 | null |
2024-09-22 | EchoAtt: Attend, Copy, then Adjust for More Efficient Large Language Models | Hossein Rajabzadeh et.al. | 2409.14595 | null |
2024-09-22 | Prior Knowledge Distillation Network for Face Super-Resolution | Qiu Yang et.al. | 2409.14385 | null |
2024-09-25 | DilateQuant: Accurate and Efficient Diffusion Quantization via Weight Dilation | Xuewen Liu et.al. | 2409.14307 | null |
2024-09-18 | Applications of Knowledge Distillation in Remote Sensing: A Survey | Yassine Himeur et.al. | 2409.12111 | null |
2024-09-18 | Data Efficient Acoustic Scene Classification using Teacher-Informed Confusing Class Instruction | Jin Jie Sean Yeo et.al. | 2409.11964 | null |
2024-09-18 | Distillation-free Scaling of Large SSMs for Images and Videos | Hamid Suleman et.al. | 2409.11867 | null |
2024-09-18 | EFCM: Efficient Fine-tuning on Compressed Models for deployment of large models in medical image analysis | Shaojie Li et.al. | 2409.11817 | null |
2024-09-18 | Efficient Low-Resolution Face Recognition via Bridge Distillation | Shiming Ge et.al. | 2409.11786 | null |
2024-09-18 | RUIE: Retrieval-based Unified Information Extraction using Large Language Model | Xincheng Liao et.al. | 2409.11673 | null |
2024-09-17 | Time-Series Forecasting, Knowledge Distillation, and Refinement within a Multimodal PDE Foundation Model | Derek Jollie et.al. | 2409.11609 | link |
2024-09-17 | Unleashing the Potential of Mamba: Boosting a LiDAR 3D Sparse Detector by Using Cross-Model Knowledge Distillation | Rui Yu et.al. | 2409.11018 | null |
2024-09-17 | Single-stage TTS with Masked Audio Token Modeling and Semantic Knowledge Distillation | Gerard I. Gállego et.al. | 2409.11003 | null |
2024-09-16 | Frequency-Guided Masking for Enhanced Vision Self-Supervised Learning | Amin Karimi Monsefi et.al. | 2409.10362 | null |
2024-09-16 | Human Insights Driven Latent Space for Different Driving Perspectives: A Unified Encoder for Efficient Multi-Task Inference | Huy-Dung Nguyen et.al. | 2409.10095 | null |
2024-09-14 | Effective Pre-Training of Audio Transformers for Sound Event Detection | Florian Schmid et.al. | 2409.09546 | link |
2024-09-14 | Integrated Multi-Level Knowledge Distillation for Enhanced Speaker Verification | Wenhao Yang et.al. | 2409.09389 | null |
2024-09-14 | Joint Semantic Knowledge Distillation and Masked Acoustic Modeling for Full-band Speech Restoration with Improved Intelligibility | Xiaoyu Liu et.al. | 2409.09357 | null |
2024-09-13 | Exploring System-Heterogeneous Federated Learning with Dynamic Model Selection | Dixi Yao et.al. | 2409.08858 | null |
2024-09-13 | AWF: Adaptive Weight Fusion for Enhanced Class Incremental Semantic Segmentation | Zechao Sun et.al. | 2409.08516 | null |
2024-09-12 | DiReDi: Distillation and Reverse Distillation for AIoT Applications | Chen Sun et.al. | 2409.08308 | null |
2024-09-12 | Ruri: Japanese General Text Embeddings | Hayato Tsukagoshi et.al. | 2409.07737 | link |
2024-09-12 | Learn from Balance: Rectifying Knowledge Transfer for Long-Tailed Scenarios | Xinlei Huang et.al. | 2409.07694 | null |
2024-09-11 | DS-ViT: Dual-Stream Vision Transformer for Cross-Task Distillation in Alzheimer’s Early Diagnosis | Ke Chen et.al. | 2409.07584 | null |
2024-09-11 | EchoDFKD: Data-Free Knowledge Distillation for Cardiac Ultrasound Segmentation using Synthetic Data | Grégoire Petit et.al. | 2409.07566 | null |
2024-09-11 | Enhancing CTC-Based Visual Speech Recognition | Hendrik Laux et.al. | 2409.07210 | null |
2024-09-11 | A Continual and Incremental Learning Approach for TinyML On-device Training Using Dataset Distillation and Model Size Adaption | Marcus Rüb et.al. | 2409.07114 | null |
2024-09-16 | Privacy-Preserving Federated Learning with Consistency via Knowledge Distillation Using Conditional Generator | Kangyang Luo et.al. | 2409.06955 | null |
2024-09-10 | Applied Federated Model Personalisation in the Industrial Domain: A Comparative Study | Ilias Siniosoglou et.al. | 2409.06904 | null |
2024-09-10 | EasyST: A Simple Framework for Spatio-Temporal Prediction | Jiabin Tang et.al. | 2409.06748 | link |
2024-09-10 | Knowledge Distillation via Query Selection for Detection Transformer | Yi Liu et.al. | 2409.06443 | null |
2024-09-10 | Distilling Generative-Discriminative Representations for Very Low-Resolution Face Recognition | Junzheng Zhang et.al. | 2409.06371 | null |
2024-09-09 | Joint Input and Output Coordination for Class-Incremental Learning | Shuai Wang et.al. | 2409.05620 | null |
2024-09-09 | LEROjD: Lidar Extended Radar-Only Object Detection | Patrick Palmer et.al. | 2409.05564 | link |
2024-09-09 | Look One and More: Distilling Hybrid Order Relational Knowledge for Cross-Resolution Image Recognition | Shiming Ge et.al. | 2409.05384 | null |
2024-09-09 | FedBrain-Distill: Communication-Efficient Federated Brain Tumor Classification Using Ensemble Knowledge Distillation on Non-IID Data | Rasoul Jafari Gohari et.al. | 2409.05359 | link |
2024-09-07 | LoCa: Logit Calibration for Knowledge Distillation | Runming Yang et.al. | 2409.04778 | null |
2024-09-06 | SCARF: Scalable Continual Learning Framework for Memory-efficient Multiple Neural Radiance Fields | Yuze Wang et.al. | 2409.04482 | null |
2024-09-05 | Experimentation in Content Moderation using RWKV | Umut Yildirim et.al. | 2409.03939 | null |
2024-09-05 | Data-Efficient Generation for Dataset Distillation | Zhe Li et.al. | 2409.03929 | null |
2024-09-05 | DKDM: Data-Free Knowledge Distillation for Diffusion Models with Any Architecture | Qianlong Xiang et.al. | 2409.03550 | null |
2024-09-05 | Data-free Distillation with Degradation-prompt Diffusion for Multi-weather Image Restoration | Pei Wang et.al. | 2409.03455 | null |
2024-09-05 | Efficient Image Compression Using Advanced State Space Models | Bouzid Arezki et.al. | 2409.02743 | null |
2024-09-04 | CLDA: Collaborative Learning for Enhanced Unsupervised Domain Adaptation | Minhee Cho et.al. | 2409.02699 | null |
2024-09-04 | Low-Resolution Object Recognition with Cross-Resolution Relational Contrastive Distillation | Kangkai Zhang et.al. | 2409.02555 | null |
2024-09-04 | A design of magnetic tunnel junctions for the deployment of neuromorphic hardware for edge computing | Davi Rodrigues et.al. | 2409.02528 | null |
2024-09-04 | Non-target Divergence Hypothesis: Toward Understanding Domain Gaps in Cross-Modal Knowledge Distillation | Yilong Chen et.al. | 2409.02438 | null |
2024-09-03 | Low-Resolution Face Recognition via Adaptable Instance-Relation Distillation | Ruixin Shi et.al. | 2409.02049 | null |
2024-09-03 | Efficient Point Cloud Classification via Offline Distillation Framework and Negative-Weight Self-Distillation Technique | Qiang Zheng et.al. | 2409.02020 | null |
2024-09-03 | Contemporary Model Compression on Large Language Models Inference | Dong Liu et.al. | 2409.01990 | null |
2024-09-05 | Adaptive Explicit Knowledge Transfer for Knowledge Distillation | Hyungkeun Park et.al. | 2409.01679 | null |
2024-09-03 | Improving Apple Object Detection with Occlusion-Enhanced Distillation | Liang Geng et.al. | 2409.01573 | null |
2024-09-02 | Dataset Distillation from First Principles: Integrating Core Information Extraction and Purposeful Learning | Vyacheslav Kungurtsev et.al. | 2409.01410 | null |
2024-09-02 | MobileIQA: Exploiting Mobile-level Diverse Opinion Network For No-Reference Image Quality Assessment Using Knowledge Distillation | Zewen Chen et.al. | 2409.01212 | link |
2024-09-04 | Diffusion-Driven Data Replay: A Novel Approach to Combat Forgetting in Federated Class Continual Learning | Jinglin Liang et.al. | 2409.01128 | link |
2024-09-02 | Compressing VAE-Based Out-of-Distribution Detectors for Embedded Deployment | Aditya Bansal et.al. | 2409.00880 | null |
2024-09-01 | LanguaShrink: Reducing Token Overhead with Psycholinguistics | Xuechen Liang et.al. | 2409.00855 | null |
2024-08-30 | How Knowledge Distillation Mitigates the Synthetic Gap in Fair Face Recognition | Pedro C. Neto et.al. | 2408.17399 | link |
2024-08-30 | HiTSR: A Hierarchical Transformer for Reference-based Super-Resolution | Masoomeh Aslahishahri et.al. | 2408.16959 | link |
2024-08-29 | VLM-KD: Knowledge Distillation from VLM for Long-Tail Visual Recognition | Zaiwei Zhang et.al. | 2408.16930 | null |
2024-08-29 | Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal Sampling | Hritik Bansal et.al. | 2408.16737 | null |
2024-08-29 | MST-KD: Multiple Specialized Teachers Knowledge Distillation for Fair Face Recognition | Eduarda Caldeira et.al. | 2408.16563 | link |
2024-08-29 | UDD: Dataset Distillation via Mining Underutilized Regions | Shiguang Wang et.al. | 2408.16268 | null |
2024-08-29 | Neural Spectral Decomposition for Dataset Distillation | Shaolei Yang et.al. | 2408.16236 | null |
2024-08-28 | EMP: Enhance Memory in Data Pruning | Jinying Xiao et.al. | 2408.16031 | null |
2024-08-28 | LLaVA-MoD: Making LLaVA Tiny via MoE Knowledge Distillation | Fangxun Shu et.al. | 2408.15881 | link |
2024-08-28 | ModalityMirror: Improving Audio Classification in Modality Heterogeneity Federated Learning with Multimodal Distillation | Tiantian Feng et.al. | 2408.15803 | null |
2024-08-28 | Online pre-training with long-form videos | Itsuki Kato et.al. | 2408.15651 | null |
2024-08-28 | Boosting Lossless Speculative Decoding via Feature Sampling and Partial Alignment Distillation | Lujun Gui et.al. | 2408.15562 | null |
2024-08-27 | Leveraging Self-supervised Audio Representations for Data-Efficient Acoustic Scene Classification | Yiqiang Cai et.al. | 2408.14862 | link |
2024-08-26 | Bridging the Gap: Unpacking the Hidden Challenges in Knowledge Distillation for Online Ranking Systems | Nikhil Khani et.al. | 2408.14678 | null |
2024-08-26 | TSAK: Two-Stage Semantic-Aware Knowledge Distillation for Efficient Wearable Modality and Model Optimization in Manufacturing Lines | Hymalai Bello et.al. | 2408.14146 | null |
Schrodinger Bridge
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2024-12-19 | LeviTor: 3D Trajectory Oriented Image-to-Video Synthesis | Hanlin Wang et.al. | 2412.15214 | null |
2024-12-19 | Flowing from Words to Pixels: A Framework for Cross-Modality Evolution | Qihao Liu et.al. | 2412.15213 | null |
2024-12-19 | Generative Multiview Relighting for 3D Reconstruction under Extreme Illumination Variation | Hadi Alzayer et.al. | 2412.15211 | null |
2024-12-19 | Preventing Local Pitfalls in Vector Quantization via Optimal Transport | Borui Zhang et.al. | 2412.15195 | link |
2024-12-19 | AV-Link: Temporally-Aligned Diffusion Features for Cross-Modal Audio-Video Generation | Moayed Haji-Ali et.al. | 2412.15191 | null |
2024-12-19 | Tiled Diffusion | Or Madar et.al. | 2412.15185 | null |
2024-12-19 | OnlineVPO: Align Video Diffusion Model with Online Video-Centric Preference Optimization | Jiacheng Zhang et.al. | 2412.15159 | null |
2024-12-19 | Prompt-A-Video: Prompt Your Video Diffusion Model via Preference-Aligned LLM | Yatai Ji et.al. | 2412.15156 | link |
2024-12-19 | Jet: A Modern Transformer-Based Normalizing Flow | Alexander Kolesnikov et.al. | 2412.15129 | null |
2024-12-19 | Uni-Renderer: Unifying Rendering and Inverse Rendering Via Dual Stream Diffusion | Zhifei Chen et.al. | 2412.15050 | null |
2024-12-19 | DCTdiff: Intriguing Properties of Image Generative Modeling in the DCT Space | Mang Ning et.al. | 2412.15032 | link |
2024-12-19 | Stable-V2A: Synthesis of Synchronized Sound Effects with Temporal and Semantic Controls | Riccardo Fosco Gramaccioni et.al. | 2412.15023 | null |
2024-12-19 | MagicNaming: Consistent Identity Generation by Finding a “Name Space” in T2I Diffusion Models | Jing Zhao et.al. | 2412.14902 | null |
2024-12-19 | Diffusion priors for Bayesian 3D reconstruction from incomplete measurements | Julian L. Möbius et.al. | 2412.14897 | null |
2024-12-19 | Quantum Algorithms for Stochastic Differential Equations: A Schrödingerisation Approach | Shi Jin et.al. | 2412.14868 | null |
2024-12-18 | AniDoc: Animation Creation Made Easier | Yihao Meng et.al. | 2412.14173 | null |
2024-12-19 | E-CAR: Efficient Continuous Autoregressive Image Generation via Multistage Modeling | Zhihang Yuan et.al. | 2412.14170 | null |
2024-12-18 | Autoregressive Video Generation without Vector Quantization | Haoge Deng et.al. | 2412.14169 | link |
2024-12-18 | VideoDPO: Omni-Preference Alignment for Video Diffusion Generation | Runtao Liu et.al. | 2412.14167 | null |
2024-12-18 | MCMat: Multiview-Consistent and Physically Accurate PBR Material Generation | Shenhao Zhu et.al. | 2412.14148 | null |
2024-12-18 | SurgSora: Decoupled RGBD-Flow Diffusion Model for Controllable Surgical Video Generation | Tong Chen et.al. | 2412.14018 | null |
2024-12-18 | Comparative Analysis of Machine Learning-Based Imputation Techniques for Air Quality Datasets with High Missing Data Rates | Sen Yan et.al. | 2412.13966 | null |
2024-12-18 | IDEQ: an improved diffusion model for the TSP | Mickael Basson et.al. | 2412.13858 | null |
2024-12-18 | Object Style Diffusion for Generalized Object Detection in Urban Scene | Hao Li et.al. | 2412.13815 | null |
2024-12-18 | Text2Relight: Creative Portrait Relighting with Text Guidance | Junuk Cha et.al. | 2412.13734 | null |
2024-12-18 | Diffusion models and stochastic quantisation in lattice field theory | Gert Aarts et.al. | 2412.13704 | null |
2024-12-18 | MMO-IG: Multi-Class and Multi-Scale Object Image Generation for Remote Sensing | Chuang Yang et.al. | 2412.13684 | null |
2024-12-18 | VIIS: Visible and Infrared Information Synthesis for Severe Low-light Image Enhancement | Chen Zhao et.al. | 2412.13655 | link |
2024-12-18 | TAUDiff: Improving statistical downscaling for extreme weather events using generative diffusion models | Rahul Sundar et.al. | 2412.13627 | null |
2024-12-18 | PASCO (PArallel Structured COarsening): an overlay to speed up graph clustering algorithms | Etienne Lasalle et.al. | 2412.13592 | link |
2024-12-17 | CoMPaSS: Enhancing Spatial Understanding in Text-to-Image Diffusion Models | Gaoyang Zhang et.al. | 2412.13195 | link |
2024-12-17 | StreetCrafter: Street View Synthesis with Controllable Video Diffusion Models | Yunzhi Yan et.al. | 2412.13188 | null |
2024-12-17 | Move-in-2D: 2D-Conditioned Human Motion Generation | Hsin-Ping Huang et.al. | 2412.13185 | null |
2024-12-17 | A Pontryagin-Guided Neural Policy Optimization Framework for Merton’s Portfolio Problem | Jeonggyu Huh et.al. | 2412.13101 | null |
2024-12-17 | Prompt Augmentation for Self-supervised Text-guided Image Manipulation | Rumeysa Bodur et.al. | 2412.13081 | null |
2024-12-17 | 3D MedDiffusion: A 3D Medical Diffusion Model for Controllable and High-quality Medical Image Generation | Haoshen Wang et.al. | 2412.13059 | null |
2024-12-18 | Attentive Eraser: Unleashing Diffusion Model’s Object Removal Potential via Self-Attention Redirection Guidance | Wenhao Sun et.al. | 2412.12974 | link |
2024-12-17 | ArchesWeather & ArchesWeatherGen: a deterministic and generative model for efficient ML weather forecasting | Guillaume Couairon et.al. | 2412.12971 | link |
2024-12-17 | Generation of cosmic ray trajectories by a Diffusion Model trained on test particles in 3D magnetohydrodynamic turbulence | Johannes Martin et.al. | 2412.12923 | null |
2024-12-17 | Unsupervised Region-Based Image Editing of Denoising Diffusion Models | Zixiang Li et.al. | 2412.12912 | null |
2024-12-17 | Design of Restricted Normalizing Flow towards Arbitrary Stochastic Policy with Computational Efficiency | Taisuke Kobayashi et.al. | 2412.12894 | null |
2024-12-18 | ArtAug: Enhancing Text-to-Image Generation through Synthesis-Understanding Interaction | Zhongjie Duan et.al. | 2412.12888 | link |
2024-12-17 | Rethinking Diffusion-Based Image Generators for Fundus Fluorescein Angiography Synthesis on Limited Data | Chengzhou Yu et.al. | 2412.12778 | null |
2024-12-17 | Guided and Variance-Corrected Fusion with One-shot Style Alignment for Large-Content Image Generation | Shoukun Sun et.al. | 2412.12771 | null |
2024-12-17 | Towards a Training Free Approach for 3D Scene Editing | Vivek Madhavaram et.al. | 2412.12766 | null |
2024-12-16 | Causal Diffusion Transformers for Generative Modeling | Chaorui Deng et.al. | 2412.12095 | link |
2024-12-16 | CAP4D: Creating Animatable 4D Portrait Avatars with Morphable Multi-View Diffusion Models | Felix Taubner et.al. | 2412.12093 | null |
2024-12-16 | Wonderland: Navigating 3D Scenes from a Single Image | Hanwen Liang et.al. | 2412.12091 | null |
2024-12-16 | A LoRA is Worth a Thousand Pictures | Chenxi Liu et.al. | 2412.12048 | null |
2024-12-16 | The entropic optimal (self-)transport problem: Limit distributions for decreasing regularization with application to score function estimation | Gilles Mordant et.al. | 2412.12007 | null |
2024-12-16 | Controllable Shadow Generation with Single-Step Diffusion Models from Synthetic Data | Onur Tasar et.al. | 2412.11972 | null |
2024-12-16 | ColorFlow: Retrieval-Augmented Image Sequence Colorization | Junhao Zhuang et.al. | 2412.11815 | null |
2024-12-16 | InterDyn: Controllable Interactive Dynamics with Video Diffusion Models | Rick Akkerman et.al. | 2412.11785 | null |
2024-12-16 | Joint Reconstruction of the Activity and the Attenuation in PET by Diffusion Posterior Sampling: a Feasibility Study | Clémentine Phung-Ngoc et.al. | 2412.11776 | null |
2024-12-17 | No More Adam: Learning Rate Scaling at Initialization is All You Need | Minghao Xu et.al. | 2412.11768 | link |
2024-12-16 | Conditional Diffusion Models Based Conditional Independence Testing | Yanfeng Yang et.al. | 2412.11744 | link |
2024-12-16 | Re-Attentional Controllable Video Diffusion Editing | Yuanzhi Wang et.al. | 2412.11710 | link |
2024-12-16 | VG-TVP: Multimodal Procedural Planning via Visually Grounded Text-Video Prompting | Muhammet Furkan Ilaslan et.al. | 2412.11621 | link |
2024-12-16 | 3D $^2$ -Actor: Learning Pose-Conditioned 3D-Aware Denoiser for Realistic Gaussian Avatar Modeling | Zichen Tang et.al. | 2412.11599 | link |
2024-12-16 | StrandHead: Text to Strand-Disentangled 3D Head Avatars Using Hair Geometric Priors | Xiaokun Sun et.al. | 2412.11586 | link |
2024-12-13 | Towards a foundation model for heavy-ion collision experiments through point cloud diffusion | Manjunath Omana Kuttan et.al. | 2412.10352 | null |
2024-12-13 | BrushEdit: All-In-One Image Inpainting and Editing | Yaowei Li et.al. | 2412.10316 | null |
2024-12-13 | Coherent 3D Scene Diffusion From a Single RGB Image | Manuel Dahnert et.al. | 2412.10294 | null |
2024-12-13 | GAF: Gaussian Avatar Reconstruction from Monocular Videos via Multi-view Diffusion | Jiapeng Tang et.al. | 2412.10209 | null |
2024-12-13 | Efficient Generative Modeling with Residual Vector Quantization-Based Tokens | Jaehyeon Kim et.al. | 2412.10208 | null |
2024-12-13 | Simple Guidance Mechanisms for Discrete Diffusion Models | Yair Schiff et.al. | 2412.10193 | link |
2024-12-13 | SwiftTry: Fast and Consistent Video Virtual Try-On with Diffusion Models | Hung Nguyen et.al. | 2412.10178 | null |
2024-12-13 | The Art of Deception: Color Visual Illusions and Diffusion Models | Alex Gomez-Villa et.al. | 2412.10122 | null |
2024-12-13 | SuperMark: Robust and Training-free Image Watermarking via Diffusion-based Super-Resolution | Runyi Hu et.al. | 2412.10049 | null |
2024-12-13 | Emergence of complexity in opinion propagation: A reaction-diffusion model | Romain Ducasse et.al. | 2412.10000 | null |
2024-12-13 | Cycle-Consistent Bridge Diffusion Model for Accelerated MRI Reconstruction | Tao Song et.al. | 2412.09998 | null |
2024-12-13 | EP-CFG: Energy-Preserving Classifier-Free Guidance | Kai Zhang et.al. | 2412.09966 | null |
2024-12-13 | Generating 3D Pseudo-Healthy Knee MR Images to Support Trochleoplasty Planning | Michael Wehrli et.al. | 2412.09962 | null |
2024-12-13 | Efficient Dataset Distillation via Diffusion-Driven Patch Selection for Improved Generalization | Xinhao Zhong et.al. | 2412.09959 | null |
2024-12-13 | Latent feedback control of distributed systems in multiple scenarios through deep learning-based reduced order models | Matteo Tomasetto et.al. | 2412.09942 | null |
2024-12-12 | FreeScale: Unleashing the Resolution of Diffusion Models via Tuning-Free Scale Fusion | Haonan Qiu et.al. | 2412.09626 | null |
2024-12-12 | Illusion3D: 3D Multiview Illusion with 2D Diffusion Priors | Yue Feng et.al. | 2412.09625 | null |
2024-12-12 | OmniDrag: Enabling Motion Control for Omnidirectional Image-to-Video Generation | Weiqi Li et.al. | 2412.09623 | null |
2024-12-12 | LoRACLR: Contrastive Adaptation for Customization of Diffusion Models | Enis Simsar et.al. | 2412.09622 | null |
2024-12-12 | SnapGen: Taming High-Resolution Text-to-Image Models for Mobile Devices with Efficient Architectures and Training | Dongting Hu et.al. | 2412.09619 | null |
2024-12-12 | EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM | Zhuofan Zong et.al. | 2412.09618 | null |
2024-12-12 | Context Canvas: Enhancing Text-to-Image Diffusion Models with Knowledge Graph-Based RAG | Kavana Venkatesh et.al. | 2412.09614 | null |
2024-12-12 | LiftImage3D: Lifting Any Single Image to 3D Gaussians with Video Generation Priors | Yabo Chen et.al. | 2412.09597 | null |
2024-12-12 | Neural LightRig: Unlocking Accurate Object Normal and Material Estimation with Multi-Light Diffusion | Zexin He et.al. | 2412.09593 | null |
2024-12-12 | SimAvatar: Simulation-Ready Avatars with Layered Hair and Clothing | Xueting Li et.al. | 2412.09545 | null |
2024-12-12 | Learned Compression for Compressed Learning | Dan Jacobellis et.al. | 2412.09405 | link |
2024-12-12 | Diffusion Model with Representation Alignment for Protein Inverse Folding | Chenglin Wang et.al. | 2412.09380 | null |
2024-12-12 | Diffusion Predictive Control with Constraints | Ralf Römer et.al. | 2412.09342 | link |
2024-12-12 | Auto-Regressive Moving Diffusion Models for Time Series Forecasting | Jiaxin Gao et.al. | 2412.09328 | link |
2024-12-13 | Are Conditional Latent Diffusion Models Effective for Image Restoration? | Yunchen Yuan et.al. | 2412.09324 | null |
2024-12-11 | Generative Semantic Communication: Architectures, Technologies, and Applications | Jinke Ren et.al. | 2412.08642 | null |
2024-12-11 | DMin: Scalable Training Data Influence Estimation for Diffusion Models | Huawei Lin et.al. | 2412.08637 | link |
2024-12-11 | TryOffAnyone: Tiled Cloth Generation from a Dressed Person | Ioannis Xarchakos et.al. | 2412.08573 | link |
2024-12-11 | A numerical method to simulate the stochastic linear-quadratic optimal control problem with control constraint in higher dimensions | Abhishek Chaudhary et.al. | 2412.08553 | null |
2024-12-11 | Learning Flow Fields in Attention for Controllable Person Image Generation | Zijian Zhou et.al. | 2412.08486 | link |
2024-12-11 | InvDiff: Invariant Guidance for Bias Mitigation in Diffusion Models | Min Hou et.al. | 2412.08480 | link |
2024-12-11 | CC-Diff: Enhancing Contextual Coherence in Remote Sensing Image Synthesis | Mu Zhang et.al. | 2412.08464 | null |
2024-12-11 | Reliable Uncertainty Quantification for Fiber Orientation in Composite Molding Processes using Multilevel Polynomial Surrogates | Stjepan Salatovic et.al. | 2412.08459 | null |
2024-12-11 | Generalized free energy and excess entropy production for active systems | Artemy Kolchinsky et.al. | 2412.08432 | null |
2024-12-12 | Pragmatist: Multiview Conditional Diffusion Models for High-Fidelity 3D Reconstruction from Unposed Sparse Views | Songchun Zhang et.al. | 2412.08412 | null |
2024-12-11 | Grasp Diffusion Network: Learning Grasp Generators from Partial Point Clouds with Diffusion Models in SO(3)xR3 | Joao Carvalho et.al. | 2412.08398 | null |
2024-12-11 | Digging into Intrinsic Contextual Information for High-fidelity 3D Point Cloud Completion | Jisheng Chu et.al. | 2412.08326 | link |
2024-12-11 | GDSG: Graph Diffusion-based Solution Generation for Optimization Problems in MEC Networks | Ruihuai Liang et.al. | 2412.08296 | link |
2024-12-11 | Self-Refining Diffusion Samplers: Enabling Parallelization via Parareal Iterations | Nikil Roashan Selvam et.al. | 2412.08292 | link |
2024-12-11 | Toward Near-Globally Optimal Nonlinear Model Predictive Control via Diffusion Models | Tzu-Yuan Huang et.al. | 2412.08278 | null |
2024-12-10 | Efficient Diversity-Preserving Diffusion Alignment via Gradient-Informed GFlowNets | Zhen Liu et.al. | 2412.07775 | null |
2024-12-10 | From Slow Bidirectional to Fast Causal Video Generators | Tianwei Yin et.al. | 2412.07772 | null |
2024-12-10 | Make-A-Texture: Fast Shape-Aware Texture Generation in 3 Seconds | Xiaoyu Xiang et.al. | 2412.07766 | null |
2024-12-10 | Repurposing Pre-trained Video Diffusion Models for Event-based Video Interpolation | Jingxi Chen et.al. | 2412.07761 | null |
2024-12-10 | SynCamMaster: Synchronizing Multi-Camera Video Generation from Diverse Viewpoints | Jianhong Bai et.al. | 2412.07760 | link |
2024-12-10 | Multi-Shot Character Consistency for Text-to-Video Generation | Yuval Atzmon et.al. | 2412.07750 | null |
2024-12-10 | FiVA: Fine-grained Visual Attribute Dataset for Text-to-Image Diffusion Models | Tong Wu et.al. | 2412.07674 | null |
2024-12-10 | TraSCE: Trajectory Steering for Concept Erasure | Anubhav Jain et.al. | 2412.07658 | null |
2024-12-11 | Motion Artifact Removal in Pixel-Frequency Domain via Alternate Masks and Diffusion Model | Jiahua Xu et.al. | 2412.07590 | link |
2024-12-10 | DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation | Jianzong Wu et.al. | 2412.07589 | null |
2024-12-10 | Mobile Video Diffusion | Haitam Ben Yahia et.al. | 2412.07583 | null |
2024-12-10 | Parallel simulation for sampling under isoperimetry and score-based diffusion models | Huanjian Zhou et.al. | 2412.07435 | null |
2024-12-10 | Non-Progressive Influence Maximization in Dynamic Social Networks | Yunming Hui et.al. | 2412.07402 | null |
2024-12-10 | Fusion Embedding for Pose-Guided Person Image Synthesis with Diffusion Model | Donghwna Lee et.al. | 2412.07333 | null |
2024-12-10 | AppGen: Mobility-aware App Usage Behavior Generation for Mobile Users | Zihan Huang et.al. | 2412.07267 | null |
2024-12-10 | [MASK] is All You Need | Vincent Tao Hu et.al. | 2412.06787 | link |
2024-12-09 | Tactile DreamFusion: Exploiting Tactile Sensing for 3D Generation | Ruihan Gao et.al. | 2412.06785 | link |
2024-12-09 | Diverse Score Distillation | Yanbo Xu et.al. | 2412.06780 | null |
2024-12-09 | Visual Lexicon: Rich Image Features in Language Space | XuDong Wang et.al. | 2412.06774 | null |
2024-12-09 | InstantRestore: Single-Step Personalized Face Restoration with Shared-Image Attention | Howard Zhang et.al. | 2412.06753 | null |
2024-12-10 | ContRail: A Framework for Realistic Railway Image Synthesis using ControlNet | Andrei-Robert Alexandrescu et.al. | 2412.06742 | null |
2024-12-09 | Partially Observed Optimal Stochastic Control: Regularity, Optimality, Approximations, and Learning | Ali Devran Kara et.al. | 2412.06735 | null |
2024-12-09 | Take Fake as Real: Realistic-like Robust Black-box Adversarial Attack to Evade AIGC Detection | Caiyun Xie et.al. | 2412.06727 | link |
2024-12-09 | You See it, You Got it: Learning 3D Creation on Pose-Free Videos at Scale | Baorui Ma et.al. | 2412.06699 | link |
2024-12-09 | Gen-3Diffusion: Realistic Image-to-3D Generation via 2D & 3D Diffusion Synergy | Yuxuan Xue et.al. | 2412.06698 | null |
2024-12-09 | Diff5T: Benchmarking Human Brain Diffusion MRI with an Extensive 5.0 Tesla K-Space and Spatial Dataset | Shanshan Wang et.al. | 2412.06666 | null |
2024-12-09 | Efficiency Meets Fidelity: A Novel Quantization Framework for Stable Diffusion | Shuaiting Li et.al. | 2412.06661 | null |
2024-12-09 | MVReward: Better Aligning and Evaluating Multi-View Diffusion Models with Human Preferences | Weitao Wang et.al. | 2412.06614 | null |
2024-12-09 | On the problem of optimal fair exchange | Alexander Kolesnikov et.al. | 2412.06522 | null |
2024-12-09 | Generative Lines Matching Models | Ori Matityahu et.al. | 2412.06403 | null |
2024-12-06 | Perturb-and-Revise: Flexible 3D Editing with Generative Trajectories | Susung Hong et.al. | 2412.05279 | null |
2024-12-06 | Birth and Death of a Rose | Chen Geng et.al. | 2412.05278 | null |
2024-12-06 | MotionFlow: Attention-Driven Motion Transfer in Video Diffusion Models | Tuna Han Salih Meral et.al. | 2412.05275 | null |
2024-12-06 | Go-or-Grow Models in Biology: a Monster on a Leash | R. Thiessen et.al. | 2412.05191 | null |
2024-12-06 | On Mean Field Monotonicity Conditions from Control Theoretical Perspective | Alain Bensoussan et.al. | 2412.05189 | null |
2024-12-06 | DNF: Unconditional 4D Generation with Dictionary-based Neural Fields | Xinyi Zhang et.al. | 2412.05161 | null |
2024-12-06 | Probabilistic Galaxy Field Generation with Diffusion Models | Tanner Sether et.al. | 2412.05131 | null |
2024-12-06 | The Silent Prompt: Initial Noise as Implicit Guidance for Goal-Driven Image Generation | Ruoyu Wang et.al. | 2412.05101 | null |
2024-12-06 | ReF-LDM: A Latent Diffusion Model for Reference-based Face Image Restoration | Chi-Wei Hsiao et.al. | 2412.05043 | null |
2024-12-06 | Noise Matters: Diffusion Model-based Urban Mobility Generation with Collaborative Noise Priors | Yuheng Zhang et.al. | 2412.05000 | null |
2024-12-06 | Continuous Video Process: Modeling Videos as Continuous Multi-Dimensional Processes for Video Prediction | Gaurav Shrivastava et.al. | 2412.04929 | null |
2024-12-06 | SleeperMark: Towards Robust Watermark against Fine-Tuning Text-to-image Diffusion Models | Zilan Wang et.al. | 2412.04852 | null |
2024-12-06 | Wavelet Diffusion Neural Operator | Peiyan Hu et.al. | 2412.04833 | null |
2024-12-06 | DAWN-SI: Data-Aware and Noise-Informed Stochastic Interpolation for Solving Inverse Problems | Shadab Ahamed et.al. | 2412.04766 | null |
2024-12-06 | Diff4Steer: Steerable Diffusion Prior for Generative Music Retrieval with Semantic Guidance | Xuchan Bao et.al. | 2412.04746 | null |
2024-12-05 | PaintScene4D: Consistent 4D Scene Generation from Text Prompts | Vinayak Gupta et.al. | 2412.04471 | null |
2024-12-05 | LayerFusion: Harmonized Multi-Layer Text-to-Image Generation with Generative Priors | Yusuf Dalva et.al. | 2412.04460 | null |
2024-12-05 | Four-Plane Factorized Video Autoencoders | Mohammed Suhail et.al. | 2412.04452 | null |
2024-12-05 | MEMO: Memory-Guided Diffusion for Expressive Talking Video Generation | Longtao Zheng et.al. | 2412.04448 | null |
2024-12-05 | DiCoDe: Diffusion-Compressed Deep Tokens for Autoregressive Video Generation with Language Models | Yizhuo Li et.al. | 2412.04446 | null |
2024-12-05 | Learning Artistic Signatures: Symmetry Discovery and Style Transfer | Emma Finn et.al. | 2412.04441 | null |
2024-12-05 | Divot: Diffusion Powers Video Tokenizer for Comprehension and Generation | Yuying Ge et.al. | 2412.04432 | link |
2024-12-05 | Infinity: Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis | Jian Han et.al. | 2412.04431 | link |
2024-12-05 | Reversible molecular simulation for training classical and machine learning force fields | Joe G Greener et.al. | 2412.04374 | link |
2024-12-05 | ActFusion: a Unified Diffusion Model for Action Segmentation and Anticipation | Dayoung Gong et.al. | 2412.04353 | null |
2024-12-05 | RMD: A Simple Baseline for More General Human Motion Generation via Training-free Retrieval-Augmented Motion Diffuse | Zhouyingcheng Liao et.al. | 2412.04343 | null |
2024-12-05 | Likelihood-Scheduled Score-Based Generative Modeling for Fully 3D PET Image Reconstruction | George Webber et.al. | 2412.04339 | null |
2024-12-05 | Multi-Subject Image Synthesis as a Generative Prior for Single-Subject PET Image Reconstruction | George Webber et.al. | 2412.04324 | null |
2024-12-05 | Structure-Aware Stylized Image Synthesis for Robust Medical Image Segmentation | Jie Bao et.al. | 2412.04296 | link |
2024-12-05 | Alpha shapes and optimal transport on the sphere | Erik Carlsson et.al. | 2412.04286 | link |
2024-12-04 | MIDI: Multi-Instance Diffusion for Single Image to 3D Scene Generation | Zehuan Huang et.al. | 2412.03558 | null |
2024-12-04 | NVComposer: Boosting Generative Novel View Synthesis with Multiple Sparse and Unposed Images | Lingen Li et.al. | 2412.03517 | null |
2024-12-04 | Distilling Diffusion Models to Efficient 3D LiDAR Scene Completion | Shengyuan Zhang et.al. | 2412.03515 | link |
2024-12-04 | Self-test loss functions for learning weak-form operators and gradient flows | Yuan Gao et.al. | 2412.03506 | null |
2024-12-04 | Solving Monge problem by Hilbert space embeddings of probability measures | Takafumi Saito et.al. | 2412.03478 | null |
2024-12-04 | CleanDIFT: Diffusion Features without Noise | Nick Stracke et.al. | 2412.03439 | link |
2024-12-04 | SINGER: Vivid Audio-driven Singing Video Generation with Multi-scale Spectral Diffusion Model | Yan Li et.al. | 2412.03430 | null |
2024-12-04 | Skel3D: Skeleton Guided Novel View Synthesis | Aron Fóthi et.al. | 2412.03407 | null |
2024-12-04 | Deep Operator BSDE: a Numerical Scheme to Approximate the Solution Operators | Giulia Di Nunno et.al. | 2412.03405 | null |
2024-12-04 | Identifiability implies consistency of MLE in partially observed diffusions on a torus | Ibrahim Ekren et.al. | 2412.03380 | null |
2024-12-04 | TASR: Timestep-Aware Diffusion Model for Image Super-Resolution | Qinwei Lin et.al. | 2412.03355 | link |
2024-12-04 | DIVE: Taming DINO for Subject-Driven Video Editing | Yi Huang et.al. | 2412.03347 | null |
2024-12-04 | Geometry-guided Cross-view Diffusion for One-to-many Cross-view Image Synthesis | Tao Jun Lin et.al. | 2412.03315 | null |
2024-12-04 | Schrodinger Bridge over Averaged Systems | Daniel Owusu Adu et.al. | 2412.03294 | null |
2024-12-04 | Diffusion-VLA: Scaling Robot Foundation Models via Unified Diffusion and Autoregression | Junjie Wen et.al. | 2412.03293 | null |
2024-12-03 | Diffusion-based Visual Anagram as Multi-task Learning | Zhiyuan Xu et.al. | 2412.02693 | link |
2024-12-03 | FoundHand: Large-Scale Domain-Specific Learning for Controllable Hand Image Generation | Kefan Chen et.al. | 2412.02690 | null |
2024-12-04 | SNOOPI: Supercharged One-step Diffusion Distillation with Proper Guidance | Viet Nguyen et.al. | 2412.02687 | null |
2024-12-03 | Sharp-It: A Multi-view to Multi-view Diffusion Model for 3D Synthesis and Manipulation | Yiftach Edelstein et.al. | 2412.02631 | null |
2024-12-03 | Unveiling Concept Attribution in Diffusion Models | Quang H. Nguyen et.al. | 2412.02542 | null |
2024-12-03 | It Takes Two: Real-time Co-Speech Two-person’s Interaction Generation via Reactive Auto-regressive Diffusion Model | Mingyi Shi et.al. | 2412.02419 | null |
2024-12-03 | GenMix: Effective Data Augmentation with Generative Diffusion Model Image Editing | Khawar Islam et.al. | 2412.02366 | null |
2024-12-03 | LoRA Diffusion: Zero-Shot LoRA Synthesis for Diffusion Model Personalization | Ethan Smith et.al. | 2412.02352 | null |
2024-12-03 | SimuScope: Realistic Endoscopic Synthetic Dataset Generation through Surgical Simulation and Diffusion Models | Sabina Martyniak et.al. | 2412.02332 | link |
2024-12-03 | Controlling the Latent Diffusion Model for Generative Image Shadow Removal via Residual Generation | Xinjie Li et.al. | 2412.02322 | null |
2024-12-03 | Viewpoint Consistency in 3D Generation via Attention and CLIP Guidance | Qing Zhang et.al. | 2412.02287 | null |
2024-12-03 | Fast LiDAR Data Generation with Rectified Flows | Kazuto Nakashima et.al. | 2412.02241 | link |
2024-12-03 | Cross-Attention Head Position Patterns Can Align with Human Visual Concepts in Text-to-Image Generative Models | Jungwon Park et.al. | 2412.02237 | link |
2024-12-03 | How to Use Diffusion Priors under Sparse Views? | Qisen Wang et.al. | 2412.02225 | link |
2024-12-03 | GIST: Towards Photorealistic Style Transfer via Multiscale Geometric Representations | Renan A. Rojas-Gomez et.al. | 2412.02214 | null |
2024-11-29 | Gaussian multi-target filtering with target dynamics driven by a stochastic differential equation | Ángel F. García-Fernández et.al. | 2411.19814 | null |
2024-11-29 | MoTe: Learning Motion-Text Diffusion Model for Multiple Generation Tasks | Yiming Wu et.al. | 2411.19786 | null |
2024-11-29 | Riemannian Denoising Score Matching for Molecular Structure Optimization with Accurate Energy | Jeheon Woo et.al. | 2411.19769 | null |
2024-11-29 | TexGaussian: Generating High-quality PBR Material via Octree-based 3D Gaussian Splatting | Bojun Xiong et.al. | 2411.19654 | null |
2024-11-29 | Uniform Attention Maps: Boosting Image Fidelity in Reconstruction and Editing | Wenyi Mo et.al. | 2411.19652 | link |
2024-11-29 | Deepfake Media Generation and Detection in the Generative AI Era: A Survey and Outlook | Florinel-Alin Croitoru et.al. | 2411.19537 | link |
2024-11-29 | Ditto: Motion-Space Diffusion for Controllable Realtime Talking Head Synthesis | Tianqi Li et.al. | 2411.19509 | null |
2024-11-29 | Diffusion Models Meet Network Management: Improving Traffic Matrix Analysis with Diffusion-based Approach | Xinyu Yuan et.al. | 2411.19493 | link |
2024-11-28 | DreamBlend: Advancing Personalized Fine-tuning of Text-to-Image Diffusion Models | Shwetha Ram et.al. | 2411.19390 | null |
2024-11-28 | Enhancing Sketch Animation: Text-to-Video Diffusion Models with Temporal Consistency and Rigidity Constraints | Gaurav Rai et.al. | 2411.19381 | null |
2024-11-28 | Towards a Mechanistic Explanation of Diffusion Model Generalization | Matthew Niedoba et.al. | 2411.19339 | null |
2024-11-28 | Trajectory Attention for Fine-grained Video Motion Control | Zeqi Xiao et.al. | 2411.19324 | null |
2024-11-28 | Improving Multi-Subject Consistency in Open-Domain Image Generation with Isolation and Reposition Attention | Huiguo He et.al. | 2411.19261 | null |
2024-11-28 | Gaussians-to-Life: Text-Driven Animation of 3D Gaussian Splatting Scenes | Thomas Wimmer et.al. | 2411.19233 | link |
2024-11-28 | Z-STAR+: A Zero-shot Style Transfer Method via Adjusting Style Distribution | Yingying Deng et.al. | 2411.19231 | null |
2024-11-27 | GeneMAN: Generalizable Single-Image 3D Human Reconstruction from Multi-Source Human Data | Wentao Wang et.al. | 2411.18624 | null |
2024-11-27 | Diffusion Self-Distillation for Zero-Shot Customized Image Generation | Shengqu Cai et.al. | 2411.18616 | null |
2024-11-27 | CAT4D: Create Anything in 4D with Multi-View Video Diffusion Models | Rundi Wu et.al. | 2411.18613 | null |
2024-11-27 | Evaluating and Improving the Effectiveness of Synthetic Chest X-Rays for Medical Image Analysis | Eva Prakash et.al. | 2411.18602 | null |
2024-11-27 | FAM Diffusion: Frequency and Attention Modulation for High-Resolution Image Generation with Stable Diffusion | Haosen Yang et.al. | 2411.18552 | null |
2024-11-28 | Enhancing weed detection performance by means of GenAI-based image augmentation | Sourav Modak et.al. | 2411.18513 | null |
2024-11-27 | Learning the Evolution of Physical Structure of Galaxies via Diffusion Models | Andrew Lizarraga et.al. | 2411.18440 | link |
2024-11-27 | De-baryonifying halos via optimal transport | Leander Thiele et.al. | 2411.18399 | null |
2024-11-27 | Individual Content and Motion Dynamics Preserved Pruning for Video Diffusion Models | Yiming Wu et.al. | 2411.18375 | null |
2024-11-28 | Large systems of symmetrized trapped Brownian Bridges and Schrodinger processes | Stefan Adams et.al. | 2411.18359 | null |
2024-11-27 | TryOffDiff: Virtual-Try-Off via High-Fidelity Garment Reconstruction using Diffusion Models | Riza Velioglu et.al. | 2411.18350 | link |
2024-11-27 | HiFiVFS: High Fidelity Video Face Swapping | Xu Chen et.al. | 2411.18293 | null |
2024-11-27 | TSD-SR: One-Step Diffusion with Target Score Distillation for Real-World Image Super-Resolution | Linwei Dong et.al. | 2411.18263 | null |
2024-11-27 | Dependency-Aware CAV Task Scheduling via Diffusion-Based Reinforcement Learning | Xiang Cheng et.al. | 2411.18230 | null |
2024-11-27 | Uniqueness and regularity of weak solutions of a drift-diffusion system for perovskite solar cells | Annegret Glitzky et.al. | 2411.18223 | null |
2024-11-27 | StableAnimator: High-Quality Identity-Preserving Human Image Animation | Shuyuan Tu et.al. | 2411.17697 | link |
2024-11-26 | ScribbleLight: Single Image Indoor Relighting with Scribbles | Jun Myeong Choi et.al. | 2411.17696 | null |
2024-11-26 | GenDeg: Diffusion-Based Degradation Synthesis for Generalizable All-in-One Image Restoration | Sudarshan Rajagopalan et.al. | 2411.17687 | null |
2024-11-26 | Accelerating Vision Diffusion Transformers with Skip Branches | Guanjie Chen et.al. | 2411.17616 | link |
2024-11-26 | VideoDirector: Precise Video Editing via Text-to-Video Models | Yukun Wang et.al. | 2411.17592 | null |
2024-11-26 | FTMoMamba: Motion Generation with Frequency and Text State Space Models | Chengjian Li et.al. | 2411.17532 | null |
2024-11-26 | WF-VAE: Enhancing Video VAE by Wavelet-Driven Energy Flow for Latent Video Diffusion Model | Zongjian Li et.al. | 2411.17459 | link |
2024-11-26 | Image Generation with Multimodule Semantic Feature-Aided Selection for Semantic Communications | Chengyang Liang et.al. | 2411.17428 | null |
2024-11-26 | Reward Incremental Learning in Text-to-Image Generation | Maorong Wang et.al. | 2411.17310 | null |
2024-11-26 | APT: Architectural Planning and Text-to-Blueprint Construction Using Large Language Models for Open-World Agents | Jun Yu Chen et.al. | 2411.17255 | link |
2024-11-26 | DiffSLT: Enhancing Diversity in Sign Language Translation via Diffusion Model | JiHwan Moon et.al. | 2411.17248 | null |
2024-11-26 | Boost 3D Reconstruction using Diffusion-based Monocular Camera Calibration | Junyuan Deng et.al. | 2411.17240 | link |
2024-11-26 | From Graph Diffusion to Graph Classification | Jia Jun Cheng Xian et.al. | 2411.17236 | null |
2024-11-26 | DreamMix: Decoupling Object Attributes for Enhanced Editability in Customized Image Inpainting | Yicheng Yang et.al. | 2411.17223 | link |
2024-11-26 | Large deviations of the empirical measures of a strong-Feller Markov process inside a subset and quasi-ergodic distribution | Arnaud Guillin et.al. | 2411.17216 | null |
2024-11-25 | Generative Omnimatte: Learning to Decompose Video into Layers | Yao-Chih Lee et.al. | 2411.16683 | null |
2024-11-25 | Diffusion Features for Zero-Shot 6DoF Object Pose Estimation | Bernd Von Gimborn et.al. | 2411.16668 | null |
2024-11-25 | On a problem of optimal mixing | Kirill Sokolov et.al. | 2411.16651 | null |
2024-11-25 | LegoPET: Hierarchical Feature Guided Conditional Diffusion for PET Image Reconstruction | Yiran Sun et.al. | 2411.16629 | null |
2024-11-25 | Chat2SVG: Vector Graphics Generation with Large Language Models and Image Diffusion Models | Ronghuan Wu et.al. | 2411.16602 | null |
2024-11-25 | Unlocking The Potential of Adaptive Attacks on Diffusion-Based Purification | Andre Kassis et.al. | 2411.16598 | link |
2024-11-25 | Rethinking Diffusion for Text-Driven Human Motion Generation | Zichong Meng et.al. | 2411.16575 | null |
2024-11-25 | Representation Collapsing Problems in Vector Quantization | Wenhao Zhao et.al. | 2411.16550 | null |
2024-11-25 | ADOBI: Adaptive Diffusion Bridge For Blind Inverse Problems with Application to MRI Reconstruction | Yuyang Hu et.al. | 2411.16535 | null |
2024-11-25 | Noise Diffusion for Enhancing Semantic Faithfulness in Text-to-Image Synthesis | Boming Miao et.al. | 2411.16503 | null |
2024-11-25 | On approximations of stochastic optimal control problems with an application to climate equations | Franco Flandoli et.al. | 2411.16491 | null |
2024-11-25 | Model-based reinforcement corrosion prediction: Continuous calibration with Bayesian optimization and corrosion wire sensor data | A. Potnis et.al. | 2411.16447 | null |
2024-11-25 | Privacy Protection in Personalized Diffusion Models via Targeted Cross-Attention Adversarial Attack | Xide Xu et.al. | 2411.16437 | null |
2024-11-25 | Ca2-VDM: Efficient Autoregressive Video Diffusion Model with Causal Generation and Cache Sharing | Kaifeng Gao et.al. | 2411.16375 | link |
2024-11-25 | One Diffusion to Generate Them All | Duong H. Le et.al. | 2411.16318 | link |
2024-11-22 | DiffusionDrive: Truncated Diffusion Model for End-to-End Autonomous Driving | Bencheng Liao et.al. | 2411.15139 | link |
2024-11-22 | Material Anything: Generating Materials for Any 3D Object via Diffusion | Xin Huang et.al. | 2411.15138 | null |
2024-11-22 | VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement | Daeun Lee et.al. | 2411.15115 | null |
2024-11-22 | Leapfrog Latent Consistency Model (LLCM) for Medical Images Generation | Lakshmikar R. Polamreddy et.al. | 2411.15084 | link |
2024-11-22 | The 1D nonlocal Fisher-KPP equation with a top hat kernel. Part 3. The effect of perturbations in the kernel | David John Needham et.al. | 2411.15054 | null |
2024-11-22 | FloAt: Flow Warping of Self-Attention for Clothing Animation Generation | Swasti Shreya Mishra et.al. | 2411.15028 | null |
2024-11-22 | Enhancing Exploration with Diffusion Policies in Hybrid Off-Policy RL: Application to Non-Prehensile Manipulation | Huy Le et.al. | 2411.14913 | null |
2024-11-22 | Prioritize Denoising Steps on Diffusion Model Preference Alignment via Explicit Denoised Distribution Estimation | Dingyuan Shi et.al. | 2411.14871 | null |
2024-11-22 | Latent Schrodinger Bridge: Prompting Latent Diffusion for Fast Unpaired Image-to-Image Translation | Jeongsol Kim et.al. | 2411.14863 | null |
2024-11-22 | Style-Friendly SNR Sampler for Style-Driven Generation | Jooyoung Choi et.al. | 2411.14793 | null |
2024-11-22 | FastGrasp: Efficient Grasp Synthesis with Diffusion | Xiaofei Wu et.al. | 2411.14786 | link |
2024-11-22 | Kolmogorov Modes and Linear Response of Jump-Diffusion Models: Applications to Stochastic Excitation of the ENSO Recharge Oscillator | Mickaël D. Chekroun et.al. | 2411.14769 | null |
2024-11-22 | Measurement of the dynamic charge susceptibility near the charge density wave transition in ErTe $_3$ | Dipanjan Chaudhuri et.al. | 2411.14746 | null |
2024-11-22 | TEXGen: a Generative Diffusion Model for Mesh Textures | Xin Yu et.al. | 2411.14740 | link |
2024-11-22 | AI Tailoring: Evaluating Influence of Image Features on Fashion Product Popularity | Xiaomin Li et.al. | 2411.14737 | null |
2024-11-21 | Stable Flow: Vital Layers for Training-Free Image Editing | Omri Avrahami et.al. | 2411.14430 | null |
2024-11-21 | Baking Gaussian Splatting into Diffusion Denoiser for Fast and Scalable Single-stage Image-to-3D Generation | Yuanhao Cai et.al. | 2411.14384 | null |
2024-11-21 | CoNFiLD-inlet: Synthetic Turbulence Inflow Using Generative Latent Diffusion Models with Neural Fields | Xin-Yang Liu et.al. | 2411.14378 | null |
2024-11-21 | Enhancing Medical Image Segmentation with Deep Learning and Diffusion Models | Houze Liu et.al. | 2411.14353 | null |
2024-11-21 | Continuous nonlinear adaptive experimental design with gradient flow | Ruhui Jin et.al. | 2411.14332 | null |
2024-11-21 | StereoCrafter-Zero: Zero-Shot Stereo Video Generation with Noisy Restart | Jian Shi et.al. | 2411.14295 | null |
2024-11-21 | Stochastic interventions, sensitivity analysis, and optimal transport | Alexander W. Levis et.al. | 2411.14285 | null |
2024-11-21 | Guided MRI Reconstruction via Schrödinger Bridge | Yue Wang et.al. | 2411.14269 | null |
2024-11-21 | TaQ-DiT: Time-aware Quantization for Diffusion Transformers | Xinyan Liu et.al. | 2411.14172 | null |
2024-11-21 | RestorerID: Towards Tuning-Free Face Restoration with ID Preservation | Jiacheng Ying et.al. | 2411.14125 | link |
2024-11-21 | Point Cloud Resampling with Learnable Heat Diffusion | Wenqiang Xu et.al. | 2411.14120 | null |
2024-11-21 | Transforming Static Images Using Generative Models for Video Salient Object Detection | Suhwan Cho et.al. | 2411.13975 | link |
2024-11-21 | Continuum of coupled Wasserstein gradient flows | Clément Cancès et.al. | 2411.13969 | null |
2024-11-21 | Decoupled Sparse Priors Guided Diffusion Compression Model for Point Clouds | Xiaoge Zhang et.al. | 2411.13860 | null |
2024-11-21 | Detecting Human Artifacts from Text-to-Image Models | Kaihong Wang et.al. | 2411.13842 | link |
2024-11-20 | REDUCIO! Generating 1024 $\times$ 1024 Video within 16 Seconds using Extremely Compressed Motion Latents | Rui Tian et.al. | 2411.13552 | link |
2024-11-20 | Identity Preserving 3D Head Stylization with Multiview Score Distillation | Bahri Batuhan Bilecen et.al. | 2411.13536 | null |
2024-11-20 | Heuristically Adaptive Diffusion-Model Evolutionary Strategy | Benedikt Hartl et.al. | 2411.13420 | null |
2024-11-20 | ripALM: A Relative-Type Inexact Proximal Augmented Lagrangian Method with Applications to Quadratically Regularized Optimal Transport | Jiayi Zhu et.al. | 2411.13267 | null |
2024-11-20 | A new maximal regularity for parabolic equations and an application | Jinlong Wei et.al. | 2411.13266 | null |
2024-11-20 | XMask3D: Cross-modal Mask Reasoning for Open Vocabulary 3D Semantic Segmentation | Ziyi Wang et.al. | 2411.13243 | link |
2024-11-20 | Backward Stochastic Control System with Entropy Regularization | Ziyue Chen et.al. | 2411.13219 | null |
2024-11-20 | A computational framework for integrating Predictive processes with evidence Accumulation Models (PAM) | Antonino Visalli et.al. | 2411.13203 | link |
2024-11-20 | RAW-Diffusion: RGB-Guided Diffusion Models for High-Fidelity RAW Image Generation | Christoph Reinders et.al. | 2411.13150 | link |
2024-11-20 | CopyrightMeter: Revisiting Copyright Protection in Text-to-image Models | Naen Xu et.al. | 2411.13144 | null |
2024-11-20 | Virtual Staining of Label-Free Tissue in Imaging Mass Spectrometry | Yijie Zhang et.al. | 2411.13120 | null |
2024-11-20 | Distribution-free Measures of Association based on Optimal Transport | Nabarun Deb et.al. | 2411.13080 | null |
2024-11-19 | Breaking the wire: the impact of critical length on melting pathways in silver nanowires | Kannan M Ridings et.al. | 2411.12891 | null |
2024-11-19 | From Text to Pose to Image: Improving Diffusion Model Control and Quality | Clément Bonnett et.al. | 2411.12872 | link |
2024-11-19 | CDI: Copyrighted Data Identification in Diffusion Models | Jan Dubiński et.al. | 2411.12858 | link |
2024-11-19 | PoM: Efficient Image and Video Generation with the Polynomial Mixer | David Picard et.al. | 2411.12663 | link |
2024-11-19 | Improving Controllability and Editability for Pretrained Text-to-Music Generation Models | Yixiao Zhang et.al. | 2411.12641 | null |
2024-11-19 | Data Pruning in Generative Diffusion Models | Rania Briq et.al. | 2411.12523 | null |
2024-11-19 | Itô, Stratonovich, and zoom-in schemes in stochastic inflation | Eemeli Tomberg et.al. | 2411.12465 | null |
2024-11-19 | Frequency-Aware Guidance for Blind Image Restoration via Diffusion Models | Jun Xiao et.al. | 2411.12450 | null |
2024-11-19 | Combinational Backdoor Attack against Customized Text-to-Image Models | Wenbo Jiang et.al. | 2411.12389 | null |
2024-11-19 | Scalable and Effective Negative Sample Generation for Hyperedge Prediction | Shilin Qu et.al. | 2411.12354 | null |
2024-11-19 | Diffusion Product Quantization | Jie Shao et.al. | 2411.12306 | null |
2024-11-19 | SSEditor: Controllable Mask-to-Scene Generation with Diffusion Model | Haowen Zheng et.al. | 2411.12290 | link |
2024-11-20 | HouseLLM: LLM-Assisted Two-Phase Text-to-Floorplan Generation | Ziyang Zong et.al. | 2411.12279 | null |
2024-11-19 | On sensitivities regarding shape and topology optimization as derivatives on Wasserstein spaces | Fumiya Okazaki et.al. | 2411.12234 | null |
2024-11-19 | Wavespeed selection of travelling wave solutions of a two-component reaction-diffusion model of cell invasion | Yuhui Chen et.al. | 2411.12232 | null |
2024-11-19 | Constant Rate Schedule: Constant-Rate Distributional Change for Efficient Training and Sampling in Diffusion Models | Shuntaro Okada et.al. | 2411.12188 | null |
2024-11-19 | Diffusion-Inspired Cold Start with Sufficient Prior in Computerized Adaptive Testing | Haiping Ma et.al. | 2411.12182 | link |
2024-11-19 | Enhancing Low Dose Computed Tomography Images Using Consistency Training Techniques | Mahmut S. Gokmen et.al. | 2411.12181 | null |
2024-11-18 | Milstein-type schemes for McKean-Vlasov SDEs driven by Brownian motion and Poisson random measure (with super-linear coefficients) | Sani Biswas et.al. | 2411.11759 | null |
2024-11-18 | Aligning Few-Step Diffusion Models with Dense Reward Difference Learning | Ziyi Zhang et.al. | 2411.11727 | link |
2024-11-18 | Robust Reinforcement Learning under Diffusion Models for Data with Jumps | Chenyang Jiang et.al. | 2411.11697 | null |
2024-11-18 | Conceptwm: A Diffusion Model Watermark for Concept Protection | Liangqi Lei et.al. | 2411.11688 | null |
2024-11-19 | Cascaded Diffusion Models for 2D and 3D Microscopy Image Synthesis to Enhance Cell Segmentation | Rüveyda Yilmaz et.al. | 2411.11515 | null |
2024-11-18 | MVLight: Relightable Text-to-3D Generation via Light-conditioned Multi-View Diffusion | Dongseok Shim et.al. | 2411.11475 | null |
2024-11-18 | CLUE-MARK: Watermarking Diffusion Models using CLWE | Kareem Shehata et.al. | 2411.11434 | null |
2024-11-18 | Teaching Video Diffusion Model with Latent Physical Phenomenon Knowledge | Qinglong Cao et.al. | 2411.11343 | null |
2024-11-18 | Stochastic quantization and diffusion models | Kenji Fukushima et.al. | 2411.11297 | null |
2024-11-18 | Unbiased Approximations for Stationary Distributions of McKean-Vlasov SDEs | Elsiddig Awadelkarim et.al. | 2411.11270 | null |
2024-11-17 | Stealing Training Graphs from Graph Neural Networks | Minhua Lin et.al. | 2411.11197 | null |
2024-11-17 | DeepSPV: An Interpretable Deep Learning Pipeline for 3D Spleen Volume Estimation from 2D Ultrasound Images | Zhen Yuan et.al. | 2411.11190 | null |
2024-11-17 | Strong Stability Preservation for Stochastic Partial Differential Equations | James Woodfield et.al. | 2411.11172 | null |
2024-11-17 | Integrated Ising Model with global inhibition for decision making | Olga Tapinova et.al. | 2411.11143 | null |
2024-11-17 | Oscillation Inversion: Understand the structure of Large Flow Model through the Lens of Inversion Method | Yan Zheng et.al. | 2411.11135 | null |
2024-11-15 | M-VAR: Decoupled Scale-wise Autoregressive Modeling for High-Quality Image Generation | Sucheng Ren et.al. | 2411.10433 | link |
2024-11-15 | Mitigating Parameter Degeneracy using Joint Conditional Diffusion Model for WECC Composite Load Model in Power Systems | Feiqin Zhu et.al. | 2411.10431 | null |
2024-11-15 | Towards High-Fidelity 3D Portrait Generation with Rich Details by Cross-View Prior-Aware Diffusion | Haoran Wei et.al. | 2411.10369 | null |
2024-11-15 | Probabilistic Prior Driven Attention Mechanism Based on Diffusion Model for Imaging Through Atmospheric Turbulence | Guodong Sun et.al. | 2411.10321 | null |
2024-11-15 | Modification Takes Courage: Seamless Image Stitching via Reference-Driven Inpainting | Ziqi Xie et.al. | 2411.10309 | link |
2024-11-15 | The Unreasonable Effectiveness of Guidance for Diffusion Models | Tim Kaiser et.al. | 2411.10257 | null |
2024-11-15 | Smooth transport map via diffusion process | Arthur Stéphanovitch et.al. | 2411.10235 | null |
2024-11-15 | ColorEdit: Training-free Image-Guided Color editing with diffusion model | Xingxi Yin et.al. | 2411.10232 | null |
2024-11-15 | Fused Gromov-Wasserstein Variance Decomposition with Linear Optimal Transport | Michael Wilson et.al. | 2411.10204 | null |
2024-11-15 | Evaluating Text-to-Image Diffusion Models for Texturing Synthetic Data | Thomas Lips et.al. | 2411.10164 | link |
2024-11-15 | Towards Multi-View Consistent Style Transfer with One-Step Diffusion via Vision Conditioning | Yushen Zuo et.al. | 2411.10130 | null |
2024-11-15 | SPLIT: SE(3)-diffusion via Local Geometry-based Score Prediction for 3D Scene-to-Pose-Set Matching Problems | Kanghyun Kim et.al. | 2411.10049 | null |
2024-11-15 | EyeDiff: text-to-image diffusion model improves rare eye disease diagnosis | Ruoyu Chen et.al. | 2411.10004 | null |
2024-11-15 | Adaptive Non-Uniform Timestep Sampling for Diffusion Model Training | Myunsoo Kim et.al. | 2411.09998 | null |
2024-11-15 | Instruction-Guided Editing Controls for Images and Multimedia: A Survey in LLM era | Thanh Tam Nguyen et.al. | 2411.09955 | link |
2024-11-14 | How to implement the Bayes’ formula in the age of ML? | Amirhossein Taghvaei et.al. | 2411.09653 | null |
2024-11-14 | Golden Noise for Diffusion Models: A Learning Framework | Zikai Zhou et.al. | 2411.09502 | null |
2024-11-14 | DiffRoad: Realistic and Diverse Road Scenario Generation for Autonomous Vehicle Testing | Junjie Zhou et.al. | 2411.09451 | null |
2024-11-14 | Image Regeneration: Evaluating Text-to-Image Model via Generating Identical Image with Multimodal Large Language Models | Chutian Meng et.al. | 2411.09449 | null |
2024-11-14 | A survey of probabilistic generative frameworks for molecular simulations | Richard John et.al. | 2411.09388 | link |
2024-11-14 | EEG-Based Speech Decoding: A Novel Approach Using Multi-Kernel Ensemble Diffusion Models | Soowon Kim et.al. | 2411.09302 | null |
2024-11-14 | Advancing Diffusion Models: Alias-Free Resampling and Enhanced Rotational Equivariance | Md Fahim Anjum et.al. | 2411.09174 | null |
2024-11-14 | VidMan: Exploiting Implicit Dynamics from Video Diffusion Model for Effective Robot Manipulation | Youpeng Wen et.al. | 2411.09153 | null |
2024-11-14 | General linear threshold models with application to influence maximization | Alexander Kagan et.al. | 2411.09100 | link |
2024-11-13 | Microfoundation Inference for Strategic Prediction | Daniele Bracale et.al. | 2411.08998 | null |
2024-11-15 | Inconsistencies In Consistency Models: Better ODE Solving Does Not Imply Better Samples | Noël Vouitsis et.al. | 2411.08954 | link |
2024-11-13 | 4D Gaussian Splatting in the Wild with Uncertainty-Aware Regularization | Mijeong Kim et.al. | 2411.08879 | null |
2024-11-13 | Offline Adaptation of Quadruped Locomotion using Diffusion Models | Reece O’Mahoney et.al. | 2411.08832 | null |
2024-11-13 | Optimal Transport-Based Displacement Interpolation with Data Augmentation for Reduced Order Modeling of Nonlinear Dynamical Systems | Moaad Khamlich et.al. | 2411.08750 | null |
2024-11-13 | Berry-Esseen bounds for large-time asymptotics of one-dimensional diffusion processes via Malliavin-Stein method | Seiichiro Kusuoka et.al. | 2411.08725 | null |
2024-11-13 | A Machine Learning Algorithm for Finite-Horizon Stochastic Control Problems in Economics | Xianhua Peng et.al. | 2411.08668 | null |
2024-11-13 | Towards More Accurate Fake Detection on Images Generated from Advanced Generative and Neural Rendering Models | Chengdong Dong et.al. | 2411.08642 | null |
2024-11-13 | Neural Topic Modeling with Large Language Models in the Loop | Xiaohao Yang et.al. | 2411.08534 | null |
2024-11-13 | V2X-R: Cooperative LiDAR-4D Radar Fusion for 3D Object Detection with Denoising Diffusion | Xun Huang et.al. | 2411.08402 | link |
2024-11-13 | Physics Informed Distillation for Diffusion Models | Joshua Tian Jin Tee et.al. | 2411.08378 | link |
2024-11-13 | Multiscale Graph Construction Using Non-local Cluster Features | Reina Kaneko et.al. | 2411.08371 | null |
2024-11-13 | Generative AI for Data Augmentation in Wireless Networks: Analysis, Applications, and Case Study | Jinbo Wen et.al. | 2411.08341 | null |
2024-11-13 | Motion Control for Enhanced Complex Action Video Generation | Qiang Zhou et.al. | 2411.08328 | null |
2024-11-13 | Conditional Variable Flow Matching: Transforming Conditional Densities with Amortized Conditional Optimal Transport | Adam P. Generale et.al. | 2411.08314 | link |
2024-11-13 | DNN Task Assignment in UAV Networks: A Generative AI Enhanced Multi-Agent Reinforcement Learning Approach | Xin Tang et.al. | 2411.08299 | null |
2024-11-12 | Joint Diffusion models in Continual Learning | Paweł Skierś et.al. | 2411.08224 | null |
2024-11-12 | Scaling Properties of Diffusion Models for Perceptual Tasks | Rahul Ravishankar et.al. | 2411.08034 | null |
2024-11-12 | GaussianAnything: Interactive Point Cloud Latent Diffusion for 3D Generation | Yushi Lan et.al. | 2411.08033 | null |
2024-11-12 | Approximation rates of entropic maps in semidiscrete optimal transport | Ritwik Sadhu et.al. | 2411.07947 | null |
2024-11-12 | Stochastic MPC for Finite Gaussian Mixture Disturbances with Guarantees | Maico H. W. Engelaar et.al. | 2411.07887 | null |
2024-11-12 | Diverse capability and scaling of diffusion and auto-regressive models when learning abstract rules | Binxu Wang et.al. | 2411.07873 | null |
2024-11-12 | Federated Learning for Discrete Optimal Transport with Large Population under Incomplete Information | Navpreet Kaur et.al. | 2411.07841 | null |
2024-11-12 | Novel View Synthesis with Pixel-Space Diffusion Models | Noam Elata et.al. | 2411.07765 | null |
2024-11-12 | Nanosecond nanothermometry in an electron microscope | Florian Castioni et.al. | 2411.07764 | null |
2024-11-12 | Leveraging Previous Steps: A Training-free Fast Solver for Flow Diffusion | Kaiyu Song et.al. | 2411.07627 | null |
2024-11-12 | Unraveling the Connections between Flow Matching and Diffusion Probabilistic Models in Training-free Conditional Generation | Kaiyu Song et.al. | 2411.07625 | null |
2024-11-12 | Harmonizing Pixels and Melodies: Maestro-Guided Film Score Generation and Composition Style Transfer | F. Qi et.al. | 2411.07539 | null |
2024-11-12 | FM-TS: Flow Matching for Time Series Generation | Yang Hu et.al. | 2411.07506 | link |
2024-11-12 | Semi-Truths: A Large-Scale Dataset of AI-Augmented Images for Evaluating Robustness of AI-Generated Image detectors | Anisha Pal et.al. | 2411.07472 | link |
2024-11-12 | Tracing the Roots: Leveraging Temporal Dynamics in Diffusion Trajectories for Origin Attribution | Andreas Floros et.al. | 2411.07449 | null |
2024-11-12 | All-in-one Weather-degraded Image Restoration via Adaptive Degradation-aware Self-prompting Model | Yuanbo Wen et.al. | 2411.07445 | null |
2024-11-11 | Score-based generative diffusion with “active” correlated noise sources | Alexandra Lamtyugina et.al. | 2411.07233 | null |
2024-11-12 | Add-it: Training-Free Object Insertion in Images With Pretrained Diffusion Models | Yoad Tewel et.al. | 2411.07232 | null |
2024-11-11 | DLCR: A Generative Data Expansion Framework via Diffusion for Clothes-Changing Person Re-ID | Nyle Siddiqui et.al. | 2411.07205 | link |
2024-11-11 | Crossover from inhomogeneous to homogeneous response of a resonantly driven hBN quantum emitter | Domitille Gérard et.al. | 2411.07202 | null |
2024-11-11 | OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision | Cong Wei et.al. | 2411.07199 | null |
2024-11-11 | More Expressive Attention with Negative Weights | Ang Lv et.al. | 2411.07176 | link |
2024-11-11 | Rough differential equations in the flow approach | Ajay Chandra et.al. | 2411.07157 | null |
2024-11-11 | Conditional simulation via entropic optimal transport: Toward non-parametric estimation of conditional Brenier maps | Ricardo Baptista et.al. | 2411.07154 | null |
2024-11-11 | Variational Graph Contrastive Learning | Shifeng Xie et.al. | 2411.07150 | link |
2024-11-11 | Edify 3D: Scalable High-Quality 3D Asset Generation | NVIDIA et.al. | 2411.07135 | null |
2024-11-11 | Edify Image: High-Quality Image Generation with Pixel Space Laplacian Diffusion Models | NVIDIA et.al. | 2411.07126 | null |
2024-11-12 | Distribution dependent SDEs with multiplicative fractional noise | Xiliang Fan et.al. | 2411.06974 | null |
2024-11-11 | Nonparametric estimation of trend for stochastic differential equations driven by multiplicative stochastic volatility | B. L. S. Prakasa Rao et.al. | 2411.06865 | null |
2024-11-11 | The Exponential Lie Series and a Chen-Strichartz Formula for Levy Processes | Kurusch Ebrahimi-Fard et.al. | 2411.06827 | null |
2024-11-11 | White-Box Diffusion Transformer for single-cell RNA-seq generation | Zhuorui Cui et.al. | 2411.06785 | link |
2024-11-08 | StdGEN: Semantic-Decomposed 3D Character Generation from Single Images | Yuze He et.al. | 2411.05738 | null |
2024-11-08 | Image2Text2Image: A Novel Framework for Label-Free Evaluation of Image-to-Text Generation with Text-to-Image Diffusion Models | Jia-Hong Huang et.al. | 2411.05706 | null |
2024-11-08 | Relative Optimal Transport | Peter Bubenik et.al. | 2411.05678 | null |
2024-11-08 | Improving Molecular Graph Generation with Flow Matching and Optimal Transport | Xiaoyang Hou et.al. | 2411.05676 | null |
2024-11-08 | Rigidly breaking potential flows and a countable Alexandrov theorem for polytopes | Jian-Guo Liu et.al. | 2411.05606 | null |
2024-11-08 | Towards Lifelong Few-Shot Customization of Text-to-Image Diffusion | Nan Song et.al. | 2411.05544 | null |
2024-11-08 | Improving image synthesis with diffusion-negative sampling | Alakh Desai et.al. | 2411.05473 | null |
2024-11-08 | Bridging the Gap between Learning and Inference for Diffusion-Based Molecule Generation | Peidong Liu et.al. | 2411.05472 | link |
2024-11-08 | Generalization, Expressivity, and Universality of Graph Neural Networks on Attributed Graphs | Levi Rauchwerger et.al. | 2411.05464 | null |
2024-11-08 | Sticky diffusions on star graphs : characterization and It{ô} formula | Jules Berry et.al. | 2411.05441 | null |
2024-11-08 | Stochastic games of parental vaccination decision making and bounded rationality | Andras Balogh et.al. | 2411.05369 | null |
2024-11-08 | RED: Residual Estimation Diffusion for Low-Dose PET Sinogram Reconstruction | Xingyu Ai et.al. | 2411.05354 | null |
2024-11-08 | Electro-diffusive modeling and the role of spine geometry on action potential propagation in neurons | Rahul Gulati et.al. | 2411.05329 | null |
2024-11-08 | Adaptive Whole-Body PET Image Denoising Using 3D Diffusion Models with ControlNet | Boxiao Yu et.al. | 2411.05302 | null |
2024-11-08 | SpecHub: Provable Acceleration to Multi-Draft Speculative Decoding | Ryan Sun et.al. | 2411.05289 | link |
2024-11-07 | SVDQunat: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models | Muyang Li et.al. | 2411.05007 | link |
2024-11-07 | ProEdit: Simple Progression is All You Need for High-Quality 3D Scene Editing | Jun-Kun Chen et.al. | 2411.05006 | null |
2024-11-07 | Diff-2-in-1: Bridging Generation and Dense Perception with Diffusion Models | Shuhong Zheng et.al. | 2411.05005 | null |
2024-11-07 | ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning | David Junhao Zhang et.al. | 2411.05003 | null |
2024-11-07 | SG-I2V: Self-Guided Trajectory Control in Image-to-Video Generation | Koichi Namekata et.al. | 2411.04989 | null |
2024-11-07 | Uncovering Hidden Subspaces in Video Diffusion Models Using Re-Identification | Mischa Dombrowski et.al. | 2411.04956 | null |
2024-11-07 | DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion | Wenqiang Sun et.al. | 2411.04928 | null |
2024-11-07 | Stem-OB: Generalizable Visual Imitation Learning with Stem-Like Convergent Observation through Diffusion Inversion | Kaizhe Hu et.al. | 2411.04919 | link |
2024-11-07 | Gluing methods for quantitative stability of optimal transport maps | Cyril Letrouit et.al. | 2411.04908 | null |
2024-11-07 | Coupling between Brownian motion and random walks on the infinite percolation cluster | Chenlin Gu et.al. | 2411.04778 | null |
2024-11-07 | Controlling Human Shape and Pose in Text-to-Image Diffusion Models via Domain Adaptation | Benito Buchheim et.al. | 2411.04724 | null |
2024-11-07 | DanceFusion: A Spatio-Temporal Skeleton Diffusion Transformer for Audio-Driven Dance Motion Reconstruction | Li Zhao et.al. | 2411.04646 | null |
2024-11-07 | Brain Tumour Removing and Missing Modality Generation using 3D WDM | André Ferreira et.al. | 2411.04630 | link |
2024-11-07 | Social EgoMesh Estimation | Luca Scofano et.al. | 2411.04598 | link |
2024-11-07 | Series-to-Series Diffusion Bridge Model | Hao Yang et.al. | 2411.04491 | null |
2024-11-06 | Community Forensics: Using Thousands of Generators to Train Fake Image Detectors | Jeongsoo Park et.al. | 2411.04125 | null |
2024-11-06 | A Multi-level Monte Carlo simulation for invariant distribution of Markovian switching Lévy-driven SDEs with super-linearly growth coefficients | Hoang-Viet Nguyen et.al. | 2411.04081 | null |
2024-11-06 | Synomaly Noise and Multi-Stage Diffusion: A Novel Approach for Unsupervised Anomaly Detection in Ultrasound Imaging | Yuan Bi et.al. | 2411.04004 | null |
2024-11-06 | ET-SEED: Efficient Trajectory-Level SE(3) Equivariant Diffusion Policy | Chenrui Tie et.al. | 2411.03990 | null |
2024-11-06 | ReEdit: Multimodal Exemplar-Based Image Editing with Diffusion Models | Ashutosh Srivastava et.al. | 2411.03982 | null |
2024-11-06 | ROBIN: Robust and Invisible Watermarks for Diffusion Models with Adversarial Optimization | Huayang Huang et.al. | 2411.03862 | link |
2024-11-06 | Sub-DM:Subspace Diffusion Model with Orthogonal Decomposition for MRI Reconstruction | Yu Guan et.al. | 2411.03758 | null |
2024-11-06 | Zero-shot Dynamic MRI Reconstruction with Global-to-local Diffusion Model | Yu Guan et.al. | 2411.03723 | null |
2024-11-06 | Asymptotic analysis of estimators of ergodic stochastic differential equations | Arnab Ganguly et.al. | 2411.03623 | null |
2024-11-06 | Investigating Conceptual Blending of a Diffusion Model for Improving Nonword-to-Image Generation | Chihaya Matsuhira et.al. | 2411.03595 | null |
2024-11-05 | Estimating Ego-Body Pose from Doubly Sparse Egocentric Video Data | Seunggeun Chi et.al. | 2411.03561 | null |
2024-11-05 | Ergodicity and Mixing of Sublinear Expectation System and Applications | Wen Huang et.al. | 2411.03512 | null |
2024-11-05 | SynthSet: Generative Diffusion Model for Semantic Segmentation in Precision Agriculture | Andrew Heschl et.al. | 2411.03505 | link |
2024-11-05 | Chance-Constrained Convex MPC for Robust Quadruped Locomotion Under Parametric and Additive Uncertainties | Ananya Trivedi et.al. | 2411.03481 | link |
2024-11-05 | Exo-Daisy World: Revisiting Gaia Theory through an Informational Architecture Perspective | Damian R Sowinski et.al. | 2411.03421 | null |
2024-11-05 | Information geometry of diffeomorphism groups | Boris Khesin et.al. | 2411.03265 | null |
2024-11-05 | DiffLM: Controllable Synthetic Data Generation via Diffusion Language Models | Ying Zhou et.al. | 2411.03250 | null |
2024-11-05 | On Improved Conditioning Mechanisms and Pre-training Strategies for Diffusion Models | Tariq Berrada Ifriqi et.al. | 2411.03177 | null |
2024-11-05 | Unleashing the power of novel conditional generative approaches for new materials discovery | Lev Novitskiy et.al. | 2411.03156 | link |
2024-11-05 | Gradient-Guided Conditional Diffusion Models for Private Image Reconstruction: Analyzing Adversarial Impacts of Differential Privacy and Denoising | Tao Huang et.al. | 2411.03053 | null |
2024-11-05 | GarVerseLOD: High-Fidelity 3D Garment Reconstruction from a Single In-the-Wild Image using a Dataset with Levels of Details | Zhongjin Luo et.al. | 2411.03047 | null |
2024-11-05 | IMUDiffusion: A Diffusion Model for Multivariate Time Series Synthetisation for Inertial Motion Capturing Systems | Heiko Oppel et.al. | 2411.02954 | null |
2024-11-05 | LDPM: Towards undersampled MRI reconstruction with MR-VAE and Latent Diffusion Prior | Xingjian Tang et.al. | 2411.02951 | null |
2024-11-05 | Theoretically Guaranteed Distribution Adaptable Learning | Chao Xu et.al. | 2411.02921 | null |
2024-11-05 | How much is a noisy image worth? Data Scaling Laws for Ambient Diffusion | Giannis Daras et.al. | 2411.02780 | link |
2024-11-04 | Modelling Alzheimer’s Protein Dynamics: A Data-Driven Integration of Stochastic Methods, Machine Learning and Connectome Insights | Alec MacIver et.al. | 2411.02644 | null |
2024-11-04 | Training-free Regional Prompting for Diffusion Transformers | Anthony Chen et.al. | 2411.02395 | link |
2024-11-04 | Diffusion-based Generative Multicasting with Intent-aware Semantic Decomposition | Xinkai Liu et.al. | 2411.02334 | null |
2024-11-04 | LayerDAG: A Layerwise Autoregressive Diffusion Model for Directed Acyclic Graph Generation | Mufei Li et.al. | 2411.02322 | link |
2024-11-05 | Tencent Hunyuan3D-1.0: A Unified Framework for Text-to-3D and Image-to-3D Generation | Xianghui Yang et.al. | 2411.02293 | null |
2024-11-04 | FewViewGS: Gaussian Splatting with Few View Matching and Multi-stage Training | Ruihong Yin et.al. | 2411.02229 | null |
2024-11-04 | Metric properties of partial and robust Gromov-Wasserstein distances | Jannatul Chhoa et.al. | 2411.02198 | null |
2024-11-04 | CleAR: Robust Context-Guided Generative Lighting Estimation for Mobile Augmented Reality | Yiqin Zhao et.al. | 2411.02179 | null |
2024-11-04 | Model Integrity when Unlearning with T2I Diffusion Models | Andrea Schioppa et.al. | 2411.02068 | null |
2024-11-04 | Learning Controlled Stochastic Differential Equations | Luc Brogat-Motte et.al. | 2411.01982 | null |
2024-11-04 | A tamed-adaptive Milstein scheme for stochastic differential equations with low regularity coefficients | Thi-Huong Vu et.al. | 2411.01849 | null |
2024-11-04 | DiffuMask-Editor: A Novel Paradigm of Integration Between the Segmentation Diffusion Model and Image Editing to Improve Segmentation Ability | Bo Gao et.al. | 2411.01819 | null |
2024-11-04 | MoMu-Diffusion: On Learning Long-Term Motion-Music Synchronization and Correspondence | Fuming You et.al. | 2411.01805 | null |
2024-11-04 | A Regressor-Guided Graph Diffusion Model for Predicting Enzyme Mutations to Enhance Turnover Number | Xiaozhu Yu et.al. | 2411.01745 | link |
2024-11-04 | xDiT: an Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism | Jiarui Fang et.al. | 2411.01738 | link |
2024-11-04 | LaGDif: Latent Graph Diffusion Model for Efficient Protein Inverse Folding with Self-Ensemble | Taoyu Wu et.al. | 2411.01737 | link |
2024-10-31 | DiffPano: Scalable and Consistent Text to Panorama Generation with Spherical Epipolar-Aware Diffusion | Weicai Ye et.al. | 2410.24203 | link |
2024-10-31 | **Redefining |
Fu Feng et.al. | 2410.24160 | null |
2024-10-31 | Scaling Concept With Text-Guided Diffusion Models | Chao Huang et.al. | 2410.24151 | null |
2024-10-31 | Understanding Generalizability of Diffusion Models Requires Rethinking the Hidden Gaussian Structure | Xiang Li et.al. | 2410.24060 | link |
2024-10-31 | TPC: Test-time Procrustes Calibration for Diffusion-based Human Image Animation | Sunjae Yoon et.al. | 2410.24037 | null |
2024-10-31 | DiffPAD: Denoising Diffusion-based Adversarial Patch Decontamination | Jia Fu et.al. | 2410.24006 | link |
2024-11-01 | Breaking Determinism: Fuzzy Modeling of Sequential Recommendation Using Discrete State Space Diffusion Model | Wenjia Xie et.al. | 2410.23994 | null |
2024-10-31 | Stochastic Reconstruction of Gappy Lagrangian Turbulent Signals by Conditional Diffusion Models | Tianyi Li et.al. | 2410.23971 | null |
2024-10-31 | Image Synthesis with Class-Aware Semantic Diffusion Models for Surgical Scene Segmentation | Yihang Zhou et.al. | 2410.23962 | null |
2024-10-31 | A dynamic programming principle for multiperiod control problems with bicausal constraints | Ruslan Mirmominov et.al. | 2410.23927 | null |
2024-10-31 | Text-DiFuse: An Interactive Multi-Modal Image Fusion Framework based on Text-modulated Diffusion Model | Hao Zhang et.al. | 2410.23905 | link |
2024-10-31 | DiffBatt: A Diffusion Model for Battery Degradation Prediction and Synthesis | Hamidreza Eivazi et.al. | 2410.23893 | link |
2024-10-31 | Denoising Diffusion Models for Anomaly Localization in Medical Images | Cosmin I. Bercea et.al. | 2410.23834 | null |
2024-10-31 | Disentangling Disentangled Representations: Towards Improved Latent Units via Diffusion Models | Youngjun Jun et.al. | 2410.23820 | null |
2024-10-31 | EDT: An Efficient Diffusion Transformer Framework Inspired by Human-like Sketching | Xinwang Chen et.al. | 2410.23788 | link |
2024-10-30 | ReferEverything: Towards Segmenting Everything We Can Speak of in Videos | Anurag Bagchi et.al. | 2410.23287 | null |
2024-10-30 | Provable acceleration for diffusion models under minimal assumptions | Gen Li et.al. | 2410.23285 | null |
2024-10-30 | RelationBooth: Towards Relation-Aware Customized Object Generation | Qingyu Shi et.al. | 2410.23280 | null |
2024-10-30 | SlowFast-VGen: Slow-Fast Learning for Action-Driven Long Video Generation | Yining Hong et.al. | 2410.23277 | null |
2024-10-30 | Multi-student Diffusion Distillation for Better One-step Generators | Yanke Song et.al. | 2410.23274 | null |
2024-10-30 | A uniform point vortex approximation for the solution of the two-dimensional Navier Stokes equation with transport noise | Filippo Giovagnini et.al. | 2410.23163 | null |
2024-10-30 | Identifiability of the Optimal Transport Cost on Finite Spaces | Alberto González-Sanz et.al. | 2410.23146 | null |
2024-10-30 | CausalDiff: Causality-Inspired Disentanglement via Diffusion Model for Adversarial Defense | Mingkun Zhang et.al. | 2410.23091 | link |
2024-10-30 | Controlling Language and Diffusion Models by Transporting Activations | Pau Rodriguez et.al. | 2410.23054 | link |
2024-10-30 | Improving Musical Accompaniment Co-creation via Diffusion Transformers | Javier Nistal et.al. | 2410.23005 | null |
2024-10-30 | DexGraspNet 2.0: Learning Generative Dexterous Grasping in Large-scale Synthetic Cluttered Scenes | Jialiang Zhang et.al. | 2410.23004 | null |
2024-10-30 | LumiSculpt: A Consistency Lighting Control Network for Video Generation | Yuxin Zhang et.al. | 2410.22979 | null |
2024-10-30 | Private Synthetic Text Generation with Diffusion Models | Sebastian Ochs et.al. | 2410.22971 | link |
2024-10-31 | DiffLight: A Partial Rewards Conditioned Diffusion Model for Traffic Signal Control with Missing Data | Hanyang Chen et.al. | 2410.22938 | link |
2024-10-30 | HelloMeme: Integrating Spatial Knitting Attentions to Embed High-Level and Fidelity-Rich Conditions in Diffusion Models | Shengkai Zhang et.al. | 2410.22901 | link |
2024-10-29 | Capacity Control is an Effective Memorization Mitigation Mechanism in Text-Conditional Diffusion Models | Raman Dutt et.al. | 2410.22149 | link |
2024-10-29 | Averaging principle for multiscale controlled jump diffusions and associated nonlocal HJB equations | Qi Zhang et.al. | 2410.22141 | null |
2024-10-29 | Variational inference for pile-up removal at hadron colliders with diffusion models | Malte Algren et.al. | 2410.22074 | null |
2024-10-29 | Self-normalized Cramér-type Moderate Deviation of Stochastic Gradient Langevin Dynamics | Hongsheng Dai et.al. | 2410.22047 | null |
2024-10-29 | Dual Conditional Diffusion Models for Sequential Recommendation | Hongtao Huang et.al. | 2410.21967 | null |
2024-10-29 | PrefPaint: Aligning Image Inpainting Diffusion Model with Human Preference | Kendong Liu et.al. | 2410.21966 | null |
2024-10-29 | CT to PET Translation: A Large-scale Dataset and Domain-Knowledge-Guided Diffusion Approach | Dac Thai Nguyen et.al. | 2410.21932 | link |
2024-10-29 | Guided Diffusion-based Counterfactual Augmentation for Robust Session-based Recommendation | Muskan Gupta et.al. | 2410.21892 | null |
2024-10-29 | On invariance of observability for BSDEs and its applications to stochastic control systems | Bao-Zhu Guo et.al. | 2410.21863 | null |
2024-10-29 | Diffusion as Reasoning: Enhancing Object Goal Navigation with LLM-Biased Diffusion Model | Yiming Ji et.al. | 2410.21842 | null |
2024-10-29 | Volumetric Conditioning Module to Control Pretrained Diffusion Models for 3D Medical Images | Suhyun Ahn et.al. | 2410.21826 | link |
2024-10-29 | Robot Policy Learning with Temporal Optimal Transport Reward | Yuwei Fu et.al. | 2410.21795 | link |
2024-10-29 | HairDiffusion: Vivid Multi-Colored Hair Editing via Latent Diffusion | Yu Zeng et.al. | 2410.21789 | null |
2024-10-29 | DiffusionVel: Multi-Information Integrated Velocity Inversion Using Generative Diffusion Models | Hao Zhang et.al. | 2410.21776 | null |
2024-10-30 | IntLoRA: Integral Low-rank Adaptation of Quantized Diffusion Models | Hang Guo et.al. | 2410.21759 | link |
2024-10-28 | On Inductive Biases That Enable Generalization of Diffusion Transformers | Jie An et.al. | 2410.21273 | link |
2024-10-28 | One-Step Diffusion Policy: Fast Visuomotor Policies via Diffusion Distillation | Zhendong Wang et.al. | 2410.21257 | null |
2024-10-28 | $\texttt{skwdro}$ : a library for Wasserstein distributionally robust machine learning | Florian Vincent et.al. | 2410.21231 | link |
2024-10-28 | On learning higher-order cumulants in diffusion models | Gert Aarts et.al. | 2410.21212 | null |
2024-10-28 | Trajectory Flow Matching with Applications to Clinical Time Series Modeling | Xi Zhang et.al. | 2410.21154 | link |
2024-10-28 | Extrapolating Prospective Glaucoma Fundus Images through Diffusion Model in Irregular Longitudinal Sequences | Zhihao Zhao et.al. | 2410.21130 | null |
2024-10-28 | Shallow Diffuse: Robust and Invisible Watermarking through Low-Dimensional Subspaces in Diffusion Models | Wenda Li et.al. | 2410.21088 | link |
2024-10-28 | Federated Time Series Generation on Feature and Temporally Misaligned Data | Chenrui Fan et.al. | 2410.21072 | null |
2024-10-28 | Kandinsky 3: Text-to-Image Synthesis for Multifunctional Generative Framework | Vladimir Arkhipkin et.al. | 2410.21061 | link |
2024-10-28 | Beyond Autoregression: Fast LLMs via Self-Distillation Through Time | Justin Deschenaux et.al. | 2410.21035 | link |
2024-10-28 | Reference-Free Formula Drift with Reinforcement Learning: From Driving Data to Tire Energy-Inspired, Real-World Policies | Franck Djeumou et.al. | 2410.20990 | null |
2024-10-29 | EEG-Driven 3D Object Reconstruction with Color Consistency and Diffusion Prior | Xin Xiang et.al. | 2410.20981 | null |
2024-10-28 | Attention Overlap Is Responsible for The Entity Missing Problem in Text-to-image Diffusion Models! | Arash Marioriyad et.al. | 2410.20972 | null |
2024-10-28 | Diff-Instruct*: Towards Human-Preferred One-step Text-to-image Generative Models | Weijian Luo et.al. | 2410.20898 | null |
2024-10-28 | Novel Object Synthesis via Adaptive Text-Image Harmony | Zeren Xiong et.al. | 2410.20823 | null |
2024-10-25 | Adversarial Environment Design via Regret-Guided Diffusion Models | Hojun Chung et.al. | 2410.19715 | null |
2024-10-25 | DiffGS: Functional Gaussian Splatting Diffusion | Junsheng Zhou et.al. | 2410.19657 | null |
2024-10-25 | Diffusion models for lattice gauge field simulations | Qianteng Zhu et.al. | 2410.19602 | null |
2024-10-25 | On the robustness of semi-discrete optimal transport | Davy Paindaveine et.al. | 2410.19596 | null |
2024-10-25 | Utilizing Image Transforms and Diffusion Models for Generative Modeling of Short and Long Time Series | Ilan Naiman et.al. | 2410.19538 | null |
2024-10-25 | Ensemble Data Assimilation for Particle-based Methods | Marius Duvillard et.al. | 2410.19525 | null |
2024-10-28 | NeuroClips: Towards High-fidelity and Smooth fMRI-to-Video Reconstruction | Zixuan Gong et.al. | 2410.19452 | link |
2024-10-25 | Learned Reference-based Diffusion Sampling for multi-modal distributions | Maxence Noble et.al. | 2410.19449 | null |
2024-10-25 | Generative Diffusion Models for Sequential Recommendations | Sharare Zolghadr et.al. | 2410.19429 | null |
2024-10-25 | FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality | Zhengyao Lv et.al. | 2410.19355 | null |
2024-10-25 | High Resolution Seismic Waveform Generation using Denoising Diffusion | Andreas Bergmeister et.al. | 2410.19343 | null |
2024-10-25 | Simpler Diffusion (SiD2): 1.5 FID on ImageNet512 with pixel-space diffusion | Emiel Hoogeboom et.al. | 2410.19324 | null |
2024-10-25 | A prescriptive theory for brain-like inference | Hadi Vafaii et.al. | 2410.19315 | null |
2024-10-25 | TEARS: Textual Representations for Scrutable Recommendations | Emiliano Penaloza et.al. | 2410.19302 | null |
2024-10-25 | A Flow-based Truncated Denoising Diffusion Model for Super-resolution Magnetic Resonance Spectroscopic Imaging | Siyuan Dong et.al. | 2410.19288 | null |
2024-10-24 | MotionCLR: Motion Generation and Training-free Editing via Understanding Attention Mechanisms | Ling-Hao Chen et.al. | 2410.18977 | null |
2024-10-24 | 3D-Adapter: Geometry-Consistent Multi-View Diffusion for High-Quality 3D Generation | Hansheng Chen et.al. | 2410.18974 | link |
2024-10-24 | On the Crucial Role of Initialization for Matrix Factorization | Bingcong Li et.al. | 2410.18965 | null |
2024-10-24 | Stable Consistency Tuning: Understanding and Improving Consistency Models | Fu-Yun Wang et.al. | 2410.18958 | link |
2024-10-24 | Generation of synthetic financial time series by diffusion models | Tomonori Takahashi et.al. | 2410.18897 | null |
2024-10-24 | The Cat and Mouse Game: The Ongoing Arms Race Between Diffusion Models and Detection Methods | Linda Laurier et.al. | 2410.18866 | null |
2024-10-24 | Multi-Scale Diffusion: Enhancing Spatial Layout in High-Resolution Panoramic Image Generation | Xiaoyu Zhang et.al. | 2410.18830 | null |
2024-10-24 | Fast constrained sampling in pre-trained diffusion models | Alexandros Graikos et.al. | 2410.18804 | null |
2024-10-24 | Robust Watermarking Using Generative Priors Against Image Editing: From Benchmarking to Advances | Shilin Lu et.al. | 2410.18775 | link |
2024-10-25 | Schedule Your Edit: A Simple yet Effective Diffusion Noise Schedule for Image Editing | Haonan Lin et.al. | 2410.18756 | null |
2024-10-24 | Rectified Diffusion Guidance for Conditional Generation | Mengfei Xia et.al. | 2410.18737 | null |
2024-10-24 | Retrieval-Augmented Diffusion Models for Time Series Forecasting | Jingwei Liu et.al. | 2410.18712 | link |
2024-10-24 | Ali-AUG: Innovative Approaches to Labeled Data Augmentation using One-Step Diffusion Model | Ali Hamza et.al. | 2410.18678 | null |
2024-10-24 | DreamClear: High-Capacity Real-World Image Restoration with Privacy-Safe Dataset Curation | Yuang Ai et.al. | 2410.18666 | link |
2024-10-25 | Diffusion Attribution Score: Evaluating Training Data Influence in Diffusion Model | Jinxu Lin et.al. | 2410.18639 | null |
2024-10-23 | DynamicCity: Large-Scale LiDAR Generation from Dynamic Scenes | Hengwei Bian et.al. | 2410.18084 | null |
2024-10-23 | Prioritized Generative Replay | Renhao Wang et.al. | 2410.18082 | null |
2024-10-23 | Optical Generative Models | Shiqi Chen et.al. | 2410.17970 | null |
2024-10-23 | A Wavelet Diffusion GAN for Image Super-Resolution | Lorenzo Aloisi et.al. | 2410.17966 | null |
2024-10-23 | Addressing Asynchronicity in Clinical Multimodal Fusion via Individualized Chest X-ray Generation | Wenfang Yao et.al. | 2410.17918 | link |
2024-10-23 | Scaling Diffusion Language Models via Adaptation from Autoregressive Models | Shansan Gong et.al. | 2410.17891 | link |
2024-10-23 | Non-intrusive Speech Quality Assessment with Diffusion Models Trained on Clean Speech | Danilo de Oliveira et.al. | 2410.17834 | null |
2024-10-23 | PGDiffSeg: Prior-Guided Denoising Diffusion Model with Parameter-Shared Attention for Breast Cancer Segmentation | Feiyan Feng et.al. | 2410.17812 | null |
2024-10-23 | AdaDiffSR: Adaptive Region-aware Dynamic Acceleration Diffusion Model for Real-World Image Super-Resolution | Yuanting Fan et.al. | 2410.17752 | null |
2024-10-23 | VISAGE: Video Synthesis using Action Graphs for Surgery | Yousef Yeganeh et.al. | 2410.17751 | null |
2024-10-23 | Optimal Impulse Control for Cyber Risk Management | Caroline Hillairet et.al. | 2410.17706 | null |
2024-10-23 | Deep Generative Models for 3D Medical Image Synthesis | Paul Friedrich et.al. | 2410.17664 | null |
2024-10-23 | Towards Effective Data-Free Knowledge Distillation via Diverse Diffusion Augmentation | Muquan Li et.al. | 2410.17606 | link |
2024-10-23 | How to Continually Adapt Text-to-Image Diffusion Models for Flexible Customization? | Jiahua Dong et.al. | 2410.17594 | link |
2024-10-23 | GDDA: Semantic OOD Detection on Graphs under Covariate Shift via Score-Based Diffusion Models | Zhixia He et.al. | 2410.17526 | null |
2024-10-22 | Reinforcement learning on structure-conditioned categorical diffusion for protein inverse folding | Yasha Ektefaie et.al. | 2410.17173 | link |
2024-10-22 | CLAP: Concave Linear APproximation for Quadratic Graph Matching | Yongqing Liang et.al. | 2410.17101 | link |
2024-10-22 | DiP-GO: A Diffusion Pruner via Few-step Gradient Optimization | Haowei Zhu et.al. | 2410.16942 | null |
2024-10-22 | Hierarchical Clustering for Conditional Diffusion in Image Generation | Jorge da Silva Goncalves et.al. | 2410.16910 | link |
2024-10-22 | VistaDream: Sampling multiview consistent images for single-view scene reconstruction | Haiping Wang et.al. | 2410.16892 | null |
2024-10-22 | MPDS: A Movie Posters Dataset for Image Generation with Diffusion Model | Meng Xu et.al. | 2410.16840 | null |
2024-10-22 | Evaluating the Effectiveness of Attack-Agnostic Features for Morphing Attack Detection | Laurent Colbois et.al. | 2410.16802 | link |
2024-10-22 | One-Step Diffusion Distillation through Score Implicit Matching | Weijian Luo et.al. | 2410.16794 | link |
2024-10-22 | LLM-Assisted Red Teaming of Diffusion Models through “Failures Are Fated, But Can Be Faded” | Som Sagar et.al. | 2410.16738 | null |
2024-10-22 | Polyp-E: Benchmarking the Robustness of Deep Segmentation Models via Polyp Editing | Runpu Wei et.al. | 2410.16732 | null |
2024-10-22 | DiffusionSeeder: Seeding Motion Optimization with Diffusion for Rapid Motion Planning | Huang Huang et.al. | 2410.16727 | null |
2024-10-22 | Progressive Compositionality In Text-to-Image Generative Models | Xu Han et.al. | 2410.16719 | null |
2024-10-22 | Governing equation discovery of a complex system from snapshots | Qunxi Zhu et.al. | 2410.16694 | null |
2024-10-22 | DARE: Diffusion Policy for Autonomous Robot Exploration | Yuhong Cao et.al. | 2410.16687 | null |
2024-10-22 | NucleiMix: Realistic Data Augmentation for Nuclei Instance Segmentation | Jiamu Wang et.al. | 2410.16671 | null |
2024-10-21 | MvDrag3D: Drag-based Creative 3D Editing via Multi-view Generation-Reconstruction Priors | Honghua Chen et.al. | 2410.16272 | null |
2024-10-21 | A Framework for Evaluating Predictive Models Using Synthetic Image Covariates and Longitudinal Data | Simon Deltadahl et.al. | 2410.16177 | null |
2024-10-22 | Warped Diffusion: Solving Video Inverse Problems with Image Diffusion Models | Giannis Daras et.al. | 2410.16152 | null |
2024-10-21 | SeaDAG: Semi-autoregressive Diffusion for Conditional Directed Acyclic Graph Generation | Xinyi Zhou et.al. | 2410.16119 | null |
2024-10-21 | Continuous Speech Synthesis using per-token Latent Diffusion | Arnon Turetzky et.al. | 2410.16048 | null |
2024-10-22 | CamI2V: Camera-Controlled Image-to-Video Diffusion Model | Guangcong Zheng et.al. | 2410.15957 | link |
2024-10-21 | Global existence and mean-field limit for a stochastic interacting particle system of signed Coulomb charges | Patrick van Meurs et.al. | 2410.15855 | null |
2024-10-21 | Learning signals defined on graphs with optimal transport and Gaussian process regression | Raphaël Carpintero Perez et.al. | 2410.15721 | null |
2024-10-21 | Quantiles and Quantile Regression on Riemannian Manifolds: a measure-transportation-based approach | Marc Hallin et.al. | 2410.15711 | null |
2024-10-21 | Solving Continual Offline RL through Selective Weights Activation on Aligned Spaces | Jifeng Hu et.al. | 2410.15698 | null |
2024-10-21 | Erasing Undesirable Concepts in Diffusion Models with Adversarial Preservation | Anh Bui et.al. | 2410.15618 | link |
2024-10-20 | Data Augmentation via Diffusion Model to Enhance AI Fairness | Christina Hastings Blow et.al. | 2410.15470 | null |
2024-10-20 | MedDiff-FM: A Diffusion-based Foundation Model for Versatile Medical Image Applications | Yongrui Yu et.al. | 2410.15432 | null |
2024-10-20 | ConSinger: Efficient High-Fidelity Singing Voice Generation with Minimal Steps | Yulin Song et.al. | 2410.15342 | null |
2024-10-20 | Diffusion-PINN Sampler | Zhekun Shi et.al. | 2410.15336 | null |
2024-10-18 | A Lipschitz spaces view of infinitely wide shallow neural networks | Francesca Bartolucci et.al. | 2410.14591 | null |
2024-10-18 | Neuro-Symbolic Traders: Assessing the Wisdom of AI Crowds in Markets | Namid R. Stillman et.al. | 2410.14587 | null |
2024-10-18 | Multi-modal Pose Diffuser: A Multimodal Generative Conditional Pose Prior | Calvin-Khang Ta et.al. | 2410.14540 | null |
2024-10-18 | LEAD: Latent Realignment for Human Motion Diffusion | Nefeli Andreou et.al. | 2410.14508 | null |
2024-10-18 | Reinforcement Learning in Non-Markov Market-Making | Luca Lalor et.al. | 2410.14504 | null |
2024-10-18 | ANT: Adaptive Noise Schedule for Time Series Diffusion Models | Seunghan Lee et.al. | 2410.14488 | link |
2024-10-18 | DRL Optimization Trajectory Generation via Wireless Network Intent-Guided Diffusion Models for Optimizing Resource Allocation | Junjie Wu et.al. | 2410.14481 | null |
2024-10-18 | FashionR2R: Texture-preserving Rendered-to-Real Image Translation with Diffusion Models | Rui Hu et.al. | 2410.14429 | null |
2024-10-18 | Dynamic Negative Guidance of Diffusion Models | Felix Koulischer et.al. | 2410.14398 | null |
2024-10-18 | Unscrambling disease progression at scale: fast inference of event permutations with optimal transport | Peter A. Wijeratne et.al. | 2410.14388 | null |
2024-10-18 | HiCo: Hierarchical Controllable Diffusion Model for Layout-to-image Generation | Bo Cheng et.al. | 2410.14324 | link |
2024-10-18 | A class of kernel-based scalable algorithms for data science | Philippe G. LeFloch et.al. | 2410.14323 | null |
2024-10-18 | ClearSR: Latent Low-Resolution Image Embeddings Help Diffusion-Based Real-World Super Resolution Models See Clearer | Yuhao Wan et.al. | 2410.14279 | null |
2024-10-18 | HYPNOS : Highly Precise Foreground-focused Diffusion Finetuning for Inanimate Objects | Oliverio Theophilus Nathanael et.al. | 2410.14265 | null |
2024-10-18 | ERDDCI: Exact Reversible Diffusion via Dual-Chain Inversion for High-Quality Image Editing | Jimin Dai et.al. | 2410.14247 | null |
2024-10-17 | Diffusing States and Matching Scores: A New Framework for Imitation Learning | Runzhe Wu et.al. | 2410.13855 | link |
2024-10-17 | Influence Functions for Scalable Data Attribution in Diffusion Models | Bruno Mlodozeniec et.al. | 2410.13850 | null |
2024-10-17 | Deep Generative Models Unveil Patterns in Medical Images Through Vision-Language Conditioning | Xiaodan Xing et.al. | 2410.13823 | link |
2024-10-17 | ConsisSR: Delving Deep into Consistency in Diffusion-based Image Super-Resolution | Junhao Gu et.al. | 2410.13807 | null |
2024-10-17 | Probing the Latent Hierarchical Structure of Data via Diffusion Models | Antonio Sclocchi et.al. | 2410.13770 | null |
2024-10-17 | Theory on Score-Mismatched Diffusion Models and Zero-Shot Conditional Samplers | Yuchen Liang et.al. | 2410.13746 | null |
2024-10-17 | Improved Convergence Rate for Diffusion Probabilistic Models | Gen Li et.al. | 2410.13738 | null |
2024-10-18 | DAWN: Dynamic Frame Avatar with Non-autoregressive Diffusion Framework for Talking Head Video Generation | Hanbo Cheng et.al. | 2410.13726 | link |
2024-10-18 | Diffusion Curriculum: Synthetic-to-Real Generative Curriculum Learning via Image-Guided Diffusion | Yijun Liang et.al. | 2410.13674 | link |
2024-10-17 | Fine-Tuning Discrete Diffusion Models via Reward Optimization with Applications to DNA and Protein Design | Chenyu Wang et.al. | 2410.13643 | link |
2024-10-17 | Preference Aligned Diffusion Planner for Quadrupedal Locomotion Control | Xinyi Yuan et.al. | 2410.13586 | null |
2024-10-17 | Can Medical Vision-Language Pre-training Succeed with Purely Synthetic Data? | Che Liu et.al. | 2410.13523 | null |
2024-10-17 | Solving Prior Distribution Mismatch in Diffusion Models via Optimal Transport | Zhanpeng Wang et.al. | 2410.13431 | null |
2024-10-17 | MagicTailor: Component-Controllable Personalization in Text-to-Image Diffusion Models | Donghao Zhou et.al. | 2410.13370 | null |
2024-10-17 | DiffImp: Efficient Diffusion Model for Probabilistic Time Series Imputation with Bidirectional Mamba Backbone | Hongfan Gao et.al. | 2410.13338 | null |
2024-10-16 | Meta-Unlearning on Diffusion Models: Preventing Relearning Unlearned Concepts | Hongcheng Gao et.al. | 2410.12777 | link |
2024-10-16 | SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image And Video Generation | Jaehong Yoon et.al. | 2410.12761 | null |
2024-10-16 | Geometry and Duality of Alternating Markov Chains | Deven Mithal et.al. | 2410.12721 | null |
2024-10-16 | Embedding an Ethical Mind: Aligning Text-to-Image Synthesis via Lightweight Value Optimization | Xingqi Wang et.al. | 2410.12700 | link |
2024-10-16 | AdaptiveDrag: Semantic-Driven Dragging on Diffusion-Based Image Editing | DuoSheng Chen et.al. | 2410.12696 | null |
2024-10-16 | One Step Diffusion via Shortcut Models | Kevin Frans et.al. | 2410.12557 | link |
2024-10-16 | Disentangling data distribution for Federated Learning | Xinyuan Zhao et.al. | 2410.12530 | null |
2024-10-16 | Shaping a Stabilized Video by Mitigating Unintended Changes for Concept-Augmented Video Editing | Mingce Guo et.al. | 2410.12526 | null |
2024-10-16 | Price impact and long-term profitability of energy storage | Roxana Dumitrescu et.al. | 2410.12495 | null |
2024-10-16 | Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspective | Yongxin Zhu et.al. | 2410.12490 | link |
2024-10-16 | A Class of Degenerate Mean Field Games, Associated FBSDEs and Master Equations | Alain Bensoussan et.al. | 2410.12404 | null |
2024-10-16 | DaDiff: Domain-aware Diffusion Model for Nighttime UAV Tracking | Haobo Zuo et.al. | 2410.12270 | link |
2024-10-16 | FlashAudio: Rectified Flows for Fast and High-Fidelity Text-to-Audio Generation | Huadai Liu et.al. | 2410.12266 | null |
2024-10-17 | Expected Sliced Transport Plans | Xinran Liu et.al. | 2410.12176 | null |
2024-10-16 | Preference Optimization with Multi-Sample Comparisons | Chaoqi Wang et.al. | 2410.12138 | null |
2024-10-15 | High-Resolution Frame Interpolation with Patch-based Cascaded Diffusion | Junhwa Hur et.al. | 2410.11838 | null |
2024-10-15 | On the Effectiveness of Dataset Alignment for Fake Image Detection | Anirudh Sundara Rajan et.al. | 2410.11835 | null |
2024-10-15 | Bayesian Experimental Design via Contrastive Diffusions | Jacopo Iollo et.al. | 2410.11826 | link |
2024-10-15 | Improving Long-Text Alignment for Text-to-Image Diffusion Models | Luping Liu et.al. | 2410.11817 | link |
2024-10-15 | SGEdit: Bridging LLM with Text2Image Generative Model for Scene Graph-based Image Editing | Zhiyuan Zhang et.al. | 2410.11815 | null |
2024-10-16 | Efficient Diffusion Models: A Comprehensive Survey from Principles to Practices | Zhiyuan Ma et.al. | 2410.11795 | null |
2024-10-15 | Probabilistic Principles for Biophysics and Neuroscience: Entropy Production, Bayesian Mechanics & the Free-Energy Principle | Lancelot Da Costa et.al. | 2410.11735 | null |
2024-10-15 | Patch-Based Diffusion Models Beat Whole-Image Models for Mismatched Distribution Inverse Problems | Jason Hu et.al. | 2410.11730 | null |
2024-10-15 | On the potential of Optimal Transport in Geospatial Data Science | Nina Wiedemann et.al. | 2410.11709 | link |
2024-10-15 | Optimal Finite-time Maxwell’s Demons in Langevin Systems | Takuya Kamijima et.al. | 2410.11603 | null |
2024-10-15 | DeformPAM: Data-Efficient Learning for Long-horizon Deformable Object Manipulation via Preference-based Action Alignment | Wendi Chen et.al. | 2410.11584 | link |
2024-10-15 | Bayesian inference of mixed Gaussian phylogenetic models | Bayu Brahmantio et.al. | 2410.11548 | link |
2024-10-15 | Riemann-Liouville fractional Brownian motion with random Hurst exponent | Hubert Woszczek et.al. | 2410.11546 | null |
2024-10-15 | InvSeg: Test-Time Prompt Inversion for Semantic Segmentation | Jiayi Lin et.al. | 2410.11473 | null |
2024-10-15 | A Simple Approach to Unifying Diffusion-based Conditional Generation | Xirui Li et.al. | 2410.11439 | null |
2024-10-14 | Tex4D: Zero-shot 4D Scene Texturing with Video Diffusion Models | Jingzhi Bao et.al. | 2410.10821 | link |
2024-10-14 | Depth Any Video with Scalable Synthetic Data | Honghui Yang et.al. | 2410.10815 | link |
2024-10-14 | HART: Efficient Visual Generation with Hybrid Autoregressive Transformer | Haotian Tang et.al. | 2410.10812 | link |
2024-10-14 | TrajDiffuse: A Conditional Diffusion Model for Environment-Aware Trajectory Prediction | Qingze et.al. | 2410.10804 | link |
2024-10-14 | Boosting Camera Motion Control for Video Diffusion Transformers | Soon Yau Cheong et.al. | 2410.10802 | null |
2024-10-14 | Semantic Image Inversion and Editing using Rectified Stochastic Differential Equations | Litu Rout et.al. | 2410.10792 | null |
2024-10-14 | ControlMM: Controllable Masked Motion Generation | Ekkasit Pinyoanuntapong et.al. | 2410.10780 | null |
2024-10-14 | Adaptive Diffusion Terrain Generator for Autonomous Uneven Terrain Navigation | Youwei Yu et.al. | 2410.10766 | null |
2024-10-14 | DragEntity: Trajectory Guided Video Generation using Entity and Positional Relationships | Zhang Wan et.al. | 2410.10751 | null |
2024-10-14 | FlexGen: Flexible Multi-View Generation from Text and Image Inputs | Xinli Xu et.al. | 2410.10745 | null |
2024-10-14 | Deep Compression Autoencoder for Efficient High-Resolution Diffusion Models | Junyu Chen et.al. | 2410.10733 | link |
2024-10-14 | TALK-Act: Enhance Textural-Awareness for 2D Speaking Avatar Reenactment with Diffusion Model | Jiazhi Guan et.al. | 2410.10696 | null |
2024-10-14 | Both Ears Wide Open: Towards Language-Driven Spatial Audio Generation | Peiwen Sun et.al. | 2410.10676 | null |
2024-10-14 | Generating Model Parameters for Controlling: Parameter Diffusion for Controllable Multi-Task Recommendation | Chenglei Shen et.al. | 2410.10639 | null |
2024-10-15 | SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformers | Enze Xie et.al. | 2410.10629 | null |
2024-10-11 | SceneCraft: Layout-Guided 3D Scene Generation | Xiuyu Yang et.al. | 2410.09049 | link |
2024-10-11 | Linear Convergence of Diffusion Models Under the Manifold Hypothesis | Peter Potaptchik et.al. | 2410.09046 | null |
2024-10-11 | Semantic Score Distillation Sampling for Compositional Text-to-3D Generation | Ling Yang et.al. | 2410.09009 | link |
2024-10-11 | WaveDiffusion: Exploring Full Waveform Inversion via Joint Diffusion in the Latent Space | Hanchen Wang et.al. | 2410.09002 | null |
2024-10-11 | Gradient-adjusted underdamped Langevin dynamics for sampling | Xinzhe Zuo et.al. | 2410.08987 | null |
2024-10-11 | DiffPO: A causal diffusion model for learning distributions of potential outcomes | Yuchen Ma et.al. | 2410.08924 | null |
2024-10-11 | Lifelong Event Detection via Optimal Transport | Viet Dao et.al. | 2410.08905 | null |
2024-10-11 | Domain decomposition for entropic unbalanced optimal transport | Ismael Medina et.al. | 2410.08859 | link |
2024-10-11 | Zero-Shot Offline Imitation Learning via Optimal Transport | Thomas Rupf et.al. | 2410.08751 | link |
2024-10-11 | Multi-dimensional non-Markovian backward stochastic differential equations of interactively quadratic generators | Shengjun Fan et.al. | 2410.08748 | null |
2024-10-11 | Distillation of Discrete Diffusion through Dimensional Correlations | Satoshi Hayakawa et.al. | 2410.08709 | null |
2024-10-14 | Gait Sequence Upsampling using Diffusion Models for Single LiDAR Sensors | Jeongho Ahn et.al. | 2410.08680 | null |
2024-10-11 | E-Motion: Future Motion Simulation via Event Sequence Diffusion | Song Wu et.al. | 2410.08649 | link |
2024-10-11 | Synth-SONAR: Sonar Image Synthesis with Enhanced Diversity and Realism via Dual Diffusion Models and GPT Prompting | Purushothaman Natarajan et.al. | 2410.08612 | link |
2024-10-11 | Context-Aware Full Body Anonymization using Text-to-Image Diffusion Models | Pascl Zwick et.al. | 2410.08551 | link |
2024-10-10 | DICE: Discrete Inversion Enabling Controllable Editing for Multinomial Diffusion and Masked Generative Models | Xiaoxiao He et.al. | 2410.08207 | null |
2024-10-10 | HybridBooth: Hybrid Prompt Inversion for Efficient Subject-Driven Generation | Shanyan Guan et.al. | 2410.08192 | null |
2024-10-10 | DifFRelight: Diffusion-Based Facial Performance Relighting | Mingming He et.al. | 2410.08188 | null |
2024-10-10 | ZeroComp: Zero-shot Object Compositing from Image Intrinsics via Diffusion | Zitian Zhang et.al. | 2410.08168 | null |
2024-10-10 | DART: Denoising Autoregressive Transformer for Scalable Text-to-Image Generation | Jiatao Gu et.al. | 2410.08159 | null |
2024-10-10 | Progressive Autoregressive Video Diffusion Models | Desai Xie et.al. | 2410.08151 | link |
2024-10-10 | Steering Masked Discrete Diffusion Models via Discrete Denoising Posterior Prediction | Jarrid Rector-Brooks et.al. | 2410.08134 | null |
2024-10-10 | On Barycenter Computation: Semi-Unbalanced Optimal Transport-based Method on Gaussians | Ngoc-Hai Nguyen et.al. | 2410.08117 | null |
2024-10-10 | CrackSegDiff: Diffusion Probability Model-based Multi-modal Crack Segmentation | Xiaoyan Jiang et.al. | 2410.08100 | link |
2024-10-10 | Unstable Unlearning: The Hidden Risk of Concept Resurgence in Diffusion Models | Vinith M. Suriyakumar et.al. | 2410.08074 | null |
2024-10-10 | Optimal Transportation by Orthogonal Coupling Dynamics | Mohsen Sadr et.al. | 2410.08060 | null |
2024-10-10 | LADIMO: Face Morph Generation through Biometric Template Inversion with Latent Diffusion | Marcel Grimmer et.al. | 2410.07988 | link |
2024-10-10 | Convex comparison of Gaussian mixtures | Benjamin Jourdain et.al. | 2410.07958 | null |
2024-10-10 | AI Surrogate Model for Distributed Computing Workloads | David K. Park et.al. | 2410.07940 | null |
2024-10-10 | Congestion and Penalization in Optimal Transport | Marcelo Gallardo et.al. | 2410.07363 | null |
2024-10-09 | IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation | Xinchen Zhang et.al. | 2410.07171 | link |
2024-10-09 | AvatarGO: Zero-shot 4D Human-Object Interaction Generation and Animation | Yukang Cao et.al. | 2410.07164 | null |
2024-10-09 | InstructG2I: Synthesizing Images from Multimodal Attributed Graphs | Bowen Jin et.al. | 2410.07157 | link |
2024-10-09 | Trans4D: Realistic Geometry-Aware Transition for Compositional Text-to-4D Synthesis | Bohan Zeng et.al. | 2410.07155 | link |
2024-10-09 | Through the Looking Glass: Mirror Schrödinger Bridges | Leticia Mattos Da Silva et.al. | 2410.07003 | null |
2024-10-09 | Diffusion Density Estimators | Akhil Premkumar et.al. | 2410.06986 | null |
2024-10-09 | Jointly Generating Multi-view Consistent PBR Textures using Collaborative Control | Shimon Vainer et.al. | 2410.06985 | null |
2024-10-09 | Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think | Sihyun Yu et.al. | 2410.06940 | link |
2024-10-09 | Boosting Few-Shot Detection with Large Language Models and Layout-to-Image Synthesis | Ahmed Abdullah et.al. | 2410.06841 | null |
2024-10-09 | Diffuse or Confuse: A Diffusion Deepfake Speech Dataset | Anton Firc et.al. | 2410.06796 | link |
2024-10-09 | Diff-FMT: Diffusion Models for Fluorescence Molecular Tomography | Qianqian Xue et.al. | 2410.06757 | null |
2024-10-10 | Suppress Content Shift: Better Diffusion Features via Off-the-Shelf Generation Techniques | Benyuan Meng et.al. | 2410.06719 | link |
2024-10-09 | Decouple-Then-Merge: Towards Better Training for Diffusion Models | Qianli Ma et.al. | 2410.06664 | null |
2024-10-09 | WardropNet: Traffic Flow Predictions via Equilibrium-Augmented Learning | Kai Jungel et.al. | 2410.06656 | link |
2024-10-10 | DeepMuon: Accelerating Cosmic-Ray Muon Simulation Based on Optimal Transport | Ao-Bo Wang et.al. | 2410.06539 | link |
2024-10-07 | DART: A Diffusion-Based Autoregressive Motion Model for Real-Time Text-Driven Motion Control | Kaifeng Zhao et.al. | 2410.05260 | null |
2024-10-07 | GS-VTON: Controllable 3D Virtual Try-on with Gaussian Splatting | Yukang Cao et.al. | 2410.05259 | null |
2024-10-07 | SePPO: Semi-Policy Preference Optimization for Diffusion Alignment | Daoan Zhang et.al. | 2410.05255 | link |
2024-10-07 | DiffuseReg: Denoising Diffusion Model for Obtaining Deformation Fields in Unsupervised Deformable Image Registration | Yongtai Zhuo et.al. | 2410.05234 | link |
2024-10-07 | Presto! Distilling Steps and Layers for Accelerating Music Generation | Zachary Novack et.al. | 2410.05167 | null |
2024-10-08 | A Simulation-Free Deep Learning Approach to Stochastic Optimal Control | Mengjian Hua et.al. | 2410.05163 | null |
2024-10-07 | Leveraging Multimodal Diffusion Models to Accelerate Imaging with Side Information | Timofey Efimov et.al. | 2410.05143 | null |
2024-10-07 | Human-Feedback Efficient Reinforcement Learning for Online Diffusion Model Finetuning | Ayano Hiranaka et.al. | 2410.05116 | null |
2024-10-07 | DreamSat: Towards a General 3D Model for Novel View Synthesis of Space Objects | Nidhi Mathihalli et.al. | 2410.05097 | link |
2024-10-07 | A nodally bound-preserving discontinuous Galerkin method for the drift-diffusion equation | Gabriel R. Barrenechea et.al. | 2410.05040 | null |
2024-10-07 | Revealing Directions for Text-guided 3D Face Editing | Zhuo Chen et.al. | 2410.04965 | null |
2024-10-07 | Low-Rank Continual Personalization of Diffusion Models | Łukasz Staniszewski et.al. | 2410.04891 | null |
2024-10-07 | Patch is Enough: Naturalistic Adversarial Patch against Vision-Language Pre-training Models | Dehong Kong et.al. | 2410.04884 | null |
2024-10-07 | Artificial Barriers for stochastic differential equations and for construction of Boundary-preserving schemes | Johan Ulander et.al. | 2410.04850 | null |
2024-10-07 | Real-time cardiac cine MRI – A comparison of a diffusion probabilistic model with alternative state-of-the-art image reconstruction techniques for undersampled spiral acquisitions | Oliver Schad et.al. | 2410.04843 | null |
2024-10-04 | Estimating Body and Hand Motion in an Ego-sensed World | Brent Yi et.al. | 2410.03665 | null |
2024-10-04 | Real-World Benchmarks Make Membership Inference Attacks Fail on Diffusion Models | Chumeng Liang et.al. | 2410.03640 | link |
2024-10-04 | How Discrete and Continuous Diffusion Meet: Comprehensive Analysis of Discrete Diffusion Models via a Stochastic Integral Framework | Yinuo Ren et.al. | 2410.03601 | null |
2024-10-04 | Not All Diffusion Model Activations Have Been Evaluated as Discriminative Features | Benyuan Meng et.al. | 2410.03558 | link |
2024-10-04 | Diffusion State-Guided Projected Gradient for Inverse Problems | Rayhan Zirvi et.al. | 2410.03463 | null |
2024-10-04 | Generative Semantic Communication for Text-to-Speech Synthesis | Jiahao Zheng et.al. | 2410.03459 | null |
2024-10-04 | Dynamic Diffusion Transformer | Wangbo Zhao et.al. | 2410.03456 | link |
2024-10-04 | CLoSD: Closing the Loop between Simulation and Diffusion for multi-task character control | Guy Tevet et.al. | 2410.03441 | link |
2024-10-04 | Sparsity of Quadratically Regularized Optimal Transport: Bounds on concentration and bias | Johannes Wiesel et.al. | 2410.03425 | null |
2024-10-04 | One2set + Large Language Model: Best Partners for Keyphrase Generation | Liangying Shao et.al. | 2410.03421 | link |
2024-10-04 | The scaling behaviour of localised and extended states in one-dimensional tight-binding models with disorder | Luca Schaefer et.al. | 2410.03405 | null |
2024-10-04 | Latent Abstractions in Generative Diffusion Models | Giulio Franzese et.al. | 2410.03368 | null |
2024-10-04 | LANTERN: Accelerating Visual Autoregressive Models with Relaxed Speculative Decoding | Doohyuk Jang et.al. | 2410.03355 | null |
2024-10-04 | Sparsity of Quadratically Regularized Optimal Transport: Scalar Case | Alberto González-Sanz et.al. | 2410.03353 | null |
2024-10-04 | Optimal Transport for $ε$ -Contaminated Credal Sets | Michele Caprio et.al. | 2410.03267 | null |
2024-10-03 | Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models | Zhengfeng Lai et.al. | 2410.02740 | null |
2024-10-03 | NETS: A Non-Equilibrium Transport Sampler | Michael S. Albergo et.al. | 2410.02711 | null |
2024-10-03 | SteerDiff: Steering towards Safe Text-to-Image Diffusion Models | Hongxiang Zhang et.al. | 2410.02710 | null |
2024-10-03 | ControlAR: Controllable Image Generation with Autoregressive Models | Zongming Li et.al. | 2410.02705 | link |
2024-10-03 | Unsupervised Point Cloud Completion through Unbalanced Optimal Transport | Taekyung Lee et.al. | 2410.02671 | null |
2024-10-03 | GUD: Generation with Unified Diffusion | Mathis Gerdes et.al. | 2410.02667 | null |
2024-10-03 | Scalable Simulation-free Entropic Unbalanced Optimal Transport | Jaemoo Choi et.al. | 2410.02656 | null |
2024-10-03 | Efficient calibration of the shifted square-root diffusion model to credit default swap spreads using asymptotic approximations | Ankush Agarwal et.al. | 2410.02645 | null |
2024-10-03 | Inverse Entropic Optimal Transport Solves Semi-supervised Learning via Data Likelihood Maximization | Mikhail Persiianov et.al. | 2410.02628 | null |
2024-10-03 | Diffusion & Adversarial Schrödinger Bridges via Iterative Proportional Markovian Fitting | Sergei Kholkin et.al. | 2410.02601 | null |
2024-10-04 | Diffusion Models are Evolutionary Algorithms | Yanbo Zhang et.al. | 2410.02543 | link |
2024-10-03 | Lightweight Diffusion Models for Resource-Constrained Semantic Communication | Giovanni Pignata et.al. | 2410.02491 | link |
2024-10-03 | Towards a Theoretical Understanding of Memorization in Diffusion Models | Yunhao Chen et.al. | 2410.02467 | null |
2024-10-03 | Eliminating Oversaturation and Artifacts of High Guidance Scales in Diffusion Models | Seyedmorteza Sadat et.al. | 2410.02416 | null |
2024-10-03 | Diffusion Meets Options: Hierarchical Generative Skill Composition for Temporally-Extended Tasks | Zeyu Feng et.al. | 2410.02389 | null |
2024-10-02 | FabricDiffusion: High-Fidelity Texture Transfer for 3D Garments Generation from In-The-Wild Clothing Images | Cheng Zhang et.al. | 2410.01801 | null |
2024-10-02 | Bellman Diffusion: Generative Modeling as Learning a Linear Operator in the Distribution Space | Yangming Li et.al. | 2410.01796 | null |
2024-10-02 | Learning To Solve Differential Equation Constrained Optimization Problems | Vincenzo Di Vito et.al. | 2410.01786 | null |
2024-10-02 | Dynamical-generative downscaling of climate model ensembles | Ignacio Lopez-Gomez et.al. | 2410.01776 | null |
2024-10-02 | ImageFolder: Autoregressive Image Generation with Folded Tokens | Xiang Li et.al. | 2410.01756 | link |
2024-10-02 | VitaGlyph: Vitalizing Artistic Typography with Flexible Dual-branch Diffusion Models | Kailai Feng et.al. | 2410.01738 | link |
2024-10-02 | HarmoniCa: Harmonizing Training and Inference for Better Feature Cache in Diffusion Transformer Acceleration | Yushi Huang et.al. | 2410.01723 | null |
2024-10-02 | KnobGen: Controlling the Sophistication of Artwork in Sketch-Based Diffusion Models | Pouyan Navard et.al. | 2410.01595 | link |
2024-10-02 | MM-LDM: Multi-Modal Latent Diffusion Model for Sounding Video Generation | Mingzhen Sun et.al. | 2410.01594 | link |
2024-10-02 | HRTF Estimation using a Score-based Prior | Etienne Thuillier et.al. | 2410.01562 | null |
2024-10-02 | Weighted $L^p~(p\geq1)$ solutions of random time horizon BSDEs with stochastic monotonicity generators | Xinying Li et.al. | 2410.01543 | null |
2024-10-02 | Edge-preserving noise for diffusion models | Jente Vandersanden et.al. | 2410.01540 | null |
2024-10-02 | Discrete Diffusion Schrödinger Bridge Matching for Graph Transformation | Jun Hyeong Kim et.al. | 2410.01500 | null |
2024-10-02 | Modeling Cosmic-Ray Transport: A CRPropa based stochastic differential equation solver | Lukas Merten et.al. | 2410.01472 | null |
2024-10-02 | Information-Theoretical Principled Trade-off between Jailbreakability and Stealthiness on Vision Language Models | Ching-Chia Kao et.al. | 2410.01438 | null |
2024-09-30 | COLLAGE: Collaborative Human-Agent Interaction Generation using Hierarchical Latent Diffusion and Language Models | Divyanshu Daiya et.al. | 2409.20502 | null |
2024-09-30 | FreeMask: Rethinking the Importance of Attention Masks for Zero-Shot Video Editing | Lingling Cai et.al. | 2409.20500 | null |
2024-09-30 | A mean field Jacobi process for modeling sustainable tourism | Hidekazu Yoshioka et.al. | 2409.20347 | null |
2024-09-30 | Ensemble Kalman Diffusion Guidance: A Derivative-free Method for Inverse Problems | Hongkai Zheng et.al. | 2409.20175 | null |
2024-09-30 | Erase, then Redraw: A Novel Data Augmentation Approach for Free Space Detection Using Diffusion Model | Fulong Ma et.al. | 2409.20164 | null |
2024-09-30 | Conditional Diffusion Models are Minimax-Optimal and Manifold-Adaptive for Conditional Distribution Estimation | Rong Tang et.al. | 2409.20124 | null |
2024-09-30 | Reaction-diffusion model for a population structured in phenotype and space I – Criterion for persistence | Nathanaël Boutillon et.al. | 2409.20118 | null |
2024-09-30 | RoCoTex: A Robust Method for Consistent Texture Synthesis with Diffusion Models | Jangyeong Kim et.al. | 2409.19989 | null |
2024-09-30 | Magnet: We Never Know How Text-to-Image Diffusion Models Work, Until We Learn How Vision-Language Models Function | Chenyi Zhuang et.al. | 2409.19967 | link |
2024-10-02 | Image Copy Detection for Diffusion Models | Wenhao Wang et.al. | 2409.19952 | null |
2024-09-30 | Task-agnostic Pre-training and Task-guided Fine-tuning for Versatile Diffusion Planner | Chenyou Fan et.al. | 2409.19949 | null |
2024-09-30 | Replace Anyone in Videos | Xiang Wang et.al. | 2409.19911 | null |
2024-09-30 | The only admissible way of merging e-values | Ruodu Wang et.al. | 2409.19888 | null |
2024-09-30 | Partial Stochastic Dominance via Optimal Transport | Takashi Kamihigashi et.al. | 2409.19876 | null |
2024-09-30 | GameLabel-10K: Collecting Image Preference Data Through Mobile Game Crowdsourcing | Jonathan Zhou et.al. | 2409.19830 | null |
2024-09-27 | $O(d/T)$ Convergence Theory for Diffusion Probabilistic Models under Minimal Assumptions | Gen Li et.al. | 2409.18959 | null |
2024-09-27 | ReviveDiff: A Universal Diffusion Model for Restoring Images in Adverse Weather Conditions | Wenfeng Huang et.al. | 2409.18932 | null |
2024-09-27 | Unsupervised Low-light Image Enhancement with Lookup Tables and Diffusion Priors | Yunlong Lin et.al. | 2409.18899 | null |
2024-09-27 | Detecting Dataset Abuse in Fine-Tuning Stable Diffusion Models for Text-to-Image Synthesis | Songrui Wang et.al. | 2409.18897 | null |
2024-09-27 | Explainable Artifacts for Synthetic Western Blot Source Attribution | João Phillipe Cardenuto et.al. | 2409.18881 | link |
2024-09-27 | Emu3: Next-Token Prediction is All You Need | Xinlong Wang et.al. | 2409.18869 | null |
2024-09-27 | Convergence of Diffusion Models Under the Manifold Hypothesis in High-Dimensions | Iskander Azangulov et.al. | 2409.18804 | null |
2024-09-27 | Unsupervised Fingerphoto Presentation Attack Detection With Diffusion Models | Hailin Li et.al. | 2409.18636 | null |
2024-09-27 | Treating Brain-inspired Memories as Priors for Diffusion Model to Forecast Multivariate Time Series | Muyao Wang et.al. | 2409.18491 | null |
2024-09-27 | Gradient-free Decoder Inversion in Latent Diffusion Models | Seongmin Hong et.al. | 2409.18442 | null |
2024-09-27 | GenesisTex2: Stable, Consistent and High-Quality Text-to-Texture Generation | Jiawei Lu et.al. | 2409.18401 | null |
2024-09-27 | Multi-hypotheses Conditioned Point Cloud Diffusion for 3D Human Reconstruction from Occluded Images | Donghwan Kim et.al. | 2409.18364 | link |
2024-09-27 | Generative AI for fast and accurate Statistical Computation of Fluids | Roberto Molinaro et.al. | 2409.18359 | null |
2024-09-26 | Harnessing Wavelet Transformations for Generalizable Deepfake Forgery Detection | Lalith Bharadwaj Baru et.al. | 2409.18301 | link |
2024-09-26 | Synthesizing beta-amyloid PET images from T1-weighted Structural MRI: A Preliminary Study | Qing Lyu et.al. | 2409.18282 | null |
2024-09-26 | FlowTurbo: Towards Real-time Flow-Based Image Generation with Velocity Refiner | Wenliang Zhao et.al. | 2409.18128 | link |
2024-09-26 | Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction | Jing He et.al. | 2409.18124 | null |
2024-09-26 | EdgeRunner: Auto-regressive Auto-encoder for Artistic Mesh Generation | Jiaxiang Tang et.al. | 2409.18114 | null |
2024-09-26 | Nonnegative cross-curvature in infinite dimensions: synthetic definition and spaces of measures | Flavien Léger et.al. | 2409.18112 | null |
2024-09-26 | StackGen: Generating Stable Structures from Silhouettes via Diffusion | Luzhe Sun et.al. | 2409.18098 | null |
2024-09-26 | DiffSSC: Semantic LiDAR Scan Completion using Denoising Diffusion Probabilistic Models | Helin Cao et.al. | 2409.18092 | null |
2024-09-26 | Stable Video Portraits | Mirela Ostrek et.al. | 2409.18083 | null |
2024-09-26 | PhoCoLens: Photorealistic and Consistent Reconstruction in Lensless Imaging | Xin Cai et.al. | 2409.17996 | null |
2024-09-26 | Joint Localization and Planning using Diffusion | L. Lao Beyer et.al. | 2409.17995 | null |
2024-09-26 | CNCA: Toward Customizable and Natural Generation of Adversarial Camouflage for Vehicle Detectors | Linye Lyu et.al. | 2409.17963 | null |
2024-09-26 | Relativistic diffusion model for hadron production in p-Pb collisions at the LHC | Philipp Schulz et.al. | 2409.17960 | null |
2024-09-26 | Pioneering Reliable Assessment in Text-to-Image Knowledge Editing: Leveraging a Fine-Grained Dataset and an Innovative Criterion | Hengrui Gu et.al. | 2409.17928 | link |
2024-09-26 | Resolving Multi-Condition Confusion for Finetuning-Free Personalized Image Generation | Qihan Huang et.al. | 2409.17920 | link |
2024-09-26 | Physics-aligned Schrödinger bridge | Zeyu Li et.al. | 2409.17825 | null |
2024-09-26 | Continual learning with task specialist | Indu Solomon et.al. | 2409.17806 | null |
2024-09-25 | DreamWaltz-G: Expressive 3D Gaussian Avatars from Skeleton-Guided 2D Diffusion | Yukun Huang et.al. | 2409.17145 | link |
2024-09-25 | Strong solutions to degenerate SDEs and uniqueness for degenerate Fokker-Planck equations | Sebastian Grube et.al. | 2409.17135 | null |
2024-09-25 | Language-oriented Semantic Communication for Image Transmission with Fine-Tuned Diffusion Model | Xinfeng Wei et.al. | 2409.17104 | null |
2024-09-25 | Degradation-Guided One-Step Image Super-Resolution with Diffusion Priors | Aiping Zhang et.al. | 2409.17058 | link |
2024-09-25 | ControlCity: A Multimodal Diffusion Model Based Approach for Accurate Geospatial Data Generation and Urban Morphology Analysis | Fangshuo Zhou et.al. | 2409.17049 | link |
2024-09-25 | Dynamic Obstacle Avoidance through Uncertainty-Based Adaptive Planning with Diffusion | Vineet Punyamoorty et.al. | 2409.16950 | null |
2024-09-25 | DALDA: Data Augmentation Leveraging Diffusion Model and LLM with Adaptive Guidance Scaling | Kyuheon Jung et.al. | 2409.16949 | link |
2024-09-25 | Generative Object Insertion in Gaussian Splatting with a Multi-View Diffusion Model | Hongliang Zhong et.al. | 2409.16938 | link |
2024-09-25 | Weak Closed-loop Solvability of Linear Quadratic Stochastic Optimal Control Problems with Partial Information | Xun Li et.al. | 2409.16924 | null |
2024-09-25 | Automating Traffic Model Enhancement with AI Research Agent | Xusen Guo et.al. | 2409.16876 | null |
2024-09-25 | A Versatile and Differentiable Hand-Object Interaction Representation | Théo Morales et.al. | 2409.16855 | null |
2024-09-25 | Analytical assessment of workers’ safety concerning direct and indirect ways of getting infected by dangerous pathogen | Krzysztof Domino et.al. | 2409.16809 | null |
2024-09-25 | Layout-Corrector: Alleviating Layout Sticking Phenomenon in Discrete Diffusion Model | Shoma Iwai et.al. | 2409.16689 | null |
2024-09-25 | CasFT: Future Trend Modeling for Information Popularity Prediction with Dynamic Cues-Driven Diffusion Models | Xin Jing et.al. | 2409.16619 | null |
2024-09-25 | BSDEs driven by G-Brownian motion with time-varying uniformly continuous generators | Bingru Zhao et.al. | 2409.16574 | null |
2024-09-18 | Massively Multi-Person 3D Human Motion Forecasting with Scene Context | Felix B Mueller et.al. | 2409.12189 | link |
2024-09-18 | MoRAG – Multi-Fusion Retrieval Augmented Generation for Human Motion | Kalakonda Sai Shashank et.al. | 2409.12140 | null |
2024-09-18 | Cyclicity Analysis of the Ornstein-Uhlenbeck Process | Vivek Kaushik et.al. | 2409.12102 | null |
2024-09-18 | Brain-Streams: fMRI-to-Image Reconstruction with Multi-modal Guidance | Jaehoon Joo et.al. | 2409.12099 | null |
2024-09-18 | Denoising diffusion models for high-resolution microscopy image restoration | Pamela Osuna-Vargas et.al. | 2409.12078 | null |
2024-09-18 | SFDA-rPPG: Source-Free Domain Adaptive Remote Physiological Measurement with Spatio-Temporal Consistency | Yiping Xie et.al. | 2409.12040 | null |
2024-09-18 | LEMON: Localized Editing with Mesh Optimization and Neural Shaders | Furkan Mert Algan et.al. | 2409.12024 | null |
2024-09-18 | Generation of Complex 3D Human Motion by Temporal and Spatial Composition of Diffusion Models | Lorenzo Mandelli et.al. | 2409.11920 | null |
2024-09-18 | DPI-TTS: Directional Patch Interaction for Fast-Converging and Style Temporal Modeling in Text-to-Speech | Xin Qi et.al. | 2409.11835 | null |
2024-09-18 | RaggeDi: Diffusion-based State Estimation of Disordered Rags, Sheets, Towels and Blankets | Jikai Ye et.al. | 2409.11831 | null |
2024-09-18 | InverseMeetInsert: Robust Real Image Editing via Geometric Accumulation Inversion in Guided Diffusion Models | Yan Zheng et.al. | 2409.11734 | null |
2024-09-18 | GUNet: A Graph Convolutional Network United Diffusion Model for Stable and Diversity Pose Generation | Shuowen Liang et.al. | 2409.11689 | link |
2024-09-18 | Recurrent Interpolants for Probabilistic Time Series Prediction | Yu Chen et.al. | 2409.11684 | null |
2024-09-18 | SRIF: Semantic Shape Registration Empowered by Diffusion-based Image Morphing and Flow Estimation | Mingze Sun et.al. | 2409.11682 | null |
2024-09-18 | Electromagnetic Property Sensing and Channel Reconstruction Based on Diffusion Schrödinger Bridge in ISAC | Yuhua Jiang et.al. | 2409.11651 | null |
2024-09-17 | Ultrasound Image Enhancement with the Variance of Diffusion Models | Yuxin Zhang et.al. | 2409.11380 | link |
2024-09-17 | OSV: One Step is Enough for High-Quality Image to Video Generation | Xiaofeng Mao et.al. | 2409.11367 | null |
2024-09-17 | Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think | Gonzalo Martin Garcia et.al. | 2409.11355 | link |
2024-09-17 | OmniGen: Unified Image Generation | Shitao Xiao et.al. | 2409.11340 | link |
2024-09-17 | Parameter dependent rough SDEs with applications to rough PDEs | Fabio Bugini et.al. | 2409.11330 | null |
2024-09-17 | fMRI-3D: A Comprehensive Dataset for Enhancing fMRI-based 3D Reconstruction | Jianxiong Gao et.al. | 2409.11315 | null |
2024-09-17 | DroneDiffusion: Robust Quadrotor Dynamics Learning with Diffusion Models | Avirup Das et.al. | 2409.11292 | null |
2024-09-17 | Score Forgetting Distillation: A Swift, Data-Free Method for Machine Unlearning in Diffusion Models | Tianqi Chen et.al. | 2409.11219 | null |
2024-09-17 | High-Resolution Speech Restoration with Latent Diffusion Model | Tushar Dhyani et.al. | 2409.11145 | null |
2024-09-17 | In-situ measurements of light diffusion in an optically dense atomic ensemble | Antoine Glicenstein et.al. | 2409.11117 | null |
2024-09-17 | TacDiffusion: Force-domain Diffusion Policy for Precise Tactile Manipulation | Yansong Wu et.al. | 2409.11047 | null |
2024-09-17 | Enhanced segmentation of femoral bone metastasis in CT scans of patients using synthetic data generation with 3D diffusion models | Emile Saillard et.al. | 2409.11011 | null |
2024-09-17 | Local discontinuous Galerkin method for nonlinear BSPDEs of Neumann boundary conditions with deep backward dynamic programming time-marching | Yixiang Dai et.al. | 2409.11004 | null |
2024-09-17 | Edge-based Denoising Image Compression | Ryugo Morita et.al. | 2409.10978 | null |
2024-09-17 | CUNSB-RFIE: Context-aware Unpaired Neural Schrödinger Bridge in Retinal Fundus Image Enhancement | Xuanzhao Dong et.al. | 2409.10966 | link |
2024-09-16 | Incorporating Classifier-Free Guidance in Diffusion Model-Based Recommendation | Noah Buchanan et.al. | 2409.10494 | null |
2024-09-16 | SimInversion: A Simple Framework for Inversion-Based Text-to-Image Editing | Qi Qian et.al. | 2409.10476 | null |
2024-09-16 | MacDiff: Unified Skeleton Modeling with Masked Conditional Diffusion | Lehong Wu et.al. | 2409.10473 | null |
2024-09-16 | Mamba-ST: State Space Model for Efficient Style Transfer | Filippo Botti et.al. | 2409.10385 | link |
2024-09-16 | Stochastic Control of UAVs: An Optimal Tradeoff between Performance, Flight Smoothness and Control Effort | George Rapakoulias et.al. | 2409.10369 | null |
2024-09-16 | Taming Diffusion Models for Image Restoration: A Review | Ziwei Luo et.al. | 2409.10353 | null |
2024-09-16 | Fairness, not Emotion, Drives Socioeconomic Decision Making | Rudra Mukhopadhyay et.al. | 2409.10322 | null |
2024-09-16 | DreamHead: Learning Spatial-Temporal Correspondence via Hierarchical Diffusion for Audio-driven Talking Head Synthesis | Fa-Ting Hong et.al. | 2409.10281 | null |
2024-09-16 | RealDiff: Real-world 3D Shape Completion using Self-Supervised Diffusion Models | Başak Melis Öcal et.al. | 2409.10180 | null |
2024-09-16 | PSHuman: Photorealistic Single-view Human Reconstruction using Cross-Scale Diffusion | Peng Li et.al. | 2409.10141 | null |
2024-09-16 | Approximating the signature of Brownian motion for high order SDE simulation | James Foster et.al. | 2409.10118 | link |
2024-09-16 | DDoS: Diffusion Distribution Similarity for Out-of-Distribution Detection | Kun Fang et.al. | 2409.10094 | null |
2024-09-16 | MotionCom: Automatic and Motion-Aware Image Composition with LLM and Video Diffusion Prior | Weijing Tao et.al. | 2409.10090 | link |
2024-09-16 | Cross-modality image synthesis from TOF-MRA to CTA using diffusion-based models | Alexander Koch et.al. | 2409.10089 | null |
2024-09-16 | A Riemannian Approach to Ground Metric Learning for Optimal Transport | Pratik Jawanpuria et.al. | 2409.10085 | null |
2024-09-13 | Closed-Loop Visuomotor Control with Generative Expectation for Robotic Manipulation | Qingwen Bu et.al. | 2409.09016 | link |
2024-09-13 | A Diffusion Approach to Radiance Field Relighting using Multi-Illumination Synthesis | Yohan Poirier-Ginter et.al. | 2409.08947 | null |
2024-09-13 | Latent Space Score-based Diffusion Model for Probabilistic Multivariate Time Series Imputation | Guojun Liang et.al. | 2409.08917 | link |
2024-09-13 | Gaussian is All You Need: A Unified Framework for Solving Inverse Problems via Diffusion Posterior Sampling | Nebiyou Yismaw et.al. | 2409.08906 | null |
2024-09-13 | Adjoint Matching: Fine-tuning Flow and Diffusion Generative Models with Memoryless Stochastic Optimal Control | Carles Domingo-Enrich et.al. | 2409.08861 | null |
2024-09-13 | InstantDrag: Improving Interactivity in Drag-based Image Editing | Joonghyuk Shin et.al. | 2409.08857 | null |
2024-09-13 | DX2CT: Diffusion Model for 3D CT Reconstruction from Bi or Mono-planar 2D X-ray(s) | Yun Su Jeong et.al. | 2409.08850 | null |
2024-09-13 | Measure-Theoretic Time-Delay Embedding | Jonah Botvinick-Greenhouse et.al. | 2409.08768 | link |
2024-09-13 | DFADD: The Diffusion and Flow-Matching Based Audio Deepfake Dataset | Jiawei Du et.al. | 2409.08731 | link |
2024-09-13 | Asymptotics for Random Quadratic Transportation Costs | Martin Huesmann et.al. | 2409.08612 | null |
2024-09-13 | Finite-time thermodynamic bounds and tradeoff relations for information processing | Takuya Kamijima et.al. | 2409.08606 | null |
2024-09-13 | STA-V2A: Video-to-Audio Generation with Semantic and Temporal Alignment | Yong Ren et.al. | 2409.08601 | null |
2024-09-13 | LHQ-SVC: Lightweight and High Quality Singing Voice Conversion Modeling | Yubo Huang et.al. | 2409.08583 | null |
2024-09-13 | DiffFAS: Face Anti-Spoofing via Generative Diffusion Models | Xinxu Ge et.al. | 2409.08572 | link |
2024-09-13 | Think Twice Before You Act: Improving Inverse Problem Solving With MCMC | Yaxuan Zhu et.al. | 2409.08551 | null |
2024-09-12 | DreamHOI: Subject-Driven Generation of 3D Human-Object Interactions with Diffusion Priors | Thomas Hanwen Zhu et.al. | 2409.08278 | null |
2024-09-12 | DreamBeast: Distilling 3D Fantastical Animals with Part-Aware Knowledge Transfer | Runjia Li et.al. | 2409.08271 | null |
2024-09-12 | Touch2Touch: Cross-Modal Tactile Generation for Object Manipulation | Samanta Rodriguez et.al. | 2409.08269 | null |
2024-09-12 | Improving Text-guided Object Inpainting with Semantic Pre-inpainting | Yifu Chen et.al. | 2409.08260 | link |
2024-09-12 | Improving Virtual Try-On with Garment-focused Diffusion Models | Siqi Wan et.al. | 2409.08258 | null |
2024-09-12 | LoRID: Low-Rank Iterative Diffusion for Adversarial Purification | Geigh Zollicoffer et.al. | 2409.08255 | null |
2024-09-12 | Dynamic Prompting of Frozen Text-to-Image Diffusion Models for Panoptic Narrative Grounding | Hongyu Li et.al. | 2409.08251 | null |
2024-09-12 | IFAdapter: Instance Feature Control for Grounded Text-to-Image Generation | Yinwei Wu et.al. | 2409.08240 | null |
2024-09-12 | How can the tragedy of the commons be prevented?: Introducing Linear Quadratic Mixed Mean Field Games | Gokce Dayanikli et.al. | 2409.08235 | null |
2024-09-12 | LT3SD: Latent Trees for 3D Scene Diffusion | Quan Meng et.al. | 2409.08215 | null |
2024-09-12 | VI3DRM:Towards meticulous 3D Reconstruction from Sparse Views via Photo-Realistic Novel View Synthesis | Hao Chen et.al. | 2409.08207 | null |
2024-09-12 | MagicStyle: Portrait Stylization Based on Reference Image | Zhaoli Deng et.al. | 2409.08156 | null |
2024-09-12 | EZIGen: Enhancing zero-shot subject-driven image generation with precise subject encoding and decoupled guidance | Zicheng Duan et.al. | 2409.08091 | link |
2024-09-12 | Diffusion-Based Image-to-Image Translation by Noise Correction via Prompt Interpolation | Junsung Lee et.al. | 2409.08077 | null |
2024-09-12 | AI-accelerated discovery of high critical temperature superconductors | Xiao-Qi Han et.al. | 2409.08065 | link |
2024-09-11 | DreamMesh: Jointly Manipulating and Texturing Triangle Meshes for Text-to-3D Generation | Haibo Yang et.al. | 2409.07454 | null |
2024-09-11 | Hi3D: Pursuing High-Resolution Image-to-3D Generation with Video Diffusion Models | Haibo Yang et.al. | 2409.07452 | link |
2024-09-11 | FreeEnhance: Tuning-Free Image Enhancement via Content-Consistent Noising-and-Denoising Process | Yang Luo et.al. | 2409.07451 | null |
2024-09-11 | Efficient One-Step Diffusion Refinement for Snapshot Compressive Imaging | Yunzhen Wang et.al. | 2409.07417 | null |
2024-09-11 | Training-Free Guidance for Discrete Diffusion Models for Molecular Generation | Thomas J. Kerby et.al. | 2409.07359 | null |
2024-09-11 | Learning Robotic Manipulation Policies from Point Clouds with Conditional Flow Matching | Eugenio Chisari et.al. | 2409.07343 | null |
2024-09-11 | Efficient and Unbiased Sampling of Boltzmann Distributions via Consistency Models | Fengzhe Zhang et.al. | 2409.07323 | null |
2024-09-11 | Exploring User-level Gradient Inversion with a Diffusion Prior | Zhuohang Li et.al. | 2409.07291 | null |
2024-09-11 | CCFExp: Facial Image Synthesis with Cycle Cross-Fusion Diffusion Model for Facial Paralysis Individuals | Weixiang Gao et.al. | 2409.07271 | link |
2024-09-11 | Realistic and Efficient Face Swapping: A Unified Approach with Diffusion Models | Sanoojan Baliah et.al. | 2409.07269 | link |
2024-09-11 | EMOdiffhead: Continuously Emotional Control in Talking Head Generation via Diffusion | Jian Zhang et.al. | 2409.07255 | null |
2024-09-12 | Alignment of Diffusion Models: Fundamentals, Challenges, and Future | Buhua Liu et.al. | 2409.07253 | link |
2024-09-11 | Diff-VPS: Video Polyp Segmentation via a Multi-task Diffusion Network with Adversarial Temporal Reasoning | Yingling Lu et.al. | 2409.07238 | link |
2024-09-11 | Phy124: Fast Physics-Driven 4D Content Generation from a Single Image | Jiajing Lin et.al. | 2409.07179 | null |
2024-09-11 | Mamba Policy: Towards Efficient 3D Diffusion Policy with Hybrid Selective State Models | Jiahang Cao et.al. | 2409.07163 | null |
2024-09-10 | SaRA: High-Efficient Diffusion Model Fine-tuning with Progressive Sparse Low-Rank Adaptation | Teng Hu et.al. | 2409.06633 | null |
2024-09-10 | One-Shot Imitation under Mismatched Execution | Kushal Kedia et.al. | 2409.06615 | null |
2024-09-10 | Modelling Global Trade with Optimal Transport | Thomas Gaskin et.al. | 2409.06554 | link |
2024-09-10 | Robust financial calibration: a Bayesian approach for neural SDEs | Christa Cuchiero et.al. | 2409.06551 | link |
2024-09-10 | Enhancing Emotional Text-to-Speech Controllability with Natural Language Guidance through Contrastive Learning and Diffusion Models | Xin Jing et.al. | 2409.06451 | null |
2024-09-10 | Robust semi-parametric signal detection in particle physics with classifiers decorrelated via optimal transport | Purvasha Chakravarti et.al. | 2409.06399 | null |
2024-09-10 | Distilling Generative-Discriminative Representations for Very Low-Resolution Face Recognition | Junzheng Zhang et.al. | 2409.06371 | null |
2024-09-10 | What happens to diffusion model likelihood when your model is conditional? | Mattias Cross et.al. | 2409.06364 | null |
2024-09-10 | DiffQRCoder: Diffusion-based Aesthetic QR Code Generation with Scanning Robustness Guided Iterative Refinement | Jia-Wei Liao et.al. | 2409.06355 | null |
2024-09-10 | Geometry of the Space of Partitioned Networks: A Unified Theoretical and Computational Framework | Stephen Y Zhang et.al. | 2409.06302 | link |
2024-09-10 | Multi-Source Music Generation with Latent Diffusion | Zhongweiyang Xu et.al. | 2409.06190 | link |
2024-09-10 | MyGo: Consistent and Controllable Multi-View Driving Video Generation with Camera Control | Yining Yao et.al. | 2409.06189 | null |
2024-09-10 | EDADepth: Enhanced Data Augmentation for Monocular Depth Estimation | Nischal Khanal et.al. | 2409.06183 | link |
2024-09-09 | Latent Diffusion Bridges for Unsupervised Musical Audio Timbre Transfer | Michele Mancusi et.al. | 2409.06096 | null |
2024-09-09 | SVS-GAN: Leveraging GANs for Semantic Video Synthesis | Khaled M. Seyam et.al. | 2409.06074 | null |
2024-09-09 | Enhancing Preference-based Linear Bandits via Human Response Time | Shen Li et.al. | 2409.05798 | null |
2024-09-09 | Vector Quantized Diffusion Model Based Speech Bandwidth Extension | Yuan Fang et.al. | 2409.05784 | null |
2024-09-09 | AS-Speech: Adaptive Style For Speech Synthesis | Zhipeng Li et.al. | 2409.05730 | null |
2024-09-09 | Distributionally Robust Stochastic Data-Driven Predictive Control with Optimized Feedback Gain | Ruiqi Li et.al. | 2409.05727 | null |
2024-09-09 | Quantitative approximation of stochastic kinetic equations: from discrete to continuum | Zimo Hao et.al. | 2409.05706 | null |
2024-09-09 | pFedGPA: Diffusion-based Generative Parameter Aggregation for Personalized Federated Learning | Jiahao Lai et.al. | 2409.05701 | null |
2024-09-09 | Unlearning or Concealment? A Critical Analysis and Evaluation Metrics for Unlearning in Diffusion Models | Aakash Sen Sharma et.al. | 2409.05668 | null |
2024-09-09 | Forward KL Regularized Preference Optimization for Aligning Diffusion Policies | Zhao Shan et.al. | 2409.05622 | null |
2024-09-09 | CipherDM: Secure Three-Party Inference for Diffusion Model Sampling | Xin Zhao et.al. | 2409.05414 | null |
2024-09-09 | Sequential Posterior Sampling with Diffusion Models | Tristan S. W. Stevens et.al. | 2409.05399 | null |
2024-09-09 | TERD: A Unified Framework for Safeguarding Diffusion Models Against Backdoors | Yichuan Mo et.al. | 2409.05294 | link |
2024-09-08 | The Stochastic Gause predator-prey model: noise-induced extinctions and invariance | Leon Alexander Valencia et.al. | 2409.05237 | null |
2024-09-08 | Nuclear transparencies with a two step process of the $A(e,e’π^+)$ reactions | Tae Keun Choi et.al. | 2409.05129 | null |
2024-09-08 | Diffusion-based Speech Enhancement with Schrödinger Bridge and Symmetric Noise Schedule | Siyi Wang et.al. | 2409.05116 | null |
2024-09-08 | A Survey on Diffusion Models for Recommender Systems | Jianghao Lin et.al. | 2409.05033 | link |
2024-09-06 | VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation | Yecheng Wu et.al. | 2409.04429 | link |
2024-09-06 | Exploring Foundation Models for Synthetic Medical Imaging: A Study on Chest X-Rays and Fine-Tuning Techniques | Davide Clode da Silva et.al. | 2409.04424 | null |
2024-09-06 | How Fair is Your Diffusion Recommender Model? | Daniele Malitesta et.al. | 2409.04339 | null |
2024-09-06 | Random effects estimation in a fractional diffusion model based on continuous observations | Nesrine Chebli et.al. | 2409.04331 | null |
2024-09-06 | Probabilistic Representation for Viscosity Solutions to Double-Obstacle Quasi-Variational Inequalities | Magnus Perninge et.al. | 2409.04207 | null |
2024-09-06 | Breaking the Brownian Barrier: Models and Manifestations of Molecular Diffusion in Complex Fluids | Harish Srinivasan et.al. | 2409.04199 | null |
2024-09-06 | GST: Precise 3D Human Body from a Single Image with Gaussian Splatting Transformers | Lorenza Prospero et.al. | 2409.04196 | null |
2024-09-06 | D4: Text-guided diffusion model-based domain adaptive data augmentation for vineyard shoot detection | Kentaro Hirahara et.al. | 2409.04060 | null |
2024-09-06 | A policy iteration algorithm for non-Markovian control problems | Dylan Possamaï et.al. | 2409.04037 | null |
2024-09-06 | One-Shot Diffusion Mimicker for Handwritten Text Generation | Gang Dai et.al. | 2409.04004 | link |
2024-09-06 | DreamForge: Motion-Aware Autoregressive Video Generation for Multi-View Driving Scenes | Jianbiao Mei et.al. | 2409.04003 | link |
2024-09-05 | Data-Efficient Generation for Dataset Distillation | Zhe Li et.al. | 2409.03929 | null |
2024-09-05 | Generating High Dimensional User-Specific Wireless Channels using Diffusion Models | Taekyun Lee et.al. | 2409.03924 | null |
2024-09-05 | Neural Entropy | Akhil Premkumar et.al. | 2409.03817 | null |
2024-09-05 | Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding | Yunze Man et.al. | 2409.03757 | link |
2024-09-05 | ArtiFade: Learning to Generate High-quality Subject from Blemished Images | Shuya Yang et.al. | 2409.03745 | null |
2024-09-05 | Quantum optimal transport with convex regularization | Emanuele Caputo et.al. | 2409.03698 | null |
2024-09-05 | RealisHuman: A Two-Stage Approach for Refining Malformed Human Parts in Generated Images | Benzhi Wang et.al. | 2409.03644 | link |
2024-09-05 | DiffEVC: Any-to-Any Emotion Voice Conversion with Expressive Guidance | Hsing-Hang Chou et.al. | 2409.03636 | null |
2024-09-05 | TCDiff: Triple Condition Diffusion Model with 3D Constraints for Stylizing Synthetic Faces | Bernardo Biesseck et.al. | 2409.03600 | link |
2024-09-05 | DKDM: Data-Free Knowledge Distillation for Diffusion Models with Any Architecture | Qianlong Xiang et.al. | 2409.03550 | null |
2024-09-05 | On the mean field limit of consensus based methods | Marvin Koß et.al. | 2409.03518 | null |
2024-09-05 | Blended Latent Diffusion under Attention Control for Real-World Video Editing | Deyin Liu et.al. | 2409.03514 | null |
2024-09-05 | Data-free Distillation with Degradation-prompt Diffusion for Multi-weather Image Restoration | Pei Wang et.al. | 2409.03455 | null |
2024-09-05 | Recursive Quantization for $\mathcal{L}_2$ Stabilization of a Finite Capacity Stochastic Control Loop with Intermittent State Observations | Shrija Karmakar et.al. | 2409.03398 | null |
2024-09-05 | Enhancing User-Centric Privacy Protection: An Interactive Framework through Diffusion Models and Machine Unlearning | Huaxi Huang et.al. | 2409.03326 | null |
2024-09-05 | SVP: Style-Enhanced Vivid Portrait Talking Head Diffusion Model | Weipeng Tan et.al. | 2409.03270 | null |
2024-09-05 | RoomDiffusion: A Specialized Diffusion Model in the Interior Design Industry | Zhaowei Wang et.al. | 2409.03198 | null |
2024-09-04 | Spatial Diffusion for Cell Layout Generation | Chen Li et.al. | 2409.03106 | link |
2024-09-04 | HiPrompt: Tuning-free Higher-Resolution Generation with Hierarchical MLLM Prompts | Xinyu Liu et.al. | 2409.02919 | link |
2024-09-04 | Masked Diffusion Models are Secretly Time-Agnostic Masked Models and Exploit Inaccurate Categorical Sampling | Kaiwen Zheng et.al. | 2409.02908 | null |
2024-09-04 | Human-VDM: Learning Single-Image 3D Human Gaussian Splatting from Video Diffusion Models | Zhibin Liu et.al. | 2409.02851 | link |
2024-09-04 | Multi-Track MusicLDM: Towards Versatile Music Generation with Latent Diffusion Model | Tornike Karchkhadze et.al. | 2409.02845 | null |
2024-09-04 | Skip-and-Play: Depth-Driven Pose-Preserved Image Generation for Any Objects | Kyungmin Jo et.al. | 2409.02653 | null |
2024-09-04 | MADiff: Motion-Aware Mamba Diffusion Models for Hand Trajectory Prediction on Egocentric Videos | Junyi Ma et.al. | 2409.02638 | null |
2024-09-04 | Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency | Jianwen Jiang et.al. | 2409.02634 | null |
2024-09-04 | Rate-Adaptive Generative Semantic Communication Using Conditional Diffusion Models | Pujing Yang et.al. | 2409.02597 | null |
2024-09-04 | Solving Video Inverse Problems Using Image Diffusion Models | Taesung Kwon et.al. | 2409.02574 | null |
2024-09-04 | StyleTokenizer: Defining Image Style by a Single Instance for Controlling Diffusion Models | Wen Li et.al. | 2409.02543 | link |
2024-09-04 | Sample what you cant compress | Vighnesh Birodkar et.al. | 2409.02529 | null |
2024-09-04 | Continual Diffuser (CoD): Mastering Continual Offline Reinforcement Learning with Experience Rehearsal | Jifeng Hu et.al. | 2409.02512 | link |
2024-09-04 | Demographic parity in regression and classification within the unawareness framework | Vincent Divol et.al. | 2409.02471 | null |
2024-09-04 | Training-free Color-Style Disentanglement for Constrained Text-to-Image Synthesis | Aishwarya Agarwal et.al. | 2409.02429 | null |
2024-09-04 | Diffusion Models Learn Low-Dimensional Distributions via Subspace Clustering | Peng Wang et.al. | 2409.02426 | link |
2024-08-30 | Subspace Diffusion Posterior Sampling for Travel-Time Tomography | Xiang Cao et.al. | 2408.17333 | null |
2024-08-30 | Likelihood estimation for stochastic differential equations with mixed effects | Fernando Baltazar-Larios et.al. | 2408.17257 | null |
2024-08-30 | The random periodic solutions for McKean-Vlasov stochastic differential equations | Jianhai Bao et.al. | 2408.17242 | null |
2024-08-30 | A methodological framework for Resilience as a Service (RaaS) in multimodal urban transportation networks | Sara Jaber et.al. | 2408.17233 | null |
2024-09-02 | RISSOLE: Parameter-efficient Diffusion Models via Block-wise Generation and Retrieval-Guidance | Avideep Mukherjee et.al. | 2408.17095 | null |
2024-09-02 | Instant Adversarial Purification with Adversarial Consistency Distillation | Chun Tong Lei et.al. | 2408.17064 | null |
2024-08-30 | Text-to-Image Generation Via Energy-Based CLIP | Roy Ganz et.al. | 2408.17046 | null |
2024-08-30 | High-fidelity holographic beam shaping with optimal transport and phase diversity | Hunter Swan et.al. | 2408.17025 | null |
2024-08-30 | Contrastive Learning with Synthetic Positives | Dewen Zeng et.al. | 2408.16965 | link |
2024-09-02 | Enabling Local Editing in Diffusion Models by Joint and Individual Component Analysis | Theodoros Kouzelis et.al. | 2408.16845 | null |
2024-08-29 | ReconX: Reconstruct Any Scene from Sparse Views with Video Diffusion Model | Fangfu Liu et.al. | 2408.16767 | null |
2024-09-04 | CSGO: Content-Style Composition in Text-to-Image Generation | Peng Xing et.al. | 2408.16766 | null |
2024-08-29 | DriveGenVLM: Real-world Video Generation for Vision Language Model based Autonomous Driving | Yongjie Fu et.al. | 2408.16647 | null |
2024-09-02 | RLCP: A Reinforcement Learning-based Copyright Protection Method for Text-to-Image Diffusion Model | Zhuan Shi et.al. | 2408.16634 | null |
2024-08-29 | A Score-based Generative Solver for PDE-constrained Inverse Problems with Complex Priors | Yankun Hong et.al. | 2408.16626 | null |
Dataset Distillation
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2024-12-19 | SCKD: Semi-Supervised Cross-Modality Knowledge Distillation for 4D Radar Object Detection | Ruoyu Xu et.al. | 2412.14571 | null |
2024-12-19 | Multi-Level Optimal Transport for Universal Cross-Tokenizer Knowledge Distillation on Language Models | Xiao Cui et.al. | 2412.14528 | null |
2024-12-19 | Knowledge Distillation in RNN-Attention Models for Early Prediction of Student Performance | Sukrit Leelaluk et.al. | 2412.14526 | link |
2024-12-18 | A Survey on Inference Optimization Techniques for Mixture of Experts Models | Jiacheng Liu et.al. | 2412.14219 | link |
2024-12-18 | Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective | Zhiyuan Zeng et.al. | 2412.14135 | null |
2024-12-18 | On Explaining Knowledge Distillation: Measuring and Visualising the Knowledge Transfer Process | Gereziher Adhane et.al. | 2412.13943 | null |
2024-12-18 | Learnable Prompting SAM-induced Knowledge Distillation for Semi-supervised Medical Image Segmentation | Kaiwen Huang et.al. | 2412.13742 | null |
2024-12-18 | On the Compression of Language Models for Code: An Empirical Study on CodeBERT | Giordano d’Aloisio et.al. | 2412.13737 | null |
2024-12-18 | Hybrid Data-Free Knowledge Distillation | Jialiang Tang et.al. | 2412.13525 | link |
2024-12-17 | In-Context Learning Distillation for Efficient Few-Shot Fine-Tuning | Yifei Duan et.al. | 2412.13243 | null |
2024-12-17 | Modality-Inconsistent Continual Learning of Multimodal Large Language Models | Weiguo Pian et.al. | 2412.13050 | null |
2024-12-17 | Efficient Speech Command Recognition Leveraging Spiking Neural Network and Curriculum Learning-based Knowledge Distillation | Jiaqi Wang et.al. | 2412.12858 | null |
2024-12-17 | PromptDet: A Lightweight 3D Object Detection Framework with LiDAR Prompts | Kun Guo et.al. | 2412.12460 | null |
2024-12-16 | Neural Collapse Inspired Knowledge Distillation | Shuoxi Zhang et.al. | 2412.11788 | null |
2024-12-16 | Relation-Guided Adversarial Learning for Data-free Knowledge Transfer | Yingping Liang et.al. | 2412.11380 | null |
2024-12-16 | BiM-VFI: directional Motion Field-Guided Frame Interpolation for Video with Non-uniform Motions | Wonyong Seo et.al. | 2412.11365 | null |
2024-12-15 | Wearable Accelerometer Foundation Models for Health via Knowledge Distillation | Salar Abbaspourazad et.al. | 2412.11276 | null |
2024-12-15 | ProFe: Communication-Efficient Decentralized Federated Learning via Distillation and Prototypes | Pedro Miguel Sánchez Sánchez et.al. | 2412.11207 | null |
2024-12-15 | Leveraging Large Language Models for Active Merchant Non-player Characters | Byungjun Kim et.al. | 2412.11189 | null |
2024-12-15 | Knowledge Migration Framework for Smart Contract Vulnerability Detection | Luqi Wang et.al. | 2412.11175 | null |
2024-12-15 | Redefining Normal: A Novel Object-Level Approach for Multi-Object Novelty Detection | Mohammadreza Salehi et.al. | 2412.11148 | link |
2024-12-17 | On Distilling the Displacement Knowledge for Few-Shot Class-Incremental Learning | Pengfei Fang et.al. | 2412.11017 | null |
2024-12-13 | Efficient Dataset Distillation via Diffusion-Driven Patch Selection for Improved Generalization | Xinhao Zhong et.al. | 2412.09959 | null |
2024-12-13 | Going Beyond Feature Similarity: Effective Dataset distillation based on Class-aware Conditional Mutual Information | Xinhao Zhong et.al. | 2412.09945 | null |
2024-12-13 | Can Students Beyond The Teacher? Distilling Knowledge from Teacher’s Bias | Jianhua Zhang et.al. | 2412.09874 | null |
2024-12-13 | ScaleOT: Privacy-utility-scalable Offsite-tuning with Dynamic LayerReplace and Selective Rank Compression | Kai Yao et.al. | 2412.09812 | null |
2024-12-13 | LLM Distillation for Efficient Few-Shot Multiple Choice Question Answering | Patrick Sutanto et.al. | 2412.09807 | null |
2024-12-12 | SnapGen: Taming High-Resolution Text-to-Image Models for Mobile Devices with Efficient Architectures and Training | Dongting Hu et.al. | 2412.09619 | null |
2024-12-12 | A Theoretical Analysis of Soft-Label vs Hard-Label Training in Neural Networks | Saptarshi Mandal et.al. | 2412.09579 | null |
2024-12-12 | All You Need in Knowledge Distillation Is a Tailored Coordinate System | Junjie Zhou et.al. | 2412.09388 | null |
2024-12-12 | Optimising TinyML with Quantization and Distillation of Transformer and Mamba Models for Indoor Localisation on Edge Devices | Thanaphon Suwannaphong et.al. | 2412.09289 | null |
2024-12-12 | DASK: Distribution Rehearsing via Adaptive Style Kernel Learning for Exemplar-Free Lifelong Person Re-Identification | Kunlun Xu et.al. | 2412.09224 | null |
2024-12-12 | Multimodal Industrial Anomaly Detection by Crossmodal Reverse Distillation | Xinyue Liu et.al. | 2412.08949 | link |
2024-12-12 | Dynamic Contrastive Knowledge Distillation for Efficient Image Restoration | Yunshuai Zhou et.al. | 2412.08939 | null |
2024-12-11 | Efficient Gravitational Wave Parameter Estimation via Knowledge Distillation: A ResNet1D-IAF Approach | Xihua Zhu et.al. | 2412.08672 | null |
2024-12-11 | Wasserstein Distance Rivals Kullback-Leibler Divergence for Knowledge Distillation | Jiaming Lv et.al. | 2412.08139 | null |
2024-12-11 | DAKD: Data Augmentation and Knowledge Distillation using Diffusion Models for SAR Oil Spill Segmentation | Jaeho Moon et.al. | 2412.08116 | null |
2024-12-10 | Unlocking the Potential of Reverse Distillation for Anomaly Detection | Xinyue Liu et.al. | 2412.07579 | link |
2024-12-10 | TT-MPD: Test Time Model Pruning and Distillation | Haihang Wu et.al. | 2412.07114 | null |
2024-12-09 | FM2DS: Few-Shot Multimodal Multihop Data Synthesis with Knowledge Distillation for Question Answering | Amirhossein Abaskohi et.al. | 2412.07030 | link |
2024-12-09 | U-Know-DiffPAN: An Uncertainty-aware Knowledge Distillation Diffusion Framework with Details Enhancement for PAN-Sharpening | Sungpyo Kim et.al. | 2412.06243 | null |
2024-12-08 | Enhancing Content Representation for AR Image Quality Assessment Using Knowledge Distillation | Aymen Sekhri et.al. | 2412.06003 | null |
2024-12-07 | Neighborhood Commonality-aware Evolution Network for Continuous Generalized Category Discovery | Ye Wang et.al. | 2412.05573 | null |
2024-12-06 | BEExformer: A Fast Inferencing Transformer Architecture via Binarization with Multiple Early Exits | Wazib Ansar et.al. | 2412.05225 | null |
2024-12-06 | One-shot Federated Learning via Synthetic Distiller-Distillate Communication | Junyuan Zhang et.al. | 2412.05186 | link |
2024-12-06 | CCS: Continuous Learning for Customized Incremental Wireless Sensing Services | Qunhang Fu et.al. | 2412.04821 | null |
2024-12-06 | Decomposed Distribution Matching in Dataset Condensation | Sahar Rahimi Malakshan et.al. | 2412.04748 | link |
2024-12-05 | Diffusion-Augmented Coreset Expansion for Scalable Dataset Distillation | Ali Abbasi et.al. | 2412.04668 | null |
2024-12-05 | FedDW: Distilling Weights through Consistency Optimization in Heterogeneous Federated Learning | Jiayu Liu et.al. | 2412.04521 | link |
2024-12-05 | Expanding Deep Learning-based Sensing Systems with Multi-Source Knowledge Transfer | Gaole Dai et.al. | 2412.04060 | null |
2024-12-07 | Enhancing CLIP Conceptual Embedding through Knowledge Distillation | Kuei-Chun Kao et.al. | 2412.03513 | null |
2024-12-04 | Distillation of Diffusion Features for Semantic Correspondence | Frank Fundel et.al. | 2412.03512 | null |
2024-12-02 | Mutli-View 3D Reconstruction using Knowledge Distillation | Aditya Dutt et.al. | 2412.02039 | link |
2024-12-02 | Align-KD: Distilling Cross-Modal Alignment Knowledge for Mobile Vision-Language Model | Qianhan Feng et.al. | 2412.01282 | link |
2024-12-01 | QABISAR: Query-Article Bipartite Interactions for Statutory Article Retrieval | T. Y. S. S. Santosh et.al. | 2412.00934 | null |
2024-12-01 | Local vs. Global: Local Land-Use and Land-Cover Models Deliver Higher Quality Maps | Girmaw Abebe Tadesse et.al. | 2412.00777 | null |
2024-11-30 | Continuous Concepts Removal in Text-to-image Diffusion Models | Tingxu Han et.al. | 2412.00580 | null |
2024-11-30 | Toward Fair Graph Neural Networks Via Dual-Teacher Knowledge Distillation | Chengyu Li et.al. | 2412.00382 | null |
2024-11-28 | PP-SSL : Priority-Perception Self-Supervised Learning for Fine-Grained Recognition | ShuaiHeng Li et.al. | 2412.00134 | null |
2024-11-28 | Video Set Distillation: Information Diversification and Temporal Densification | Yinjie Zhao et.al. | 2412.00111 | null |
2024-11-29 | DELT: A Simple Diversity-driven EarlyLate Training for Dataset Distillation | Zhiqiang Shen et.al. | 2411.19946 | link |
2024-11-29 | Reverse Thinking Makes LLMs Stronger Reasoners | Justin Chih-Yao Chen et.al. | 2411.19865 | null |
2024-11-29 | FairDD: Fair Dataset Distillation via Synchronized Matching | Qihang Zhou et.al. | 2411.19623 | null |
2024-11-28 | Pre-Training Graph Contrastive Masked Autoencoders are Strong Distillers for EEG | Xinxu Wei et.al. | 2411.19230 | null |
2024-12-03 | Puzzle: Distillation-Based NAS for Inference-Optimized LLMs | Akhiad Bercovich et.al. | 2411.19146 | null |
2024-11-28 | Headache to Overstock? Promoting Long-tail Items through Debiased Product Bundling | Shuo Xu et.al. | 2411.19107 | null |
2024-11-28 | Zero-shot Slot Filling in the Age of LLMs for Dialogue Systems | Mansi Rana et.al. | 2411.18980 | null |
2024-11-27 | Active Data Curation Effectively Distills Large-Scale Multimodal Models | Vishaal Udandarao et.al. | 2411.18674 | null |
2024-11-27 | Vision Mamba Distillation for Low-resolution Fine-grained Image Classification | Yao Chen et.al. | 2411.17980 | link |
2024-11-27 | Improved implicit diffusion model with knowledge distillation to estimate the spatial distribution density of carbon stock in remote sensing imagery | Zhenyu Yu et.al. | 2411.17973 | null |
2024-11-26 | Large-Scale Data-Free Knowledge Distillation for ImageNet via Multi-Resolution Data Generation | Minh-Tuan Tran et.al. | 2411.17046 | null |
2024-11-26 | Words Matter: Leveraging Individual Text Embeddings for Code Generation in CLIP Test-Time Adaptation | Shambhavi Mishra et.al. | 2411.17002 | link |
2024-11-25 | Dynamic Self-Distillation via Previous Mini-batches for Fine-tuning Small Language Models | Yao Fu et.al. | 2411.16991 | null |
2024-11-25 | Leveraging Foundation Models To learn the shape of semi-fluid deformable objects | Omar El Assal et.al. | 2411.16802 | null |
2024-11-25 | O1 Replication Journey – Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson? | Zhen Huang et.al. | 2411.16489 | link |
2024-11-25 | When Babies Teach Babies: Can student knowledge sharing outperform Teacher-Guided Distillation on small datasets? | Srikrishna Iyer et.al. | 2411.16487 | link |
2024-11-25 | Learn from Foundation Model: Fruit Detection Model without Manual Annotation | Yanan Wang et.al. | 2411.16196 | link |
2024-11-25 | Beyond Task Vectors: Selective Task Arithmetic Based on Importance Metrics | Tian Bowen et.al. | 2411.16139 | null |
2024-11-25 | Ensemble Learning via Knowledge Transfer for CTR Prediction | Honghao Li et.al. | 2411.16122 | link |
2024-11-24 | Data Lineage Inference: Uncovering Privacy Vulnerabilities of Dataset Pruning | Qi Li et.al. | 2411.15796 | null |
2024-11-23 | Botfip-LLM: An Enhanced Multimodal Scientific Computing Framework Leveraging Knowledge Distillation from Large Language Models | Tianhao Chen et.al. | 2411.15525 | null |
2024-11-23 | Efficient Ternary Weight Embedding Model: Bridging Scalability and Performance | Jiayi Chen et.al. | 2411.15438 | link |
2024-11-23 | Partial Knowledge Distillation for Alleviating the Inherent Inter-Class Discrepancy in Federated Learning | Xiaoyu Gan et.al. | 2411.15403 | null |
2024-11-22 | BanglaEmbed: Efficient Sentence Embedding Models for a Low-Resource Language Using Cross-Lingual Distillation Techniques | Muhammad Rafsan Kabir et.al. | 2411.15270 | null |
2024-11-22 | RankByGene: Gene-Guided Histopathology Representation Learning Through Cross-Modal Ranking Consistency | Wentao Huang et.al. | 2411.15076 | null |
2024-11-22 | Adaptive Group Robust Ensemble Knowledge Distillation | Patrik Kenfack et.al. | 2411.14984 | null |
2024-11-25 | Information Extraction from Heterogeneous Documents without Ground Truth Labels using Synthetic Label Generation and Knowledge Distillation | Aniket Bhattacharyya et.al. | 2411.14957 | null |
2024-11-22 | Simplifying CLIP: Unleashing the Power of Large-Scale Models on Consumer-level Computers | Hongbo Liu et.al. | 2411.14789 | null |
2024-11-22 | Improving Mathematical Reasoning Capabilities of Small Language Models via Feedback-Driven Distillation | Xunyu Zhu et.al. | 2411.14698 | null |
2024-11-21 | Teaching MLPs to Master Heterogeneous Graph-Structured Knowledge for Efficient and Accurate Inference | Yunhui Liu et.al. | 2411.14035 | link |
2024-11-21 | CLFace: A Scalable and Resource-Efficient Continual Learning Framework for Lifelong Face Recognition | Md Mahedi Hasan et.al. | 2411.13886 | null |
2024-11-20 | RTSR: A Real-Time Super-Resolution Model for AV1 Compressed Content | Yuxuan Jiang et.al. | 2411.13362 | null |
2024-11-20 | Explainable LLM-driven Multi-dimensional Distillation for E-Commerce Relevance Learning | Gang Zhao et.al. | 2411.13045 | null |
2024-11-19 | Reward Modeling with Ordinal Feedback: Wisdom of the Crowd | Shang Liu et.al. | 2411.12843 | null |
2024-11-19 | Data-to-Model Distillation: Data-Efficient Learning Framework | Ahmad Sajedi et.al. | 2411.12841 | link |
2024-11-19 | What Makes a Good Dataset for Knowledge Distillation? | Logan Frank et.al. | 2411.12817 | null |
2024-11-19 | KDC-MAE: Knowledge Distilled Contrastive Mask Auto-Encoder | Maheswar Bora et.al. | 2411.12270 | null |
2024-11-19 | Just KIDDIN: Knowledge Infusion and Distillation for Detection of INdecent Memes | Rahul Garg et.al. | 2411.12174 | null |
2024-11-18 | Distill the Best, Ignore the Rest: Improving Dataset Distillation with Loss-Value-Based Pruning | Brian B. Moser et.al. | 2411.12115 | link |
2024-11-18 | Dataset Distillers Are Good Label Denoisers In the Wild | Lechao Cheng et.al. | 2411.11924 | link |
2024-11-18 | Federated Incremental Named Entity Recognition | Duzhen Zhang et.al. | 2411.11623 | null |
2024-11-18 | Color-Oriented Redundancy Reduction in Dataset Distillation | Bowen Yuan et.al. | 2411.11329 | link |
2024-11-17 | Map-Free Trajectory Prediction with Map Distillation and Hierarchical Encoding | Xiaodong Liu et.al. | 2411.10961 | null |
2024-11-16 | Hybrid Attention Model Using Feature Decomposition and Knowledge Distillation for Glucose Forecasting | Ebrahim Farahmand et.al. | 2411.10703 | null |
2024-11-16 | Multi-perspective Contrastive Logit Distillation | Qi Wang et.al. | 2411.10693 | null |
2024-11-16 | Exploring Feature-based Knowledge Distillation For Recommender System: A Frequency Perspective | Zhangchi Zhu et.al. | 2411.10676 | null |
2024-11-15 | Evidential Federated Learning for Skin Lesion Image Classification | Rutger Hendrix et.al. | 2411.10071 | null |
2024-11-14 | VPBSD:Vessel-Pattern-Based Semi-Supervised Distillation for Efficient 3D Microscopic Cerebrovascular Segmentation | Xi Lin et.al. | 2411.09567 | null |
2024-11-14 | BEARD: Benchmarking the Adversarial Robustness for Dataset Distillation | Zheng Zhou et.al. | 2411.09265 | link |
2024-11-14 | Mono2Stereo: Monocular Knowledge Transfer for Enhanced Stereo Matching | Yuran Wang et.al. | 2411.09151 | null |
2024-11-14 | Toward Democratized Generative AI in Next-Generation Mobile Edge Networks | Ruichen Zhang et.al. | 2411.09148 | null |
2024-11-14 | SCAN: Bootstrapping Contrastive Pre-training for Data Efficiency | Yangyang Guo et.al. | 2411.09126 | link |
2024-11-13 | Dual-Head Knowledge Distillation: Enhancing Logits Utilization with an Auxiliary Head | Penghui Yang et.al. | 2411.08937 | null |
2024-11-13 | UIFormer: A Unified Transformer-based Framework for Incremental Few-Shot Object Detection and Instance Segmentation | Chengyuan Zhang et.al. | 2411.08569 | null |
2024-11-13 | Federated Graph Learning with Graphless Clients | Xingbo Fu et.al. | 2411.08374 | null |
2024-11-12 | Joint Diffusion models in Continual Learning | Paweł Skierś et.al. | 2411.08224 | null |
2024-11-12 | Learning with Less: Knowledge Distillation from Large Language Models via Unlabeled Data | Juanhui Li et.al. | 2411.08028 | null |
2024-11-13 | Query Optimization for Parametric Knowledge Refinement in Retrieval-Augmented Large Language Models | Youan Cong et.al. | 2411.07820 | null |
2024-11-12 | Robust Offline Reinforcement Learning for Non-Markovian Decision Processes | Ruiquan Huang et.al. | 2411.07514 | null |
2024-11-13 | Feature Interaction Fusion Self-Distillation Network For CTR Prediction | Lei Sang et.al. | 2411.07508 | null |
2024-11-12 | Quantifying Knowledge Distillation Using Partial Information Decomposition | Pasan Dissanayake et.al. | 2411.07483 | null |
2024-11-08 | Multi-Document Financial Question Answering using LLMs | Shalin Shah et.al. | 2411.07264 | null |
2024-11-11 | SAMPart3D: Segment Any Part in 3D Objects | Yunhan Yang et.al. | 2411.07184 | link |
2024-11-11 | LLM-Neo: Parameter Efficient Knowledge Distillation for Large Language Models | Runming Yang et.al. | 2411.06839 | null |
2024-11-11 | ScaleKD: Strong Vision Transformers Could Be Excellent Teachers | Jiawei Fan et.al. | 2411.06786 | link |
2024-11-11 | An Efficient Memory Module for Graph Few-Shot Class-Incremental Learning | Dong Li et.al. | 2411.06659 | link |
2024-11-10 | CULL-MT: Compression Using Language and Layer pruning for Machine Translation | Pedram Rostami et.al. | 2411.06506 | null |
2024-11-10 | Over-parameterized Student Model via Tensor Decomposition Boosted Knowledge Distillation | Yu-Liang Zhan et.al. | 2411.06448 | link |
2024-11-09 | Dynamic Textual Prompt For Rehearsal-free Lifelong Person Re-identification | Hongyu Chen et.al. | 2411.06023 | null |
2024-11-09 | Multi-hop RIS-aided Learning Model Sharing for Urban Air Mobility | Kai Xiong et.al. | 2411.06015 | null |
2024-11-08 | Mitigating Hallucination with ZeroG: An Advanced Knowledge Management Engine | Anantha Sharma et.al. | 2411.05936 | null |
2024-11-08 | Asterisk*: Keep it Simple | Andrew Semenov et.al. | 2411.05691 | null |
2024-11-08 | Knowledge Distillation Neural Network for Predicting Car-following Behaviour of Human-driven and Autonomous Vehicles | Ayobami Adewale et.al. | 2411.05618 | null |
2024-11-08 | Towards Lifelong Few-Shot Customization of Text-to-Image Diffusion | Nan Song et.al. | 2411.05544 | null |
2024-11-07 | Performance-Guided LLM Knowledge Distillation for Efficient Text Classification at Scale | Flavio Di Palo et.al. | 2411.05045 | null |
2024-11-07 | Towards Competitive Search Relevance For Inference-Free Learned Sparse Retrievers | Zhichao Geng et.al. | 2411.04403 | null |
2024-11-07 | GazeGen: Gaze-Driven User Interaction for Visual Content Generation | He-Yen Hsieh et.al. | 2411.04335 | null |
2024-11-06 | Towards Personalized Federated Learning via Comprehensive Knowledge Distillation | Pengju Wang et.al. | 2411.03569 | null |
2024-11-05 | Transformer-Based Fault-Tolerant Control for Fixed-Wing UAVs Using Knowledge Distillation and In-Context Adaptation | Francisco Giral et.al. | 2411.02975 | null |
2024-11-05 | Centerness-based Instance-aware Knowledge Distillation with Task-wise Mutual Lifting for Object Detection on Drone Imagery | Bowei Du et.al. | 2411.02861 | null |
2024-11-05 | Brewing Vodka: Distilling Pure Knowledge for Lightweight Threat Detection in Audit Logs | Weiheng Wu et.al. | 2411.02775 | null |
2024-11-05 | Multimodal Commonsense Knowledge Distillation for Visual Question Answering | Shuo Yang et.al. | 2411.02722 | null |
2024-11-04 | Training on the Test Model: Contamination in Ranking Distillation | Vishakha Suresh Kalal et.al. | 2411.02284 | link |
2024-11-03 | Decoupling Dark Knowledge via Block-wise Logit Distillation for Feature-level Alignment | Chengting Yu et.al. | 2411.01547 | null |
2024-11-01 | On the Impact of White-box Deployment Strategies for Edge AI on Latency and Model Performance | Jaskirat Singh et.al. | 2411.00907 | null |
2024-10-30 | The Graph’s Apprentice: Teaching an LLM Low Level Knowledge for Circuit Quality Estimation | Reza Moravej et.al. | 2411.00843 | null |
2024-10-29 | Unsupervised Training of a Dynamic Context-Aware Deep Denoising Framework for Low-Dose Fluoroscopic Imaging | Sun-Young Jeon et.al. | 2411.00830 | link |
2024-11-01 | Adapting While Learning: Grounding LLMs for Scientific Problems with Intelligent Tool Usage Adaptation | Bohan Lyu et.al. | 2411.00412 | null |
2024-11-01 | Towards Building Secure UAV Navigation with FHE-aware Knowledge Distillation | Arjun Ramesh Kaushik et.al. | 2411.00403 | null |
2024-10-31 | Semantic Knowledge Distillation for Onboard Satellite Earth Observation Image Classification | Thanh-Dung Le et.al. | 2411.00209 | link |
2024-10-30 | Larger models yield better results? Streamlined severity classification of ADHD-related concerns using BERT-based knowledge distillation | Ahmed Akib Jawad Karim et.al. | 2411.00052 | null |
2024-10-30 | IP-MOT: Instance Prompt Learning for Cross-Domain Multi-Object Tracking | Run Luo et.al. | 2410.23907 | null |
2024-10-28 | Unveiling Context-Aware Criteria in Self-Assessing LLMs | Taneesh Gupta et.al. | 2410.21545 | null |
2024-10-28 | Knowledge Distillation for Real-Time Classification of Early Media in Voice Communications | Kemal Altwlkany et.al. | 2410.21478 | null |
2024-10-28 | Less is More: Efficient Time Series Dataset Condensation via Two-fold Modal Matching–Extended Version | Hao Miao et.al. | 2410.20905 | null |
2024-10-28 | Deep Learning for Medical Text Processing: BERT Model Fine-Tuning and Comparative Study | Jiacheng Hu et.al. | 2410.20792 | null |
2024-10-28 | KD-LoRA: A Hybrid Approach to Efficient Fine-Tuning with LoRA and Knowledge Distillation | Rambod Azimi et.al. | 2410.20777 | link |
2024-10-28 | Data-Efficient Low-Complexity Acoustic Scene Classification via Distilling and Progressive Pruning | Bing Han et.al. | 2410.20775 | null |
2024-10-28 | Relaxed Recursive Transformers: Effective Parameter Sharing with Layer-wise LoRA | Sangmin Bae et.al. | 2410.20672 | null |
2024-10-28 | FLiP: Privacy-Preserving Federated Learning based on the Principle of Least Privileg | ShiMao Xu et.al. | 2410.19548 | null |
2024-10-25 | SWITCH: Studying with Teacher for Knowledge Distillation of Large Language Models | Jahyun Koo et.al. | 2410.19503 | null |
2024-10-24 | AlignCap: Aligning Speech Emotion Captioning to Human Preferences | Ziqi Liang et.al. | 2410.19134 | null |
2024-10-24 | High-dimensional Analysis of Knowledge Distillation: Weak-to-Strong Generalization and Scaling Laws | M. Emrullah Ildiz et.al. | 2410.18837 | null |
2024-10-24 | Knowledge Distillation Using Frontier Open-source LLMs: Generalizability and the Role of Synthetic Data | Anup Shirgaonkar et.al. | 2410.18588 | null |
2024-10-24 | SIKeD: Self-guided Iterative Knowledge Distillation for mathematical reasoning | Shivam Adarsh et.al. | 2410.18574 | link |
2024-10-23 | ELAICHI: Enhancing Low-resource TTS by Addressing Infrequent and Low-frequency Character Bigrams | Srija Anand et.al. | 2410.17901 | null |
2024-10-23 | Towards Active Participant-Centric Vertical Federated Learning: Some Representations May Be All You Need | Jon Irureta et.al. | 2410.17648 | null |
2024-10-23 | Towards Effective Data-Free Knowledge Distillation via Diverse Diffusion Augmentation | Muquan Li et.al. | 2410.17606 | link |
2024-10-23 | Physics-driven AI for Channel Estimation in Cellular Network | Xiaoqian Qi et.al. | 2410.17525 | null |
2024-10-22 | MiniPLM: Knowledge Distillation for Pre-Training Language Models | Yuxian Gu et.al. | 2410.17215 | link |
2024-10-22 | Emphasizing Discriminative Features for Dataset Distillation in Complex Scenarios | Kai Wang et.al. | 2410.17193 | link |
2024-10-22 | CK4Gen: A Knowledge Distillation Framework for Generating High-Utility Synthetic Survival Datasets in Healthcare | Nicholas I-Hsien Kuo et.al. | 2410.16872 | null |
2024-10-22 | AttriPrompter: Auto-Prompting with Attribute Semantics for Zero-shot Nuclei Detection via Visual-Language Pre-trained Models | Yongjian Wu et.al. | 2410.16820 | link |
2024-10-22 | SafetyAnalyst: Interpretable, transparent, and steerable LLM safety moderation | Jing-Jing Li et.al. | 2410.16665 | null |
Synthetic Data Generation
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2024-12-19 | OnlineVPO: Align Video Diffusion Model with Online Video-Centric Preference Optimization | Jiacheng Zhang et.al. | 2412.15159 | null |
2024-12-19 | Language Models as Continuous Self-Evolving Data Engineers | Peidong Wang et.al. | 2412.15151 | null |
2024-12-19 | Assessing treatment effects in observational data with missing confounders: A comparative study of practical doubly-robust and traditional missing data methods | Brian D. Williamson et.al. | 2412.15012 | null |
2024-12-19 | DS $^2$ -ABSA: Dual-Stream Data Synthesis with Label Refinement for Few-Shot Aspect-Based Sentiment Analysis | Hongling Xu et.al. | 2412.14849 | link |
2024-12-19 | ResoFilter: Rine-grained Synthetic Data Filtering for Large Language Models through Data-Parameter Resonance Analysis | Zeao Tu et.al. | 2412.14809 | link |
2024-12-19 | ALKAFI-LLAMA3: Fine-Tuning LLMs for Precise Legal Understanding in Palestine | Rabee Qasem et.al. | 2412.14771 | null |
2024-12-19 | How to Synthesize Text Data without Model Collapse? | Xuekai Zhu et.al. | 2412.14689 | null |
2024-12-19 | Bel Esprit: Multi-Agent Framework for Building AI Model Pipelines | Yunsu Kim et.al. | 2412.14684 | null |
2024-12-19 | Drive-1-to-3: Enriching Diffusion Priors for Novel View Synthesis of Real Vehicles | Chuang Lin et.al. | 2412.14494 | null |
2024-12-19 | MegaPairs: Massive Data Synthesis For Universal Multimodal Retrieval | Junjie Zhou et.al. | 2412.14475 | null |
2024-12-18 | GREGoR: Accelerating Genomics for Rare Diseases | Moez Dawood et.al. | 2412.14338 | null |
2024-12-18 | MegaSynth: Scaling Up 3D Scene Reconstruction with Synthesized Data | Hanwen Jiang et.al. | 2412.14166 | null |
2024-12-18 | Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective | Zhiyuan Zeng et.al. | 2412.14135 | null |
2024-12-18 | Prompting Depth Anything for 4K Resolution Accurate Metric Depth Estimation | Haotong Lin et.al. | 2412.14015 | null |
2024-12-18 | Domain-adaptative Continual Learning for Low-resource Tasks: Evaluation on Nepali | Sharad Duwal et.al. | 2412.13860 | null |
2024-12-18 | RadField3D: A Data Generator and Data Format for Deep Learning in Radiation-Protection Dosimetry for Medical Applications | Felix Lehner et.al. | 2412.13852 | link |
2024-12-18 | Object Style Diffusion for Generalized Object Detection in Urban Scene | Hao Li et.al. | 2412.13815 | null |
2024-12-18 | Text2Relight: Creative Portrait Relighting with Text Guidance | Junuk Cha et.al. | 2412.13734 | null |
2024-12-18 | NPC: Neural Predictive Control for Fuel-Efficient Autonomous Trucks | Jiaping Ren et.al. | 2412.13618 | null |
2024-12-18 | Single-cell spatial (scs) omics: Recent developments in data analysis | José Camacho et.al. | 2412.13591 | null |
2024-12-18 | Hybrid Data-Free Knowledge Distillation | Jialiang Tang et.al. | 2412.13525 | link |
2024-12-18 | Learning Causal Transition Matrix for Instance-dependent Label Noise | Jiahui Li et.al. | 2412.13516 | null |
2024-12-18 | AIR-Bench: Automated Heterogeneous Information Retrieval Benchmark | Jianlyu Chen et.al. | 2412.13102 | link |
2024-12-17 | Are Data Experts Buying into Differentially Private Synthetic Data? Gathering Community Perspectives | Lucas Rosenblatt et.al. | 2412.13030 | null |
2024-12-17 | OmniEval: An Omnidirectional and Automatic RAG Evaluation Benchmark in Financial Domain | Shuting Wang et.al. | 2412.13018 | link |
2024-12-17 | Synthetic Data Generation for Anomaly Detection on Table Grapes | Ionut Marian Motoi et.al. | 2412.12949 | null |
2024-12-17 | SynthCypher: A Fully Synthetic Data Generation Framework for Text-to-Cypher Querying in Knowledge Graphs | Aman Tiwari et.al. | 2412.12612 | null |
2024-12-17 | Libri2Vox Dataset: Target Speaker Extraction with Diverse Speaker Conditions and Synthetic Data | Yun Liu et.al. | 2412.12512 | null |
2024-12-17 | Persona-SQ: A Personalized Suggested Question Generation Framework For Real-world Documents | Zihao Lin et.al. | 2412.12445 | null |
2024-12-17 | On the Number of Vertices in a Hyperplane Section of a Polytope | Jesús A. De Loera et.al. | 2412.12419 | null |
2024-12-16 | LLM-RG4: Flexible and Factual Radiology Report Generation across Diverse Input Contexts | Zhuhao Wang et.al. | 2412.12001 | link |
2024-12-16 | Controllable Shadow Generation with Single-Step Diffusion Models from Synthetic Data | Onur Tasar et.al. | 2412.11972 | null |
2024-12-16 | Scalable Data Transmission Framework for Earth Observation Satellites with Channel Adaptation | Van-Phuc Bui et.al. | 2412.11857 | null |
2024-12-16 | Beyond Dataset Creation: Critical View of Annotation Variation and Bias Probing of a Dataset for Online Radical Content Detection | Arij Riabi et.al. | 2412.11745 | null |
2024-12-18 | Conditional Diffusion Models Based Conditional Independence Testing | Yanfeng Yang et.al. | 2412.11744 | link |
2024-12-16 | Generalized Bayesian deep reinforcement learning | Shreya Sinha Roy et.al. | 2412.11743 | null |
2024-12-16 | PSGraph: Differentially Private Streaming Graph Synthesis by Considering Temporal Dynamics | Quan Yuan et.al. | 2412.11369 | null |
2024-12-17 | Learning Set Functions with Implicit Differentiation | Gözde Özcan et.al. | 2412.11239 | null |
2024-12-15 | Drawing the Line: Enhancing Trustworthiness of MLLMs Through the Power of Refusal | Yuhao Wang et.al. | 2412.11196 | null |
2024-12-15 | OccScene: Semantic Occupancy-based Cross-task Mutual Learning for 3D Scene Generation | Bohan Li et.al. | 2412.11183 | null |
2024-12-15 | AD-LLM: Benchmarking Large Language Models for Anomaly Detection | Tiankai Yang et.al. | 2412.11142 | link |
2024-12-15 | Empowering LLMs to Understand and Generate Complex Vector Graphics | Ximing Xing et.al. | 2412.11102 | null |
2024-12-15 | Understanding and Mitigating Memorization in Diffusion Models for Tabular Data | Zhengyu Fang et.al. | 2412.11044 | null |
2024-12-13 | Differentially Private Multi-Sampling from Distributions | Albert Cheu et.al. | 2412.10512 | null |
2024-12-13 | Uncertainties in Signal Recovery from Heterogeneous and Convoluted Time Series with Principal Component Analysis | Mariia Legenkaia et.al. | 2412.10175 | null |
2024-12-13 | Research Integrity and GenAI: A Systematic Analysis of Ethical Challenges Across Research Phases | Sonja Bjelobaba et.al. | 2412.10134 | null |
2024-12-13 | AMUSE: Adaptive Model Updating using a Simulated Environment | Louis Chislett et.al. | 2412.10119 | null |
2024-12-13 | Quaffure: Real-Time Quasi-Static Neural Hair Simulation | Tuur Stuyck et.al. | 2412.10061 | null |
2024-12-13 | Are you doing better than random guessing? A call for using negative controls when evaluating causal discovery algorithms | Anne Helby Petersen et.al. | 2412.10039 | null |
2024-12-13 | Latent feedback control of distributed systems in multiple scenarios through deep learning-based reduced order models | Matteo Tomasetto et.al. | 2412.09942 | null |
2024-12-13 | Financial Sentiment Analysis: Leveraging Actual and Synthetic Data for Supervised Fine-tuning | Abraham Atsiwo et.al. | 2412.09859 | link |
2024-12-13 | Leveraging Programmatically Generated Synthetic Data for Differentially Private Diffusion Training | Yujin Choi et.al. | 2412.09842 | null |
2024-12-13 | LLM Distillation for Efficient Few-Shot Multiple Choice Question Answering | Patrick Sutanto et.al. | 2412.09807 | null |
2024-12-12 | Private Synthetic Data Generation in Small Memory | Rayne Holland et.al. | 2412.09756 | null |
2024-12-12 | Should We Learn Contact-Rich Manipulation Policies from Sampling-Based Planners? | Huaijiang Zhu et.al. | 2412.09743 | null |
2024-12-12 | AgentTrek: Agent Trajectory Synthesis via Guiding Replay with Web Tutorials | Yiheng Xu et.al. | 2412.09605 | null |
2024-12-12 | A Plug-and-Play Algorithm for 3D Video Super-Resolution of Single-Photon LiDAR data | Alice Ruget et.al. | 2412.09427 | null |
2024-12-12 | MaskTerial: A Foundation Model for Automated 2D Material Flake Detection | Jan-Lucas Uslu et.al. | 2412.09333 | null |
2024-12-13 | First Train to Generate, then Generate to Train: UnitedSynT5 for Few-Shot NLI | Sourav Banerjee et.al. | 2412.09263 | null |
2024-12-12 | VLMs meet UDA: Boosting Transferability of Open Vocabulary Segmentation with Unsupervised Domain Adaptation | Roberto Alcover-Couso et.al. | 2412.09240 | null |
2024-12-12 | eCARLA-scenes: A synthetically generated dataset for event-based optical flow prediction | Jad Mansour et.al. | 2412.09209 | link |
2024-12-12 | Towards Long-Horizon Vision-Language Navigation: Platform, Benchmark and Method | Xinshuai Song et.al. | 2412.09082 | null |
2024-12-12 | Phi-4 Technical Report | Marah Abdin et.al. | 2412.08905 | null |
2024-12-12 | A Graph-Based Synthetic Data Pipeline for Scaling High-Quality Reasoning Instructions | Jiankang Wang et.al. | 2412.08864 | null |
2024-12-12 | Exploring Large Language Models on Cross-Cultural Values in Connection with Training Methodology | Minsang Kim et.al. | 2412.08846 | null |
2024-12-11 | Efficient Dynamic Attributed Graph Generation | Fan Li et.al. | 2412.08810 | null |
2024-12-11 | Euclid: Supercharging Multimodal LLMs with Synthetic High-Fidelity Visual Descriptions | Jiarui Zhang et.al. | 2412.08737 | null |
2024-12-11 | Coherent3D: Coherent 3D Portrait Video Reconstruction via Triplane Fusion | Shengze Wang et.al. | 2412.08684 | null |
2024-12-11 | A 1% accurate method to include baryonic effects in galaxy-galaxy lensing models | Matteo Zennaro et.al. | 2412.08623 | null |
2024-12-11 | Can We Generate Visual Programs Without Prompting LLMs? | Michal Shlapentokh-Rothman et.al. | 2412.08564 | null |
2024-12-11 | Federated Learning for Traffic Flow Prediction with Synthetic Data Augmentation | Fermin Orozco et.al. | 2412.08460 | null |
2024-12-11 | Generate Any Scene: Evaluating and Improving Text-to-Vision Generation with Scene Graph Programming | Ziqi Gao et.al. | 2412.08221 | null |
2024-12-11 | Analyzing and Improving Model Collapse in Rectified Flow Models | Huminhao Zhu et.al. | 2412.08175 | null |
2024-12-11 | DiffRaman: A Conditional Latent Denoising Diffusion Probabilistic Model for Bacterial Raman Spectroscopy Identification Under Limited Data Conditions | Haiming Yao et.al. | 2412.08131 | null |
2024-12-11 | Progressive Multi-granular Alignments for Grounded Reasoning in Large Vision-Language Models | Quang-Hung Le et.al. | 2412.08125 | null |
2024-12-11 | Generative Zoo | Tomasz Niewiadomski et.al. | 2412.08101 | null |
2024-12-11 | THUD++: Large-Scale Dynamic Indoor Scene Dataset and Benchmark for Mobile Robots | Zeshun Li et.al. | 2412.08096 | null |
2024-12-11 | DialogAgent: An Auto-engagement Agent for Code Question Answering Data Production | Xiaoyun Liang et.al. | 2412.08069 | null |
2024-12-10 | Mitigating exponential concentration in covariant quantum kernels for subspace and real-world data | Gabriele Agliardi et.al. | 2412.07915 | null |
2024-12-10 | Spectral Differential Network Analysis for High-Dimensional Time Series | Michael Hellstern et.al. | 2412.07905 | null |
2024-12-10 | GASP: Gaussian Avatars with Synthetic Priors | Jack Saunders et.al. | 2412.07739 | null |
2024-12-10 | Granite Guardian | Inkit Padhi et.al. | 2412.07724 | link |
2024-12-10 | SimVS: Simulating World Inconsistencies for Robust View Synthesis | Alex Trevithick et.al. | 2412.07696 | null |
2024-12-10 | Bayesian Data Augmentation and Training for Perception DNN in Autonomous Aerial Vehicles | Ashik E Rasul et.al. | 2412.07655 | link |
2024-12-10 | SurvBETA: Ensemble-Based Survival Models Using Beran Estimators and Several Attention Mechanisms | Lev V. Utkin et.al. | 2412.07638 | link |
2024-12-10 | Causal World Representation in the GPT Model | Raanan Y. Rohekar et.al. | 2412.07446 | null |
2024-12-10 | AppGen: Mobility-aware App Usage Behavior Generation for Mobile Users | Zihan Huang et.al. | 2412.07267 | null |
2024-12-10 | Epidemiological Model Calibration via Graybox Bayesian Optimization | Puhua Niu et.al. | 2412.07193 | null |
2024-12-11 | Rate-In: Information-Driven Adaptive Dropout Rates for Improved Inference-Time Uncertainty Estimation | Tal Zeevi et.al. | 2412.07169 | link |
2024-12-10 | Enhancing radioisotope identification in gamma spectra with transfer learning | Peter Lalor et.al. | 2412.07069 | null |
2024-12-09 | Data Augmentation with Variational Autoencoder for Imbalanced Dataset | Samuel Stocksieker et.al. | 2412.07039 | link |
2024-12-09 | FM2DS: Few-Shot Multimodal Multihop Data Synthesis with Knowledge Distillation for Question Answering | Amirhossein Abaskohi et.al. | 2412.07030 | link |
2024-12-09 | ProVision: Programmatically Scaling Vision-centric Instruction Data for Multimodal Language Models | Jieyu Zhang et.al. | 2412.07012 | link |
2024-12-09 | JAPAGEN: Efficient Few/Zero-shot Learning via Japanese Training Dataset Generation with LLM | Takuro Fujii et.al. | 2412.06738 | link |
2024-12-11 | Numerical Estimation of Spatial Distributions under Differential Privacy | Leilei Du et.al. | 2412.06541 | null |
2024-12-09 | Improving text-conditioned latent diffusion for cancer pathology | Aakash Madhav Rao et.al. | 2412.06487 | link |
2024-12-09 | World-Consistent Data Generation for Vision-and-Language Navigation | Yu Zhong et.al. | 2412.06413 | null |
2024-12-09 | Exploring the Impact of Synthetic Data on Human Gesture Recognition Tasks Using GANs | George Kontogiannis et.al. | 2412.06389 | null |
2024-12-09 | Rendering-Refined Stable Diffusion for Privacy Compliant Synthetic Data | Kartik Patwari et.al. | 2412.06248 | null |
2024-12-09 | AIDE: Task-Specific Fine Tuning with Attribute Guided Multi-Hop Data Expansion | Jiayu Li et.al. | 2412.06136 | null |
2024-12-08 | Implicit Delta Learning of High Fidelity Neural Network Potentials | Stephan Thaler et.al. | 2412.06064 | null |
2024-12-08 | Concerning the Use of Turbulent Flow Data for Machine Learning | Mohammed Sardar et.al. | 2412.06050 | null |
2024-12-08 | Accelerating Video Diffusion Models via Distribution Matching | Yuanzhi Zhu et.al. | 2412.05899 | null |
2024-12-08 | XKV: Personalized KV Cache Memory Reduction for Long-Context LLM Inference | Weizhuo Li et.al. | 2412.05896 | null |
2024-12-08 | Towards Modeling Data Quality and Machine Learning Model Performance | Usman Anjum et.al. | 2412.05882 | link |
2024-12-08 | Laser Ultrasonic Imaging via the Time Domain Linear Sampling Method | Jian Song et.al. | 2412.05803 | null |
2024-12-08 | Prism: Semi-Supervised Multi-View Stereo with Monocular Structure Priors | Alex Rich et.al. | 2412.05771 | null |
2024-12-07 | A new basic air shower observable sensitive to the cosmic-ray elemental mass | Animesh Basak et.al. | 2412.05727 | null |
2024-12-06 | One-shot Federated Learning via Synthetic Distiller-Distillate Communication | Junyuan Zhang et.al. | 2412.05186 | link |
2024-12-06 | A text-to-tabular approach to generate synthetic patient data using LLMs | Margaux Tornqvist et.al. | 2412.05153 | link |
2024-12-06 | Noise Matters: Diffusion Model-based Urban Mobility Generation with Collaborative Noise Priors | Yuheng Zhang et.al. | 2412.05000 | null |
2024-12-06 | Neuro-Symbolic Data Generation for Math Reasoning | Zenan Li et.al. | 2412.04857 | null |
2024-12-06 | DrIFT: Autonomous Drone Dataset with Integrated Real and Synthetic Data, Flexible Views, and Transformed Domains | Fardad Dadboud et.al. | 2412.04789 | link |
2024-12-06 | Differentially Private Random Feature Model | Chunyang Liao et.al. | 2412.04785 | link |
2024-12-06 | SpasticMyoElbow: Physical Human-Robot Interaction Simulation Framework for Modelling Elbow Spasticity | Hao Yu et.al. | 2412.04700 | null |
2024-12-05 | Give me Some Hard Questions: Synthetic Data Generation for Clinical QA | Fan Bai et.al. | 2412.04573 | null |
2024-12-05 | DualPM: Dual Posed-Canonical Point Maps for 3D Shape and Pose Reconstruction | Ben Kaye et.al. | 2412.04464 | null |
2024-12-05 | Monocular Dynamic Gaussian Splatting is Fast and Brittle but Smooth Motion Helps | Yiqing Liang et.al. | 2412.04457 | null |
2024-12-05 | BhashaVerse : Translation Ecosystem for Indian Subcontinent Languages | Vandan Mujadia et.al. | 2412.04351 | null |
2024-12-05 | ALMA: Alignment with Minimal Annotation | Michihiro Yasunaga et.al. | 2412.04305 | null |
2024-12-05 | Methodology for Online Estimation of Rheological Parameters in Polymer Melts Using Deep Learning and Microfluidics | Juan Sandubete-López et.al. | 2412.04142 | null |
2024-12-05 | AI-based Attacker Models for Enhancing Multi-Stage Cyberattack Simulations in Smart Grids Using Co-Simulation Environments | Omer Sen et.al. | 2412.03979 | null |
2024-12-05 | Learning Speed-Adaptive Walking Agent Using Imitation Learning with Physics-Informed Simulation | Yi-Hung Chiu et.al. | 2412.03949 | link |
2024-12-05 | Towards Data Governance of Frontier AI Models | Jason Hausenloy et.al. | 2412.03824 | null |
2024-12-04 | Diffusion in Zero-Shot Learning for Environmental Audio | Ysobel Sims et.al. | 2412.03771 | link |
2024-12-04 | End to End Collaborative Synthetic Data Generation | Sikha Pentyala et.al. | 2412.03766 | null |
2024-12-04 | Evaluating Language Models as Synthetic Data Generators | Seungone Kim et.al. | 2412.03679 | link |
2024-12-04 | Interpreting Transformers for Jet Tagging | Aaron Wang et.al. | 2412.03673 | link |
2024-12-04 | DiffuPT: Class Imbalance Mitigation for Glaucoma Detection via Diffusion Based Generation and Model Pretraining | Youssof Nawar et.al. | 2412.03629 | null |
2024-12-04 | MIDI: Multi-Instance Diffusion for Single Image to 3D Scene Generation | Zehuan Huang et.al. | 2412.03558 | null |
2024-12-04 | Microwave Remote Sensing of Soil Moisture, Above Ground Biomass and Freeze-Thaw Dynamic: Modeling and Empirical Approaches | Laura Angeloni et.al. | 2412.03523 | null |
2024-12-04 | Domain-Agnostic Stroke Lesion Segmentation Using Physics-Constrained Synthetic Data | Liam Chalcroft et.al. | 2412.03318 | null |
2024-12-04 | GERD: Geometric event response data generation | Jens Egholm Pedersen et.al. | 2412.03259 | link |
2024-12-04 | Semi-Supervised Transfer Boosting (SS-TrBoosting) | Lingfei Deng et.al. | 2412.03212 | null |
2024-12-04 | ChatTS: Aligning Time Series with LLMs via Synthetic Data for Enhanced Understanding and Reasoning | Zhe Xie et.al. | 2412.03104 | null |
2024-12-04 | Surveying the Effects of Quality, Diversity, and Complexity in Synthetic Data From Large Language Models | Alex Havrilla et.al. | 2412.02980 | null |
2024-12-03 | MACAW: A Causal Generative Model for Medical Imaging | Vibujithan Vigneshwaran et.al. | 2412.02900 | link |
2024-12-03 | Learning constitutive relations from experiments: 1. PDE constrained optimization | Andrew Akerson et.al. | 2412.02864 | null |
2024-12-03 | Unpaired Modality Translation for Pseudo Labeling of Histology Images | Arthur Boschet et.al. | 2412.02858 | null |
2024-12-03 | Nemotron-CC: Transforming Common Crawl into a Refined Long-Horizon Pretraining Dataset | Dan Su et.al. | 2412.02595 | null |
2024-12-03 | Active learning of neural population dynamics using two-photon holographic optogenetics | Andrew Wagenmaker et.al. | 2412.02529 | null |
2024-12-03 | DP-2Stage: Adapting Language Models as Differentially Private Tabular Data Generators | Tejumade Afonja et.al. | 2412.02467 | link |
2024-12-03 | 3D Face Reconstruction From Radar Images | Valentin Braeutigam et.al. | 2412.02403 | null |
2024-12-03 | Probing jet dynamics and collimation in radio galaxies. Application to NGC 1052 | Ainara Saiz-Pérez et.al. | 2412.02358 | null |
2024-12-03 | SimuScope: Realistic Endoscopic Synthetic Dataset Generation through Surgical Simulation and Diffusion Models | Sabina Martyniak et.al. | 2412.02332 | link |
2024-12-03 | Initial Study On Improving Segmentation By Combining Preoperative CT And Intraoperative CBCT Using Synthetic Data | Maximilian E. Tschuchnig et.al. | 2412.02294 | null |
2024-12-03 | Connecting Large Language Models with Blockchain: Advancing the Evolution of Smart Contracts from Automation to Intelligence | Youquan Xian et.al. | 2412.02263 | null |
2024-12-03 | Fast LiDAR Data Generation with Rectified Flows | Kazuto Nakashima et.al. | 2412.02241 | link |
2024-12-03 | FaaSRCA: Full Lifecycle Root Cause Analysis for Serverless Applications | Jin Huang et.al. | 2412.02239 | null |
2024-12-03 | Unlocking Tuning-Free Few-Shot Adaptability in Visual Foundation Models by Recycling Pre-Tuned LoRAs | Zixuan Hu et.al. | 2412.02220 | null |
2024-12-03 | Thallus: An RDMA-based Columnar Data Transport Protocol | Jayjeet Chakraborty et.al. | 2412.02192 | null |
2024-12-02 | Who’s Gaming the System? A Causally-Motivated Approach for Detecting Strategic Adaptation | Trenton Chang et.al. | 2412.02000 | link |
2024-12-02 | MALT: Improving Reasoning with Multi-Agent LLM Training | Sumeet Ramesh Motwani et.al. | 2412.01928 | null |
2024-12-02 | VideoLights: Feature Refinement and Cross-Task Alignment Transformer for Joint Video Highlight Detection and Moment Retrieval | Dhiman Paul et.al. | 2412.01558 | link |
2024-11-29 | On Domain-Specific Post-Training for Multimodal Large Language Models | Daixuan Cheng et.al. | 2411.19930 | null |
2024-11-29 | Linear methods for non-linear inverse problems | Geerten Koers et.al. | 2411.19797 | null |
2024-11-29 | Know Your RAG: Dataset Taxonomy and Generation Strategies for Evaluating RAG Systems | Rafael Teixeira de Lima et.al. | 2411.19710 | null |
2024-11-29 | MIMDE: Exploring the Use of Synthetic vs Human Data for Evaluating Multi-Insight Multi-Document Extraction Tasks | John Francis et.al. | 2411.19689 | null |
2024-11-29 | Diorama: Unleashing Zero-shot Single-view 3D Scene Modeling | Qirui Wu et.al. | 2411.19492 | null |
2024-11-28 | UrbanCAD: Towards Highly Controllable and Photorealistic 3D Vehicles for Urban Scene Simulation | Yichong Lu et.al. | 2411.19292 | null |
2024-11-28 | Parallel and Mini-Batch Stable Matching for Large-Scale Reciprocal Recommender Systems | Kento Nakada et.al. | 2411.19214 | null |
2024-11-27 | Reconstructing Animals and the Wild | Peter Kulits et.al. | 2411.18807 | null |
2024-11-27 | Evaluating and Improving the Effectiveness of Synthetic Chest X-Rays for Medical Image Analysis | Eva Prakash et.al. | 2411.18602 | null |
2024-11-28 | Enhancing weed detection performance by means of GenAI-based image augmentation | Sourav Modak et.al. | 2411.18513 | null |
2024-11-27 | Synthetic ECG Generation for Data Augmentation and Transfer Learning in Arrhythmia Classification | José Fernando Núñez et.al. | 2411.18456 | null |
2024-11-27 | The more, the better? Evaluating the role of EEG preprocessing for deep learning applications | Federico Del Pup et.al. | 2411.18392 | link |
2024-11-27 | Two-Timescale Digital Twin Assisted Model Interference and Retraining over Wireless Network | Jiayi Cong et.al. | 2411.18329 | null |
2024-11-27 | Dependency-Aware CAV Task Scheduling via Diffusion-Based Reinforcement Learning | Xiang Cheng et.al. | 2411.18230 | null |
2024-11-27 | SharpDepth: Sharpening Metric Depth Predictions Using Diffusion Distillation | Duc-Hai Pham et.al. | 2411.18229 | null |
2024-11-27 | Training Data Synthesis with Difficulty Controlled Diffusion Model | Zerun Wang et.al. | 2411.18109 | null |
2024-11-27 | Training and Evaluating Language Models with Template-based Data Generation | Yifan Zhang et.al. | 2411.18104 | link |
2024-11-26 | CrypQ: A Database Benchmark Based on Dynamic, Ever-Evolving Ethereum Data | Vincent Capol et.al. | 2411.17913 | null |
2024-11-26 | Repeated sampling of different individuals but the same clusters to improve precision of difference-in-differences estimators: the DISC design | Jordan Downey et.al. | 2411.17905 | null |
2024-11-26 | RealSeal: Revolutionizing Media Authentication with Real-Time Realism Scoring | Bhaktipriya Radharapu et.al. | 2411.17684 | null |
2024-11-26 | Synthetic Data Generation with LLM for Improved Depression Prediction | Andrea Kang et.al. | 2411.17672 | null |
2024-11-26 | Pre-training for Action Recognition with Automatically Generated Fractal Datasets | Davyd Svyezhentsev et.al. | 2411.17584 | link |
2024-11-26 | Evolving Markov Chains: Unsupervised Mode Discovery and Recognition from Data Streams | Kutalmış Coşkun et.al. | 2411.17528 | null |
2024-11-26 | A Method for Fabricating CMOS Back-End-of-Line-Compatible Solid-State Nanopore Devices | Mohamed Yassine Bouhamidi et.al. | 2411.17416 | null |
2024-11-26 | vesselFM: A Foundation Model for Universal 3D Blood Vessel Segmentation | Bastian Wittmann et.al. | 2411.17386 | null |
2024-11-27 | RealTraj: Towards Real-World Pedestrian Trajectory Forecasting | Ryo Fujii et.al. | 2411.17376 | null |
2024-11-26 | On the Generalization of Handwritten Text Recognition Models | Carlos Garrido-Munoz et.al. | 2411.17332 | null |
2024-11-26 | ER2Score: LLM-based Explainable and Customizable Metric for Assessing Radiology Reports with Reward-Control Loss | Yunyi Liu et.al. | 2411.17301 | null |
2024-11-26 | LHPF: Look back the History and Plan for the Future in Autonomous Driving | Sheng Wang et.al. | 2411.17253 | null |
2024-11-26 | DOGE: Towards Versatile Visual Document Grounding and Referring | Yinan Zhou et.al. | 2411.17125 | null |
2024-11-26 | Average X-ray properties of galaxy groups. From Milky Way-like halos to massive clusters | P. Popesso et.al. | 2411.17120 | null |
2024-11-26 | Large-Scale Data-Free Knowledge Distillation for ImageNet via Multi-Resolution Data Generation | Minh-Tuan Tran et.al. | 2411.17046 | null |
2024-11-25 | Decision Making under the Exponential Family: Distributionally Robust Optimisation with Bayesian Ambiguity Sets | Charita Dellaporta et.al. | 2411.16829 | null |
2024-11-25 | A Study on Unsupervised Domain Adaptation for Semantic Segmentation in the Era of Vision-Language Models | Manuel Schwonberg et.al. | 2411.16407 | null |
2024-11-25 | Video-Text Dataset Construction from Multi-AI Feedback: Promoting Weak-to-Strong Preference Learning for Video Large Language Models | Hao Yi et.al. | 2411.16201 | null |
2024-11-25 | On the Robustness of the Successive Projection Algorithm | Giovanni Barbarino et.al. | 2411.16195 | link |
2024-11-25 | Image Generation Diversity Issues and How to Tame Them | Mischa Dombrowski et.al. | 2411.16171 | link |
2024-11-25 | DP-CDA: An Algorithm for Enhanced Privacy Preservation in Dataset Synthesis Through Randomized Mixing | Utsab Saha et.al. | 2411.16121 | null |
2024-11-25 | Boosting 3D Object Generation through PBR Materials | Yitong Wang et.al. | 2411.16080 | null |
2024-11-24 | PINNs4Drops: Convolutional feature-enhanced physics-informed neural networks for reconstructing two-phase flows | Maximilian Dreisbach et.al. | 2411.15949 | null |
2024-11-24 | Generative Context Distillation | Haebin Shin et.al. | 2411.15927 | link |
2024-11-24 | Beyond Data Scarcity: A Frequency-Driven Framework for Zero-Shot Forecasting | Liran Nochumsohn et.al. | 2411.15743 | null |
2024-11-24 | Comparative Analysis of Diffusion Generative Models in Computational Pathology | Denisha Thakkar et.al. | 2411.15719 | link |
2024-11-24 | Tackling Data Heterogeneity in Federated Time Series Forecasting | Wei Yuan et.al. | 2411.15716 | null |
2024-11-24 | ROOT: VLM based System for Indoor Scene Understanding and Beyond | Yonghui Wang et.al. | 2411.15714 | link |
2024-11-26 | GraphGrad: Efficient Estimation of Sparse Polynomial Representations for General State-Space Models | Benjamin Cox et.al. | 2411.15637 | null |
2024-11-23 | Enhancing Object Detection Accuracy in Autonomous Vehicles Using Synthetic Data | Sergei Voronin et.al. | 2411.15602 | null |
2024-11-23 | Boosting Semi-Supervised Scene Text Recognition via Viewing and Summarizing | Yadong Qu et.al. | 2411.15585 | link |
2024-11-22 | OminiControl: Minimal and Universal Control for Diffusion Transformer | Zhenxiong Tan et.al. | 2411.15098 | link |
2024-11-22 | The EE-Classifier: A classification method for functional data based on extremality indexes | Catalina Lesmes et.al. | 2411.14999 | null |
2024-11-22 | Open-Amp: Synthetic Data Framework for Audio Effect Foundation Models | Alec Wright et.al. | 2411.14972 | link |
2024-11-22 | LLM for Barcodes: Generating Diverse Synthetic Data for Identity Documents | Hitesh Laxmichand Patel et.al. | 2411.14962 | null |
2024-11-22 | Morph: A Motion-free Physics Optimization Framework for Human Motion Generation | Zhuo Li et.al. | 2411.14951 | null |
2024-11-22 | The NANOGrav 15 year Data Set: Removing pulsars one by one from the pulsar timing array | Gabriella Agazie et.al. | 2411.14846 | null |
2024-11-22 | Harlequin: Color-driven Generation of Synthetic Data for Referring Expression Comprehension | Luca Parolari et.al. | 2411.14807 | null |
2024-11-22 | Aim My Robot: Precision Local Navigation to Any Object | Xiangyun Meng et.al. | 2411.14770 | null |
2024-11-22 | Double Machine Learning for Adaptive Causal Representation in High-Dimensional Data | Lynda Aouar et.al. | 2411.14665 | null |
2024-11-21 | The importance of the clustering model to detect new types of intrusion in data traffic | Noor Saud Abd et.al. | 2411.14550 | null |
2024-11-21 | Learning Fair Robustness via Domain Mixup | Meiyu Zhong et.al. | 2411.14424 | null |
2024-11-21 | Intent-Aware Dialogue Generation and Multi-Task Contrastive Learning for Multi-Turn Intent Classification | Junhua Liu et.al. | 2411.14252 | null |
2024-11-21 | Learning from “Silly” Questions Improves Large Language Models, But Only Slightly | Tingyuan Zhu et.al. | 2411.14121 | null |
2024-11-21 | Generative Intervention Models for Causal Perturbation Modeling | Nora Schneider et.al. | 2411.14003 | null |
2024-11-21 | iHQGAN: A Lightweight Invertible Hybrid Quantum-Classical Generative Adversarial Network for Unsupervised Image-to-Image Translation | Xue Yang et.al. | 2411.13920 | link |
2024-11-21 | Towards Full Delegation: Designing Ideal Agentic Behaviors for Travel Planning | Song Jiang et.al. | 2411.13904 | null |
2024-11-21 | PIORS: Personalized Intelligent Outpatient Reception based on Large Language Model with Multi-Agents Medical Scenario Simulation | Zhijie Bao et.al. | 2411.13902 | null |
2024-11-21 | Robust Detection of Watermarks for Large Language Models Under Human Edits | Xiang Li et.al. | 2411.13868 | link |
2024-11-21 | Dealing with Synthetic Data Contamination in Online Continual Learning | Maorong Wang et.al. | 2411.13852 | link |
2024-11-21 | GalaxyEdit: Large-Scale Image Editing Dataset with Enhanced Diffusion Adapter | Aniruddha Bala et.al. | 2411.13794 | null |
2024-11-21 | Adaptable Embeddings Network (AEN) | Stan Loosmore et.al. | 2411.13786 | null |
2024-11-22 | Utilizing Large Language Models to Synthesize Product Desirability Datasets | John D. Hastings et.al. | 2411.13485 | null |
2024-11-20 | Heuristically Adaptive Diffusion-Model Evolutionary Strategy | Benedikt Hartl et.al. | 2411.13420 | null |
2024-11-20 | Enhanced Gas Source Localization Using Distributed IoT Sensors and Bayesian Inference | Leonardo Balocchi et.al. | 2411.13268 | null |
2024-11-20 | BelHouse3D: A Benchmark Dataset for Assessing Occlusion Robustness in 3D Point Cloud Semantic Segmentation | Umamaheswaran Raman Kumar et.al. | 2411.13251 | null |
2024-11-20 | SONNET: Enhancing Time Delay Estimation by Leveraging Simulated Audio | Erik Tegler et.al. | 2411.13179 | null |
2024-11-20 | Writing Style Matters: An Examination of Bias and Fairness in Information Retrieval Systems | Hongliu Cao et.al. | 2411.13173 | null |
2024-11-20 | Data driven learning to enhance a kinetic model of distressed crowd dynamics | Daewa Kim et.al. | 2411.12974 | null |
2024-11-20 | Machine learned reconstruction of tsunami dynamics from sparse observations | Edward McDugald et.al. | 2411.12948 | null |
2024-11-20 | Improving Low-Fidelity Models of Li-ion Batteries via Hybrid Sparse Identification of Nonlinear Dynamics | Samuel Filgueira da Silva et.al. | 2411.12935 | null |
2024-11-19 | Data-to-Model Distillation: Data-Efficient Learning Framework | Ahmad Sajedi et.al. | 2411.12841 | link |
2024-11-19 | Regular-pattern-sensitive CRFs for Distant Label Interactions | Sean Papay et.al. | 2411.12484 | null |
2024-11-19 | Empirical Privacy Evaluations of Generative and Predictive Machine Learning Models – A review and challenges for practice | Flavio Hafner et.al. | 2411.12451 | null |
2024-11-19 | Could Humans Outshine AI in Visual Data Analysis? | Ratanond Koonchanok et.al. | 2411.12299 | null |
2024-11-18 | SpatialDreamer: Self-supervised Stereo Video Synthesis from Monocular Input | Zhen Lv et.al. | 2411.11934 | null |
2024-11-18 | RoboGSim: A Real2Sim2Real Robotic Gaussian Splatting Simulator | Xinhai Li et.al. | 2411.11839 | null |
2024-11-18 | Theoretical Foundations of Conformal Prediction | Anastasios N. Angelopoulos et.al. | 2411.11824 | null |
2024-11-18 | Parallelly Tempered Generative Adversarial Networks | Jinwon Sohn et.al. | 2411.11786 | null |
2024-11-18 | Open Catalyst Experiments 2024 (OCx24): Bridging Experiments and Computational Models | Jehad Abed et.al. | 2411.11783 | null |
2024-11-18 | Few-shot Model Extraction Attacks against Sequential Recommender Systems | Hui Zhang et.al. | 2411.11677 | null |
2024-11-18 | Real-Time Fitness Exercise Classification and Counting from Video Frames | Riccardo Riccio et.al. | 2411.11548 | link |
2024-11-18 | A Pre-Trained Graph-Based Model for Adaptive Sequencing of Educational Documents | Jean Vassoyan et.al. | 2411.11520 | link |
2024-11-19 | Cascaded Diffusion Models for 2D and 3D Microscopy Image Synthesis to Enhance Cell Segmentation | Rüveyda Yilmaz et.al. | 2411.11515 | null |
2024-11-18 | Lorentz: Learned SKU Recommendation Using Profile Data | Nicholas Glaze et.al. | 2411.11325 | null |
2024-11-18 | Subgroup analysis in multi level hierarchical cluster randomized trials | Shubhadeep Chakraborty et.al. | 2411.11301 | null |
2024-11-17 | MolParser: End-to-end Visual Recognition of Molecule Structures in the Wild | Xi Fang et.al. | 2411.11098 | null |
2024-11-17 | SRA-MCTS: Self-driven Reasoning Aurmentation with Monte Carlo Tree Search for Enhanced Code Generation | Bin Xu et.al. | 2411.11053 | link |
2024-11-17 | Towards a framework on tabular synthetic data generation: a minimalist approach: theory, use cases, and limitations | Agus Sudjianto et.al. | 2411.10982 | null |
2024-11-16 | Efficient, Low-Regret, Online Reinforcement Learning for Linear MDPs | Philips George John et.al. | 2411.10906 | null |
2024-11-16 | Watermarking Generative Categorical Data | Bochao Gu et.al. | 2411.10898 | null |
2024-11-15 | Dynamic Causal Effects in a Nonlinear World: the Good, the Bad, and the Ugly | Michal Kolesár et.al. | 2411.10415 | link |
2024-11-15 | How to Build a Quantum Supercomputer: Scaling Challenges and Opportunities | Masoud Mohseni et.al. | 2411.10406 | null |
2024-11-15 | Generation of synthetic gait data: application to multiple sclerosis patients’ gait patterns | Klervi Le Gall et.al. | 2411.10377 | null |
2024-11-15 | Multidimensional Byte Pair Encoding: Shortened Sequences for Improved Visual Data Generation | Tim Elsner et.al. | 2411.10281 | link |
2024-11-15 | Evaluating Text-to-Image Diffusion Models for Texturing Synthetic Data | Thomas Lips et.al. | 2411.10164 | link |
2024-11-15 | Mitigating Sycophancy in Decoder-Only Transformer Architectures: Synthetic Data Intervention | Libo Wang et.al. | 2411.10156 | link |
2024-11-15 | Adaptive Physics-Guided Neural Network | David Shulman et.al. | 2411.10064 | null |
2024-11-14 | Cross-Matched Interval Prevalence of High Dimensional Point Clouds | Jonathan M. Mousley et.al. | 2411.09797 | null |
2024-11-14 | Advancing Fine-Grained Visual Understanding with Multi-Scale Alignment in Multi-Modal Models | Wei Wang et.al. | 2411.09691 | null |
2024-11-16 | SAFES: Sequential Privacy and Fairness Enhancing Data Synthesis for Responsible AI | Spencer Giddens et.al. | 2411.09178 | link |
2024-11-14 | Mono2Stereo: Monocular Knowledge Transfer for Enhanced Stereo Matching | Yuran Wang et.al. | 2411.09151 | null |
2024-11-13 | Drone Detection using Deep Neural Networks Trained on Pure Synthetic Data | Mariusz Wisniewski et.al. | 2411.09077 | link |
2024-11-13 | Evaluating cosmological simulations of galaxy formation with spectral variance in the optical window | Z. Sharbaf et.al. | 2411.08945 | null |
2024-11-13 | A probabilistic reduced-order modeling framework for patient-specific cardio-mechanical analysis | Robin Willems et.al. | 2411.08822 | null |
2024-11-13 | Towards More Accurate Fake Detection on Images Generated from Advanced Generative and Neural Rendering Models | Chengdong Dong et.al. | 2411.08642 | null |
2024-11-13 | Generalized Pose Space Embeddings for Training In-the-Wild using Anaylis-by-Synthesis | Dominik Borer et.al. | 2411.08603 | null |
2024-11-13 | Space-local memory in generalized master equations: Reaching the thermodynamic limit for the cost of a small lattice simulation | Srijan Bhattacharyya et.al. | 2411.08598 | null |
2024-11-13 | CorrSynth – A Correlated Sampling Method for Diverse Dataset Generation from LLMs | Suhas S Kowshik et.al. | 2411.08553 | null |
2024-11-13 | A dark energy parameterization independent constraint of the spatial curvature $Ω_K$ | Zhennan Li et.al. | 2411.08498 | null |
2024-11-13 | Generative AI for Data Augmentation in Wireless Networks: Analysis, Applications, and Case Study | Jinbo Wen et.al. | 2411.08341 | null |
2024-11-13 | DNN Task Assignment in UAV Networks: A Generative AI Enhanced Multi-Agent Reinforcement Learning Approach | Xin Tang et.al. | 2411.08299 | null |
2024-11-13 | Dynamic Thresholding Algorithm with Memory for Linear Inverse Problems | Zhong-Feng Sun et.al. | 2411.08284 | null |
2024-11-12 | SynapsNet: Enhancing Neuronal Population Dynamics Modeling via Learning Functional Connectivity | Parsa Delavari et.al. | 2411.08221 | null |
2024-11-12 | Design optimization of semiconductor manufacturing equipment using a novel multi-fidelity surrogate modeling approach | Bingran Wang et.al. | 2411.08149 | null |
2024-11-12 | Large Language Models Can Self-Improve in Long-context Reasoning | Siheng Li et.al. | 2411.08147 | link |
2024-11-12 | Language Models as Causal Effect Generators | Lucius E. J. Bynum et.al. | 2411.08019 | link |
2024-11-12 | Scalable piecewise smoothing with BART | Ryan Yee et.al. | 2411.07984 | null |
2024-11-12 | Maritime Search and Rescue Missions with Aerial Images: A Survey | Juan P. Martinez-Esteso et.al. | 2411.07649 | null |
2024-11-11 | Music Discovery Dialogue Generation Using Human Intent Analysis and Large Language Models | SeungHeon Doh et.al. | 2411.07439 | link |
2024-11-11 | Feature-Space Semantic Invariance: Enhanced OOD Detection for Open-Set Domain Generalization | Haoliang Wang et.al. | 2411.07392 | null |
2024-11-11 | SynRL: Aligning Synthetic Clinical Trial Data with Human-preferred Clinical Endpoints Using Reinforcement Learning | Trisha Das et.al. | 2411.07317 | null |
2024-11-11 | DLCR: A Generative Data Expansion Framework via Diffusion for Clothes-Changing Person Re-ID | Nyle Siddiqui et.al. | 2411.07205 | link |
2024-11-11 | Data-Driven Predictive Control of Nonholonomic Robots Based on a Bilinear Koopman Realization: Data Does Not Replace Geometry | Mario Rosenfelder et.al. | 2411.07192 | null |
2024-11-11 | Hierarchical Conditional Tabular GAN for Multi-Tabular Synthetic Data Generation | Wilhelm Ågren et.al. | 2411.07009 | null |
2024-11-11 | Maximizing domain generalization in fetal brain tissue segmentation: the role of synthetic data generation, intensity clustering and real image fine-tuning | Vladyslav Zalevskyi et.al. | 2411.06842 | null |
2024-11-11 | Synthesize, Partition, then Adapt: Eliciting Diverse Samples from Foundation Models | Yeming Wen et.al. | 2411.06722 | null |
2024-11-11 | DiffSR: Learning Radar Reflectivity Synthesis via Diffusion Model from Satellite Observations | Xuming He et.al. | 2411.06714 | null |
2024-11-11 | What Should Baby Models Read? Exploring Sample-Efficient Data Composition on Model Performance | Hong Meng Yam et.al. | 2411.06672 | null |
2024-11-10 | In-Context Learning for Preserving Patient Privacy: A Framework for Synthesizing Realistic Patient Portal Messages | Joseph Gatto et.al. | 2411.06549 | link |
2024-11-10 | CRTRE: Causal Rule Generation with Target Trial Emulation Framework | Junda Wang et.al. | 2411.06338 | null |
2024-11-09 | Clustering Algorithms and RAG Enhancing Semi-Supervised Text Classification with Large LLMs | Shan Zhong et.al. | 2411.06175 | null |
2024-11-09 | Behavior-Aware Efficient Detection of Malicious EVs in V2G Systems | Ruixiang Wu et.al. | 2411.06113 | null |
2024-11-09 | A novel study on the MUSIC-type imaging of small electromagnetic inhomogeneities in the limited-aperture inverse scattering problem | Won-Kwang Park et.al. | 2411.06030 | null |
2024-11-08 | DNAMite: Interpretable Calibrated Survival Analysis with Discretized Additive Models | Mike Van Ness et.al. | 2411.05923 | link |
2024-11-08 | Differential Privacy Under Class Imbalance: Methods and Empirical Insights | Lucas Rosenblatt et.al. | 2411.05733 | null |
2024-11-08 | Evaluating Large Language Model Capability in Vietnamese Fact-Checking Data Generation | Long Truong To et.al. | 2411.05641 | null |
2024-11-08 | SynDroneVision: A Synthetic Dataset for Image-Based Drone Detection | Tamara R. Lenhard et.al. | 2411.05633 | null |
2024-11-08 | DeepArUco++: Improved detection of square fiducial markers in challenging lighting conditions | Rafael Berral-Soler et.al. | 2411.05552 | link |
2024-11-08 | A Quality-Centric Framework for Generic Deepfake Detection | Wentang Song et.al. | 2411.05335 | null |
2024-11-08 | Discovering Latent Structural Causal Models from Spatio-Temporal Data | Kun Wang et.al. | 2411.05331 | null |
2024-11-08 | Cancer-Net SCa-Synth: An Open Access Synthetically Generated 2D Skin Lesion Dataset for Skin Cancer Classification | Chi-en Amy Tai et.al. | 2411.05269 | link |
2024-11-07 | Precision or Recall? An Analysis of Image Captions for Training Text-to-Image Generation Model | Sheng Cheng et.al. | 2411.05079 | link |
2024-11-07 | Diff-2-in-1: Bridging Generation and Dense Perception with Diffusion Models | Shuhong Zheng et.al. | 2411.05005 | null |
2024-11-07 | Uncovering Hidden Subspaces in Video Diffusion Models Using Re-Identification | Mischa Dombrowski et.al. | 2411.04956 | null |
2024-11-09 | OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models | Siming Huang et.al. | 2411.04905 | null |
2024-11-07 | Controlling Human Shape and Pose in Text-to-Image Diffusion Models via Domain Adaptation | Benito Buchheim et.al. | 2411.04724 | null |
2024-11-08 | BhasaAnuvaad: A Speech Translation Dataset for 13 Indian Languages | Sparsh Jain et.al. | 2411.04699 | link |
2024-11-07 | Improved Multi-Task Brain Tumour Segmentation with Synthetic Data Augmentation | André Ferreira et.al. | 2411.04632 | link |
2024-11-07 | Enhancing Bronchoscopy Depth Estimation through Synthetic-to-Real Domain Adaptation | Qingyao Tian et.al. | 2411.04404 | null |
2024-11-06 | Generating Synthetic Electronic Health Record (EHR) Data: A Review with Benchmarking | Xingran Chen et.al. | 2411.04281 | link |
2024-11-06 | Debiasing Synthetic Data Generated by Deep Generative Models | Alexander Decruyenaere et.al. | 2411.04216 | null |
2024-11-06 | Topology Bench: Systematic Graph Based Benchmarking for Core Optical Networks | Robin Matzner et.al. | 2411.04160 | null |
2024-11-06 | GUIDE-VAE: Advancing Data Generation with User Information and Pattern Dictionaries | Kutay Bölat et.al. | 2411.03936 | null |
2024-11-06 | VQA $^2$ :Visual Question Answering for Video Quality Assessment | Ziheng Jia et.al. | 2411.03795 | link |
2024-11-06 | Content-Style Learning from Unaligned Domains: Identifiability under Unknown Latent Dimensions | Sagar Shrestha et.al. | 2411.03755 | null |
2024-11-06 | Where Do We Stand with Implicit Neural Representations? A Technical and Performance Survey | Amer Essakine et.al. | 2411.03688 | null |
2024-11-06 | Open-Source High-Speed Flight Surrogate Modeling Framework | Tyler E. Korenyi-Both et.al. | 2411.03598 | null |
2024-11-05 | Forecasting Outside the Box: Application-Driven Optimal Pointwise Forecasts for Stochastic Optimization | Tito Homem-de-Mello et.al. | 2411.03520 | null |
2024-11-04 | Enhancing Table Representations with LLM-powered Synthetic Data Generation | Dayu Yang et.al. | 2411.03356 | null |
2024-11-05 | DiffLM: Controllable Synthetic Data Generation via Diffusion Language Models | Ying Zhou et.al. | 2411.03250 | null |
2024-11-05 | A data-driven study on Implicit LES using a spectral difference method | Nicola Clinco et.al. | 2411.03211 | null |
2024-11-05 | Local Lesion Generation is Effective for Capsule Endoscopy Image Data Augmentation in a Limited Data Setting | Adrian B. Chłopowiec et.al. | 2411.03098 | null |
2024-11-05 | Speech Separation with Pretrained Frontend to Minimize Domain Mismatch | Wupeng Wang et.al. | 2411.03085 | link |
2024-11-05 | Controlling for Unobserved Confounding with Large Language Model Classification of Patient Smoking Status | Samuel Lee et.al. | 2411.03004 | null |
2024-11-05 | IMUDiffusion: A Diffusion Model for Multivariate Time Series Synthetisation for Inertial Motion Capturing Systems | Heiko Oppel et.al. | 2411.02954 | null |
2024-11-05 | SpiDR: A Reconfigurable Digital Compute-in-Memory Spiking Neural Network Accelerator for Event-based Perception | Deepika Sharma et.al. | 2411.02854 | null |
2024-11-05 | On the Comparison between Multi-modal and Single-modal Contrastive Learning | Wei Huang et.al. | 2411.02837 | null |
2024-11-04 | Combining Induction and Transduction for Abstract Reasoning | Wen-Ding Li et.al. | 2411.02272 | link |
2024-11-06 | Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent | Xingwu Sun et.al. | 2411.02265 | link |
2024-11-06 | Digi2Real: Bridging the Realism Gap in Synthetic Data Face Recognition via Foundation Models | Anjith George et.al. | 2411.02188 | null |
2024-11-04 | Generating the Traces You Need: A Conditional Generative Model for Process Mining Data | Riccardo Graziosi et.al. | 2411.02131 | link |
2024-11-04 | GDP nowcasting with large-scale inter-industry payment data in real time – A network approach | Anastasia Mantziou et.al. | 2411.02029 | null |
2024-11-04 | Learning Where to Edit Vision Transformers | Yunqiao Yang et.al. | 2411.01948 | link |
2024-11-04 | Exploring the Landscape for Generative Sequence Models for Specialized Data Synthesis | Mohammad Zbeeb et.al. | 2411.01929 | link |
2024-11-04 | ManiBox: Enhancing Spatial Grasping Generalization via Scalable Simulation Data Generation | Hengkai Tan et.al. | 2411.01850 | null |
2024-11-04 | DiffuMask-Editor: A Novel Paradigm of Integration Between the Segmentation Diffusion Model and Image Editing to Improve Segmentation Ability | Bo Gao et.al. | 2411.01819 | null |
2024-11-03 | Enhancing Forecasts Using Real-Time Data Flow and Hierarchical Forecast Reconciliation, with Applications to the Energy Sector | Lukas Neubauer et.al. | 2411.01528 | link |
2024-11-03 | Privacy-Preserving Customer Churn Prediction Model in the Context of Telecommunication Industry | Joydeb Kumar Sana et.al. | 2411.01447 | null |
2024-11-02 | Network Causal Effect Estimation In Graphical Models Of Contagion And Latent Confounding | Yufeng Wu et.al. | 2411.01371 | null |
2024-11-02 | Guided Synthesis of Labeled Brain MRI Data Using Latent Diffusion Models for Segmentation of Enlarged Ventricles | Tim Ruschke et.al. | 2411.01351 | null |
2024-11-02 | Marginal Causal Flows for Validation and Inference | Daniel de Vassimon Manela et.al. | 2411.01295 | link |
2024-11-02 | Efficient Collaborative Navigation through Perception Fusion for Multi-Robots in Unknown Environments | Qingquan Lin et.al. | 2411.01274 | null |
2024-11-01 | SelfCodeAlign: Self-Alignment for Code Generation | Yuxiang Wei et.al. | 2410.24198 | link |
2024-10-31 | DexMimicGen: Automated Data Generation for Bimanual Dexterous Manipulation via Imitation Learning | Zhenyu Jiang et.al. | 2410.24185 | null |
2024-10-31 | Constraint Back-translation Improves Complex Instruction Following of Large Language Models | Yunjia Qi et.al. | 2410.24175 | null |
2024-11-02 | $π_0$ : A Vision-Language-Action Flow Model for General Robot Control | Kevin Black et.al. | 2410.24164 | null |
2024-10-31 | Understanding Generalizability of Diffusion Models Requires Rethinking the Hidden Gaussian Structure | Xiang Li et.al. | 2410.24060 | link |
2024-10-31 | Unveiling Synthetic Faces: How Synthetic Datasets Can Expose Real Identities | Hatef Otroshi Shahreza et.al. | 2410.24015 | null |
2024-10-31 | Towards Fast Algorithms for the Preference Consistency Problem Based on Hierarchical Models | Anne-Marie George et.al. | 2410.23934 | null |
2024-10-31 | Bayesian Hierarchical Model for Synthesizing Registry and Survey Data on Female Breast Cancer Prevalence | Qiao Wang et.al. | 2410.23580 | null |
2024-10-30 | Neural spell-checker: Beyond words with synthetic data generation | Matej Klemen et.al. | 2410.23514 | link |
2024-10-30 | Development and Comparative Analysis of Machine Learning Models for Hypoxemia Severity Triage in CBRNE Emergency Scenarios Using Physiological and Demographic Data from Medical-Grade Devices | Santino Nanini et.al. | 2410.23503 | null |
2024-10-30 | PACER: Preference-conditioned All-terrain Costmap Generation | Luisa Mao et.al. | 2410.23488 | null |
2024-10-30 | Multilingual Vision-Language Pre-training for the Remote Sensing Domain | João Daniel Silva et.al. | 2410.23370 | link |
2024-10-30 | Strategic communication of narratives | Gerrit Bauch et.al. | 2410.23259 | null |
2024-10-31 | Enhancing Autonomous Driving Safety Analysis with Generative AI: A Comparative Study on Automated Hazard and Risk Assessment | Alireza Abbaspour et.al. | 2410.23207 | null |
2024-10-30 | Directional anomaly detection | Oliver Urs Lenz et.al. | 2410.23158 | null |
2024-10-30 | Federated Learning under Periodic Client Participation and Heterogeneous Data: A New Communication-Efficient Algorithm and Analysis | Michael Crawshaw et.al. | 2410.23131 | link |
2024-10-30 | Automated Image-Based Identification and Consistent Classification of Fire Patterns with Quantitative Shape Analysis and Spatial Location Identification | Pengkun Liu et.al. | 2410.23105 | null |
2024-10-30 | CausalDiff: Causality-Inspired Disentanglement via Diffusion Model for Adversarial Defense | Mingkun Zhang et.al. | 2410.23091 | link |
2024-10-30 | Private Synthetic Text Generation with Diffusion Models | Sebastian Ochs et.al. | 2410.22971 | link |
2024-10-30 | Augmenting Polish Automatic Speech Recognition System With Synthetic Data | Łukasz Bondaruk et.al. | 2410.22903 | null |
2024-10-30 | Universality of the $π^2/6$ Pathway in Avoiding Model Collapse | Apratim Dey et.al. | 2410.22812 | link |
2024-10-30 | Analysis of Classifier Training on Synthetic Data for Cross-Domain Datasets | Andoni Cortés et.al. | 2410.22748 | null |
2024-10-29 | Unpicking Data at the Seams: VAEs, Disentanglement and Independent Components | Carl Allen et.al. | 2410.22559 | null |
2024-10-29 | Evaluating utility in synthetic banking microdata applications | Hugo E. Caceres et.al. | 2410.22519 | null |
2024-10-30 | Nanoscale Connectomics Annotation Standards Framework | Nicole K. Guittari et.al. | 2410.22320 | null |
2024-10-29 | Understanding Synthetic Context Extension via Retrieval Heads | Xinyu Zhao et.al. | 2410.22316 | null |
2024-10-29 | Model-free Estimation of Latent Structure via Multiscale Nonparametric Maximum Likelihood | Bryon Aragam et.al. | 2410.22248 | null |
2024-10-29 | Synthetic Data Generation with Large Language Models for Personalized Community Question Answering | Marco Braga et.al. | 2410.22182 | link |
2024-10-29 | Data Generation for Hardware-Friendly Post-Training Quantization | Lior Dikstein et.al. | 2410.22110 | link |
2024-10-29 | Cross-Entropy Is All You Need To Invert the Data Generating Process | Patrik Reizinger et.al. | 2410.21869 | null |
2024-10-29 | Generating Realistic Tabular Data with Large Language Models | Dang Nguyen et.al. | 2410.21717 | null |
2024-10-28 | Identifying Selections for Unsupervised Subtask Discovery | Yiwen Qiu et.al. | 2410.21616 | null |
2024-10-28 | Approximate Bayesian Computation with Statistical Distances for Model Selection | Clara Grazian et.al. | 2410.21603 | link |
2024-10-28 | Unveiling Context-Aware Criteria in Self-Assessing LLMs | Taneesh Gupta et.al. | 2410.21545 | null |
2024-10-28 | Not All LLM-Generated Data Are Equal: Rethinking Data Weighting in Text Classification | Hsun-Yu Kuo et.al. | 2410.21526 | null |
2024-10-28 | LLM-Forest for Health Tabular Data Imputation | Xinrui He et.al. | 2410.21520 | null |
2024-10-28 | Inferring the Morphology of the Galactic Center Excess with Gaussian Processes | Edward D. Ramirez et.al. | 2410.21367 | link |
2024-10-28 | Reconstructing dynamics from sparse observations with no training on target system | Zheng-Meng Zhai et.al. | 2410.21222 | null |
2024-10-29 | Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction | Qintong Zhang et.al. | 2410.21169 | null |
2024-10-28 | Synthetica: Large Scale Synthetic Data for Robot Perception | Ritvik Singh et.al. | 2410.21153 | null |
2024-10-28 | Topological Identification of Agent Status in Information Contagions: Application to Financial Markets | Anubha Goel et.al. | 2410.21104 | link |
2024-10-28 | Shallow Diffuse: Robust and Invisible Watermarking through Low-Dimensional Subspaces in Diffusion Models | Wenda Li et.al. | 2410.21088 | link |
2024-10-28 | Federated Time Series Generation on Feature and Temporally Misaligned Data | Chenrui Fan et.al. | 2410.21072 | null |
2024-10-28 | Push-Forward Signed Distance Functions enable interpretable and robust continuous shape quantification | Roua Rouatbi et.al. | 2410.21004 | null |
2024-10-29 | Valid Bootstraps for Networks with Applications to Network Visualisation | Emerald Dilworth et.al. | 2410.20895 | null |
2024-10-28 | Super-resolution with dynamics in the loss | Jacob Page et.al. | 2410.20884 | null |
2024-10-29 | zGAN: An Outlier-focused Generative Adversarial Network For Realistic Synthetic Data Generation | Azizjon Azimi et.al. | 2410.20808 | link |
2024-10-28 | Rephrasing natural text data with different languages and quality levels for Large Language Model pre-training | Michael Pieler et.al. | 2410.20796 | null |
2024-10-28 | Scaling-based Data Augmentation for Generative Models and its Theoretical Extension | Yoshitaka Koike et.al. | 2410.20780 | null |
2024-10-28 | Plan $\times$ RAG: Planning-guided Retrieval Augmented Generation | Prakhar Verma et.al. | 2410.20753 | null |
2024-10-28 | General Causal Imputation via Synthetic Interventions | Marco Jiralerspong et.al. | 2410.20647 | null |
2024-10-29 | TabDiff: a Multi-Modal Diffusion Model for Tabular Data Generation | Juntong Shi et.al. | 2410.20626 | link |
2024-10-25 | Considerations for Distribution Shift Robustness of Diagnostic Models in Healthcare | Arno Blaas et.al. | 2410.19575 | null |
2024-10-25 | EDGE: Enhanced Grounded GUI Understanding with Enriched Multi-Granularity Synthetic Data | Xuetian Chen et.al. | 2410.19461 | null |
2024-10-25 | Fictitious Synthetic Data Can Improve LLM Factuality via Prerequisite Learning | Yujian Liu et.al. | 2410.19290 | link |
2024-10-25 | In-Simulation Testing of Deep Learning Vision Models in Autonomous Robotic Manipulators | Dmytro Humeniuk et.al. | 2410.19277 | null |
2024-10-24 | Equitable Federated Learning with Activation Clustering | Antesh Upadhyay et.al. | 2410.19207 | null |
2024-10-24 | Heterogeneous Random Forest | Ye-eun Kim et.al. | 2410.19022 | link |
2024-10-24 | Ferret-UI 2: Mastering Universal User Interface Understanding Across Platforms | Zhangheng Li et.al. | 2410.18967 | null |
2024-10-24 | SkillMimicGen: Automated Demonstration Generation for Efficient Skill Learning and Deployment | Caelan Garrett et.al. | 2410.18907 | null |
2024-10-24 | Distill Visual Chart Reasoning Ability from LLMs to MLLMs | Wei He et.al. | 2410.18798 | link |
2024-10-24 | Learning Geodesics of Geometric Shape Deformations From Images | Nian Wu et.al. | 2410.18797 | null |
2024-10-24 | Unleashing Reasoning Capability of LLMs via Scalable Question Synthesis from Scratch | Yuyang Ding et.al. | 2410.18693 | link |
2024-10-24 | DreamClear: High-Capacity Real-World Image Restoration with Privacy-Safe Dataset Curation | Yuang Ai et.al. | 2410.18666 | link |
2024-10-24 | Little Giants: Synthesizing High-Quality Embedding Data at Scale | Haonan Chen et.al. | 2410.18634 | link |
2024-10-24 | Knowledge Distillation Using Frontier Open-source LLMs: Generalizability and the Role of Synthetic Data | Anup Shirgaonkar et.al. | 2410.18588 | null |
2024-10-24 | Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data | Shuhao Gu et.al. | 2410.18558 | null |