Updated on 2024.11.21
Table of Contents
- <a href=#peft>PEFT</a>
- <a href=#text-to-image-generation>Text-to-Image Generation</a>
- <a href=#vision-language-models>Vision-Language Models</a>
- <a href=#generative-weight-space-modeling>Generative Weight Space Modeling</a>
- <a href=#data-distillation>Data Distillation</a>
- <a href=#schrodinger-bridge>Schrodinger Bridge</a>
- <a href=#dataset-distillation>Dataset Distillation</a>
- <a href=#synthetic-data-generation>Synthetic Data Generation</a>
PEFT
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2024-11-17 | F $^3$ OCUS – Federated Finetuning of Vision-Language Foundation Models with Optimal Client Layer Updating Strategy via Multi-objective Meta-Heuristics | Pramit Saha et.al. | 2411.11912 | null |
2024-11-16 | HELENE: Hessian Layer-wise Clipping and Gradient Annealing for Accelerating Fine-tuning LLM with Zeroth-order Optimization | Huaqin Zhao et.al. | 2411.10696 | null |
2024-11-12 | PERFT: Parameter-Efficient Routed Fine-Tuning for Mixture-of-Expert Model | Yilun Liu et.al. | 2411.08212 | null |
2024-11-10 | Prompt-Efficient Fine-Tuning for GPT-like Deep Models to Reduce Hallucination and to Improve Reproducibility in Scientific Text Generation Using Stochastic Optimisation Techniques | Daniil Sulimov et.al. | 2411.06445 | null |
2024-11-06 | MambaPEFT: Exploring Parameter-Efficient Fine-Tuning for Mamba | Masakazu Yoshimura et.al. | 2411.03855 | null |
2024-11-04 | PipeLLM: Fast and Confidential Large Language Model Services with Speculative Pipelined Encryption | Yifan Tan et.al. | 2411.03357 | null |
2024-11-05 | Efficient and Effective Adaptation of Multimodal Foundation Models in Sequential Recommendation | Junchen Fu et.al. | 2411.02992 | null |
2024-11-04 | Parameter-Efficient Fine-Tuning of Large Language Models for Unit Test Generation: An Empirical Study | André Storhaug et.al. | 2411.02462 | null |
2024-11-04 | Expanding Sparse Tuning for Low Memory Usage | Shufan Shen et.al. | 2411.01800 | link |
2024-11-15 | Visual Fourier Prompt Tuning | Runjia Zeng et.al. | 2411.01327 | link |
2024-10-31 | CleaR: Towards Robust and Generalized Parameter-Efficient Fine-Tuning for Noisy Label Learning | Yeachan Kim et.al. | 2411.00873 | null |
2024-10-30 | FPE-LLM: Highly Intelligent Time-Series Forecasting and Language Interaction LLM in Energy Systems | Zihang Qiu et.al. | 2411.00852 | null |
2024-11-01 | Dual Low-Rank Adaptation for Continual Learning with Pre-Trained Models | Huancheng Chen et.al. | 2411.00623 | null |
2024-11-01 | Is Multiple Object Tracking a Matter of Specialization? | Gianluca Mancusi et.al. | 2411.00553 | null |
2024-11-01 | C2A: Client-Customized Adaptation for Parameter-Efficient Federated Learning | Yeachan Kim et.al. | 2411.00311 | link |
2024-10-29 | Preserving Pre-trained Representation Space: On Effectiveness of Prefix-tuning for Large Multi-modal Models | Donghoon Kim et.al. | 2411.00029 | null |
2024-10-30 | Efficient Adaptation of Pre-trained Vision Transformer via Householder Transformation | Wei Dong et.al. | 2410.22952 | null |
2024-10-30 | MALoRA: Mixture of Asymmetric Low-Rank Adaptation for Enhanced Multi-Task Learning | Xujia Wang et.al. | 2410.22782 | null |
2024-10-29 | Meta-Learning Adaptable Foundation Models | Jacob L. Block et.al. | 2410.22264 | null |
2024-10-29 | Capacity Control is an Effective Memorization Mitigation Mechanism in Text-Conditional Diffusion Models | Raman Dutt et.al. | 2410.22149 | link |
2024-10-30 | IntLoRA: Integral Low-rank Adaptation of Quantized Diffusion Models | Hang Guo et.al. | 2410.21759 | null |
2024-10-28 | KD-LoRA: A Hybrid Approach to Efficient Fine-Tuning with LoRA and Knowledge Distillation | Rambod Azimi et.al. | 2410.20777 | link |
2024-10-27 | Get Large Language Models Ready to Speak: A Late-fusion Approach for Speech Generation | Maohao Shen et.al. | 2410.20336 | null |
2024-11-01 | Parameter-Efficient Fine-Tuning in Large Models: A Survey of Methodologies | Luping Wang et.al. | 2410.19878 | null |
2024-10-23 | MiLoRA: Efficient Mixture of Low-Rank Adaptation for Large Language Models Fine-tuning | Jingfan Zhang et.al. | 2410.18035 | null |
2024-10-22 | Towards Real Zero-Shot Camouflaged Object Segmentation without Camouflaged Annotations | Cheng Lei et.al. | 2410.16953 | null |
2024-10-22 | MoRE: Multi-Modal Contrastive Pre-training with Transformers on X-Rays, ECGs, and Diagnostic Report | Samrajya Thapa et.al. | 2410.16239 | link |
2024-10-21 | Natural GaLore: Accelerating GaLore for memory-efficient LLM Training and Fine-tuning | Arijit Das et.al. | 2410.16029 | link |
2024-10-18 | Unlearning Backdoor Attacks for LLMs with Weak-to-Strong Knowledge Distillation | Shuai Zhao et.al. | 2410.14425 | link |
2024-10-17 | LoLDU: Low-Rank Adaptation via Lower-Diag-Upper Decomposition for Parameter-Efficient Fine-Tuning | Yiming Shi et.al. | 2410.13618 | link |
2024-10-16 | Communication-Efficient and Tensorized Federated Fine-Tuning of Large Language Models | Sajjad Ghiasvand et.al. | 2410.13097 | null |
2024-10-17 | Prompt Compression for Large Language Models: A Survey | Zongqian Li et.al. | 2410.12388 | link |
2024-10-15 | Layer-wise Importance Matters: Less Memory for Better Performance in Parameter-efficient Fine-tuning of Large Language Models | Kai Yao et.al. | 2410.11772 | link |
2024-10-15 | LoKO: Low-Rank Kalman Optimizer for Online Fine-Tuning of Large Models | Hossein Abdi et.al. | 2410.11551 | null |
2024-10-15 | RoCoFT: Efficient Finetuning of Large Language Models with Row-Column Updates | Md Kowsher et.al. | 2410.10075 | link |
2024-10-13 | BiDoRA: Bi-level Optimization-Based Weight-Decomposed Low-Rank Adaptation | Peijia Qin et.al. | 2410.09758 | null |
2024-10-12 | Towards Efficient Visual-Language Alignment of the Q-Former for Visual Reasoning Tasks | Sungkyung Kim et.al. | 2410.09489 | link |
2024-10-15 | MTL-LoRA: Low-Rank Adaptation for Multi-Task Learning | Yaming Yang et.al. | 2410.09437 | null |
2024-10-09 | Parameter-Efficient Fine-Tuning via Selective Discrete Cosine Transform | Yixian Shen et.al. | 2410.09103 | null |
2024-10-04 | BIPEFT: Budget-Guided Iterative Search for Parameter Efficient Fine-Tuning of Large Pretrained Language Models | Aofei Chang et.al. | 2410.09079 | null |
2024-10-11 | Parameter-Efficient Fine-Tuning of State Space Models | Kevin Galim et.al. | 2410.09016 | link |
2024-10-10 | Parameter-Efficient Fine-Tuning in Spectral Domain for Point Cloud Learning | Dingkang Liang et.al. | 2410.08114 | link |
2024-10-10 | SLIM: Let LLM Learn More and Forget Less with Soft LoRA and Identity Mixture | Jiayi Han et.al. | 2410.07739 | null |
2024-10-10 | Enhancing Zeroth-order Fine-tuning for Language Models with Low-rank Structures | Yiming Chen et.al. | 2410.07698 | link |
2024-10-09 | SparseGrad: A Selective Method for Efficient Fine-tuning of MLP Layers | Viktoriia Chekalina et.al. | 2410.07383 | link |
2024-10-09 | Functional-level Uncertainty Quantification for Calibrated Fine-tuning on LLMs | Ruijia Niu et.al. | 2410.06431 | null |
2024-10-08 | Are Large Language Models State-of-the-art Quality Estimators for Machine Translation of User-generated Content? | Shenbin Qian et.al. | 2410.06338 | link |
2024-10-15 | LoRTA: Low Rank Tensor Adaptation of Large Language Models | Ignacio Hounie et.al. | 2410.04060 | null |
2024-10-03 | Llama SLayer 8B: Shallow Layers Hold the Key to Knowledge Injection | Tianxiang Chen et.al. | 2410.02330 | link |
2024-10-02 | TPP-LLM: Modeling Temporal Point Processes by Efficiently Fine-Tuning Large Language Models | Zefang Liu et.al. | 2410.02062 | link |
2024-10-02 | NEAT: Nonlinear Parameter-efficient Adaptation of Pre-trained Models | Yibo Zhong et.al. | 2410.01870 | null |
2024-09-27 | A GEN AI Framework for Medical Note Generation | Hui Yi Leong et.al. | 2410.01841 | null |
2024-10-02 | DLP-LoRA: Efficient Task-Specific LoRA Fusion with a Dynamic, Lightweight Plugin for Large Language Models | Yuxuan Zhang et.al. | 2410.01497 | link |
2024-10-01 | PrivTuner with Homomorphic Encryption and LoRA: A P3EFT Scheme for Privacy-Preserving Parameter-Efficient Fine-Tuning of AI Foundation Models | Yang Li et.al. | 2410.00433 | null |
2024-09-30 | Adapting LLMs for the Medical Domain in Portuguese: A Study on Fine-Tuning and Model Evaluation | Pedro Henrique Paiola et.al. | 2410.00163 | null |
2024-09-30 | Resource Allocation for Stable LLM Training in Mobile Edge Computing | Chang Liu et.al. | 2409.20247 | null |
2024-09-30 | Reference Trustable Decoding: A Training-Free Augmentation Paradigm for Large Language Models | Luohe Shi et.al. | 2409.20181 | null |
2024-09-28 | FINE: Factorizing Knowledge for Initialization of Variable-sized Diffusion Models | Yucheng Xie et.al. | 2409.19289 | null |
2024-10-01 | Backdoor Attacks for LLMs with Weak-To-Strong Knowledge Distillation | Shuai Zhao et.al. | 2409.17946 | null |
2024-09-26 | PEDRO: Parameter-Efficient Fine-tuning with Prompt DEpenDent Representation MOdification | Tianfang Xie et.al. | 2409.17834 | null |
2024-09-30 | Efficient In-Domain Question Answering for Resource-Constrained Environments | Isaac Chung et.al. | 2409.17648 | null |
2024-10-07 | PACE: marrying generalization in PArameter-efficient fine-tuning with Consistency rEgularization | Yao Ni et.al. | 2409.17137 | null |
2024-09-25 | Parameter-efficient Bayesian Neural Networks for Uncertainty-aware Depth Estimation | Richard D. Paul et.al. | 2409.17085 | null |
2024-10-02 | Bone: Block Affine Transformation as Parameter Efficient Fine-tuning Methods for Large Language Models | Jiale Kang et.al. | 2409.15371 | link |
2024-09-22 | Flat-LoRA: Low-Rank Adaption over a Flat Loss Landscape | Tao Li et.al. | 2409.14396 | null |
2024-10-01 | Obliviate: Neutralizing Task-agnostic Backdoors within the Parameter-efficient Fine-tuning Paradigm | Jaehan Kim et.al. | 2409.14119 | link |
2024-09-20 | HUT: A More Computation Efficient Fine-Tuning Method With Hadamard Updated Transformation | Geyuan Zhang et.al. | 2409.13501 | null |
2024-09-17 | THaMES: An End-to-End Tool for Hallucination Mitigation and Evaluation in Large Language Models | Mengfei Liang et.al. | 2409.11353 | null |
2024-09-17 | LPT++: Efficient Training on Mixture of Long-tailed Experts | Bowen Dong et.al. | 2409.11323 | null |
2024-09-17 | Beyond LoRA: Exploring Efficient Fine-Tuning Techniques for Time Series Foundational Models | Divij Gupta et.al. | 2409.11302 | null |
2024-09-18 | Propulsion: Steering LLM with Tiny Fine-Tuning | Md Kowsher et.al. | 2409.10927 | link |
2024-09-16 | From Text to Emoji: How PEFT-Driven Personality Manipulation Unleashes the Emoji Potential in LLMs | Navya Jain et.al. | 2409.10245 | null |
2024-09-14 | COMFORT: A Continual Fine-Tuning Framework for Foundation Models Targeted at Consumer Healthcare | Chia-Hao Li et.al. | 2409.09549 | null |
2024-09-14 | Comparing Retrieval-Augmentation and Parameter-Efficient Fine-Tuning for Privacy-Preserving Personalization of Large Language Models | Alireza Salemi et.al. | 2409.09510 | link |
2024-09-13 | Risks When Sharing LoRA Fine-Tuned Diffusion Model Weights | Dixi Yao et.al. | 2409.08482 | null |
2024-09-12 | Do Vision Foundation Models Enhance Domain Generalization in Medical Image Segmentation? | Kerem Cekmeceli et.al. | 2409.07960 | link |
2024-09-11 | Efficient Localized Adaptation of Neural Weather Forecasting: A Case Study in the MENA Region | Muhammad Akhtar Munir et.al. | 2409.07585 | link |
2024-09-10 | Sam2Rad: A Segmentation Model for Medical Images with Learnable Prompts | Assefa Seyoum Wahd et.al. | 2409.06821 | link |
2024-09-11 | Ferret: Federated Full-Parameter Tuning at Scale for Large Language Models | Yao Shu et.al. | 2409.06277 | link |
2024-09-09 | SVFit: Parameter-Efficient Fine-Tuning of Large Pre-Trained Models Using Singular Values | Chengwei Sun et.al. | 2409.05926 | null |
2024-09-10 | Improving Multimodal Emotion Recognition by Leveraging Acoustic Adaptation and Visual Alignment | Zhixian Zhao et.al. | 2409.05015 | null |
2024-09-06 | Customizing Large Language Model Generation Style using Parameter-Efficient Finetuning | Xinyue Liu et.al. | 2409.04574 | null |
2024-09-04 | iConFormer: Dynamic Parameter-Efficient Tuning with Input-Conditioned Adaptation | Hayeon Jo et.al. | 2409.02838 | null |
2024-09-04 | Deconfounded Causality-aware Parameter-Efficient Fine-Tuning for Problem-Solving Improvement of LLMs | Ruoyu Wang et.al. | 2409.02686 | null |
2024-09-04 | Robust Federated Finetuning of Foundation Models via Alternating Minimization of LoRA | Shuangyi Chen et.al. | 2409.02346 | null |
2024-09-02 | Unleashing the Power of Task-Specific Directions in Parameter Efficient Fine-tuning | Chongjie Si et.al. | 2409.01035 | link |
2024-08-28 | 3-in-1: 2D Rotary Adaptation for Efficient Finetuning, Efficient Batching and Composability | Baohao Liao et.al. | 2409.00119 | link |
2024-08-21 | SORSA: Singular Values and Orthonormal Regularized Singular Vectors Adaptation of Large Language Models | Yang Cao et.al. | 2409.00055 | link |
2024-08-30 | MoRe Fine-Tuning with 10x Fewer Parameters | Wenxuan Tan et.al. | 2408.17383 | link |
2024-09-02 | Instant Adversarial Purification with Adversarial Consistency Distillation | Chun Tong Lei et.al. | 2408.17064 | null |
2024-08-28 | Scaling Up Summarization: Leveraging Large Language Models for Long Text Extractive Summarization | Léo Hemamou et.al. | 2408.15801 | null |
2024-08-27 | GIFT-SW: Gaussian noise Injected Fine-Tuning of Salient Weights for LLMs | Maxim Zhelnin et.al. | 2408.15300 | link |
2024-08-27 | Pre-training Everywhere: Parameter-Efficient Fine-Tuning for Medical Image Analysis via Target Parameter Pre-training | Xingliang Lei et.al. | 2408.15011 | null |
2024-08-27 | CVPT: Cross-Attention help Visual Prompt Tuning adapt visual task | Lingyun Huang et.al. | 2408.14961 | link |
2024-08-27 | Step-by-Step Unmasking for Parameter-Efficient Fine-tuning of Large Language Models | Aradhye Agarwal et.al. | 2408.14470 | link |
2024-08-24 | Advancing Enterprise Spatio-Temporal Forecasting Applications: Data Mining Meets Instruction Tuning of Language Models For Multi-modal Time Series Analysis in Low-Resource Settings | Sagar Srinivas Sakhinana et.al. | 2408.13622 | null |
2024-08-21 | Positional Prompt Tuning for Efficient 3D Representation Learning | Shaochen Zhang et.al. | 2408.11567 | link |
2024-08-20 | Pluto and Charon: A Time and Memory Efficient Collaborative Edge AI Framework for Personal LLMs Fine-Tuning | Bei Ouyang et.al. | 2408.10746 | null |
2024-08-20 | TDS-CLIP: Temporal Difference Side Network for Image-to-Video Transfer Learning | Bin Wang et.al. | 2408.10688 | link |
2024-08-19 | TeamLoRA: Boosting Low-Rank Adaptation with Expert Collaboration and Competition | Tianwei Lin et.al. | 2408.09856 | link |
2024-08-16 | Learning to Route for Dynamic Adapter Composition in Continual Learning with Language Models | Vladimir Araujo et.al. | 2408.09053 | null |
2024-08-14 | KIND: Knowledge Integration and Diversion in Diffusion Models | Yucheng Xie et.al. | 2408.07337 | null |
2024-08-30 | TaSL: Task Skill Localization and Consolidation for Language Model Continual Learning | Yujie Feng et.al. | 2408.05200 | link |
2024-08-08 | Bias-Aware Low-Rank Adaptation: Mitigating Catastrophic Inheritance of Large Language Models | Yupeng Chang et.al. | 2408.04556 | link |
2024-08-06 | SARA: Singular-Value Based Adaptive Low-Rank Adaption | Jihao Gu et.al. | 2408.03290 | null |
2024-08-06 | Leveraging Parameter Efficient Training Methods for Low Resource Text Classification: A Case Study in Marathi | Pranita Deshmukh et.al. | 2408.03172 | null |
2024-08-03 | TS-SAM: Fine-Tuning Segment-Anything Model for Downstream Tasks | Yang Yu et.al. | 2408.01835 | link |
2024-08-02 | MoDE: Effective Multi-task Parameter Efficient Fine-Tuning with a Mixture of Dyadic Experts | Lin Ning et.al. | 2408.01505 | null |
2024-08-02 | Tensor Train Low-rank Approximation (TT-LoRA): Democratizing AI with Accelerated LLMs | Afia Anjum et.al. | 2408.01008 | null |
2024-07-31 | A Federated Learning-Friendly Approach for Parameter-Efficient Fine-Tuning of SAM in 3D Segmentation | Mothilal Asokan et.al. | 2407.21739 | null |
2024-07-28 | Forecast-PEFT: Parameter-Efficient Fine-Tuning for Pre-trained Motion Forecasting Models | Jifeng Wang et.al. | 2407.19564 | link |
2024-07-24 | Parameter-Efficient Fine-Tuning for Continual Learning: A Neural Tangent Kernel Perspective | Jingren Liu et.al. | 2407.17120 | null |
2024-07-22 | Zero-Shot Embeddings Inform Learning and Forgetting with Vision-Language Encoders | Laura Niss et.al. | 2407.15731 | null |
2024-07-21 | Learn to Preserve and Diversify: Parameter-Efficient Group with Orthogonal Regularization for Domain Generalization | Jiajun Hu et.al. | 2407.15085 | null |
2024-07-16 | InstructAV: Instruction Fine-tuning Large Language Models for Authorship Verification | Yujia Hu et.al. | 2407.12882 | link |
2024-07-18 | Turning Generative Models Degenerate: The Power of Data Poisoning Attacks | Shuli Jiang et.al. | 2407.12281 | null |
2024-07-16 | Probing the Efficacy of Federated Parameter-Efficient Fine-Tuning of Vision Transformers for Medical Image Classification | Naif Alkhunaizi et.al. | 2407.11573 | null |
2024-07-16 | An efficient framework based on large foundation model for cervical cytopathology whole slide image screening | Jialong Huang et.al. | 2407.11486 | link |
2024-07-10 | RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization | Xijie Huang et.al. | 2407.08044 | link |
2024-07-10 | ROSA: Random Subspace Adaptation for Efficient Fine-Tuning | Marawan Gamal Abdel Hameed et.al. | 2407.07802 | link |
2024-07-10 | Parameter Efficient Fine Tuning for Multi-scanner PET to PET Reconstruction | Yumin Kim et.al. | 2407.07517 | null |
2024-07-09 | Reprogramming Distillation for Medical Foundation Models | Yuhang Zhou et.al. | 2407.06504 | null |
2024-07-07 | See Further for Parameter Efficient Fine-tuning by Standing on the Shoulders of Decomposition | Chongjie Si et.al. | 2407.05417 | link |
2024-07-16 | LoRA-GA: Low-Rank Adaptation with Gradient Approximation | Shaowen Wang et.al. | 2407.05000 | link |
2024-07-05 | GPT vs RETRO: Exploring the Intersection of Retrieval and Parameter-Efficient Fine-Tuning | Aleksander Ficek et.al. | 2407.04528 | null |
2024-07-04 | Deep Content Understanding Toward Entity and Aspect Target Sentiment Analysis on Foundation Models | Vorakit Vorakitphan et.al. | 2407.04050 | link |
2024-07-04 | ASteISR: Adapting Single Image Super-resolution Pre-trained Model for Efficient Stereo Image Super-resolution | Yuanbo Zhou et.al. | 2407.03598 | null |
2024-07-03 | Knowledge Composition using Task Vectors with Learned Anisotropic Scaling | Frederic Z. Zhang et.al. | 2407.02880 | link |
2024-07-03 | Exploring the Capabilities of LLMs for Code Change Related Tasks | Lishui Fan et.al. | 2407.02824 | link |
2024-07-02 | FineCLIPER: Multi-modal Fine-grained CLIP for Dynamic Facial Expression Recognition with AdaptERs | Haodong Chen et.al. | 2407.02157 | null |
2024-07-02 | CatMemo at the FinLLM Challenge Task: Fine-Tuning Large Language Models using Data Fusion in Financial Applications | Yupeng Cao et.al. | 2407.01953 | null |
2024-07-05 | Let the Expert Stick to His Last: Expert-Specialized Fine-Tuning for Sparse Architectural Large Language Models | Zihan Wang et.al. | 2407.01906 | link |
2024-07-01 | A Fingerprint for Large Language Models | Zhiguang Yang et.al. | 2407.01235 | null |
2024-07-02 | Embedded Prompt Tuning: Towards Enhanced Calibration of Pretrained Models for Medical Images | Wenqiang Zu et.al. | 2407.01003 | link |
2024-06-25 | Structured Unrestricted-Rank Matrices for Parameter Efficient Fine-tuning | Arijit Sehanobish et.al. | 2406.17740 | null |
2024-06-19 | Parameter Training Efficiency Aware Resource Allocation for AIGC in Space-Air-Ground Integrated Networks | Liangxin Qian et.al. | 2406.13602 | null |
2024-06-19 | Sparse High Rank Adapters | Kartikeya Bhardwaj et.al. | 2406.13175 | null |
2024-06-18 | Bayesian-LoRA: LoRA based Parameter Efficient Fine-Tuning using Optimal Quantization levels and Rank Values trough Differentiable Bayesian Gates | Cristian Meo et.al. | 2406.13046 | null |
2024-06-18 | Fighting Randomness with Randomness: Mitigating Optimisation Instability of Fine-Tuning using Delayed Ensemble and Noisy Interpolation | Branislav Pecher et.al. | 2406.12471 | link |
2024-06-17 | A Semantic-based Layer Freezing Approach to Efficient Fine-Tuning of Language Models | Jian Gu et.al. | 2406.11753 | null |
2024-06-16 | ExPLoRA: Parameter-Efficient Extended Pre-Training to Adapt Vision Transformers under Domain Shifts | Samar Khanna et.al. | 2406.10973 | null |
2024-06-16 | ShareLoRA: Parameter Efficient and Robust Large Language Model Fine-tuning via Shared Low-Rank Adaptation | Yurun Song et.al. | 2406.10785 | null |
2024-06-16 | RoseLoRA: Row and Column-wise Sparse Low-rank Adaptation of Pre-trained Language Model for Knowledge Editing and Fine-tuning | Haoyu Wang et.al. | 2406.10777 | null |
2024-06-15 | Benchmarking Children’s ASR with Supervised and Self-supervised Speech Foundation Models | Ruchao Fan et.al. | 2406.10507 | link |
2024-06-15 | Personalized Pieces: Efficient Personalized Large Language Models through Collaborative Efforts | Zhaoxuan Tan et.al. | 2406.10471 | link |
2024-06-13 | Reflecting on the State of Rehearsal-free Continual Learning with Pretrained Models | Lukas Thede et.al. | 2406.09384 | null |
2024-06-12 | Exploring Fact Memorization and Style Imitation in LLMs Using QLoRA: An Experimental Study and Quality Assessment Methods | Eugene Vyborov et.al. | 2406.08582 | null |
2024-06-12 | The Impact of Initialization on LoRA Finetuning Dynamics | Soufiane Hayou et.al. | 2406.08447 | null |
2024-06-20 | Low-Rank Quantization-Aware Training for LLMs | Yelysei Bondarenko et.al. | 2406.06385 | link |
2024-06-10 | A Parameter-efficient Language Extension Framework for Multilingual ASR | Wei Liu et.al. | 2406.06329 | null |
2024-06-09 | A Comprehensive Evaluation of Parameter-Efficient Fine-Tuning on Automated Program Repair | Guochang Li et.al. | 2406.05639 | link |
2024-06-07 | Efficient Differentially Private Fine-Tuning of Diffusion Models | Jing Liu et.al. | 2406.05257 | null |
2024-06-07 | CorDA: Context-Oriented Decomposition Adaptation of Large Language Models | Yibo Yang et.al. | 2406.05223 | link |
2024-06-07 | An Empirical Study on Parameter-Efficient Fine-Tuning for MultiModal Large Language Models | Xiongtao Zhou et.al. | 2406.05130 | link |
2024-06-07 | MEFT: Memory-Efficient Fine-Tuning through Sparse Adapter | Jitai Hao et.al. | 2406.04984 | link |
2024-06-06 | Time Sensitive Knowledge Editing through Efficient Finetuning | Xiou Ge et.al. | 2406.04496 | link |
2024-06-06 | VHDL-Eval: A Framework for Evaluating Large Language Models in VHDL Code Generation | Prashanth Vijayaraghavan et.al. | 2406.04379 | null |
2024-06-10 | Hypernetworks for Personalizing ASR to Atypical Speech | Max Müller-Eberstein et.al. | 2406.04240 | null |
2024-06-06 | Light-PEFT: Lightening Parameter-Efficient Fine-Tuning via Early Pruning | Naibin Gu et.al. | 2406.03792 | link |
2024-06-05 | Choice of PEFT Technique in Continual Learning: Prompt Tuning is Not All You Need | Martin Wistuba et.al. | 2406.03216 | null |
2024-06-06 | Adapter-X: A Novel General Parameter-Efficient Fine-Tuning Framework for Vision | Minglei Li et.al. | 2406.03051 | null |
2024-05-31 | Mamba State-Space Models Can Be Strong Downstream Learners | John T. Halloran et.al. | 2406.00209 | null |
2024-05-30 | ETHER: Efficient Finetuning of Large-Scale Models with Hyperplane Reflections | Massimo Bini et.al. | 2405.20271 | link |
2024-05-30 | SVFT: Parameter-Efficient Fine-Tuning with Singular Vectors | Vijay Lingam et.al. | 2405.19597 | link |
2024-05-29 | MemControl: Mitigating Memorization in Medical Diffusion Models via Automated Parameter Selection | Raman Dutt et.al. | 2405.19458 | link |
2024-05-29 | MLAE: Masked LoRA Experts for Parameter-Efficient Fine-Tuning | Junjie Wang et.al. | 2405.18897 | null |
2024-05-29 | Parameter-efficient Fine-tuning in Hyperspherical Space for Open-vocabulary Semantic Segmentation | Zelin Peng et.al. | 2405.18840 | null |
2024-06-01 | Low-Rank Few-Shot Adaptation of Vision-Language Models | Maxime Zanella et.al. | 2405.18541 | null |
2024-05-28 | Semantic are Beacons: A Semantic Perspective for Unveiling Parameter-Efficient Fine-Tuning in Knowledge Learning | Renzhi Wang et.al. | 2405.18292 | null |
2024-05-28 | VeLoRA: Memory Efficient Training using Rank-1 Sub-Token Projections | Roy Miles et.al. | 2405.17991 | link |
2024-05-28 | Sparsity- and Hybridity-Inspired Visual Parameter-Efficient Fine-Tuning for Medical Diagnosis | Mingyuan Liu et.al. | 2405.17877 | null |
2024-05-27 | LoRA-XS: Low-Rank Adaptation with Extremely Small Number of Parameters | Klaudia Bałazy et.al. | 2405.17604 | link |
2024-05-23 | EMR-Merging: Tuning-Free High-Performance Model Merging | Chenyu Huang et.al. | 2405.17461 | link |
2024-05-28 | DoRA: Enhancing Parameter-Efficient Fine-Tuning with Dynamic Rank Distribution | Yulong Mao et.al. | 2405.17357 | link |
2024-05-27 | $\textit{Trans-LoRA}$ : towards data-free Transferable Parameter Efficient Finetuning | Runqian Wang et.al. | 2405.17258 | null |
2024-05-30 | Sparse Matrix in Large Language Model Fine-tuning | Haoze He et.al. | 2405.15525 | null |
2024-05-24 | Prompt Tuning Strikes Back: Customizing Foundation Models with Low-Rank Prompt Adaptation | Abhinav Jain et.al. | 2405.15282 | link |
2024-05-27 | VB-LoRA: Extreme Parameter Efficient Fine-Tuning with Vector Banks | Yang Li et.al. | 2405.15179 | link |
2024-05-23 | Bitune: Bidirectional Instruction-Tuning | Dawid J. Kopiczko et.al. | 2405.14862 | null |
2024-05-23 | Sparse-Tuning: Adapting Vision Transformers with Efficient Fine-tuning and Inference | Ting Liu et.al. | 2405.14700 | link |
2024-05-22 | Spectral Adapter: Fine-Tuning in Spectral Space | Fangzhao Zhang et.al. | 2405.13952 | link |
2024-05-24 | MeteoRA: Multiple-tasks Embedded LoRA for Large Language Models | Jingwei Xu et.al. | 2405.13053 | link |
2024-05-20 | FeTT: Continual Class Incremental Learning via Feature Transformation Tuning | Sunyuan Qiang et.al. | 2405.11822 | null |
2024-05-21 | HARIS: Human-Like Attention for Reference Image Segmentation | Mengxi Zhang et.al. | 2405.10707 | null |
2024-05-28 | DP-DyLoRA: Fine-Tuning Transformer-Based Models On-Device under Differentially Private Federated Learning using Dynamic Low-Rank Adaptation | Jie Xu et.al. | 2405.06368 | null |
2024-05-09 | Selective Fine-tuning on LLM-labeled Data May Reduce Reliance on Human Annotation: A Case Study Using Schedule-of-Event Table Detection | Bhawesh Kumar et.al. | 2405.06093 | null |
2024-05-09 | Memory-Space Visual Prompting for Efficient Vision-Language Fine-Tuning | Shibo Jie et.al. | 2405.05615 | link |
2024-05-07 | Refining Joint Text and Source Code Embeddings for Retrieval Task with Parameter-Efficient Fine-Tuning | Karim Galliamov et.al. | 2405.04126 | link |
2024-05-04 | Random Masking Finds Winning Tickets for Parameter Efficient Fine-tuning | Jing Xu et.al. | 2405.02596 | link |
2024-03-16 | Empirical Studies of Parameter Efficient Methods for Large Language Models of Code and Knowledge Transfer to R | Amirreza Esmaeili et.al. | 2405.01553 | null |
2024-05-02 | NeMo-Aligner: Scalable Toolkit for Efficient Model Alignment | Gerald Shen et.al. | 2405.01481 | link |
2024-04-29 | LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report | Justin Zhao et.al. | 2405.00732 | link |
2024-05-01 | Investigating Automatic Scoring and Feedback using Large Language Models | Gloria Ashiya Katuka et.al. | 2405.00602 | null |
2024-05-01 | MoPEFT: A Mixture-of-PEFTs for the Segment Anything Model | Rajat Sahay et.al. | 2405.00293 | null |
2024-04-30 | SPAFIT: Stratified Progressive Adaptation Fine-tuning for Pre-trained Large Language Models | Samir Arora et.al. | 2405.00201 | null |
2024-05-23 | HydraLoRA: An Asymmetric LoRA Architecture for Efficient Fine-Tuning | Chunlin Tian et.al. | 2404.19245 | link |
2024-05-25 | FeDeRA:Efficient Fine-tuning of Language Models in Federated Learning Leveraging Weight Decomposition | Yuxuan Yan et.al. | 2404.18848 | null |
2024-04-25 | Efficiency in Focus: LayerNorm as a Catalyst for Fine-tuning Medical Visual Language Pre-trained Models | Jiawei Chen et.al. | 2404.16385 | null |
2024-05-23 | MixLoRA: Enhancing Large Language Models Fine-Tuning with LoRA-based Mixture of Experts | Dengchun Li et.al. | 2404.15159 | link |
2024-04-22 | ColA: Collaborative Adaptation with Gradient Learning | Enmao Diao et.al. | 2404.13844 | link |
2024-04-23 | Parameter Efficient Fine Tuning: A Comprehensive Analysis Across Applications | Charith Chandra Sai Balne et.al. | 2404.13506 | null |
2024-04-18 | SKIP: Skill-Localized Prompt Tuning for Inference Speed Boost-Up | Nakyeong Yang et.al. | 2404.11916 | null |
2024-04-16 | Shears: Unstructured Sparsity with Neural Low-rank Adapter Search | J. Pablo Muñoz et.al. | 2404.10934 | link |
2024-04-16 | Exact and Efficient Unlearning for Large Language Model-based Recommendation | Zhiyu Hu et.al. | 2404.10327 | null |
2024-04-15 | LoRA Dropout as a Sparsity Regularizer for Overfitting Control | Yang Lin et.al. | 2404.09610 | null |
2024-04-21 | Analyzing the Impact of Data Selection and Fine-Tuning on Economic and Political Biases in LLMs | Ahmed Agiza et.al. | 2404.08699 | link |
2024-04-08 | Certified PEFTSmoothing: Parameter-Efficient Fine-Tuning with Randomized Smoothing | Chengyan Fu et.al. | 2404.05350 | null |
2024-04-08 | DLoRA: Distributed Parameter-Efficient Fine-Tuning Solution for Large Language Model | Chao Gao et.al. | 2404.05182 | null |
2024-04-12 | Q-PEFT: Query-dependent Parameter Efficient Fine-tuning for Text Reranking with Large Language Models | Zhiyuan Peng et.al. | 2404.04522 | null |
2024-04-05 | Unlocking Parameter-Efficient Fine-Tuning for Low-Resource Language Translation | Tong Su et.al. | 2404.04212 | null |
2024-05-22 | ReFT: Representation Finetuning for Language Models | Zhengxuan Wu et.al. | 2404.03592 | link |
2024-06-11 | Personalized LLM Response Generation with Parameterized Memory Injection | Kai Zhang et.al. | 2404.03565 | null |
2024-06-20 | Eigenpruning: an Interpretability-Inspired PEFT Method | Tomás Vergara-Browne et.al. | 2404.03147 | link |
2024-05-28 | PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models | Fanxu Meng et.al. | 2404.02948 | link |
2024-04-03 | Enhancing Low-Resource LLMs Classification with PEFT and Synthetic Data | Parth Patwa et.al. | 2404.02422 | null |
2024-04-11 | IISAN: Efficiently Adapting Multimodal Representation for Sequential Recommendation with Decoupled PEFT | Junchen Fu et.al. | 2404.02059 | link |
2024-03-31 | Query-driven Relevant Paragraph Extraction from Legal Judgments | T. Y. S. S Santosh et.al. | 2404.00595 | null |
2024-03-30 | Edinburgh Clinical NLP at SemEval-2024 Task 2: Fine-tune your model unless you have access to GPT-4 | Aryo Pradipta Gema et.al. | 2404.00484 | link |
2024-04-03 | InfLoRA: Interference-Free Low-Rank Adaptation for Continual Learning | Yan-Shuo Liang et.al. | 2404.00228 | link |
2024-03-27 | Is Modularity Transferable? A Case Study through the Lens of Knowledge Distillation | Mateusz Klimaszewski et.al. | 2403.18804 | link |
2024-03-26 | The Unreasonable Ineffectiveness of the Deeper Layers | Andrey Gromov et.al. | 2403.17887 | null |
2024-04-15 | ALoRA: Allocating Low-Rank Adaptation for Fine-tuning Large Language Models | Zequan Liu et.al. | 2403.16187 | null |
2024-03-22 | KnowLA: Enhancing Parameter-efficient Finetuning with Knowledgeable Adaptation | Xindi Luo et.al. | 2403.14950 | link |
2024-03-22 | A Single Linear Layer Yields Task-Adapted Low-Rank Matrices | Hwichan Kim et.al. | 2403.14946 | null |
2024-03-21 | AutoRE: Document-Level Relation Extraction with Large Language Models | Xue Lilong et.al. | 2403.14888 | link |
2024-04-29 | Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey | Zeyu Han et.al. | 2403.14608 | null |
2024-03-20 | Harnessing Large Language Models for Text-Rich Sequential Recommendation | Zhi Zheng et.al. | 2403.13325 | link |
2024-04-16 | AFLoRA: Adaptive Freezing of Low Rank Adaptation in Parameter Efficient Fine-Tuning of Large Models | Zeyu Liu et.al. | 2403.13269 | null |
2024-03-18 | Improving LoRA in Privacy-preserving Federated Learning | Youbang Sun et.al. | 2403.12313 | null |
2024-03-18 | Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation | Wangbo Zhao et.al. | 2403.11808 | link |
2024-03-18 | Let’s Focus on Neuron: Neuron-Level Supervised Fine-tuning for Large Language Model | Haoyun Xu et.al. | 2403.11621 | null |
2024-03-19 | JORA: JAX Tensor-Parallel LoRA Library for Retrieval Augmented Fine-Tuning | Anique Tahir et.al. | 2403.11366 | link |
2024-03-14 | Introducing Routing Functions to Vision-Language Parameter-Efficient Fine-Tuning with Low-Rank Bottlenecks | Tingyu Qu et.al. | 2403.09377 | link |
2024-03-14 | PYRA: Parallel Yielding Re-Activation for Training-Inference Efficient Task Adaptation | Yizhe Xiong et.al. | 2403.09192 | link |
2024-03-13 | Data-oriented Dynamic Fine-tuning Parameter Selection Strategy for FISH Mask based Efficient Fine-tuning | Ming Dong et.al. | 2403.08484 | null |
Text-to-Image Generation
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2024-11-19 | Enhancing Multi-Class Disease Classification: Neoplasms, Cardiovascular, Nervous System, and Digestive Disorders Using Advanced LLMs | Ahmed Akib Jawad Karim et.al. | 2411.12712 | null |
2024-11-19 | OrigamiPlot: An R Package and Shiny Web App Enhanced Visualizations for Multivariate Data | Yiwen Lu et.al. | 2411.12674 | null |
2024-11-19 | Auto-Evaluation with Few Labels through Post-hoc Regression | Benjamin Eyre et.al. | 2411.12665 | null |
2024-11-19 | PoM: Efficient Image and Video Generation with the Polynomial Mixer | David Picard et.al. | 2411.12663 | link |
2024-11-19 | Optimizing Airline Reservation Systems with Edge-Enabled Microservices: A Framework for Real-Time Data Processing and Enhanced User Responsiveness | Biman Barua et.al. | 2411.12650 | null |
2024-11-19 | DLBacktrace: A Model Agnostic Explainability for any Deep Learning Models | Vinay Kumar Sankarapu et.al. | 2411.12643 | link |
2024-11-19 | Improving Controllability and Editability for Pretrained Text-to-Music Generation Models | Yixiao Zhang et.al. | 2411.12641 | null |
2024-11-19 | Universal programmable waveguide arrays | Akram Youssry et.al. | 2411.12610 | null |
2024-11-19 | Whisper Finetuning on Nepali Language | Sanjay Rijal et.al. | 2411.12587 | null |
2024-11-19 | Predicting Customer Satisfaction by Replicating the Survey Response Distribution | Etienne Manderscheid et.al. | 2411.12539 | null |
2024-11-19 | Data Pruning in Generative Diffusion Models | Rania Briq et.al. | 2411.12523 | null |
2024-11-19 | Probe-Me-Not: Protecting Pre-trained Encoders from Malicious Probing | Ruyi Ding et.al. | 2411.12508 | null |
2024-11-19 | Empirical Privacy Evaluations of Generative and Predictive Machine Learning Models – A review and challenges for practice | Flavio Hafner et.al. | 2411.12451 | null |
2024-11-19 | Frequency-Aware Guidance for Blind Image Restoration via Diffusion Models | Jun Xiao et.al. | 2411.12450 | null |
2024-11-19 | A general modeling and simulation framework for dynamic vehicle routing | Markó Horváth et.al. | 2411.12406 | null |
2024-11-18 | QARM: Quantitative Alignment Multi-Modal Recommendation at Kuaishou | Xinchen Luo et.al. | 2411.11739 | null |
2024-11-18 | Aligning Few-Step Diffusion Models with Dense Reward Difference Learning | Ziyi Zhang et.al. | 2411.11727 | link |
2024-11-18 | Multiscale nonlinear integration drives accurate encoding of input information | Giorgio Nicoletti et.al. | 2411.11710 | null |
2024-11-18 | Robust Reinforcement Learning under Diffusion Models for Data with Jumps | Chenyang Jiang et.al. | 2411.11697 | null |
2024-11-18 | Active droplets controlled by enzymatic reactions | Jacques Fries et.al. | 2411.11696 | null |
2024-11-18 | Do Captioning Metrics Reflect Music Semantic Alignment? | Jinwoo Lee et.al. | 2411.11692 | null |
2024-11-18 | Conceptwm: A Diffusion Model Watermark for Concept Protection | Liangqi Lei et.al. | 2411.11688 | null |
2024-11-19 | GNN-Based Code Annotation Logic for Establishing Security Boundaries in C Code | Varun Gadey et.al. | 2411.11567 | null |
2024-11-19 | Cascaded Diffusion Models for 2D and 3D Microscopy Image Synthesis to Enhance Cell Segmentation | Rüveyda Yilmaz et.al. | 2411.11515 | null |
2024-11-18 | Collaborative Contrastive Network for Click-Through Rate Prediction | Chen Gao et.al. | 2411.11508 | null |
2024-11-18 | LaVin-DiT: Large Vision Diffusion Transformer | Zhaoqing Wang et.al. | 2411.11505 | null |
2024-11-18 | Alien Recombination: Exploring Concept Blends Beyond Human Cognitive Availability in Visual Art | Alejandro Hernandez et.al. | 2411.11494 | null |
2024-11-18 | MVLight: Relightable Text-to-3D Generation via Light-conditioned Multi-View Diffusion | Dongseok Shim et.al. | 2411.11475 | null |
2024-11-18 | GLDesigner: Leveraging Multi-Modal LLMs as Designer for Enhanced Aesthetic Text Glyph Layouts | Junwen He et.al. | 2411.11435 | null |
2024-11-18 | CLUE-MARK: Watermarking Diffusion Models using CLWE | Kareem Shehata et.al. | 2411.11434 | null |
2024-11-15 | M-VAR: Decoupled Scale-wise Autoregressive Modeling for High-Quality Image Generation | Sucheng Ren et.al. | 2411.10433 | null |
2024-11-15 | Mitigating Parameter Degeneracy using Joint Conditional Diffusion Model for WECC Composite Load Model in Power Systems | Feiqin Zhu et.al. | 2411.10431 | null |
2024-11-15 | Multiscale Dubuc: A New Similarity Measure for Time Series | Mahsa Khazaei et.al. | 2411.10418 | null |
2024-11-15 | Experimental generation of extreme electron beams for advanced accelerator applications | Claudio Emma et.al. | 2411.10413 | null |
2024-11-15 | How to Build a Quantum Supercomputer: Scaling Challenges and Opportunities | Masoud Mohseni et.al. | 2411.10406 | null |
2024-11-15 | Nonlinearity-Driven Morphing and Control of Topological Modes in Non-Hermitian Systems | Zhao-Fan Cai et.al. | 2411.10398 | null |
2024-11-15 | Towards High-Fidelity 3D Portrait Generation with Rich Details by Cross-View Prior-Aware Diffusion | Haoran Wei et.al. | 2411.10369 | null |
2024-11-15 | Safe Text-to-Image Generation: Simply Sanitize the Prompt Embedding | Huming Qiu et.al. | 2411.10329 | null |
2024-11-15 | Probabilistic Prior Driven Attention Mechanism Based on Diffusion Model for Imaging Through Atmospheric Turbulence | Guodong Sun et.al. | 2411.10321 | null |
2024-11-15 | Assortment Optimization under the Multinomial Logit Model with Covering Constraints | Omar El Housni et.al. | 2411.10310 | null |
2024-11-15 | Modification Takes Courage: Seamless Image Stitching via Reference-Driven Inpainting | Ziqi Xie et.al. | 2411.10309 | link |
2024-11-15 | MDHP-Net: Detecting Injection Attacks on In-vehicle Network using Multi-Dimensional Hawkes Process and Temporal Model | Qi Liu et.al. | 2411.10258 | null |
2024-11-15 | The Unreasonable Effectiveness of Guidance for Diffusion Models | Tim Kaiser et.al. | 2411.10257 | null |
2024-11-15 | Smooth transport map via diffusion process | Arthur Stéphanovitch et.al. | 2411.10235 | null |
2024-11-15 | ColorEdit: Training-free Image-Guided Color editing with diffusion model | Xingxi Yin et.al. | 2411.10232 | null |
2024-11-14 | A Bayesian Optimization Approach to Machine Translation Reranking | Julius Cheng et.al. | 2411.09694 | null |
2024-11-14 | SimTube: Generating Simulated Video Comments through Multimodal AI and User Personas | Yu-Kai Hung et.al. | 2411.09577 | null |
2024-11-14 | Golden Noise for Diffusion Models: A Learning Framework | Zikai Zhou et.al. | 2411.09502 | null |
2024-11-14 | Sparse Bayesian Generative Modeling for Compressive Sensing | Benedikt Böck et.al. | 2411.09483 | null |
2024-11-14 | DiffRoad: Realistic and Diverse Road Scenario Generation for Autonomous Vehicle Testing | Junjie Zhou et.al. | 2411.09451 | null |
2024-11-14 | Image Regeneration: Evaluating Text-to-Image Model via Generating Identical Image with Multimodal Large Language Models | Chutian Meng et.al. | 2411.09449 | null |
2024-11-14 | A survey of probabilistic generative frameworks for molecular simulations | Richard John et.al. | 2411.09388 | link |
2024-11-14 | Multi-scale Generative Modeling for Fast Sampling | Xiongye Xiao et.al. | 2411.09356 | null |
2024-11-14 | ParaLBench: A Large-Scale Benchmark for Computational Paralinguistics over Acoustic Foundation Models | Zixing Zhang et.al. | 2411.09349 | null |
2024-11-15 | Approximate Probabilistic Inference for Time-Series Data A Robust Latent Gaussian Model With Temporal Awareness | Anton Johansson et.al. | 2411.09312 | null |
2024-11-14 | EEG-Based Speech Decoding: A Novel Approach Using Multi-Kernel Ensemble Diffusion Models | Soowon Kim et.al. | 2411.09302 | null |
2024-11-14 | LES-Talker: Fine-Grained Emotion Editing for Talking Head Generation in Linear Emotion Space | Guanwen Feng et.al. | 2411.09268 | null |
2024-11-14 | Jailbreak Attacks and Defenses against Multimodal Generative Models: A Survey | Xuannan Liu et.al. | 2411.09259 | null |
2024-11-14 | RibCageImp: A Deep Learning Framework for 3D Ribcage Implant Generation | Gyanendra Chaubey et.al. | 2411.09204 | null |
2024-11-14 | Improvement and Implementation of a Speech Emotion Recognition Model Based on Dual-Layer LSTM | Xiaoran Yang et.al. | 2411.09189 | null |
2024-11-13 | 4D Gaussian Splatting in the Wild with Uncertainty-Aware Regularization | Mijeong Kim et.al. | 2411.08879 | null |
2024-11-13 | A generalized software framework for consolidation of radiotherapy planning and delivery data from diverse data sources | Yasin Abdulkadir et.al. | 2411.08876 | null |
2024-11-13 | Offline Adaptation of Quadruped Locomotion using Diffusion Models | Reece O’Mahoney et.al. | 2411.08832 | null |
2024-11-13 | SANDWICH: Towards an Offline, Differentiable, Fully-Trainable Wireless Neural Ray-Tracing Surrogate | Yifei Jin et.al. | 2411.08767 | null |
2024-11-13 | Analyst Reports and Stock Performance: Evidence from the Chinese Market | Rui Liu et.al. | 2411.08726 | null |
2024-11-14 | Reducing ADC Front-end Costs During Training of On-sensor Printed Multilayer Perceptrons | Florentia Afentaki et.al. | 2411.08674 | null |
2024-11-13 | Joint Model Caching and Resource Allocation in Generative AI-Enabled Wireless Edge Networks | Zhang Liu et.al. | 2411.08672 | null |
2024-11-13 | Toward Human Understanding with Controllable Synthesis | Hanz Cuevas-Velasquez et.al. | 2411.08663 | null |
2024-11-13 | The Galactica database: an open, generic and versatile tool for the dissemination of simulation data in astrophysics | Damien Chapon et.al. | 2411.08647 | null |
2024-11-13 | Towards More Accurate Fake Detection on Images Generated from Advanced Generative and Neural Rendering Models | Chengdong Dong et.al. | 2411.08642 | null |
2024-11-13 | Deep Generative Demand Learning for Newsvendor and Pricing | Shijin Gong et.al. | 2411.08631 | null |
2024-11-13 | LG-Gaze: Learning Geometry-aware Continuous Prompts for Language-Guided Gaze Estimation | Pengwei Yin et.al. | 2411.08606 | null |
2024-11-13 | CorrSynth – A Correlated Sampling Method for Diverse Dataset Generation from LLMs | Suhas S Kowshik et.al. | 2411.08553 | null |
2024-11-13 | Explainers’ Mental Representations of Explainees’ Needs in Everyday Explanations | Michael Erol Schaffer et.al. | 2411.08514 | null |
2024-11-13 | HyperFace: Generating Synthetic Face Recognition Datasets by Exploring Face Embedding Hypersphere | Hatef Otroshi Shahreza et.al. | 2411.08470 | null |
2024-11-12 | Scaling Properties of Diffusion Models for Perceptual Tasks | Rahul Ravishankar et.al. | 2411.08034 | null |
2024-11-12 | GaussianAnything: Interactive Point Cloud Latent Diffusion for 3D Generation | Yushi Lan et.al. | 2411.08033 | null |
2024-11-12 | Wavelet Latent Diffusion (Wala): Billion-Parameter 3D Generative Model with Compact Wavelet Encodings | Aditya Sanghi et.al. | 2411.08017 | link |
2024-11-12 | JanusFlow: Harmonizing Autoregression and Rectified Flow for Unified Multimodal Understanding and Generation | Yiyang Ma et.al. | 2411.07975 | link |
2024-11-12 | Diverse capability and scaling of diffusion and auto-regressive models when learning abstract rules | Binxu Wang et.al. | 2411.07873 | null |
2024-11-12 | Trustful LLMs: Customizing and Grounding Text Generation with Knowledge Bases and Dual Decoders | Xiaofeng Zhu et.al. | 2411.07870 | null |
2024-11-12 | CDXFormer: Boosting Remote Sensing Change Detection with Extended Long Short-Term Memory | Zhenkai Wu et.al. | 2411.07863 | link |
2024-11-12 | Sparsity-Aware Optimization of In-Memory Bayesian Binary Neural Network Accelerators | Prabodh Katti et.al. | 2411.07842 | null |
2024-11-12 | Novel View Synthesis with Pixel-Space Diffusion Models | Noam Elata et.al. | 2411.07765 | null |
2024-11-12 | Nanosecond nanothermometry in an electron microscope | Florian Castioni et.al. | 2411.07764 | null |
2024-11-12 | LapGSR: Laplacian Reconstructive Network for Guided Thermal Super-Resolution | Aditya Kasliwal et.al. | 2411.07750 | null |
2024-11-12 | The relationship between general equilibrium models with infinite-lived agents and overlapping generations models, and some applications | Ngoc-Sang Pham et.al. | 2411.07674 | null |
2024-11-12 | Evaluating the Generation of Spatial Relations in Text and Image Generative Models | Shang Hong Sim et.al. | 2411.07664 | null |
2024-11-12 | Leveraging Previous Steps: A Training-free Fast Solver for Flow Diffusion | Kaiyu Song et.al. | 2411.07627 | null |
2024-11-12 | Unraveling the Connections between Flow Matching and Diffusion Probabilistic Models in Training-free Conditional Generation | Kaiyu Song et.al. | 2411.07625 | null |
2024-11-11 | Score-based generative diffusion with “active” correlated noise sources | Alexandra Lamtyugina et.al. | 2411.07233 | null |
2024-11-12 | Add-it: Training-Free Object Insertion in Images With Pretrained Diffusion Models | Yoad Tewel et.al. | 2411.07232 | null |
2024-11-11 | Learning from Limited and Imperfect Data | Harsh Rangwani et.al. | 2411.07229 | null |
2024-11-11 | TempCharBERT: Keystroke Dynamics for Continuous Access Control Based on Pre-trained Language Models | Matheus Simão et.al. | 2411.07224 | null |
2024-11-11 | DLCR: A Generative Data Expansion Framework via Diffusion for Clothes-Changing Person Re-ID | Nyle Siddiqui et.al. | 2411.07205 | link |
2024-11-11 | Crossover from inhomogeneous to homogeneous response of a resonantly driven hBN quantum emitter | Domitille Gérard et.al. | 2411.07202 | null |
2024-11-11 | OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision | Cong Wei et.al. | 2411.07199 | null |
2024-11-11 | More Expressive Attention with Negative Weights | Ang Lv et.al. | 2411.07176 | link |
2024-11-11 | Edify 3D: Scalable High-Quality 3D Asset Generation | NVIDIA et.al. | 2411.07135 | null |
2024-11-11 | Benchmarking LLMs’ Judgments with No Gold Standard | Shengwei Xu et.al. | 2411.07127 | link |
2024-11-11 | Edify Image: High-Quality Image Generation with Pixel Space Laplacian Diffusion Models | NVIDIA et.al. | 2411.07126 | null |
2024-11-11 | Decoding Visual Experience and Mapping Semantics through Whole-Brain Analysis Using fMRI Foundation Models | Yanchen Wang et.al. | 2411.07121 | link |
2024-11-11 | Scaling Mesh Generation via Compressive Tokenization | Haohan Weng et.al. | 2411.07025 | link |
2024-11-11 | An Electrocardiogram Monitoring Device Based on STM32 | Wenqi Guan et.al. | 2411.06962 | null |
2024-11-11 | Generative Feature Training of Thin 2-Layer Networks | Johannes Hertrich et.al. | 2411.06848 | link |
2024-11-08 | StdGEN: Semantic-Decomposed 3D Character Generation from Single Images | Yuze He et.al. | 2411.05738 | null |
2024-11-08 | Image2Text2Image: A Novel Framework for Label-Free Evaluation of Image-to-Text Generation with Text-to-Image Diffusion Models | Jia-Hong Huang et.al. | 2411.05706 | null |
2024-11-08 | Improving Molecular Graph Generation with Flow Matching and Optimal Transport | Xiaoyang Hou et.al. | 2411.05676 | null |
2024-11-08 | Towards Lifelong Few-Shot Customization of Text-to-Image Diffusion | Nan Song et.al. | 2411.05544 | null |
2024-11-08 | Improving image synthesis with diffusion-negative sampling | Alakh Desai et.al. | 2411.05473 | null |
2024-11-08 | Bridging the Gap between Learning and Inference for Diffusion-Based Molecule Generation | Peidong Liu et.al. | 2411.05472 | link |
2024-11-08 | IntellBot: Retrieval Augmented LLM Chatbot for Cyber Threat Knowledge Delivery | Dincy R. Arikkat et.al. | 2411.05442 | null |
2024-11-08 | RED: Residual Estimation Diffusion for Low-Dose PET Sinogram Reconstruction | Xingyu Ai et.al. | 2411.05354 | null |
2024-11-08 | Electro-diffusive modeling and the role of spine geometry on action potential propagation in neurons | Rahul Gulati et.al. | 2411.05329 | null |
2024-11-08 | Social balance in directed networks | Bingjie Hao et.al. | 2411.05327 | null |
2024-11-08 | SeqRFM: Fast RFM Analysis in Sequence Data | Yanxin Zheng et.al. | 2411.05317 | link |
2024-11-08 | Differentiable Calibration of Inexact Stochastic Simulation Models via Kernel Score Minimization | Ziwei Su et.al. | 2411.05315 | null |
2024-11-08 | A Real-time Face Mask Detection and Social Distancing System for COVID-19 using Attention-InceptionV3 Model | Abdullah Al Asif et.al. | 2411.05312 | null |
2024-11-08 | Adaptive Whole-Body PET Image Denoising Using 3D Diffusion Models with ControlNet | Boxiao Yu et.al. | 2411.05302 | null |
2024-11-08 | GPT Semantic Cache: Reducing LLM Costs and Latency via Semantic Embedding Caching | Sajal Regmi et.al. | 2411.05276 | null |
2024-11-07 | SVDQunat: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models | Muyang Li et.al. | 2411.05007 | link |
2024-11-07 | ProEdit: Simple Progression is All You Need for High-Quality 3D Scene Editing | Jun-Kun Chen et.al. | 2411.05006 | null |
2024-11-07 | Diff-2-in-1: Bridging Generation and Dense Perception with Diffusion Models | Shuhong Zheng et.al. | 2411.05005 | null |
2024-11-07 | ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning | David Junhao Zhang et.al. | 2411.05003 | null |
2024-11-07 | SG-I2V: Self-Guided Trajectory Control in Image-to-Video Generation | Koichi Namekata et.al. | 2411.04989 | null |
2024-11-07 | Few-Shot Task Learning through Inverse Generative Modeling | Aviv Netanyahu et.al. | 2411.04987 | null |
2024-11-07 | How fast does the WallGo? A package for computing wall velocities in first-order phase transitions | Andreas Ekstedt et.al. | 2411.04970 | link |
2024-11-07 | VAIR: Visuo-Acoustic Implicit Representations for Low-Cost, Multi-Modal Transparent Surface Reconstruction in Indoor Scenes | Advaith V. Sethuraman et.al. | 2411.04963 | null |
2024-11-07 | Uncovering Hidden Subspaces in Video Diffusion Models Using Re-Identification | Mischa Dombrowski et.al. | 2411.04956 | null |
2024-11-07 | Fed-LDR: Federated Local Data-infused Graph Creation with Node-centric Model Refinement | Jiechao Gao et.al. | 2411.04936 | null |
2024-11-07 | DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion | Wenqiang Sun et.al. | 2411.04928 | null |
2024-11-07 | StoryAgent: Customized Storytelling Video Generation via Multi-Agent Collaboration | Panwen Hu et.al. | 2411.04925 | null |
2024-11-07 | Stem-OB: Generalizable Visual Imitation Learning with Stem-Like Convergent Observation through Diffusion Inversion | Kaizhe Hu et.al. | 2411.04919 | link |
2024-11-07 | GASE: Generatively Augmented Sentence Encoding | Manuel Frank et.al. | 2411.04914 | null |
2024-11-07 | Controlling Human Shape and Pose in Text-to-Image Diffusion Models via Domain Adaptation | Benito Buchheim et.al. | 2411.04724 | null |
2024-11-06 | Community Forensics: Using Thousands of Generators to Train Fake Image Detectors | Jeongsoo Park et.al. | 2411.04125 | null |
2024-11-06 | Stepping Forward on the Last Mile | Chen Feng et.al. | 2411.04036 | null |
2024-11-06 | Prototyping O-RAN Enabled UAV Experimentation for the AERPAW Testbed | Joshua Moore et.al. | 2411.04027 | null |
2024-11-06 | Object-Centric Dexterous Manipulation from Human Motion Data | Yuanpei Chen et.al. | 2411.04005 | null |
2024-11-06 | Synomaly Noise and Multi-Stage Diffusion: A Novel Approach for Unsupervised Anomaly Detection in Ultrasound Imaging | Yuan Bi et.al. | 2411.04004 | null |
2024-11-06 | ET-SEED: Efficient Trajectory-Level SE(3) Equivariant Diffusion Policy | Chenrui Tie et.al. | 2411.03990 | null |
2024-11-06 | ReEdit: Multimodal Exemplar-Based Image Editing with Diffusion Models | Ashutosh Srivastava et.al. | 2411.03982 | null |
2024-11-06 | Customized Multiple Clustering via Multi-Modal Subspace Proxy Learning | Jiawei Yao et.al. | 2411.03978 | null |
2024-11-06 | Bayesian algorithmic perfumery: A Hierarchical Relevance Vector Machine for the Estimation of Personalized Fragrance Preferences based on Three Sensory Layers and Jungian Personality Archetypes | Rolando Gonzales Martinez et.al. | 2411.03965 | null |
2024-11-06 | Long-Form Text-to-Music Generation with Adaptive Prompts: A Case of Study in Tabletop Role-Playing Games Soundtracks | Felipe Marra et.al. | 2411.03948 | null |
2024-11-06 | Can Custom Models Learn In-Context? An Exploration of Hybrid Architecture Performance on In-Context Learning Tasks | Ryan Campbell et.al. | 2411.03945 | link |
2024-11-06 | GUIDE-VAE: Advancing Data Generation with User Information and Pattern Dictionaries | Kutay Bölat et.al. | 2411.03936 | null |
2024-11-06 | Large Generative Model-assisted Talking-face Semantic Communication System | Feibo Jiang et.al. | 2411.03876 | null |
2024-11-06 | ROBIN: Robust and Invisible Watermarks for Diffusion Models with Adversarial Optimization | Huayang Huang et.al. | 2411.03862 | link |
2024-11-06 | Sub-DM:Subspace Diffusion Model with Orthogonal Decomposition for MRI Reconstruction | Yu Guan et.al. | 2411.03758 | null |
2024-11-05 | MME-Finance: A Multimodal Finance Benchmark for Expert-level Understanding and Reasoning | Ziliang Gan et.al. | 2411.03314 | null |
2024-11-05 | LLMs for Domain Generation Algorithm Detection | Reynier Leyva La O et.al. | 2411.03307 | null |
2024-11-05 | DiffLM: Controllable Synthetic Data Generation via Diffusion Language Models | Ying Zhou et.al. | 2411.03250 | null |
2024-11-05 | On Improved Conditioning Mechanisms and Pre-training Strategies for Diffusion Models | Tariq Berrada Ifriqi et.al. | 2411.03177 | null |
2024-11-05 | Unleashing the power of novel conditional generative approaches for new materials discovery | Lev Novitskiy et.al. | 2411.03156 | link |
2024-11-05 | Local Lesion Generation is Effective for Capsule Endoscopy Image Data Augmentation in a Limited Data Setting | Adrian B. Chłopowiec et.al. | 2411.03098 | null |
2024-11-05 | Gradient-Guided Conditional Diffusion Models for Private Image Reconstruction: Analyzing Adversarial Impacts of Differential Privacy and Denoising | Tao Huang et.al. | 2411.03053 | null |
2024-11-05 | GarVerseLOD: High-Fidelity 3D Garment Reconstruction from a Single In-the-Wild Image using a Dataset with Levels of Details | Zhongjin Luo et.al. | 2411.03047 | null |
2024-11-05 | Speaker Emotion Recognition: Leveraging Self-Supervised Models for Feature Extraction Using Wav2Vec2 and HuBERT | Pourya Jafarzadeh et.al. | 2411.02964 | null |
2024-11-05 | IMUDiffusion: A Diffusion Model for Multivariate Time Series Synthetisation for Inertial Motion Capturing Systems | Heiko Oppel et.al. | 2411.02954 | null |
2024-11-05 | LDPM: Towards undersampled MRI reconstruction with MR-VAE and Latent Diffusion Prior | Xingjian Tang et.al. | 2411.02951 | null |
2024-11-05 | A scalable generative model for dynamical system reconstruction from neuroimaging data | Eric Volkmann et.al. | 2411.02949 | null |
2024-11-05 | Exploring the Interplay Between Video Generation and World Models in Autonomous Driving: A Survey | Ao Fu et.al. | 2411.02914 | null |
2024-11-05 | The Unreasonable Effectiveness of LLMs for Query Optimization | Peter Akioyamen et.al. | 2411.02862 | link |
2024-11-05 | ADOPT: Modified Adam Can Converge with Any $β_2$ with the Optimal Rate | Shohei Taniguchi et.al. | 2411.02853 | link |
2024-11-04 | Training-free Regional Prompting for Diffusion Transformers | Anthony Chen et.al. | 2411.02395 | link |
2024-11-04 | How Far is Video Generation from World Model: A Physical Law Perspective | Bingyi Kang et.al. | 2411.02385 | null |
2024-11-04 | Virgo Filaments IV: Using WISE to Measure the Modification of Star-Forming Disks in the Extended Regions Around the Virgo Cluster | Kim Conger et.al. | 2411.02352 | null |
2024-11-04 | Diffusion-based Generative Multicasting with Intent-aware Semantic Decomposition | Xinkai Liu et.al. | 2411.02334 | null |
2024-11-05 | PPLLaVA: Varied Video Sequence Understanding With Prompt Guidance | Ruyang Liu et.al. | 2411.02327 | link |
2024-11-04 | LayerDAG: A Layerwise Autoregressive Diffusion Model for Directed Acyclic Graph Generation | Mufei Li et.al. | 2411.02322 | link |
2024-11-04 | CRMArena: Understanding the Capacity of LLM Agents to Perform Professional CRM Tasks in Realistic Environments | Kung-Hsiang Huang et.al. | 2411.02305 | link |
2024-11-04 | Hunyuan3D-1.0: A Unified Framework for Text-to-3D and Image-to-3D Generation | Xianghui Yang et.al. | 2411.02293 | null |
2024-11-04 | Counterfactual Explanations via Riemannian Latent Space Traversal | Paraskevas Pegios et.al. | 2411.02259 | null |
2024-11-04 | FewViewGS: Gaussian Splatting with Few View Matching and Multi-stage Training | Ruihong Yin et.al. | 2411.02229 | null |
2024-11-04 | Recursive Learning of Asymptotic Variational Objectives | Alessandro Mastrototaro et.al. | 2411.02217 | null |
2024-11-04 | Digi2Real: Bridging the Realism Gap in Synthetic Data Face Recognition via Foundation Models | Anjith George et.al. | 2411.02188 | null |
2024-11-04 | Touch-to-Touch Translation – Learning the Mapping Between Heterogeneous Tactile Sensing Technologies | Francesco Grella et.al. | 2411.02187 | null |
2024-11-04 | CleAR: Robust Context-Guided Generative Lighting Estimation for Mobile Augmented Reality | Yiqin Zhao et.al. | 2411.02179 | null |
2024-11-04 | CryptoEL: A Novel Experiential Learning Tool for Enhancing K-12 Cryptography Education | Pranathi Rayavaram et.al. | 2411.02143 | null |
2024-10-31 | Bridging Geometric States via Geometric Diffusion Bridge | Shengjie Luo et.al. | 2410.24220 | null |
2024-10-31 | Enhancing Motion in Text-to-Video Generation with Decomposed Encoding and Conditioning | Penghui Ruan et.al. | 2410.24219 | link |
2024-10-31 | DiffPano: Scalable and Consistent Text to Panorama Generation with Spherical Epipolar-Aware Diffusion | Weicai Ye et.al. | 2410.24203 | link |
2024-10-31 | Multi-Attribute Linguistic Tuning for Controlled Paraphrase Generation | Mohamed Elgaar et.al. | 2410.24199 | null |
2024-10-31 | Generative modelling for mass-mapping with fast uncertainty quantification | Jessica J. Whitney et.al. | 2410.24197 | link |
2024-10-31 | AR-Pro: Counterfactual Explanations for Anomaly Repair with Formal Properties | Xiayan Ji et.al. | 2410.24178 | null |
2024-10-31 | **Redefining |
Fu Feng et.al. | 2410.24160 | null |
2024-10-31 | Scaling Concept With Text-Guided Diffusion Models | Chao Huang et.al. | 2410.24151 | null |
2024-10-31 | Repository-Level Compositional Code Translation and Validation | Ali Reza Ibrahimzada et.al. | 2410.24117 | link |
2024-10-31 | Extended electrochemical monitoring of biomolecular binding using commercially available, reusable electrodes in microliter volumes | Jeremy Mendez et.al. | 2410.24110 | null |
2024-10-31 | Sparsh: Self-supervised touch representations for vision-based tactile sensing | Carolina Higuera et.al. | 2410.24090 | null |
2024-10-31 | Understanding Generalizability of Diffusion Models Requires Rethinking the Hidden Gaussian Structure | Xiang Li et.al. | 2410.24060 | link |
2024-10-31 | TPC: Test-time Procrustes Calibration for Diffusion-based Human Image Animation | Sunjae Yoon et.al. | 2410.24037 | null |
2024-10-31 | Unveiling Synthetic Faces: How Synthetic Datasets Can Expose Real Identities | Hatef Otroshi Shahreza et.al. | 2410.24015 | null |
2024-10-31 | DiffPAD: Denoising Diffusion-based Adversarial Patch Decontamination | Jia Fu et.al. | 2410.24006 | link |
2024-10-30 | ReferEverything: Towards Segmenting Everything We Can Speak of in Videos | Anurag Bagchi et.al. | 2410.23287 | null |
2024-10-30 | Provable acceleration for diffusion models under minimal assumptions | Gen Li et.al. | 2410.23285 | null |
2024-10-30 | RelationBooth: Towards Relation-Aware Customized Object Generation | Qingyu Shi et.al. | 2410.23280 | null |
2024-10-30 | SlowFast-VGen: Slow-Fast Learning for Action-Driven Long Video Generation | Yining Hong et.al. | 2410.23277 | null |
2024-10-30 | Multi-student Diffusion Distillation for Better One-step Generators | Yanke Song et.al. | 2410.23274 | null |
2024-10-30 | ReaWristic: Remote Touch Sensation to Fingers from a Wristband via Visually Augmented Electro-Tactile Feedback | Yudai Tanaka et.al. | 2410.23193 | null |
2024-10-30 | Real-Time Personalization for LLM-based Recommendation with Customized In-Context Learning | Keqin Bao et.al. | 2410.23136 | link |
2024-10-30 | Educating for Hardware Specialization in the Chiplet Era: A Path for the HPC Community | Kazutomo Yoshii et.al. | 2410.23127 | null |
2024-10-30 | CausalDiff: Causality-Inspired Disentanglement via Diffusion Model for Adversarial Defense | Mingkun Zhang et.al. | 2410.23091 | null |
2024-10-30 | General Bayesian quantile regression for counts via generative modeling | Yuta Yamauchi et.al. | 2410.23081 | null |
2024-10-30 | Controlling Language and Diffusion Models by Transporting Activations | Pau Rodriguez et.al. | 2410.23054 | null |
2024-10-30 | Dispersion kinks from electronic correlations in an unconventional iron-based superconductor | Ming-Hua Chang et.al. | 2410.23044 | null |
2024-10-30 | Improving Musical Accompaniment Co-creation via Diffusion Transformers | Javier Nistal et.al. | 2410.23005 | null |
2024-10-30 | DexGraspNet 2.0: Learning Generative Dexterous Grasping in Large-scale Synthetic Cluttered Scenes | Jialiang Zhang et.al. | 2410.23004 | null |
2024-10-30 | LumiSculpt: A Consistency Lighting Control Network for Video Generation | Yuxin Zhang et.al. | 2410.22979 | null |
2024-10-29 | CaStL: Constraints as Specifications through LLM Translation for Long-Horizon Task and Motion Planning | Weihang Guo et.al. | 2410.22225 | null |
2024-10-29 | A Gaussian Process Generative Model for QCD Equation of State | Jiaxuan Gong et.al. | 2410.22160 | null |
2024-10-29 | Capacity Control is an Effective Memorization Mitigation Mechanism in Text-Conditional Diffusion Models | Raman Dutt et.al. | 2410.22149 | link |
2024-10-29 | AmpleGCG-Plus: A Strong Generative Model of Adversarial Suffixes to Jailbreak LLMs with Higher Success Rates in Fewer Attempts | Vishal Kumar et.al. | 2410.22143 | null |
2024-10-29 | Infrared photometry with InGaAs detectors: First light with SPECULOOS | Peter P. Pedersen et.al. | 2410.22140 | link |
2024-10-29 | SimRec: Mitigating the Cold-Start Problem in Sequential Recommendation by Integrating Item Similarity | Shaked Brody et.al. | 2410.22136 | link |
2024-10-29 | Protecting Privacy in Multimodal Large Language Models with MLLMU-Bench | Zheyuan Liu et.al. | 2410.22108 | link |
2024-10-29 | Variational inference for pile-up removal at hadron colliders with diffusion models | Malte Algren et.al. | 2410.22074 | null |
2024-10-29 | PACA: Perspective-Aware Cross-Attention Representation for Zero-Shot Scene Rearrangement | Shutong Jin et.al. | 2410.22059 | null |
2024-10-29 | Dual Conditional Diffusion Models for Sequential Recommendation | Hongtao Huang et.al. | 2410.21967 | null |
2024-10-29 | PrefPaint: Aligning Image Inpainting Diffusion Model with Human Preference | Kendong Liu et.al. | 2410.21966 | null |
2024-10-29 | CT to PET Translation: A Large-scale Dataset and Domain-Knowledge-Guided Diffusion Approach | Dac Thai Nguyen et.al. | 2410.21932 | link |
2024-10-29 | Guided Diffusion-based Counterfactual Augmentation for Robust Session-based Recommendation | Muskan Gupta et.al. | 2410.21892 | null |
2024-10-29 | On the study of the limit cycles for a class of population models with time-varying factors | Renhao Tian et.al. | 2410.21848 | null |
2024-10-29 | Diffusion as Reasoning: Enhancing Object Goal Navigation with LLM-Biased Diffusion Model | Yiming Ji et.al. | 2410.21842 | null |
2024-10-28 | On Inductive Biases That Enable Generalization of Diffusion Transformers | Jie An et.al. | 2410.21273 | link |
2024-10-28 | EoRA: Training-free Compensation for Compressed LLM with Eigenspace Low-Rank Approximation | Shih-Yang Liu et.al. | 2410.21271 | null |
2024-10-28 | LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior | Hanyu Wang et.al. | 2410.21264 | null |
2024-10-28 | One-Step Diffusion Policy: Fast Visuomotor Policies via Diffusion Distillation | Zhendong Wang et.al. | 2410.21257 | null |
2024-10-28 | On learning higher-order cumulants in diffusion models | Gert Aarts et.al. | 2410.21212 | null |
2024-10-28 | The VSPEC Collection: A suite of utilities to model spectroscopic phase curves of 3D exoplanet atmospheres in the presence of stellar variability | Ted M Johnson et.al. | 2410.21190 | null |
2024-10-28 | Trajectory Flow Matching with Applications to Clinical Time Series Modeling | Xi Zhang et.al. | 2410.21154 | link |
2024-10-28 | Synthetica: Large Scale Synthetic Data for Robot Perception | Ritvik Singh et.al. | 2410.21153 | null |
2024-10-28 | Extrapolating Prospective Glaucoma Fundus Images through Diffusion Model in Irregular Longitudinal Sequences | Zhihao Zhao et.al. | 2410.21130 | null |
2024-10-28 | Shallow Diffuse: Robust and Invisible Watermarking through Low-Dimensional Subspaces in Diffusion Models | Wenda Li et.al. | 2410.21088 | link |
2024-10-28 | Federated Time Series Generation on Feature and Temporally Misaligned Data | Chenrui Fan et.al. | 2410.21072 | null |
2024-10-28 | Kandinsky 3: Text-to-Image Synthesis for Multifunctional Generative Framework | Vladimir Arkhipkin et.al. | 2410.21061 | link |
2024-10-28 | Beyond Autoregression: Fast LLMs via Self-Distillation Through Time | Justin Deschenaux et.al. | 2410.21035 | link |
2024-10-29 | EEG-Driven 3D Object Reconstruction with Color Consistency and Diffusion Prior | Xin Xiang et.al. | 2410.20981 | null |
2024-10-28 | MovieCharacter: A Tuning-Free Framework for Controllable Character Video Synthesis | Di Qiu et.al. | 2410.20974 | null |
2024-10-25 | Model merging with SVD to tie the Knots | George Stoica et.al. | 2410.19735 | link |
2024-10-25 | Adversarial Environment Design via Regret-Guided Diffusion Models | Hojun Chung et.al. | 2410.19715 | null |
2024-10-25 | Perception, Control and Hardware for In-Hand Slip-Aware Object Manipulation with Parallel Grippers | Gabriel Arslan Waltersson et.al. | 2410.19660 | null |
2024-10-25 | DiffGS: Functional Gaussian Splatting Diffusion | Junsheng Zhou et.al. | 2410.19657 | null |
2024-10-25 | VARS: Vision-based Assessment of Risk in Security Systems | Pranav Gupta et.al. | 2410.19642 | null |
2024-10-25 | Diffusion models for lattice gauge field simulations | Qianteng Zhu et.al. | 2410.19602 | null |
2024-10-25 | Energy Efficient Dual Designs of FeFET-Based Analog In-Memory Computing with Inherent Shift-Add Capability | Zeyu Yang et.al. | 2410.19593 | null |
2024-10-25 | Hybrid Memetic Search for Electric Vehicle Routing with Time Windows, Simultaneous Pickup-Delivery, and Partial Recharges | Zubin Zheng et.al. | 2410.19580 | null |
2024-10-25 | Utilizing Image Transforms and Diffusion Models for Generative Modeling of Short and Long Time Series | Ilan Naiman et.al. | 2410.19538 | null |
2024-10-25 | Ensemble Data Assimilation for Particle-based Methods | Marius Duvillard et.al. | 2410.19525 | null |
2024-10-25 | Marked Temporal Bayesian Flow Point Processes | Hui Chen et.al. | 2410.19512 | null |
2024-10-25 | EDGE: Enhanced Grounded GUI Understanding with Enriched Multi-Granularity Synthetic Data | Xuetian Chen et.al. | 2410.19461 | null |
2024-10-28 | NeuroClips: Towards High-fidelity and Smooth fMRI-to-Video Reconstruction | Zixuan Gong et.al. | 2410.19452 | link |
2024-10-25 | Learned Reference-based Diffusion Sampling for multi-modal distributions | Maxence Noble et.al. | 2410.19449 | null |
2024-10-25 | Generative Diffusion Models for Sequential Recommendations | Sharare Zolghadr et.al. | 2410.19429 | null |
2024-10-24 | Framer: Interactive Frame Interpolation | Wen Wang et.al. | 2410.18978 | null |
2024-10-24 | MotionCLR: Motion Generation and Training-free Editing via Understanding Attention Mechanisms | Ling-Hao Chen et.al. | 2410.18977 | null |
2024-10-24 | Unbounded: A Generative Infinite Game of Character Life Simulation | Jialu Li et.al. | 2410.18975 | null |
2024-10-24 | 3D-Adapter: Geometry-Consistent Multi-View Diffusion for High-Quality 3D Generation | Hansheng Chen et.al. | 2410.18974 | link |
2024-10-24 | On the Crucial Role of Initialization for Matrix Factorization | Bingcong Li et.al. | 2410.18965 | null |
2024-10-24 | Stable Consistency Tuning: Understanding and Improving Consistency Models | Fu-Yun Wang et.al. | 2410.18958 | link |
2024-10-24 | Generation of synthetic financial time series by diffusion models | Tomonori Takahashi et.al. | 2410.18897 | null |
2024-10-24 | Diff-Instruct++: Training One-step Text-to-image Generator Model to Align with Human Preferences | Weijian Luo et.al. | 2410.18881 | null |
2024-10-24 | The Cat and Mouse Game: The Ongoing Arms Race Between Diffusion Models and Detection Methods | Linda Laurier et.al. | 2410.18866 | null |
2024-10-24 | From Efficiency to Equity: Measuring Fairness in Preference Learning | Shreeyash Gowaikar et.al. | 2410.18841 | null |
2024-10-24 | From English-Centric to Effective Bilingual: LLMs with Custom Tokenizers for Underrepresented Languages | Artur Kiulian et.al. | 2410.18836 | null |
2024-10-24 | Multi-Scale Diffusion: Enhancing Spatial Layout in High-Resolution Panoramic Image Generation | Xiaoyu Zhang et.al. | 2410.18830 | null |
2024-10-24 | Towards Visual Text Design Transfer Across Languages | Yejin Choi et.al. | 2410.18823 | null |
2024-10-24 | Fast constrained sampling in pre-trained diffusion models | Alexandros Graikos et.al. | 2410.18804 | null |
2024-10-24 | Large Generative AI Models meet Open Networks for 6G: Integration, Platform, and Monetization | Peizheng Li et.al. | 2410.18790 | null |
2024-10-23 | DynamicCity: Large-Scale LiDAR Generation from Dynamic Scenes | Hengwei Bian et.al. | 2410.18084 | null |
2024-10-23 | Prioritized Generative Replay | Renhao Wang et.al. | 2410.18082 | null |
2024-10-23 | WorldSimBench: Towards Video Generation Models as World Simulators | Yiran Qin et.al. | 2410.18072 | null |
2024-10-23 | TP-Eval: Tap Multimodal LLMs’ Potential in Evaluation by Customizing Prompts | Yuxuan Xie et.al. | 2410.18071 | null |
2024-10-23 | Training Free Guided Flow Matching with Optimal Control | Luran Wang et.al. | 2410.18070 | null |
2024-10-23 | Spectrally shaped THz pulses from tapered dielectric waveguides | Karel Peetermans et.al. | 2410.17975 | null |
2024-10-23 | Optical Generative Models | Shiqi Chen et.al. | 2410.17970 | null |
2024-10-23 | A Wavelet Diffusion GAN for Image Super-Resolution | Lorenzo Aloisi et.al. | 2410.17966 | null |
2024-10-23 | Addressing Asynchronicity in Clinical Multimodal Fusion via Individualized Chest X-ray Generation | Wenfang Yao et.al. | 2410.17918 | link |
2024-10-23 | regAL: Python Package for Active Learning of Regression Problems | Elizaveta Surzhikova et.al. | 2410.17917 | null |
2024-10-23 | Scaling Diffusion Language Models via Adaptation from Autoregressive Models | Shansan Gong et.al. | 2410.17891 | link |
2024-10-23 | Non-intrusive Speech Quality Assessment with Diffusion Models Trained on Clean Speech | Danilo de Oliveira et.al. | 2410.17834 | null |
2024-10-23 | PGDiffSeg: Prior-Guided Denoising Diffusion Model with Parameter-Shared Attention for Breast Cancer Segmentation | Feiyan Feng et.al. | 2410.17812 | null |
2024-10-23 | GenUDC: High Quality 3D Mesh Generation with Unsigned Dual Contouring Representation | Ruowei Wang et.al. | 2410.17802 | link |
2024-10-23 | Regularized autoregressive modeling and its application to audio signal declipping | Ondřej Mokrý et.al. | 2410.17790 | link |
2024-10-22 | Large Language Models Empowered Personalized Web Agents | Hongru Cai et.al. | 2410.17236 | null |
2024-10-22 | Creativity in AI: Progresses and Challenges | Mete Ismayilzada et.al. | 2410.17218 | null |
2024-10-22 | Audio-to-Score Conversion Model Based on Whisper methodology | Hongyao Zhang et.al. | 2410.17209 | null |
2024-10-22 | Reinforcement learning on structure-conditioned categorical diffusion for protein inverse folding | Yasha Ektefaie et.al. | 2410.17173 | link |
2024-10-22 | Performance of the CMS high-level trigger during LHC Run 2 | CMS Collaboration et.al. | 2410.17038 | null |
2024-10-22 | Hybrid Generative AI for De Novo Design of Co-Crystals with Enhanced Tabletability | Nina Gubina et.al. | 2410.17005 | link |
2024-10-22 | DiP-GO: A Diffusion Pruner via Few-step Gradient Optimization | Haowei Zhu et.al. | 2410.16942 | null |
2024-10-22 | Hierarchical Clustering for Conditional Diffusion in Image Generation | Jorge da Silva Goncalves et.al. | 2410.16910 | link |
2024-10-22 | Bayes without Underfitting: Fully Correlated Deep Learning Posteriors via Alternating Projections | Marco Miani et.al. | 2410.16901 | null |
2024-10-22 | VistaDream: Sampling multiview consistent images for single-view scene reconstruction | Haiping Wang et.al. | 2410.16892 | null |
2024-10-22 | CK4Gen: A Knowledge Distillation Framework for Generating High-Utility Synthetic Survival Datasets in Healthcare | Nicholas I-Hsien Kuo et.al. | 2410.16872 | null |
2024-10-22 | MPDS: A Movie Posters Dataset for Image Generation with Diffusion Model | Meng Xu et.al. | 2410.16840 | null |
2024-10-22 | Bridging Search and Recommendation in Generative Retrieval: Does One Task Help the Other? | Gustavo Penha et.al. | 2410.16823 | null |
2024-10-22 | Evaluating the Effectiveness of Attack-Agnostic Features for Morphing Attack Detection | Laurent Colbois et.al. | 2410.16802 | link |
2024-10-22 | One-Step Diffusion Distillation through Score Implicit Matching | Weijian Luo et.al. | 2410.16794 | link |
2024-10-21 | MvDrag3D: Drag-based Creative 3D Editing via Multi-view Generation-Reconstruction Priors | Honghua Chen et.al. | 2410.16272 | null |
2024-10-21 | Agent-to-Sim: Learning Interactive Behavior Models from Casual Longitudinal Videos | Gengshan Yang et.al. | 2410.16259 | null |
2024-10-21 | Distribution Learning with Valid Outputs Beyond the Worst-Case | Nick Rittler et.al. | 2410.16253 | null |
2024-10-21 | Building A Coding Assistant via the Retrieval-Augmented Language Model | Xinze Li et.al. | 2410.16229 | link |
2024-10-21 | CiteClick: A Browser Extension for Real-Time Scholar Citation Tracking | Nishat Raihan et.al. | 2410.16211 | null |
2024-10-21 | A Framework for Evaluating Predictive Models Using Synthetic Image Covariates and Longitudinal Data | Simon Deltadahl et.al. | 2410.16177 | null |
2024-10-22 | Warped Diffusion: Solving Video Inverse Problems with Image Diffusion Models | Giannis Daras et.al. | 2410.16152 | null |
2024-10-21 | Modelling Structured Data Learning with Restricted Boltzmann Machines in the Teacher-Student Setting | Robin Thériault et.al. | 2410.16150 | null |
2024-10-21 | SeaDAG: Semi-autoregressive Diffusion for Conditional Directed Acyclic Graph Generation | Xinyi Zhou et.al. | 2410.16119 | null |
2024-10-21 | Critical Example Mining for Vehicle Trajectory Prediction using Flow-based Generative Models | Zhezhang Ding et.al. | 2410.16083 | null |
2024-10-21 | Continuous Speech Synthesis using per-token Latent Diffusion | Arnon Turetzky et.al. | 2410.16048 | null |
2024-10-21 | Some generalizations of the convective model of jet generation | S. N. Artekha et.al. | 2410.16035 | null |
2024-10-21 | ComPO: Community Preferences for Language Model Personalization | Sachin Kumar et.al. | 2410.16027 | null |
2024-10-21 | Massimo: Public Queue Monitoring and Management using Mass-Spring Model | Abhijeet Kumar et.al. | 2410.16012 | null |
2024-10-21 | AI-Driven Innovations in Modern Cloud Computing | Animesh Kumar et.al. | 2410.15960 | null |
2024-10-18 | BiGR: Harnessing Binary Latent Codes for Image Generation and Improved Visual Representation Capabilities | Shaozhe Hao et.al. | 2410.14672 | link |
2024-10-18 | How Does Data Diversity Shape the Weight Landscape of Neural Networks? | Yang Ba et.al. | 2410.14602 | null |
2024-10-18 | Bayesian Multi-wavelength Imaging of the LMC SN1987A with SRG/eROSITA | Vincent Eberle et.al. | 2410.14599 | null |
2024-10-18 | Neuro-Symbolic Traders: Assessing the Wisdom of AI Crowds in Markets | Namid R. Stillman et.al. | 2410.14587 | null |
2024-10-18 | Reimagining partial thickness keratoplasty: An eye mountable robot for autonomous big bubble needle insertion | Y. Wang et.al. | 2410.14577 | null |
2024-10-18 | Multi-modal Pose Diffuser: A Multimodal Generative Conditional Pose Prior | Calvin-Khang Ta et.al. | 2410.14540 | null |
2024-10-18 | Blockchain-Based Trust and Transparency in Airline Reservation Systems using Microservices Architecture | Biman Barua et.al. | 2410.14518 | null |
2024-10-18 | LEAD: Latent Realignment for Human Motion Diffusion | Nefeli Andreou et.al. | 2410.14508 | null |
2024-10-18 | Reinforcement Learning in Non-Markov Market-Making | Luca Lalor et.al. | 2410.14504 | null |
2024-10-18 | Data-driven topology design with persistent homology for enhancing population diversity | Taisei Kii et.al. | 2410.14496 | null |
2024-10-18 | ANT: Adaptive Noise Schedule for Time Series Diffusion Models | Seunghan Lee et.al. | 2410.14488 | link |
2024-10-21 | CaTs and DAGs: Integrating Directed Acyclic Graphs with Transformers and Fully-Connected Neural Networks for Causally Constrained Predictions | Matthew J. Vowels et.al. | 2410.14485 | link |
2024-10-18 | DRL Optimization Trajectory Generation via Wireless Network Intent-Guided Diffusion Models for Optimizing Resource Allocation | Junjie Wu et.al. | 2410.14481 | null |
2024-10-18 | Flow-based Sampling for Entanglement Entropy and the Machine Learning of Defects | Andrea Bulgarelli et.al. | 2410.14466 | null |
2024-10-18 | FashionR2R: Texture-preserving Rendered-to-Real Image Translation with Diffusion Models | Rui Hu et.al. | 2410.14429 | null |
2024-10-17 | Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens | Lijie Fan et.al. | 2410.13863 | null |
2024-10-17 | Diffusing States and Matching Scores: A New Framework for Imitation Learning | Runzhe Wu et.al. | 2410.13855 | link |
2024-10-17 | Influence Functions for Scalable Data Attribution in Diffusion Models | Bruno Mlodozeniec et.al. | 2410.13850 | null |
2024-10-17 | VidPanos: Generative Panoramic Videos from Casual Panning Videos | Jingwei Ma et.al. | 2410.13832 | null |
2024-10-17 | DreamVideo-2: Zero-Shot Subject-Driven Video Customization with Precise Motion Control | Yujie Wei et.al. | 2410.13830 | null |
2024-10-17 | Deep Generative Models Unveil Patterns in Medical Images Through Vision-Language Conditioning | Xiaodan Xing et.al. | 2410.13823 | link |
2024-10-17 | ConsisSR: Delving Deep into Consistency in Diffusion-based Image Super-Resolution | Junhao Gu et.al. | 2410.13807 | null |
2024-10-17 | Probing the Latent Hierarchical Structure of Data via Diffusion Models | Antonio Sclocchi et.al. | 2410.13770 | null |
2024-10-17 | Theory on Score-Mismatched Diffusion Models and Zero-Shot Conditional Samplers | Yuchen Liang et.al. | 2410.13746 | null |
2024-10-17 | Improved Convergence Rate for Diffusion Probabilistic Models | Gen Li et.al. | 2410.13738 | null |
2024-10-17 | Optimizing Probabilistic Conformal Prediction with Vectorized Non-Conformity Scores | Minxing Zheng et.al. | 2410.13735 | null |
2024-10-18 | DAWN: Dynamic Frame Avatar with Non-autoregressive Diffusion Framework for Talking Head Video Generation | Hanbo Cheng et.al. | 2410.13726 | link |
2024-10-17 | Movie Gen: A Cast of Media Foundation Models | Adam Polyak et.al. | 2410.13720 | link |
2024-10-18 | Diffusion Curriculum: Synthetic-to-Real Generative Curriculum Learning via Image-Guided Diffusion | Yijun Liang et.al. | 2410.13674 | link |
2024-10-17 | Fine-Tuning Discrete Diffusion Models via Reward Optimization with Applications to DNA and Protein Design | Chenyu Wang et.al. | 2410.13643 | link |
2024-10-16 | Geometry-Aware Generative Autoencoders for Warped Riemannian Metric Learning and Generative Modeling on Data Manifolds | Xingzhi Sun et.al. | 2410.12779 | null |
2024-10-16 | Meta-Unlearning on Diffusion Models: Preventing Relearning Unlearned Concepts | Hongcheng Gao et.al. | 2410.12777 | link |
2024-10-16 | SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image And Video Generation | Jaehong Yoon et.al. | 2410.12761 | null |
2024-10-16 | Signature of Vertical Mixing in Hydrogen-dominated Exoplanet Atmospheres | Vikas Soni et.al. | 2410.12737 | null |
2024-10-16 | Counterfactual Generative Modeling with Variational Causal Inference | Yulun Wu et.al. | 2410.12730 | null |
2024-10-16 | FusionLLM: A Decentralized LLM Training System on Geo-distributed GPUs with Adaptive Compression | Zhenheng Tang et.al. | 2410.12707 | null |
2024-10-16 | Embedding an Ethical Mind: Aligning Text-to-Image Synthesis via Lightweight Value Optimization | Xingqi Wang et.al. | 2410.12700 | link |
2024-10-16 | AdaptiveDrag: Semantic-Driven Dragging on Diffusion-Based Image Editing | DuoSheng Chen et.al. | 2410.12696 | null |
2024-10-16 | 3DIS: Depth-Driven Decoupled Instance Synthesis for Text-to-Image Generation | Dewei Zhou et.al. | 2410.12669 | null |
2024-10-16 | Towards Designing Scalable Quantum-Enhanced Generative Networks for Neutrino Physics Experiments with Liquid Argon Time Projection Chambers | Andrea Delgado et.al. | 2410.12650 | null |
2024-10-16 | A Robo-Advisor System: expected utility modeling via pairwise comparisons | Bo Chen et.al. | 2410.12570 | null |
2024-10-16 | One Step Diffusion via Shortcut Models | Kevin Frans et.al. | 2410.12557 | link |
2024-10-16 | Disentangling data distribution for Federated Learning | Xinyuan Zhao et.al. | 2410.12530 | null |
2024-10-16 | Shaping a Stabilized Video by Mitigating Unintended Changes for Concept-Augmented Video Editing | Mingce Guo et.al. | 2410.12526 | null |
2024-10-16 | MING: A Functional Approach to Learning Molecular Generative Models | Van Khoa Nguyen et.al. | 2410.12522 | null |
2024-10-15 | High-Resolution Frame Interpolation with Patch-based Cascaded Diffusion | Junhwa Hur et.al. | 2410.11838 | null |
2024-10-15 | On the Effectiveness of Dataset Alignment for Fake Image Detection | Anirudh Sundara Rajan et.al. | 2410.11835 | null |
2024-10-15 | Bayesian Experimental Design via Contrastive Diffusions | Jacopo Iollo et.al. | 2410.11826 | link |
2024-10-15 | KITTEN: A Knowledge-Intensive Evaluation of Image Generation on Visual Entities | Hsin-Ping Huang et.al. | 2410.11824 | null |
2024-10-15 | Improving Long-Text Alignment for Text-to-Image Diffusion Models | Luping Liu et.al. | 2410.11817 | link |
2024-10-15 | SGEdit: Bridging LLM with Text2Image Generative Model for Scene Graph-based Image Editing | Zhiyuan Zhang et.al. | 2410.11815 | null |
2024-10-16 | Efficient Diffusion Models: A Comprehensive Survey from Principles to Practices | Zhiyuan Ma et.al. | 2410.11795 | null |
2024-10-15 | G-Designer: Architecting Multi-agent Communication Topologies via Graph Neural Networks | Guibin Zhang et.al. | 2410.11782 | null |
2024-10-15 | Technical Report of 1:10 Scale Autonomous Vehicle Robot | Amirhossein Kheiri Holighi et.al. | 2410.11746 | null |
2024-10-15 | Probabilistic Principles for Biophysics and Neuroscience: Entropy Production, Bayesian Mechanics & the Free-Energy Principle | Lancelot Da Costa et.al. | 2410.11735 | null |
2024-10-15 | Patch-Based Diffusion Models Beat Whole-Image Models for Mismatched Distribution Inverse Problems | Jason Hu et.al. | 2410.11730 | null |
2024-10-15 | Parameter estimation of structural dynamics with neural operators enabled surrogate modeling | Mingyuan Zhou et.al. | 2410.11712 | null |
2024-10-15 | Findings of the WMT 2024 Shared Task on Chat Translation | Wafaa Mohammed et.al. | 2410.11624 | null |
2024-10-15 | DeformPAM: Data-Efficient Learning for Long-horizon Deformable Object Manipulation via Preference-based Action Alignment | Wendi Chen et.al. | 2410.11584 | link |
2024-10-15 | A Data-Driven Aggressive Autonomous Racing Framework Utilizing Local Trajectory Planning with Velocity Prediction | Zhouheng Li et.al. | 2410.11570 | link |
2024-10-14 | Tex4D: Zero-shot 4D Scene Texturing with Video Diffusion Models | Jingzhi Bao et.al. | 2410.10821 | link |
2024-10-15 | TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models | Mu Cai et.al. | 2410.10818 | link |
2024-10-14 | LVD-2M: A Long-take Video Dataset with Temporally Dense Captions | Tianwei Xiong et.al. | 2410.10816 | link |
2024-10-14 | Depth Any Video with Scalable Synthetic Data | Honghui Yang et.al. | 2410.10815 | link |
2024-10-14 | HART: Efficient Visual Generation with Hybrid Autoregressive Transformer | Haotian Tang et.al. | 2410.10812 | link |
2024-10-14 | TrajDiffuse: A Conditional Diffusion Model for Environment-Aware Trajectory Prediction | Qingze et.al. | 2410.10804 | link |
2024-10-14 | Boosting Camera Motion Control for Video Diffusion Transformers | Soon Yau Cheong et.al. | 2410.10802 | null |
2024-10-14 | Semantic Image Inversion and Editing using Rectified Stochastic Differential Equations | Litu Rout et.al. | 2410.10792 | null |
2024-10-14 | ControlMM: Controllable Masked Motion Generation | Ekkasit Pinyoanuntapong et.al. | 2410.10780 | null |
2024-10-14 | Adaptive Diffusion Terrain Generator for Autonomous Uneven Terrain Navigation | Youwei Yu et.al. | 2410.10766 | null |
2024-10-14 | DragEntity: Trajectory Guided Video Generation using Entity and Positional Relationships | Zhang Wan et.al. | 2410.10751 | null |
2024-10-14 | CosForce: A Force-Based General Model for Simulating Pedestrian Anticipation and Reaction Mechanisms | Jinghui Wang et.al. | 2410.10746 | null |
2024-10-14 | FlexGen: Flexible Multi-View Generation from Text and Image Inputs | Xinli Xu et.al. | 2410.10745 | null |
2024-10-14 | Deep Compression Autoencoder for Efficient High-Resolution Diffusion Models | Junyu Chen et.al. | 2410.10733 | link |
2024-10-14 | Large Language Models Are Active Critics in NLG Evaluation | Shuying Xu et.al. | 2410.10724 | null |
2024-10-11 | SceneCraft: Layout-Guided 3D Scene Generation | Xiuyu Yang et.al. | 2410.09049 | link |
2024-10-11 | Linear Convergence of Diffusion Models Under the Manifold Hypothesis | Peter Potaptchik et.al. | 2410.09046 | null |
2024-10-11 | PEAR: A Robust and Flexible Automation Framework for Ptychography Enabled by Multiple Large Language Model Agents | Xiangyu Yin et.al. | 2410.09034 | null |
2024-10-11 | Semantic Score Distillation Sampling for Compositional Text-to-3D Generation | Ling Yang et.al. | 2410.09009 | link |
2024-10-11 | WaveDiffusion: Exploring Full Waveform Inversion via Joint Diffusion in the Latent Space | Hanchen Wang et.al. | 2410.09002 | null |
2024-10-11 | Maximizing the Potential of Synthetic Data: Insights from Random Matrix Theory | Aymane El Firdoussi et.al. | 2410.08942 | null |
2024-10-11 | DiffPO: A causal diffusion model for learning distributions of potential outcomes | Yuchen Ma et.al. | 2410.08924 | null |
2024-10-11 | An End-to-End Deep Learning Method for Solving Nonlocal Allen-Cahn and Cahn-Hilliard Phase-Field Models | Yuwei Geng et.al. | 2410.08914 | null |
2024-10-11 | Conditional Generative Models for Contrast-Enhanced Synthesis of T1w and T1 Maps in Brain MRI | Moritz Piening et.al. | 2410.08894 | link |
2024-10-11 | MATCH: Model-Aware TVM-based Compilation for Heterogeneous Edge Devices | Mohamed Amine Hamdi et.al. | 2410.08855 | link |
2024-10-14 | LIME-Eval: Rethinking Low-light Image Enhancement Evaluation via Object Detection | Mingjia Li et.al. | 2410.08810 | link |
2024-10-11 | Bad Neighbors: On Understanding VPN Provider Networks | Teemu Rytilahti et.al. | 2410.08737 | link |
2024-10-11 | 5G as Enabler for Industrie 4.0 Use Cases: Challenges and Concepts | M. Gundall et.al. | 2410.08726 | null |
2024-10-11 | Investigating Human-Computer Interaction and Visual Comprehension in Text Generation Process of Natural Language Generation Models | Yunchao Wang et.al. | 2410.08723 | null |
2024-10-11 | Impact of Surface Reflections in Maritime Obstacle Detection | Samed Yalçın et.al. | 2410.08713 | link |
2024-10-10 | LatteCLIP: Unsupervised CLIP Fine-Tuning via LMM-Synthetic Texts | Anh-Quan Cao et.al. | 2410.08211 | null |
2024-10-10 | DICE: Discrete Inversion Enabling Controllable Editing for Multinomial Diffusion and Masked Generative Models | Xiaoxiao He et.al. | 2410.08207 | null |
2024-10-10 | HybridBooth: Hybrid Prompt Inversion for Efficient Subject-Driven Generation | Shanyan Guan et.al. | 2410.08192 | null |
2024-10-10 | DifFRelight: Diffusion-Based Facial Performance Relighting | Mingming He et.al. | 2410.08188 | null |
2024-10-10 | RGM: Reconstructing High-fidelity 3D Car Assets with Relightable 3D-GS Generative Model from a Single Image | Xiaoxue Chen et.al. | 2410.08181 | null |
2024-10-10 | ZeroComp: Zero-shot Object Compositing from Image Intrinsics via Diffusion | Zitian Zhang et.al. | 2410.08168 | null |
2024-10-10 | DART: Denoising Autoregressive Transformer for Scalable Text-to-Image Generation | Jiatao Gu et.al. | 2410.08159 | null |
2024-10-10 | Progressive Autoregressive Video Diffusion Models | Desai Xie et.al. | 2410.08151 | link |
2024-10-10 | Steering Masked Discrete Diffusion Models via Discrete Denoising Posterior Prediction | Jarrid Rector-Brooks et.al. | 2410.08134 | null |
2024-10-10 | Robust AI-Generated Text Detection by Restricted Embeddings | Kristian Kuznetsov et.al. | 2410.08113 | link |
2024-10-10 | LiPO: LiDAR Inertial Odometry for ICP Comparison | Darwin Mick et.al. | 2410.08097 | null |
2024-10-10 | Unstable Unlearning: The Hidden Risk of Concept Resurgence in Diffusion Models | Vinith M. Suriyakumar et.al. | 2410.08074 | null |
2024-10-10 | Reversible Decoupling Network for Single Image Reflection Removal | Hao Zhao et.al. | 2410.08063 | link |
2024-10-10 | A Target-Aware Analysis of Data Augmentation for Hate Speech Detection | Camilla Casula et.al. | 2410.08053 | null |
2024-10-10 | LADIMO: Face Morph Generation through Biometric Template Inversion with Latent Diffusion | Marcel Grimmer et.al. | 2410.07988 | link |
2024-10-09 | IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation | Xinchen Zhang et.al. | 2410.07171 | link |
2024-10-09 | Sylber: Syllabic Embedding Representation of Speech from Raw Audio | Cheol Jun Cho et.al. | 2410.07168 | link |
2024-10-09 | AvatarGO: Zero-shot 4D Human-Object Interaction Generation and Animation | Yukang Cao et.al. | 2410.07164 | null |
2024-10-09 | InstructG2I: Synthesizing Images from Multimodal Attributed Graphs | Bowen Jin et.al. | 2410.07157 | link |
2024-10-09 | Trans4D: Realistic Geometry-Aware Transition for Compositional Text-to-4D Synthesis | Bohan Zeng et.al. | 2410.07155 | link |
2024-10-10 | EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models | Rui Zhao et.al. | 2410.07133 | link |
2024-10-09 | Personalized Visual Instruction Tuning | Renjie Pi et.al. | 2410.07113 | link |
2024-10-09 | A Gentle Introduction and Tutorial on Deep Generative Models in Transportation Research | Seongjin Choi et.al. | 2410.07066 | link |
2024-10-09 | Efficient Distribution Matching of Representations via Noise-Injected Deep InfoMax | Ivan Butakov et.al. | 2410.06993 | null |
2024-10-09 | Diffusion Density Estimators | Akhil Premkumar et.al. | 2410.06986 | null |
2024-10-09 | Jointly Generating Multi-view Consistent PBR Textures using Collaborative Control | Shimon Vainer et.al. | 2410.06985 | null |
2024-10-09 | Structure-Centric Robust Monocular Depth Estimation via Knowledge Distillation | Runze Chen et.al. | 2410.06982 | null |
2024-10-09 | Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think | Sihyun Yu et.al. | 2410.06940 | link |
2024-10-09 | VEC-Sim: A Simulation Platform for Evaluating Service Caching and Computation Offloading Policies in Vehicular Edge Networks | Fan Wu et.al. | 2410.06934 | null |
2024-10-09 | Generative Model for Less-Resourced Language with 1 billion parameters | Domen Vreš et.al. | 2410.06898 | null |
2024-10-07 | DART: A Diffusion-Based Autoregressive Motion Model for Real-Time Text-Driven Motion Control | Kaifeng Zhao et.al. | 2410.05260 | null |
2024-10-07 | GS-VTON: Controllable 3D Virtual Try-on with Gaussian Splatting | Yukang Cao et.al. | 2410.05259 | null |
2024-10-07 | SePPO: Semi-Policy Preference Optimization for Diffusion Alignment | Daoan Zhang et.al. | 2410.05255 | link |
2024-10-07 | DiffuseReg: Denoising Diffusion Model for Obtaining Deformation Fields in Unsupervised Deformable Image Registration | Yongtai Zhuo et.al. | 2410.05234 | link |
2024-10-07 | Density estimation with LLMs: a geometric investigation of in-context learning trajectories | Toni J. B. Liu et.al. | 2410.05218 | null |
2024-10-07 | Avoiding Deadlocks via Weak Deadlock Sets | Gianpaolo Oriolo et.al. | 2410.05175 | null |
2024-10-07 | Presto! Distilling Steps and Layers for Accelerating Music Generation | Zachary Novack et.al. | 2410.05167 | null |
2024-10-08 | A Simulation-Free Deep Learning Approach to Stochastic Optimal Control | Mengjian Hua et.al. | 2410.05163 | null |
2024-10-07 | Smart Jamming Attack and Mitigation on Deep Transfer Reinforcement Learning Enabled Resource Allocation for Network Slicing | Shavbo Salehi et.al. | 2410.05153 | null |
2024-10-07 | Leveraging Multimodal Diffusion Models to Accelerate Imaging with Side Information | Timofey Efimov et.al. | 2410.05143 | null |
2024-10-07 | Agnostic Smoothed Online Learning | Moïse Blanchard et.al. | 2410.05124 | null |
2024-10-07 | Human-Feedback Efficient Reinforcement Learning for Online Diffusion Model Finetuning | Ayano Hiranaka et.al. | 2410.05116 | null |
2024-10-07 | Synthetic Generation of Dermatoscopic Images with GAN and Closed-Form Factorization | Rohan Reddy Mekala et.al. | 2410.05114 | null |
2024-10-07 | Hyper-Representations: Learning from Populations of Neural Networks | Konstantin Schürholt et.al. | 2410.05107 | link |
2024-10-07 | DreamSat: Towards a General 3D Model for Novel View Synthesis of Space Objects | Nidhi Mathihalli et.al. | 2410.05097 | link |
2024-10-04 | Estimating Body and Hand Motion in an Ego-sensed World | Brent Yi et.al. | 2410.03665 | null |
2024-10-04 | Enhance Reasoning by Learning from Mistakes: Peer-Review Knowledge Distillation from Multiple Large Language Models | Zhuochun Li et.al. | 2410.03663 | null |
2024-10-04 | Geometric Representation Condition Improves Equivariant Molecule Generation | Zian Li et.al. | 2410.03655 | null |
2024-10-04 | Aligning LLMs with Individual Preferences via Interaction | Shujin Wu et.al. | 2410.03642 | link |
2024-10-04 | Real-World Benchmarks Make Membership Inference Attacks Fail on Diffusion Models | Chumeng Liang et.al. | 2410.03640 | link |
2024-10-04 | Conditional Enzyme Generation Using Protein Language Models with Adapters | Jason Yang et.al. | 2410.03634 | null |
2024-10-04 | How Discrete and Continuous Diffusion Meet: Comprehensive Analysis of Discrete Diffusion Models via a Stochastic Integral Framework | Yinuo Ren et.al. | 2410.03601 | null |
2024-10-04 | Teaching Transformers Modular Arithmetic at Scale | Eshika Saxena et.al. | 2410.03569 | null |
2024-10-04 | Not All Diffusion Model Activations Have Been Evaluated as Discriminative Features | Benyuan Meng et.al. | 2410.03558 | link |
2024-10-04 | Loading Ceramics: Visualising Possibilities of Robotics in Ceramics | Varvara Guljajeva et.al. | 2410.03550 | null |
2024-10-04 | NRGBoost: Energy-Based Generative Boosted Trees | João Bravo et.al. | 2410.03535 | null |
2024-10-04 | Generative Artificial Intelligence for Navigating Synthesizable Chemical Space | Wenhao Gao et.al. | 2410.03494 | link |
2024-10-04 | SeBS-Flow: Benchmarking Serverless Cloud Function Workflows | Larissa Schmid et.al. | 2410.03480 | null |
2024-10-04 | Formalizing MLTL Formula Progression in Isabelle/HOL | Katherine Kosaian et.al. | 2410.03465 | null |
2024-10-04 | Diffusion State-Guided Projected Gradient for Inverse Problems | Rayhan Zirvi et.al. | 2410.03463 | null |
2024-10-03 | SIEVE: General Purpose Data Filtering System Matching GPT-4o Accuracy at 1% the Cost | Jifan Zhang et.al. | 2410.02755 | null |
2024-10-03 | CriSPO: Multi-Aspect Critique-Suggestion-guided Automatic Prompt Optimization for Text Generation | Han He et.al. | 2410.02748 | null |
2024-10-03 | Salient Information Prompting to Steer Content in Prompt-based Abstractive Summarization | Lei Xu et.al. | 2410.02741 | null |
2024-10-03 | Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models | Zhengfeng Lai et.al. | 2410.02740 | null |
2024-10-03 | Custom Non-Linear Model Predictive Control for Obstacle Avoidance in Indoor and Outdoor Environments | Lara Laban et.al. | 2410.02732 | link |
2024-10-03 | A Photonic Parameter-shift Rule: Enabling Gradient Computation for Photonic Quantum Computers | Axel Pappalardo et.al. | 2410.02726 | null |
2024-10-03 | AlzhiNet: Traversing from 2DCNN to 3DCNN, Towards Early Detection and Diagnosis of Alzheimer’s Disease | Romoke Grace Akindele et.al. | 2410.02714 | null |
2024-10-03 | SteerDiff: Steering towards Safe Text-to-Image Diffusion Models | Hongxiang Zhang et.al. | 2410.02710 | null |
2024-10-03 | ControlAR: Controllable Image Generation with Autoregressive Models | Zongming Li et.al. | 2410.02705 | link |
2024-10-03 | User-centric Immersive Communications in 6G: A Data-oriented Approach via Digital Twin | Conghao Zhou et.al. | 2410.02688 | null |
2024-10-03 | GUD: Generation with Unified Diffusion | Mathis Gerdes et.al. | 2410.02667 | null |
2024-10-03 | Grounded Answers for Multi-agent Decision-making Problem through Generative World Model | Zeyang Liu et.al. | 2410.02664 | null |
2024-10-03 | Scalable Simulation-free Entropic Unbalanced Optimal Transport | Jaemoo Choi et.al. | 2410.02656 | null |
2024-10-03 | Measuring and Improving Persuasiveness of Generative Models | Somesh Singh et.al. | 2410.02653 | null |
2024-10-03 | Efficient calibration of the shifted square-root diffusion model to credit default swap spreads using asymptotic approximations | Ankush Agarwal et.al. | 2410.02645 | null |
2024-10-02 | FabricDiffusion: High-Fidelity Texture Transfer for 3D Garments Generation from In-The-Wild Clothing Images | Cheng Zhang et.al. | 2410.01801 | null |
2024-10-02 | Bellman Diffusion: Generative Modeling as Learning a Linear Operator in the Distribution Space | Yangming Li et.al. | 2410.01796 | null |
2024-10-02 | Dynamical-generative downscaling of climate model ensembles | Ignacio Lopez-Gomez et.al. | 2410.01776 | null |
2024-10-02 | Towards deep learning sequence-structure co-generation for protein design | Chentong Wang et.al. | 2410.01773 | null |
2024-10-02 | ImageFolder: Autoregressive Image Generation with Folded Tokens | Xiang Li et.al. | 2410.01756 | link |
2024-10-02 | AssessITS: Integrating procedural guidelines and practical evaluation metrics for organizational IT and Cybersecurity risk assessment | Mir Mehedi Rahman et.al. | 2410.01750 | null |
2024-10-02 | VitaGlyph: Vitalizing Artistic Typography with Flexible Dual-branch Diffusion Models | Kailai Feng et.al. | 2410.01738 | link |
2024-10-02 | HarmoniCa: Harmonizing Training and Inference for Better Feature Cache in Diffusion Transformer Acceleration | Yushi Huang et.al. | 2410.01723 | null |
2024-10-02 | Towards a Theoretical Understanding of Synthetic Data in LLM Post-Training: A Reverse-Bottleneck Perspective | Zeyu Gan et.al. | 2410.01720 | link |
2024-10-02 | COMUNI: Decomposing Common and Unique Video Signals for Diffusion-based Video Generation | Mingzhen Sun et.al. | 2410.01718 | null |
2024-10-02 | A Mathematics-Inspired Learning-to-Optimize Framework for Decentralized Optimization | Yutong He et.al. | 2410.01700 | null |
2024-10-02 | Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding | Yao Teng et.al. | 2410.01699 | link |
2024-10-02 | Lossy Semantic Communication for the Logical Deduction of the State of the World | Ahmet Faruk Saz et.al. | 2410.01676 | null |
2024-10-02 | Conformal Generative Modeling with Improved Sample Efficiency through Sequential Greedy Filtering | Klaus-Rudolf Kladny et.al. | 2410.01660 | null |
2024-10-02 | On The Adaptation of Unlimiformer for Decoder-Only Transformers | Kian Ahrabian et.al. | 2410.01637 | null |
2024-09-30 | SpaceMesh: A Continuous Representation for Learning Manifold Surface Meshes | Tianchang Shen et.al. | 2409.20562 | null |
2024-09-30 | Annealing Flow Generative Model Towards Sampling High-Dimensional and Multi-Modal Distributions | Dongze Wu et.al. | 2409.20547 | link |
2024-09-30 | A Compact Quantum Random Number Generator Based on Balanced Detection of Shot Noise | Jaideep Singh et.al. | 2409.20515 | null |
2024-09-30 | NUTRIVISION: A System for Automatic Diet Management in Smart Healthcare | Madhumita Veeramreddy et.al. | 2409.20508 | null |
2024-09-30 | COLLAGE: Collaborative Human-Agent Interaction Generation using Hierarchical Latent Diffusion and Language Models | Divyanshu Daiya et.al. | 2409.20502 | null |
2024-09-30 | FreeMask: Rethinking the Importance of Attention Masks for Zero-Shot Video Editing | Lingling Cai et.al. | 2409.20500 | null |
2024-09-30 | All-optical autoencoder machine learning framework using diffractive processors | Peijie Feng et.al. | 2409.20346 | null |
2024-09-30 | Devil is in Details: Locality-Aware 3D Abdominal CT Volume Generation for Self-Supervised Organ Segmentation | Yuran Wang et.al. | 2409.20332 | null |
2024-09-30 | UIR-LoRA: Achieving Universal Image Restoration through Multiple Low-Rank Adaptation | Cheng Zhang et.al. | 2409.20197 | link |
2024-09-30 | Ensemble Kalman Diffusion Guidance: A Derivative-free Method for Inverse Problems | Hongkai Zheng et.al. | 2409.20175 | null |
2024-09-30 | Erase, then Redraw: A Novel Data Augmentation Approach for Free Space Detection Using Diffusion Model | Fulong Ma et.al. | 2409.20164 | null |
2024-09-30 | Conditional Diffusion Models are Minimax-Optimal and Manifold-Adaptive for Conditional Distribution Estimation | Rong Tang et.al. | 2409.20124 | null |
2024-09-30 | Training a Computer Vision Model for Commercial Bakeries with Primarily Synthetic Images | Thomas H. Schmitt et.al. | 2409.20122 | null |
2024-09-30 | Reaction-diffusion model for a population structured in phenotype and space I – Criterion for persistence | Nathanaël Boutillon et.al. | 2409.20118 | null |
2024-09-30 | Near-Field Coupling Coil System: A Novel Radiofrequency Coil Solution for MRI | Zhiguang Mo et.al. | 2409.20095 | null |
2024-09-27 | $O(d/T)$ Convergence Theory for Diffusion Probabilistic Models under Minimal Assumptions | Gen Li et.al. | 2409.18959 | null |
2024-09-27 | ReviveDiff: A Universal Diffusion Model for Restoring Images in Adverse Weather Conditions | Wenfeng Huang et.al. | 2409.18932 | null |
2024-09-27 | Unsupervised Low-light Image Enhancement with Lookup Tables and Diffusion Priors | Yunlong Lin et.al. | 2409.18899 | null |
2024-09-27 | Detecting Dataset Abuse in Fine-Tuning Stable Diffusion Models for Text-to-Image Synthesis | Songrui Wang et.al. | 2409.18897 | null |
2024-09-27 | HM3: Hierarchical Multi-Objective Model Merging for Pretrained Models | Yu Zhou et.al. | 2409.18893 | null |
2024-09-27 | Explainable Artifacts for Synthetic Western Blot Source Attribution | João Phillipe Cardenuto et.al. | 2409.18881 | link |
2024-09-27 | Emu3: Next-Token Prediction is All You Need | Xinlong Wang et.al. | 2409.18869 | null |
2024-09-27 | Challenges of Generating Structurally Diverse Graphs | Fedor Velikonivtsev et.al. | 2409.18859 | link |
2024-09-27 | Moldable Development Patterns | Oscar Nierstrasz et.al. | 2409.18811 | null |
2024-09-27 | Convergence of Diffusion Models Under the Manifold Hypothesis in High-Dimensions | Iskander Azangulov et.al. | 2409.18804 | null |
2024-09-27 | Student-Oriented Teacher Knowledge Refinement for Knowledge Distillation | Chaomin Shen et.al. | 2409.18785 | null |
2024-09-27 | Geometric deep learning for galaxy-halo connection: a case study for galaxy intrinsic alignments | Yesukhei Jagvaral et.al. | 2409.18761 | null |
2024-09-27 | Cottention: Linear Transformers With Cosine Attention | Gabriel Mongaras et.al. | 2409.18747 | link |
2024-09-27 | Read Over the Lines: Attacking LLMs and Toxicity Detection Systems with ASCII Art to Mask Profanity | Sergey Berezin et.al. | 2409.18708 | link |
2024-09-27 | MG-Net: Learn to Customize QAOA with Circuit Depth Awareness | Yang Qian et.al. | 2409.18692 | link |
2024-09-26 | FlowTurbo: Towards Real-time Flow-Based Image Generation with Velocity Refiner | Wenliang Zhao et.al. | 2409.18128 | link |
2024-09-26 | Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction | Jing He et.al. | 2409.18124 | null |
2024-09-26 | EdgeRunner: Auto-regressive Auto-encoder for Artistic Mesh Generation | Jiaxiang Tang et.al. | 2409.18114 | null |
2024-09-26 | MALPOLON: A Framework for Deep Species Distribution Modeling | Theo Larcher et.al. | 2409.18102 | link |
2024-09-26 | StackGen: Generating Stable Structures from Silhouettes via Diffusion | Luzhe Sun et.al. | 2409.18098 | null |
2024-09-26 | DiffSSC: Semantic LiDAR Scan Completion using Denoising Diffusion Probabilistic Models | Helin Cao et.al. | 2409.18092 | null |
2024-09-26 | Stable Video Portraits | Mirela Ostrek et.al. | 2409.18083 | null |
2024-09-26 | LightAvatar: Efficient Head Avatar as Dynamic Neural Light Field | Huan Wang et.al. | 2409.18057 | link |
2024-09-26 | Automated Detection and Analysis of Power Words in Persuasive Text Using Natural Language Processing | Sahil Garje et.al. | 2409.18033 | null |
2024-09-26 | PhoCoLens: Photorealistic and Consistent Reconstruction in Lensless Imaging | Xin Cai et.al. | 2409.17996 | null |
2024-09-26 | Joint Localization and Planning using Diffusion | L. Lao Beyer et.al. | 2409.17995 | null |
2024-09-26 | Manufacturing, processing, applications, and advancements of Fe-based shape memory alloys | Anwar Algamal et.al. | 2409.17973 | null |
2024-09-26 | CNCA: Toward Customizable and Natural Generation of Adversarial Camouflage for Vehicle Detectors | Linye Lyu et.al. | 2409.17963 | null |
2024-09-26 | Relativistic diffusion model for hadron production in p-Pb collisions at the LHC | Philipp Schulz et.al. | 2409.17960 | null |
2024-09-26 | Perturb, Attend, Detect and Localize (PADL): Robust Proactive Image Defense | Filippo Bartolucci et.al. | 2409.17941 | null |
2024-09-25 | DreamWaltz-G: Expressive 3D Gaussian Avatars from Skeleton-Guided 2D Diffusion | Yukun Huang et.al. | 2409.17145 | link |
2024-09-25 | Language-oriented Semantic Communication for Image Transmission with Fine-Tuned Diffusion Model | Xinfeng Wei et.al. | 2409.17104 | null |
2024-09-25 | Accumulator-Aware Post-Training Quantization | Ian Colbert et.al. | 2409.17092 | null |
2024-09-25 | Ctrl-GenAug: Controllable Generative Augmentation for Medical Sequence Classification | Xinrui Zhou et.al. | 2409.17091 | null |
2024-09-25 | Degradation-Guided One-Step Image Super-Resolution with Diffusion Priors | Aiping Zhang et.al. | 2409.17058 | link |
2024-09-25 | ControlCity: A Multimodal Diffusion Model Based Approach for Accurate Geospatial Data Generation and Urban Morphology Analysis | Fangshuo Zhou et.al. | 2409.17049 | link |
2024-09-25 | GeoBiked: A Dataset with Geometric Features and Automated Labeling Techniques to Enable Deep Generative Models in Engineering Design | Phillip Mueller et.al. | 2409.17045 | null |
2024-09-25 | CNN Mixture-of-Depths | Rinor Cakaj et.al. | 2409.17016 | null |
2024-09-25 | Single Image, Any Face: Generalisable 3D Face Generation | Wenqing Wang et.al. | 2409.16990 | null |
2024-09-25 | Dynamic Obstacle Avoidance through Uncertainty-Based Adaptive Planning with Diffusion | Vineet Punyamoorty et.al. | 2409.16950 | null |
2024-09-25 | DALDA: Data Augmentation Leveraging Diffusion Model and LLM with Adaptive Guidance Scaling | Kyuheon Jung et.al. | 2409.16949 | link |
2024-09-25 | Divergence asymmetry and connected components in a general duplication-divergence graph model | Dario Borrelli et.al. | 2409.16943 | null |
2024-09-25 | Generative Object Insertion in Gaussian Splatting with a Multi-View Diffusion Model | Hongliang Zhong et.al. | 2409.16938 | link |
2024-09-25 | Linking in Style: Understanding learned features in deep learning models | Maren H. Wehrheim et.al. | 2409.16865 | link |
2024-09-25 | A Versatile and Differentiable Hand-Object Interaction Representation | Théo Morales et.al. | 2409.16855 | null |
2024-09-18 | Massively Multi-Person 3D Human Motion Forecasting with Scene Context | Felix B Mueller et.al. | 2409.12189 | link |
2024-09-18 | MoRAG – Multi-Fusion Retrieval Augmented Generation for Human Motion | Kalakonda Sai Shashank et.al. | 2409.12140 | null |
2024-09-24 | Takin: A Cohort of Superior Quality Zero-shot Speech Generation Models | Sijing Chen et.al. | 2409.12139 | null |
2024-09-18 | Brain-Streams: fMRI-to-Image Reconstruction with Multi-modal Guidance | Jaehoon Joo et.al. | 2409.12099 | null |
2024-09-19 | Skill matching at scale: freelancer-project alignment for efficient multilingual candidate retrieval | Warren Jouanneau et.al. | 2409.12097 | null |
2024-09-18 | Design of Ligand-Binding Proteins with Atomic Flow Matching | Junqi Liu et.al. | 2409.12080 | null |
2024-09-18 | Denoising diffusion models for high-resolution microscopy image restoration | Pamela Osuna-Vargas et.al. | 2409.12078 | null |
2024-09-19 | Using Large Language Models to Generate Clinical Trial Tables and Figures | Yumeng Yang et.al. | 2409.12046 | null |
2024-09-18 | LEMON: Localized Editing with Mesh Optimization and Neural Shaders | Furkan Mert Algan et.al. | 2409.12024 | null |
2024-09-18 | Promise and Peril of Collaborative Code Generation Models: Balancing Effectiveness and Memorization | Zhi Chen et.al. | 2409.12020 | null |
2024-09-18 | Towards Global Localization using Multi-Modal Object-Instance Re-Identification | Aneesh Chavan et.al. | 2409.12002 | link |
2024-09-18 | Tracking Any Point with Frame-Event Fusion Network at High Frame Rate | Jiaxiong Liu et.al. | 2409.11953 | null |
2024-09-18 | Generation of Complex 3D Human Motion by Temporal and Spatial Composition of Diffusion Models | Lorenzo Mandelli et.al. | 2409.11920 | null |
2024-09-18 | AlignBot: Aligning VLM-powered Customized Task Planning with User Reminders Through Fine-Tuning for Household Robots | Zhaxizhuoma et.al. | 2409.11905 | null |
2024-09-18 | Finding the Subjective Truth: Collecting 2 Million Votes for Comprehensive Gen-AI Model Evaluation | Dimitrios Christodoulou et.al. | 2409.11904 | null |
2024-09-17 | Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion | Zhenwei Wang et.al. | 2409.11406 | null |
2024-09-17 | Teaching dark matter simulations to speak the halo language | Shivam Pandey et.al. | 2409.11401 | link |
2024-09-17 | Ultrasound Image Enhancement with the Variance of Diffusion Models | Yuxin Zhang et.al. | 2409.11380 | link |
2024-09-17 | OSV: One Step is Enough for High-Quality Image to Video Generation | Xiaofeng Mao et.al. | 2409.11367 | null |
2024-09-17 | Ping! Your Food is Ready: Comparing Different Notification Techniques in 3D AR Cooking Environment | Aditya Raikwar et.al. | 2409.11357 | null |
2024-09-17 | Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think | Gonzalo Martin Garcia et.al. | 2409.11355 | link |
2024-09-17 | OmniGen: Unified Image Generation | Shitao Xiao et.al. | 2409.11340 | link |
2024-09-17 | fMRI-3D: A Comprehensive Dataset for Enhancing fMRI-based 3D Reconstruction | Jianxiong Gao et.al. | 2409.11315 | null |
2024-09-17 | SpMis: An Investigation of Synthetic Spoken Misinformation Detection | Peizhuo Liu et.al. | 2409.11308 | null |
2024-09-17 | Measurement of top-quark pair production in association with charm quarks in proton-proton collisions at $\sqrt{s}=13$ TeV with the ATLAS detector | ATLAS Collaboration et.al. | 2409.11305 | null |
2024-09-17 | NirvaWave: An Accurate and Efficient Near Field Wave Propagation Simulator for 6G and Beyond | Vahid Yazdnian et.al. | 2409.11293 | link |
2024-09-17 | DroneDiffusion: Robust Quadrotor Dynamics Learning with Diffusion Models | Avirup Das et.al. | 2409.11292 | null |
2024-09-17 | Neural Networks for Vehicle Routing Problem | László Kovács et.al. | 2409.11290 | null |
2024-09-17 | Attacking Slicing Network via Side-channel Reinforcement Learning Attack | Wei Shao et.al. | 2409.11258 | null |
2024-09-17 | Learning Source Disentanglement in Neural Audio Codec | Xiaoyu Bie et.al. | 2409.11228 | null |
2024-09-16 | Pennsieve - A Collaborative Platform for Translational Neuroscience and Beyond | Zack Goldblum et.al. | 2409.10509 | null |
2024-09-16 | Torres funerarias chullpa en el valle del río Lauca: un primer análisis arqueoastronómico | Alejandro Gangui et.al. | 2409.10497 | null |
2024-09-16 | Incorporating Classifier-Free Guidance in Diffusion Model-Based Recommendation | Noah Buchanan et.al. | 2409.10494 | null |
2024-09-16 | SimInversion: A Simple Framework for Inversion-Based Text-to-Image Editing | Qi Qian et.al. | 2409.10476 | null |
2024-09-16 | MacDiff: Unified Skeleton Modeling with Masked Conditional Diffusion | Lehong Wu et.al. | 2409.10473 | null |
2024-09-16 | Signed Graph Autoencoder for Explainable and Polarization-Aware Network Embeddings | Nikolaos Nakis et.al. | 2409.10452 | null |
2024-09-16 | Mamba-ST: State Space Model for Efficient Style Transfer | Filippo Botti et.al. | 2409.10385 | link |
2024-09-16 | 2D or not 2D: How Does the Dimensionality of Gesture Representation Affect 3D Co-Speech Gesture Generation? | Téo Guichoux et.al. | 2409.10357 | null |
2024-09-16 | Taming Diffusion Models for Image Restoration: A Review | Ziwei Luo et.al. | 2409.10353 | null |
2024-09-16 | MEGS: Morphological Evaluation of Galactic Structure | Ufuk Çakır et.al. | 2409.10346 | link |
2024-09-16 | VAE-QWGAN: Improving Quantum GANs for High Resolution Image Generation | Aaron Mark Thomas et.al. | 2409.10339 | null |
2024-09-16 | Research and Design of a Financial Intelligent Risk Control Platform Based on Big Data Analysis and Deep Machine Learning | Shuochen Bi et.al. | 2409.10331 | null |
2024-09-16 | Fairness, not Emotion, Drives Socioeconomic Decision Making | Rudra Mukhopadhyay et.al. | 2409.10322 | null |
2024-09-16 | On Synthetic Texture Datasets: Challenges, Creation, and Curation | Blaine Hoak et.al. | 2409.10297 | null |
2024-09-16 | DreamHead: Learning Spatial-Temporal Correspondence via Hierarchical Diffusion for Audio-driven Talking Head Synthesis | Fa-Ting Hong et.al. | 2409.10281 | null |
2024-09-13 | Closed-Loop Visuomotor Control with Generative Expectation for Robotic Manipulation | Qingwen Bu et.al. | 2409.09016 | link |
2024-09-13 | A Diffusion Approach to Radiance Field Relighting using Multi-Illumination Synthesis | Yohan Poirier-Ginter et.al. | 2409.08947 | null |
2024-09-13 | Emerging Reliance Behaviors in Human-AI Text Generation: Hallucinations, Data Quality Assessment, and Cognitive Forcing Functions | Zahra Ashktorab et.al. | 2409.08937 | null |
2024-09-13 | Latent Space Score-based Diffusion Model for Probabilistic Multivariate Time Series Imputation | Guojun Liang et.al. | 2409.08917 | link |
2024-09-13 | Gaussian is All You Need: A Unified Framework for Solving Inverse Problems via Diffusion Posterior Sampling | Nebiyou Yismaw et.al. | 2409.08906 | null |
2024-09-13 | Adjoint Matching: Fine-tuning Flow and Diffusion Generative Models with Memoryless Stochastic Optimal Control | Carles Domingo-Enrich et.al. | 2409.08861 | null |
2024-09-13 | The Line-Based Dial-a-Ride Problem | Kendra Reiter et.al. | 2409.08860 | link |
2024-09-13 | InstantDrag: Improving Interactivity in Drag-based Image Editing | Joonghyuk Shin et.al. | 2409.08857 | null |
2024-09-13 | DX2CT: Diffusion Model for 3D CT Reconstruction from Bi or Mono-planar 2D X-ray(s) | Yun Su Jeong et.al. | 2409.08850 | null |
2024-09-13 | Development of a Compton Imager Setup | Anuraag Arya et.al. | 2409.08822 | null |
2024-09-13 | LLaQo: Towards a Query-Based Coach in Expressive Music Performance Assessment | Huan Zhang et.al. | 2409.08795 | link |
2024-09-13 | What You Say = What You Want? Teaching Humans to Articulate Requirements for LLMs | Qianou Ma et.al. | 2409.08775 | null |
2024-09-13 | A Hybrid Meta-Learning and Multi-Armed Bandit Approach for Context-Specific Multi-Objective Recommendation Optimization | Tiago Cunha et.al. | 2409.08752 | null |
2024-09-13 | Adaptive Sampling for Continuous Group Equivariant Neural Networks | Berfin Inal et.al. | 2409.08741 | null |
2024-09-13 | DFADD: The Diffusion and Flow-Matching Based Audio Deepfake Dataset | Jiawei Du et.al. | 2409.08731 | link |
2024-09-12 | DreamHOI: Subject-Driven Generation of 3D Human-Object Interactions with Diffusion Priors | Thomas Hanwen Zhu et.al. | 2409.08278 | null |
2024-09-12 | Hand-Object Interaction Pretraining from Videos | Himanshu Gaurav Singh et.al. | 2409.08273 | null |
2024-09-12 | Click2Mask: Local Editing with Dynamic Mask Generation | Omer Regev et.al. | 2409.08272 | null |
2024-09-12 | DreamBeast: Distilling 3D Fantastical Animals with Part-Aware Knowledge Transfer | Runjia Li et.al. | 2409.08271 | null |
2024-09-12 | Touch2Touch: Cross-Modal Tactile Generation for Object Manipulation | Samanta Rodriguez et.al. | 2409.08269 | null |
2024-09-12 | Improving Text-guided Object Inpainting with Semantic Pre-inpainting | Yifu Chen et.al. | 2409.08260 | link |
2024-09-12 | Improving Virtual Try-On with Garment-focused Diffusion Models | Siqi Wan et.al. | 2409.08258 | null |
2024-09-12 | LoRID: Low-Rank Iterative Diffusion for Adversarial Purification | Geigh Zollicoffer et.al. | 2409.08255 | null |
2024-09-12 | Dynamic Prompting of Frozen Text-to-Image Diffusion Models for Panoptic Narrative Grounding | Hongyu Li et.al. | 2409.08251 | null |
2024-09-12 | IFAdapter: Instance Feature Control for Grounded Text-to-Image Generation | Yinwei Wu et.al. | 2409.08240 | null |
2024-09-12 | Source2Synth: Synthetic Data Generation and Curation Grounded in Real Data Sources | Alisia Lupidi et.al. | 2409.08239 | null |
2024-09-12 | LT3SD: Latent Trees for 3D Scene Diffusion | Quan Meng et.al. | 2409.08215 | null |
2024-09-12 | VI3DRM:Towards meticulous 3D Reconstruction from Sparse Views via Photo-Realistic Novel View Synthesis | Hao Chen et.al. | 2409.08207 | null |
2024-09-12 | High-Frequency Anti-DreamBooth: Robust Defense Against Image Synthesis | Takuto Onikubo et.al. | 2409.08167 | link |
2024-09-12 | MagicStyle: Portrait Stylization Based on Reference Image | Zhaoli Deng et.al. | 2409.08156 | null |
2024-09-11 | DreamMesh: Jointly Manipulating and Texturing Triangle Meshes for Text-to-3D Generation | Haibo Yang et.al. | 2409.07454 | null |
2024-09-11 | Hi3D: Pursuing High-Resolution Image-to-3D Generation with Video Diffusion Models | Haibo Yang et.al. | 2409.07452 | link |
2024-09-11 | FreeEnhance: Tuning-Free Image Enhancement via Content-Consistent Noising-and-Denoising Process | Yang Luo et.al. | 2409.07451 | null |
2024-09-11 | Efficient One-Step Diffusion Refinement for Snapshot Compressive Imaging | Yunzhen Wang et.al. | 2409.07417 | null |
2024-09-11 | Extracting TCPIP Headers at High Speed for the Anonymized Network Traffic Graph Challenge | Zhaoyang Han et.al. | 2409.07374 | null |
2024-09-11 | Awaking the Slides: A Tuning-free and Knowledge-regulated AI Tutoring System via Language Model Coordination | Daniel Zhang-Li et.al. | 2409.07372 | null |
2024-09-11 | Event-based Mosaicing Bundle Adjustment | Shuang Guo et.al. | 2409.07365 | link |
2024-09-11 | Training-Free Guidance for Discrete Diffusion Models for Molecular Generation | Thomas J. Kerby et.al. | 2409.07359 | null |
2024-09-11 | Learning Robotic Manipulation Policies from Point Clouds with Conditional Flow Matching | Eugenio Chisari et.al. | 2409.07343 | null |
2024-09-11 | Efficient and Unbiased Sampling of Boltzmann Distributions via Consistency Models | Fengzhe Zhang et.al. | 2409.07323 | null |
2024-09-11 | Optimizing Neural Network Performance and Interpretability with Diophantine Equation Encoding | Ronald Katende et.al. | 2409.07310 | null |
2024-09-11 | Exploring User-level Gradient Inversion with a Diffusion Prior | Zhuohang Li et.al. | 2409.07291 | null |
2024-09-11 | CCFExp: Facial Image Synthesis with Cycle Cross-Fusion Diffusion Model for Facial Paralysis Individuals | Weixiang Gao et.al. | 2409.07271 | link |
2024-09-11 | Realistic and Efficient Face Swapping: A Unified Approach with Diffusion Models | Sanoojan Baliah et.al. | 2409.07269 | link |
2024-09-11 | EMOdiffhead: Continuously Emotional Control in Talking Head Generation via Diffusion | Jian Zhang et.al. | 2409.07255 | null |
2024-09-10 | Technical Report of Mobile Manipulator Robot for Industrial Environments | Erfan Amoozad Khalili et.al. | 2409.06693 | null |
2024-09-10 | SaRA: High-Efficient Diffusion Model Fine-tuning with Progressive Sparse Low-Rank Adaptation | Teng Hu et.al. | 2409.06633 | null |
2024-09-10 | MVGaussian: High-Fidelity text-to-3D Content Generation with Multi-View Guidance and Surface Densification | Phu Pham et.al. | 2409.06620 | null |
2024-09-10 | A Primer on Variational Inference for Physics-Informed Deep Generative Modelling | Alex Glyn-Davies et.al. | 2409.06560 | null |
2024-09-10 | From LIMA to DeepLIMA: following a new path of interoperability | Victor Bocharov et.al. | 2409.06550 | null |
2024-09-10 | Enhancing Emotional Text-to-Speech Controllability with Natural Language Guidance through Contrastive Learning and Diffusion Models | Xin Jing et.al. | 2409.06451 | null |
2024-09-10 | Prompt2Fashion: An automatically generated fashion dataset | Georgia Argyro et.al. | 2409.06442 | link |
2024-09-10 | Fast nonparametric inference of network backbones for graph sparsification | Alec Kirkley et.al. | 2409.06417 | link |
2024-09-10 | Distilling Generative-Discriminative Representations for Very Low-Resolution Face Recognition | Junzheng Zhang et.al. | 2409.06371 | null |
2024-09-10 | What happens to diffusion model likelihood when your model is conditional? | Mattias Cross et.al. | 2409.06364 | null |
2024-09-10 | DiffQRCoder: Diffusion-based Aesthetic QR Code Generation with Scanning Robustness Guided Iterative Refinement | Jia-Wei Liao et.al. | 2409.06355 | null |
2024-09-10 | Improving Conditional Level Generation using Automated Validation in Match-3 Games | Monica Villanueva Aylagas et.al. | 2409.06349 | null |
2024-09-10 | Foragax: An Agent Based Modelling framework based on JAX | Siddharth Chaturvedi et.al. | 2409.06345 | link |
2024-09-10 | G3PT: Unleash the power of Autoregressive Modeling in 3D Generation via Cross-scale Querying Transformer | Jinzhi Zhang et.al. | 2409.06322 | null |
2024-09-10 | Learning Augmentation Policies from A Model Zoo for Time Series Forecasting | Haochen Yuan et.al. | 2409.06282 | null |
2024-09-09 | Fast Generation of Custom Floating-Point Spatial Filters on FPGAs | Nelson Campos et.al. | 2409.05837 | null |
2024-09-09 | Enhancing Preference-based Linear Bandits via Human Response Time | Shen Li et.al. | 2409.05798 | null |
2024-09-09 | Predicting Critical Heat Flux with Uncertainty Quantification and Domain Generalization Using Conditional Variational Autoencoders and Deep Neural Networks | Farah Alsafadi et.al. | 2409.05790 | null |
2024-09-09 | Vector Quantized Diffusion Model Based Speech Bandwidth Extension | Yuan Fang et.al. | 2409.05784 | null |
2024-09-09 | AS-Speech: Adaptive Style For Speech Synthesis | Zhipeng Li et.al. | 2409.05730 | null |
2024-09-09 | pFedGPA: Diffusion-based Generative Parameter Aggregation for Personalized Federated Learning | Jiahao Lai et.al. | 2409.05701 | null |
2024-09-09 | Citizen-Led Personalization of User Interfaces: Investigating How People Customize Interfaces for Themselves and Others | Sérgio Alves et.al. | 2409.05696 | null |
2024-09-09 | Unlearning or Concealment? A Critical Analysis and Evaluation Metrics for Unlearning in Diffusion Models | Aakash Sen Sharma et.al. | 2409.05668 | null |
2024-09-09 | Forward KL Regularized Preference Optimization for Aligning Diffusion Policies | Zhao Shan et.al. | 2409.05622 | null |
2024-09-09 | CustomContrast: A Multilevel Contrastive Perspective For Subject-Driven Text-to-Image Customization | Nan Chen et.al. | 2409.05606 | null |
2024-09-09 | Latent 3D Brain MRI Counterfactual | Wei Peng et.al. | 2409.05585 | null |
2024-09-09 | Spatially-Aware Speaker for Vision-and-Language Navigation Instruction Generation | Muraleekrishna Gopinathan et.al. | 2409.05583 | link |
2024-09-09 | Design and Implementation of TAO DAQ System | Shuihan Zhang et.al. | 2409.05522 | null |
2024-09-09 | A Taxonomy of Miscompressions: Preparing Image Forensics for Neural Compression | Nora Hofer et.al. | 2409.05490 | null |
2024-09-09 | DriveScape: Towards High-Resolution Controllable Multi-View Driving Video Generation | Wei Wu et.al. | 2409.05463 | null |
2024-09-06 | VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation | Yecheng Wu et.al. | 2409.04429 | link |
2024-09-06 | Exploring Foundation Models for Synthetic Medical Imaging: A Study on Chest X-Rays and Fine-Tuning Techniques | Davide Clode da Silva et.al. | 2409.04424 | null |
2024-09-06 | Open-MAGVIT2: An Open-Source Project Toward Democratizing Auto-regressive Visual Generation | Zhuoyan Luo et.al. | 2409.04410 | null |
2024-09-06 | Enhancing Skin Lesion Diagnosis with Ensemble Learning | Xiaoyi Liu et.al. | 2409.04381 | null |
2024-09-06 | How Fair is Your Diffusion Recommender Model? | Daniele Malitesta et.al. | 2409.04339 | null |
2024-09-06 | Random effects estimation in a fractional diffusion model based on continuous observations | Nesrine Chebli et.al. | 2409.04331 | null |
2024-09-06 | Advancing Automated Knowledge Transfer in Evolutionary Multitasking via Large Language Models | Yuxiao Huang et.al. | 2409.04270 | null |
2024-09-06 | An overview of domain-specific foundation model: key technologies, applications and challenges | Haolong Chen et.al. | 2409.04267 | null |
2024-09-06 | UniDet3D: Multi-dataset Indoor 3D Object Detection | Maksim Kolodiazhnyi et.al. | 2409.04234 | link |
2024-09-06 | Generative Modelling via Quantile Regression | Johannes Schmidt-Hieber et.al. | 2409.04231 | null |
2024-09-06 | Breaking the Brownian Barrier: Models and Manifestations of Molecular Diffusion in Complex Fluids | Harish Srinivasan et.al. | 2409.04199 | null |
2024-09-06 | GST: Precise 3D Human Body from a Single Image with Gaussian Splatting Transformers | Lorenza Prospero et.al. | 2409.04196 | null |
2024-09-06 | Subsampling of Correlated Graph Signals | Rishabh Ravi et.al. | 2409.04107 | null |
2024-09-06 | Estimation of service value parameters for a queue with unobserved balking | Daniel Podorojnyi et.al. | 2409.04090 | null |
2024-09-06 | D4: Text-guided diffusion model-based domain adaptive data augmentation for vineyard shoot detection | Kentaro Hirahara et.al. | 2409.04060 | null |
2024-09-05 | Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding | Yunze Man et.al. | 2409.03757 | link |
2024-09-05 | WildVis: Open Source Visualizer for Million-Scale Chat Logs in the Wild | Yuntian Deng et.al. | 2409.03753 | null |
2024-09-05 | ArtiFade: Learning to Generate High-quality Subject from Blemished Images | Shuya Yang et.al. | 2409.03745 | null |
2024-09-06 | RAG based Question-Answering for Contextual Response Prediction System | Sriram Veturi et.al. | 2409.03708 | null |
2024-09-05 | RealisHuman: A Two-Stage Approach for Refining Malformed Human Parts in Generated Images | Benzhi Wang et.al. | 2409.03644 | link |
2024-09-05 | DiffEVC: Any-to-Any Emotion Voice Conversion with Expressive Guidance | Hsing-Hang Chou et.al. | 2409.03636 | null |
2024-09-05 | Generalizing Linear Graphs and Bond Graph Models with Hetero-functional Graphs for System-of-Systems Engineering Applications | Ehsanoddin Ghorbanichemazkati et.al. | 2409.03630 | null |
2024-09-05 | TCDiff: Triple Condition Diffusion Model with 3D Constraints for Stylizing Synthetic Faces | Bernardo Biesseck et.al. | 2409.03600 | link |
2024-09-05 | DKDM: Data-Free Knowledge Distillation for Diffusion Models with Any Architecture | Qianlong Xiang et.al. | 2409.03550 | null |
2024-09-05 | Euclid preparation. Simulations and nonlinearities beyond $Λ$ CDM. 2. Results from non-standard simulations | Euclid Collaboration et.al. | 2409.03523 | null |
2024-09-05 | Blended Latent Diffusion under Attention Control for Real-World Video Editing | Deyin Liu et.al. | 2409.03514 | null |
2024-09-05 | Physical Modelling of Piano Sound | Haifan Xie et.al. | 2409.03481 | null |
2024-09-05 | Data-free Distillation with Degradation-prompt Diffusion for Multi-weather Image Restoration | Pei Wang et.al. | 2409.03455 | null |
2024-09-05 | Rx Strategist: Prescription Verification using LLM Agents System | Phuc Phan Van et.al. | 2409.03440 | null |
2024-09-05 | KiloBot: A Programming Language for Deploying Perception-Guided Industrial Manipulators at Scale | Wei Gao et.al. | 2409.03439 | null |
2024-09-04 | HiPrompt: Tuning-free Higher-Resolution Generation with Hierarchical MLLM Prompts | Xinyu Liu et.al. | 2409.02919 | link |
2024-09-04 | Latent Watermarking of Audio Generative Models | Robin San Roman et.al. | 2409.02915 | null |
2024-09-04 | Masked Diffusion Models are Secretly Time-Agnostic Masked Models and Exploit Inaccurate Categorical Sampling | Kaiwen Zheng et.al. | 2409.02908 | null |
2024-09-04 | Configurable Foundation Models: Building LLMs from a Modular Perspective | Chaojun Xiao et.al. | 2409.02877 | null |
2024-09-04 | Look Into the LITE in Deep Learning for Time Series Classification | Ali Ismail-Fawaz et.al. | 2409.02869 | link |
2024-09-04 | Building a Scalable, Effective, and Steerable Search and Ranking Platform | Marjan Celikik et.al. | 2409.02856 | null |
2024-09-04 | Human-VDM: Learning Single-Image 3D Human Gaussian Splatting from Video Diffusion Models | Zhibin Liu et.al. | 2409.02851 | link |
2024-09-04 | Anomaly Detection in Offshore Open Radio Access Network Using Long Short-Term Memory Models on a Novel Artificial Intelligence-Driven Cloud-Native Data Platform | Abdelrahim Ahmad et.al. | 2409.02849 | null |
2024-09-04 | Multi-Track MusicLDM: Towards Versatile Music Generation with Latent Diffusion Model | Tornike Karchkhadze et.al. | 2409.02845 | null |
2024-09-04 | SNNAX – Spiking Neural Networks in JAX | Jamie Lohoff et.al. | 2409.02842 | null |
2024-09-04 | Experimental Framework for Generating Reliable Ground Truth for Laryngeal Spatial Segmentation Tasks | Hamzeh Ghasemzadeh et.al. | 2409.02809 | null |
2024-09-04 | Creating a Gen-AI based Track and Trace Assistant MVP (SuperTracy) for PostNL | Mohammad Reshadati et.al. | 2409.02711 | null |
2024-09-04 | Rethinking HTG Evaluation: Bridging Generation and Recognition | Konstantina Nikolaidou et.al. | 2409.02683 | link |
2024-09-04 | Introduction to Machine Learning | Laurent Younes et.al. | 2409.02668 | null |
2024-09-04 | Creating Domain-Specific Translation Memories for Machine Translation Fine-tuning: The TRENCARD Bilingual Cardiology Corpus | Gokhan Dogru et.al. | 2409.02667 | null |
2024-08-30 | Generative AI Enables Medical Image Segmentation in Ultra Low-Data Regimes | Li Zhang et.al. | 2408.17421 | link |
2024-08-30 | Assessing Generative Language Models in Classification Tasks: Performance and Self-Evaluation Capabilities in the Environmental and Climate Change Domain | Francesca Grasso et.al. | 2408.17362 | link |
2024-08-30 | Subspace Diffusion Posterior Sampling for Travel-Time Tomography | Xiang Cao et.al. | 2408.17333 | null |
2024-08-30 | Structuring a Training Strategy to Robustify Perception Models with Realistic Image Augmentations | Ahmed Hammam et.al. | 2408.17311 | null |
2024-08-30 | Leveraging Deep Generative Model For Computational Protein Design And Optimization | Boqiao Lai et.al. | 2408.17241 | null |
2024-08-30 | Towards Symbolic XAI – Explanation Through Human Understandable Logical Relationships Between Features | Thomas Schnake et.al. | 2408.17198 | null |
2024-09-02 | Leveraging Blockchain and ANFIS for Optimal Supply Chain Management | Amirfarhad Farhadi et.al. | 2408.17161 | null |
2024-08-30 | Look, Compare, Decide: Alleviating Hallucination in Large Vision-Language Models via Multi-View Multi-Path Reasoning | Xiaoye Qu et.al. | 2408.17150 | link |
2024-08-30 | Flow Matching for Optimal Reaction Coordinates of Biomolecular System | Mingyuan Zhang et.al. | 2408.17139 | link |
2024-08-30 | Temporal and Interactive Modeling for Efficient Human-Human Motion Generation | Yabiao Wang et.al. | 2408.17135 | null |
2024-09-02 | RISSOLE: Parameter-efficient Diffusion Models via Block-wise Generation and Retrieval-Guidance | Avideep Mukherjee et.al. | 2408.17095 | null |
2024-08-30 | FissionVAE: Federated Non-IID Image Generation with Latent Space and Decoder Decomposition | Chen Hu et.al. | 2408.17090 | link |
2024-08-30 | Approximately Invertible Neural Network for Learned Image Compression | Yanbo Gao et.al. | 2408.17073 | null |
2024-09-02 | Instant Adversarial Purification with Adversarial Consistency Distillation | Chun Tong Lei et.al. | 2408.17064 | null |
2024-08-30 | Text-to-Image Generation Via Energy-Based CLIP | Roy Ganz et.al. | 2408.17046 | null |
2024-08-29 | ReconX: Reconstruct Any Scene from Sparse Views with Video Diffusion Model | Fangfu Liu et.al. | 2408.16767 | null |
2024-08-29 | CSGO: Content-Style Composition in Text-to-Image Generation | Peng Xing et.al. | 2408.16766 | null |
2024-08-29 | A Score-Based Density Formula, with Applications in Diffusion Generative Models | Gen Li et.al. | 2408.16765 | null |
2024-08-29 | UV-free Texture Generation with Denoising and Geodesic Heat Diffusions | Simone Foti et.al. | 2408.16762 | link |
2024-08-29 | One-Shot Learning Meets Depth Diffusion in Multi-Object Videos | Anisha Jain et.al. | 2408.16704 | null |
2024-08-29 | VMC: A Grammar for Visualizing Statistical Model Checks | Ziyang Guo et.al. | 2408.16702 | null |
2024-08-29 | GradBias: Unveiling Word Influence on Bias in Text-to-Image Generative Models | Moreno D’Incà et.al. | 2408.16700 | link |
2024-08-29 | Optimization Models for the Quadratic Traveling Salesperson Problem | Yuxiao Chen et.al. | 2408.16680 | null |
2024-08-29 | DriveGenVLM: Real-world Video Generation for Vision Language Model based Autonomous Driving | Yongjie Fu et.al. | 2408.16647 | null |
2024-08-29 | RLCP: A Reinforcement Learning-based Copyright Protection Method for Text-to-Image Diffusion Model | Zhuan Shi et.al. | 2408.16634 | null |
2024-08-28 | TEDRA: Text-based Editing of Dynamic and Photoreal Actors | Basavaraj Sunagad et.al. | 2408.15995 | null |
2024-08-28 | Distribution Backtracking Builds A Faster Convergence Trajectory for One-step Diffusion Distillation | Shengyuan Zhang et.al. | 2408.15991 | link |
2024-08-28 | Thoughtseeds: Evolutionary Priors, Nested Markov Blankets, and the Emergence of Embodied Cognition | Prakash Chandra Kavi et.al. | 2408.15982 | null |
2024-08-28 | Stability of Primal-Dual Gradient Flow Dynamics for Multi-Block Convex Optimization Problems | Ibrahim K. Ozaslan et.al. | 2408.15969 | null |
2024-08-28 | MetaGFN: Exploring Distant Modes with Adapted Metadynamics for Continuous GFlowNets | Dominic Phillips et.al. | 2408.15905 | null |
2024-08-28 | Gen-Swarms: Adapting Deep Generative Models to Swarms of Drones | Carlos Plou et.al. | 2408.15899 | null |
2024-08-28 | Airfoil Diffusion: Denoising Diffusion Model For Conditional Airfoil Generation | Reid Graves et.al. | 2408.15898 | link |
2024-08-28 | Disentangled Diffusion Autoencoder for Harmonization of Multi-site Neuroimaging Data | Ayodeji Ijishakin et.al. | 2408.15890 | null |
2024-08-29 | Recent Decade’s Power Outage Data Reveals the Increasing Vulnerability of U.S. Power Infrastructure | Bo Li et.al. | 2408.15882 | null |
2024-08-28 | GenDDS: Generating Diverse Driving Video Scenarios with Prompt-to-Video Generative Model | Yongjie Fu et.al. | 2408.15868 | null |
2024-08-27 | GenRec: Unifying Video Generation and Recognition with Diffusion Models | Zejia Weng et.al. | 2408.15241 | link |
2024-08-27 | Generative Inbetweening: Adapting Image-to-Video Models for Keyframe Interpolation | Xiaojuan Wang et.al. | 2408.15239 | null |
2024-08-27 | Simulation of Stochastic Discrete Dislocation Dynamics in Ductile Vs Brittle Materials | Santosh Chhetri et.al. | 2408.15157 | null |
2024-08-27 | How transformers learn structured data: insights from hierarchical filtering | Jerome Garnier-Brun et.al. | 2408.15138 | null |
2024-08-27 | DIFR3CT: Latent Diffusion for Probabilistic 3D CT Reconstruction from Few Planar X-Rays | Yiran Sun et.al. | 2408.15118 | link |
2024-08-27 | Data-Driven Nonlinear Deformation Design of 3D-Printable Shells | Samuel Silverman et.al. | 2408.15097 | link |
2024-08-27 | Constrained Diffusion Models via Dual Training | Shervin Khalafi et.al. | 2408.15094 | null |
2024-08-27 | LN-Gen: Rectal Lymph Nodes Generation via Anatomical Features | Weidong Guo et.al. | 2408.14977 | null |
2024-08-27 | MegActor- $Σ$ : Unlocking Flexible Mixed-Modal Control in Portrait Animation with Diffusion Transformer | Shurong Yang et.al. | 2408.14975 | null |
2024-08-27 | Integrated Bundling and Pricing of Unique Items | Maxime Bouscary et.al. | 2408.14913 | null |
2024-08-26 | K-Sort Arena: Efficient and Reliable Benchmarking for Generative Models via K-wise Human Preferences | Zhikai Li et.al. | 2408.14468 | null |
2024-08-26 | Uncovering Knowledge Gaps in Radiology Report Generation Models through Knowledge Graphs | Xiaoman Zhang et.al. | 2408.14397 | link |
2024-08-26 | Reprogramming Foundational Large Language Models(LLMs) for Enterprise Adoption for Spatio-Temporal Forecasting Applications: Unveiling a New Era in Copilot-Guided Cross-Modal Time Series Representation Learning | Sakhinana Sagar Srinivas et.al. | 2408.14387 | null |
2024-08-26 | GR-MG: Leveraging Partially Annotated Data via Multi-Modal Goal Conditioned Policy | Peiyan Li et.al. | 2408.14368 | link |
2024-08-27 | Foundation Models for Music: A Survey | Yinghao Ma et.al. | 2408.14340 | link |
2024-08-26 | Automated Machine Learning in Insurance | Panyi Dong et.al. | 2408.14331 | link |
2024-08-26 | LLM-3D Print: Large Language Models To Monitor and Control 3D Printing | Yayati Jadhav et.al. | 2408.14307 | null |
2024-08-26 | Learning Local Pattern Modularization for Point Cloud Reconstruction from Unseen Classes | Chao Chen et.al. | 2408.14279 | null |
2024-08-26 | Towards Synthetic Trace Generation of Modeling Operations using In-Context Learning Approach | Vittoriano Muttillo et.al. | 2408.14259 | null |
2024-08-27 | Text3DAug – Prompted Instance Augmentation for LiDAR Perception | Laurenz Reichardt et.al. | 2408.14253 | link |
2024-08-23 | How Diffusion Models Learn to Factorize and Compose | Qiyao Liang et.al. | 2408.13256 | null |
2024-08-23 | Foundational Model for Electron Micrograph Analysis: Instruction-Tuning Small-Scale Language-and-Vision Assistant for Enterprise Adoption | Sakhinana Sagar Srinivas et.al. | 2408.13248 | null |
2024-08-23 | CustomCrafter: Customized Video Generation with Preserving Motion and Concept Composition Abilities | Tao Wu et.al. | 2408.13239 | null |
2024-08-23 | Social Welfare Maximization for Federated Learning with Network Effects | Xiang Li et.al. | 2408.13223 | null |
2024-08-23 | Instruct-DeBERTa: A Hybrid Approach for Aspect-based Sentiment Analysis on Textual Reviews | Dineth Jayakody et.al. | 2408.13202 | null |
2024-08-23 | IFH: a Diffusion Framework for Flexible Design of Graph Generative Models | Samuel Cognolato et.al. | 2408.13194 | link |
2024-08-23 | Deep Learning for Lung Disease Classification Using Transfer Learning and a Customized CNN Architecture with Attention | Xiaoyi Liu et.al. | 2408.13180 | null |
2024-08-26 | Focus on Neighbors and Know the Whole: Towards Consistent Dense Multiview Text-to-Image Generator for 3D Creation | Bonan Li et.al. | 2408.13149 | null |
2024-08-23 | Diffusion-based Episodes Augmentation for Offline Multi-Agent Reinforcement Learning | Jihwan Oh et.al. | 2408.13092 | null |
2024-08-23 | General Intelligent Imaging and Uncertainty Quantification by Deterministic Diffusion Model | Weiru Fan et.al. | 2408.13061 | null |
2024-08-22 | xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed Representations | Can Qin et.al. | 2408.12590 | null |
2024-08-22 | ssProp: Energy-Efficient Training for Convolutional Neural Networks with Scheduled Sparse Back Propagation | Lujia Zhong et.al. | 2408.12561 | link |
2024-08-22 | Show-o: One Single Transformer to Unify Multimodal Understanding and Generation | Jinheng Xie et.al. | 2408.12528 | null |
2024-08-22 | FlexEdit: Marrying Free-Shape Masks to VLLM for Flexible Image Editing | Jue Wang et.al. | 2408.12429 | link |
2024-08-22 | Enhanced Infield Agriculture with Interpretable Machine Learning Approaches for Crop Classification | Sudi Murindanyi et.al. | 2408.12426 | null |
2024-08-22 | 4D Diffusion for Dynamic Protein Structure Prediction with Reference Guided Motion Alignment | Kaihui Cheng et.al. | 2408.12419 | null |
2024-08-22 | CODE: Confident Ordinary Differential Editing | Bastien van Delft et.al. | 2408.12418 | link |
2024-08-22 | Dynamic PDB: A New Dataset and a SE(3) Model Extension by Integrating Dynamic Behaviors and Physical Properties in Protein Structures | Ce Liu et.al. | 2408.12413 | null |
2024-08-22 | A Stable Polygamy Approach to Spectrum Access with Channel Reuse | Dan Ben Ami et.al. | 2408.12402 | null |
2024-08-22 | Multi-Style Facial Sketch Synthesis through Masked Generative Modeling | Bowen Sun et.al. | 2408.12400 | null |
2024-08-21 | Pixel Is Not A Barrier: An Effective Evasion Attack for Pixel-Domain Diffusion Models | Chun-Yen Shih et.al. | 2408.11810 | null |
2024-08-21 | ACE: A Cross-Platform Visual-Exoskeletons System for Low-Cost Dexterous Teleoperation | Shiqi Yang et.al. | 2408.11805 | null |
2024-08-21 | DreamFactory: Pioneering Multi-Scene Long Video Generation with a Multi-Agent Framework | Zhifei Xie et.al. | 2408.11788 | null |
2024-08-21 | Timeline and Boundary Guided Diffusion Network for Video Shadow Detection | Haipeng Zhou et.al. | 2408.11785 | link |
2024-08-21 | Sum of Squares Circuits | Lorenzo Loconte et.al. | 2408.11778 | null |
2024-08-21 | Leveraging Fine-Tuned Retrieval-Augmented Generation with Long-Context Support: For 3GPP Standards | Omar Erak et.al. | 2408.11775 | link |
2024-08-21 | D-RMGPT: Robot-assisted collaborative tasks driven by large multimodal models | M. Forlini et.al. | 2408.11761 | null |
2024-08-21 | JieHua Paintings Style Feature Extracting Model using Stable Diffusion with ControlNet | Yujia Gu et.al. | 2408.11744 | null |
2024-08-21 | Enhancing Cross-Modal Medical Image Segmentation through Compositionality | Aniek Eijpe et.al. | 2408.11733 | link |
2024-08-21 | AI-assisted Automated Short Answer Grading of Handwritten University Level Mathematics Exams | Tianyi Liu et.al. | 2408.11728 | null |
2024-08-20 | Reconciling Methodological Paradigms: Employing Large Language Models as Novice Qualitative Research Assistants in Talent Management Research | Sreyoshi Bhaduri et.al. | 2408.11043 | null |
2024-08-20 | Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model | Chunting Zhou et.al. | 2408.11039 | null |
2024-08-20 | Full Detector Simulation of a Projective Dual-Readout Segmented Crystal Electromagnetic Calorimeter with Precision Timing | Wonyong Chung et.al. | 2408.11027 | null |
2024-08-20 | MegaFusion: Extend Diffusion Models towards Higher-resolution Image Generation without Further Tuning | Haoning Wu et.al. | 2408.11001 | link |
2024-08-20 | GreediRIS: Scalable Influence Maximization using Distributed Streaming Maximum Cover | Reet Barik et.al. | 2408.10982 | null |
2024-08-21 | Assortment Optimization Under History-Dependent Effects | Taotao He et.al. | 2408.10967 | null |
2024-08-20 | Kilometer-Scale Convection Allowing Model Emulation using Generative Diffusion Modeling | Jaideep Pathak et.al. | 2408.10958 | null |
2024-08-20 | SysBench: Can Large Language Models Follow System Messages? | Yanzhao Qin et.al. | 2408.10943 | link |
2024-08-20 | A Closer Look at Data Augmentation Strategies for Finetuning-Based Low/Few-Shot Object Detection | Vladislav Li et.al. | 2408.10940 | null |
2024-08-20 | Large Point-to-Gaussian Model for Image-to-3D Generation | Longfei Lu et.al. | 2408.10935 | null |
2024-08-19 | MeshFormer: High-Quality Mesh Generation with 3D-Guided Reconstruction Model | Minghua Liu et.al. | 2408.10198 | null |
2024-08-19 | SpaRP: Fast 3D Object Reconstruction and Pose Estimation from Sparse Views | Chao Xu et.al. | 2408.10195 | null |
2024-08-19 | Customizing Language Models with Instance-wise LoRA for Sequential Recommendation | Xiaoyu Kong et.al. | 2408.10159 | link |
2024-08-19 | Advancing Voice Cloning for Nepali: Leveraging Transfer Learning in a Low-Resource Language | Manjil Karki et.al. | 2408.10128 | null |
2024-08-19 | Learning Precise Affordances from Egocentric Videos for Robotic Manipulation | Gen Li et.al. | 2408.10123 | null |
2024-08-19 | Convert and Speak: Zero-shot Accent Conversion with Minimum Supervision | Zhijun Jia et.al. | 2408.10096 | null |
2024-08-19 | Stacked Intelligent Metasurfaces for Integrated Sensing and Communications | Haoxian Niu et.al. | 2408.10043 | null |
2024-08-19 | General Impedance Modeling for Modular Multilevel Converter with Grid-forming and Grid-following Control | Chu Sun et.al. | 2408.10017 | null |
2024-08-19 | Uniting contrastive and generative learning for event sequences models | Aleksandr Yugay et.al. | 2408.09995 | null |
2024-08-19 | Multi-layer diffusion model of photovoltaic installations | Tomasz Weron et.al. | 2408.09904 | null |
2024-08-16 | Automated High-throughput Organic Crystal Structure Prediction via Population-based Sampling | Qiang Zhu et.al. | 2408.08843 | link |
2024-08-16 | PFDiff: Training-free Acceleration of Diffusion Models through the Gradient Guidance of Past and Future | Guangyi Wang et.al. | 2408.08822 | null |
2024-08-16 | A Unified Automata-Theoretic Approach to LTLf Modulo Theories (Extended Version) | Marco Faella et.al. | 2408.08817 | null |
2024-08-16 | EmoDynamiX: Emotional Support Dialogue Strategy Prediction by Modelling MiXed Emotions and Discourse Dynamics | Chenwei Wan et.al. | 2408.08782 | link |
2024-08-16 | Comparative Analysis of Generative Models: Enhancing Image Synthesis with VAEs, GANs, and Stable Diffusion | Sanchayan Vivekananthan et.al. | 2408.08751 | null |
2024-08-16 | The Blessing of Strategic Customers in Personalized Pricing | Zhi Chen et.al. | 2408.08738 | null |
2024-08-16 | ChatZero:Zero-shot Cross-Lingual Dialogue Generation via Pseudo-Target Language | Yongkang Liu et.al. | 2408.08724 | null |
2024-08-16 | An End-to-End Model for Photo-Sharing Multi-modal Dialogue Generation | Peiming Guo et.al. | 2408.08650 | null |
2024-08-16 | Modeling the Neonatal Brain Development Using Implicit Neural Representations | Florentin Bieder et.al. | 2408.08647 | link |
2024-08-16 | Sampling effects on Lasso estimation of drift functions in high-dimensional diffusion processes | Chiara Amorino et.al. | 2408.08638 | null |
2024-08-15 | Understanding the Local Geometry of Generative Model Manifolds | Ahmed Imtiaz Humayun et.al. | 2408.08307 | null |
2024-08-15 | Accelerated Image-Aware Generative Diffusion Modeling | Tanmay Asthana et.al. | 2408.08306 | null |
2024-08-15 | Marker or Markerless? Mode-Switchable Optical Tactile Sensing for Diverse Robot Tasks | Ni Ou et.al. | 2408.08276 | null |
2024-08-15 | mhGPT: A Lightweight Generative Pre-Trained Transformer for Mental Health Text Analysis | Dae-young Kim et.al. | 2408.08261 | null |
2024-08-15 | Derivative-Free Guidance in Continuous and Discrete Diffusion Models with Soft Value-Based Decoding | Xiner Li et.al. | 2408.08252 | link |
2024-08-15 | Picosecond laser pulses for quantum dot-microcavity based single photon generation by cascaded electro-optic modulation of a narrow-linewidth laser | Mio Poortvliet et.al. | 2408.08213 | null |
2024-08-15 | Not Every Image is Worth a Thousand Words: Quantifying Originality in Stable Diffusion | Adi Haviv et.al. | 2408.08184 | null |
2024-08-15 | Impact of Comprehensive Data Preprocessing on Predictive Modelling of COVID-19 Mortality | Sangita Das et.al. | 2408.08142 | link |
2024-08-15 | Decoding Memes: A Comparative Study of Machine Learning Models for Template Identification | Levente Murgás et.al. | 2408.08126 | link |
2024-08-15 | When Video Coding Meets Multimodal Large Language Models: A Unified Paradigm for Video Coding | Pingping Zhang et.al. | 2408.08093 | null |
2024-08-14 | Detecting Near-Duplicate Face Images | Sudipta Banerjee et.al. | 2408.07689 | link |
2024-08-14 | Composing Automatic Differentiation with Custom Derivatives of Higher-Order Functions | Sam Estep et.al. | 2408.07683 | null |
2024-08-14 | Drug Discovery SMILES-to-Pharmacokinetics Diffusion Models with Deep Molecular Understanding | Bing Hu et.al. | 2408.07636 | null |
2024-08-14 | Anisotropic Diffusion Model of Communication in 2D Biofilm | Yanahan Paramalingam et.al. | 2408.07626 | null |
2024-08-14 | Neural Quantum States and Peaked Molecular Wave Functions: Curse or Blessing? | Aleksei Malyshev et.al. | 2408.07625 | null |
2024-08-14 | MatterGPT: A Generative Transformer for Multi-Property Inverse Design of Solid-State Materials | Yan Chen et.al. | 2408.07608 | null |
2024-08-14 | PeriodWave: Multi-Period Flow Matching for High-Fidelity Waveform Generation | Sang-Hoon Lee et.al. | 2408.07547 | link |
2024-08-14 | New Curriculum, New Chance – Retrieval Augmented Generation for Lesson Planning in Ugandan Secondary Schools. Prototype Quality Evaluation | Simon Kloker et.al. | 2408.07542 | null |
2024-08-14 | DifuzCam: Replacing Camera Lens with a Mask and a Diffusion Model | Erez Yosef et.al. | 2408.07541 | null |
2024-08-14 | Towards Real-time Video Compressive Sensing on Mobile Devices | Miao Cao et.al. | 2408.07530 | link |
2024-08-13 | Imagen 3 | Imagen-Team-Google et.al. | 2408.07009 | null |
2024-08-13 | Low-Bitwidth Floating Point Quantization for Efficient High-Quality Diffusion Models | Cheng Chen et.al. | 2408.06995 | null |
2024-08-13 | DCMSA: Multi-Head Self-Attention Mechanism Based on Deformable Convolution For Seismic Data Denoising | Wang Mingwei et.al. | 2408.06963 | null |
2024-08-13 | Neural Speech and Audio Coding | Minje Kim et.al. | 2408.06954 | null |
2024-08-13 | Diffusion Model for Slate Recommendation | Federico Tomasi et.al. | 2408.06883 | null |
2024-08-13 | Efficient Search for Customized Activation Functions with Gradient Descent | Lukas Strack et.al. | 2408.06820 | link |
2024-08-13 | Enhancing Diabetic Retinopathy Diagnosis: A Lightweight CNN Architecture for Efficient Exudate Detection in Retinal Fundus Images | Mujadded Al Rabbani Alif et.al. | 2408.06784 | null |
2024-08-13 | Improving Synthetic Image Detection Towards Generalization: An Image Transformation Perspective | Ouxiang Li et.al. | 2408.06741 | link |
2024-08-13 | DiffLoRA: Generating Personalized Low-Rank Adaptation Weights with Diffusion | Yujia Wu et.al. | 2408.06740 | null |
2024-08-13 | Multimodal Analysis of White Blood Cell Differentiation in Acute Myeloid Leukemia Patients using a β-Variational Autoencoder | Gizem Mert et.al. | 2408.06720 | null |
2024-08-12 | The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery | Chris Lu et.al. | 2408.06292 | link |
2024-08-12 | Open-Source Molecular Processing Pipeline for Generating Molecules | Shreyas V et.al. | 2408.06261 | null |
2024-08-12 | 3D Reconstruction of Protein Structures from Multi-view AFM Images using Neural Radiance Fields (NeRFs) | Jaydeep Rade et.al. | 2408.06244 | null |
2024-08-12 | Cislunar Constellation Design for Space Situational Awareness with Time-Expanded Facility Location Problem | Yuri Shimane et.al. | 2408.06238 | null |
2024-08-12 | Novel View Synthesis from a Single Image with Pretrained Diffusion Guidance | Taewon Kang et.al. | 2408.06157 | null |
2024-08-12 | LipidBERT: A Lipid Language Model Pre-trained on METiS de novo Lipid Library | Tianhao Yu et.al. | 2408.06150 | null |
2024-08-12 | Efficient and Scalable Point Cloud Generation with Sparse Point-Voxel Diffusion Models | Ioannis Romanelis et.al. | 2408.06145 | link |
2024-08-12 | Med42-v2: A Suite of Clinical LLMs | Clément Christophe et.al. | 2408.06142 | null |
2024-08-12 | Five Pitfalls When Assessing Synthetic Medical Images with Reference Metrics | Melanie Dohmen et.al. | 2408.06075 | null |
2024-08-12 | CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer | Zhuoyi Yang et.al. | 2408.06072 | link |
2024-08-09 | Multi-Garment Customized Model Generation | Yichen Liu et.al. | 2408.05206 | null |
2024-08-09 | TaSL: Task Skill Localization and Consolidation for Language Model Continual Learning | Yujie Feng et.al. | 2408.05200 | link |
2024-08-09 | Cell Morphology-Guided Small Molecule Generation with GFlowNets | Stephen Zhewen Lu et.al. | 2408.05196 | link |
2024-08-09 | Lithography-free patterning of chalcogenide materials for integrated photonic devices | Zhen Hu et.al. | 2408.05099 | null |
2024-08-09 | Social contagion under hybrid interactions | Xincheng Shu et.al. | 2408.05050 | null |
2024-08-09 | Infrared Beam-shaping on Demand via Tailored Geometric Phase Metasurfaces employing the Plasmonic Phase-Change Material In3SbTe2 | Lukas Conrads et.al. | 2408.05044 | null |
2024-08-09 | Collaborative Static-Dynamic Teaching: A Semi-Supervised Framework for Stripe-Like Space Target Detection | Zijian Zhu et.al. | 2408.05029 | null |
2024-08-09 | Retrieval-augmented code completion for local projects using large language models | Marko Hostnik et.al. | 2408.05026 | null |
2024-08-09 | DreamCouple: Exploring High Quality Text-to-3D Generation Via Rectified Flow | Hangyu Li et.al. | 2408.05008 | null |
2024-08-09 | Pay Attention To Mean Fields For Point Cloud Generation | Benno Käch et.al. | 2408.04997 | link |
2024-08-08 | Puppet-Master: Scaling Interactive Video Generation as a Motion Prior for Part-Level Dynamics | Ruining Li et.al. | 2408.04631 | null |
2024-08-08 | Transformer Explainer: Interactive Learning of Text-Generative Models | Aeree Cho et.al. | 2408.04619 | null |
2024-08-08 | Sketch2Scene: Automatic Generation of Interactive 3D Game Scenes from User’s Casual Sketches | Yongzhi Xu et.al. | 2408.04567 | null |
2024-08-08 | Bias-Aware Low-Rank Adaptation: Mitigating Catastrophic Inheritance of Large Language Models | Yupeng Chang et.al. | 2408.04556 | link |
2024-08-08 | On the Asymptotic Convergence of Subgraph Generated Models | Xinchen Xu et.al. | 2408.04541 | null |
2024-08-08 | AExGym: Benchmarks and Environments for Adaptive Experimentation | Jimmy Wang et.al. | 2408.04531 | null |
2024-08-08 | NFDI4Health workflow and service for synthetic data generation, assessment and risk management | Sobhan Moazemi et.al. | 2408.04478 | null |
2024-08-08 | Deep Generative Models in Robotics: A Survey on Learning from Multimodal Demonstrations | Julen Urain et.al. | 2408.04380 | null |
2024-08-08 | Making sense of AI systems development | Mateusz Dolata et.al. | 2408.04311 | null |
2024-08-08 | AI-Driven Chatbot for Intrusion Detection in Edge Networks: Enhancing Cybersecurity with Ethical User Consent | Mugheez Asif et.al. | 2408.04281 | null |
2024-08-07 | Prospects for using drones to test formation-flying CubeSat concepts, and other astronomical applications | John D. Monnier et.al. | 2408.03911 | null |
2024-08-07 | Hate Speech Detection and Classification in Amharic Text with Deep Learning | Samuel Minale Gashe et.al. | 2408.03849 | null |
2024-08-07 | WalledEval: A Comprehensive Safety Evaluation Toolkit for Large Language Models | Prannaya Gupta et.al. | 2408.03837 | link |
2024-08-07 | A broken duet: multistable dynamics of dyadic interactions | Johan Medrano et.al. | 2408.03809 | link |
2024-08-07 | Navigating the Human Maze: Real-Time Robot Pathfinding with Generative Imitation Learning | Martin Moder et.al. | 2408.03807 | link |
2024-08-07 | Data Generation Scheme for Thermal Modality with Edge-Guided Adversarial Conditional Diffusion Model | Guoqing Zhu et.al. | 2408.03748 | link |
2024-08-07 | Local Topology Measures of Contextual Language Model Latent Spaces With Applications to Dialogue Term Extraction | Benjamin Matthias Ruppik et.al. | 2408.03706 | null |
2024-08-07 | Openstory++: A Large-scale Dataset and Benchmark for Instance-aware Open-domain Visual Storytelling | Zilyu Ye et.al. | 2408.03695 | null |
2024-08-07 | Unsupervised Detection of Fetal Brain Anomalies using Denoising Diffusion Models | Markus Ditlev Sjøgren Olsen et.al. | 2408.03654 | null |
2024-08-07 | Goal-oriented Semantic Communication for the Metaverse Application | Zhe Wang et.al. | 2408.03646 | null |
2024-08-06 | MDT-A2G: Exploring Masked Diffusion Transformers for Co-Speech Gesture Generation | Xiaofeng Mao et.al. | 2408.03312 | null |
2024-08-06 | IPAdapter-Instruct: Resolving Ambiguity in Image-based Conditioning using Instruct Prompts | Ciara Rowles et.al. | 2408.03209 | null |
2024-08-06 | Personalizing Federated Instrument Segmentation with Visual Trait Priors in Robotic Surgery | Jialang Xu et.al. | 2408.03208 | null |
2024-08-06 | An Object is Worth 64x64 Pixels: Generating 3D Object via Image Diffusion | Xingguang Yan et.al. | 2408.03178 | null |
2024-08-06 | Iterative CT Reconstruction via Latent Variable Optimization of Shallow Diffusion Models | Sho Ozaki et.al. | 2408.03156 | null |
2024-08-06 | Enhancing Twitter Bot Detection via Multimodal Invariant Representations | Jibing Gong et.al. | 2408.03096 | null |
2024-08-06 | Analysis of Argument Structure Constructions in a Deep Recurrent Language Model | Pegah Ramezani et.al. | 2408.03062 | null |
2024-08-06 | OpenOmni: A Collaborative Open Source Tool for Building Future-Ready Multimodal Conversational Agents | Qiang Sun et.al. | 2408.03047 | link |
2024-08-06 | Targeted Visual Prompting for Medical Visual Question Answering | Sergio Tascon-Morales et.al. | 2408.03043 | link |
2024-08-06 | Training-Free Condition Video Diffusion Models for single frame Spatial-Semantic Echocardiogram Synthesis | Van Phi Nguyen et.al. | 2408.03035 | link |
2024-08-05 | Command-line Obfuscation Detection using Small Language Models | Vojtech Outrata et.al. | 2408.02637 | null |
2024-08-05 | VidGen-1M: A Large-Scale Dataset for Text-to-video Generation | Zhiyu Tan et.al. | 2408.02629 | null |
2024-08-05 | YOWOv3: An Efficient and Generalized Framework for Human Action Detection and Recognition | Duc Manh Nguyen Dang et.al. | 2408.02623 | link |
2024-08-05 | LaMamba-Diff: Linear-Time High-Fidelity Diffusion Models Based on Local Attention and Mamba | Yunxiang Fu et.al. | 2408.02615 | link |
2024-08-05 | MetaParticles: Computationally engineered nanomaterials with tunable and responsive properties | Massimiliano Paesani et.al. | 2408.02564 | null |
2024-08-05 | Fairness and Bias Mitigation in Computer Vision: A Survey | Sepehr Dehdashtian et.al. | 2408.02464 | null |
2024-08-05 | TGS: Trajectory Generation and Selection using Vision Language Models in Mapless Outdoor Environments | Daeun Song et.al. | 2408.02454 | null |
2024-08-05 | Why Are My Prompts Leaked? Unraveling Prompt Extraction Threats in Customized Large Language Models | Zi Liang et.al. | 2408.02416 | link |
2024-08-05 | Multi-weather Cross-view Geo-localization Using Denoising Diffusion Models | Tongtong Feng et.al. | 2408.02408 | null |
2024-08-05 | A Few-Shot Approach for Relation Extraction Domain Adaptation using Large Language Models | Vanni Zavarella et.al. | 2408.02377 | null |
2024-08-02 | Conditional LoRA Parameter Generation | Xiaolong Jin et.al. | 2408.01415 | null |
2024-08-02 | Autoencoders in Function Space | Justin Bunker et.al. | 2408.01362 | link |
2024-08-02 | MCGMark: An Encodable and Robust Online Watermark for LLM-Generated Malicious Code | Kaiwen Ning et.al. | 2408.01354 | link |
2024-08-02 | TexGen: Text-Guided 3D Texture Generation with Multi-view Sampling and Resampling | Dong Huo et.al. | 2408.01291 | null |
2024-08-02 | A General Framework to Boost 3D GS Initialization for Text-to-3D Generation by Lexical Richness | Lutao Jiang et.al. | 2408.01269 | null |
2024-08-02 | Exchange control in a MOS double quantum dot made using a 300 mm wafer process | Jacob F. Chittock-Wood et.al. | 2408.01241 | null |
2024-08-02 | CLIP4Sketch: Enhancing Sketch to Mugshot Matching through Dataset Augmentation using Diffusion Models | Kushal Kumar Jain et.al. | 2408.01233 | null |
2024-08-02 | Reality Fusion: Robust Real-time Immersive Mobile Robot Teleoperation with Volumetric Visual Data Fusion | Ke Li et.al. | 2408.01225 | link |
2024-08-02 | PSP-GEN: Stochastic inversion of the Process-Structure-Property chain in materials design through deep, generative probabilistic modeling | Yaohua Zang et.al. | 2408.01114 | null |
2024-08-02 | Six Dragons Fly Again: Reviving 15th-Century Korean Court Music with Transformers and Novel Encoding | Danbinaerin Han et.al. | 2408.01096 | link |
2024-08-01 | Optimizing Diffusion Models for Joint Trajectory Prediction and Controllable Generation | Yixiao Wang et.al. | 2408.00766 | null |
2024-08-01 | Smoothed Energy Guidance: Guiding Diffusion Models with Reduced Energy Curvature of Attention | Susung Hong et.al. | 2408.00760 | link |
2024-08-01 | DynamoLLM: Designing LLM Inference Clusters for Performance and Energy Efficiency | Jovan Stojkovic et.al. | 2408.00741 | null |
2024-08-01 | TurboEdit: Text-Based Image Editing Using Few-Step Diffusion Models | Gilad Deutch et.al. | 2408.00735 | null |
2024-08-01 | A Natural Language Processing Framework for Hotel Recommendation Based on Users’ Text Reviews | Lavrentia Aravani et.al. | 2408.00716 | null |
2024-08-02 | Reinforcement Learning applied to Insurance Portfolio Pursuit | Edward James Young et.al. | 2408.00713 | link |
2024-08-01 | MotionFix: Text-Driven 3D Human Motion Editing | Nikos Athanasiou et.al. | 2408.00712 | null |
2024-08-01 | Synthetic dual image generation for reduction of labeling efforts in semantic segmentation of micrographs with a customized metric function | Matias Oscar Volman Stern et.al. | 2408.00707 | null |
2024-08-01 | AutoM3L: An Automated Multimodal Machine Learning Framework with Large Language Models | Daqin Luo et.al. | 2408.00665 | link |
2024-08-01 | Privacy-preserving datasets by capturing feature distributions with Conditional VAEs | Francesco Di Salvo et.al. | 2408.00639 | link |
2024-07-31 | Detecting, Explaining, and Mitigating Memorization in Diffusion Models | Yuxin Wen et.al. | 2407.21720 | link |
2024-07-31 | Tora: Trajectory-oriented Diffusion Transformer for Video Generation | Zhenghao Zhang et.al. | 2407.21705 | link |
2024-07-31 | Generative Diffusion Model for Seismic Imaging Improvement of Sparsely Acquired Data and Uncertainty Quantification | Xingchen Shi et.al. | 2407.21683 | null |
2024-07-31 | Quality Control for Radiology Report Generation Models via Auxiliary Auditing Components | Hermione Warr et.al. | 2407.21638 | null |
2024-07-31 | LLM-for-X: Application-agnostic Integration of Large Language Models to Support Personal Writing Workflows | Lukas Teufelberger et.al. | 2407.21593 | null |
2024-07-31 | Long-term investment and energy procurement risk management under uncertainty for an electrolytic green hydrogen producer | Owen Palmer et.al. | 2407.21574 | null |
2024-07-31 | Conditioned Prompt-Optimization for Continual Deepfake Detection | Francesco Laiti et.al. | 2407.21554 | link |
2024-07-31 | CXSimulator: A User Behavior Simulation using LLM Embeddings for Web-Marketing Campaign Assessment | Akira Kasuga et.al. | 2407.21553 | null |
2024-07-31 | Explainable and Controllable Motion Curve Guided Cardiac Ultrasound Video Generation | Junxuan Yu et.al. | 2407.21490 | null |
2024-07-31 | Maverick: Efficient and Accurate Coreference Resolution Defying Recent Trends | Giuliano Martinelli et.al. | 2407.21489 | link |
2024-07-30 | Matting by Generation | Zhixiang Wang et.al. | 2407.21017 | null |
2024-07-30 | Add-SD: Rational Generation without Manual Reference | Lingfeng Yang et.al. | 2407.21016 | link |
2024-07-30 | Integrating Agent-Based and Compartmental Models for Infectious Disease Modeling: A Novel Hybrid Approach | Inan Bostanci et.al. | 2407.20993 | null |
2024-07-30 | MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions | Xiaowei Chi et.al. | 2407.20962 | link |
2024-07-30 | Mitigating calibration errors from mutual coupling with time-domain filtering of 21 cm cosmological radio observations | N. Charles et.al. | 2407.20923 | null |
2024-07-30 | Impact of Geographical Separation on Spectrum Sharing Markets | Kangle Mu et.al. | 2407.20909 | null |
2024-07-30 | Dynamic Scene Understanding through Object-Centric Voxelization and Neural Rendering | Yanpeng Zhao et.al. | 2407.20908 | link |
2024-07-30 | Vulnerabilities in AI-generated Image Detection: The Challenge of Adversarial Attacks | Yunfeng Diao et.al. | 2407.20836 | null |
2024-07-30 | Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning | Norman Di Palo et.al. | 2407.20798 | null |
2024-07-30 | SynthVLM: High-Efficiency and High-Quality Synthetic Data for Vision Language Models | Zheng Liu et.al. | 2407.20756 | link |
2024-07-29 | Specify and Edit: Overcoming Ambiguity in Text-Based Image Editing | Ekaterina Iakovleva et.al. | 2407.20232 | null |
2024-07-29 | LatentArtiFusion: An Effective and Efficient Histological Artifacts Restoration Framework | Zhenqi He et.al. | 2407.20172 | link |
2024-07-29 | Diffusion Feedback Helps CLIP See Better | Wenxuan Wang et.al. | 2407.20171 | link |
2024-07-29 | DDAP: Dual-Domain Anti-Personalization against Text-to-Image Diffusion Models | Jing Yang et.al. | 2407.20141 | null |
2024-07-29 | Diffusion-DICE: In-Sample Diffusion Guidance for Offline Reinforcement Learning | Liyuan Mao et.al. | 2407.20109 | null |
2024-07-29 | On the significance of parameters and the projective level in the Choice and Collection axioms | Vladimir Kanovei et.al. | 2407.20098 | null |
2024-07-29 | Generative Diffusion Model Bootstraps Zero-shot Classification of Fetal Ultrasound Images In Underrepresented African Populations | Fangyijie Wang et.al. | 2407.20072 | link |
2024-07-29 | ImagiNet: A Multi-Content Dataset for Generalizable Synthetic Image Detection via Contrastive Learning | Delyan Boychev et.al. | 2407.20020 | link |
2024-07-29 | Reproducibility Study of “ITI-GEN: Inclusive Text-to-Image Generation” | Daniel Gallo Fernández et.al. | 2407.19996 | link |
2024-07-29 | HeadsetOff: Enabling Photorealistic Video Conferencing on Economical VR Headsets | Yili Jin et.al. | 2407.19988 | null |
2024-07-26 | Generative Adversarial Networks for Imputing Sparse Learning Performance | Liang Zhang et.al. | 2407.18875 | null |
2024-07-26 | Unifying Visual and Semantic Feature Spaces with Diffusion Models for Enhanced Cross-Modal Alignment | Yuze Zheng et.al. | 2407.18854 | null |
2024-07-26 | Scalable Group Choreography via Variational Phase Manifold Learning | Nhat Le et.al. | 2407.18839 | null |
2024-07-26 | Revision of calcium and scandium abundances in Am stars based on NLTE calculations and comparison with diffusion stellar evolution models | L. I. Mashonkina et.al. | 2407.18736 | null |
2024-07-26 | BCTR: Bidirectional Conditioning Transformer for Scene Graph Generation | Peng Hao et.al. | 2407.18715 | null |
2024-07-26 | Q-gen: A Parameterized Quantum Circuit Generator | Yikai Mao et.al. | 2407.18697 | link |
2024-07-26 | Adversarial Robustification via Text-to-Image Diffusion Models | Daewon Choi et.al. | 2407.18658 | link |
2024-07-26 | Robust VAEs via Generating Process of Noise Augmented Data | Hiroo Irobe et.al. | 2407.18632 | null |
2024-07-26 | Denoising Lévy Probabilistic Models | Dario Shariatian et.al. | 2407.18609 | link |
2024-07-26 | How To Segment in 3D Using 2D Models: Automated 3D Segmentation of Prostate Cancer Metastatic Lesions on PET Volumes Using Multi-Angle Maximum Intensity Projections and Diffusion Models | Amirhosein Toosi et.al. | 2407.18555 | null |
2024-07-25 | RegionDrag: Fast Region-Based Image Editing with Diffusion Models | Jingyi Lu et.al. | 2407.18247 | null |
2024-07-25 | VGGHeads: A Large-Scale Synthetic Dataset for 3D Human Heads | Orest Kupyn et.al. | 2407.18245 | null |
2024-07-25 | CodedVO: Coded Visual Odometry | Sachin Shah et.al. | 2407.18240 | null |
2024-07-25 | SuperFlow: A Fully-Customized RTL-to-GDS Design Automation Flow for Adiabatic Quantum-Flux-Parametron Superconducting Circuits | Yanyue Xie et.al. | 2407.18209 | null |
2024-07-25 | Test2VA: Reusing GUI Test Cases for Voice Assistant Features Development in Mobile Applications | Garrett Weaver et.al. | 2407.18155 | null |
2024-07-25 | Self-supervised pre-training with diffusion model for few-shot landmark detection in x-ray images | Roberto Di Via et.al. | 2407.18125 | null |
2024-07-25 | Keypoint Promptable Re-Identification | Vladimir Somers et.al. | 2407.18112 | link |
2024-07-25 | SSTD: Stripe-Like Space Target Detection using Single-Point Supervision | Zijian Zhu et.al. | 2407.18097 | null |
2024-07-25 | Cross-Observatory Coordination with tilepy: A Novel Tool for Observations of Multi-Messenger Transient Events | Monica Seglar-Arroyo et.al. | 2407.18076 | null |
2024-07-25 | AttentionHand: Text-driven Controllable Hand Image Generation for 3D Hand Reconstruction in the Wild | Junho Park et.al. | 2407.18034 | link |
2024-07-24 | SV4D: Dynamic 3D Content Generation with Multi-Frame and Multi-View Consistency | Yiming Xie et.al. | 2407.17470 | null |
2024-07-24 | BlueTempNet: A Temporal Multi-network Dataset of Social Interactions in Bluesky Social | Ujun Jeong et.al. | 2407.17451 | link |
2024-07-24 | ProvenanceWidgets: A Library of UI Control Elements to Track and Dynamically Overlay Analytic Provenance | Arpit Narechania et.al. | 2407.17431 | link |
2024-07-24 | CDDIP: Constrained Diffusion-Driven Deep Image Prior for Seismic Image Reconstruction | Paul Goyes-Peñafiel et.al. | 2407.17402 | link |
2024-07-24 | Cosmic ray susceptibility of the Terahertz Intensity Mapper detector arrays | Lun-Jun Liu et.al. | 2407.17381 | null |
2024-07-24 | ViPer: Visual Personalization of Generative Models via Individual Preference Learning | Sogand Salehi et.al. | 2407.17365 | null |
2024-07-24 | Boosting Large Language Models with Socratic Method for Conversational Mathematics Teaching | Yuyang Ding et.al. | 2407.17349 | link |
2024-07-24 | Quantum nonlocal modulation cancellation with distributed clocks | Stephen D. Chapman et.al. | 2407.17330 | null |
2024-07-25 | Enhanced Deep Learning Methodologies and MRI Selection Techniques for Dementia Diagnosis in the Elderly Population | Nikolaos Ntampakis et.al. | 2407.17324 | null |
2024-07-24 | Edge-Cloud Continuum Orchestration of Critical Services: A Smart-City Approach | Rodrigo Rosmaninho et.al. | 2407.17314 | null |
2024-07-23 | Diffusion Models for Monocular Depth Estimation: Overcoming Challenging Conditions | Fabio Tosi et.al. | 2407.16698 | link |
2024-07-23 | From Imitation to Refinement – Residual RL for Precise Visual Assembly | Lars Ankile et.al. | 2407.16677 | null |
2024-07-23 | RedAgent: Red Teaming Large Language Models with Context-aware Autonomous Language Agent | Huiyu Xu et.al. | 2407.16667 | null |
2024-07-23 | MovieDreamer: Hierarchical Generation for Coherent Long Visual Sequence | Canyu Zhao et.al. | 2407.16655 | null |
2024-07-23 | Unveiling and Mitigating Bias in Audio Visual Segmentation | Peiwen Sun et.al. | 2407.16638 | null |
2024-07-23 | Knowledge-driven AI-generated data for accurate and interpretable breast ultrasound diagnoses | Haojun Yu et.al. | 2407.16634 | null |
2024-07-23 | GenRec: A Flexible Data Generator for Recommendations | Erica Coppolillo et.al. | 2407.16594 | null |
2024-07-23 | COALA: A Practical and Vision-Centric Federated Learning Platform | Weiming Zhuang et.al. | 2407.16560 | link |
2024-07-23 | DreamVTON: Customizing 3D Virtual Try-on with Personalized Diffusion Models | Zhenyu Xie et.al. | 2407.16511 | null |
2024-07-23 | qMRI Diffusor: Quantitative T1 Mapping of the Brain using a Denoising Diffusion Probabilistic Model | Shishuai Wang et.al. | 2407.16477 | null |
2024-07-22 | Artist: Aesthetically Controllable Text-Driven Stylization without Training | Ruixiang Jiang et.al. | 2407.15842 | link |
2024-07-23 | A Large-scale Benchmark Dataset for Commuting Origin-destination Matrix Generation | Can Rong et.al. | 2407.15823 | link |
2024-07-22 | Stretching Each Dollar: Diffusion Training from Scratch on a Micro-Budget | Vikash Sehwag et.al. | 2407.15811 | null |
2024-07-22 | Quantum Computing for Phonon Scattering Effects on Thermal Conductivity | Xiangjun Tan et.al. | 2407.15808 | null |
2024-07-22 | Enhancing Mass Customization Manufacturing: Multiobjective Metaheuristic Algorithms for flow shop Production in Smart Industry | Diego Rossit et.al. | 2407.15802 | null |
2024-07-22 | Diffusion Model Based Resource Allocation Strategy in Ultra-Reliable Wireless Networked Control Systems | Amirhassan Babazadeh Darabi et.al. | 2407.15784 | null |
2024-07-22 | A Hamilton-Jacobi approach to road-field reaction-diffusion models | Christopher Henderson et.al. | 2407.15760 | null |
2024-07-22 | Diffusion for Out-of-Distribution Detection on Road Scenes and Beyond | Silvio Galesso et.al. | 2407.15739 | link |
2024-07-22 | DStruct2Design: Data and Benchmarks for Data Structure Driven Generative Floor Plan Design | Zhi Hao Luo et.al. | 2407.15723 | link |
2024-07-22 | Estimating Probability Densities with Transformer and Denoising Diffusion | Henry W. Leung et.al. | 2407.15703 | link |
2024-07-19 | DEPICT: Diffusion-Enabled Permutation Importance for Image Classification Tasks | Sarah Jabbour et.al. | 2407.14509 | null |
2024-07-19 | On Pre-training of Multimodal Language Models Customized for Chart Understanding | Wan-Cyuan Fan et.al. | 2407.14506 | null |
2024-07-19 | T2V-CompBench: A Comprehensive Benchmark for Compositional Text-to-video Generation | Kaiyue Sun et.al. | 2407.14505 | link |
2024-07-19 | M2D2M: Multi-Motion Generation from Text with Discrete Diffusion Models | Seunggeun Chi et.al. | 2407.14502 | null |
2024-07-19 | A Precision Cryogenic Positioning Stage for Detector Dithering and Flexure Compensation | Stephen A. Smee et.al. | 2407.14493 | null |
2024-07-19 | Contrastive Learning with Counterfactual Explanations for Radiology Report Generation | Mingjie Li et.al. | 2407.14474 | null |
2024-07-19 | Describe Data to get Science-Data-Ready Tooling: Awkward as a Target for Kaitai Struct YAML | Manasvi Goyal et.al. | 2407.14461 | null |
2024-07-19 | Co-synthesis of Histopathology Nuclei Image-Label Pairs using a Context-Conditioned Joint Diffusion Model | Seonghui Min et.al. | 2407.14434 | null |
2024-07-19 | Controllable and Efficient Multi-Class Pathology Nuclei Data Augmentation using Text-Conditioned Diffusion Models | Hyun-Jic Oh et.al. | 2407.14426 | null |
2024-07-19 | GLAudio Listens to the Sound of the Graph | Aurelio Sulser et.al. | 2407.14387 | link |
2024-07-18 | LogoSticker: Inserting Logos into Diffusion Models for Customized Generation | Mingkang Zhu et.al. | 2407.13752 | null |
2024-07-18 | Understanding Reinforcement Learning-Based Fine-Tuning of Diffusion Models: A Tutorial and Review | Masatoshi Uehara et.al. | 2407.13734 | link |
2024-07-18 | Shaded Route Planning Using Active Segmentation and Identification of Satellite Images | Longchao Da et.al. | 2407.13689 | null |
2024-07-18 | PASTA: Controllable Part-Aware Shape Generation with Autoregressive Transformers | Songlin Li et.al. | 2407.13677 | link |
2024-07-18 | MeshSegmenter: Zero-Shot Mesh Semantic Segmentation via Texture Synthesis | Ziming Zhong et.al. | 2407.13675 | link |
2024-07-18 | Open-Vocabulary 3D Semantic Segmentation with Text-to-Image Diffusion Models | Xiaoyu Zhu et.al. | 2407.13642 | null |
2024-07-18 | Training-free Composite Scene Generation for Layout-to-Image Synthesis | Jiaqi Liu et.al. | 2407.13609 | link |
2024-07-18 | EnergyDiff: Universal Time-Series Energy Data Generation using Diffusion Models | Nan Lin et.al. | 2407.13538 | null |
2024-07-18 | VeriQR: A Robustness Verification Tool for Quantum Machine Learning Models | Yanling Lin et.al. | 2407.13533 | null |
2024-07-18 | All Roads Lead to Rome? Exploring Representational Similarities Between Latent Spaces of Generative Image Models | Charumathi Badrinath et.al. | 2407.13449 | link |
2024-07-17 | SMooDi: Stylized Motion Diffusion Model | Lei Zhong et.al. | 2407.12783 | null |
2024-07-17 | VD3D: Taming Large Video Diffusion Transformers for 3D Camera Control | Sherwin Bahmani et.al. | 2407.12781 | null |
2024-07-17 | Hallucination Index: An Image Quality Metric for Generative Reconstruction Models | Matthew Tivnan et.al. | 2407.12780 | null |
2024-07-17 | GroundUp: Rapid Sketch-Based 3D City Massing | Gizem Esra Unlu et.al. | 2407.12739 | null |
2024-07-17 | EchoSight: Advancing Visual-Language Models with Wiki Knowledge | Yibin Yan et.al. | 2407.12735 | null |
2024-07-17 | NL2Contact: Natural Language Guided 3D Hand-Object Contact Modeling with Diffusion Model | Zhongqun Zhang et.al. | 2407.12727 | null |
2024-07-17 | An Evaluation of Continual Learning for Advanced Node Semiconductor Defect Inspection | Amit Prasad et.al. | 2407.12724 | null |
2024-07-17 | Unlocking planetesimal magnetic field histories: a refined, versatile model for thermal evolution and dynamo generation | Hannah R. Sanderson et.al. | 2407.12721 | null |
2024-07-17 | SlimFlow: Training Smaller One-Step Diffusion Models with Rectified Flow | Yuanzhi Zhu et.al. | 2407.12718 | link |
2024-07-17 | Teleoperation in Robot-assisted MIS with Adaptive RCM via Admittance Control | Ehsan Nasiri et.al. | 2407.12711 | null |
2024-07-16 | Efficient Training with Denoised Neural Weights | Yifan Gong et.al. | 2407.11966 | null |
2024-07-16 | UrbanWorld: An Urban World Model for 3D City Generation | Yu Shang et.al. | 2407.11965 | link |
2024-07-16 | Context-Guided Diffusion for Out-of-Distribution Molecular and Protein Design | Leo Klarner et.al. | 2407.11942 | link |
2024-07-16 | Code Documentation and Analysis to Secure Software Development | Paul Attie et.al. | 2407.11934 | null |
2024-07-16 | Global Optimisation of Black-Box Functions with Generative Models in the Wasserstein Space | Tigran Ramazyan et.al. | 2407.11917 | link |
2024-07-16 | Quantised Global Autoencoder: A Holistic Approach to Representing Visual Data | Tim Elsner et.al. | 2407.11913 | null |
2024-07-16 | Data-Juicer Sandbox: A Comprehensive Suite for Multimodal Data-Model Co-development | Daoyuan Chen et.al. | 2407.11784 | link |
2024-07-16 | Diffusion-driven self-assembly of emerin nanodomains at the nuclear envelope | Carlos D. Alas et.al. | 2407.11758 | null |
2024-07-16 | Generating Multi-Modal and Multi-Attribute Single-Cell Counts with CFGen | Alessandro Palma et.al. | 2407.11734 | link |
2024-07-16 | Theoretical Insights into CycleGAN: Analyzing Approximation and Estimation Errors in Unpaired Data Generation | Luwei Sun et.al. | 2407.11678 | null |
2024-07-15 | Make-An-Agent: A Generalizable Policy Network Generator with Behavior-Prompted Diffusion | Yongyuan Liang et.al. | 2407.10973 | null |
2024-07-15 | Fast Matrix Multiplications for Lookup Table-Quantized LLMs | Han Guo et.al. | 2407.10960 | link |
2024-07-15 | InVi: Object Insertion In Videos Using Off-the-Shelf Diffusion Models | Nirat Saini et.al. | 2407.10958 | null |
2024-07-16 | DataDream: Few-shot Guided Dataset Generation | Jae Myung Kim et.al. | 2407.10910 | link |
2024-07-15 | Optical Diffusion Models for Image Generation | Ilker Oguz et.al. | 2407.10897 | null |
2024-07-15 | R3D-AD: Reconstruction via Diffusion for 3D Anomaly Detection | Zheyuan Zhou et.al. | 2407.10862 | null |
2024-07-15 | Physics-Inspired Generative Models in Medical Imaging: A Review | Dennis Hein et.al. | 2407.10856 | null |
2024-07-15 | Inferring dark energy properties from the scale factor parametrisation | Upala Mukhopadhayay et.al. | 2407.10845 | null |
2024-07-15 | MoE-DiffIR: Task-customized Diffusion Priors for Universal Compressed Image Restoration | Yulin Ren et.al. | 2407.10833 | null |
2024-07-15 | Foundational Autoraters: Taming Large Language Models for Better Automatic Evaluation | Tu Vu et.al. | 2407.10817 | null |
2024-07-12 | StyleSplat: 3D Object Style Transfer with Gaussian Splatting | Sahil Jain et.al. | 2407.09473 | null |
2024-07-12 | FairyLandAI: Personalized Fairy Tales utilizing ChatGPT and DALLE-3 | Georgios Makridis et.al. | 2407.09467 | null |
2024-07-12 | The $μ\mathcal{G}$ Language for Programming Graph Neural Networks | Matteo Belenchia et.al. | 2407.09441 | null |
2024-07-12 | Graph Neural Network Causal Explanation via Neural Causal Models | Arman Behnam et.al. | 2407.09378 | link |
2024-07-12 | Computationally Efficient Estimation of Large Probit Models | Patrick Ding et.al. | 2407.09371 | null |
2024-07-12 | Is Contrasting All You Need? Contrastive Learning for the Detection and Attribution of AI-generated Text | Lucio La Cava et.al. | 2407.09364 | null |
2024-07-15 | Any-Property-Conditional Molecule Generation with Self-Criticism using Spanning Trees | Alexia Jolicoeur-Martineau et.al. | 2407.09357 | link |
2024-07-12 | PID: Physics-Informed Diffusion Model for Infrared Image Generation | Fangyuan Mao et.al. | 2407.09299 | link |
2024-07-12 | Learning Distances from Data with Normalizing Flows and Score Matching | Peter Sorrenson et.al. | 2407.09297 | null |
2024-07-12 | Surgical Text-to-Image Generation | Chinedu Innocent Nwoye et.al. | 2407.09230 | null |
2024-07-11 | Video Diffusion Alignment via Reward Gradients | Mihir Prabhudesai et.al. | 2407.08737 | link |
2024-07-11 | Live2Diff: Live Stream Translation via Uni-directional Attention in Video Diffusion Models | Zhening Xing et.al. | 2407.08701 | null |
2024-07-11 | FAR-Trans: An Investment Dataset for Financial Asset Recommendation | Javier Sanz-Cruzado et.al. | 2407.08692 | null |
2024-07-11 | Scattering transforms on the sphere, application to large scale structure modelling | Louise Mousset et.al. | 2407.08687 | null |
2024-07-11 | CAD-Prompted Generative Models: A Pathway to Feasible and Novel Engineering Designs | Leah Chong et.al. | 2407.08675 | null |
2024-07-11 | Still-Moving: Customized Video Generation without Customized Video Data | Hila Chefer et.al. | 2407.08674 | null |
2024-07-11 | Controlling the Fidelity and Diversity of Deep Generative Models via Pseudo Density | Shuangqi Li et.al. | 2407.08659 | null |
2024-07-11 | Adaptive Smooth Non-Stationary Bandits | Joe Suk et.al. | 2407.08654 | null |
2024-07-11 | Fine-Tuning Stable Diffusion XL for Stylistic Icon Generation: A Comparison of Caption Size | Youssef Sultan et.al. | 2407.08513 | null |
2024-07-11 | Latent Conditional Diffusion-based Data Augmentation for Continuous-Time Dynamic Graph Mode | Yuxing Tian et.al. | 2407.08500 | null |
2024-07-10 | Generative Image as Action Models | Mohit Shridhar et.al. | 2407.07875 | link |
2024-07-10 | Dynamical Measure Transport and Neural PDE Solvers for Sampling | Jingtong Sun et.al. | 2407.07873 | null |
2024-07-10 | Controlling Space and Time with Diffusion Models | Daniel Watson et.al. | 2407.07860 | null |
2024-07-10 | Generic Numerical Analysis of Stochastic Reaction Diffusion Model with applications in excitable media | Yahya Alnashri et.al. | 2407.07834 | null |
2024-07-10 | Universal and non-universal signatures in the scaling functions of critical variables | Gianluca Teza et.al. | 2407.07782 | null |
2024-07-10 | Towards Human-Like Driving: Active Inference in Autonomous Vehicle Control | Elahe Delavari et.al. | 2407.07684 | null |
2024-07-10 | VEnhancer: Generative Space-Time Enhancement for Video Generation | Jingwen He et.al. | 2407.07667 | null |
2024-07-10 | A Coding-Theoretic Analysis of Hyperspherical Prototypical Learning Geometry | Martin Lindström et.al. | 2407.07664 | link |
2024-07-10 | The heterogeneous impact of the EU-Canada agreement with causal machine | Lionel Fontagné et.al. | 2407.07652 | null |
2024-07-11 | MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image Synthesis | Wanggui He et.al. | 2407.07614 | link |
2024-07-09 | ConceptExpress: Harnessing Diffusion Models for Single-image Unsupervised Concept Extraction | Shaozhe Hao et.al. | 2407.07077 | link |
2024-07-09 | Latent Space Imaging | Matheus Souza et.al. | 2407.07052 | null |
2024-07-09 | Generative models of astrophysical fields with scattering transforms on the sphere | Louise Mousset et.al. | 2407.07007 | link |
2024-07-10 | PEER: Expertizing Domain-Specific Tasks with a Multi-Agent Framework and Tuning Methods | Yiying Wang et.al. | 2407.06985 | link |
2024-07-09 | Parameter-Efficient and Memory-Efficient Tuning for Vision Transformer: A Disentangled Approach | Taolin Zhang et.al. | 2407.06964 | null |
2024-07-09 | RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models | Bowen Zhang et.al. | 2407.06938 | null |
2024-07-09 | HumanRefiner: Benchmarking Abnormal Human Generation and Refining with Coarse-to-fine Pose-Reversible Guidance | Guian Fang et.al. | 2407.06937 | link |
2024-07-09 | Fine-grained large-scale content recommendations for MSX sellers | Manpreet Singh et.al. | 2407.06910 | null |
2024-07-09 | Enhanced Battery Degradation-Aware Scheduling for Distribution Network with Electric Vehicle Load | Vijay Babu Pamshetti et.al. | 2407.06857 | null |
2024-07-09 | A reaction-diffusion model for relapsing-remitting multiple sclerosis with a treatment term | Romina Travaglini et.al. | 2407.06802 | null |
2024-07-08 | Tailor3D: Customized 3D Assets Editing and Generation with Dual-Side Images | Zhangyang Qi et.al. | 2407.06191 | null |
2024-07-08 | CrowdMoGen: Zero-Shot Text-Driven Collective Motion Generation | Xinying Guo et.al. | 2407.06188 | null |
2024-07-08 | JeDi: Joint-Image Diffusion Models for Finetuning-Free Personalized Text-to-Image Generation | Yu Zeng et.al. | 2407.06187 | null |
2024-07-08 | The Tug-of-War Between Deepfake Generation and Detection | Hannah Lee et.al. | 2407.06174 | null |
2024-07-08 | ANOLE: An Open, Autoregressive, Native Large Multimodal Models for Interleaved Image-Text Generation | Ethan Chern et.al. | 2407.06135 | link |
2024-07-08 | Structured Generations: Using Hierarchical Clusters to guide Diffusion Models | Jorge da Silva Goncalves et.al. | 2407.06124 | link |
2024-07-08 | PerlDiff: Controllable Street View Synthesis Using Perspective-Layout Diffusion Models | Jinhua Zhang et.al. | 2407.06109 | link |
2024-07-08 | Accelerating Diffusion for SAR-to-Optical Image Translation via Adversarial Consistency Distillation | Xinyu Bai et.al. | 2407.06095 | null |
2024-07-08 | Assessing Cardiomegaly in Dogs Using a Simple CNN Model | Nikhil Deekonda et.al. | 2407.06092 | null |
2024-07-08 | Layered Diffusion Model for One-Shot High Resolution Text-to-Image Synthesis | Emaad Khwaja et.al. | 2407.06079 | null |
2024-07-05 | RAM: Retrieval-Based Affordance Transfer for Generalizable Zero-Shot Robotic Manipulation | Yuxuan Kuang et.al. | 2407.04689 | link |
2024-07-05 | Thermal and mechanical study of a parametrised cryostat model for optical characterisation of upcoming CMB experiments | Thomas J. L. J. Gascard et.al. | 2407.04613 | link |
2024-07-08 | PartCraft: Crafting Creative Objects by Parts | Kam Woh Ng et.al. | 2407.04604 | link |
2024-07-05 | Structural Constraint Integration in Generative Model for Discovery of Quantum Material Candidates | Ryotaro Okabe et.al. | 2407.04557 | null |
2024-07-05 | Unified continuous-time q-learning for mean-field game and mean-field control problems | Xiaoli Wei et.al. | 2407.04521 | null |
2024-07-08 | Speed-accuracy trade-off for the diffusion models: Wisdom from nonequilibrium thermodynamics and optimal transport | Kotaro Ikeda et.al. | 2407.04495 | null |
2024-07-05 | PROUD: PaRetO-gUided Diffusion Model for Multi-objective Generation | Yinghua Yao et.al. | 2407.04493 | link |
2024-07-05 | Dude: Dual Distribution-Aware Context Prompt Learning For Large Vision-Language Model | Duy M. H. Nguyen et.al. | 2407.04489 | null |
2024-07-05 | Leveraging Graph Structures to Detect Hallucinations in Large Language Models | Noa Nonkes et.al. | 2407.04485 | link |
2024-07-05 | VCD-Texture: Variance Alignment based 3D-2D Co-Denoising for Text-Guided Texturing | Shang Liu et.al. | 2407.04461 | null |
2024-07-03 | DisCo-Diff: Enhancing Continuous Diffusion Models with Discrete Latents | Yilun Xu et.al. | 2407.03300 | link |
2024-07-03 | Improved Noise Schedule for Diffusion Training | Tiankai Hang et.al. | 2407.03297 | null |
2024-07-03 | Anomaly-based Framework for Detecting Power Overloading Cyberattacks in Smart Grid AMI | Abdelaziz Amara Korba et.al. | 2407.03264 | null |
2024-07-03 | SOS! Soft Prompt Attack Against Open-Source Large Language Models | Ziqing Yang et.al. | 2407.03160 | null |
2024-07-04 | Spatio-Temporal Adaptive Diffusion Models for EEG Super-Resolution in Epilepsy Diagnosis | Tong Zhou et.al. | 2407.03089 | null |
2024-07-03 | Artificial Inductive Bias for Synthetic Tabular Data Generation in Data-Scarce Scenarios | Patricia A. Apellániz et.al. | 2407.03080 | link |
2024-07-03 | Electromagnetic Property Sensing Based on Diffusion Model in ISAC System | Yuhua Jiang et.al. | 2407.03075 | null |
2024-07-03 | Semantic-Aware Power Allocation for Generative Semantic Communications with Foundation Models | Chunmei Xu et.al. | 2407.03050 | null |
2024-07-03 | SlerpFace: Face Template Protection via Spherical Linear Interpolation | Zhizhou Zhong et.al. | 2407.03043 | null |
2024-07-03 | An Organism Starts with a Single Pix-Cell: A Neural Cellular Diffusion for High-Resolution Image Synthesis | Marawan Elbatel et.al. | 2407.03018 | link |
2024-07-02 | Magic Insert: Style-Aware Drag-and-Drop | Nataniel Ruiz et.al. | 2407.02489 | null |
2024-07-02 | Boosting Consistency in Story Visualization with Rich-Contextual Conditional Diffusion Models | Fei Shen et.al. | 2407.02482 | link |
2024-07-02 | A Pattern Language for Machine Learning Tasks | Benjamin Rodatz et.al. | 2407.02424 | null |
2024-07-02 | GCF: Graph Convolutional Networks for Facial Expression Recognition | Hozaifa Kassab et.al. | 2407.02361 | null |
2024-07-02 | MORPHEUS: Modeling Role from Personalized Dialogue History by Exploring and Utilizing Latent Space | Yihong Tang et.al. | 2407.02345 | null |
2024-07-02 | Choice-based time slot management in attended home delivery | Dorsa Abdolhamidi et.al. | 2407.02339 | null |
2024-07-02 | Mining Constraints from Reference Process Models for Detecting Best-Practice Violations in Event Log | Adrian Rebmann et.al. | 2407.02336 | link |
2024-07-02 | A tactical time slot management problem under mixed logit demand | Dorsa Abdolhamidi et.al. | 2407.02308 | null |
2024-07-02 | Renard: A Modular Pipeline for Extracting Character Networks from Narrative Texts | Arthur Amalvy et.al. | 2407.02284 | link |
2024-07-03 | Federated Distillation for Medical Image Classification: Towards Trustworthy Computer-Aided Diagnosis | Sufen Ren et.al. | 2407.02261 | null |
2024-06-28 | Auto Cherry-Picker: Learning from High-quality Generative Data Driven by Language | Yicheng Chen et.al. | 2406.20085 | null |
2024-06-28 | The hybrid Josephson rhombus: A superconducting element with tailored current-phase relation | L. Banszerus et.al. | 2406.20082 | null |
2024-06-28 | HouseCrafter: Lifting Floorplans to 3D Scenes with 2D Diffusion Model | Hieu T. Nguyen et.al. | 2406.20077 | null |
2024-06-28 | Modeling and LQR Control of Insect Sized Flapping Wing Robot | Daksh Dhingra et.al. | 2406.20061 | null |
2024-06-28 | Neural Differentiable Modeling with Diffusion-Based Super-resolution for Two-Dimensional Spatiotemporal Turbulence | Xiantao Fan et.al. | 2406.20047 | null |
2024-06-28 | Electrostatics-based particle sampling and approximate inference | Yongchao Huang et.al. | 2406.20044 | link |
2024-06-28 | HAITCH: A Framework for Distortion and Motion Correction in Fetal Multi-Shell Diffusion-Weighted MRI | Haykel Snoussi et.al. | 2406.20042 | null |
2024-06-28 | Concept Lens: Visually Analyzing the Consistency of Semantic Manipulation in GANs | Sangwon Jeong et.al. | 2406.19987 | null |
2024-07-01 | Text2Robot: Evolutionary Robot Design from Text Descriptions | Ryan P. Ringel et.al. | 2406.19963 | link |
2024-06-28 | Kolmogorov-Smirnov GAN | Maciej Falkiewicz et.al. | 2406.19948 | link |
2024-06-27 | Looking 3D: Anomaly Detection with 2D-3D Alignment | Ankan Bhunia et.al. | 2406.19393 | link |
2024-06-27 | Taming Data and Transformers for Audio Generation | Moayed Haji-Ali et.al. | 2406.19388 | null |
2024-06-27 | Emergence of Hidden Capabilities: Exploring Learning Dynamics in Concept Space | Core Francisco Park et.al. | 2406.19370 | link |
2024-06-27 | Accelerating Multiphase Flow Simulations with Denoising Diffusion Model Driven Initializations | Jaehong Chung et.al. | 2406.19333 | null |
2024-06-27 | Subtractive Training for Music Stem Insertion using Latent Diffusion Models | Ivan Villa-Renteria et.al. | 2406.19328 | null |
2024-06-27 | Efficient World Models with Context-Aware Tokenization | Vincent Micheli et.al. | 2406.19320 | link |
2024-06-27 | PNeRV: A Polynomial Neural Representation for Videos | Sonam Gupta et.al. | 2406.19299 | null |
2024-06-27 | Compositional Image Decomposition with Diffusion Models | Jocelin Su et.al. | 2406.19298 | null |
2024-06-27 | BISeizuRe: BERT-Inspired Seizure Data Representation to Improve Epilepsy Monitoring | Luca Benfenati et.al. | 2406.19189 | null |
2024-06-27 | On Pólya-Young urn models and growth processes | Markus Kuba et.al. | 2406.19110 | null |
2024-06-26 | MatchTime: Towards Automatic Soccer Game Commentary Generation | Jiayuan Rao et.al. | 2406.18530 | link |
2024-06-26 | MultiDiff: Consistent Novel View Synthesis from a Single Image | Norman Müller et.al. | 2406.18524 | null |
2024-06-26 | Denoising as Adaptation: Noise-Space Domain Adaptation for Image Restoration | Kang Liao et.al. | 2406.18516 | link |
2024-06-26 | DiffuseHigh: Training-free Progressive High-Resolution Image Synthesis through Structure Guidance | Younghyun Kim et.al. | 2406.18459 | link |
2024-06-26 | Cascading Large Language Models for Salient Event Graph Generation | Xingwei Tan et.al. | 2406.18449 | link |
2024-06-26 | Repeat and Concatenate: 2D to 3D Image Translation with 3D to 3D Generative Modeling | Abril Corona-Figueroa et.al. | 2406.18422 | link |
2024-06-26 | Towards diffusion models for large-scale sea-ice modelling | Tobias Sebastian Finn et.al. | 2406.18417 | null |
2024-06-27 | Stable Diffusion Segmentation for Biomedical Images with Single-step Reverse Process | Tianyu Lin et.al. | 2406.18361 | link |
2024-06-26 | Molecular Diffusion Models with Virtual Receptors | Matan Halfon et.al. | 2406.18330 | null |
2024-06-27 | Weak Reward Model Transforms Generative Models into Robust Causal Event Extraction Systems | Italo Luis da Silva et.al. | 2406.18245 | link |
2024-06-25 | DiffusionPDE: Generative PDE-Solving Under Partial Observation | Jiahe Huang et.al. | 2406.17763 | link |
2024-06-25 | MotionBooth: Motion-Aware Customized Text-to-Video Generation | Jianzong Wu et.al. | 2406.17758 | null |
2024-06-25 | Accelerating Clinical Evidence Synthesis with Large Language Models | Zifeng Wang et.al. | 2406.17755 | null |
2024-06-25 | Extensions of Panjer’s recursion for mixed compound distributions | Spyridon M. Tzaninis et.al. | 2406.17726 | null |
2024-06-25 | PANDA: A self-driving lab for studying electrodeposited polymer films | Harley Quinn et.al. | 2406.17725 | null |
2024-06-25 | Unified Auto-Encoding with Masked Diffusion | Philippe Hansen-Estruch et.al. | 2406.17688 | link |
2024-06-25 | LaTable: Towards Large Tabular Models | Boris van Breugel et.al. | 2406.17673 | null |
2024-06-26 | SpecMaskGIT: Masked Generative Modeling of Audio Spectrograms for Efficient Audio Synthesis and Beyond | Marco Comunità et.al. | 2406.17672 | null |
2024-06-25 | Banishing LLM Hallucinations Requires Rethinking Generalization | Johnny Li et.al. | 2406.17642 | null |
2024-06-25 | The experience of humans’ and robots’ mutual (im)politeness in enacted service scenarios: An empirical study | Victor Kaptelinin et.al. | 2406.17641 | null |
2024-06-24 | FreeTraj: Tuning-Free Trajectory Control in Video Diffusion Models | Haonan Qiu et.al. | 2406.16863 | link |
2024-06-24 | Dreamitate: Real-World Visuomotor Policy Learning via Video Generation | Junbang Liang et.al. | 2406.16862 | null |
2024-06-24 | DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation | Yuang Peng et.al. | 2406.16855 | link |
2024-06-24 | USDC: A Dataset of $\underline{U}$ser $\underline{S}$tance and $\underline{D}$ogmatism in Long $\underline{C}$ onversations | Mounika Marreddy et.al. | 2406.16833 | null |
2024-06-24 | General Binding Affinity Guidance for Diffusion Models in Structure-Based Drug Design | Yue Jian et.al. | 2406.16821 | null |
2024-06-24 | ClotheDreamer: Text-Guided Garment Generation with 3D Gaussians | Yufei Liu et.al. | 2406.16815 | null |
2024-06-24 | Conformal time series decomposition with component-wise exchangeability | Derck W. E. Prinzhorn et.al. | 2406.16766 | link |
2024-06-24 | Inferring stochastic low-rank recurrent neural networks from neural data | Matthijs Pals et.al. | 2406.16749 | link |
2024-06-24 | Portrait3D: 3D Head Generation from Single In-the-wild Portrait Image | Jinkun Hao et.al. | 2406.16710 | null |
2024-06-24 | Geometry-Aware Score Distillation via 3D Consistent Noising and Gradient Consistency Modeling | Min-Seop Kwak et.al. | 2406.16695 | null |
2024-06-21 | Masked Extended Attention for Zero-Shot Virtual Try-On In The Wild | Nadav Orzech et.al. | 2406.15331 | null |
2024-06-21 | Rethinking Remote Sensing Change Detection With A Mask View | Xiaowen Ma et.al. | 2406.15320 | link |
2024-06-21 | You Only Acquire Sparse-channel (YOAS): A Unified Framework for Dense-channel EEG Generation | Hongyu Chen et.al. | 2406.15269 | null |
2024-06-21 | Evaluating Diversity in Automatic Poetry Generation | Yanran Chen et.al. | 2406.15267 | link |
2024-06-21 | Fingerprint Membership and Identity Inference Against Generative Adversarial Networks | Saverio Cavasin et.al. | 2406.15253 | null |
2024-06-21 | MantisScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation | Xuan He et.al. | 2406.15252 | null |
2024-06-21 | Unsupervised Bayesian Generation of Synthetic CT from CBCT Using Patient-Specific Score-Based Prior | Junbo Peng et.al. | 2406.15219 | null |
2024-06-21 | Sound and Fury, Signifying Nothing? Impact of Data Breach Disclosure Laws | Muhammad Zia Hydari et.al. | 2406.15215 | null |
2024-06-21 | Injecting Bias in Text-To-Image Models via Composite-Trigger Backdoors | Ali Naseh et.al. | 2406.15213 | link |
2024-06-21 | Exploring the Efficacy of Robotic Assistants with ChatGPT and Claude in Enhancing ADHD Therapy: Innovating Treatment Paradigms | Santiago Berrezueta-Guzman et.al. | 2406.15198 | null |
2024-06-20 | A Survey of Multimodal-Guided Image Editing with Text-to-Image Diffusion Models | Xincheng Shuai et.al. | 2406.14555 | link |
2024-06-21 | Advancing Fine-Grained Classification by Structure and Subject Preserving Augmentation | Eyal Michaeli et.al. | 2406.14551 | link |
2024-06-20 | Consistency Models Made Easy | Zhengyang Geng et.al. | 2406.14548 | link |
2024-06-20 | IRASim: Learning Interactive Real-Robot Action Simulators | Fangqi Zhu et.al. | 2406.14540 | null |
2024-06-20 | Invertible Consistency Distillation for Text-Guided Image Editing in Around 7 Steps | Nikita Starodubcev et.al. | 2406.14539 | null |
2024-06-20 | Fantastic Copyrighted Beasts and How (Not) to Generate Them | Luxi He et.al. | 2406.14526 | null |
2024-06-20 | Photoacoustic methane detection assisted by a gas-filled anti-resonant hollow-core fiber laser | Cuiling Zhang et.al. | 2406.14521 | null |
2024-06-20 | V-LASIK: Consistent Glasses-Removal from Videos Using Synthetic Data | Rotem Shalev-Arkushin et.al. | 2406.14510 | null |
2024-06-20 | CodeRAG-Bench: Can Retrieval Augment Code Generation? | Zora Zhiruo Wang et.al. | 2406.14497 | link |
2024-06-20 | SafeSora: Towards Safety Alignment of Text2Video Generation via a Human Preference Dataset | Josef Dai et.al. | 2406.14477 | link |
2024-06-20 | CollaFuse: Collaborative Diffusion Models | Simeon Allmendinger et.al. | 2406.14429 | link |
2024-06-20 | Active Diffusion Subsampling | Oisin Nolan et.al. | 2406.14388 | link |
2024-06-20 | Multicoloured Hardcore Model: Fast Mixing and Queueing | Sam Olesker-Taylor et.al. | 2406.14376 | null |
2024-06-20 | FairX: A comprehensive benchmarking tool for model analysis using fairness, utility, and explainability | Md Fahim Sikder et.al. | 2406.14281 | link |
2024-06-20 | In Tree Structure Should Sentence Be Generated | Yaguang Li et.al. | 2406.14189 | link |
2024-06-20 | CriDiff: Criss-cross Injection Diffusion Framework via Generative Pre-train for Prostate Segmentation | Tingwei Liu et.al. | 2406.14186 | link |
2024-06-20 | Tractable Equilibrium Computation in Markov Games through Risk Aversion | Eric Mazumdar et.al. | 2406.14156 | null |
2024-06-20 | ExVideo: Extending Video Diffusion Models via Parameter-Efficient Post-Tuning | Zhongjie Duan et.al. | 2406.14130 | link |
2024-06-20 | Dye4AI: Assuring Data Boundary on Generative AI Services | Shu Wang et.al. | 2406.14114 | null |
2024-06-20 | HeartBeat: Towards Controllable Echocardiography Video Synthesis with Multimodal Conditions-Guided Diffusion Models | Xinrui Zhou et.al. | 2406.14098 | null |
2024-06-20 | Bridging bulk and surface: An interacting particle system towards the field-road diffusion model | Matthieu Alfaro et.al. | 2406.14093 | null |
2024-06-20 | A Practical Diffusion Path for Sampling | Omar Chehab et.al. | 2406.14040 | null |
2024-06-20 | Leveraging eBPF and AI for Ransomware Nose Out | Arjun Sekar et.al. | 2406.14020 | null |
2024-06-20 | Feature Fusion Based on Mutual-Cross-Attention Mechanism for EEG Emotion Recognition | Yimin Zhao et.al. | 2406.14014 | link |
2024-06-20 | Exploring Changes in Nation Perception with Nationality-Assigned Personas in LLMs | Mahammed Kamruzzaman et.al. | 2406.13993 | null |
2024-06-20 | The Elusive Pursuit of Replicating PATE-GAN: Benchmarking, Auditing, Debugging | Georgi Ganev et.al. | 2406.13985 | link |
2024-06-20 | Similarity-aware Syncretic Latent Diffusion Model for Medical Image Translation with Representation Learning | Tingyi Lin et.al. | 2406.13977 | null |
2024-06-20 | Synthesizing Multimodal Electronic Health Records via Predictive Diffusion Models | Yuan Zhong et.al. | 2406.13942 | null |
2024-06-20 | EnTruth: Enhancing the Traceability of Unauthorized Dataset Usage in Text-to-image Diffusion Models with Minimal and Robust Alterations | Jie Ren et.al. | 2406.13933 | null |
2024-06-20 | Generative AI for Enhancing Active Learning in Education: A Comparative Study of GPT-3.5 and GPT-4 in Crafting Customized Test Questions | Hamdireza Rouzegar et.al. | 2406.13903 | null |
2024-06-19 | INFusion: Diffusion Regularized Implicit Neural Representations for 2D and 3D accelerated MRI reconstruction | Yamin Arefeen et.al. | 2406.13895 | null |
2024-06-19 | Open Generative Large Language Models for Galician | Pablo Gamallo et.al. | 2406.13893 | null |
2024-06-19 | StackRAG Agent: Improving Developer Answers with Retrieval-Augmented Generation | Davit Abrahamyan et.al. | 2406.13840 | link |
2024-06-19 | RNA-FrameFlow: Flow Matching for de novo 3D RNA Backbone Design | Rishabh Anand et.al. | 2406.13839 | link |
2024-06-19 | COAC: Cross-layer Optimization of Accelerator Configurability for Efficient CNN Processing | Steven Colleman et.al. | 2406.13752 | null |
2024-06-19 | GenAI-Bench: Evaluating and Improving Compositional Text-to-Visual Generation | Baiqi Li et.al. | 2406.13743 | link |
2024-06-19 | Tree-Sliced Wasserstein Distance on a System of Lines | Viet-Hoang Tran et.al. | 2406.13725 | null |
2024-06-19 | Hitchhiker’s guide on Energy-Based Models: a comprehensive review on the relation with other generative models, sampling and statistical physics | Davide Carbone et.al. | 2406.13661 | null |
2024-06-19 | Towards Minimal Targeted Updates of Language Models with Targeted Negative Training | Lily H. Zhang et.al. | 2406.13660 | link |
2024-06-19 | Stability and Generalizability in SDE Diffusion Models with Measure-Preserving Dynamics | Weitong Zhang et.al. | 2406.13652 | null |
2024-06-19 | On AI-Inspired UI-Design | Jialiang Wei et.al. | 2406.13631 | null |
2024-06-19 | Can AI be enabled to dynamical downscaling? Training a Latent Diffusion Model to mimic km-scale COSMO-CLM downscaling of ERA5 over Italy | Elena Tomasi et.al. | 2406.13627 | link |
2024-06-19 | Enhance the Image: Super Resolution using Artificial Intelligence in MRI | Ziyu Li et.al. | 2406.13625 | null |
2024-06-19 | Generative Modeling by Minimizing the Wasserstein-2 Loss | Yu-Jui Huang et.al. | 2406.13619 | null |
2024-06-19 | Parameter Training Efficiency Aware Resource Allocation for AIGC in Space-Air-Ground Integrated Networks | Liangxin Qian et.al. | 2406.13602 | null |
2024-06-19 | ModSec-Learn: Boosting ModSecurity with Machine Learning | Christian Scano et.al. | 2406.13547 | link |
2024-06-19 | Towards Cyber Threat Intelligence for the IoT | Alfonso Iacovazzi et.al. | 2406.13543 | null |
2024-06-19 | Image Distillation for Safe Data Sharing in Histopathology | Zhe Li et.al. | 2406.13536 | link |
2024-06-19 | Diffusion-based Generative Modeling with Discriminative Guidance for Streamable Speech Enhancement | Chenda Li et.al. | 2406.13471 | null |
2024-06-19 | Unifying nonlinearly constrained nonconvex optimization | Charlie Vanaret et.al. | 2406.13454 | link |
2024-06-19 | Federating to Grow Transformers with Constrained Resources without Model Sharing | Shikun Shen et.al. | 2406.13450 | null |
2024-06-19 | Multi-messenger modeling of the Monogem pulsar halo | Youyou Li et.al. | 2406.13426 | null |
2024-06-19 | Style-NeRF2NeRF: 3D Style Transfer From Style-Aligned Multi-View Images | Haruo Fujiwara et.al. | 2406.13393 | null |
2024-06-19 | Effective Edge-wise Representation Learning in Edge-Attributed Bipartite Graphs | Hewen Wang et.al. | 2406.13369 | null |
2024-06-19 | Situational Instructions Database: Task Guidance in Dynamic Environments | Muhammad Saif Ullah Khan et.al. | 2406.13302 | link |
2024-06-19 | ARDuP: Active Region Video Diffusion for Universal Policies | Shuaiyi Huang et.al. | 2406.13301 | null |
2024-06-19 | AniFaceDiff: High-Fidelity Face Reenactment via Facial Parametric Conditioned Diffusion Models | Ken Chen et.al. | 2406.13272 | null |
2024-06-19 | Self-Supervised Diffusion Model for 3-D Seismic Data Reconstruction | Xinyang Wang et.al. | 2406.13252 | null |
2024-06-19 | Optimizing Inventory Management through Multiobjective Reverse Logistics with Environmental Impact | I. B. Wadhawan et.al. | 2406.13226 | null |
2024-06-19 | Neural Residual Diffusion Models for Deep Scalable Vision Generation | Zhiyuan Ma et.al. | 2406.13215 | null |
2024-06-19 | Surgical Triplet Recognition via Diffusion Model | Daochang Liu et.al. | 2406.13210 | null |
2024-06-19 | Diffusion Model-based FOD Restoration from High Distortion in dMRI | Shuo Huang et.al. | 2406.13209 | null |
2024-06-19 | Toward Structure Fairness in Dynamic Graph Embedding: A Trend-aware Dual Debiasing Approach | Yicong Li et.al. | 2406.13201 | link |
2024-06-19 | Synthetic Context Generation for Question Generation | Naiming Liu et.al. | 2406.13188 | null |
2024-06-19 | Conditional score-based diffusion models for solving inverse problems in mechanics | Agnimitra Dasgupta et.al. | 2406.13154 | null |
2024-06-19 | von Mises Quasi-Processes for Bayesian Circular Regression | Yarden Cohen et.al. | 2406.13151 | null |
2024-06-19 | MCAD: Multi-modal Conditioned Adversarial Diffusion Model for High-Quality PET Image Reconstruction | Jiaqi Cui et.al. | 2406.13150 | null |
2024-06-19 | GVT2RPM: An Empirical Study for General Video Transformer Adaptation to Remote Physiological Measurement | Hao Wang et.al. | 2406.13136 | null |
2024-06-19 | Thruster-Assisted Incline Walking | Kaushik Venkatesh Krishnamurthy et.al. | 2406.13118 | null |
2024-06-18 | Sampling 3D Gaussian Scenes in Seconds with Latent Diffusion Models | Paul Henderson et.al. | 2406.13099 | null |
2024-06-18 | RITA: A Real-time Interactive Talking Avatars Framework | Wuxinlin Cheng et.al. | 2406.13093 | null |
2024-06-18 | PIPPIN: Generating variable length full events from partons | Guillaume Quétant et.al. | 2406.13074 | link |
2024-06-18 | MaskPure: Improving Defense Against Text Adversaries with Stochastic Purification | Harrison Gietz et.al. | 2406.13066 | link |
2024-06-18 | Traffic Prediction considering Multiple Levels of Spatial-temporal Information: A Multi-scale Graph Wavelet-based Approach | Zilin Bian et.al. | 2406.13038 | null |
2024-06-18 | Sharp detection of low-dimensional structure in probability measures via dimensional logarithmic Sobolev inequalities | Matthew T. C. Li et.al. | 2406.13036 | null |
2024-06-18 | Data Plagiarism Index: Characterizing the Privacy Risk of Data-Copying in Tabular Generative Models | Joshua Ward et.al. | 2406.13012 | null |
2024-06-18 | Synergizing Foundation Models and Federated Learning: A Survey | Shenghui Li et.al. | 2406.12844 | null |
2024-06-18 | Evaluating the design space of diffusion-based generative models | Yuqing Wang et.al. | 2406.12839 | null |
2024-06-18 | Neural Approximate Mirror Maps for Constrained Diffusion Models | Berthy T. Feng et.al. | 2406.12816 | null |
2024-06-19 | AITTI: Learning Adaptive Inclusive Token for Text-to-Image Generation | Xinyu Hou et.al. | 2406.12805 | link |
2024-06-18 | Extracting Training Data from Unconditional Diffusion Models | Yunhao Chen et.al. | 2406.12752 | null |
2024-06-18 | Useful stochastic bounds in time-varying queues with service and patience times having general joint distribution | Shreehari Anand Bodas et.al. | 2406.12745 | null |
2024-06-18 | SUPER: Selfie Undistortion and Head Pose Editing with Identity Preservation | Polina Karpikova et.al. | 2406.12700 | null |
2024-06-18 | Speak in the Scene: Diffusion-based Acoustic Scene Transfer toward Immersive Speech Generation | Miseul Kim et.al. | 2406.12688 | null |
2024-06-18 | GeoBench: Benchmarking and Analyzing Monocular Geometry Estimation Models | Yongtao Ge et.al. | 2406.12671 | link |
2024-06-18 | Research and Implementation of Data Enhancement Techniques for Graph Neural Networks | Jingzhao Gu et.al. | 2406.12640 | null |
2024-06-18 | News Without Borders: Domain Adaptation of Multilingual Sentence Embeddings for Cross-lingual News Recommendation | Andreea Iana et.al. | 2406.12634 | link |
2024-06-18 | Learning Diffusion at Lightspeed | Antonio Terpin et.al. | 2406.12616 | null |
2024-06-18 | Unmasking the Veil: An Investigation into Concept Ablation for Privacy and Copyright Protection in Images | Shivank Garg et.al. | 2406.12592 | link |
2024-06-18 | Behavior-Dependent Linear Recurrent Units for Efficient Sequential Recommendation | Chengkai Liu et.al. | 2406.12580 | link |
2024-06-18 | Training Diffusion Models with Federated Learning | Matthijs de Goede et.al. | 2406.12575 | null |
2024-06-18 | P-Tailor: Customizing Personality Traits for Language Models via Mixture of Specialized LoRA Experts | Yuhao Dan et.al. | 2406.12548 | null |
2024-06-18 | Structured Detection for Simultaneous Super-Resolution and Optical Sectioning in Laser Scanning Microscopy | Alessandro Zunino et.al. | 2406.12542 | link |
2024-06-18 | Variational Distillation of Diffusion Policies into Mixture of Experts | Hongyi Zhou et.al. | 2406.12538 | null |
2024-06-18 | HumanSplat: Generalizable Single-Image Human Gaussian Splatting with Structure Priors | Panwang Pan et.al. | 2406.12459 | link |
2024-06-18 | Planning Using Schrödinger Bridge Diffusion Models | Adarsh Srivastava et.al. | 2406.12458 | link |
2024-06-18 | Deep Temporal Deaggregation: Large-Scale Spatio-Temporal Generative Models | David Bergström et.al. | 2406.12423 | null |
2024-06-18 | ROVER: RTL Optimization via Verified E-Graph Rewriting | Samuel Coward et.al. | 2406.12421 | null |
2024-06-18 | TADM: Temporally-Aware Diffusion Model for Neurodegenerative Progression on Brain MRI | Mattia Litrico et.al. | 2406.12411 | null |
2024-06-18 | SDNIA-YOLO: A Robust Object Detection Model for Extreme Weather Conditions | Yuexiong Ding et.al. | 2406.12395 | null |
Vision-Language Models
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2024-11-18 | MMBind: Unleashing the Potential of Distributed and Heterogeneous Data for Multimodal Learning in IoT | Xiaomin Ouyang et.al. | 2411.12126 | null |
2024-11-17 | SymDPO: Boosting In-Context Learning of Large Multimodal Models with Symbol Demonstration Direct Preference Optimization | Hongrui Jia et.al. | 2411.11909 | null |
2024-11-18 | The Power of Many: Multi-Agent Multimodal Models for Cultural Image Captioning | Longju Bai et.al. | 2411.11758 | null |
2024-11-18 | Artificial Scientific Discovery | Antonio Norelli et.al. | 2411.11672 | null |
2024-11-18 | InstruGen: Automatic Instruction Generation for Vision-and-Language Navigation Via Large Multimodal Models | Yu Yan et.al. | 2411.11394 | null |
2024-11-19 | SoK: Unifying Cybersecurity and Cybersafety of Multimodal Foundation Models with an Information Theory Approach | Ruoxi Sun et.al. | 2411.11195 | null |
2024-11-16 | ViBe: A Text-to-Video Benchmark for Evaluating Hallucination in Large Multimodal Models | Vipula Rawte et.al. | 2411.10867 | null |
2024-11-19 | MLAN: Language-Based Instruction Tuning Improves Zero-Shot Generalization of Multimodal Large Language Models | Jianhong Tu et.al. | 2411.10557 | link |
2024-11-15 | Everything is a Video: Unifying Modalities through Next-Frame Prediction | G. Thomas Hudson et.al. | 2411.10503 | null |
2024-11-15 | Weakly-Supervised Multimodal Learning on MIMIC-CXR | Andrea Agostini et.al. | 2411.10356 | null |
2024-11-15 | Instruction-Guided Editing Controls for Images and Multimedia: A Survey in LLM era | Thanh Tam Nguyen et.al. | 2411.09955 | link |
2024-11-14 | Cross-Modal Consistency in Multimodal Large Language Models | Xiang Zhang et.al. | 2411.09273 | null |
2024-11-14 | SmartInv: Multimodal Learning for Smart Contract Invariant Inference | Sally Junsong Wang et.al. | 2411.09217 | null |
2024-11-13 | Multimodal Object Detection using Depth and Image Data for Manufacturing Parts | Nazanin Mahjourian et.al. | 2411.09062 | null |
2024-11-13 | Bridging the Visual Gap: Fine-Tuning Multimodal Models with Knowledge-Adapted Captions | Moran Yanuka et.al. | 2411.09018 | null |
2024-11-13 | AstroM $^3$ : A self-supervised multimodal model for astronomy | Mariia Rizhko et.al. | 2411.08842 | null |
2024-11-13 | Multimodal Instruction Tuning with Hybrid State Space Models | Jianing Zhou et.al. | 2411.08840 | null |
2024-11-13 | Retrieval Augmented Recipe Generation | Guoshan Liu et.al. | 2411.08715 | null |
2024-11-12 | DPU: Dynamic Prototype Updating for Multimodal Out-of-Distribution Detection | Shawn Li et.al. | 2411.08227 | link |
2024-11-12 | Leveraging Multimodal Models for Enhanced Neuroimaging Diagnostics in Alzheimer’s Disease | Francesco Chiumento et.al. | 2411.07871 | null |
2024-11-12 | SparrowVQE: Visual Question Explanation for Course Content Understanding | Jialu Li et.al. | 2411.07516 | link |
2024-11-12 | BLIP3-KALE: Knowledge Augmented Large-Scale Dense Captions | Anas Awadalla et.al. | 2411.07461 | null |
2024-11-11 | Multimodal Fusion Balancing Through Game-Theoretic Regularization | Konstantinos Kontras et.al. | 2411.07335 | null |
2024-11-11 | OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision | Cong Wei et.al. | 2411.07199 | null |
2024-11-09 | M-Longdoc: A Benchmark For Multimodal Super-Long Document Understanding And A Retrieval-Aware Tuning Framework | Yew Ken Chia et.al. | 2411.06176 | null |
2024-11-09 | An Empirical Analysis on Spatial Reasoning Capabilities of Large Multimodal Models | Fatemeh Shiri et.al. | 2411.06048 | link |
2024-11-08 | Towards Low-Resource Harmful Meme Detection with LMM Agents | Jianzhao Huang et.al. | 2411.05383 | link |
2024-11-08 | Exploring the Alignment Landscape: LLMs and Geometric Deep Models in Protein Representation | Dong Shu et.al. | 2411.05316 | link |
2024-11-07 | HourVideo: 1-Hour Video-Language Understanding | Keshigeyan Chandrasegaran et.al. | 2411.04998 | null |
2024-11-07 | VideoGLaMM: A Large Multimodal Model for Pixel-Level Visual Grounding in Videos | Shehan Munasinghe et.al. | 2411.04923 | null |
2024-11-07 | Exploring Hierarchical Molecular Graph Representation in Multimodal LLMs | Chengxin Hu et.al. | 2411.04708 | null |
2024-11-06 | AutoGameUI: Constructing High-Fidelity Game UIs via Multimodal Learning and Interactive Web-Based Tool | Zhongliang Tang et.al. | 2411.03709 | null |
2024-11-05 | MME-Finance: A Multimodal Finance Benchmark for Expert-level Understanding and Reasoning | Ziliang Gan et.al. | 2411.03314 | null |
2024-11-05 | HumanVLM: Foundation for Human-Scene Vision-Language Model | Dawei Dai et.al. | 2411.03034 | null |
2024-11-05 | Toward Robust Incomplete Multimodal Sentiment Analysis via Hierarchical Representation Learning | Mingcheng Li et.al. | 2411.02793 | null |
2024-11-11 | INQUIRE: A Natural World Text-to-Image Retrieval Benchmark | Edward Vendrow et.al. | 2411.02537 | link |
2024-11-04 | See it, Think it, Sorted: Large Multimodal Models are Few-shot Time Series Anomaly Analyzers | Jiaxin Zhuang et.al. | 2411.02465 | null |
2024-11-07 | TableGPT2: A Large Multimodal Model with Tabular Data Integration | Aofeng Su et.al. | 2411.02059 | link |
2024-11-04 | Foundations and Recent Trends in Multimodal Mobile Agents: A Survey | Biao Wu et.al. | 2411.02006 | link |
2024-11-04 | KptLLM: Unveiling the Power of Large Language Model for Keypoint Comprehension | Jie Yang et.al. | 2411.01846 | null |
2024-11-03 | EEE-Bench: A Comprehensive Multimodal Electrical And Electronics Engineering Benchmark | Ming Li et.al. | 2411.01492 | null |
2024-11-03 | Classifier-guided Gradient Modulation for Enhanced Multimodal Learning | Zirun Guo et.al. | 2411.01409 | link |
2024-11-02 | LoRA-Contextualizing Adaptation of Large Multimodal Models for Long Document Understanding | Jian Chen et.al. | 2411.01106 | null |
2024-11-01 | Text2Freq: Learning Series Patterns from Text via Frequency Domain | Ming-Chih Lo et.al. | 2411.00929 | null |
2024-11-01 | V-LoRA: An Efficient and Flexible System Boosts Vision Applications with LoRA LMM | Liang Mi et.al. | 2411.00915 | null |
2024-11-01 | Analyzing Multimodal Integration in the Variational Autoencoder from an Information-Theoretic Perspective | Carlotta Langer et.al. | 2411.00522 | null |
2024-10-31 | TurtleBench: A Visual Programming Benchmark in Turtle Geometry | Sina Rismanchian et.al. | 2411.00264 | link |
2024-10-31 | ResiDual Transformer Alignment with Spectral Decomposition | Lorenzo Basile et.al. | 2411.00246 | null |
2024-10-31 | Nearest Neighbor Normalization Improves Multimodal Retrieval | Neil Chowdhury et.al. | 2410.24114 | link |
2024-11-04 | AndroidLab: Training and Systematic Benchmarking of Android Autonomous Agents | Yifan Xu et.al. | 2410.24024 | link |
2024-10-31 | Audio Is the Achilles’ Heel: Red Teaming Audio Large Multimodal Models | Hao Yang et.al. | 2410.23861 | null |
2024-10-30 | CLIPErase: Efficient Unlearning of Visual-Textual Associations in CLIP | Tianyu Yang et.al. | 2410.23330 | null |
2024-10-30 | EMMA: End-to-End Multimodal Model for Autonomous Driving | Jyh-Jing Hwang et.al. | 2410.23262 | null |
2024-10-29 | ProMQA: Question Answering Dataset for Multimodal Procedural Activity Understanding | Kimihiro Hasegawa et.al. | 2410.22211 | link |
2024-10-29 | Beyond Text: Optimizing RAG with Multimodal Inputs for Industrial Applications | Monica Riedler et.al. | 2410.21943 | link |
2024-10-28 | AiSciVision: A Framework for Specializing Large Multimodal Models in Scientific Image Classification | Brendan Hogan et.al. | 2410.21480 | link |
2024-10-27 | Mind Your Step (by Step): Chain-of-Thought can Reduce Performance on Tasks where Thinking Makes Humans Worse | Ryan Liu et.al. | 2410.21333 | null |
2024-10-28 | IndraEye: Infrared Electro-Optical UAV-based Perception Dataset for Robust Downstream Tasks | Manjunath D et.al. | 2410.20953 | link |
2024-10-27 | Generator Matching: Generative modeling with arbitrary Markov processes | Peter Holderrieth et.al. | 2410.20587 | null |
2024-10-27 | PaPaGei: Open Foundation Models for Optical Physiological Signals | Arvind Pillai et.al. | 2410.20542 | link |
2024-10-25 | Turn-by-Turn Indoor Navigation for the Visually Impaired | Santosh Srinivasaiah et.al. | 2410.19954 | null |
2024-10-25 | A Multimodal Approach For Endoscopic VCE Image Classification Using BiomedCLIP-PubMedBERT | Nagarajan Ganapathy et.al. | 2410.19944 | link |
2024-10-25 | OpenWebVoyager: Building Multimodal Web Agents via Iterative Real-World Exploration, Feedback and Optimization | Hongliang He et.al. | 2410.19609 | link |
2024-10-24 | Visual Text Matters: Improving Text-KVQA with Visual Text Entity Knowledge-aware Large Multimodal Assistant | Abhirama Subramanyam Penamakuri et.al. | 2410.19144 | link |
2024-10-24 | VideoWebArena: Evaluating Long Context Multimodal Agents with Video Understanding Web Tasks | Lawrence Jang et.al. | 2410.19100 | null |
2024-10-24 | CAMEL-Bench: A Comprehensive Arabic LMM Benchmark | Sara Ghaboura et.al. | 2410.18976 | link |
2024-10-24 | Deep Insights into Cognitive Decline: A Survey of Leveraging Non-Intrusive Modalities with Deep Learning Techniques | David Ortiz-Perez et.al. | 2410.18972 | null |
2024-10-24 | OSCAR: Operating System Control via State-Aware Reasoning and Re-Planning | Xiaoqiang Wang et.al. | 2410.18963 | null |
2024-10-24 | A Survey of Multimodal Sarcasm Detection | Shafkat Farabi et.al. | 2410.18882 | null |
2024-10-27 | R-CoT: Reverse Chain-of-Thought Problem Generation for Geometric Reasoning in Large Multimodal Models | Linger Deng et.al. | 2410.17885 | link |
2024-10-22 | JMMMU: A Japanese Massive Multi-discipline Multimodal Understanding Benchmark for Culture-aware Evaluation | Shota Onohara et.al. | 2410.17250 | null |
2024-10-22 | An Eye for an AI: Evaluating GPT-4o’s Visual Perception Skills and Geometric Reasoning Skills Using Computer Graphics Questions | Tony Haoran Feng et.al. | 2410.16991 | null |
2024-10-21 | DocEdit-v2: Document Structure Editing Via Multimodal LLM Grounding | Manan Suri et.al. | 2410.16472 | null |
2024-10-21 | Promoting cross-modal representations to improve multimodal foundation models for physiological signals | Ching Fang et.al. | 2410.16424 | null |
2024-10-22 | Mini-InternVL: A Flexible-Transfer Pocket Multimodal Model with 5% Parameters and 90% Performance | Zhangwei Gao et.al. | 2410.16261 | link |
2024-10-22 | MoRE: Multi-Modal Contrastive Pre-training with Transformers on X-Rays, ECGs, and Diagnostic Report | Samrajya Thapa et.al. | 2410.16239 | link |
2024-10-21 | Griffon-G: Bridging Vision-Language and Vision-Centric Tasks via Large Multimodal Models | Yufei Zhan et.al. | 2410.16163 | link |
2024-10-21 | LMHaze: Intensity-aware Image Dehazing with a Large-scale Multi-intensity Real Haze Dataset | Ruikun Zhang et.al. | 2410.16095 | link |
2024-10-21 | How to Build a Pre-trained Multimodal model for Simultaneously Chatting and Decision-making? | Zuojin Tang et.al. | 2410.15885 | null |
2024-10-21 | Multimodal Learning for Embryo Viability Prediction in Clinical IVF | Junsik Kim et.al. | 2410.15581 | null |
2024-10-20 | IPO: Interpretable Prompt Optimization for Vision-Language Models | Yingjun Du et.al. | 2410.15397 | link |
2024-10-20 | Modality-Fair Preference Optimization for Trustworthy MLLM Alignment | Songtao Jiang et.al. | 2410.15334 | null |
2024-10-19 | ChitroJera: A Regionally Relevant Visual Question Answering Dataset for Bangla | Deeparghya Dutta Barua et.al. | 2410.14991 | null |
2024-10-19 | SemiHVision: Enhancing Medical Multimodal Models with a Semi-Human Annotated Dataset and Fine-Tuned Instruction Generation | Junda Wang et.al. | 2410.14948 | link |
2024-10-18 | Croc: Pretraining Large Multimodal Models with Cross-Modal Comprehension | Yin Xie et.al. | 2410.14332 | link |
2024-10-18 | Personalized Image Generation with Large Multimodal Models | Yiyan Xu et.al. | 2410.14170 | null |
2024-10-18 | Coherence-Driven Multimodal Safety Dialogue with Active Learning for Embodied Agents | Sabit Hassan et.al. | 2410.14141 | null |
2024-10-17 | Janus: Decoupling Visual Encoding for Unified Multimodal Understanding and Generation | Chengyue Wu et.al. | 2410.13848 | link |
2024-10-18 | Harnessing Webpage UIs for Text-Rich Visual Understanding | Junpeng Liu et.al. | 2410.13824 | null |
2024-10-17 | Parameter-efficient Adaptation of Multilingual Multimodal Models for Low-resource ASR | Abhishek Gupta et.al. | 2410.13445 | null |
2024-10-16 | The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio | Sicong Leng et.al. | 2410.12787 | null |
2024-10-16 | HumanEval-V: Evaluating Visual Understanding and Reasoning Abilities of Large Multimodal Models Through Coding Tasks | Fengji Zhang et.al. | 2410.12381 | link |
2024-10-15 | CtrlSynth: Controllable Image Text Synthesis for Data-Efficient Multimodal Learning | Qingqing Cao et.al. | 2410.11963 | null |
2024-10-15 | Generalizable Spacecraft Trajectory Generation via Multimodal Learning with Transformers | Davide Celestini et.al. | 2410.11723 | null |
2024-10-15 | Unveiling the Mystery of Visual Attributes of Concrete and Abstract Concepts: Variability, Nearest Neighbors, and Challenging Categories | Tarun Tater et.al. | 2410.11657 | link |
2024-10-15 | On-the-fly Modulation for Balanced Multimodal Learning | Yake Wei et.al. | 2410.11582 | link |
2024-10-15 | Enhancing Unimodal Latent Representations in Multimodal VAEs through Iterative Amortized Inference | Yuta Oshima et.al. | 2410.11403 | null |
2024-10-14 | Saliency Guided Optimization of Diffusion Latents | Xiwen Wang et.al. | 2410.10257 | null |
2024-10-14 | MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models | Peng Xia et.al. | 2410.10139 | link |
2024-10-13 | LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models | Junyan Ye et.al. | 2410.09732 | null |
2024-10-12 | Reconstructive Visual Instruction Tuning | Haochen Wang et.al. | 2410.09575 | null |
2024-10-11 | Can GPTs Evaluate Graphic Design Based on Design Principles? | Daichi Haraguchi et.al. | 2410.08885 | null |
2024-10-11 | VERIFIED: A Video Corpus Moment Retrieval Benchmark for Fine-Grained Video Understanding | Houlun Chen et.al. | 2410.08593 | link |
2024-10-10 | ElasticTok: Adaptive Tokenization for Image and Video | Wilson Yan et.al. | 2410.08368 | null |
2024-10-10 | Flex-MoE: Modeling Arbitrary Modality Combination via the Flexible Mixture-of-Experts | Sukwon Yun et.al. | 2410.08245 | link |
2024-10-10 | LatteCLIP: Unsupervised CLIP Fine-Tuning via LMM-Synthetic Texts | Anh-Quan Cao et.al. | 2410.08211 | null |
2024-10-10 | Emerging Pixel Grounding in Large Multimodal Models Without Grounding Supervision | Shengcao Cao et.al. | 2410.08209 | null |
2024-10-10 | MRAG-Bench: Vision-Centric Evaluation for Retrieval-Augmented Multimodal Models | Wenbo Hu et.al. | 2410.08182 | null |
2024-10-10 | Generated Bias: Auditing Internal Bias Dynamics of Text-To-Image Generative Models | Abhishek Mandal et.al. | 2410.07884 | null |
2024-10-09 | The Cognitive Capabilities of Generative AI: A Comparative Analysis with Human Benchmarks | Isaac R. Galatzer-Levy et.al. | 2410.07391 | null |
2024-10-12 | Deep Correlated Prompting for Visual Recognition with Missing Modalities | Lianyu Hu et.al. | 2410.06558 | link |
2024-10-11 | Chip-Tuning: Classify Before Language Models Say | Fangwei Zhu et.al. | 2410.06541 | link |
2024-10-09 | Does Spatial Cognition Emerge in Frontier Models? | Santhosh Kumar Ramakrishnan et.al. | 2410.06468 | null |
2024-10-08 | Multimodal Representation Learning using Adaptive Graph Construction | Weichen Huang et.al. | 2410.06395 | null |
2024-10-08 | Temporal Image Caption Retrieval Competition – Description and Results | Jakub Pokrywka et.al. | 2410.06314 | null |
2024-10-08 | PDF-WuKong: A Large Multimodal Model for Efficient Long PDF Reading with End-to-End Sparse Sampling | Xudong Xie et.al. | 2410.05970 | link |
2024-10-08 | ModalPrompt:Dual-Modality Guided Prompt for Continual Learning of Large Multimodal Models | Fanhu Zeng et.al. | 2410.05849 | null |
2024-10-08 | Multimodal Large Language Models and Tunings: Vision, Language, Sensors, Audio, and Beyond | Soyeon Caren Han et.al. | 2410.05608 | link |
2024-10-08 | TeaserGen: Generating Teasers for Long Documentaries | Weihan Xu et.al. | 2410.05586 | null |
2024-10-07 | R-Bench: Are your Large Multimodal Model Robust to Real-world Corruptions? | Chunyi Li et.al. | 2410.05474 | link |
2024-10-07 | RespLLM: Unifying Audio and Text with Multimodal LLMs for Generalized Respiratory Health Prediction | Yuwei Zhang et.al. | 2410.05361 | null |
2024-10-07 | Patch is Enough: Naturalistic Adversarial Patch against Vision-Language Pre-training Models | Dehong Kong et.al. | 2410.04884 | null |
2024-10-06 | VISTA: A Visual and Textual Attention Dataset for Interpreting Multimodal Models | Harshit et.al. | 2410.04609 | null |
2024-10-06 | UniMuMo: Unified Text, Music and Motion Generation | Han Yang et.al. | 2410.04534 | null |
2024-10-08 | Gamified crowd-sourcing of high-quality data for visual fine-tuning | Shashank Yadav et.al. | 2410.04038 | null |
2024-10-07 | Multimodal Point-of-Interest Recommendation | Yuta Kanzawa et.al. | 2410.03265 | null |
2024-10-04 | Bridging the Gap between Text, Audio, Image, and Any Sequence: A Novel Approach using Gloss-based Annotation | Sen Fang et.al. | 2410.03146 | null |
2024-10-04 | AuroraCap: Efficient, Performant Video Detailed Captioning and a New Benchmark | Wenhao Chai et.al. | 2410.03051 | null |
2024-10-07 | CPFD: Confidence-aware Privileged Feature Distillation for Short Video Classification | Jinghao Shi et.al. | 2410.03038 | null |
2024-10-07 | MMP: Towards Robust Multi-Modal Learning with Masked Modality Projection | Niki Nezakati et.al. | 2410.03010 | null |
2024-10-03 | Vinoground: Scrutinizing LMMs over Dense Temporal Reasoning with Short Videos | Jianrui Zhang et.al. | 2410.02763 | null |
2024-10-03 | Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models | Zhengfeng Lai et.al. | 2410.02740 | null |
2024-10-04 | Video Instruction Tuning With Synthetic Data | Yuanhan Zhang et.al. | 2410.02713 | null |
2024-10-03 | LLaVA-Critic: Learning to Evaluate Multimodal Models | Tianyi Xiong et.al. | 2410.02712 | null |
2024-10-03 | Plots Unlock Time-Series Understanding in Multimodal Models | Mayank Daswani et.al. | 2410.02637 | null |
2024-10-02 | Anchors Aweigh! Sail for Optimal Unified Multi-Modal Representations | Minoh Jeong et.al. | 2410.02086 | null |
2024-10-02 | Toward a Holistic Evaluation of Robustness in CLIP Models | Weijie Tu et.al. | 2410.01534 | null |
2024-10-02 | SHAP-CAT: A interpretable multi-modal framework enhancing WSI classification via virtual staining and shapley-value-based multimodal fusion | Jun Wang et.al. | 2410.01408 | null |
2024-10-02 | Backdooring Vision-Language Models with Out-Of-Distribution Data | Weimin Lyu et.al. | 2410.01264 | null |
2024-10-02 | OCC-MLLM:Empowering Multimodal Large Language Model For the Understanding of Occluded Objects | Wenmo Qiu et.al. | 2410.01261 | null |
2024-09-30 | Robin3D: Improving 3D Large Language Model via Robust Instruction Tuning | Weitai Kang et.al. | 2410.00255 | link |
2024-09-30 | Using Large Multimodal Models to Extract Knowledge Components for Knowledge Tracing from Multimedia Question Information | Hyeongdon Moon et.al. | 2409.20167 | link |
2024-10-02 | Visual Context Window Extension: A New Perspective for Long Video Understanding | Hongchen Wei et.al. | 2409.20018 | null |
2024-09-30 | Towards Robust Multimodal Sentiment Analysis with Incomplete Data | Haoyu Zhang et.al. | 2409.20012 | link |
2024-09-28 | FairPIVARA: Reducing and Assessing Biases in CLIP-Based Multimodal Models | Diego A. B. Moreira et.al. | 2409.19474 | link |
2024-09-28 | From Unimodal to Multimodal: Scaling up Projectors to Align Modalities | Mayug Maniparambil et.al. | 2409.19425 | null |
2024-10-02 | CLIP-MoE: Towards Building Mixture of Experts for CLIP with Diversified Multiplet Upcycling | Jihai Zhang et.al. | 2409.19291 | link |
2024-09-28 | TrojVLM: Backdoor Attack Against Vision Language Models | Weimin Lyu et.al. | 2409.19232 | null |
2024-09-27 | Multimodal Markup Document Models for Graphic Design Completion | Kotaro Kikuchi et.al. | 2409.19051 | null |
2024-09-27 | Emu3: Next-Token Prediction is All You Need | Xinlong Wang et.al. | 2409.18869 | null |
2024-09-27 | Data Analysis in the Era of Generative AI | Jeevana Priya Inala et.al. | 2409.18475 | null |
2024-09-26 | MultiClimate: Multimodal Stance Detection on Climate Change Videos | Jiawen Wang et.al. | 2409.18346 | link |
2024-09-26 | LLaVA-3D: A Simple yet Effective Pathway to Empowering LMMs with 3D-awareness | Chenming Zhu et.al. | 2409.18125 | null |
2024-09-26 | GSON: A Group-based Social Navigation Framework with Large Multimodal Model | Shangyi Luo et.al. | 2409.18084 | null |
2024-09-26 | A Multimodal Single-Branch Embedding Network for Recommendation in Cold-Start and Missing Modality Scenarios | Christian Ganhör et.al. | 2409.17864 | link |
2024-09-26 | Harnessing Shared Relations via Multimodal Mixup Contrastive Learning for Multimodal Classification | Raja Kumar et.al. | 2409.17777 | link |
2024-09-26 | MIO: A Foundation Model on Multimodal Tokens | Zekun Wang et.al. | 2409.17692 | link |
2024-09-25 | Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models | Matt Deitke et.al. | 2409.17146 | null |
2024-09-24 | CDChat: A Large Multimodal Model for Remote Sensing Change Description | Mubashir Noman et.al. | 2409.16261 | link |
2024-09-24 | CLSP: High-Fidelity Contrastive Language-State Pre-training for Agent State Representation | Fuxian Huang et.al. | 2409.15806 | null |
2024-09-18 | Recommendation with Generative Models | Yashar Deldjoo et.al. | 2409.15173 | null |
2024-09-23 | With Ears to See and Eyes to Hear: Sound Symbolism Experiments with Multimodal Large Language Models | Tyler Loakman et.al. | 2409.14917 | link |
2024-09-22 | Patch Ranking: Efficient CLIP by Learning to Rank Local Patches | Cheng-En Wu et.al. | 2409.14607 | null |
2024-09-22 | Can-Do! A Dataset and Neuro-Symbolic Grounded Framework for Embodied Planning with Large Multimodal Models | Yew Ken Chia et.al. | 2409.14277 | null |
2024-09-20 | Brain-Cognition Fingerprinting via Graph-GCCA with Contrastive Learning | Yixin Wang et.al. | 2409.13887 | null |
2024-09-20 | Instruction-guided Multi-Granularity Segmentation and Captioning with Large Multimodal Model | Li Zhou et.al. | 2409.13407 | link |
2024-09-20 | A Novel Adaptive Fine-Tuning Algorithm for Multimodal Models: Self-Optimizing Classification and Selection of High-Quality Datasets in Remote Sensing | Yi Ren et.al. | 2409.13345 | null |
2024-09-20 | ChemDFM-X: Towards Large Multimodal Model for Chemistry | Zihan Zhao et.al. | 2409.13194 | null |
2024-09-19 | MMSearch: Benchmarking the Potential of Large Models as Multi-modal Search Engines | Dongzhi Jiang et.al. | 2409.12959 | null |
2024-09-24 | TinyVLA: Towards Fast, Data-Efficient Vision-Language-Action Models for Robotic Manipulation | Junjie Wen et.al. | 2409.12514 | null |
2024-09-18 | Qwen2-VL: Enhancing Vision-Language Model’s Perception of the World at Any Resolution | Peng Wang et.al. | 2409.12191 | link |
2024-09-18 | All-in-one foundational models learning across quantum chemical levels | Yuxinxin Chen et.al. | 2409.12015 | link |
2024-09-18 | LMMCoDrive: Cooperative Driving with Large Multimodal Model | Haichao Liu et.al. | 2409.11981 | link |
2024-09-16 | MusicLIME: Explainable Multimodal Music Understanding | Theodoros Sotirou et.al. | 2409.10496 | link |
2024-09-19 | IRIS: Interactive Responsive Intelligent Segmentation for 3D Affordance Analysis | Meng Chu et.al. | 2409.10078 | null |
2024-09-16 | AceParse: A Comprehensive Dataset with Diverse Structured Texts for Academic Literature Parsing | Huawei Ji et.al. | 2409.10016 | link |
2024-09-14 | Keypoints-Integrated Instruction-Following Data Generation for Enhanced Human Pose Understanding in Multimodal Models | Dewen Zhang et.al. | 2409.09306 | null |
2024-09-13 | Interactive Masked Image Modeling for Multimodal Object Detection in Remote Sensing | Minh-Duc Vu et.al. | 2409.08885 | null |
2024-09-13 | A Multimodal Approach for Fluid Overload Prediction: Integrating Lung Ultrasound and Clinical Data | Tianqi Yang et.al. | 2409.08790 | null |
2024-09-13 | Dynamics of Collective Group Affect: Group-level Annotations and the Multimodal Modeling of Convergence and Divergence | Navin Raj Prabhu et.al. | 2409.08578 | null |
2024-09-13 | A Comprehensive Survey on Deep Multimodal Learning with Missing Modality | Renjie Wu et.al. | 2409.07825 | null |
2024-09-12 | Top-down Activity Representation Learning for Video Question Answering | Yanan Wang et.al. | 2409.07748 | null |
2024-09-11 | What to align in multimodal contrastive learning? | Benoit Dufumier et.al. | 2409.07402 | null |
2024-09-11 | MVLLaVA: An Intelligent Agent for Unified and Flexible Novel View Synthesis | Hanyu Jiang et.al. | 2409.07129 | null |
2024-09-11 | FSMDet: Vision-guided feature diffusion for fully sparse 3D detector | Tianran Liu et.al. | 2409.06945 | null |
2024-09-16 | Scaling Law Hypothesis for Multimodal Model | Qingyun Sun et.al. | 2409.06754 | null |
2024-09-10 | Multiclass Arrhythmia Classification using Smartwatch Photoplethysmography Signals Collected in Real-life Settings | Dong Han et.al. | 2409.06147 | null |
2024-09-11 | A Survey of Multimodal Composite Editing and Retrieval | Suyan Li et.al. | 2409.05405 | link |
2024-09-05 | Learning in Order! A Sequential Strategy to Learn Invariant Features for Multimodal Sentiment Analysis | Xianbing Zhao et.al. | 2409.04473 | null |
2024-09-06 | Generating Faithful and Salient Text from Multimodal Data | Tahsina Hashem et.al. | 2409.03961 | link |
2024-09-06 | CMM-Math: A Chinese Multimodal Math Dataset To Evaluate and Enhance the Mathematics Reasoning of Large Multimodal Models | Wentao Liu et.al. | 2409.02834 | link |
2024-09-10 | MMMU-Pro: A More Robust Multi-discipline Multimodal Understanding Benchmark | Xiang Yue et.al. | 2409.02813 | null |
2024-09-04 | Understanding eGFR Trajectories and Kidney Function Decline via Large Multimodal Models | Chih-Yuan Li et.al. | 2409.02530 | null |
2024-09-03 | Blocks as Probes: Dissecting Categorization Ability of Large Multimodal Models | Bin Fu et.al. | 2409.01560 | null |
2024-09-03 | Think Twice Before Recognizing: Large Multimodal Models for General Fine-grained Traffic Sign Recognition | Yaozong Gan et.al. | 2409.01534 | null |
2024-09-02 | Towards General Industrial Intelligence: A Survey on IIoT-Enhanced Continual Large Models | Jiao Chen et.al. | 2409.01207 | null |
2024-09-02 | Recoverable Compression: A Multimodal Vision Token Recovery Mechanism Guided by Text Information | Yi Chen et.al. | 2409.01179 | null |
2024-08-31 | Comparative Analysis of Modality Fusion Approaches for Audio-Visual Person Identification and Verification | Aref Farhadipour et.al. | 2409.00562 | null |
2024-08-30 | UrBench: A Comprehensive Benchmark for Evaluating Large Multimodal Models in Multi-View Urban Scenarios | Baichuan Zhou et.al. | 2408.17267 | null |
2024-08-29 | Seeking the Sufficiency and Necessity Causal Features in Multimodal Representation Learning | Boyu Chen et.al. | 2408.16577 | null |
2024-08-29 | Toward Robust Early Detection of Alzheimer’s Disease via an Integrated Multimodal Learning Approach | Yifei Chen et.al. | 2408.16343 | link |
2024-08-28 | Meta-Learn Unimodal Signals with Weak Supervision for Multimodal Sentiment Analysis | Sijie Mai et.al. | 2408.16029 | null |
2024-08-28 | ModalityMirror: Improving Audio Classification in Modality Heterogeneity Federated Learning with Multimodal Distillation | Tiantian Feng et.al. | 2408.15803 | null |
2024-08-28 | Visual Prompt Engineering for Medical Vision Language Models in Radiology | Stefan Denner et.al. | 2408.15802 | null |
2024-08-27 | X-Reflect: Cross-Reflection Prompting for Multimodal Recommendation | Hanjia Lyu et.al. | 2408.15172 | null |
2024-08-27 | The Benefits of Balance: From Information Projections to Variance Reduction | Lang Liu et.al. | 2408.15065 | null |
2024-08-27 | NeuralOOD: Improving Out-of-Distribution Generalization Performance with Brain-machine Fusion Learning Framework | Shuangchen Zhao et.al. | 2408.14950 | null |
2024-08-26 | MMR: Evaluating Reading Ability of Large Multimodal Models | Jian Chen et.al. | 2408.14594 | null |
2024-09-03 | Foundation Models for Music: A Survey | Yinghao Ma et.al. | 2408.14340 | link |
2024-08-26 | LMM-VQA: Advancing Video Quality Assessment with Large Multimodal Models | Qihang Ge et.al. | 2408.14008 | null |
2024-08-27 | Quantum Multimodal Contrastive Learning Framework | Chi-Sheng Chen et.al. | 2408.13919 | null |
2024-08-25 | Tangram: A Challenging Benchmark for Geometric Element Recognizing | Jiamin Tang et.al. | 2408.13854 | null |
2024-08-25 | Multimodal Ensemble with Conditional Feature Fusion for Dysgraphia Diagnosis in Children from Handwriting Samples | Jayakanth Kunhoth et.al. | 2408.13754 | null |
2024-08-24 | Preliminary Investigations of a Multi-Faceted Robust and Synergistic Approach in Semiconductor Electron Micrograph Analysis: Integrating Vision Transformers with Large Language and Multimodal Models | Sakhinana Sagar Srinivas et.al. | 2408.13621 | null |
2024-08-23 | Foundational Model for Electron Micrograph Analysis: Instruction-Tuning Small-Scale Language-and-Vision Assistant for Enterprise Adoption | Sakhinana Sagar Srinivas et.al. | 2408.13248 | null |
2024-08-23 | Indoor scene recognition from images under visual corruptions | Willams de Lima Costa et.al. | 2408.13029 | null |
2024-08-23 | Ada2I: Enhancing Modality Balance for Multimodal Conversational Emotion Recognition | Cam-Van Thi Nguyen et.al. | 2408.12895 | null |
2024-08-23 | Has Multimodal Learning Delivered Universal Intelligence in Healthcare? A Comprehensive Survey | Qika Lin et.al. | 2408.12880 | link |
2024-08-22 | Assessing Modality Bias in Video Question Answering Benchmarks with Multimodal Large Language Models | Jean Park et.al. | 2408.12763 | null |
2024-08-22 | Integrating Audio, Visual, and Semantic Information for Enhanced Multimodal Speaker Diarization | Luyao Cheng et.al. | 2408.12102 | null |
2024-08-22 | Mental-Perceiver: Audio-Textual Multimodal Learning for Mental Health Assessment | Jinghui Qin et.al. | 2408.12088 | null |
2024-08-21 | GRAB: A Challenging GRaph Analysis Benchmark for Large Multimodal Models | Jonathan Roberts et.al. | 2408.11817 | null |
2024-08-21 | D-RMGPT: Robot-assisted collaborative tasks driven by large multimodal models | M. Forlini et.al. | 2408.11761 | null |
2024-08-21 | UniFashion: A Unified Vision-Language Model for Multimodal Fashion Retrieval and Generation | Xiangyu Zhao et.al. | 2408.11305 | link |
2024-08-21 | BearLLM: A Prior Knowledge-Enhanced Bearing Health Management Framework with Unified Vibration Signal Representation | Haotian Peng et.al. | 2408.11281 | link |
2024-08-20 | Exploring the use of Generative AI to Support Automated Just-in-Time Programming for Visual Scene Displays | Cynthia Zastudil et.al. | 2408.11137 | null |
2024-08-21 | SZTU-CMU at MER2024: Improving Emotion-LLaMA with Conv-Attention for Multimodal Emotion Recognition | Zebang Cheng et.al. | 2408.10500 | link |
2024-08-19 | Enhance Modality Robustness in Text-Centric Multimodal Alignment with Adversarial Prompting | Yun-Da Tsai et.al. | 2408.09798 | null |
2024-08-19 | Anim-Director: A Large Multimodal Model Powered Agent for Controllable Animation Video Generation | Yunxin Li et.al. | 2408.09787 | link |
2024-08-18 | PA-LLaVA: A Large Language-Vision Assistant for Human Pathology Image Understanding | Dawei Dai et.al. | 2408.09530 | link |
2024-08-17 | Measuring Visual Sycophancy in Multimodal Models | Jaehyuk Lim et.al. | 2408.09111 | link |
2024-08-16 | AdaRank: Disagreement Based Module Rank Prediction for Low-rank Adaptation | Yihe Dong et.al. | 2408.09015 | link |
2024-08-16 | xGen-MM (BLIP-3): A Family of Open Large Multimodal Models | Le Xue et.al. | 2408.08872 | null |
2024-08-16 | Tell Codec What Worth Compressing: Semantically Disentangled Image Coding for Machine with LMMs | Jinming Liu et.al. | 2408.08575 | null |
2024-08-15 | LLaVA-Surg: Towards Multimodal Surgical Assistant via Structured Surgical Video Learning | Jiajie Li et.al. | 2408.07981 | null |
2024-08-15 | MathScape: Evaluating MLLMs in multimodal Math Scenarios through a Hierarchical Benchmark | Minxuan Zhou et.al. | 2408.07543 | link |
2024-08-14 | Modality Invariant Multimodal Learning to Handle Missing Modalities: A Single-Branch Approach | Muhammad Saad Saeed et.al. | 2408.07445 | null |
2024-08-14 | Robust Semi-supervised Multimodal Medical Image Segmentation via Cross Modality Collaboration | Xiaogen Zhon et.al. | 2408.07341 | link |
2024-08-14 | Enhancing Visual Question Answering through Ranking-Based Hybrid Training and Multimodal Fusion | Peiyuan Chen et.al. | 2408.07303 | null |
2024-08-13 | PathInsight: Instruction Tuning of Multimodal Datasets and Models for Intelligence Assisted Diagnosis in Histopathology | Xiaomin Wu et.al. | 2408.07037 | null |
2024-08-13 | EditScribe: Non-Visual Image Editing with Natural Language Verification Loops | Ruei-Che Chang et.al. | 2408.06632 | null |
2024-08-13 | CROME: Cross-Modal Adapters for Efficient Multimodal LLM | Sayna Ebrahimi et.al. | 2408.06610 | null |
2024-08-13 | Prioritizing Modalities: Flexible Importance Scheduling in Federated Multimodal Learning | Jieming Bian et.al. | 2408.06549 | null |
2024-08-12 | VisualAgentBench: Towards Large Multimodal Models as Visual Foundation Agents | Xiao Liu et.al. | 2408.06327 | link |
2024-08-11 | HateSieve: A Contrastive Learning Framework for Detecting and Segmenting Hateful Content in Multimodal Memes | Xuanyu Su et.al. | 2408.05794 | null |
2024-08-08 | Enhancing Journalism with AI: A Study of Contextualized Image Captioning for News Articles using LLMs and LMMs | Aliki Anagnostopoulou et.al. | 2408.04331 | null |
2024-08-06 | LLaVA-OneVision: Easy Visual Task Transfer | Bo Li et.al. | 2408.03326 | link |
2024-08-06 | Multitask and Multimodal Neural Tuning for Large Models | Hao Sun et.al. | 2408.03001 | null |
2024-08-06 | Body of Her: A Preliminary Study on End-to-End Humanoid Agent | Tenglong Ao et.al. | 2408.02879 | null |
2024-08-04 | Distribution-Level Memory Recall for Continual Learning: Preserving Knowledge and Avoiding Confusion | Shaoxu Cheng et.al. | 2408.02695 | null |
2024-08-02 | A Systematic Review of Intermediate Fusion in Multimodal Deep Learning for Biomedical Applications | Valerio Guarrasi et.al. | 2408.02686 | null |
2024-08-05 | REVISION: Rendering Tools Enable Spatial Fidelity in Vision-Language Models | Agneet Chatterjee et.al. | 2408.02231 | null |
2024-08-04 | CACE-Net: Co-guidance Attention and Contrastive Enhancement for Effective Audio-Visual Event Localization | Xiang He et.al. | 2408.01952 | link |
2024-08-02 | MuChoMusic: Evaluating Music Understanding in Multimodal Audio-Language Models | Benno Weck et.al. | 2408.01337 | link |
2024-08-05 | Dissecting Dissonance: Benchmarking Large Multimodal Models Against Self-Contradictory Instructions | Jin Gao et.al. | 2408.01091 | link |
2024-08-02 | GraphAge: Unleashing the power of Graph Neural Network to Decode Epigenetic Aging | Saleh Sakib Ahmed et.al. | 2408.00984 | link |
2024-08-01 | MM-Vet v2: A Challenging Benchmark to Evaluate Large Multimodal Models for Integrated Capabilities | Weihao Yu et.al. | 2408.00765 | link |
2024-08-01 | GalleryGPT: Analyzing Paintings with Large Multimodal Models | Yi Bin et.al. | 2408.00491 | link |
2024-08-01 | Everything We Hear: Towards Tackling Misinformation in Podcasts | Sachin Pathiyan Cherumanal et.al. | 2408.00292 | null |
2024-08-01 | OmniParser for Pure Vision Based GUI Agent | Yadong Lu et.al. | 2408.00203 | null |
2024-07-30 | Evolver: Chain-of-Evolution Prompting to Boost Large Multimodal Models for Hateful Meme Detection | Jinfa Huang et.al. | 2407.21004 | null |
2024-07-30 | HyperMM : Robust Multimodal Learning with Varying-sized Inputs | Hava Chaptoukaev et.al. | 2407.20768 | null |
2024-07-30 | Effectively Leveraging CLIP for Generating Situational Summaries of Images and Videos | Dhruv Verma et.al. | 2407.20642 | link |
2024-07-29 | Adversarial Robustness in RGB-Skeleton Action Recognition: Leveraging Attention Modality Reweighter | Chao Liu et.al. | 2407.19981 | null |
2024-07-29 | ML-Mamba: Efficient Multi-Modal Large Language Model Utilizing Mamba-2 | Wenjun Huang et.al. | 2407.19832 | null |
2024-08-02 | XLIP: Cross-modal Attention Masked Modelling for Medical Language-Image Pre-Training | Biao Wu et.al. | 2407.19546 | link |
2024-07-28 | Detached and Interactive Multimodal Learning | Yunfeng Fan et.al. | 2407.19514 | link |
2024-07-27 | Data Processing Techniques for Modern Multimodal Models | Yinheng Li et.al. | 2407.19180 | null |
2024-07-26 | MangaUB: A Manga Understanding Benchmark for Large Multimodal Models | Hikaru Ikuta et.al. | 2407.19034 | null |
2024-07-26 | Unifying Visual and Semantic Feature Spaces with Diffusion Models for Enhanced Cross-Modal Alignment | Yuze Zheng et.al. | 2407.18854 | null |
2024-07-26 | ChatSchema: A pipeline of extracting structured information with Large Multimodal Models based on schema | Fei Wang et.al. | 2407.18716 | null |
2024-07-25 | Sparse vs Contiguous Adversarial Pixel Perturbations in Multimodal Models: An Empirical Analysis | Cristian-Alexandru Botocan et.al. | 2407.18251 | link |
2024-07-25 | $\mathbb{X}$ -Sample Contrastive Loss: Improving Contrastive Learning with Sample Similarity Graphs | Vlad Sobal et.al. | 2407.18134 | null |
2024-07-25 | Cross-Vendor Reproducibility of Radiomics-based Machine Learning Models for Computer-aided Diagnosis | Jatin Chaudhary et.al. | 2407.18060 | null |
2024-07-25 | What does Kiki look like? Cross-modal associations between speech sounds and visual shapes in vision-and-language models | Tessa Verhoef et.al. | 2407.17974 | null |
2024-07-25 | Shapley Value-based Contrastive Alignment for Multimodal Information Extraction | Wen Luo et.al. | 2407.17854 | null |
2024-07-25 | Enhancing Model Performance: Another Approach to Vision-Language Instruction Tuning | Vedanshu et.al. | 2407.17813 | null |
2024-07-25 | KiVA: Kid-inspired Visual Analogies for Testing Large Multimodal Models | Eunice Yiu et.al. | 2407.17773 | link |
2024-07-24 | Testing Large Language Models on Driving Theory Knowledge and Skills for Connected Autonomous Vehicles | Zuoyin Tang et.al. | 2407.17211 | null |
2024-07-23 | Chameleon: Images Are What You Need For Multimodal Learning Robust To Missing Modalities | Muhammad Irzam Liaqat et.al. | 2407.16243 | null |
2024-07-22 | LongVideoBench: A Benchmark for Long-context Interleaved Video-Language Understanding | Haoning Wu et.al. | 2407.15754 | link |
2024-07-22 | Resource-Efficient Federated Multimodal Learning via Layer-wise and Progressive Training | Ye Lin Tun et.al. | 2407.15426 | null |
2024-07-21 | VideoGameBunny: Towards vision assistants for video games | Mohammad Reza Taesiri et.al. | 2407.15295 | null |
2024-07-22 | Patch-based Intuitive Multimodal Prototypes Network (PIMPNet) for Alzheimer’s Disease classification | Lisa Anita De Santi et.al. | 2407.14277 | link |
2024-07-18 | Visual Haystacks: Answering Harder Questions About Sets of Images | Tsung-Han Wu et.al. | 2407.13766 | link |
2024-07-17 | Text- and Feature-based Models for Compound Multimodal Emotion Recognition in the Wild | Nicolas Richet et.al. | 2407.12927 | link |
2024-07-16 | ChatBCG: Can AI Read Your Slide Deck? | Nikita Singh et.al. | 2407.12875 | null |
2024-07-17 | LMMs-Eval: Reality Check on the Evaluation of Large Multimodal Models | Kaichen Zhang et.al. | 2407.12772 | link |
2024-07-17 | Missing Modality Prediction for Unpaired Multimodal Learning via Joint Embedding of Unimodal Models | Donggeun Kim et.al. | 2407.12616 | null |
2024-07-17 | E5-V: Universal Embeddings with Multimodal Large Language Models | Ting Jiang et.al. | 2407.12580 | link |
2024-07-16 | FIRE: A Dataset for Feedback Integration and Refinement Evaluation of Multimodal Models | Pengxiang Li et.al. | 2407.11522 | null |
2024-07-16 | COMET: “Cone of experience” enhanced large multimodal model for mathematical problem generation | Sannyuya Liu et.al. | 2407.11315 | null |
2024-07-15 | OpenPSG: Open-set Panoptic Scene Graph Generation via Large Multimodal Models | Zijian Zhou et.al. | 2407.11213 | link |
2024-07-15 | FabGPT: An Efficient Large Multimodal Model for Complex Wafer Defect Knowledge Queries | Yuqi Jiang et.al. | 2407.10810 | null |
2024-07-15 | Scaling 3D Reasoning with LMMs to Large Robot Mission Environments Using Datagraphs | W. J. Meijer et.al. | 2407.10743 | null |
2024-07-16 | Qwen2 Technical Report | An Yang et.al. | 2407.10671 | link |
2024-07-15 | How and where does CLIP process negation? | Vincent Quantmeyer et.al. | 2407.10488 | null |
2024-07-12 | Diagnosing and Re-learning for Balanced Multimodal Learning | Yake Wei et.al. | 2407.09705 | link |
2024-07-12 | Unifying Sequences, Structures, and Descriptions for Any-to-Any Protein Generation with the Large Multimodal Model HelixProtX | Zhiyuan Chen et.al. | 2407.09274 | link |
2024-07-12 | DART: An Automated End-to-End Object Detection Pipeline with Data Diversification, Open-Vocabulary Bounding Box Annotation, Pseudo-Label Review, and Model Training | Chen Xin et.al. | 2407.09174 | link |
2024-07-11 | Emerging Practices for Large Multimodal Model (LMM) Assistance for People with Visual Impairments: Implications for Design | Jingyi Xie et.al. | 2407.08882 | null |
2024-07-10 | RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization | Xijie Huang et.al. | 2407.08044 | link |
2024-07-10 | LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal Models | Feng Li et.al. | 2407.07895 | link |
2024-07-11 | InstructLayout: Instruction-Driven 2D and 3D Layout Synthesis with Semantic Graph Prior | Chenguo Lin et.al. | 2407.07580 | null |
2024-07-10 | Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model | Wenqi Zhang et.al. | 2407.07053 | link |
2024-07-08 | ANOLE: An Open, Autoregressive, Native Large Multimodal Models for Interleaved Image-Text Generation | Ethan Chern et.al. | 2407.06135 | link |
2024-07-07 | Multimodal Language Models for Domain-Specific Procedural Video Summarization | Nafisa Hussain et.al. | 2407.05419 | null |
2024-07-07 | Multimodal Prompt Learning with Missing Modalities for Sentiment Analysis and Emotion Recognition | Zirun Guo et.al. | 2407.05374 | link |
2024-07-06 | Enhance the Robustness of Text-Centric Multimodal Alignments | Ting-Yu Yen et.al. | 2407.05036 | null |
2024-07-06 | Completed Feature Disentanglement Learning for Multimodal MRIs Analysis | Tianling Liu et.al. | 2407.04916 | null |
2024-07-06 | MMSci: A Multimodal Multi-Discipline Dataset for PhD-Level Scientific Comprehension | Zekun Li et.al. | 2407.04903 | link |
2024-07-05 | VCoME: Verbal Video Composition with Multimodal Editing Effects | Weibo Gong et.al. | 2407.04697 | null |
2024-07-05 | Multimodal Classification via Modal-Aware Interactive Enhancement | Qing-Yuan Jiang et.al. | 2407.04587 | null |
2024-07-05 | Robust Multimodal Learning via Representation Decoupling | Shicai Wei et.al. | 2407.04458 | null |
2024-07-05 | Smart Vision-Language Reasoners | Denisa Roberts et.al. | 2407.04212 | link |
2024-07-04 | Investigating the Role of Instruction Variety and Task Difficulty in Robotic Manipulation Tasks | Amit Parekh et.al. | 2407.03967 | link |
2024-07-04 | ADAPT: Multimodal Learning for Detecting Physiological Changes under Missing Modalities | Julie Mordacq et.al. | 2407.03836 | link |
2024-07-04 | M $\mathbf5$ – A Diverse Benchmark to Assess the Performance of Large Multimodal Models Across Multilingual and Multicultural Vision-Language Tasks | Florian Schneider et.al. | 2407.03791 | null |
2024-07-03 | HEMM: Holistic Evaluation of Multimodal Foundation Models | Paul Pu Liang et.al. | 2407.03418 | link |
2024-07-02 | Multi-Peptide: Multimodality Leveraged Language-Graph Learning of Peptide Properties | Srivathsan Badrinarayanan et.al. | 2407.03380 | link |
2024-07-02 | Understanding Alignment in Multimodal LLMs: A Comprehensive Study | Elmira Amirloo et.al. | 2407.02477 | null |
2024-07-02 | Synthetic Multimodal Question Generation | Ian Wu et.al. | 2407.02233 | null |
2024-07-02 | Crossroads of Continents: Automated Artifact Extraction for Cultural Adaptation with Large Multimodal Models | Anjishnu Mukherjee et.al. | 2407.02067 | link |
2024-07-01 | Empathic Grounding: Explorations using Multimodal Interaction and Large Language Models with Conversational Agents | Mehdi Arjmand et.al. | 2407.01824 | link |
2024-07-01 | We-Math: Does Your Large Multimodal Model Achieve Human-like Mathematical Reasoning? | Runqi Qiao et.al. | 2407.01284 | link |
2024-07-01 | Unaligning Everything: Or Aligning Any Text to Any Image in Multimodal Models | Shaeke Salman et.al. | 2407.01157 | null |
2024-06-29 | AI-powered multimodal modeling of personalized hemodynamics in aortic stenosis | Caglar Ozturk et.al. | 2407.00535 | null |
2024-06-29 | MMEvalPro: Calibrating Multimodal Benchmarks Towards Trustworthy and Efficient Evaluation | Jinsheng Huang et.al. | 2407.00468 | link |
2024-06-29 | How to Train Your Fact Verifier: Knowledge Transfer with Multimodal Open Models | Jaeyoung Lee et.al. | 2407.00369 | null |
2024-06-28 | PathGen-1.6M: 1.6 Million Pathology Image-text Pairs Generation through Multi-agent Collaboration | Yuxuan Sun et.al. | 2407.00203 | null |
2024-06-28 | EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model | Yuxuan Zhang et.al. | 2406.20076 | link |
2024-06-28 | InfiniBench: A Comprehensive Benchmark for Large Multimodal Models in Very Long Video Understanding | Kirolos Ataallah et.al. | 2406.19875 | link |
2024-06-28 | MetaDesigner: Advancing Artistic Typography through AI-Driven, User-Centric, and Multilingual WordArt Synthesis | Jun-Yan He et.al. | 2406.19859 | null |
2024-06-28 | MM-Instruct: Generated Visual Instructions for Large Multimodal Model Alignment | Jihao Liu et.al. | 2406.19736 | link |
2024-06-28 | Enhancing Radiological Diagnosis: A Collaborative Approach Integrating AI and Human Expertise for Visual Miss Correction | Akash Awasthi et.al. | 2406.19686 | null |
2024-06-28 | SK-VQA: Synthetic Knowledge Generation at Scale for Training Context-Augmented Multimodal LLMs | Xin Su et.al. | 2406.19593 | null |
2024-06-27 | OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding | Tao Zhang et.al. | 2406.19389 | null |
2024-06-28 | FlowVQA: Mapping Multimodal Logic in Visual Question Answering with Flowcharts | Shubhankar Singh et.al. | 2406.19237 | null |
2024-06-27 | RAVEN: Multitask Retrieval Augmented Vision-Language Learning | Varun Nagaraj Rao et.al. | 2406.19150 | null |
2024-06-27 | DocKylin: A Large Multimodal Model for Visual Document Understanding with Efficient Visual Slimming | Jiaxin Zhang et.al. | 2406.19101 | null |
2024-06-27 | Fairness and Bias in Multimodal AI: A Survey | Tosin Adewumi et.al. | 2406.19097 | null |
2024-06-27 | MissionGNN: Hierarchical Multimodal GNN-based Weakly Supervised Video Anomaly Recognition with Mission-Specific Knowledge Graph Generation | Sanggeon Yun et.al. | 2406.18815 | null |
2024-06-26 | MUMU: Bootstrapping Multimodal Image Generation from Text-to-Image Data | William Berman et.al. | 2406.18790 | null |
2024-06-26 | S3: A Simple Strong Sample-effective Multimodal Dialog System | Elisei Rykov et.al. | 2406.18305 | link |
2024-06-26 | EHR-Based Mobile and Web Platform for Chronic Disease Risk Prediction Using Large Language Multimodal Models | Chun-Chieh Liao et.al. | 2406.18087 | null |
2024-06-26 | Speech2UnifiedExpressions: Synchronous Synthesis of Co-Speech Affective Face and Body Expressions from Affordable Inputs | Uttaran Bhattacharya et.al. | 2406.18068 | null |
2024-06-25 | Human-centered In-building Embodied Delivery Benchmark | Zhuoqun Xu et.al. | 2406.17898 | link |
2024-06-25 | InFiConD: Interactive No-code Fine-tuning with Concept-based Knowledge Distillation | Jinbin Huang et.al. | 2406.17838 | null |
2024-06-25 | Data curation via joint example selection further accelerates multimodal learning | Talfan Evans et.al. | 2406.17711 | null |
2024-06-25 | Towards Probing Speech-Specific Risks in Large Multimodal Models: A Taxonomy, Benchmark, and Insights | Hao Yang et.al. | 2406.17430 | link |
2024-06-24 | At First Sight: Zero-Shot Classification of Astronomical Images with Large Multimodal Models | Dimitrios Tanoglidis et.al. | 2406.17057 | null |
2024-06-24 | Revisiting Referring Expression Comprehension Evaluation in the Era of Large Multimodal Models | Jierun Chen et.al. | 2406.16866 | link |
2024-06-24 | Long Context Transfer from Language to Vision | Peiyuan Zhang et.al. | 2406.16852 | link |
2024-06-24 | QuadrupedGPT: Towards a Versatile Quadruped Agent in Open-ended Worlds | Ye Wang et.al. | 2406.16578 | null |
2024-06-21 | Multimodal Task Vectors Enable Many-Shot Multimodal In-Context Learning | Brandon Huang et.al. | 2406.15334 | null |
2024-06-21 | Is A Picture Worth A Thousand Words? Delving Into Spatial Reasoning for Vision Language Models | Jiayu Wang et.al. | 2406.14852 | link |
2024-06-20 | Evaluating vision-capable chatbots in interpreting kinematics graphs: a comparative study of free and subscription-based models | Giulia Polverini et.al. | 2406.14685 | null |
2024-06-20 | Revealing Vision-Language Integration in the Brain with Multimodal Networks | Vighnesh Subramaniam et.al. | 2406.14481 | link |
2024-06-25 | iWISDM: Assessing instruction following in multimodal models at scale | Xiaoxuan Lei et.al. | 2406.14343 | link |
2024-06-20 | Two Giraffes in a Dirt Field: Using Game Play to Investigate Situation Modelling in Large Multimodal Models | Sherzod Hakimov et.al. | 2406.14035 | null |
2024-06-20 | Knowledge-driven Subspace Fusion and Gradient Coordination for Multi-modal Learning | Yupei Zhang et.al. | 2406.13979 | link |
2024-06-20 | PIN: A Knowledge-Intensive Dataset for Paired and Interleaved Multimodal Documents | Junjie Wang et.al. | 2406.13923 | null |
2024-06-19 | Through the Theory of Mind’s Eye: Reading Minds with Multimodal Video Large Language Models | Zhawnen Chen et.al. | 2406.13763 | null |
2024-06-19 | GUI Action Narrator: Where and When Did That Action Take Place? | Qinchen Wu et.al. | 2406.13719 | null |
2024-06-19 | Is AI fun? HumorDB: a curated dataset and benchmark to investigate graphical humor | Veedant Jain et.al. | 2406.13564 | null |
2024-06-19 | VisualRWKV: Exploring Recurrent Neural Networks for Visual Language Models | Haowen Hou et.al. | 2406.13362 | link |
2024-06-19 | Learnable In-Context Vector for Visual Question Answering | Yingzhe Peng et.al. | 2406.13185 | link |
2024-06-18 | Synergizing Foundation Models and Federated Learning: A Survey | Shenghui Li et.al. | 2406.12844 | null |
2024-06-18 | OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI | Zhen Huang et.al. | 2406.12753 | link |
2024-06-18 | Disturbing Image Detection Using LMM-Elicited Emotion Embeddings | Maria Tzelepi et.al. | 2406.12668 | null |
2024-06-18 | Automatic benchmarking of large multimodal models via iterative experiment programming | Alessandro Conti et.al. | 2406.12321 | link |
2024-06-18 | Language and Multimodal Models in Sports: A Survey of Datasets and Applications | Haotian Xia et.al. | 2406.12252 | null |
2024-06-17 | VideoLLM-online: Online Video Large Language Model for Streaming Video | Joya Chen et.al. | 2406.11816 | null |
2024-06-17 | LLARVA: Vision-Action Instruction Tuning Enhances Robot Learning | Dantong Niu et.al. | 2406.11815 | null |
2024-06-17 | Multimodal Learning To Improve Segmentation With Intraoperative CBCT & Preoperative CT | Maximilian E. Tschuchnig et.al. | 2406.11650 | null |
2024-06-17 | Program Synthesis Benchmark for Visual Programming in XLogoOnline Environment | Chao Wen et.al. | 2406.11334 | null |
2024-06-17 | VideoVista: A Versatile Benchmark for Video Understanding and Reasoning | Yunxin Li et.al. | 2406.11303 | null |
2024-06-17 | i-SRT: Aligning Large Multimodal Models for Videos by Iterative Self-Retrospective Judgment | Daechul Ahn et.al. | 2406.11280 | link |
2024-06-17 | MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens | Anas Awadalla et.al. | 2406.11271 | link |
2024-06-17 | Generative Visual Instruction Tuning | Jefferson Hernandez et.al. | 2406.11262 | link |
2024-06-17 | Relational Learning in Pre-Trained Models: A Theory from Hypergraph Recovery Perspective | Yang Chen et.al. | 2406.11249 | null |
2024-06-16 | Investigating Video Reasoning Capability of Large Language Models with Tropes in Movies | Hung-Ting Su et.al. | 2406.10923 | null |
2024-06-15 | Beyond Raw Videos: Understanding Edited Videos with Large Multimodal Model | Lu Xu et.al. | 2406.10484 | link |
2024-06-12 | MobileAIBench: Benchmarking LLMs and LMMs for On-Device Use Cases | Rithesh Murthy et.al. | 2406.10290 | null |
2024-06-14 | VideoGUI: A Benchmark for GUI Automation from Instructional Videos | Kevin Qinghong Lin et.al. | 2406.10227 | null |
2024-06-14 | ChartMimic: Evaluating LMM’s Cross-Modal Reasoning Capability via Chart-to-Code Generation | Chufan Shi et.al. | 2406.09961 | link |
2024-06-14 | BiVLC: Extending Vision-Language Compositionality Evaluation with Text-to-Image Retrieval | Imanol Miranda et.al. | 2406.09952 | link |
2024-06-13 | VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding | Muhammad Maaz et.al. | 2406.09418 | link |
2024-06-13 | Explore the Limits of Omni-modal Pretraining at Scale | Yiyuan Zhang et.al. | 2406.09412 | link |
2024-06-14 | 4M-21: An Any-to-Any Vision Model for Tens of Tasks and Modalities | Roman Bachmann et.al. | 2406.09406 | null |
2024-06-13 | Yo’LLaVA: Your Personalized Language and Vision Assistant | Thao Nguyen et.al. | 2406.09400 | null |
2024-06-13 | CMC-Bench: Towards a New Paradigm of Visual Signal Compression | Chunyi Li et.al. | 2406.09356 | link |
2024-06-13 | Comparison Visual Instruction Tuning | Wei Lin et.al. | 2406.09240 | null |
2024-06-13 | Zoom and Shift are All You Need | Jiahao Qin et.al. | 2406.08866 | null |
2024-06-11 | Embedding-based Multimodal Learning on Pan-Squamous Cell Carcinomas for Improved Survival Outcomes | Asim Waqas et.al. | 2406.08521 | null |
2024-06-14 | Beyond LLaVA-HD: Diving into High-Resolution Large Multimodal Models | Yi-Fan Zhang et.al. | 2406.08487 | link |
2024-06-13 | OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text | Qingyun Li et.al. | 2406.08418 | link |
2024-06-12 | A Concept-Based Explainability Framework for Large Multimodal Models | Jayneel Parekh et.al. | 2406.08074 | null |
2024-06-12 | LVBench: An Extreme Long Video Understanding Benchmark | Weihan Wang et.al. | 2406.08035 | link |
2024-06-11 | Cognitive Insights Across Languages: Enhancing Multimodal Interview Analysis | David Ortiz-Perez et.al. | 2406.07542 | link |
2024-06-11 | Understanding Visual Concepts Across Models | Brandon Trabucco et.al. | 2406.07506 | link |
2024-06-11 | Unified Modeling Enhanced Multimodal Learning for Precision Neuro-Oncology | Huahui Yi et.al. | 2406.07078 | link |
2024-06-14 | BTS: Bridging Text and Sound Modalities for Metadata-Aided Respiratory Sound Classification | June-Woo Kim et.al. | 2406.06786 | link |
2024-06-10 | Vript: A Video Is Worth Thousands of Words | Dongjie Yang et.al. | 2406.06040 | link |
2024-06-10 | FLEUR: An Explainable Reference-Free Evaluation Metric for Image Captioning Using a Large Multimodal Model | Yebin Lee et.al. | 2406.06004 | link |
2024-06-10 | CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark | David Romero et.al. | 2406.05967 | null |
2024-06-09 | Stealthy Targeted Backdoor Attacks against Image Captioning | Wenshu Fan et.al. | 2406.05874 | link |
2024-06-09 | F-LMM: Grounding Frozen Large Multimodal Models | Size Wu et.al. | 2406.05821 | link |
2024-06-08 | Generalist Multimodal AI: A Review of Architectures, Challenges and Opportunities | Sai Munikoti et.al. | 2406.05496 | null |
2024-06-07 | Semantic Segmentation on VSPW Dataset through Masked Video Consistency | Chen Liang et.al. | 2406.04979 | null |
2024-06-07 | Predictive Dynamic Fusion | Bing Cao et.al. | 2406.04802 | link |
2024-06-07 | MGIMM: Multi-Granularity Instruction Multimodal Model for Attribute-Guided Remote Sensing Image Detailed Description | Cong Yang et.al. | 2406.04716 | link |
2024-06-07 | AICoderEval: Improving AI Domain Code Generation of Large Language Models | Yinghui Xia et.al. | 2406.04712 | null |
2024-06-06 | GenAI Arena: An Open Evaluation Platform for Generative Models | Dongfu Jiang et.al. | 2406.04485 | null |
2024-06-06 | MAIRA-2: Grounded Radiology Report Generation | Shruthi Bannur et.al. | 2406.04449 | link |
2024-06-06 | DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effective for LMMs | Lingchen Meng et.al. | 2406.04334 | null |
2024-06-06 | BLSP-Emo: Towards Empathetic Large Speech-Language Models | Chen Wang et.al. | 2406.03872 | link |
2024-06-05 | Identification of Stone Deterioration Patterns with Large Multimodal Models | Daniele Corradetti et.al. | 2406.03207 | link |
2024-06-05 | Exploiting LMM-based knowledge for image classification tasks | Maria Tzelepi et.al. | 2406.03071 | null |
2024-06-02 | Multimodal Deep Learning for Low-Resource Settings: A Vector Embedding Alignment Approach for Healthcare Applications | David Restrepo et.al. | 2406.02601 | null |
2024-06-04 | Leveraging Visual Tokens for Extended Text Contexts in Multi-Modal Learning | Alex Jinpeng Wang et.al. | 2406.02547 | link |
2024-06-04 | Dealing with All-stage Missing Modality: Towards A Universal Model with Robust Reconstruction and Personalization | Yunpeng Zhao et.al. | 2406.01987 | null |
2024-06-03 | Automatic Fused Multimodal Deep Learning for Plant Identification | Alfreds Lapkovskis et.al. | 2406.01455 | link |
2024-06-05 | Pulmonary Embolism Mortality Prediction Using Multimodal Learning Based on Computed Tomography Angiography and Clinical Data | Zhusi Zhong et.al. | 2406.01302 | null |
2024-06-03 | Dragonfly: Multi-Resolution Zoom Supercharges Large Visual-Language Model | Kezhen Chen et.al. | 2406.00977 | link |
2024-06-02 | Learning Multimodal Behaviors from Scratch with Diffusion Policy Gradient | Zechu Li et.al. | 2406.00681 | null |
2024-06-04 | StrucTexTv3: An Efficient Vision-Language Model for Text-rich Image Perception, Comprehension, and Beyond | Pengyuan Lyu et.al. | 2405.21013 | null |
2024-05-31 | Don’t Buy it! Reassessing the Ad Understanding Abilities of Contrastive Multimodal Models | A. Bavaresco et.al. | 2405.20846 | link |
2024-06-17 | Ovis: Structural Embedding Alignment for Multimodal Large Language Model | Shiyin Lu et.al. | 2405.20797 | link |
2024-05-31 | Vision-Language Meets the Skeleton: Progressively Distillation with Cross-Modal Knowledge for 3D Action Representation Learning | Yang Chen et.al. | 2405.20606 | link |
2024-05-30 | Worse than Random? An Embarrassingly Simple Probing Evaluation of Large Multimodal Models in Medical VQA | Qianqi Yan et.al. | 2405.20421 | link |
2024-05-30 | Retrieval Augmented Structured Generation: Business Document Information Extraction As Tool Use | Franz Louis Cesista et.al. | 2405.20245 | null |
2024-05-31 | Visual Attention Analysis in Online Learning | Miriam Navarro et.al. | 2405.20091 | null |
2024-05-30 | MM-Lego: Modular Biomedical Multimodal Models with Minimal Fine-Tuning | Konstantin Hemker et.al. | 2405.19950 | null |
2024-05-30 | Instruction-Guided Visual Masking | Jinliang Zheng et.al. | 2405.19783 | link |
2024-05-29 | Thermodynamically Informed Multimodal Learning of High-Dimensional Free Energy Models in Molecular Coarse Graining | Blake R. Duschatko et.al. | 2405.19386 | null |
2024-06-09 | LLMs Meet Multimodal Generation and Editing: A Survey | Yingqing He et.al. | 2405.19334 | link |
2024-05-29 | Adaptive Image Quality Assessment via Teaching Large Multimodal Model to Compare | Hanwei Zhu et.al. | 2405.19298 | link |
2024-05-31 | Benchmarking and Improving Detail Image Caption | Hongyuan Dong et.al. | 2405.19092 | link |
2024-05-29 | Topological Perspectives on Optimal Multimodal Embedding Spaces | Abdul Aziz A. B et.al. | 2405.18867 | null |
2024-05-29 | Exploring Exotic Decays of the Higgs Boson to Multi-Photons at the LHC via Multimodal Learning Approaches | A. Hammad et.al. | 2405.18834 | null |
2024-05-28 | The Evolution of Multimodal Model Architectures | Shakti N. Wadekar et.al. | 2405.17927 | null |
2024-05-28 | Seeing the Image: Prioritizing Visual Correlation by Contrastive Alignment | Xin Xiao et.al. | 2405.17871 | link |
2024-05-28 | Full-Stack Allreduce on Multi-Rail Networks | Enda Yu et.al. | 2405.17870 | null |
2024-05-28 | MMPareto: Boosting Multimodal Learning with Innocent Unimodal Assistance | Yake Wei et.al. | 2405.17730 | link |
2024-05-27 | Matryoshka Multimodal Models | Mu Cai et.al. | 2405.17430 | null |
2024-05-27 | XFormParser: A Simple and Effective Multimodal Multilingual Semi-structured Form Parser | Xianfu Cheng et.al. | 2405.17336 | link |
2024-05-28 | LLM-Optic: Unveiling the Capabilities of Large Language Models for Universal Visual Grounding | Haoyu Zhao et.al. | 2405.17104 | null |
2024-05-27 | Mitigating Noisy Correspondence by Geometrical Structure Consistency Learning | Zihua Zhao et.al. | 2405.16996 | link |
2024-05-27 | Multilingual Diversity Improves Vision-Language Representations | Thao Nguyen et.al. | 2405.16915 | null |
2024-05-26 | Implicit Multimodal Alignment: On the Generalization of Frozen LLMs to Multimodal Inputs | Mustafa Shukor et.al. | 2405.16700 | link |
2024-05-25 | How Well Do Deep Learning Models Capture Human Concepts? The Case of the Typicality Effect | Siddhartha K. Vemuri et.al. | 2405.16128 | null |
2024-05-24 | ConvLLaVA: Hierarchical Backbones as Visual Encoder for Large Multimodal Models | Chunjiang Ge et.al. | 2405.15738 | link |
2024-05-24 | Chain-of-Thought Prompting for Demographic Inference with Large Multimodal Models | Yongsheng Yu et.al. | 2405.15687 | null |
2024-05-24 | M4U: Evaluating Multilingual Understanding and Reasoning for Large Multimodal Models | Hongyu Wang et.al. | 2405.15638 | link |
2024-05-24 | DEEM: Diffusion Models Serve as the Eyes of Large Language Models for Image Perception | Run Luo et.al. | 2405.15232 | link |
2024-05-24 | Shopping Queries Image Dataset (SQID): An Image-Enriched ESCI Dataset for Exploring Multimodal Learning in Product Search | Marie Al Ghossein et.al. | 2405.15190 | link |
Generative Weight Space Modeling
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2024-11-14 | Enhancing generalization in high energy physics using white-box adversarial attacks | Franck Rothen et.al. | 2411.09296 | null |
2024-11-11 | Minimal nilpotent finite $W$-algebra and cuspidal module category of $\mathfrak{sp}_{2n}$ | Genqiang Liu et.al. | 2411.06768 | null |
2024-11-07 | Well-Posedness and Regularity of the Heat Equation with Robin Boundary Conditions in the Two-Dimensional Wedge | Marco Bravin et.al. | 2411.04651 | null |
2024-11-04 | SALSA: Soup-based Alignment Learning for Stronger Adaptation in RLHF | Atoosa Chegini et.al. | 2411.01798 | null |
2024-10-28 | Modular Duality in Deep Learning | Jeremy Bernstein et.al. | 2410.21265 | null |
2024-10-26 | MarDini: Masked Autoregressive Diffusion for Video Generation at Scale | Haozhe Liu et.al. | 2410.20280 | null |
2024-10-25 | Four-parameter Mittag-Leffler functions and their associated coherent states | Dušan Popov et.al. | 2410.19462 | null |
2024-10-24 | Bielik 7B v0.1: A Polish Language Model – Development, Insights, and Evaluation | Krzysztof Ociepa et.al. | 2410.18565 | null |
2024-10-21 | Two dimensional delta Bose gas in a weighted space | Sudheesh Surendranath et.al. | 2410.16550 | null |
2024-10-21 | In Search of the Successful Interpolation: On the Role of Sharpness in CLIP Generalization | Alireza Abdollahpoorrostam et.al. | 2410.16476 | link |
2024-10-23 | Universal approximation results for neural networks with non-polynomial activation function over non-compact domains | Ariel Neufeld et.al. | 2410.14759 | null |
2024-10-23 | Harnessing Your DRAM and SSD for Sustainable and Accessible LLM Inference with Mixed-Precision and Multi-level Caching | Jie Peng et.al. | 2410.14740 | null |
2024-10-16 | Differential Shape Optimization with Image Representation for Photonic Design | Zhaocheng Liu et.al. | 2410.13074 | null |
2024-10-15 | Scaling Laws for Multilingual Language Models | Yifei He et.al. | 2410.12883 | null |
2024-10-16 | AutoSimTTF: A Fully Automatic Pipeline for Electric Field Simulation and Treatment Planning of Tumor Treating Fields | Minmin Wang et.al. | 2410.12196 | null |
2024-10-15 | Model Swarms: Collaborative Search to Adapt LLM Experts via Swarm Intelligence | Shangbin Feng et.al. | 2410.11163 | null |
2024-10-14 | Deep Linear Probe Generators for Weight Space Learning | Jonathan Kahana et.al. | 2410.10811 | null |
2024-10-14 | Generating Model Parameters for Controlling: Parameter Diffusion for Controllable Multi-Task Recommendation | Chenglei Shen et.al. | 2410.10639 | null |
2024-10-14 | MoTE: Reconciling Generalization with Specialization for Visual-Language to Video Knowledge Transfer | Minghao Zhu et.al. | 2410.10589 | link |
2024-10-15 | Regions of Level $\ell$ of Catalan/Semiorder-Type Arrangements | Yanru Chen et.al. | 2410.10198 | null |
2024-10-13 | A Quantum Circuit-Based Compression Perspective for Parameter-Efficient Learning | Chen-Yu Liu et.al. | 2410.09846 | null |
2024-10-11 | Meta-Transfer Learning Empowered Temporal Graph Networks for Cross-City Real Estate Appraisal | Weijia Zhang et.al. | 2410.08947 | null |
2024-10-09 | Efficient Weight-Space Laplace-Gaussian Filtering and Smoothing for Sequential Deep Learning | Joanna Sliwa et.al. | 2410.06800 | null |
2024-10-09 | Revisiting Multi-Permutation Equivariance through the Lens of Irreducible Representations | Yonatan Sverdlov et.al. | 2410.06665 | link |
2024-10-08 | Weighted Embeddings for Low-Dimensional Graph Representation | Thomas Bläsius et.al. | 2410.06042 | null |
2024-10-05 | Computing ground states of Bose-Einstein condensation by normalized deep neural network | Weizhu Bao et.al. | 2410.05319 | link |
2024-10-07 | Hyper-Representations: Learning from Populations of Neural Networks | Konstantin Schürholt et.al. | 2410.05107 | link |
2024-10-06 | Integrable Modules of Map full Toroidal Lie Algebras | Pradeep Bisht et.al. | 2410.04495 | null |
2024-10-06 | Global well-posedness for the defocusing 3D quadratic NLS in the sharp critical space | Jia Shen et.al. | 2410.04337 | null |
2024-10-05 | Equivariant Neural Functional Networks for Transformers | Viet-Hoang Tran et.al. | 2410.04209 | null |
2024-10-15 | Learning on LoRAs: GL-Equivariant Processing of Low-Rank Weight Spaces for Large Finetuned Models | Theo Putterman et.al. | 2410.04207 | null |
2024-10-04 | Measuring and Controlling Solution Degeneracy across Task-Trained Recurrent Neural Networks | Ann Huang et.al. | 2410.03972 | null |
2024-10-04 | Autoregressive Moving-average Attention Mechanism for Time Series Forecasting | Jiecheng Lu et.al. | 2410.03159 | link |
2024-10-02 | Composing Global Optimizers to Reasoning Tasks via Algebraic Objects in Neural Nets | Yuandong Tian et.al. | 2410.01779 | link |
2024-10-01 | SynCOM: A tool for simulating coronal outflows | Valmir Moraes Filho et.al. | 2410.01004 | null |
2024-10-01 | On the prime ideals of higher secant varieties of Veronese embeddings of small degrees | Katsuhisa Furukawa et.al. | 2410.00652 | null |
2024-09-30 | Old Optimizer, New Norm: An Anthology | Jeremy Bernstein et.al. | 2409.20325 | null |
2024-09-27 | Effects of Peierls phases in open linear chains | Anselmo M. Marques et.al. | 2409.18780 | null |
2024-09-27 | Density of states in neural networks: an in-depth exploration of learning in parameter space | Margherita Mele et.al. | 2409.18683 | null |
2024-09-26 | The time periodic problem for the Navier-Stokes equations in exterior domains in weighted spaces | Reinhard Farwig et.al. | 2409.17590 | null |
2024-09-25 | Scalable Ensemble Diversification for OOD Generalization and Detection | Alexander Rubinstein et.al. | 2409.16797 | null |
2024-10-04 | Lessons Learned from a Unifying Empirical Study of Parameter-Efficient Transfer Learning (PETL) in Visual Recognition | Zheda Mai et.al. | 2409.16434 | link |
2024-09-24 | VascX Models: Model Ensembles for Retinal Vascular Analysis from Color Fundus Images | Jose Vargas Quiros et.al. | 2409.16016 | link |
2024-09-23 | Efficient Large-Scale Quantum Optimization via Counterdiabatic Ansatz | Jie Liu et.al. | 2409.15055 | null |
2024-09-24 | Weighted Approximation By Max-Product Generalized Exponential Sampling Series | Satyaranjan Pradhan et.al. | 2409.14884 | null |
2024-09-21 | Weakly magnetized black holes in Einstein-ModMax theory | Haryanto M. Siahaan et.al. | 2409.13967 | null |
2024-09-18 | Monomial Matrix Group Equivariant Neural Functional Networks | Hoang V. Tran et.al. | 2409.11697 | link |
2024-09-17 | Existence of an extremal function of Sobolev critical embedding with an $α$ -homogeneous weight | Petr Gurka et.al. | 2409.11193 | null |
2024-09-16 | Inferring stellar parameters and their uncertainties from high-resolution spectroscopy using invertible neural networks | Nils Candebat et.al. | 2409.10621 | null |
2024-09-13 | Non-unitary Wightman CFTs and non-unitary vertex algebras | Sebastiano Carpi et.al. | 2409.08454 | null |
2024-09-12 | Global well-posedness and scattering in weighted space for nonlinear Schrödinger equations below the Strauss exponent without gauge-invariance | Masaki Kawamoto et.al. | 2409.08432 | null |
2024-09-09 | Fast gradient-free optimization of excitations in variational quantum eigensolvers | Jonas Jäger et.al. | 2409.05939 | null |
2024-09-06 | SCARF: Scalable Continual Learning Framework for Memory-efficient Multiple Neural Radiance Fields | Yuze Wang et.al. | 2409.04482 | null |
2024-09-04 | Federated Quantum-Train with Batched Parameter Generation | Chen-Yu Liu et.al. | 2409.02763 | null |
2024-09-16 | Regret Analysis for Randomized Gaussian Process Upper Confidence Bound | Shion Takeno et.al. | 2409.00979 | null |
2024-08-30 | Abstracted Gaussian Prototypes for One-Shot Concept Learning | Chelsea Zou et.al. | 2408.17251 | link |
2024-08-23 | Emergence of global receptive fields capturing multipartite quantum correlations | Oleg M. Sotnikov et.al. | 2408.13033 | null |
2024-08-22 | **Action of $\mathfrak{osp}(1 | 2n)$ on polynomials tensor $\mathbb{C}^{0 | 2n}$** | Dwight Anderson Williams II et.al. |
2024-08-19 | Unimodal sequences and mixed false theta functions | Kevin Allen et.al. | 2408.09789 | null |
2024-08-16 | Onsager-Machlup functional for stochastic lattice dynamical systems driven by time-varying noise | Xinze Zhang et.al. | 2408.08465 | null |
2024-08-10 | Variational Inference Failures Under Model Symmetries: Permutation Invariant Posteriors for Bayesian Neural Networks | Yoav Gelberg et.al. | 2408.05496 | null |
2024-08-09 | Quasilinear parabolic equations with superlinear nonlinearities in critical spaces | Bogdan-Vasile Matioc et.al. | 2408.05067 | null |
2024-08-08 | A framework for generalizing toric inequalities for holographic entanglement entropy | Ning Bao et.al. | 2408.04741 | null |
2024-08-07 | Counterfactuals and Uncertainty-Based Explainable Paradigm for the Automated Detection and Segmentation of Renal Cysts in Computed Tomography Images: A Multi-Center Study | Zohaib Salahuddin et.al. | 2408.03789 | null |
2024-08-05 | BOTS-LM: Training Large Language Models for Setswana | Nathan Brown et.al. | 2408.02239 | null |
2024-08-02 | Conditional LoRA Parameter Generation | Xiaolong Jin et.al. | 2408.01415 | null |
2024-08-01 | Reclaiming Residual Knowledge: A Novel Paradigm to Low-Bit Quantization | Róisín Luo et.al. | 2408.00923 | null |
2024-07-31 | Semantic Codebook Learning for Dynamic Recommendation Models | Zheqi Lv et.al. | 2408.00123 | null |
2024-07-29 | Tensor product weight modules over the affine-Virasoro algebra | Qiu-Fan Chen et.al. | 2407.19844 | null |
2024-07-24 | Generalized Hilbert operators acting on weighted spaces of holomorphic functions with sup-norms | María J. Beltrán-Meneu et.al. | 2407.17646 | null |
2024-07-24 | Generalized Ordinal Priority Approach for Multi-Attribute Decision-Making under Incomplete Preference Information | Renlong Wang et.al. | 2407.17099 | null |
2024-07-22 | WebRPG: Automatic Web Rendering Parameters Generation for Visual Presentation | Zirui Shao et.al. | 2407.15502 | link |
2024-07-18 | FSP-Laplace: Function-Space Priors for the Laplace Approximation in Bayesian Deep Learning | Tristan Cinquin et.al. | 2407.13711 | null |
2024-07-19 | Parameter Generation of Quantum Approximate Optimization Algorithm with Diffusion Model | Fanxu Meng et.al. | 2407.12242 | null |
2024-07-24 | Effect Heterogeneity with Earth Observation in Randomized Controlled Trials: Exploring the Role of Data, Model, and Evaluation Metric Choice | Connor T. Jerzak et.al. | 2407.11674 | link |
2024-07-15 | Make-An-Agent: A Generalizable Policy Network Generator with Behavior-Prompted Diffusion | Yongyuan Liang et.al. | 2407.10973 | null |
2024-07-16 | The well-posedness of generalized nonlinear wave equations on the lattice graph | Bobo Hua et.al. | 2407.09815 | null |
2024-07-15 | Enhancing Robustness of Vision-Language Models through Orthogonality Learning and Cross-Regularization | Jinlong Li et.al. | 2407.08374 | null |
2024-07-09 | Fine-Tuning Linear Layers Only Is a Simple yet Effective Way for Task Arithmetic | Ruochen Jin et.al. | 2407.07089 | link |
2024-07-04 | Recovering Initial States in Semilinear Parabolic Problems from Time-Averages | Lina Sophie Schmitz et.al. | 2407.03829 | null |
2024-07-01 | A quantum deformation of the ${\mathcal N}=2$ superconformal algebra | H. Awata et.al. | 2407.00901 | null |
2024-06-24 | WARP: On the Benefits of Weight Averaged Rewarded Policies | Alexandre Ramé et.al. | 2406.16768 | null |
2024-06-24 | Improving robustness to corruptions with multiplicative weight perturbations | Trung Trinh et.al. | 2406.16540 | null |
2024-06-21 | Determination of certain mod $p$ Galois representations using local constancy | Abhik Ganguli et.al. | 2406.15600 | null |
2024-06-21 | Elliptic analysis on collapsing gravitational instantons modelled using the Gibbons-Hawking ansatz | Willem Adriaan Salm et.al. | 2406.15008 | null |
2024-06-20 | MEAT: Median-Ensemble Adversarial Training for Improving Robustness and Generalization | Zhaozhe Hu et.al. | 2406.14259 | link |
2024-06-18 | From Instance Training to Instruction Learning: Task Adapters Generation from Instructions | Huanxuan Liao et.al. | 2406.12382 | link |
2024-06-17 | Kaniadakis entropy in extreme gravitational and cosmological environments: a review on the state-of-the-art and future prospects | Giuseppe Gaetano Luciano et.al. | 2406.11373 | null |
2024-06-16 | Analysis and approximation of elliptic problems with Uhlenbeck structure in convex polytopes | Tadele Mengesha et.al. | 2406.10762 | null |
2024-06-14 | Towards Scalable and Versatile Weight Space Learning | Konstantin Schürholt et.al. | 2406.09997 | link |
2024-06-13 | Interpreting the Weight Space of Customized Diffusion Models | Amil Dravid et.al. | 2406.09413 | link |
2024-06-12 | Diffusion Soup: Model Merging for Text-to-Image Diffusion Models | Benjamin Biggs et.al. | 2406.08431 | null |
2024-06-24 | Cartan monopoles | Andrei Smilga et.al. | 2406.06042 | null |
2024-06-08 | Regularized Training with Generated Datasets for Name-Only Transfer of Vision-Language Models | Minho Park et.al. | 2406.05432 | link |
2024-06-06 | Regularized KL-Divergence for Well-Defined Function-Space Variational Inference in Bayesian neural networks | Tristan Cinquin et.al. | 2406.04317 | null |
2024-06-06 | A characterization of $(μ,ν)$ -dichotomies via admissibility | Lucas Backes et.al. | 2406.04126 | null |
2024-06-05 | Reproducing Kernel Thesis of Hankel Operators on Weighted Hardy Spaces | Ana Čolović et.al. | 2406.03106 | null |
2024-05-21 | Backpropogation-Free Multi-modal On-Device Model Adaptation via Cloud-Device Collaboration | Wei Ji et.al. | 2406.01601 | null |
2024-05-29 | Thermodynamics of the most generalized form of Holographic Dark Energy and some particular cases with Corrected Entropies | Sanghati Saha et.al. | 2405.20783 | null |
2024-06-20 | The Empirical Impact of Neural Parameter Symmetries, or Lack Thereof | Derek Lim et.al. | 2405.20231 | link |
2024-05-28 | Universal and Extensible Language-Vision Models for Organ Segmentation and Tumor Detection from Abdominal Computed Tomography | Jie Liu et.al. | 2405.18356 | link |
2024-05-28 | $C^2M^3$ : Cycle-Consistent Multi-Model Merging | Donato Crisostomi et.al. | 2405.17897 | link |
2024-05-27 | Smoothing effects and extinction in finite time for fractional fast diffusions on Riemannian manifolds | Elvise Berchio et.al. | 2405.17126 | null |
2024-05-31 | FedSheafHN: Personalized Federated Learning on Graph-structured Data | Wenfei Liang et.al. | 2405.16056 | null |
2024-05-27 | HyperInterval: Hypernetwork approach to training weight interval regions in continual learning | Patryk Krukowski et.al. | 2405.15444 | link |
2024-05-23 | Scalable Optimization in the Modular Norm | Tim Large et.al. | 2405.14813 | link |
2024-06-16 | A refined Weyl character formula for comodules on $\operatorname{GL}_{2,A}$ | Helge Øystein Maakestad et.al. | 2405.09210 | null |
2024-05-13 | Localizing Task Information for Improved Model Merging and Compression | Ke Wang et.al. | 2405.07813 | link |
2024-05-13 | $α$ VIL: Learning to Leverage Auxiliary Tasks for Multitask Learning | Rafael Kourdis et.al. | 2405.07769 | null |
2024-05-12 | Approximation by a new sequence of operators involving Laguerre polynomials | Kapil Kumar et.al. | 2405.07228 | null |
2024-05-06 | Swarm intelligence for full Stokes dynamic imaging reconstruction of interferometric data | Alejandro Mus et.al. | 2405.03330 | null |
2024-05-04 | Large Deviation Principles of Invariant Measures of Stochastic Reaction-Diffusion Lattice Systems | Bixiang Wang et.al. | 2405.02720 | null |
2024-05-03 | The Immersed Inextensible Interface Problem in 2D Stokes Flow | Eduardo García-Juárez et.al. | 2405.02446 | null |
2024-05-02 | Customizing Text-to-Image Models with a Single Image Pair | Maxwell Jones et.al. | 2405.01536 | null |
2024-04-25 | Robust Fine-tuning for Pre-trained 3D Point Cloud Models | Zhibo Zhang et.al. | 2404.16422 | null |
2024-04-23 | The Geometry of the Set of Equivalent Linear Neural Networks | Jonathan Richard Shewchuk et.al. | 2404.14855 | null |
2024-04-24 | Nonexistence of solutions to parabolic problems with a potential on weighted graphs | Dario D. Monticelli et.al. | 2404.12058 | null |
2024-04-17 | On the relaxation to equilibrium of a quantum oscillator interacting with a radiation field | Pierre-A. Vuillermot et.al. | 2404.11329 | null |
2024-04-15 | Higher-curvature gravity in AdS $_3$, holographic $c$ -theorems and black hole microstates | Mariano Chernicoff et.al. | 2404.10128 | null |
2024-04-16 | Asymptotic-preserving approximations for stochastic incompressible viscous fluids and SPDEs on graph | Jianbo Cui et.al. | 2404.09168 | null |
2024-04-09 | Perspective on Physical Interpretations of Rényi Entropy in Statistical Mechanics | Misaki Ozawa et.al. | 2404.06436 | null |
2024-04-09 | A gluing construction of singular solutions for a fully non-linear equation in conformal geometry | María Fernanda Espinal et.al. | 2404.05965 | null |
2024-04-05 | Dissipative Euler flows originating from circular vortex filaments | Francisco Gancedo et.al. | 2404.04250 | null |
2024-04-05 | Macdonald characters from a new formula for Macdonald polynomials | Houcine Ben Dali et.al. | 2404.03904 | null |
2024-04-04 | Fundamental inequalities for the iterated Fourier-cosine convolution with Gaussian weight and its application | Nguyen Thi Hong Phuong et.al. | 2404.03609 | null |
2024-03-29 | Embracing Unknown Step by Step: Towards Reliable Sparse Training in Real World | Bowen Lei et.al. | 2403.20047 | link |
2024-03-28 | Model Stock: All we need is just a few fine-tuned models | Dong-Hwan Jang et.al. | 2403.19522 | link |
2024-03-26 | A location Invariant Statistic-Based Consistent Estimation Method for Three-Parameter Generalized Exponential Distribution | Kiran Prajapat et.al. | 2403.17609 | null |
2024-06-03 | FissionFusion: Fast Geometric Generation and Hierarchical Souping for Medical Image Analysis | Santosh Sanjeev et.al. | 2403.13341 | link |
2024-06-18 | Learning Useful Representations of Recurrent Neural Network Weight Matrices | Vincent Herrmann et.al. | 2403.11998 | link |
2024-03-16 | Function-space Parameterization of Neural Networks for Sequential Learning | Aidan Scannell et.al. | 2403.10929 | link |
2024-03-14 | Imprints of Barrow-Tsallis Cosmology in Primordial Gravitational Waves | Petr Jizba et.al. | 2403.09797 | null |
2024-03-14 | Eigenvariety for partially classical Hilbert modular forms | Mladen Dimitrov et.al. | 2403.09784 | null |
2024-03-12 | The solenoidal Heisenberg Virasoro algebra and its simple weight modules | Boujemaa Agrebaoui et.al. | 2403.07381 | null |
2024-03-10 | FrameQuant: Flexible Low-Bit Quantization for Transformers | Harshavardhan Adepu et.al. | 2403.06082 | link |
2024-03-06 | The solenoidal Virasoro algebra and its simple weight modules | Boujemaa Agrebaoui et.al. | 2403.03753 | null |
2024-03-05 | Tensor Decomposition-based Time Varying Channel Estimation for mmWave MIMO-OFDM Systems | Ruizhe Wang et.al. | 2403.02942 | null |
2024-03-05 | Neural Redshift: Random Networks are not Random Functions | Damien Teney et.al. | 2403.02241 | null |
2024-03-04 | Tiny fluctuations of the averaging process around its degenerate steady state | Federico Sau et.al. | 2403.02032 | null |
2024-03-15 | Training-Free Pretrained Model Merging | Zhengqi Xu et.al. | 2403.01753 | link |
2024-04-22 | HanDiffuser: Text-to-Image Generation With Realistic Hand Appearances | Supreeth Narasimhaswamy et.al. | 2403.01693 | null |
2024-03-13 | TOOLVERIFIER: Generalization to New Tools via Self-Verification | Dheeraj Mekala et.al. | 2402.14158 | link |
2024-02-21 | Computing Tangent Spaces to Eigenvarieties | James Rawson et.al. | 2402.13799 | null |
2024-05-28 | Neural Network Parameter Diffusion | Kai Wang et.al. | 2402.13144 | link |
2024-02-19 | Exponential attractors for a nonlocal delayed reaction-diffusion equation on an unbounded domain | Wenjie Hu et.al. | 2402.11856 | null |
2024-02-18 | Discrete Neural Algorithmic Reasoning | Gleb Rodionov et.al. | 2402.11628 | link |
2024-02-17 | Uncertainty Quantification of Graph Convolution Neural Network Models of Evolving Processes | Jeremiah Hauth et.al. | 2402.11179 | null |
2024-06-06 | Generalizability of Mixture of Domain-Specific Adapters from the Lens of Signed Weight Directions and its Application to Effective Model Pruning | Tuc Nguyen et.al. | 2402.10639 | null |
2024-02-14 | TAI-GAN: A Temporally and Anatomically Informed Generative Adversarial Network for early-to-late frame conversion in dynamic cardiac PET inter-frame motion correction | Xueqi Guo et.al. | 2402.09567 | null |
2024-02-14 | The cohomology of $p$ -adic Deligne-Luszitg schemes of Coxeter type | Alexander B. Ivanov et.al. | 2402.09017 | null |
2024-02-09 | The Asymptotic Structure of Cosmological Integrals | Paolo Benincasa et.al. | 2402.06558 | null |
2024-02-07 | Universal Neural Functionals | Allan Zhou et.al. | 2402.05232 | link |
2024-02-06 | Maximal regularity and optimal control for a non-local Cahn-Hilliard tumour growth model | Matteo Fornoni et.al. | 2402.04204 | null |
2024-02-06 | Improved Generalization of Weight Space Networks via Augmentations | Aviv Shamsian et.al. | 2402.04081 | link |
2024-02-02 | Training-time Neuron Alignment through Permutation Subspace for Improving Linear Mode Connectivity and Model Fusion | Zexi Li et.al. | 2402.01342 | null |
2024-02-01 | Understanding Neural Network Systems for Image Analysis using Vector Spaces and Inverse Maps | Rebecca Pattichis et.al. | 2402.00261 | null |
2024-01-26 | Do deep neural networks utilize the weight space efficiently? | Onur Can Koyun et.al. | 2401.16438 | null |
2024-01-22 | On strong growth conditions for weighted spaces of entire functions | Gerhard Schindl et.al. | 2401.14330 | null |
2024-01-24 | Task structure and nonlinearity jointly determine learned representational geometry | Matteo Alleman et.al. | 2401.13558 | null |
2024-01-25 | Sparse Domination of Singular Bilinear Forms on Non-Homogeneous spaces | Paco Villarroya et.al. | 2401.13130 | null |
2024-01-22 | WARM: On the Benefits of Weight Averaged Reward Models | Alexandre Ramé et.al. | 2401.12187 | null |
2024-01-17 | Cesàro operators associated with Borel measures acting on weighted spaces of holomorphic functions with sup-norm | Maria José Beltrán Meneu et.al. | 2401.09406 | null |
2024-01-15 | Singular fractal dimension at periodicity cascades in parameters spaces | Carlos E. P. Abreu et.al. | 2401.07648 | null |
2024-01-17 | Computing Fringe Presentations of Multigraded Persistence Modules | Fabian Lenzen et.al. | 2401.06008 | null |
2024-01-10 | Grimoire is All You Need for Enhancing Large Language Models | Ding Chen et.al. | 2401.03385 | link |
2024-03-26 | Artificial Intelligence for Operations Research: Revolutionizing the Operations Research Process | Zhenan Fan et.al. | 2401.03244 | null |
2023-12-31 | A Compact Representation for Bayesian Neural Networks By Removing Permutation Symmetry | Tim Z. Xiao et.al. | 2401.00611 | link |
2023-12-28 | Fractional non-homogeneous counting process | Nick Laskin et.al. | 2312.17389 | null |
2023-12-28 | Some unimodal sequences of Kronecker coefficients | Alimzhan Amanov et.al. | 2312.17054 | null |
2023-12-24 | The Vlasov-Maxwell-Boltzmann/Landau system with polynomial perturbation near Maxwellian | Chuqi Cao et.al. | 2312.15510 | null |
2023-12-22 | Emage: Non-Autoregressive Text-to-Image Generation | Zhangyin Feng et.al. | 2312.14988 | null |
2023-12-21 | Hypercyclic shifts on lattice graphs | Anton Baranov et.al. | 2312.13934 | null |
2023-12-21 | Scattering for 2d semi-relativistic Hartree equations with short range potential | Changhun Yang et.al. | 2312.13606 | null |
2023-12-21 | Entropic Inflation in Presence of Scalar Field | Sergei D. Odintsov et.al. | 2312.13587 | null |
2023-12-30 | Time is Encoded in the Weights of Finetuned Language Models | Kai Nylund et.al. | 2312.13401 | link |
2023-12-14 | Efficient momentum space approach to superconductivity in quasiperiodic systems | Mao Yoshii et.al. | 2312.09124 | null |
2023-12-13 | Best one-sided algebraic approximation by average modulus | Raheam A. Al-Saphory et.al. | 2312.08407 | null |
2023-12-19 | Well-Posedness of Quasilinear Parabolic Equations in Time-Weighted Spaces | Bogdan Matioc et.al. | 2312.07974 | null |
2023-12-12 | Rethinking Compression: Reduced Order Modelling of Latent Features in Large Language Models | Arnav Chavan et.al. | 2312.07046 | link |
2023-12-11 | Model Breadcrumbs: Scaling Multi-Task Model Merging with Sparse Masks | MohammadReza Davari et.al. | 2312.06795 | null |
2023-12-08 | Stoichiometry preservation and generalization of Bilger mixture fraction for non-premixed combustion with differential molecular diffusion | Haifeng Wang et.al. | 2312.05204 | null |
2023-12-01 | New polyconvolution product for Fourier-cosine and Laplace integral operators and their applications | Trinh Tuan et.al. | 2312.00764 | null |
2023-11-30 | Modelling Einstein cluster using Einasto profile | Ritwik Acharyya et.al. | 2311.18622 | null |
2023-11-27 | Extraction of the microscopic properties of quasi-particles using deep neural networks | Olga Soloveva et.al. | 2311.15984 | null |
2024-01-24 | Deep Latent Force Models: ODE-based Process Convolutions for Bayesian Deep Learning | Thomas Baldwin-McDonald et.al. | 2311.14828 | null |
Data Distillation
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2024-10-25 | FLiP: Privacy-Preserving Federated Learning based on the Principle of Least Privileg | ShiMao Xu et.al. | 2410.19548 | null |
2024-10-25 | SWITCH: Studying with Teacher for Knowledge Distillation of Large Language Models | Jahyun Koo et.al. | 2410.19503 | null |
2024-10-24 | AlignCap: Aligning Speech Emotion Captioning to Human Preferences | Ziqi Liang et.al. | 2410.19134 | null |
2024-10-24 | High-dimensional Analysis of Knowledge Distillation: Weak-to-Strong Generalization and Scaling Laws | M. Emrullah Ildiz et.al. | 2410.18837 | null |
2024-10-24 | Knowledge Distillation Using Frontier Open-source LLMs: Generalizability and the Role of Synthetic Data | Anup Shirgaonkar et.al. | 2410.18588 | null |
2024-10-24 | SIKeD: Self-guided Iterative Knowledge Distillation for mathematical reasoning | Shivam Adarsh et.al. | 2410.18574 | link |
2024-10-23 | ELAICHI: Enhancing Low-resource TTS by Addressing Infrequent and Low-frequency Character Bigrams | Srija Anand et.al. | 2410.17901 | null |
2024-10-23 | Towards Active Participant-Centric Vertical Federated Learning: Some Representations May Be All You Need | Jon Irureta et.al. | 2410.17648 | null |
2024-10-23 | Towards Effective Data-Free Knowledge Distillation via Diverse Diffusion Augmentation | Muquan Li et.al. | 2410.17606 | link |
2024-10-23 | Physics-driven AI for Channel Estimation in Cellular Network | Xiaoqian Qi et.al. | 2410.17525 | null |
2024-10-22 | MiniPLM: Knowledge Distillation for Pre-Training Language Models | Yuxian Gu et.al. | 2410.17215 | link |
2024-10-22 | Emphasizing Discriminative Features for Dataset Distillation in Complex Scenarios | Kai Wang et.al. | 2410.17193 | link |
2024-10-22 | CK4Gen: A Knowledge Distillation Framework for Generating High-Utility Synthetic Survival Datasets in Healthcare | Nicholas I-Hsien Kuo et.al. | 2410.16872 | null |
2024-10-22 | AttriPrompter: Auto-Prompting with Attribute Semantics for Zero-shot Nuclei Detection via Visual-Language Pre-trained Models | Yongjian Wu et.al. | 2410.16820 | link |
2024-10-22 | SafetyAnalyst: Interpretable, transparent, and steerable LLM safety moderation | Jing-Jing Li et.al. | 2410.16665 | null |
2024-10-21 | Pre-training Distillation for Large Language Models: A Design Space Exploration | Hao Peng et.al. | 2410.16215 | null |
2024-10-18 | Interpreting Microbiome Relative Abundance Data Using Symbolic Regression | Swagatam Haldar et.al. | 2410.16109 | link |
2024-10-21 | Are Large-scale Soft Labels Necessary for Large-scale Dataset Distillation? | Lingao Xiao et.al. | 2410.15919 | link |
2024-10-21 | Model Mimic Attack: Knowledge Distillation for Provably Transferable Adversarial Examples | Kirill Lukyanov et.al. | 2410.15889 | null |
2024-10-20 | Hybrid Memory Replay: Blending Real and Distilled Data for Class Incremental Learning | Jiangtao Kong et.al. | 2410.15372 | null |
2024-10-20 | GSSF: Generalized Structural Sparse Function for Deep Cross-modal Metric Learning | Haiwen Diao et.al. | 2410.15266 | link |
2024-10-19 | LLaVA-Ultra: Large Chinese Language and Vision Assistant for Ultrasound | Xuechen Guo et.al. | 2410.15074 | null |
2024-10-19 | Improving Pronunciation and Accent Conversion through Knowledge Distillation And Synthetic Ground-Truth from Native TTS | Tuan Nam Nguyen et.al. | 2410.14997 | null |
2024-10-17 | CAKD: A Correlation-Aware Knowledge Distillation Framework Based on Decoupling Kullback-Leibler Divergence | Zao Zhang et.al. | 2410.14741 | null |
2024-10-18 | Unlearning Backdoor Attacks for LLMs with Weak-to-Strong Knowledge Distillation | Shuai Zhao et.al. | 2410.14425 | link |
2024-10-18 | Preview-based Category Contrastive Learning for Knowledge Distillation | Muhe Ding et.al. | 2410.14143 | null |
2024-10-17 | Leveraging Fine-Tuned Language Models for Efficient and Accurate Smart Contract Auditing | Zhiyuan Wei et.al. | 2410.13918 | link |
2024-10-17 | GDeR: Safeguarding Efficiency, Balancing, and Robustness via Prototypical Graph Pruning | Guibin Zhang et.al. | 2410.13761 | link |
2024-10-17 | An Active Learning Framework for Inclusive Generation by Large Language Models | Sabit Hassan et.al. | 2410.13641 | null |
2024-10-18 | Towards Satellite Non-IID Imagery: A Spectral Clustering-Assisted Federated Learning Approach | Luyao Zou et.al. | 2410.13602 | null |
2024-10-17 | Enhancing Dataset Distillation via Label Inconsistency Elimination and Learning Pattern Refinement | Chuhao Zhou et.al. | 2410.13311 | link |
2024-10-18 | Cyber Attacks Prevention Towards Prosumer-based EV Charging Stations: An Edge-assisted Federated Prototype Knowledge Distillation Approach | Luyao Zou et.al. | 2410.13260 | null |
2024-10-16 | TAS: Distilling Arbitrary Teacher and Student via a Hybrid Assistant | Guopeng Li et.al. | 2410.12342 | null |
2024-10-16 | Optimizing YOLOv5s Object Detection through Knowledge Distillation algorithm | Guanming Huang et.al. | 2410.12259 | null |
2024-10-16 | TransAgent: Transfer Vision-Language Foundation Models with Heterogeneous Agent Collaboration | Yiwei Guo et.al. | 2410.12183 | link |
2024-10-17 | SAM-Guided Masked Token Prediction for 3D Scene Understanding | Zhimin Chen et.al. | 2410.12158 | null |
2024-10-15 | MoE-Pruner: Pruning Mixture-of-Experts Large Language Model using the Hints from Its Router | Yanyue Xie et.al. | 2410.12013 | null |
2024-10-15 | Breaking Modality Gap in RGBT Tracking: Coupled Knowledge Distillation | Andong Lu et.al. | 2410.11586 | link |
2024-10-15 | Learning from Imperfect Data: Towards Efficient Knowledge Distillation of Autoregressive Language Models for Text-to-SQL | Qihuang Zhong et.al. | 2410.11371 | null |
2024-10-15 | Speculative Knowledge Distillation: Bridging the Teacher-Student Gap Through Interleaved Sampling | Wenda Xu et.al. | 2410.11325 | null |
2024-10-14 | BrainMVP: Multi-modal Vision Pre-training for Brain Image Analysis using Multi-parametric MRI | Shaohao Rui et.al. | 2410.10604 | null |
2024-10-14 | ROSAR: An Adversarial Re-Training Framework for Robust Side-Scan Sonar Object Detection | Martin Aubard et.al. | 2410.10554 | link |
2024-10-14 | Temperature-Centric Investigation of Speculative Decoding with Knowledge Distillation | Siru Ouyang et.al. | 2410.10141 | null |
2024-10-14 | REHRSeg: Unleashing the Power of Self-Supervised Super-Resolution for Resource-Efficient 3D MRI Segmentation | Zhiyun Song et.al. | 2410.10097 | null |
2024-10-15 | Self-Data Distillation for Recovering Quality in Pruned Large Language Models | Vithursan Thangarasa et.al. | 2410.09982 | null |
2024-10-13 | Generalized Group Data Attribution | Dan Ley et.al. | 2410.09940 | null |
2024-10-12 | Distilling Invariant Representations with Dual Augmentation | Nikolaos Giakoumoglou et.al. | 2410.09474 | null |
2024-10-12 | Declarative Knowledge Distillation from Large Language Models for Visual Question Answering Datasets | Thomas Eiter et.al. | 2410.09428 | link |
2024-10-15 | Transforming In-Vehicle Network Intrusion Detection: VAE-based Knowledge Distillation Meets Explainable AI | Muhammet Anil Yagiz et.al. | 2410.09043 | null |
2024-10-11 | Mentor-KD: Making Small Language Models Better Multi-step Reasoners | Hojae Lee et.al. | 2410.09037 | link |
2024-10-11 | Contrastive Knowledge Distillation for Robust Multimodal Sentiment Analysis | Zhongyi Sang et.al. | 2410.08692 | null |
2024-10-11 | DistDD: Distributed Data Distillation Aggregation through Gradient Matching | Peiran Wang et.al. | 2410.08665 | null |
2024-10-11 | GAI-Enabled Explainable Personalized Federated Semi-Supervised Learning | Yubo Peng et.al. | 2410.08634 | null |
2024-10-11 | Simultaneous Reward Distillation and Preference Learning: Get You a Language Model Who Can Do Both | Abhijnan Nath et.al. | 2410.08458 | null |
2024-10-10 | What is Left After Distillation? How Knowledge Transfer Impacts Fairness and Bias | Aida Mohammadshahi et.al. | 2410.08407 | null |
2024-10-10 | A Lightweight Target-Driven Network of Stereo Matching for Inland Waterways | Jing Su et.al. | 2410.07915 | null |
2024-10-10 | SNN-PAR: Energy Efficient Pedestrian Attribute Recognition via Spiking Neural Networks | Haiyang Wang et.al. | 2410.07857 | link |
2024-10-12 | Relational Diffusion Distillation for Efficient Image Generation | Weilun Feng et.al. | 2410.07679 | link |
2024-10-10 | Teddy: Efficient Large-Scale Dataset Distillation via Taylor-Approximated Matching | Ruonan Yu et.al. | 2410.07579 | null |
2024-10-09 | Unlocking Real-Time Fluorescence Lifetime Imaging: Multi-Pixel Parallelism for FPGA-Accelerated Processing | Ismail Erbas et.al. | 2410.07364 | null |
2024-10-09 | S2HPruner: Soft-to-Hard Distillation Bridges the Discretization Gap in Pruning | Weihao Lin et.al. | 2410.07046 | null |
2024-10-09 | Structure-Centric Robust Monocular Depth Estimation via Knowledge Distillation | Runze Chen et.al. | 2410.06982 | null |
2024-10-09 | Efficient and Robust Knowledge Distillation from A Stronger Teacher Based on Correlation Matching | Wenqi Niu et.al. | 2410.06561 | null |
2024-10-10 | KnowledgeSG: Privacy-Preserving Synthetic Text Generation with Knowledge Distillation from Server | Wenhao Wang et.al. | 2410.05725 | link |
2024-10-07 | Progressive distillation induces an implicit curriculum | Abhishek Panigrahi et.al. | 2410.05464 | null |
2024-10-07 | ReasoningRank: Teaching Student Models to Rank through Reasoning-Based Knowledge Distillation | Yuelyu Ji et.al. | 2410.05168 | null |
2024-10-07 | MetaDD: Boosting Dataset Distillation with Neural Network Architecture-Invariant Generalization | Yunlong Zhao et.al. | 2410.05103 | null |
2024-10-06 | CAPEEN: Image Captioning with Early Exits and Knowledge Distillation | Divya Jyoti Bajpai et.al. | 2410.04433 | link |
2024-10-06 | DAdEE: Unsupervised Domain Adaptation in Early Exit PLMs | Divya Jyoti Bajpai et.al. | 2410.04424 | link |
2024-10-10 | Towards Understanding and Enhancing Security of Proof-of-Training for DNN Model Ownership Verification | Yijia Chang et.al. | 2410.04397 | null |
2024-10-10 | Distillation-Free One-Step Diffusion for Real-World Image Super-Resolution | Jianze Li et.al. | 2410.04224 | link |
2024-10-05 | Accelerating Diffusion Models with One-to-Many Knowledge Distillation | Linfeng Zhang et.al. | 2410.04191 | null |
2024-10-05 | DiDOTS: Knowledge Distillation from Large-Language-Models for Dementia Obfuscation in Transcribed Speech | Dominika Woszczyk et.al. | 2410.04188 | null |
2024-10-05 | Gap Preserving Distillation by Building Bidirectional Mappings with A Dynamic Teacher | Yong Guo et.al. | 2410.04140 | null |
2024-10-05 | WiDistill: Distilling Large-scale Wi-Fi Datasets with Trajectory Matching | Tiantian Wang et.al. | 2410.04073 | link |
2024-10-04 | Enhance Reasoning by Learning from Mistakes: Peer-Review Knowledge Distillation from Multiple Large Language Models | Zhuochun Li et.al. | 2410.03663 | null |
2024-10-04 | DocKD: Knowledge Distillation from LLMs for Open-World Document Understanding Models | Sungnyun Kim et.al. | 2410.03061 | null |
2024-10-03 | Dataset Distillation via Knowledge Distillation: Towards Efficient Self-Supervised Pre-Training of Deep Networks | Siddharth Joshi et.al. | 2410.02116 | null |
2024-10-02 | PHI-S: Distribution Balancing for Label-Free Multi-Teacher Distillation | Mike Ranzinger et.al. | 2410.01680 | null |
2024-10-04 | HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models | Seanie Lee et.al. | 2410.01524 | link |
2024-10-02 | Foldable SuperNets: Scalable Merging of Transformers with Different Initializations and Tasks | Edan Kinderman et.al. | 2410.01483 | link |
2024-10-02 | PairDistill: Pairwise Relevance Distillation for Dense Retrieval | Chao-Wei Huang et.al. | 2410.01383 | link |
2024-10-02 | “No Matter What You Do!”: Mitigating Backdoor Attacks in Graph Neural Networks | Jiale Zhang et.al. | 2410.01272 | link |
2024-10-01 | Compressing Recurrent Neural Networks for FPGA-accelerated Implementation in Fluorescence Lifetime Imaging | Ismail Erbas et.al. | 2410.00948 | null |
2024-10-01 | Local-to-Global Self-Supervised Representation Learning for Diabetic Retinopathy Grading | Mostafa Hajighasemloua et.al. | 2410.00779 | null |
2024-10-01 | Efficient Technical Term Translation: A Knowledge Distillation Approach for Parenthetical Terminology Translation | Jiyoon Myung et.al. | 2410.00683 | null |
2024-10-01 | AMR-Evol: Adaptive Modular Response Evolution Elicits Better Knowledge Distillation for Large Language Models in Code Generation | Ziyang Luo et.al. | 2410.00558 | link |
2024-10-01 | Self-Updatable Large Language Models with Parameter Integration | Yu Wang et.al. | 2410.00487 | null |
2024-10-01 | Advancing Medical Radiograph Representation Learning: A Hybrid Pre-training Paradigm with Multilevel Semantic Granularity | Hanqi Jiang et.al. | 2410.00448 | null |
2024-09-30 | Collaborative Knowledge Distillation via a Learning-by-Education Node Community | Anestis Kaimakamidis et.al. | 2410.00074 | null |
2024-09-30 | Enhancing Romanian Offensive Language Detection through Knowledge Distillation, Multi-Task Learning, and Data Augmentation | Vlad-Cristian Matei et.al. | 2409.20498 | null |
2024-10-02 | Linear Projections of Teacher Embeddings for Few-Class Distillation | Noel Loo et.al. | 2409.20449 | null |
2024-09-30 | Classroom-Inspired Multi-Mentor Distillation with Adaptive Learning Strategies | Shalini Sarode et.al. | 2409.20237 | null |
2024-10-01 | HYDRA-FL: Hybrid Knowledge Distillation for Robust and Accurate Federated Learning | Momin Ahmad Khan et.al. | 2409.19912 | null |
2024-09-29 | Tailored Federated Learning: Leveraging Direction Regulation & Knowledge Distillation | Huidong Tang et.al. | 2409.19741 | null |
2024-09-29 | InfantCryNet: A Data-driven Framework for Intelligent Analysis of Infant Cries | Mengze Hong et.al. | 2409.19689 | null |
2024-09-28 | Mind the Gap: Promoting Missing Modality Brain Tumor Segmentation with Alignment | Tianyi Liu et.al. | 2409.19366 | null |
2024-09-27 | Semi-Supervised Bone Marrow Lesion Detection from Knee MRI Segmentation Using Mask Inpainting Models | Shihua Qin et.al. | 2409.19185 | null |
2024-09-27 | Multi-modal Cross-domain Self-supervised Pre-training for fMRI and EEG Fusion | Xinxu Wei et.al. | 2409.19130 | null |
2024-10-01 | Pruning then Reweighting: Towards Data-Efficient Training of Diffusion Models | Yize Li et.al. | 2409.19128 | link |
2024-09-27 | MiniVLN: Efficient Vision-and-Language Navigation by Progressive Knowledge Distillation | Junyou Zhu et.al. | 2409.18800 | null |
2024-09-27 | Student-Oriented Teacher Knowledge Refinement for Knowledge Distillation | Chaomin Shen et.al. | 2409.18785 | null |
2024-09-27 | Harmonizing knowledge Transfer in Neural Network with Unified Distillation | Yaomin Huang et.al. | 2409.18565 | null |
2024-09-27 | Towards Diverse Device Heterogeneous Federated Learning via Task Arithmetic Knowledge Integration | Mahdi Morafah et.al. | 2409.18461 | link |
2024-10-01 | Backdoor Attacks for LLMs with Weak-To-Strong Knowledge Distillation | Shuai Zhao et.al. | 2409.17946 | null |
2024-09-26 | Kendall’s $τ$ Coefficient for Logits Distillation | Yuchen Guan et.al. | 2409.17823 | null |
2024-09-26 | Diversity-Driven Synthesis: Enhancing Dataset Distillation through Directed Weight Adjustment | Jiawei Du et.al. | 2409.17612 | null |
2024-09-26 | Dataset Distillation-based Hybrid Federated Learning on Non-IID Data | Xiufang Shi et.al. | 2409.17517 | null |
2024-09-26 | Shape-intensity knowledge distillation for robust medical image segmentation | Wenhui Dong et.al. | 2409.17503 | link |
2024-09-25 | MT2KD: Towards A General-Purpose Encoder for Speech, Speaker, and Audio Events | Xiaoyu Yang et.al. | 2409.17010 | null |
2024-09-25 | Adverse Weather Optical Flow: Cumulative Homogeneous-Heterogeneous Adaptation | Hanyu Zhou et.al. | 2409.17001 | null |
2024-09-25 | A Novel Framework for Analyzing Structural Transformation in Data-Constrained Economies Using Bayesian Modeling and Machine Learning | Ronald Katende et.al. | 2409.16738 | null |
2024-09-25 | SelectiveKD: A semi-supervised framework for cancer detection in DBT through Knowledge Distillation and Pseudo-labeling | Laurent Dillard et.al. | 2409.16581 | null |
2024-09-24 | AIM 2024 Challenge on UHD Blind Photo Quality Assessment | Vlad Hosu et.al. | 2409.16271 | null |
2024-09-24 | Label-Augmented Dataset Distillation | Seoungyoon Kang et.al. | 2409.16239 | null |
2024-09-25 | Privacy Evaluation Benchmarks for NLP Models | Wei Huang et.al. | 2409.15868 | link |
2024-09-24 | Twin Network Augmentation: A Novel Training Strategy for Improved Spiking Neural Networks and Efficient Weight Quantization | Lucas Deckers et.al. | 2409.15849 | null |
2024-09-23 | TS-TCD: Triplet-Level Cross-Modal Distillation for Time-Series Forecasting Using Large Language Models | Pengfei Wang et.al. | 2409.14978 | null |
2024-09-23 | DSG-KD: Knowledge Distillation from Domain-Specific to General Language Models | Sangyeon Cho et.al. | 2409.14904 | link |
2024-09-23 | Pre-trained Language Model and Knowledge Distillation for Lightweight Sequential Recommendation | Li Li et.al. | 2409.14810 | null |
2024-09-23 | An Adverse Weather-Immune Scheme with Unfolded Regularization and Foundation Model Knowledge Distillation for Street Scene Understanding | Wei-Bin Kou et.al. | 2409.14737 | null |
2024-09-22 | EchoAtt: Attend, Copy, then Adjust for More Efficient Large Language Models | Hossein Rajabzadeh et.al. | 2409.14595 | null |
2024-09-22 | Prior Knowledge Distillation Network for Face Super-Resolution | Qiu Yang et.al. | 2409.14385 | null |
2024-09-25 | DilateQuant: Accurate and Efficient Diffusion Quantization via Weight Dilation | Xuewen Liu et.al. | 2409.14307 | null |
2024-09-18 | Applications of Knowledge Distillation in Remote Sensing: A Survey | Yassine Himeur et.al. | 2409.12111 | null |
2024-09-18 | Data Efficient Acoustic Scene Classification using Teacher-Informed Confusing Class Instruction | Jin Jie Sean Yeo et.al. | 2409.11964 | null |
2024-09-18 | Distillation-free Scaling of Large SSMs for Images and Videos | Hamid Suleman et.al. | 2409.11867 | null |
2024-09-18 | EFCM: Efficient Fine-tuning on Compressed Models for deployment of large models in medical image analysis | Shaojie Li et.al. | 2409.11817 | null |
2024-09-18 | Efficient Low-Resolution Face Recognition via Bridge Distillation | Shiming Ge et.al. | 2409.11786 | null |
2024-09-18 | RUIE: Retrieval-based Unified Information Extraction using Large Language Model | Xincheng Liao et.al. | 2409.11673 | null |
2024-09-17 | Time-Series Forecasting, Knowledge Distillation, and Refinement within a Multimodal PDE Foundation Model | Derek Jollie et.al. | 2409.11609 | link |
2024-09-17 | Unleashing the Potential of Mamba: Boosting a LiDAR 3D Sparse Detector by Using Cross-Model Knowledge Distillation | Rui Yu et.al. | 2409.11018 | null |
2024-09-17 | Single-stage TTS with Masked Audio Token Modeling and Semantic Knowledge Distillation | Gerard I. Gállego et.al. | 2409.11003 | null |
2024-09-16 | Frequency-Guided Masking for Enhanced Vision Self-Supervised Learning | Amin Karimi Monsefi et.al. | 2409.10362 | null |
2024-09-16 | Human Insights Driven Latent Space for Different Driving Perspectives: A Unified Encoder for Efficient Multi-Task Inference | Huy-Dung Nguyen et.al. | 2409.10095 | null |
2024-09-14 | Effective Pre-Training of Audio Transformers for Sound Event Detection | Florian Schmid et.al. | 2409.09546 | link |
2024-09-14 | Integrated Multi-Level Knowledge Distillation for Enhanced Speaker Verification | Wenhao Yang et.al. | 2409.09389 | null |
2024-09-14 | Joint Semantic Knowledge Distillation and Masked Acoustic Modeling for Full-band Speech Restoration with Improved Intelligibility | Xiaoyu Liu et.al. | 2409.09357 | null |
2024-09-13 | Exploring System-Heterogeneous Federated Learning with Dynamic Model Selection | Dixi Yao et.al. | 2409.08858 | null |
2024-09-13 | AWF: Adaptive Weight Fusion for Enhanced Class Incremental Semantic Segmentation | Zechao Sun et.al. | 2409.08516 | null |
2024-09-12 | DiReDi: Distillation and Reverse Distillation for AIoT Applications | Chen Sun et.al. | 2409.08308 | null |
2024-09-12 | Ruri: Japanese General Text Embeddings | Hayato Tsukagoshi et.al. | 2409.07737 | link |
2024-09-12 | Learn from Balance: Rectifying Knowledge Transfer for Long-Tailed Scenarios | Xinlei Huang et.al. | 2409.07694 | null |
2024-09-11 | DS-ViT: Dual-Stream Vision Transformer for Cross-Task Distillation in Alzheimer’s Early Diagnosis | Ke Chen et.al. | 2409.07584 | null |
2024-09-11 | EchoDFKD: Data-Free Knowledge Distillation for Cardiac Ultrasound Segmentation using Synthetic Data | Grégoire Petit et.al. | 2409.07566 | null |
2024-09-11 | Enhancing CTC-Based Visual Speech Recognition | Hendrik Laux et.al. | 2409.07210 | null |
2024-09-11 | A Continual and Incremental Learning Approach for TinyML On-device Training Using Dataset Distillation and Model Size Adaption | Marcus Rüb et.al. | 2409.07114 | null |
2024-09-16 | Privacy-Preserving Federated Learning with Consistency via Knowledge Distillation Using Conditional Generator | Kangyang Luo et.al. | 2409.06955 | null |
2024-09-10 | Applied Federated Model Personalisation in the Industrial Domain: A Comparative Study | Ilias Siniosoglou et.al. | 2409.06904 | null |
2024-09-10 | EasyST: A Simple Framework for Spatio-Temporal Prediction | Jiabin Tang et.al. | 2409.06748 | link |
2024-09-10 | Knowledge Distillation via Query Selection for Detection Transformer | Yi Liu et.al. | 2409.06443 | null |
2024-09-10 | Distilling Generative-Discriminative Representations for Very Low-Resolution Face Recognition | Junzheng Zhang et.al. | 2409.06371 | null |
2024-09-09 | Joint Input and Output Coordination for Class-Incremental Learning | Shuai Wang et.al. | 2409.05620 | null |
2024-09-09 | LEROjD: Lidar Extended Radar-Only Object Detection | Patrick Palmer et.al. | 2409.05564 | link |
2024-09-09 | Look One and More: Distilling Hybrid Order Relational Knowledge for Cross-Resolution Image Recognition | Shiming Ge et.al. | 2409.05384 | null |
2024-09-09 | FedBrain-Distill: Communication-Efficient Federated Brain Tumor Classification Using Ensemble Knowledge Distillation on Non-IID Data | Rasoul Jafari Gohari et.al. | 2409.05359 | link |
2024-09-07 | LoCa: Logit Calibration for Knowledge Distillation | Runming Yang et.al. | 2409.04778 | null |
2024-09-06 | SCARF: Scalable Continual Learning Framework for Memory-efficient Multiple Neural Radiance Fields | Yuze Wang et.al. | 2409.04482 | null |
2024-09-05 | Experimentation in Content Moderation using RWKV | Umut Yildirim et.al. | 2409.03939 | null |
2024-09-05 | Data-Efficient Generation for Dataset Distillation | Zhe Li et.al. | 2409.03929 | null |
2024-09-05 | DKDM: Data-Free Knowledge Distillation for Diffusion Models with Any Architecture | Qianlong Xiang et.al. | 2409.03550 | null |
2024-09-05 | Data-free Distillation with Degradation-prompt Diffusion for Multi-weather Image Restoration | Pei Wang et.al. | 2409.03455 | null |
2024-09-05 | Efficient Image Compression Using Advanced State Space Models | Bouzid Arezki et.al. | 2409.02743 | null |
2024-09-04 | CLDA: Collaborative Learning for Enhanced Unsupervised Domain Adaptation | Minhee Cho et.al. | 2409.02699 | null |
2024-09-04 | Low-Resolution Object Recognition with Cross-Resolution Relational Contrastive Distillation | Kangkai Zhang et.al. | 2409.02555 | null |
2024-09-04 | A design of magnetic tunnel junctions for the deployment of neuromorphic hardware for edge computing | Davi Rodrigues et.al. | 2409.02528 | null |
2024-09-04 | Non-target Divergence Hypothesis: Toward Understanding Domain Gaps in Cross-Modal Knowledge Distillation | Yilong Chen et.al. | 2409.02438 | null |
2024-09-03 | Low-Resolution Face Recognition via Adaptable Instance-Relation Distillation | Ruixin Shi et.al. | 2409.02049 | null |
2024-09-03 | Efficient Point Cloud Classification via Offline Distillation Framework and Negative-Weight Self-Distillation Technique | Qiang Zheng et.al. | 2409.02020 | null |
2024-09-03 | Contemporary Model Compression on Large Language Models Inference | Dong Liu et.al. | 2409.01990 | null |
2024-09-05 | Adaptive Explicit Knowledge Transfer for Knowledge Distillation | Hyungkeun Park et.al. | 2409.01679 | null |
2024-09-03 | Improving Apple Object Detection with Occlusion-Enhanced Distillation | Liang Geng et.al. | 2409.01573 | null |
2024-09-02 | Dataset Distillation from First Principles: Integrating Core Information Extraction and Purposeful Learning | Vyacheslav Kungurtsev et.al. | 2409.01410 | null |
2024-09-02 | MobileIQA: Exploiting Mobile-level Diverse Opinion Network For No-Reference Image Quality Assessment Using Knowledge Distillation | Zewen Chen et.al. | 2409.01212 | link |
2024-09-04 | Diffusion-Driven Data Replay: A Novel Approach to Combat Forgetting in Federated Class Continual Learning | Jinglin Liang et.al. | 2409.01128 | link |
2024-09-02 | Compressing VAE-Based Out-of-Distribution Detectors for Embedded Deployment | Aditya Bansal et.al. | 2409.00880 | null |
2024-09-01 | LanguaShrink: Reducing Token Overhead with Psycholinguistics | Xuechen Liang et.al. | 2409.00855 | null |
2024-08-30 | How Knowledge Distillation Mitigates the Synthetic Gap in Fair Face Recognition | Pedro C. Neto et.al. | 2408.17399 | link |
2024-08-30 | HiTSR: A Hierarchical Transformer for Reference-based Super-Resolution | Masoomeh Aslahishahri et.al. | 2408.16959 | link |
2024-08-29 | VLM-KD: Knowledge Distillation from VLM for Long-Tail Visual Recognition | Zaiwei Zhang et.al. | 2408.16930 | null |
2024-08-29 | Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal Sampling | Hritik Bansal et.al. | 2408.16737 | null |
2024-08-29 | MST-KD: Multiple Specialized Teachers Knowledge Distillation for Fair Face Recognition | Eduarda Caldeira et.al. | 2408.16563 | link |
2024-08-29 | UDD: Dataset Distillation via Mining Underutilized Regions | Shiguang Wang et.al. | 2408.16268 | null |
2024-08-29 | Neural Spectral Decomposition for Dataset Distillation | Shaolei Yang et.al. | 2408.16236 | null |
2024-08-28 | EMP: Enhance Memory in Data Pruning | Jinying Xiao et.al. | 2408.16031 | null |
2024-08-28 | LLaVA-MoD: Making LLaVA Tiny via MoE Knowledge Distillation | Fangxun Shu et.al. | 2408.15881 | link |
2024-08-28 | ModalityMirror: Improving Audio Classification in Modality Heterogeneity Federated Learning with Multimodal Distillation | Tiantian Feng et.al. | 2408.15803 | null |
2024-08-28 | Online pre-training with long-form videos | Itsuki Kato et.al. | 2408.15651 | null |
2024-08-28 | Boosting Lossless Speculative Decoding via Feature Sampling and Partial Alignment Distillation | Lujun Gui et.al. | 2408.15562 | null |
2024-08-27 | Leveraging Self-supervised Audio Representations for Data-Efficient Acoustic Scene Classification | Yiqiang Cai et.al. | 2408.14862 | link |
2024-08-26 | Bridging the Gap: Unpacking the Hidden Challenges in Knowledge Distillation for Online Ranking Systems | Nikhil Khani et.al. | 2408.14678 | null |
2024-08-26 | TSAK: Two-Stage Semantic-Aware Knowledge Distillation for Efficient Wearable Modality and Model Optimization in Manufacturing Lines | Hymalai Bello et.al. | 2408.14146 | null |
Schrodinger Bridge
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2024-11-19 | PoM: Efficient Image and Video Generation with the Polynomial Mixer | David Picard et.al. | 2411.12663 | link |
2024-11-19 | Improving Controllability and Editability for Pretrained Text-to-Music Generation Models | Yixiao Zhang et.al. | 2411.12641 | null |
2024-11-19 | Data Pruning in Generative Diffusion Models | Rania Briq et.al. | 2411.12523 | null |
2024-11-19 | Itô, Stratonovich, and zoom-in schemes in stochastic inflation | Eemeli Tomberg et.al. | 2411.12465 | null |
2024-11-19 | Frequency-Aware Guidance for Blind Image Restoration via Diffusion Models | Jun Xiao et.al. | 2411.12450 | null |
2024-11-19 | Combinational Backdoor Attack against Customized Text-to-Image Models | Wenbo Jiang et.al. | 2411.12389 | null |
2024-11-19 | Scalable and Effective Negative Sample Generation for Hyperedge Prediction | Shilin Qu et.al. | 2411.12354 | null |
2024-11-19 | Diffusion Product Quantization | Jie Shao et.al. | 2411.12306 | null |
2024-11-19 | SSEditor: Controllable Mask-to-Scene Generation with Diffusion Model | Haowen Zheng et.al. | 2411.12290 | null |
2024-11-20 | HouseLLM: LLM-Assisted Two-Phase Text-to-Floorplan Generation | Ziyang Zong et.al. | 2411.12279 | null |
2024-11-19 | On sensitivities regarding shape and topology optimization as derivatives on Wasserstein spaces | Fumiya Okazaki et.al. | 2411.12234 | null |
2024-11-19 | Wavespeed selection of travelling wave solutions of a two-component reaction-diffusion model of cell invasion | Yuhui Chen et.al. | 2411.12232 | null |
2024-11-19 | Constant Rate Schedule: Constant-Rate Distributional Change for Efficient Training and Sampling in Diffusion Models | Shuntaro Okada et.al. | 2411.12188 | null |
2024-11-19 | Diffusion-Inspired Cold Start with Sufficient Prior in Computerized Adaptive Testing | Haiping Ma et.al. | 2411.12182 | link |
2024-11-19 | Enhancing Low Dose Computed Tomography Images Using Consistency Training Techniques | Mahmut S. Gokmen et.al. | 2411.12181 | null |
2024-11-18 | Milstein-type schemes for McKean-Vlasov SDEs driven by Brownian motion and Poisson random measure (with super-linear coefficients) | Sani Biswas et.al. | 2411.11759 | null |
2024-11-18 | Aligning Few-Step Diffusion Models with Dense Reward Difference Learning | Ziyi Zhang et.al. | 2411.11727 | link |
2024-11-18 | Robust Reinforcement Learning under Diffusion Models for Data with Jumps | Chenyang Jiang et.al. | 2411.11697 | null |
2024-11-18 | Conceptwm: A Diffusion Model Watermark for Concept Protection | Liangqi Lei et.al. | 2411.11688 | null |
2024-11-19 | Cascaded Diffusion Models for 2D and 3D Microscopy Image Synthesis to Enhance Cell Segmentation | Rüveyda Yilmaz et.al. | 2411.11515 | null |
2024-11-18 | MVLight: Relightable Text-to-3D Generation via Light-conditioned Multi-View Diffusion | Dongseok Shim et.al. | 2411.11475 | null |
2024-11-18 | CLUE-MARK: Watermarking Diffusion Models using CLWE | Kareem Shehata et.al. | 2411.11434 | null |
2024-11-18 | Teaching Video Diffusion Model with Latent Physical Phenomenon Knowledge | Qinglong Cao et.al. | 2411.11343 | null |
2024-11-18 | Stochastic quantization and diffusion models | Kenji Fukushima et.al. | 2411.11297 | null |
2024-11-18 | Unbiased Approximations for Stationary Distributions of McKean-Vlasov SDEs | Elsiddig Awadelkarim et.al. | 2411.11270 | null |
2024-11-17 | Stealing Training Graphs from Graph Neural Networks | Minhua Lin et.al. | 2411.11197 | null |
2024-11-17 | DeepSPV: An Interpretable Deep Learning Pipeline for 3D Spleen Volume Estimation from 2D Ultrasound Images | Zhen Yuan et.al. | 2411.11190 | null |
2024-11-17 | Strong Stability Preservation for Stochastic Partial Differential Equations | James Woodfield et.al. | 2411.11172 | null |
2024-11-17 | Integrated Ising Model with global inhibition for decision making | Olga Tapinova et.al. | 2411.11143 | null |
2024-11-17 | Oscillation Inversion: Understand the structure of Large Flow Model through the Lens of Inversion Method | Yan Zheng et.al. | 2411.11135 | null |
2024-11-15 | M-VAR: Decoupled Scale-wise Autoregressive Modeling for High-Quality Image Generation | Sucheng Ren et.al. | 2411.10433 | null |
2024-11-15 | Mitigating Parameter Degeneracy using Joint Conditional Diffusion Model for WECC Composite Load Model in Power Systems | Feiqin Zhu et.al. | 2411.10431 | null |
2024-11-15 | Towards High-Fidelity 3D Portrait Generation with Rich Details by Cross-View Prior-Aware Diffusion | Haoran Wei et.al. | 2411.10369 | null |
2024-11-15 | Probabilistic Prior Driven Attention Mechanism Based on Diffusion Model for Imaging Through Atmospheric Turbulence | Guodong Sun et.al. | 2411.10321 | null |
2024-11-15 | Modification Takes Courage: Seamless Image Stitching via Reference-Driven Inpainting | Ziqi Xie et.al. | 2411.10309 | link |
2024-11-15 | The Unreasonable Effectiveness of Guidance for Diffusion Models | Tim Kaiser et.al. | 2411.10257 | null |
2024-11-15 | Smooth transport map via diffusion process | Arthur Stéphanovitch et.al. | 2411.10235 | null |
2024-11-15 | ColorEdit: Training-free Image-Guided Color editing with diffusion model | Xingxi Yin et.al. | 2411.10232 | null |
2024-11-15 | Fused Gromov-Wasserstein Variance Decomposition with Linear Optimal Transport | Michael Wilson et.al. | 2411.10204 | null |
2024-11-15 | Evaluating Text-to-Image Diffusion Models for Texturing Synthetic Data | Thomas Lips et.al. | 2411.10164 | null |
2024-11-15 | Towards Multi-View Consistent Style Transfer with One-Step Diffusion via Vision Conditioning | Yushen Zuo et.al. | 2411.10130 | null |
2024-11-15 | SPLIT: SE(3)-diffusion via Local Geometry-based Score Prediction for 3D Scene-to-Pose-Set Matching Problems | Kanghyun Kim et.al. | 2411.10049 | null |
2024-11-15 | EyeDiff: text-to-image diffusion model improves rare eye disease diagnosis | Ruoyu Chen et.al. | 2411.10004 | null |
2024-11-15 | Adaptive Non-Uniform Timestep Sampling for Diffusion Model Training | Myunsoo Kim et.al. | 2411.09998 | null |
2024-11-15 | Instruction-Guided Editing Controls for Images and Multimedia: A Survey in LLM era | Thanh Tam Nguyen et.al. | 2411.09955 | null |
2024-11-14 | How to implement the Bayes’ formula in the age of ML? | Amirhossein Taghvaei et.al. | 2411.09653 | null |
2024-11-14 | Golden Noise for Diffusion Models: A Learning Framework | Zikai Zhou et.al. | 2411.09502 | null |
2024-11-14 | DiffRoad: Realistic and Diverse Road Scenario Generation for Autonomous Vehicle Testing | Junjie Zhou et.al. | 2411.09451 | null |
2024-11-14 | Image Regeneration: Evaluating Text-to-Image Model via Generating Identical Image with Multimodal Large Language Models | Chutian Meng et.al. | 2411.09449 | null |
2024-11-14 | A survey of probabilistic generative frameworks for molecular simulations | Richard John et.al. | 2411.09388 | link |
2024-11-14 | EEG-Based Speech Decoding: A Novel Approach Using Multi-Kernel Ensemble Diffusion Models | Soowon Kim et.al. | 2411.09302 | null |
2024-11-14 | Advancing Diffusion Models: Alias-Free Resampling and Enhanced Rotational Equivariance | Md Fahim Anjum et.al. | 2411.09174 | null |
2024-11-14 | VidMan: Exploiting Implicit Dynamics from Video Diffusion Model for Effective Robot Manipulation | Youpeng Wen et.al. | 2411.09153 | null |
2024-11-14 | General linear threshold models with application to influence maximization | Alexander Kagan et.al. | 2411.09100 | link |
2024-11-13 | Microfoundation Inference for Strategic Prediction | Daniele Bracale et.al. | 2411.08998 | null |
2024-11-15 | Inconsistencies In Consistency Models: Better ODE Solving Does Not Imply Better Samples | Noël Vouitsis et.al. | 2411.08954 | link |
2024-11-13 | 4D Gaussian Splatting in the Wild with Uncertainty-Aware Regularization | Mijeong Kim et.al. | 2411.08879 | null |
2024-11-13 | Offline Adaptation of Quadruped Locomotion using Diffusion Models | Reece O’Mahoney et.al. | 2411.08832 | null |
2024-11-13 | Optimal Transport-Based Displacement Interpolation with Data Augmentation for Reduced Order Modeling of Nonlinear Dynamical Systems | Moaad Khamlich et.al. | 2411.08750 | null |
2024-11-13 | Berry-Esseen bounds for large-time asymptotics of one-dimensional diffusion processes via Malliavin-Stein method | Seiichiro Kusuoka et.al. | 2411.08725 | null |
2024-11-13 | A Machine Learning Algorithm for Finite-Horizon Stochastic Control Problems in Economics | Xianhua Peng et.al. | 2411.08668 | null |
2024-11-13 | Towards More Accurate Fake Detection on Images Generated from Advanced Generative and Neural Rendering Models | Chengdong Dong et.al. | 2411.08642 | null |
2024-11-13 | Neural Topic Modeling with Large Language Models in the Loop | Xiaohao Yang et.al. | 2411.08534 | null |
2024-11-13 | V2X-R: Cooperative LiDAR-4D Radar Fusion for 3D Object Detection with Denoising Diffusion | Xun Huang et.al. | 2411.08402 | link |
2024-11-13 | Physics Informed Distillation for Diffusion Models | Joshua Tian Jin Tee et.al. | 2411.08378 | link |
2024-11-13 | Multiscale Graph Construction Using Non-local Cluster Features | Reina Kaneko et.al. | 2411.08371 | null |
2024-11-13 | Generative AI for Data Augmentation in Wireless Networks: Analysis, Applications, and Case Study | Jinbo Wen et.al. | 2411.08341 | null |
2024-11-13 | Motion Control for Enhanced Complex Action Video Generation | Qiang Zhou et.al. | 2411.08328 | null |
2024-11-13 | Conditional Variable Flow Matching: Transforming Conditional Densities with Amortized Conditional Optimal Transport | Adam P. Generale et.al. | 2411.08314 | link |
2024-11-13 | DNN Task Assignment in UAV Networks: A Generative AI Enhanced Multi-Agent Reinforcement Learning Approach | Xin Tang et.al. | 2411.08299 | null |
2024-11-12 | Joint Diffusion models in Continual Learning | Paweł Skierś et.al. | 2411.08224 | null |
2024-11-12 | Scaling Properties of Diffusion Models for Perceptual Tasks | Rahul Ravishankar et.al. | 2411.08034 | null |
2024-11-12 | GaussianAnything: Interactive Point Cloud Latent Diffusion for 3D Generation | Yushi Lan et.al. | 2411.08033 | null |
2024-11-12 | Approximation rates of entropic maps in semidiscrete optimal transport | Ritwik Sadhu et.al. | 2411.07947 | null |
2024-11-12 | Stochastic MPC for Finite Gaussian Mixture Disturbances with Guarantees | Maico H. W. Engelaar et.al. | 2411.07887 | null |
2024-11-12 | Diverse capability and scaling of diffusion and auto-regressive models when learning abstract rules | Binxu Wang et.al. | 2411.07873 | null |
2024-11-12 | Federated Learning for Discrete Optimal Transport with Large Population under Incomplete Information | Navpreet Kaur et.al. | 2411.07841 | null |
2024-11-12 | Novel View Synthesis with Pixel-Space Diffusion Models | Noam Elata et.al. | 2411.07765 | null |
2024-11-12 | Nanosecond nanothermometry in an electron microscope | Florian Castioni et.al. | 2411.07764 | null |
2024-11-12 | Leveraging Previous Steps: A Training-free Fast Solver for Flow Diffusion | Kaiyu Song et.al. | 2411.07627 | null |
2024-11-12 | Unraveling the Connections between Flow Matching and Diffusion Probabilistic Models in Training-free Conditional Generation | Kaiyu Song et.al. | 2411.07625 | null |
2024-11-12 | Harmonizing Pixels and Melodies: Maestro-Guided Film Score Generation and Composition Style Transfer | F. Qi et.al. | 2411.07539 | null |
2024-11-12 | FM-TS: Flow Matching for Time Series Generation | Yang Hu et.al. | 2411.07506 | link |
2024-11-12 | Semi-Truths: A Large-Scale Dataset of AI-Augmented Images for Evaluating Robustness of AI-Generated Image detectors | Anisha Pal et.al. | 2411.07472 | link |
2024-11-12 | Tracing the Roots: Leveraging Temporal Dynamics in Diffusion Trajectories for Origin Attribution | Andreas Floros et.al. | 2411.07449 | null |
2024-11-12 | All-in-one Weather-degraded Image Restoration via Adaptive Degradation-aware Self-prompting Model | Yuanbo Wen et.al. | 2411.07445 | null |
2024-11-11 | Score-based generative diffusion with “active” correlated noise sources | Alexandra Lamtyugina et.al. | 2411.07233 | null |
2024-11-12 | Add-it: Training-Free Object Insertion in Images With Pretrained Diffusion Models | Yoad Tewel et.al. | 2411.07232 | null |
2024-11-11 | DLCR: A Generative Data Expansion Framework via Diffusion for Clothes-Changing Person Re-ID | Nyle Siddiqui et.al. | 2411.07205 | link |
2024-11-11 | Crossover from inhomogeneous to homogeneous response of a resonantly driven hBN quantum emitter | Domitille Gérard et.al. | 2411.07202 | null |
2024-11-11 | OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision | Cong Wei et.al. | 2411.07199 | null |
2024-11-11 | More Expressive Attention with Negative Weights | Ang Lv et.al. | 2411.07176 | link |
2024-11-11 | Rough differential equations in the flow approach | Ajay Chandra et.al. | 2411.07157 | null |
2024-11-11 | Conditional simulation via entropic optimal transport: Toward non-parametric estimation of conditional Brenier maps | Ricardo Baptista et.al. | 2411.07154 | null |
2024-11-11 | Variational Graph Contrastive Learning | Shifeng Xie et.al. | 2411.07150 | link |
2024-11-11 | Edify 3D: Scalable High-Quality 3D Asset Generation | NVIDIA et.al. | 2411.07135 | null |
2024-11-11 | Edify Image: High-Quality Image Generation with Pixel Space Laplacian Diffusion Models | NVIDIA et.al. | 2411.07126 | null |
2024-11-12 | Distribution dependent SDEs with multiplicative fractional noise | Xiliang Fan et.al. | 2411.06974 | null |
2024-11-11 | Nonparametric estimation of trend for stochastic differential equations driven by multiplicative stochastic volatility | B. L. S. Prakasa Rao et.al. | 2411.06865 | null |
2024-11-11 | The Exponential Lie Series and a Chen-Strichartz Formula for Levy Processes | Kurusch Ebrahimi-Fard et.al. | 2411.06827 | null |
2024-11-11 | White-Box Diffusion Transformer for single-cell RNA-seq generation | Zhuorui Cui et.al. | 2411.06785 | link |
2024-11-08 | StdGEN: Semantic-Decomposed 3D Character Generation from Single Images | Yuze He et.al. | 2411.05738 | null |
2024-11-08 | Image2Text2Image: A Novel Framework for Label-Free Evaluation of Image-to-Text Generation with Text-to-Image Diffusion Models | Jia-Hong Huang et.al. | 2411.05706 | null |
2024-11-08 | Relative Optimal Transport | Peter Bubenik et.al. | 2411.05678 | null |
2024-11-08 | Improving Molecular Graph Generation with Flow Matching and Optimal Transport | Xiaoyang Hou et.al. | 2411.05676 | null |
2024-11-08 | Rigidly breaking potential flows and a countable Alexandrov theorem for polytopes | Jian-Guo Liu et.al. | 2411.05606 | null |
2024-11-08 | Towards Lifelong Few-Shot Customization of Text-to-Image Diffusion | Nan Song et.al. | 2411.05544 | null |
2024-11-08 | Improving image synthesis with diffusion-negative sampling | Alakh Desai et.al. | 2411.05473 | null |
2024-11-08 | Bridging the Gap between Learning and Inference for Diffusion-Based Molecule Generation | Peidong Liu et.al. | 2411.05472 | link |
2024-11-08 | Generalization, Expressivity, and Universality of Graph Neural Networks on Attributed Graphs | Levi Rauchwerger et.al. | 2411.05464 | null |
2024-11-08 | Sticky diffusions on star graphs : characterization and It{ô} formula | Jules Berry et.al. | 2411.05441 | null |
2024-11-08 | Stochastic games of parental vaccination decision making and bounded rationality | Andras Balogh et.al. | 2411.05369 | null |
2024-11-08 | RED: Residual Estimation Diffusion for Low-Dose PET Sinogram Reconstruction | Xingyu Ai et.al. | 2411.05354 | null |
2024-11-08 | Electro-diffusive modeling and the role of spine geometry on action potential propagation in neurons | Rahul Gulati et.al. | 2411.05329 | null |
2024-11-08 | Adaptive Whole-Body PET Image Denoising Using 3D Diffusion Models with ControlNet | Boxiao Yu et.al. | 2411.05302 | null |
2024-11-08 | SpecHub: Provable Acceleration to Multi-Draft Speculative Decoding | Ryan Sun et.al. | 2411.05289 | link |
2024-11-07 | SVDQunat: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models | Muyang Li et.al. | 2411.05007 | link |
2024-11-07 | ProEdit: Simple Progression is All You Need for High-Quality 3D Scene Editing | Jun-Kun Chen et.al. | 2411.05006 | null |
2024-11-07 | Diff-2-in-1: Bridging Generation and Dense Perception with Diffusion Models | Shuhong Zheng et.al. | 2411.05005 | null |
2024-11-07 | ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning | David Junhao Zhang et.al. | 2411.05003 | null |
2024-11-07 | SG-I2V: Self-Guided Trajectory Control in Image-to-Video Generation | Koichi Namekata et.al. | 2411.04989 | null |
2024-11-07 | Uncovering Hidden Subspaces in Video Diffusion Models Using Re-Identification | Mischa Dombrowski et.al. | 2411.04956 | null |
2024-11-07 | DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion | Wenqiang Sun et.al. | 2411.04928 | null |
2024-11-07 | Stem-OB: Generalizable Visual Imitation Learning with Stem-Like Convergent Observation through Diffusion Inversion | Kaizhe Hu et.al. | 2411.04919 | link |
2024-11-07 | Gluing methods for quantitative stability of optimal transport maps | Cyril Letrouit et.al. | 2411.04908 | null |
2024-11-07 | Coupling between Brownian motion and random walks on the infinite percolation cluster | Chenlin Gu et.al. | 2411.04778 | null |
2024-11-07 | Controlling Human Shape and Pose in Text-to-Image Diffusion Models via Domain Adaptation | Benito Buchheim et.al. | 2411.04724 | null |
2024-11-07 | DanceFusion: A Spatio-Temporal Skeleton Diffusion Transformer for Audio-Driven Dance Motion Reconstruction | Li Zhao et.al. | 2411.04646 | null |
2024-11-07 | Brain Tumour Removing and Missing Modality Generation using 3D WDM | André Ferreira et.al. | 2411.04630 | link |
2024-11-07 | Social EgoMesh Estimation | Luca Scofano et.al. | 2411.04598 | link |
2024-11-07 | Series-to-Series Diffusion Bridge Model | Hao Yang et.al. | 2411.04491 | null |
2024-11-06 | Community Forensics: Using Thousands of Generators to Train Fake Image Detectors | Jeongsoo Park et.al. | 2411.04125 | null |
2024-11-06 | A Multi-level Monte Carlo simulation for invariant distribution of Markovian switching Lévy-driven SDEs with super-linearly growth coefficients | Hoang-Viet Nguyen et.al. | 2411.04081 | null |
2024-11-06 | Synomaly Noise and Multi-Stage Diffusion: A Novel Approach for Unsupervised Anomaly Detection in Ultrasound Imaging | Yuan Bi et.al. | 2411.04004 | null |
2024-11-06 | ET-SEED: Efficient Trajectory-Level SE(3) Equivariant Diffusion Policy | Chenrui Tie et.al. | 2411.03990 | null |
2024-11-06 | ReEdit: Multimodal Exemplar-Based Image Editing with Diffusion Models | Ashutosh Srivastava et.al. | 2411.03982 | null |
2024-11-06 | ROBIN: Robust and Invisible Watermarks for Diffusion Models with Adversarial Optimization | Huayang Huang et.al. | 2411.03862 | link |
2024-11-06 | Sub-DM:Subspace Diffusion Model with Orthogonal Decomposition for MRI Reconstruction | Yu Guan et.al. | 2411.03758 | null |
2024-11-06 | Zero-shot Dynamic MRI Reconstruction with Global-to-local Diffusion Model | Yu Guan et.al. | 2411.03723 | null |
2024-11-06 | Asymptotic analysis of estimators of ergodic stochastic differential equations | Arnab Ganguly et.al. | 2411.03623 | null |
2024-11-06 | Investigating Conceptual Blending of a Diffusion Model for Improving Nonword-to-Image Generation | Chihaya Matsuhira et.al. | 2411.03595 | null |
2024-11-05 | Estimating Ego-Body Pose from Doubly Sparse Egocentric Video Data | Seunggeun Chi et.al. | 2411.03561 | null |
2024-11-05 | Ergodicity and Mixing of Sublinear Expectation System and Applications | Wen Huang et.al. | 2411.03512 | null |
2024-11-05 | SynthSet: Generative Diffusion Model for Semantic Segmentation in Precision Agriculture | Andrew Heschl et.al. | 2411.03505 | link |
2024-11-05 | Chance-Constrained Convex MPC for Robust Quadruped Locomotion Under Parametric and Additive Uncertainties | Ananya Trivedi et.al. | 2411.03481 | link |
2024-11-05 | Exo-Daisy World: Revisiting Gaia Theory through an Informational Architecture Perspective | Damian R Sowinski et.al. | 2411.03421 | null |
2024-11-05 | Information geometry of diffeomorphism groups | Boris Khesin et.al. | 2411.03265 | null |
2024-11-05 | DiffLM: Controllable Synthetic Data Generation via Diffusion Language Models | Ying Zhou et.al. | 2411.03250 | null |
2024-11-05 | On Improved Conditioning Mechanisms and Pre-training Strategies for Diffusion Models | Tariq Berrada Ifriqi et.al. | 2411.03177 | null |
2024-11-05 | Unleashing the power of novel conditional generative approaches for new materials discovery | Lev Novitskiy et.al. | 2411.03156 | link |
2024-11-05 | Gradient-Guided Conditional Diffusion Models for Private Image Reconstruction: Analyzing Adversarial Impacts of Differential Privacy and Denoising | Tao Huang et.al. | 2411.03053 | null |
2024-11-05 | GarVerseLOD: High-Fidelity 3D Garment Reconstruction from a Single In-the-Wild Image using a Dataset with Levels of Details | Zhongjin Luo et.al. | 2411.03047 | null |
2024-11-05 | IMUDiffusion: A Diffusion Model for Multivariate Time Series Synthetisation for Inertial Motion Capturing Systems | Heiko Oppel et.al. | 2411.02954 | null |
2024-11-05 | LDPM: Towards undersampled MRI reconstruction with MR-VAE and Latent Diffusion Prior | Xingjian Tang et.al. | 2411.02951 | null |
2024-11-05 | Theoretically Guaranteed Distribution Adaptable Learning | Chao Xu et.al. | 2411.02921 | null |
2024-11-05 | How much is a noisy image worth? Data Scaling Laws for Ambient Diffusion | Giannis Daras et.al. | 2411.02780 | link |
2024-11-04 | Modelling Alzheimer’s Protein Dynamics: A Data-Driven Integration of Stochastic Methods, Machine Learning and Connectome Insights | Alec MacIver et.al. | 2411.02644 | null |
2024-11-04 | Training-free Regional Prompting for Diffusion Transformers | Anthony Chen et.al. | 2411.02395 | link |
2024-11-04 | Diffusion-based Generative Multicasting with Intent-aware Semantic Decomposition | Xinkai Liu et.al. | 2411.02334 | null |
2024-11-04 | LayerDAG: A Layerwise Autoregressive Diffusion Model for Directed Acyclic Graph Generation | Mufei Li et.al. | 2411.02322 | link |
2024-11-05 | Tencent Hunyuan3D-1.0: A Unified Framework for Text-to-3D and Image-to-3D Generation | Xianghui Yang et.al. | 2411.02293 | null |
2024-11-04 | FewViewGS: Gaussian Splatting with Few View Matching and Multi-stage Training | Ruihong Yin et.al. | 2411.02229 | null |
2024-11-04 | Metric properties of partial and robust Gromov-Wasserstein distances | Jannatul Chhoa et.al. | 2411.02198 | null |
2024-11-04 | CleAR: Robust Context-Guided Generative Lighting Estimation for Mobile Augmented Reality | Yiqin Zhao et.al. | 2411.02179 | null |
2024-11-04 | Model Integrity when Unlearning with T2I Diffusion Models | Andrea Schioppa et.al. | 2411.02068 | null |
2024-11-04 | Learning Controlled Stochastic Differential Equations | Luc Brogat-Motte et.al. | 2411.01982 | null |
2024-11-04 | A tamed-adaptive Milstein scheme for stochastic differential equations with low regularity coefficients | Thi-Huong Vu et.al. | 2411.01849 | null |
2024-11-04 | DiffuMask-Editor: A Novel Paradigm of Integration Between the Segmentation Diffusion Model and Image Editing to Improve Segmentation Ability | Bo Gao et.al. | 2411.01819 | null |
2024-11-04 | MoMu-Diffusion: On Learning Long-Term Motion-Music Synchronization and Correspondence | Fuming You et.al. | 2411.01805 | null |
2024-11-04 | A Regressor-Guided Graph Diffusion Model for Predicting Enzyme Mutations to Enhance Turnover Number | Xiaozhu Yu et.al. | 2411.01745 | link |
2024-11-04 | xDiT: an Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism | Jiarui Fang et.al. | 2411.01738 | link |
2024-11-04 | LaGDif: Latent Graph Diffusion Model for Efficient Protein Inverse Folding with Self-Ensemble | Taoyu Wu et.al. | 2411.01737 | link |
2024-10-31 | DiffPano: Scalable and Consistent Text to Panorama Generation with Spherical Epipolar-Aware Diffusion | Weicai Ye et.al. | 2410.24203 | link |
2024-10-31 | **Redefining |
Fu Feng et.al. | 2410.24160 | null |
2024-10-31 | Scaling Concept With Text-Guided Diffusion Models | Chao Huang et.al. | 2410.24151 | null |
2024-10-31 | Understanding Generalizability of Diffusion Models Requires Rethinking the Hidden Gaussian Structure | Xiang Li et.al. | 2410.24060 | link |
2024-10-31 | TPC: Test-time Procrustes Calibration for Diffusion-based Human Image Animation | Sunjae Yoon et.al. | 2410.24037 | null |
2024-10-31 | DiffPAD: Denoising Diffusion-based Adversarial Patch Decontamination | Jia Fu et.al. | 2410.24006 | link |
2024-11-01 | Breaking Determinism: Fuzzy Modeling of Sequential Recommendation Using Discrete State Space Diffusion Model | Wenjia Xie et.al. | 2410.23994 | null |
2024-10-31 | Stochastic Reconstruction of Gappy Lagrangian Turbulent Signals by Conditional Diffusion Models | Tianyi Li et.al. | 2410.23971 | null |
2024-10-31 | Image Synthesis with Class-Aware Semantic Diffusion Models for Surgical Scene Segmentation | Yihang Zhou et.al. | 2410.23962 | null |
2024-10-31 | A dynamic programming principle for multiperiod control problems with bicausal constraints | Ruslan Mirmominov et.al. | 2410.23927 | null |
2024-10-31 | Text-DiFuse: An Interactive Multi-Modal Image Fusion Framework based on Text-modulated Diffusion Model | Hao Zhang et.al. | 2410.23905 | link |
2024-10-31 | DiffBatt: A Diffusion Model for Battery Degradation Prediction and Synthesis | Hamidreza Eivazi et.al. | 2410.23893 | link |
2024-10-31 | Denoising Diffusion Models for Anomaly Localization in Medical Images | Cosmin I. Bercea et.al. | 2410.23834 | null |
2024-10-31 | Disentangling Disentangled Representations: Towards Improved Latent Units via Diffusion Models | Youngjun Jun et.al. | 2410.23820 | null |
2024-10-31 | EDT: An Efficient Diffusion Transformer Framework Inspired by Human-like Sketching | Xinwang Chen et.al. | 2410.23788 | link |
2024-10-30 | ReferEverything: Towards Segmenting Everything We Can Speak of in Videos | Anurag Bagchi et.al. | 2410.23287 | null |
2024-10-30 | Provable acceleration for diffusion models under minimal assumptions | Gen Li et.al. | 2410.23285 | null |
2024-10-30 | RelationBooth: Towards Relation-Aware Customized Object Generation | Qingyu Shi et.al. | 2410.23280 | null |
2024-10-30 | SlowFast-VGen: Slow-Fast Learning for Action-Driven Long Video Generation | Yining Hong et.al. | 2410.23277 | null |
2024-10-30 | Multi-student Diffusion Distillation for Better One-step Generators | Yanke Song et.al. | 2410.23274 | null |
2024-10-30 | A uniform point vortex approximation for the solution of the two-dimensional Navier Stokes equation with transport noise | Filippo Giovagnini et.al. | 2410.23163 | null |
2024-10-30 | Identifiability of the Optimal Transport Cost on Finite Spaces | Alberto González-Sanz et.al. | 2410.23146 | null |
2024-10-30 | CausalDiff: Causality-Inspired Disentanglement via Diffusion Model for Adversarial Defense | Mingkun Zhang et.al. | 2410.23091 | null |
2024-10-30 | Controlling Language and Diffusion Models by Transporting Activations | Pau Rodriguez et.al. | 2410.23054 | null |
2024-10-30 | Improving Musical Accompaniment Co-creation via Diffusion Transformers | Javier Nistal et.al. | 2410.23005 | null |
2024-10-30 | DexGraspNet 2.0: Learning Generative Dexterous Grasping in Large-scale Synthetic Cluttered Scenes | Jialiang Zhang et.al. | 2410.23004 | null |
2024-10-30 | LumiSculpt: A Consistency Lighting Control Network for Video Generation | Yuxin Zhang et.al. | 2410.22979 | null |
2024-10-30 | Private Synthetic Text Generation with Diffusion Models | Sebastian Ochs et.al. | 2410.22971 | link |
2024-10-31 | DiffLight: A Partial Rewards Conditioned Diffusion Model for Traffic Signal Control with Missing Data | Hanyang Chen et.al. | 2410.22938 | link |
2024-10-30 | HelloMeme: Integrating Spatial Knitting Attentions to Embed High-Level and Fidelity-Rich Conditions in Diffusion Models | Shengkai Zhang et.al. | 2410.22901 | link |
2024-10-29 | Capacity Control is an Effective Memorization Mitigation Mechanism in Text-Conditional Diffusion Models | Raman Dutt et.al. | 2410.22149 | link |
2024-10-29 | Averaging principle for multiscale controlled jump diffusions and associated nonlocal HJB equations | Qi Zhang et.al. | 2410.22141 | null |
2024-10-29 | Variational inference for pile-up removal at hadron colliders with diffusion models | Malte Algren et.al. | 2410.22074 | null |
2024-10-29 | Self-normalized Cramér-type Moderate Deviation of Stochastic Gradient Langevin Dynamics | Hongsheng Dai et.al. | 2410.22047 | null |
2024-10-29 | Dual Conditional Diffusion Models for Sequential Recommendation | Hongtao Huang et.al. | 2410.21967 | null |
2024-10-29 | PrefPaint: Aligning Image Inpainting Diffusion Model with Human Preference | Kendong Liu et.al. | 2410.21966 | null |
2024-10-29 | CT to PET Translation: A Large-scale Dataset and Domain-Knowledge-Guided Diffusion Approach | Dac Thai Nguyen et.al. | 2410.21932 | link |
2024-10-29 | Guided Diffusion-based Counterfactual Augmentation for Robust Session-based Recommendation | Muskan Gupta et.al. | 2410.21892 | null |
2024-10-29 | On invariance of observability for BSDEs and its applications to stochastic control systems | Bao-Zhu Guo et.al. | 2410.21863 | null |
2024-10-29 | Diffusion as Reasoning: Enhancing Object Goal Navigation with LLM-Biased Diffusion Model | Yiming Ji et.al. | 2410.21842 | null |
2024-10-29 | Volumetric Conditioning Module to Control Pretrained Diffusion Models for 3D Medical Images | Suhyun Ahn et.al. | 2410.21826 | link |
2024-10-29 | Robot Policy Learning with Temporal Optimal Transport Reward | Yuwei Fu et.al. | 2410.21795 | link |
2024-10-29 | HairDiffusion: Vivid Multi-Colored Hair Editing via Latent Diffusion | Yu Zeng et.al. | 2410.21789 | null |
2024-10-29 | DiffusionVel: Multi-Information Integrated Velocity Inversion Using Generative Diffusion Models | Hao Zhang et.al. | 2410.21776 | null |
2024-10-30 | IntLoRA: Integral Low-rank Adaptation of Quantized Diffusion Models | Hang Guo et.al. | 2410.21759 | null |
2024-10-28 | On Inductive Biases That Enable Generalization of Diffusion Transformers | Jie An et.al. | 2410.21273 | link |
2024-10-28 | One-Step Diffusion Policy: Fast Visuomotor Policies via Diffusion Distillation | Zhendong Wang et.al. | 2410.21257 | null |
2024-10-28 | $\texttt{skwdro}$ : a library for Wasserstein distributionally robust machine learning | Florian Vincent et.al. | 2410.21231 | link |
2024-10-28 | On learning higher-order cumulants in diffusion models | Gert Aarts et.al. | 2410.21212 | null |
2024-10-28 | Trajectory Flow Matching with Applications to Clinical Time Series Modeling | Xi Zhang et.al. | 2410.21154 | link |
2024-10-28 | Extrapolating Prospective Glaucoma Fundus Images through Diffusion Model in Irregular Longitudinal Sequences | Zhihao Zhao et.al. | 2410.21130 | null |
2024-10-28 | Shallow Diffuse: Robust and Invisible Watermarking through Low-Dimensional Subspaces in Diffusion Models | Wenda Li et.al. | 2410.21088 | link |
2024-10-28 | Federated Time Series Generation on Feature and Temporally Misaligned Data | Chenrui Fan et.al. | 2410.21072 | null |
2024-10-28 | Kandinsky 3: Text-to-Image Synthesis for Multifunctional Generative Framework | Vladimir Arkhipkin et.al. | 2410.21061 | link |
2024-10-28 | Beyond Autoregression: Fast LLMs via Self-Distillation Through Time | Justin Deschenaux et.al. | 2410.21035 | link |
2024-10-28 | Reference-Free Formula Drift with Reinforcement Learning: From Driving Data to Tire Energy-Inspired, Real-World Policies | Franck Djeumou et.al. | 2410.20990 | null |
2024-10-29 | EEG-Driven 3D Object Reconstruction with Color Consistency and Diffusion Prior | Xin Xiang et.al. | 2410.20981 | null |
2024-10-28 | Attention Overlap Is Responsible for The Entity Missing Problem in Text-to-image Diffusion Models! | Arash Marioriyad et.al. | 2410.20972 | null |
2024-10-28 | Diff-Instruct*: Towards Human-Preferred One-step Text-to-image Generative Models | Weijian Luo et.al. | 2410.20898 | null |
2024-10-28 | Novel Object Synthesis via Adaptive Text-Image Harmony | Zeren Xiong et.al. | 2410.20823 | null |
2024-10-25 | Adversarial Environment Design via Regret-Guided Diffusion Models | Hojun Chung et.al. | 2410.19715 | null |
2024-10-25 | DiffGS: Functional Gaussian Splatting Diffusion | Junsheng Zhou et.al. | 2410.19657 | null |
2024-10-25 | Diffusion models for lattice gauge field simulations | Qianteng Zhu et.al. | 2410.19602 | null |
2024-10-25 | On the robustness of semi-discrete optimal transport | Davy Paindaveine et.al. | 2410.19596 | null |
2024-10-25 | Utilizing Image Transforms and Diffusion Models for Generative Modeling of Short and Long Time Series | Ilan Naiman et.al. | 2410.19538 | null |
2024-10-25 | Ensemble Data Assimilation for Particle-based Methods | Marius Duvillard et.al. | 2410.19525 | null |
2024-10-28 | NeuroClips: Towards High-fidelity and Smooth fMRI-to-Video Reconstruction | Zixuan Gong et.al. | 2410.19452 | link |
2024-10-25 | Learned Reference-based Diffusion Sampling for multi-modal distributions | Maxence Noble et.al. | 2410.19449 | null |
2024-10-25 | Generative Diffusion Models for Sequential Recommendations | Sharare Zolghadr et.al. | 2410.19429 | null |
2024-10-25 | FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality | Zhengyao Lv et.al. | 2410.19355 | null |
2024-10-25 | High Resolution Seismic Waveform Generation using Denoising Diffusion | Andreas Bergmeister et.al. | 2410.19343 | null |
2024-10-25 | Simpler Diffusion (SiD2): 1.5 FID on ImageNet512 with pixel-space diffusion | Emiel Hoogeboom et.al. | 2410.19324 | null |
2024-10-25 | A prescriptive theory for brain-like inference | Hadi Vafaii et.al. | 2410.19315 | null |
2024-10-25 | TEARS: Textual Representations for Scrutable Recommendations | Emiliano Penaloza et.al. | 2410.19302 | null |
2024-10-25 | A Flow-based Truncated Denoising Diffusion Model for Super-resolution Magnetic Resonance Spectroscopic Imaging | Siyuan Dong et.al. | 2410.19288 | null |
2024-10-24 | MotionCLR: Motion Generation and Training-free Editing via Understanding Attention Mechanisms | Ling-Hao Chen et.al. | 2410.18977 | null |
2024-10-24 | 3D-Adapter: Geometry-Consistent Multi-View Diffusion for High-Quality 3D Generation | Hansheng Chen et.al. | 2410.18974 | link |
2024-10-24 | On the Crucial Role of Initialization for Matrix Factorization | Bingcong Li et.al. | 2410.18965 | null |
2024-10-24 | Stable Consistency Tuning: Understanding and Improving Consistency Models | Fu-Yun Wang et.al. | 2410.18958 | link |
2024-10-24 | Generation of synthetic financial time series by diffusion models | Tomonori Takahashi et.al. | 2410.18897 | null |
2024-10-24 | The Cat and Mouse Game: The Ongoing Arms Race Between Diffusion Models and Detection Methods | Linda Laurier et.al. | 2410.18866 | null |
2024-10-24 | Multi-Scale Diffusion: Enhancing Spatial Layout in High-Resolution Panoramic Image Generation | Xiaoyu Zhang et.al. | 2410.18830 | null |
2024-10-24 | Fast constrained sampling in pre-trained diffusion models | Alexandros Graikos et.al. | 2410.18804 | null |
2024-10-24 | Robust Watermarking Using Generative Priors Against Image Editing: From Benchmarking to Advances | Shilin Lu et.al. | 2410.18775 | link |
2024-10-25 | Schedule Your Edit: A Simple yet Effective Diffusion Noise Schedule for Image Editing | Haonan Lin et.al. | 2410.18756 | null |
2024-10-24 | Rectified Diffusion Guidance for Conditional Generation | Mengfei Xia et.al. | 2410.18737 | null |
2024-10-24 | Retrieval-Augmented Diffusion Models for Time Series Forecasting | Jingwei Liu et.al. | 2410.18712 | link |
2024-10-24 | Ali-AUG: Innovative Approaches to Labeled Data Augmentation using One-Step Diffusion Model | Ali Hamza et.al. | 2410.18678 | null |
2024-10-24 | DreamClear: High-Capacity Real-World Image Restoration with Privacy-Safe Dataset Curation | Yuang Ai et.al. | 2410.18666 | link |
2024-10-25 | Diffusion Attribution Score: Evaluating Training Data Influence in Diffusion Model | Jinxu Lin et.al. | 2410.18639 | null |
2024-10-23 | DynamicCity: Large-Scale LiDAR Generation from Dynamic Scenes | Hengwei Bian et.al. | 2410.18084 | null |
2024-10-23 | Prioritized Generative Replay | Renhao Wang et.al. | 2410.18082 | null |
2024-10-23 | Optical Generative Models | Shiqi Chen et.al. | 2410.17970 | null |
2024-10-23 | A Wavelet Diffusion GAN for Image Super-Resolution | Lorenzo Aloisi et.al. | 2410.17966 | null |
2024-10-23 | Addressing Asynchronicity in Clinical Multimodal Fusion via Individualized Chest X-ray Generation | Wenfang Yao et.al. | 2410.17918 | link |
2024-10-23 | Scaling Diffusion Language Models via Adaptation from Autoregressive Models | Shansan Gong et.al. | 2410.17891 | link |
2024-10-23 | Non-intrusive Speech Quality Assessment with Diffusion Models Trained on Clean Speech | Danilo de Oliveira et.al. | 2410.17834 | null |
2024-10-23 | PGDiffSeg: Prior-Guided Denoising Diffusion Model with Parameter-Shared Attention for Breast Cancer Segmentation | Feiyan Feng et.al. | 2410.17812 | null |
2024-10-23 | AdaDiffSR: Adaptive Region-aware Dynamic Acceleration Diffusion Model for Real-World Image Super-Resolution | Yuanting Fan et.al. | 2410.17752 | null |
2024-10-23 | VISAGE: Video Synthesis using Action Graphs for Surgery | Yousef Yeganeh et.al. | 2410.17751 | null |
2024-10-23 | Optimal Impulse Control for Cyber Risk Management | Caroline Hillairet et.al. | 2410.17706 | null |
2024-10-23 | Deep Generative Models for 3D Medical Image Synthesis | Paul Friedrich et.al. | 2410.17664 | null |
2024-10-23 | Towards Effective Data-Free Knowledge Distillation via Diverse Diffusion Augmentation | Muquan Li et.al. | 2410.17606 | link |
2024-10-23 | How to Continually Adapt Text-to-Image Diffusion Models for Flexible Customization? | Jiahua Dong et.al. | 2410.17594 | link |
2024-10-23 | GDDA: Semantic OOD Detection on Graphs under Covariate Shift via Score-Based Diffusion Models | Zhixia He et.al. | 2410.17526 | null |
2024-10-22 | Reinforcement learning on structure-conditioned categorical diffusion for protein inverse folding | Yasha Ektefaie et.al. | 2410.17173 | link |
2024-10-22 | CLAP: Concave Linear APproximation for Quadratic Graph Matching | Yongqing Liang et.al. | 2410.17101 | link |
2024-10-22 | DiP-GO: A Diffusion Pruner via Few-step Gradient Optimization | Haowei Zhu et.al. | 2410.16942 | null |
2024-10-22 | Hierarchical Clustering for Conditional Diffusion in Image Generation | Jorge da Silva Goncalves et.al. | 2410.16910 | link |
2024-10-22 | VistaDream: Sampling multiview consistent images for single-view scene reconstruction | Haiping Wang et.al. | 2410.16892 | null |
2024-10-22 | MPDS: A Movie Posters Dataset for Image Generation with Diffusion Model | Meng Xu et.al. | 2410.16840 | null |
2024-10-22 | Evaluating the Effectiveness of Attack-Agnostic Features for Morphing Attack Detection | Laurent Colbois et.al. | 2410.16802 | link |
2024-10-22 | One-Step Diffusion Distillation through Score Implicit Matching | Weijian Luo et.al. | 2410.16794 | link |
2024-10-22 | LLM-Assisted Red Teaming of Diffusion Models through “Failures Are Fated, But Can Be Faded” | Som Sagar et.al. | 2410.16738 | null |
2024-10-22 | Polyp-E: Benchmarking the Robustness of Deep Segmentation Models via Polyp Editing | Runpu Wei et.al. | 2410.16732 | null |
2024-10-22 | DiffusionSeeder: Seeding Motion Optimization with Diffusion for Rapid Motion Planning | Huang Huang et.al. | 2410.16727 | null |
2024-10-22 | Progressive Compositionality In Text-to-Image Generative Models | Xu Han et.al. | 2410.16719 | null |
2024-10-22 | Governing equation discovery of a complex system from snapshots | Qunxi Zhu et.al. | 2410.16694 | null |
2024-10-22 | DARE: Diffusion Policy for Autonomous Robot Exploration | Yuhong Cao et.al. | 2410.16687 | null |
2024-10-22 | NucleiMix: Realistic Data Augmentation for Nuclei Instance Segmentation | Jiamu Wang et.al. | 2410.16671 | null |
2024-10-21 | MvDrag3D: Drag-based Creative 3D Editing via Multi-view Generation-Reconstruction Priors | Honghua Chen et.al. | 2410.16272 | null |
2024-10-21 | A Framework for Evaluating Predictive Models Using Synthetic Image Covariates and Longitudinal Data | Simon Deltadahl et.al. | 2410.16177 | null |
2024-10-22 | Warped Diffusion: Solving Video Inverse Problems with Image Diffusion Models | Giannis Daras et.al. | 2410.16152 | null |
2024-10-21 | SeaDAG: Semi-autoregressive Diffusion for Conditional Directed Acyclic Graph Generation | Xinyi Zhou et.al. | 2410.16119 | null |
2024-10-21 | Continuous Speech Synthesis using per-token Latent Diffusion | Arnon Turetzky et.al. | 2410.16048 | null |
2024-10-22 | CamI2V: Camera-Controlled Image-to-Video Diffusion Model | Guangcong Zheng et.al. | 2410.15957 | link |
2024-10-21 | Global existence and mean-field limit for a stochastic interacting particle system of signed Coulomb charges | Patrick van Meurs et.al. | 2410.15855 | null |
2024-10-21 | Learning signals defined on graphs with optimal transport and Gaussian process regression | Raphaël Carpintero Perez et.al. | 2410.15721 | null |
2024-10-21 | Quantiles and Quantile Regression on Riemannian Manifolds: a measure-transportation-based approach | Marc Hallin et.al. | 2410.15711 | null |
2024-10-21 | Solving Continual Offline RL through Selective Weights Activation on Aligned Spaces | Jifeng Hu et.al. | 2410.15698 | null |
2024-10-21 | Erasing Undesirable Concepts in Diffusion Models with Adversarial Preservation | Anh Bui et.al. | 2410.15618 | link |
2024-10-20 | Data Augmentation via Diffusion Model to Enhance AI Fairness | Christina Hastings Blow et.al. | 2410.15470 | null |
2024-10-20 | MedDiff-FM: A Diffusion-based Foundation Model for Versatile Medical Image Applications | Yongrui Yu et.al. | 2410.15432 | null |
2024-10-20 | ConSinger: Efficient High-Fidelity Singing Voice Generation with Minimal Steps | Yulin Song et.al. | 2410.15342 | null |
2024-10-20 | Diffusion-PINN Sampler | Zhekun Shi et.al. | 2410.15336 | null |
2024-10-18 | A Lipschitz spaces view of infinitely wide shallow neural networks | Francesca Bartolucci et.al. | 2410.14591 | null |
2024-10-18 | Neuro-Symbolic Traders: Assessing the Wisdom of AI Crowds in Markets | Namid R. Stillman et.al. | 2410.14587 | null |
2024-10-18 | Multi-modal Pose Diffuser: A Multimodal Generative Conditional Pose Prior | Calvin-Khang Ta et.al. | 2410.14540 | null |
2024-10-18 | LEAD: Latent Realignment for Human Motion Diffusion | Nefeli Andreou et.al. | 2410.14508 | null |
2024-10-18 | Reinforcement Learning in Non-Markov Market-Making | Luca Lalor et.al. | 2410.14504 | null |
2024-10-18 | ANT: Adaptive Noise Schedule for Time Series Diffusion Models | Seunghan Lee et.al. | 2410.14488 | link |
2024-10-18 | DRL Optimization Trajectory Generation via Wireless Network Intent-Guided Diffusion Models for Optimizing Resource Allocation | Junjie Wu et.al. | 2410.14481 | null |
2024-10-18 | FashionR2R: Texture-preserving Rendered-to-Real Image Translation with Diffusion Models | Rui Hu et.al. | 2410.14429 | null |
2024-10-18 | Dynamic Negative Guidance of Diffusion Models | Felix Koulischer et.al. | 2410.14398 | null |
2024-10-18 | Unscrambling disease progression at scale: fast inference of event permutations with optimal transport | Peter A. Wijeratne et.al. | 2410.14388 | null |
2024-10-18 | HiCo: Hierarchical Controllable Diffusion Model for Layout-to-image Generation | Bo Cheng et.al. | 2410.14324 | link |
2024-10-18 | A class of kernel-based scalable algorithms for data science | Philippe G. LeFloch et.al. | 2410.14323 | null |
2024-10-18 | ClearSR: Latent Low-Resolution Image Embeddings Help Diffusion-Based Real-World Super Resolution Models See Clearer | Yuhao Wan et.al. | 2410.14279 | null |
2024-10-18 | HYPNOS : Highly Precise Foreground-focused Diffusion Finetuning for Inanimate Objects | Oliverio Theophilus Nathanael et.al. | 2410.14265 | null |
2024-10-18 | ERDDCI: Exact Reversible Diffusion via Dual-Chain Inversion for High-Quality Image Editing | Jimin Dai et.al. | 2410.14247 | null |
2024-10-17 | Diffusing States and Matching Scores: A New Framework for Imitation Learning | Runzhe Wu et.al. | 2410.13855 | link |
2024-10-17 | Influence Functions for Scalable Data Attribution in Diffusion Models | Bruno Mlodozeniec et.al. | 2410.13850 | null |
2024-10-17 | Deep Generative Models Unveil Patterns in Medical Images Through Vision-Language Conditioning | Xiaodan Xing et.al. | 2410.13823 | link |
2024-10-17 | ConsisSR: Delving Deep into Consistency in Diffusion-based Image Super-Resolution | Junhao Gu et.al. | 2410.13807 | null |
2024-10-17 | Probing the Latent Hierarchical Structure of Data via Diffusion Models | Antonio Sclocchi et.al. | 2410.13770 | null |
2024-10-17 | Theory on Score-Mismatched Diffusion Models and Zero-Shot Conditional Samplers | Yuchen Liang et.al. | 2410.13746 | null |
2024-10-17 | Improved Convergence Rate for Diffusion Probabilistic Models | Gen Li et.al. | 2410.13738 | null |
2024-10-18 | DAWN: Dynamic Frame Avatar with Non-autoregressive Diffusion Framework for Talking Head Video Generation | Hanbo Cheng et.al. | 2410.13726 | link |
2024-10-18 | Diffusion Curriculum: Synthetic-to-Real Generative Curriculum Learning via Image-Guided Diffusion | Yijun Liang et.al. | 2410.13674 | link |
2024-10-17 | Fine-Tuning Discrete Diffusion Models via Reward Optimization with Applications to DNA and Protein Design | Chenyu Wang et.al. | 2410.13643 | link |
2024-10-17 | Preference Aligned Diffusion Planner for Quadrupedal Locomotion Control | Xinyi Yuan et.al. | 2410.13586 | null |
2024-10-17 | Can Medical Vision-Language Pre-training Succeed with Purely Synthetic Data? | Che Liu et.al. | 2410.13523 | null |
2024-10-17 | Solving Prior Distribution Mismatch in Diffusion Models via Optimal Transport | Zhanpeng Wang et.al. | 2410.13431 | null |
2024-10-17 | MagicTailor: Component-Controllable Personalization in Text-to-Image Diffusion Models | Donghao Zhou et.al. | 2410.13370 | null |
2024-10-17 | DiffImp: Efficient Diffusion Model for Probabilistic Time Series Imputation with Bidirectional Mamba Backbone | Hongfan Gao et.al. | 2410.13338 | null |
2024-10-16 | Meta-Unlearning on Diffusion Models: Preventing Relearning Unlearned Concepts | Hongcheng Gao et.al. | 2410.12777 | link |
2024-10-16 | SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image And Video Generation | Jaehong Yoon et.al. | 2410.12761 | null |
2024-10-16 | Geometry and Duality of Alternating Markov Chains | Deven Mithal et.al. | 2410.12721 | null |
2024-10-16 | Embedding an Ethical Mind: Aligning Text-to-Image Synthesis via Lightweight Value Optimization | Xingqi Wang et.al. | 2410.12700 | link |
2024-10-16 | AdaptiveDrag: Semantic-Driven Dragging on Diffusion-Based Image Editing | DuoSheng Chen et.al. | 2410.12696 | null |
2024-10-16 | One Step Diffusion via Shortcut Models | Kevin Frans et.al. | 2410.12557 | link |
2024-10-16 | Disentangling data distribution for Federated Learning | Xinyuan Zhao et.al. | 2410.12530 | null |
2024-10-16 | Shaping a Stabilized Video by Mitigating Unintended Changes for Concept-Augmented Video Editing | Mingce Guo et.al. | 2410.12526 | null |
2024-10-16 | Price impact and long-term profitability of energy storage | Roxana Dumitrescu et.al. | 2410.12495 | null |
2024-10-16 | Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspective | Yongxin Zhu et.al. | 2410.12490 | link |
2024-10-16 | A Class of Degenerate Mean Field Games, Associated FBSDEs and Master Equations | Alain Bensoussan et.al. | 2410.12404 | null |
2024-10-16 | DaDiff: Domain-aware Diffusion Model for Nighttime UAV Tracking | Haobo Zuo et.al. | 2410.12270 | link |
2024-10-16 | FlashAudio: Rectified Flows for Fast and High-Fidelity Text-to-Audio Generation | Huadai Liu et.al. | 2410.12266 | null |
2024-10-17 | Expected Sliced Transport Plans | Xinran Liu et.al. | 2410.12176 | null |
2024-10-16 | Preference Optimization with Multi-Sample Comparisons | Chaoqi Wang et.al. | 2410.12138 | null |
2024-10-15 | High-Resolution Frame Interpolation with Patch-based Cascaded Diffusion | Junhwa Hur et.al. | 2410.11838 | null |
2024-10-15 | On the Effectiveness of Dataset Alignment for Fake Image Detection | Anirudh Sundara Rajan et.al. | 2410.11835 | null |
2024-10-15 | Bayesian Experimental Design via Contrastive Diffusions | Jacopo Iollo et.al. | 2410.11826 | link |
2024-10-15 | Improving Long-Text Alignment for Text-to-Image Diffusion Models | Luping Liu et.al. | 2410.11817 | link |
2024-10-15 | SGEdit: Bridging LLM with Text2Image Generative Model for Scene Graph-based Image Editing | Zhiyuan Zhang et.al. | 2410.11815 | null |
2024-10-16 | Efficient Diffusion Models: A Comprehensive Survey from Principles to Practices | Zhiyuan Ma et.al. | 2410.11795 | null |
2024-10-15 | Probabilistic Principles for Biophysics and Neuroscience: Entropy Production, Bayesian Mechanics & the Free-Energy Principle | Lancelot Da Costa et.al. | 2410.11735 | null |
2024-10-15 | Patch-Based Diffusion Models Beat Whole-Image Models for Mismatched Distribution Inverse Problems | Jason Hu et.al. | 2410.11730 | null |
2024-10-15 | On the potential of Optimal Transport in Geospatial Data Science | Nina Wiedemann et.al. | 2410.11709 | link |
2024-10-15 | Optimal Finite-time Maxwell’s Demons in Langevin Systems | Takuya Kamijima et.al. | 2410.11603 | null |
2024-10-15 | DeformPAM: Data-Efficient Learning for Long-horizon Deformable Object Manipulation via Preference-based Action Alignment | Wendi Chen et.al. | 2410.11584 | link |
2024-10-15 | Bayesian inference of mixed Gaussian phylogenetic models | Bayu Brahmantio et.al. | 2410.11548 | link |
2024-10-15 | Riemann-Liouville fractional Brownian motion with random Hurst exponent | Hubert Woszczek et.al. | 2410.11546 | null |
2024-10-15 | InvSeg: Test-Time Prompt Inversion for Semantic Segmentation | Jiayi Lin et.al. | 2410.11473 | null |
2024-10-15 | A Simple Approach to Unifying Diffusion-based Conditional Generation | Xirui Li et.al. | 2410.11439 | null |
2024-10-14 | Tex4D: Zero-shot 4D Scene Texturing with Video Diffusion Models | Jingzhi Bao et.al. | 2410.10821 | link |
2024-10-14 | Depth Any Video with Scalable Synthetic Data | Honghui Yang et.al. | 2410.10815 | link |
2024-10-14 | HART: Efficient Visual Generation with Hybrid Autoregressive Transformer | Haotian Tang et.al. | 2410.10812 | link |
2024-10-14 | TrajDiffuse: A Conditional Diffusion Model for Environment-Aware Trajectory Prediction | Qingze et.al. | 2410.10804 | link |
2024-10-14 | Boosting Camera Motion Control for Video Diffusion Transformers | Soon Yau Cheong et.al. | 2410.10802 | null |
2024-10-14 | Semantic Image Inversion and Editing using Rectified Stochastic Differential Equations | Litu Rout et.al. | 2410.10792 | null |
2024-10-14 | ControlMM: Controllable Masked Motion Generation | Ekkasit Pinyoanuntapong et.al. | 2410.10780 | null |
2024-10-14 | Adaptive Diffusion Terrain Generator for Autonomous Uneven Terrain Navigation | Youwei Yu et.al. | 2410.10766 | null |
2024-10-14 | DragEntity: Trajectory Guided Video Generation using Entity and Positional Relationships | Zhang Wan et.al. | 2410.10751 | null |
2024-10-14 | FlexGen: Flexible Multi-View Generation from Text and Image Inputs | Xinli Xu et.al. | 2410.10745 | null |
2024-10-14 | Deep Compression Autoencoder for Efficient High-Resolution Diffusion Models | Junyu Chen et.al. | 2410.10733 | link |
2024-10-14 | TALK-Act: Enhance Textural-Awareness for 2D Speaking Avatar Reenactment with Diffusion Model | Jiazhi Guan et.al. | 2410.10696 | null |
2024-10-14 | Both Ears Wide Open: Towards Language-Driven Spatial Audio Generation | Peiwen Sun et.al. | 2410.10676 | null |
2024-10-14 | Generating Model Parameters for Controlling: Parameter Diffusion for Controllable Multi-Task Recommendation | Chenglei Shen et.al. | 2410.10639 | null |
2024-10-15 | SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformers | Enze Xie et.al. | 2410.10629 | null |
2024-10-11 | SceneCraft: Layout-Guided 3D Scene Generation | Xiuyu Yang et.al. | 2410.09049 | link |
2024-10-11 | Linear Convergence of Diffusion Models Under the Manifold Hypothesis | Peter Potaptchik et.al. | 2410.09046 | null |
2024-10-11 | Semantic Score Distillation Sampling for Compositional Text-to-3D Generation | Ling Yang et.al. | 2410.09009 | link |
2024-10-11 | WaveDiffusion: Exploring Full Waveform Inversion via Joint Diffusion in the Latent Space | Hanchen Wang et.al. | 2410.09002 | null |
2024-10-11 | Gradient-adjusted underdamped Langevin dynamics for sampling | Xinzhe Zuo et.al. | 2410.08987 | null |
2024-10-11 | DiffPO: A causal diffusion model for learning distributions of potential outcomes | Yuchen Ma et.al. | 2410.08924 | null |
2024-10-11 | Lifelong Event Detection via Optimal Transport | Viet Dao et.al. | 2410.08905 | null |
2024-10-11 | Domain decomposition for entropic unbalanced optimal transport | Ismael Medina et.al. | 2410.08859 | link |
2024-10-11 | Zero-Shot Offline Imitation Learning via Optimal Transport | Thomas Rupf et.al. | 2410.08751 | link |
2024-10-11 | Multi-dimensional non-Markovian backward stochastic differential equations of interactively quadratic generators | Shengjun Fan et.al. | 2410.08748 | null |
2024-10-11 | Distillation of Discrete Diffusion through Dimensional Correlations | Satoshi Hayakawa et.al. | 2410.08709 | null |
2024-10-14 | Gait Sequence Upsampling using Diffusion Models for Single LiDAR Sensors | Jeongho Ahn et.al. | 2410.08680 | null |
2024-10-11 | E-Motion: Future Motion Simulation via Event Sequence Diffusion | Song Wu et.al. | 2410.08649 | link |
2024-10-11 | Synth-SONAR: Sonar Image Synthesis with Enhanced Diversity and Realism via Dual Diffusion Models and GPT Prompting | Purushothaman Natarajan et.al. | 2410.08612 | link |
2024-10-11 | Context-Aware Full Body Anonymization using Text-to-Image Diffusion Models | Pascl Zwick et.al. | 2410.08551 | link |
2024-10-10 | DICE: Discrete Inversion Enabling Controllable Editing for Multinomial Diffusion and Masked Generative Models | Xiaoxiao He et.al. | 2410.08207 | null |
2024-10-10 | HybridBooth: Hybrid Prompt Inversion for Efficient Subject-Driven Generation | Shanyan Guan et.al. | 2410.08192 | null |
2024-10-10 | DifFRelight: Diffusion-Based Facial Performance Relighting | Mingming He et.al. | 2410.08188 | null |
2024-10-10 | ZeroComp: Zero-shot Object Compositing from Image Intrinsics via Diffusion | Zitian Zhang et.al. | 2410.08168 | null |
2024-10-10 | DART: Denoising Autoregressive Transformer for Scalable Text-to-Image Generation | Jiatao Gu et.al. | 2410.08159 | null |
2024-10-10 | Progressive Autoregressive Video Diffusion Models | Desai Xie et.al. | 2410.08151 | link |
2024-10-10 | Steering Masked Discrete Diffusion Models via Discrete Denoising Posterior Prediction | Jarrid Rector-Brooks et.al. | 2410.08134 | null |
2024-10-10 | On Barycenter Computation: Semi-Unbalanced Optimal Transport-based Method on Gaussians | Ngoc-Hai Nguyen et.al. | 2410.08117 | null |
2024-10-10 | CrackSegDiff: Diffusion Probability Model-based Multi-modal Crack Segmentation | Xiaoyan Jiang et.al. | 2410.08100 | link |
2024-10-10 | Unstable Unlearning: The Hidden Risk of Concept Resurgence in Diffusion Models | Vinith M. Suriyakumar et.al. | 2410.08074 | null |
2024-10-10 | Optimal Transportation by Orthogonal Coupling Dynamics | Mohsen Sadr et.al. | 2410.08060 | null |
2024-10-10 | LADIMO: Face Morph Generation through Biometric Template Inversion with Latent Diffusion | Marcel Grimmer et.al. | 2410.07988 | link |
2024-10-10 | Convex comparison of Gaussian mixtures | Benjamin Jourdain et.al. | 2410.07958 | null |
2024-10-10 | AI Surrogate Model for Distributed Computing Workloads | David K. Park et.al. | 2410.07940 | null |
2024-10-10 | Congestion and Penalization in Optimal Transport | Marcelo Gallardo et.al. | 2410.07363 | null |
2024-10-09 | IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation | Xinchen Zhang et.al. | 2410.07171 | link |
2024-10-09 | AvatarGO: Zero-shot 4D Human-Object Interaction Generation and Animation | Yukang Cao et.al. | 2410.07164 | null |
2024-10-09 | InstructG2I: Synthesizing Images from Multimodal Attributed Graphs | Bowen Jin et.al. | 2410.07157 | link |
2024-10-09 | Trans4D: Realistic Geometry-Aware Transition for Compositional Text-to-4D Synthesis | Bohan Zeng et.al. | 2410.07155 | link |
2024-10-09 | Through the Looking Glass: Mirror Schrödinger Bridges | Leticia Mattos Da Silva et.al. | 2410.07003 | null |
2024-10-09 | Diffusion Density Estimators | Akhil Premkumar et.al. | 2410.06986 | null |
2024-10-09 | Jointly Generating Multi-view Consistent PBR Textures using Collaborative Control | Shimon Vainer et.al. | 2410.06985 | null |
2024-10-09 | Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think | Sihyun Yu et.al. | 2410.06940 | link |
2024-10-09 | Boosting Few-Shot Detection with Large Language Models and Layout-to-Image Synthesis | Ahmed Abdullah et.al. | 2410.06841 | null |
2024-10-09 | Diffuse or Confuse: A Diffusion Deepfake Speech Dataset | Anton Firc et.al. | 2410.06796 | link |
2024-10-09 | Diff-FMT: Diffusion Models for Fluorescence Molecular Tomography | Qianqian Xue et.al. | 2410.06757 | null |
2024-10-10 | Suppress Content Shift: Better Diffusion Features via Off-the-Shelf Generation Techniques | Benyuan Meng et.al. | 2410.06719 | link |
2024-10-09 | Decouple-Then-Merge: Towards Better Training for Diffusion Models | Qianli Ma et.al. | 2410.06664 | null |
2024-10-09 | WardropNet: Traffic Flow Predictions via Equilibrium-Augmented Learning | Kai Jungel et.al. | 2410.06656 | link |
2024-10-10 | DeepMuon: Accelerating Cosmic-Ray Muon Simulation Based on Optimal Transport | Ao-Bo Wang et.al. | 2410.06539 | link |
2024-10-07 | DART: A Diffusion-Based Autoregressive Motion Model for Real-Time Text-Driven Motion Control | Kaifeng Zhao et.al. | 2410.05260 | null |
2024-10-07 | GS-VTON: Controllable 3D Virtual Try-on with Gaussian Splatting | Yukang Cao et.al. | 2410.05259 | null |
2024-10-07 | SePPO: Semi-Policy Preference Optimization for Diffusion Alignment | Daoan Zhang et.al. | 2410.05255 | link |
2024-10-07 | DiffuseReg: Denoising Diffusion Model for Obtaining Deformation Fields in Unsupervised Deformable Image Registration | Yongtai Zhuo et.al. | 2410.05234 | link |
2024-10-07 | Presto! Distilling Steps and Layers for Accelerating Music Generation | Zachary Novack et.al. | 2410.05167 | null |
2024-10-08 | A Simulation-Free Deep Learning Approach to Stochastic Optimal Control | Mengjian Hua et.al. | 2410.05163 | null |
2024-10-07 | Leveraging Multimodal Diffusion Models to Accelerate Imaging with Side Information | Timofey Efimov et.al. | 2410.05143 | null |
2024-10-07 | Human-Feedback Efficient Reinforcement Learning for Online Diffusion Model Finetuning | Ayano Hiranaka et.al. | 2410.05116 | null |
2024-10-07 | DreamSat: Towards a General 3D Model for Novel View Synthesis of Space Objects | Nidhi Mathihalli et.al. | 2410.05097 | link |
2024-10-07 | A nodally bound-preserving discontinuous Galerkin method for the drift-diffusion equation | Gabriel R. Barrenechea et.al. | 2410.05040 | null |
2024-10-07 | Revealing Directions for Text-guided 3D Face Editing | Zhuo Chen et.al. | 2410.04965 | null |
2024-10-07 | Low-Rank Continual Personalization of Diffusion Models | Łukasz Staniszewski et.al. | 2410.04891 | null |
2024-10-07 | Patch is Enough: Naturalistic Adversarial Patch against Vision-Language Pre-training Models | Dehong Kong et.al. | 2410.04884 | null |
2024-10-07 | Artificial Barriers for stochastic differential equations and for construction of Boundary-preserving schemes | Johan Ulander et.al. | 2410.04850 | null |
2024-10-07 | Real-time cardiac cine MRI – A comparison of a diffusion probabilistic model with alternative state-of-the-art image reconstruction techniques for undersampled spiral acquisitions | Oliver Schad et.al. | 2410.04843 | null |
2024-10-04 | Estimating Body and Hand Motion in an Ego-sensed World | Brent Yi et.al. | 2410.03665 | null |
2024-10-04 | Real-World Benchmarks Make Membership Inference Attacks Fail on Diffusion Models | Chumeng Liang et.al. | 2410.03640 | link |
2024-10-04 | How Discrete and Continuous Diffusion Meet: Comprehensive Analysis of Discrete Diffusion Models via a Stochastic Integral Framework | Yinuo Ren et.al. | 2410.03601 | null |
2024-10-04 | Not All Diffusion Model Activations Have Been Evaluated as Discriminative Features | Benyuan Meng et.al. | 2410.03558 | link |
2024-10-04 | Diffusion State-Guided Projected Gradient for Inverse Problems | Rayhan Zirvi et.al. | 2410.03463 | null |
2024-10-04 | Generative Semantic Communication for Text-to-Speech Synthesis | Jiahao Zheng et.al. | 2410.03459 | null |
2024-10-04 | Dynamic Diffusion Transformer | Wangbo Zhao et.al. | 2410.03456 | link |
2024-10-04 | CLoSD: Closing the Loop between Simulation and Diffusion for multi-task character control | Guy Tevet et.al. | 2410.03441 | link |
2024-10-04 | Sparsity of Quadratically Regularized Optimal Transport: Bounds on concentration and bias | Johannes Wiesel et.al. | 2410.03425 | null |
2024-10-04 | One2set + Large Language Model: Best Partners for Keyphrase Generation | Liangying Shao et.al. | 2410.03421 | link |
2024-10-04 | The scaling behaviour of localised and extended states in one-dimensional tight-binding models with disorder | Luca Schaefer et.al. | 2410.03405 | null |
2024-10-04 | Latent Abstractions in Generative Diffusion Models | Giulio Franzese et.al. | 2410.03368 | null |
2024-10-04 | LANTERN: Accelerating Visual Autoregressive Models with Relaxed Speculative Decoding | Doohyuk Jang et.al. | 2410.03355 | null |
2024-10-04 | Sparsity of Quadratically Regularized Optimal Transport: Scalar Case | Alberto González-Sanz et.al. | 2410.03353 | null |
2024-10-04 | Optimal Transport for $ε$ -Contaminated Credal Sets | Michele Caprio et.al. | 2410.03267 | null |
2024-10-03 | Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models | Zhengfeng Lai et.al. | 2410.02740 | null |
2024-10-03 | NETS: A Non-Equilibrium Transport Sampler | Michael S. Albergo et.al. | 2410.02711 | null |
2024-10-03 | SteerDiff: Steering towards Safe Text-to-Image Diffusion Models | Hongxiang Zhang et.al. | 2410.02710 | null |
2024-10-03 | ControlAR: Controllable Image Generation with Autoregressive Models | Zongming Li et.al. | 2410.02705 | link |
2024-10-03 | Unsupervised Point Cloud Completion through Unbalanced Optimal Transport | Taekyung Lee et.al. | 2410.02671 | null |
2024-10-03 | GUD: Generation with Unified Diffusion | Mathis Gerdes et.al. | 2410.02667 | null |
2024-10-03 | Scalable Simulation-free Entropic Unbalanced Optimal Transport | Jaemoo Choi et.al. | 2410.02656 | null |
2024-10-03 | Efficient calibration of the shifted square-root diffusion model to credit default swap spreads using asymptotic approximations | Ankush Agarwal et.al. | 2410.02645 | null |
2024-10-03 | Inverse Entropic Optimal Transport Solves Semi-supervised Learning via Data Likelihood Maximization | Mikhail Persiianov et.al. | 2410.02628 | null |
2024-10-03 | Diffusion & Adversarial Schrödinger Bridges via Iterative Proportional Markovian Fitting | Sergei Kholkin et.al. | 2410.02601 | null |
2024-10-04 | Diffusion Models are Evolutionary Algorithms | Yanbo Zhang et.al. | 2410.02543 | link |
2024-10-03 | Lightweight Diffusion Models for Resource-Constrained Semantic Communication | Giovanni Pignata et.al. | 2410.02491 | link |
2024-10-03 | Towards a Theoretical Understanding of Memorization in Diffusion Models | Yunhao Chen et.al. | 2410.02467 | null |
2024-10-03 | Eliminating Oversaturation and Artifacts of High Guidance Scales in Diffusion Models | Seyedmorteza Sadat et.al. | 2410.02416 | null |
2024-10-03 | Diffusion Meets Options: Hierarchical Generative Skill Composition for Temporally-Extended Tasks | Zeyu Feng et.al. | 2410.02389 | null |
2024-10-02 | FabricDiffusion: High-Fidelity Texture Transfer for 3D Garments Generation from In-The-Wild Clothing Images | Cheng Zhang et.al. | 2410.01801 | null |
2024-10-02 | Bellman Diffusion: Generative Modeling as Learning a Linear Operator in the Distribution Space | Yangming Li et.al. | 2410.01796 | null |
2024-10-02 | Learning To Solve Differential Equation Constrained Optimization Problems | Vincenzo Di Vito et.al. | 2410.01786 | null |
2024-10-02 | Dynamical-generative downscaling of climate model ensembles | Ignacio Lopez-Gomez et.al. | 2410.01776 | null |
2024-10-02 | ImageFolder: Autoregressive Image Generation with Folded Tokens | Xiang Li et.al. | 2410.01756 | link |
2024-10-02 | VitaGlyph: Vitalizing Artistic Typography with Flexible Dual-branch Diffusion Models | Kailai Feng et.al. | 2410.01738 | link |
2024-10-02 | HarmoniCa: Harmonizing Training and Inference for Better Feature Cache in Diffusion Transformer Acceleration | Yushi Huang et.al. | 2410.01723 | null |
2024-10-02 | KnobGen: Controlling the Sophistication of Artwork in Sketch-Based Diffusion Models | Pouyan Navard et.al. | 2410.01595 | link |
2024-10-02 | MM-LDM: Multi-Modal Latent Diffusion Model for Sounding Video Generation | Mingzhen Sun et.al. | 2410.01594 | link |
2024-10-02 | HRTF Estimation using a Score-based Prior | Etienne Thuillier et.al. | 2410.01562 | null |
2024-10-02 | Weighted $L^p~(p\geq1)$ solutions of random time horizon BSDEs with stochastic monotonicity generators | Xinying Li et.al. | 2410.01543 | null |
2024-10-02 | Edge-preserving noise for diffusion models | Jente Vandersanden et.al. | 2410.01540 | null |
2024-10-02 | Discrete Diffusion Schrödinger Bridge Matching for Graph Transformation | Jun Hyeong Kim et.al. | 2410.01500 | null |
2024-10-02 | Modeling Cosmic-Ray Transport: A CRPropa based stochastic differential equation solver | Lukas Merten et.al. | 2410.01472 | null |
2024-10-02 | Information-Theoretical Principled Trade-off between Jailbreakability and Stealthiness on Vision Language Models | Ching-Chia Kao et.al. | 2410.01438 | null |
2024-09-30 | COLLAGE: Collaborative Human-Agent Interaction Generation using Hierarchical Latent Diffusion and Language Models | Divyanshu Daiya et.al. | 2409.20502 | null |
2024-09-30 | FreeMask: Rethinking the Importance of Attention Masks for Zero-Shot Video Editing | Lingling Cai et.al. | 2409.20500 | null |
2024-09-30 | A mean field Jacobi process for modeling sustainable tourism | Hidekazu Yoshioka et.al. | 2409.20347 | null |
2024-09-30 | Ensemble Kalman Diffusion Guidance: A Derivative-free Method for Inverse Problems | Hongkai Zheng et.al. | 2409.20175 | null |
2024-09-30 | Erase, then Redraw: A Novel Data Augmentation Approach for Free Space Detection Using Diffusion Model | Fulong Ma et.al. | 2409.20164 | null |
2024-09-30 | Conditional Diffusion Models are Minimax-Optimal and Manifold-Adaptive for Conditional Distribution Estimation | Rong Tang et.al. | 2409.20124 | null |
2024-09-30 | Reaction-diffusion model for a population structured in phenotype and space I – Criterion for persistence | Nathanaël Boutillon et.al. | 2409.20118 | null |
2024-09-30 | RoCoTex: A Robust Method for Consistent Texture Synthesis with Diffusion Models | Jangyeong Kim et.al. | 2409.19989 | null |
2024-09-30 | Magnet: We Never Know How Text-to-Image Diffusion Models Work, Until We Learn How Vision-Language Models Function | Chenyi Zhuang et.al. | 2409.19967 | link |
2024-10-02 | Image Copy Detection for Diffusion Models | Wenhao Wang et.al. | 2409.19952 | null |
2024-09-30 | Task-agnostic Pre-training and Task-guided Fine-tuning for Versatile Diffusion Planner | Chenyou Fan et.al. | 2409.19949 | null |
2024-09-30 | Replace Anyone in Videos | Xiang Wang et.al. | 2409.19911 | null |
2024-09-30 | The only admissible way of merging e-values | Ruodu Wang et.al. | 2409.19888 | null |
2024-09-30 | Partial Stochastic Dominance via Optimal Transport | Takashi Kamihigashi et.al. | 2409.19876 | null |
2024-09-30 | GameLabel-10K: Collecting Image Preference Data Through Mobile Game Crowdsourcing | Jonathan Zhou et.al. | 2409.19830 | null |
2024-09-27 | $O(d/T)$ Convergence Theory for Diffusion Probabilistic Models under Minimal Assumptions | Gen Li et.al. | 2409.18959 | null |
2024-09-27 | ReviveDiff: A Universal Diffusion Model for Restoring Images in Adverse Weather Conditions | Wenfeng Huang et.al. | 2409.18932 | null |
2024-09-27 | Unsupervised Low-light Image Enhancement with Lookup Tables and Diffusion Priors | Yunlong Lin et.al. | 2409.18899 | null |
2024-09-27 | Detecting Dataset Abuse in Fine-Tuning Stable Diffusion Models for Text-to-Image Synthesis | Songrui Wang et.al. | 2409.18897 | null |
2024-09-27 | Explainable Artifacts for Synthetic Western Blot Source Attribution | João Phillipe Cardenuto et.al. | 2409.18881 | link |
2024-09-27 | Emu3: Next-Token Prediction is All You Need | Xinlong Wang et.al. | 2409.18869 | null |
2024-09-27 | Convergence of Diffusion Models Under the Manifold Hypothesis in High-Dimensions | Iskander Azangulov et.al. | 2409.18804 | null |
2024-09-27 | Unsupervised Fingerphoto Presentation Attack Detection With Diffusion Models | Hailin Li et.al. | 2409.18636 | null |
2024-09-27 | Treating Brain-inspired Memories as Priors for Diffusion Model to Forecast Multivariate Time Series | Muyao Wang et.al. | 2409.18491 | null |
2024-09-27 | Gradient-free Decoder Inversion in Latent Diffusion Models | Seongmin Hong et.al. | 2409.18442 | null |
2024-09-27 | GenesisTex2: Stable, Consistent and High-Quality Text-to-Texture Generation | Jiawei Lu et.al. | 2409.18401 | null |
2024-09-27 | Multi-hypotheses Conditioned Point Cloud Diffusion for 3D Human Reconstruction from Occluded Images | Donghwan Kim et.al. | 2409.18364 | link |
2024-09-27 | Generative AI for fast and accurate Statistical Computation of Fluids | Roberto Molinaro et.al. | 2409.18359 | null |
2024-09-26 | Harnessing Wavelet Transformations for Generalizable Deepfake Forgery Detection | Lalith Bharadwaj Baru et.al. | 2409.18301 | link |
2024-09-26 | Synthesizing beta-amyloid PET images from T1-weighted Structural MRI: A Preliminary Study | Qing Lyu et.al. | 2409.18282 | null |
2024-09-26 | FlowTurbo: Towards Real-time Flow-Based Image Generation with Velocity Refiner | Wenliang Zhao et.al. | 2409.18128 | link |
2024-09-26 | Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction | Jing He et.al. | 2409.18124 | null |
2024-09-26 | EdgeRunner: Auto-regressive Auto-encoder for Artistic Mesh Generation | Jiaxiang Tang et.al. | 2409.18114 | null |
2024-09-26 | Nonnegative cross-curvature in infinite dimensions: synthetic definition and spaces of measures | Flavien Léger et.al. | 2409.18112 | null |
2024-09-26 | StackGen: Generating Stable Structures from Silhouettes via Diffusion | Luzhe Sun et.al. | 2409.18098 | null |
2024-09-26 | DiffSSC: Semantic LiDAR Scan Completion using Denoising Diffusion Probabilistic Models | Helin Cao et.al. | 2409.18092 | null |
2024-09-26 | Stable Video Portraits | Mirela Ostrek et.al. | 2409.18083 | null |
2024-09-26 | PhoCoLens: Photorealistic and Consistent Reconstruction in Lensless Imaging | Xin Cai et.al. | 2409.17996 | null |
2024-09-26 | Joint Localization and Planning using Diffusion | L. Lao Beyer et.al. | 2409.17995 | null |
2024-09-26 | CNCA: Toward Customizable and Natural Generation of Adversarial Camouflage for Vehicle Detectors | Linye Lyu et.al. | 2409.17963 | null |
2024-09-26 | Relativistic diffusion model for hadron production in p-Pb collisions at the LHC | Philipp Schulz et.al. | 2409.17960 | null |
2024-09-26 | Pioneering Reliable Assessment in Text-to-Image Knowledge Editing: Leveraging a Fine-Grained Dataset and an Innovative Criterion | Hengrui Gu et.al. | 2409.17928 | link |
2024-09-26 | Resolving Multi-Condition Confusion for Finetuning-Free Personalized Image Generation | Qihan Huang et.al. | 2409.17920 | link |
2024-09-26 | Physics-aligned Schrödinger bridge | Zeyu Li et.al. | 2409.17825 | null |
2024-09-26 | Continual learning with task specialist | Indu Solomon et.al. | 2409.17806 | null |
2024-09-25 | DreamWaltz-G: Expressive 3D Gaussian Avatars from Skeleton-Guided 2D Diffusion | Yukun Huang et.al. | 2409.17145 | link |
2024-09-25 | Strong solutions to degenerate SDEs and uniqueness for degenerate Fokker-Planck equations | Sebastian Grube et.al. | 2409.17135 | null |
2024-09-25 | Language-oriented Semantic Communication for Image Transmission with Fine-Tuned Diffusion Model | Xinfeng Wei et.al. | 2409.17104 | null |
2024-09-25 | Degradation-Guided One-Step Image Super-Resolution with Diffusion Priors | Aiping Zhang et.al. | 2409.17058 | link |
2024-09-25 | ControlCity: A Multimodal Diffusion Model Based Approach for Accurate Geospatial Data Generation and Urban Morphology Analysis | Fangshuo Zhou et.al. | 2409.17049 | link |
2024-09-25 | Dynamic Obstacle Avoidance through Uncertainty-Based Adaptive Planning with Diffusion | Vineet Punyamoorty et.al. | 2409.16950 | null |
2024-09-25 | DALDA: Data Augmentation Leveraging Diffusion Model and LLM with Adaptive Guidance Scaling | Kyuheon Jung et.al. | 2409.16949 | link |
2024-09-25 | Generative Object Insertion in Gaussian Splatting with a Multi-View Diffusion Model | Hongliang Zhong et.al. | 2409.16938 | link |
2024-09-25 | Weak Closed-loop Solvability of Linear Quadratic Stochastic Optimal Control Problems with Partial Information | Xun Li et.al. | 2409.16924 | null |
2024-09-25 | Automating Traffic Model Enhancement with AI Research Agent | Xusen Guo et.al. | 2409.16876 | null |
2024-09-25 | A Versatile and Differentiable Hand-Object Interaction Representation | Théo Morales et.al. | 2409.16855 | null |
2024-09-25 | Analytical assessment of workers’ safety concerning direct and indirect ways of getting infected by dangerous pathogen | Krzysztof Domino et.al. | 2409.16809 | null |
2024-09-25 | Layout-Corrector: Alleviating Layout Sticking Phenomenon in Discrete Diffusion Model | Shoma Iwai et.al. | 2409.16689 | null |
2024-09-25 | CasFT: Future Trend Modeling for Information Popularity Prediction with Dynamic Cues-Driven Diffusion Models | Xin Jing et.al. | 2409.16619 | null |
2024-09-25 | BSDEs driven by G-Brownian motion with time-varying uniformly continuous generators | Bingru Zhao et.al. | 2409.16574 | null |
2024-09-18 | Massively Multi-Person 3D Human Motion Forecasting with Scene Context | Felix B Mueller et.al. | 2409.12189 | link |
2024-09-18 | MoRAG – Multi-Fusion Retrieval Augmented Generation for Human Motion | Kalakonda Sai Shashank et.al. | 2409.12140 | null |
2024-09-18 | Cyclicity Analysis of the Ornstein-Uhlenbeck Process | Vivek Kaushik et.al. | 2409.12102 | null |
2024-09-18 | Brain-Streams: fMRI-to-Image Reconstruction with Multi-modal Guidance | Jaehoon Joo et.al. | 2409.12099 | null |
2024-09-18 | Denoising diffusion models for high-resolution microscopy image restoration | Pamela Osuna-Vargas et.al. | 2409.12078 | null |
2024-09-18 | SFDA-rPPG: Source-Free Domain Adaptive Remote Physiological Measurement with Spatio-Temporal Consistency | Yiping Xie et.al. | 2409.12040 | null |
2024-09-18 | LEMON: Localized Editing with Mesh Optimization and Neural Shaders | Furkan Mert Algan et.al. | 2409.12024 | null |
2024-09-18 | Generation of Complex 3D Human Motion by Temporal and Spatial Composition of Diffusion Models | Lorenzo Mandelli et.al. | 2409.11920 | null |
2024-09-18 | DPI-TTS: Directional Patch Interaction for Fast-Converging and Style Temporal Modeling in Text-to-Speech | Xin Qi et.al. | 2409.11835 | null |
2024-09-18 | RaggeDi: Diffusion-based State Estimation of Disordered Rags, Sheets, Towels and Blankets | Jikai Ye et.al. | 2409.11831 | null |
2024-09-18 | InverseMeetInsert: Robust Real Image Editing via Geometric Accumulation Inversion in Guided Diffusion Models | Yan Zheng et.al. | 2409.11734 | null |
2024-09-18 | GUNet: A Graph Convolutional Network United Diffusion Model for Stable and Diversity Pose Generation | Shuowen Liang et.al. | 2409.11689 | link |
2024-09-18 | Recurrent Interpolants for Probabilistic Time Series Prediction | Yu Chen et.al. | 2409.11684 | null |
2024-09-18 | SRIF: Semantic Shape Registration Empowered by Diffusion-based Image Morphing and Flow Estimation | Mingze Sun et.al. | 2409.11682 | null |
2024-09-18 | Electromagnetic Property Sensing and Channel Reconstruction Based on Diffusion Schrödinger Bridge in ISAC | Yuhua Jiang et.al. | 2409.11651 | null |
2024-09-17 | Ultrasound Image Enhancement with the Variance of Diffusion Models | Yuxin Zhang et.al. | 2409.11380 | link |
2024-09-17 | OSV: One Step is Enough for High-Quality Image to Video Generation | Xiaofeng Mao et.al. | 2409.11367 | null |
2024-09-17 | Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think | Gonzalo Martin Garcia et.al. | 2409.11355 | link |
2024-09-17 | OmniGen: Unified Image Generation | Shitao Xiao et.al. | 2409.11340 | link |
2024-09-17 | Parameter dependent rough SDEs with applications to rough PDEs | Fabio Bugini et.al. | 2409.11330 | null |
2024-09-17 | fMRI-3D: A Comprehensive Dataset for Enhancing fMRI-based 3D Reconstruction | Jianxiong Gao et.al. | 2409.11315 | null |
2024-09-17 | DroneDiffusion: Robust Quadrotor Dynamics Learning with Diffusion Models | Avirup Das et.al. | 2409.11292 | null |
2024-09-17 | Score Forgetting Distillation: A Swift, Data-Free Method for Machine Unlearning in Diffusion Models | Tianqi Chen et.al. | 2409.11219 | null |
2024-09-17 | High-Resolution Speech Restoration with Latent Diffusion Model | Tushar Dhyani et.al. | 2409.11145 | null |
2024-09-17 | In-situ measurements of light diffusion in an optically dense atomic ensemble | Antoine Glicenstein et.al. | 2409.11117 | null |
2024-09-17 | TacDiffusion: Force-domain Diffusion Policy for Precise Tactile Manipulation | Yansong Wu et.al. | 2409.11047 | null |
2024-09-17 | Enhanced segmentation of femoral bone metastasis in CT scans of patients using synthetic data generation with 3D diffusion models | Emile Saillard et.al. | 2409.11011 | null |
2024-09-17 | Local discontinuous Galerkin method for nonlinear BSPDEs of Neumann boundary conditions with deep backward dynamic programming time-marching | Yixiang Dai et.al. | 2409.11004 | null |
2024-09-17 | Edge-based Denoising Image Compression | Ryugo Morita et.al. | 2409.10978 | null |
2024-09-17 | CUNSB-RFIE: Context-aware Unpaired Neural Schrödinger Bridge in Retinal Fundus Image Enhancement | Xuanzhao Dong et.al. | 2409.10966 | link |
2024-09-16 | Incorporating Classifier-Free Guidance in Diffusion Model-Based Recommendation | Noah Buchanan et.al. | 2409.10494 | null |
2024-09-16 | SimInversion: A Simple Framework for Inversion-Based Text-to-Image Editing | Qi Qian et.al. | 2409.10476 | null |
2024-09-16 | MacDiff: Unified Skeleton Modeling with Masked Conditional Diffusion | Lehong Wu et.al. | 2409.10473 | null |
2024-09-16 | Mamba-ST: State Space Model for Efficient Style Transfer | Filippo Botti et.al. | 2409.10385 | link |
2024-09-16 | Stochastic Control of UAVs: An Optimal Tradeoff between Performance, Flight Smoothness and Control Effort | George Rapakoulias et.al. | 2409.10369 | null |
2024-09-16 | Taming Diffusion Models for Image Restoration: A Review | Ziwei Luo et.al. | 2409.10353 | null |
2024-09-16 | Fairness, not Emotion, Drives Socioeconomic Decision Making | Rudra Mukhopadhyay et.al. | 2409.10322 | null |
2024-09-16 | DreamHead: Learning Spatial-Temporal Correspondence via Hierarchical Diffusion for Audio-driven Talking Head Synthesis | Fa-Ting Hong et.al. | 2409.10281 | null |
2024-09-16 | RealDiff: Real-world 3D Shape Completion using Self-Supervised Diffusion Models | Başak Melis Öcal et.al. | 2409.10180 | null |
2024-09-16 | PSHuman: Photorealistic Single-view Human Reconstruction using Cross-Scale Diffusion | Peng Li et.al. | 2409.10141 | null |
2024-09-16 | Approximating the signature of Brownian motion for high order SDE simulation | James Foster et.al. | 2409.10118 | link |
2024-09-16 | DDoS: Diffusion Distribution Similarity for Out-of-Distribution Detection | Kun Fang et.al. | 2409.10094 | null |
2024-09-16 | MotionCom: Automatic and Motion-Aware Image Composition with LLM and Video Diffusion Prior | Weijing Tao et.al. | 2409.10090 | link |
2024-09-16 | Cross-modality image synthesis from TOF-MRA to CTA using diffusion-based models | Alexander Koch et.al. | 2409.10089 | null |
2024-09-16 | A Riemannian Approach to Ground Metric Learning for Optimal Transport | Pratik Jawanpuria et.al. | 2409.10085 | null |
2024-09-13 | Closed-Loop Visuomotor Control with Generative Expectation for Robotic Manipulation | Qingwen Bu et.al. | 2409.09016 | link |
2024-09-13 | A Diffusion Approach to Radiance Field Relighting using Multi-Illumination Synthesis | Yohan Poirier-Ginter et.al. | 2409.08947 | null |
2024-09-13 | Latent Space Score-based Diffusion Model for Probabilistic Multivariate Time Series Imputation | Guojun Liang et.al. | 2409.08917 | link |
2024-09-13 | Gaussian is All You Need: A Unified Framework for Solving Inverse Problems via Diffusion Posterior Sampling | Nebiyou Yismaw et.al. | 2409.08906 | null |
2024-09-13 | Adjoint Matching: Fine-tuning Flow and Diffusion Generative Models with Memoryless Stochastic Optimal Control | Carles Domingo-Enrich et.al. | 2409.08861 | null |
2024-09-13 | InstantDrag: Improving Interactivity in Drag-based Image Editing | Joonghyuk Shin et.al. | 2409.08857 | null |
2024-09-13 | DX2CT: Diffusion Model for 3D CT Reconstruction from Bi or Mono-planar 2D X-ray(s) | Yun Su Jeong et.al. | 2409.08850 | null |
2024-09-13 | Measure-Theoretic Time-Delay Embedding | Jonah Botvinick-Greenhouse et.al. | 2409.08768 | link |
2024-09-13 | DFADD: The Diffusion and Flow-Matching Based Audio Deepfake Dataset | Jiawei Du et.al. | 2409.08731 | link |
2024-09-13 | Asymptotics for Random Quadratic Transportation Costs | Martin Huesmann et.al. | 2409.08612 | null |
2024-09-13 | Finite-time thermodynamic bounds and tradeoff relations for information processing | Takuya Kamijima et.al. | 2409.08606 | null |
2024-09-13 | STA-V2A: Video-to-Audio Generation with Semantic and Temporal Alignment | Yong Ren et.al. | 2409.08601 | null |
2024-09-13 | LHQ-SVC: Lightweight and High Quality Singing Voice Conversion Modeling | Yubo Huang et.al. | 2409.08583 | null |
2024-09-13 | DiffFAS: Face Anti-Spoofing via Generative Diffusion Models | Xinxu Ge et.al. | 2409.08572 | link |
2024-09-13 | Think Twice Before You Act: Improving Inverse Problem Solving With MCMC | Yaxuan Zhu et.al. | 2409.08551 | null |
2024-09-12 | DreamHOI: Subject-Driven Generation of 3D Human-Object Interactions with Diffusion Priors | Thomas Hanwen Zhu et.al. | 2409.08278 | null |
2024-09-12 | DreamBeast: Distilling 3D Fantastical Animals with Part-Aware Knowledge Transfer | Runjia Li et.al. | 2409.08271 | null |
2024-09-12 | Touch2Touch: Cross-Modal Tactile Generation for Object Manipulation | Samanta Rodriguez et.al. | 2409.08269 | null |
2024-09-12 | Improving Text-guided Object Inpainting with Semantic Pre-inpainting | Yifu Chen et.al. | 2409.08260 | link |
2024-09-12 | Improving Virtual Try-On with Garment-focused Diffusion Models | Siqi Wan et.al. | 2409.08258 | null |
2024-09-12 | LoRID: Low-Rank Iterative Diffusion for Adversarial Purification | Geigh Zollicoffer et.al. | 2409.08255 | null |
2024-09-12 | Dynamic Prompting of Frozen Text-to-Image Diffusion Models for Panoptic Narrative Grounding | Hongyu Li et.al. | 2409.08251 | null |
2024-09-12 | IFAdapter: Instance Feature Control for Grounded Text-to-Image Generation | Yinwei Wu et.al. | 2409.08240 | null |
2024-09-12 | How can the tragedy of the commons be prevented?: Introducing Linear Quadratic Mixed Mean Field Games | Gokce Dayanikli et.al. | 2409.08235 | null |
2024-09-12 | LT3SD: Latent Trees for 3D Scene Diffusion | Quan Meng et.al. | 2409.08215 | null |
2024-09-12 | VI3DRM:Towards meticulous 3D Reconstruction from Sparse Views via Photo-Realistic Novel View Synthesis | Hao Chen et.al. | 2409.08207 | null |
2024-09-12 | MagicStyle: Portrait Stylization Based on Reference Image | Zhaoli Deng et.al. | 2409.08156 | null |
2024-09-12 | EZIGen: Enhancing zero-shot subject-driven image generation with precise subject encoding and decoupled guidance | Zicheng Duan et.al. | 2409.08091 | link |
2024-09-12 | Diffusion-Based Image-to-Image Translation by Noise Correction via Prompt Interpolation | Junsung Lee et.al. | 2409.08077 | null |
2024-09-12 | AI-accelerated discovery of high critical temperature superconductors | Xiao-Qi Han et.al. | 2409.08065 | null |
2024-09-11 | DreamMesh: Jointly Manipulating and Texturing Triangle Meshes for Text-to-3D Generation | Haibo Yang et.al. | 2409.07454 | null |
2024-09-11 | Hi3D: Pursuing High-Resolution Image-to-3D Generation with Video Diffusion Models | Haibo Yang et.al. | 2409.07452 | link |
2024-09-11 | FreeEnhance: Tuning-Free Image Enhancement via Content-Consistent Noising-and-Denoising Process | Yang Luo et.al. | 2409.07451 | null |
2024-09-11 | Efficient One-Step Diffusion Refinement for Snapshot Compressive Imaging | Yunzhen Wang et.al. | 2409.07417 | null |
2024-09-11 | Training-Free Guidance for Discrete Diffusion Models for Molecular Generation | Thomas J. Kerby et.al. | 2409.07359 | null |
2024-09-11 | Learning Robotic Manipulation Policies from Point Clouds with Conditional Flow Matching | Eugenio Chisari et.al. | 2409.07343 | null |
2024-09-11 | Efficient and Unbiased Sampling of Boltzmann Distributions via Consistency Models | Fengzhe Zhang et.al. | 2409.07323 | null |
2024-09-11 | Exploring User-level Gradient Inversion with a Diffusion Prior | Zhuohang Li et.al. | 2409.07291 | null |
2024-09-11 | CCFExp: Facial Image Synthesis with Cycle Cross-Fusion Diffusion Model for Facial Paralysis Individuals | Weixiang Gao et.al. | 2409.07271 | link |
2024-09-11 | Realistic and Efficient Face Swapping: A Unified Approach with Diffusion Models | Sanoojan Baliah et.al. | 2409.07269 | link |
2024-09-11 | EMOdiffhead: Continuously Emotional Control in Talking Head Generation via Diffusion | Jian Zhang et.al. | 2409.07255 | null |
2024-09-12 | Alignment of Diffusion Models: Fundamentals, Challenges, and Future | Buhua Liu et.al. | 2409.07253 | link |
2024-09-11 | Diff-VPS: Video Polyp Segmentation via a Multi-task Diffusion Network with Adversarial Temporal Reasoning | Yingling Lu et.al. | 2409.07238 | link |
2024-09-11 | Phy124: Fast Physics-Driven 4D Content Generation from a Single Image | Jiajing Lin et.al. | 2409.07179 | null |
2024-09-11 | Mamba Policy: Towards Efficient 3D Diffusion Policy with Hybrid Selective State Models | Jiahang Cao et.al. | 2409.07163 | null |
2024-09-10 | SaRA: High-Efficient Diffusion Model Fine-tuning with Progressive Sparse Low-Rank Adaptation | Teng Hu et.al. | 2409.06633 | null |
2024-09-10 | One-Shot Imitation under Mismatched Execution | Kushal Kedia et.al. | 2409.06615 | null |
2024-09-10 | Modelling Global Trade with Optimal Transport | Thomas Gaskin et.al. | 2409.06554 | link |
2024-09-10 | Robust financial calibration: a Bayesian approach for neural SDEs | Christa Cuchiero et.al. | 2409.06551 | link |
2024-09-10 | Enhancing Emotional Text-to-Speech Controllability with Natural Language Guidance through Contrastive Learning and Diffusion Models | Xin Jing et.al. | 2409.06451 | null |
2024-09-10 | Robust semi-parametric signal detection in particle physics with classifiers decorrelated via optimal transport | Purvasha Chakravarti et.al. | 2409.06399 | null |
2024-09-10 | Distilling Generative-Discriminative Representations for Very Low-Resolution Face Recognition | Junzheng Zhang et.al. | 2409.06371 | null |
2024-09-10 | What happens to diffusion model likelihood when your model is conditional? | Mattias Cross et.al. | 2409.06364 | null |
2024-09-10 | DiffQRCoder: Diffusion-based Aesthetic QR Code Generation with Scanning Robustness Guided Iterative Refinement | Jia-Wei Liao et.al. | 2409.06355 | null |
2024-09-10 | Geometry of the Space of Partitioned Networks: A Unified Theoretical and Computational Framework | Stephen Y Zhang et.al. | 2409.06302 | link |
2024-09-10 | Multi-Source Music Generation with Latent Diffusion | Zhongweiyang Xu et.al. | 2409.06190 | link |
2024-09-10 | MyGo: Consistent and Controllable Multi-View Driving Video Generation with Camera Control | Yining Yao et.al. | 2409.06189 | null |
2024-09-10 | EDADepth: Enhanced Data Augmentation for Monocular Depth Estimation | Nischal Khanal et.al. | 2409.06183 | link |
2024-09-09 | Latent Diffusion Bridges for Unsupervised Musical Audio Timbre Transfer | Michele Mancusi et.al. | 2409.06096 | null |
2024-09-09 | SVS-GAN: Leveraging GANs for Semantic Video Synthesis | Khaled M. Seyam et.al. | 2409.06074 | null |
2024-09-09 | Enhancing Preference-based Linear Bandits via Human Response Time | Shen Li et.al. | 2409.05798 | null |
2024-09-09 | Vector Quantized Diffusion Model Based Speech Bandwidth Extension | Yuan Fang et.al. | 2409.05784 | null |
2024-09-09 | AS-Speech: Adaptive Style For Speech Synthesis | Zhipeng Li et.al. | 2409.05730 | null |
2024-09-09 | Distributionally Robust Stochastic Data-Driven Predictive Control with Optimized Feedback Gain | Ruiqi Li et.al. | 2409.05727 | null |
2024-09-09 | Quantitative approximation of stochastic kinetic equations: from discrete to continuum | Zimo Hao et.al. | 2409.05706 | null |
2024-09-09 | pFedGPA: Diffusion-based Generative Parameter Aggregation for Personalized Federated Learning | Jiahao Lai et.al. | 2409.05701 | null |
2024-09-09 | Unlearning or Concealment? A Critical Analysis and Evaluation Metrics for Unlearning in Diffusion Models | Aakash Sen Sharma et.al. | 2409.05668 | null |
2024-09-09 | Forward KL Regularized Preference Optimization for Aligning Diffusion Policies | Zhao Shan et.al. | 2409.05622 | null |
2024-09-09 | CipherDM: Secure Three-Party Inference for Diffusion Model Sampling | Xin Zhao et.al. | 2409.05414 | null |
2024-09-09 | Sequential Posterior Sampling with Diffusion Models | Tristan S. W. Stevens et.al. | 2409.05399 | null |
2024-09-09 | TERD: A Unified Framework for Safeguarding Diffusion Models Against Backdoors | Yichuan Mo et.al. | 2409.05294 | link |
2024-09-08 | The Stochastic Gause predator-prey model: noise-induced extinctions and invariance | Leon Alexander Valencia et.al. | 2409.05237 | null |
2024-09-08 | Nuclear transparencies with a two step process of the $A(e,e’π^+)$ reactions | Tae Keun Choi et.al. | 2409.05129 | null |
2024-09-08 | Diffusion-based Speech Enhancement with Schrödinger Bridge and Symmetric Noise Schedule | Siyi Wang et.al. | 2409.05116 | null |
2024-09-08 | A Survey on Diffusion Models for Recommender Systems | Jianghao Lin et.al. | 2409.05033 | link |
2024-09-06 | VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation | Yecheng Wu et.al. | 2409.04429 | link |
2024-09-06 | Exploring Foundation Models for Synthetic Medical Imaging: A Study on Chest X-Rays and Fine-Tuning Techniques | Davide Clode da Silva et.al. | 2409.04424 | null |
2024-09-06 | How Fair is Your Diffusion Recommender Model? | Daniele Malitesta et.al. | 2409.04339 | null |
2024-09-06 | Random effects estimation in a fractional diffusion model based on continuous observations | Nesrine Chebli et.al. | 2409.04331 | null |
2024-09-06 | Probabilistic Representation for Viscosity Solutions to Double-Obstacle Quasi-Variational Inequalities | Magnus Perninge et.al. | 2409.04207 | null |
2024-09-06 | Breaking the Brownian Barrier: Models and Manifestations of Molecular Diffusion in Complex Fluids | Harish Srinivasan et.al. | 2409.04199 | null |
2024-09-06 | GST: Precise 3D Human Body from a Single Image with Gaussian Splatting Transformers | Lorenza Prospero et.al. | 2409.04196 | null |
2024-09-06 | D4: Text-guided diffusion model-based domain adaptive data augmentation for vineyard shoot detection | Kentaro Hirahara et.al. | 2409.04060 | null |
2024-09-06 | A policy iteration algorithm for non-Markovian control problems | Dylan Possamaï et.al. | 2409.04037 | null |
2024-09-06 | One-Shot Diffusion Mimicker for Handwritten Text Generation | Gang Dai et.al. | 2409.04004 | link |
2024-09-06 | DreamForge: Motion-Aware Autoregressive Video Generation for Multi-View Driving Scenes | Jianbiao Mei et.al. | 2409.04003 | link |
2024-09-05 | Data-Efficient Generation for Dataset Distillation | Zhe Li et.al. | 2409.03929 | null |
2024-09-05 | Generating High Dimensional User-Specific Wireless Channels using Diffusion Models | Taekyun Lee et.al. | 2409.03924 | null |
2024-09-05 | Neural Entropy | Akhil Premkumar et.al. | 2409.03817 | null |
2024-09-05 | Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding | Yunze Man et.al. | 2409.03757 | link |
2024-09-05 | ArtiFade: Learning to Generate High-quality Subject from Blemished Images | Shuya Yang et.al. | 2409.03745 | null |
2024-09-05 | Quantum optimal transport with convex regularization | Emanuele Caputo et.al. | 2409.03698 | null |
2024-09-05 | RealisHuman: A Two-Stage Approach for Refining Malformed Human Parts in Generated Images | Benzhi Wang et.al. | 2409.03644 | link |
2024-09-05 | DiffEVC: Any-to-Any Emotion Voice Conversion with Expressive Guidance | Hsing-Hang Chou et.al. | 2409.03636 | null |
2024-09-05 | TCDiff: Triple Condition Diffusion Model with 3D Constraints for Stylizing Synthetic Faces | Bernardo Biesseck et.al. | 2409.03600 | link |
2024-09-05 | DKDM: Data-Free Knowledge Distillation for Diffusion Models with Any Architecture | Qianlong Xiang et.al. | 2409.03550 | null |
2024-09-05 | On the mean field limit of consensus based methods | Marvin Koß et.al. | 2409.03518 | null |
2024-09-05 | Blended Latent Diffusion under Attention Control for Real-World Video Editing | Deyin Liu et.al. | 2409.03514 | null |
2024-09-05 | Data-free Distillation with Degradation-prompt Diffusion for Multi-weather Image Restoration | Pei Wang et.al. | 2409.03455 | null |
2024-09-05 | Recursive Quantization for $\mathcal{L}_2$ Stabilization of a Finite Capacity Stochastic Control Loop with Intermittent State Observations | Shrija Karmakar et.al. | 2409.03398 | null |
2024-09-05 | Enhancing User-Centric Privacy Protection: An Interactive Framework through Diffusion Models and Machine Unlearning | Huaxi Huang et.al. | 2409.03326 | null |
2024-09-05 | SVP: Style-Enhanced Vivid Portrait Talking Head Diffusion Model | Weipeng Tan et.al. | 2409.03270 | null |
2024-09-05 | RoomDiffusion: A Specialized Diffusion Model in the Interior Design Industry | Zhaowei Wang et.al. | 2409.03198 | null |
2024-09-04 | Spatial Diffusion for Cell Layout Generation | Chen Li et.al. | 2409.03106 | link |
2024-09-04 | HiPrompt: Tuning-free Higher-Resolution Generation with Hierarchical MLLM Prompts | Xinyu Liu et.al. | 2409.02919 | link |
2024-09-04 | Masked Diffusion Models are Secretly Time-Agnostic Masked Models and Exploit Inaccurate Categorical Sampling | Kaiwen Zheng et.al. | 2409.02908 | null |
2024-09-04 | Human-VDM: Learning Single-Image 3D Human Gaussian Splatting from Video Diffusion Models | Zhibin Liu et.al. | 2409.02851 | link |
2024-09-04 | Multi-Track MusicLDM: Towards Versatile Music Generation with Latent Diffusion Model | Tornike Karchkhadze et.al. | 2409.02845 | null |
2024-09-04 | Skip-and-Play: Depth-Driven Pose-Preserved Image Generation for Any Objects | Kyungmin Jo et.al. | 2409.02653 | null |
2024-09-04 | MADiff: Motion-Aware Mamba Diffusion Models for Hand Trajectory Prediction on Egocentric Videos | Junyi Ma et.al. | 2409.02638 | null |
2024-09-04 | Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency | Jianwen Jiang et.al. | 2409.02634 | null |
2024-09-04 | Rate-Adaptive Generative Semantic Communication Using Conditional Diffusion Models | Pujing Yang et.al. | 2409.02597 | null |
2024-09-04 | Solving Video Inverse Problems Using Image Diffusion Models | Taesung Kwon et.al. | 2409.02574 | null |
2024-09-04 | StyleTokenizer: Defining Image Style by a Single Instance for Controlling Diffusion Models | Wen Li et.al. | 2409.02543 | link |
2024-09-04 | Sample what you cant compress | Vighnesh Birodkar et.al. | 2409.02529 | null |
2024-09-04 | Continual Diffuser (CoD): Mastering Continual Offline Reinforcement Learning with Experience Rehearsal | Jifeng Hu et.al. | 2409.02512 | link |
2024-09-04 | Demographic parity in regression and classification within the unawareness framework | Vincent Divol et.al. | 2409.02471 | null |
2024-09-04 | Training-free Color-Style Disentanglement for Constrained Text-to-Image Synthesis | Aishwarya Agarwal et.al. | 2409.02429 | null |
2024-09-04 | Diffusion Models Learn Low-Dimensional Distributions via Subspace Clustering | Peng Wang et.al. | 2409.02426 | link |
2024-08-30 | Subspace Diffusion Posterior Sampling for Travel-Time Tomography | Xiang Cao et.al. | 2408.17333 | null |
2024-08-30 | Likelihood estimation for stochastic differential equations with mixed effects | Fernando Baltazar-Larios et.al. | 2408.17257 | null |
2024-08-30 | The random periodic solutions for McKean-Vlasov stochastic differential equations | Jianhai Bao et.al. | 2408.17242 | null |
2024-08-30 | A methodological framework for Resilience as a Service (RaaS) in multimodal urban transportation networks | Sara Jaber et.al. | 2408.17233 | null |
2024-09-02 | RISSOLE: Parameter-efficient Diffusion Models via Block-wise Generation and Retrieval-Guidance | Avideep Mukherjee et.al. | 2408.17095 | null |
2024-09-02 | Instant Adversarial Purification with Adversarial Consistency Distillation | Chun Tong Lei et.al. | 2408.17064 | null |
2024-08-30 | Text-to-Image Generation Via Energy-Based CLIP | Roy Ganz et.al. | 2408.17046 | null |
2024-08-30 | High-fidelity holographic beam shaping with optimal transport and phase diversity | Hunter Swan et.al. | 2408.17025 | null |
2024-08-30 | Contrastive Learning with Synthetic Positives | Dewen Zeng et.al. | 2408.16965 | link |
2024-09-02 | Enabling Local Editing in Diffusion Models by Joint and Individual Component Analysis | Theodoros Kouzelis et.al. | 2408.16845 | null |
2024-08-29 | ReconX: Reconstruct Any Scene from Sparse Views with Video Diffusion Model | Fangfu Liu et.al. | 2408.16767 | null |
2024-09-04 | CSGO: Content-Style Composition in Text-to-Image Generation | Peng Xing et.al. | 2408.16766 | null |
2024-08-29 | DriveGenVLM: Real-world Video Generation for Vision Language Model based Autonomous Driving | Yongjie Fu et.al. | 2408.16647 | null |
2024-09-02 | RLCP: A Reinforcement Learning-based Copyright Protection Method for Text-to-Image Diffusion Model | Zhuan Shi et.al. | 2408.16634 | null |
2024-08-29 | A Score-based Generative Solver for PDE-constrained Inverse Problems with Complex Priors | Yankun Hong et.al. | 2408.16626 | null |
Dataset Distillation
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2024-11-19 | KDC-MAE: Knowledge Distilled Contrastive Mask Auto-Encoder | Maheswar Bora et.al. | 2411.12270 | null |
2024-11-19 | Just KIDDIN: Knowledge Infusion and Distillation for Detection of INdecent Memes | Rahul Garg et.al. | 2411.12174 | null |
2024-11-18 | Distill the Best, Ignore the Rest: Improving Dataset Distillation with Loss-Value-Based Pruning | Brian B. Moser et.al. | 2411.12115 | link |
2024-11-18 | Dataset Distillers Are Good Label Denoisers In the Wild | Lechao Cheng et.al. | 2411.11924 | null |
2024-11-18 | Federated Incremental Named Entity Recognition | Duzhen Zhang et.al. | 2411.11623 | null |
2024-11-18 | Color-Oriented Redundancy Reduction in Dataset Distillation | Bowen Yuan et.al. | 2411.11329 | link |
2024-11-17 | Map-Free Trajectory Prediction with Map Distillation and Hierarchical Encoding | Xiaodong Liu et.al. | 2411.10961 | null |
2024-11-16 | Hybrid Attention Model Using Feature Decomposition and Knowledge Distillation for Glucose Forecasting | Ebrahim Farahmand et.al. | 2411.10703 | null |
2024-11-16 | Multi-perspective Contrastive Logit Distillation | Qi Wang et.al. | 2411.10693 | null |
2024-11-16 | Exploring Feature-based Knowledge Distillation For Recommender System: A Frequency Perspective | Zhangchi Zhu et.al. | 2411.10676 | null |
2024-11-15 | Evidential Federated Learning for Skin Lesion Image Classification | Rutger Hendrix et.al. | 2411.10071 | null |
2024-11-14 | VPBSD:Vessel-Pattern-Based Semi-Supervised Distillation for Efficient 3D Microscopic Cerebrovascular Segmentation | Xi Lin et.al. | 2411.09567 | null |
2024-11-14 | BEARD: Benchmarking the Adversarial Robustness for Dataset Distillation | Zheng Zhou et.al. | 2411.09265 | link |
2024-11-14 | Mono2Stereo: Monocular Knowledge Transfer for Enhanced Stereo Matching | Yuran Wang et.al. | 2411.09151 | null |
2024-11-14 | Toward Democratized Generative AI in Next-Generation Mobile Edge Networks | Ruichen Zhang et.al. | 2411.09148 | null |
2024-11-14 | SCAN: Bootstrapping Contrastive Pre-training for Data Efficiency | Yangyang Guo et.al. | 2411.09126 | link |
2024-11-13 | Dual-Head Knowledge Distillation: Enhancing Logits Utilization with an Auxiliary Head | Penghui Yang et.al. | 2411.08937 | null |
2024-11-13 | UIFormer: A Unified Transformer-based Framework for Incremental Few-Shot Object Detection and Instance Segmentation | Chengyuan Zhang et.al. | 2411.08569 | null |
2024-11-13 | Federated Graph Learning with Graphless Clients | Xingbo Fu et.al. | 2411.08374 | null |
2024-11-12 | Joint Diffusion models in Continual Learning | Paweł Skierś et.al. | 2411.08224 | null |
2024-11-12 | Learning with Less: Knowledge Distillation from Large Language Models via Unlabeled Data | Juanhui Li et.al. | 2411.08028 | null |
2024-11-13 | Query Optimization for Parametric Knowledge Refinement in Retrieval-Augmented Large Language Models | Youan Cong et.al. | 2411.07820 | null |
2024-11-12 | Robust Offline Reinforcement Learning for Non-Markovian Decision Processes | Ruiquan Huang et.al. | 2411.07514 | null |
2024-11-13 | Feature Interaction Fusion Self-Distillation Network For CTR Prediction | Lei Sang et.al. | 2411.07508 | null |
2024-11-12 | Quantifying Knowledge Distillation Using Partial Information Decomposition | Pasan Dissanayake et.al. | 2411.07483 | null |
2024-11-08 | Multi-Document Financial Question Answering using LLMs | Shalin Shah et.al. | 2411.07264 | null |
2024-11-11 | SAMPart3D: Segment Any Part in 3D Objects | Yunhan Yang et.al. | 2411.07184 | link |
2024-11-11 | LLM-Neo: Parameter Efficient Knowledge Distillation for Large Language Models | Runming Yang et.al. | 2411.06839 | null |
2024-11-11 | ScaleKD: Strong Vision Transformers Could Be Excellent Teachers | Jiawei Fan et.al. | 2411.06786 | link |
2024-11-11 | An Efficient Memory Module for Graph Few-Shot Class-Incremental Learning | Dong Li et.al. | 2411.06659 | link |
2024-11-10 | CULL-MT: Compression Using Language and Layer pruning for Machine Translation | Pedram Rostami et.al. | 2411.06506 | null |
2024-11-10 | Over-parameterized Student Model via Tensor Decomposition Boosted Knowledge Distillation | Yu-Liang Zhan et.al. | 2411.06448 | link |
2024-11-09 | Dynamic Textual Prompt For Rehearsal-free Lifelong Person Re-identification | Hongyu Chen et.al. | 2411.06023 | null |
2024-11-09 | Multi-hop RIS-aided Learning Model Sharing for Urban Air Mobility | Kai Xiong et.al. | 2411.06015 | null |
2024-11-08 | Mitigating Hallucination with ZeroG: An Advanced Knowledge Management Engine | Anantha Sharma et.al. | 2411.05936 | null |
2024-11-08 | Asterisk*: Keep it Simple | Andrew Semenov et.al. | 2411.05691 | null |
2024-11-08 | Knowledge Distillation Neural Network for Predicting Car-following Behaviour of Human-driven and Autonomous Vehicles | Ayobami Adewale et.al. | 2411.05618 | null |
2024-11-08 | Towards Lifelong Few-Shot Customization of Text-to-Image Diffusion | Nan Song et.al. | 2411.05544 | null |
2024-11-07 | Performance-Guided LLM Knowledge Distillation for Efficient Text Classification at Scale | Flavio Di Palo et.al. | 2411.05045 | null |
2024-11-07 | Towards Competitive Search Relevance For Inference-Free Learned Sparse Retrievers | Zhichao Geng et.al. | 2411.04403 | null |
2024-11-07 | GazeGen: Gaze-Driven User Interaction for Visual Content Generation | He-Yen Hsieh et.al. | 2411.04335 | null |
2024-11-06 | Towards Personalized Federated Learning via Comprehensive Knowledge Distillation | Pengju Wang et.al. | 2411.03569 | null |
2024-11-05 | Transformer-Based Fault-Tolerant Control for Fixed-Wing UAVs Using Knowledge Distillation and In-Context Adaptation | Francisco Giral et.al. | 2411.02975 | null |
2024-11-05 | Centerness-based Instance-aware Knowledge Distillation with Task-wise Mutual Lifting for Object Detection on Drone Imagery | Bowei Du et.al. | 2411.02861 | null |
2024-11-05 | Brewing Vodka: Distilling Pure Knowledge for Lightweight Threat Detection in Audit Logs | Weiheng Wu et.al. | 2411.02775 | null |
2024-11-05 | Multimodal Commonsense Knowledge Distillation for Visual Question Answering | Shuo Yang et.al. | 2411.02722 | null |
2024-11-04 | Training on the Test Model: Contamination in Ranking Distillation | Vishakha Suresh Kalal et.al. | 2411.02284 | link |
2024-11-03 | Decoupling Dark Knowledge via Block-wise Logit Distillation for Feature-level Alignment | Chengting Yu et.al. | 2411.01547 | null |
2024-11-01 | On the Impact of White-box Deployment Strategies for Edge AI on Latency and Model Performance | Jaskirat Singh et.al. | 2411.00907 | null |
2024-10-30 | The Graph’s Apprentice: Teaching an LLM Low Level Knowledge for Circuit Quality Estimation | Reza Moravej et.al. | 2411.00843 | null |
2024-10-29 | Unsupervised Training of a Dynamic Context-Aware Deep Denoising Framework for Low-Dose Fluoroscopic Imaging | Sun-Young Jeon et.al. | 2411.00830 | link |
2024-11-01 | Adapting While Learning: Grounding LLMs for Scientific Problems with Intelligent Tool Usage Adaptation | Bohan Lyu et.al. | 2411.00412 | null |
2024-11-01 | Towards Building Secure UAV Navigation with FHE-aware Knowledge Distillation | Arjun Ramesh Kaushik et.al. | 2411.00403 | null |
2024-10-31 | Semantic Knowledge Distillation for Onboard Satellite Earth Observation Image Classification | Thanh-Dung Le et.al. | 2411.00209 | link |
2024-10-30 | Larger models yield better results? Streamlined severity classification of ADHD-related concerns using BERT-based knowledge distillation | Ahmed Akib Jawad Karim et.al. | 2411.00052 | null |
2024-10-30 | IP-MOT: Instance Prompt Learning for Cross-Domain Multi-Object Tracking | Run Luo et.al. | 2410.23907 | null |
2024-10-28 | Unveiling Context-Aware Criteria in Self-Assessing LLMs | Taneesh Gupta et.al. | 2410.21545 | null |
2024-10-28 | Knowledge Distillation for Real-Time Classification of Early Media in Voice Communications | Kemal Altwlkany et.al. | 2410.21478 | null |
2024-10-28 | Less is More: Efficient Time Series Dataset Condensation via Two-fold Modal Matching–Extended Version | Hao Miao et.al. | 2410.20905 | null |
2024-10-28 | Deep Learning for Medical Text Processing: BERT Model Fine-Tuning and Comparative Study | Jiacheng Hu et.al. | 2410.20792 | null |
2024-10-28 | KD-LoRA: A Hybrid Approach to Efficient Fine-Tuning with LoRA and Knowledge Distillation | Rambod Azimi et.al. | 2410.20777 | link |
2024-10-28 | Data-Efficient Low-Complexity Acoustic Scene Classification via Distilling and Progressive Pruning | Bing Han et.al. | 2410.20775 | null |
2024-10-28 | Relaxed Recursive Transformers: Effective Parameter Sharing with Layer-wise LoRA | Sangmin Bae et.al. | 2410.20672 | null |
2024-10-28 | FLiP: Privacy-Preserving Federated Learning based on the Principle of Least Privileg | ShiMao Xu et.al. | 2410.19548 | null |
2024-10-25 | SWITCH: Studying with Teacher for Knowledge Distillation of Large Language Models | Jahyun Koo et.al. | 2410.19503 | null |
2024-10-24 | AlignCap: Aligning Speech Emotion Captioning to Human Preferences | Ziqi Liang et.al. | 2410.19134 | null |
2024-10-24 | High-dimensional Analysis of Knowledge Distillation: Weak-to-Strong Generalization and Scaling Laws | M. Emrullah Ildiz et.al. | 2410.18837 | null |
2024-10-24 | Knowledge Distillation Using Frontier Open-source LLMs: Generalizability and the Role of Synthetic Data | Anup Shirgaonkar et.al. | 2410.18588 | null |
2024-10-24 | SIKeD: Self-guided Iterative Knowledge Distillation for mathematical reasoning | Shivam Adarsh et.al. | 2410.18574 | link |
2024-10-23 | ELAICHI: Enhancing Low-resource TTS by Addressing Infrequent and Low-frequency Character Bigrams | Srija Anand et.al. | 2410.17901 | null |
2024-10-23 | Towards Active Participant-Centric Vertical Federated Learning: Some Representations May Be All You Need | Jon Irureta et.al. | 2410.17648 | null |
2024-10-23 | Towards Effective Data-Free Knowledge Distillation via Diverse Diffusion Augmentation | Muquan Li et.al. | 2410.17606 | link |
2024-10-23 | Physics-driven AI for Channel Estimation in Cellular Network | Xiaoqian Qi et.al. | 2410.17525 | null |
2024-10-22 | MiniPLM: Knowledge Distillation for Pre-Training Language Models | Yuxian Gu et.al. | 2410.17215 | link |
2024-10-22 | Emphasizing Discriminative Features for Dataset Distillation in Complex Scenarios | Kai Wang et.al. | 2410.17193 | link |
2024-10-22 | CK4Gen: A Knowledge Distillation Framework for Generating High-Utility Synthetic Survival Datasets in Healthcare | Nicholas I-Hsien Kuo et.al. | 2410.16872 | null |
2024-10-22 | AttriPrompter: Auto-Prompting with Attribute Semantics for Zero-shot Nuclei Detection via Visual-Language Pre-trained Models | Yongjian Wu et.al. | 2410.16820 | link |
2024-10-22 | SafetyAnalyst: Interpretable, transparent, and steerable LLM safety moderation | Jing-Jing Li et.al. | 2410.16665 | null |
Synthetic Data Generation
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2024-11-19 | Regular-pattern-sensitive CRFs for Distant Label Interactions | Sean Papay et.al. | 2411.12484 | null |
2024-11-19 | Empirical Privacy Evaluations of Generative and Predictive Machine Learning Models – A review and challenges for practice | Flavio Hafner et.al. | 2411.12451 | null |
2024-11-19 | Could Humans Outshine AI in Visual Data Analysis? | Ratanond Koonchanok et.al. | 2411.12299 | null |
2024-11-18 | SpatialDreamer: Self-supervised Stereo Video Synthesis from Monocular Input | Zhen Lv et.al. | 2411.11934 | null |
2024-11-18 | RoboGSim: A Real2Sim2Real Robotic Gaussian Splatting Simulator | Xinhai Li et.al. | 2411.11839 | null |
2024-11-18 | Theoretical Foundations of Conformal Prediction | Anastasios N. Angelopoulos et.al. | 2411.11824 | null |
2024-11-18 | Parallelly Tempered Generative Adversarial Networks | Jinwon Sohn et.al. | 2411.11786 | null |
2024-11-18 | Open Catalyst Experiments 2024 (OCx24): Bridging Experiments and Computational Models | Jehad Abed et.al. | 2411.11783 | null |
2024-11-18 | Few-shot Model Extraction Attacks against Sequential Recommender Systems | Hui Zhang et.al. | 2411.11677 | null |
2024-11-18 | Real-Time Fitness Exercise Classification and Counting from Video Frames | Riccardo Riccio et.al. | 2411.11548 | link |
2024-11-18 | A Pre-Trained Graph-Based Model for Adaptive Sequencing of Educational Documents | Jean Vassoyan et.al. | 2411.11520 | link |
2024-11-19 | Cascaded Diffusion Models for 2D and 3D Microscopy Image Synthesis to Enhance Cell Segmentation | Rüveyda Yilmaz et.al. | 2411.11515 | null |
2024-11-18 | Lorentz: Learned SKU Recommendation Using Profile Data | Nicholas Glaze et.al. | 2411.11325 | null |
2024-11-18 | Subgroup analysis in multi level hierarchical cluster randomized trials | Shubhadeep Chakraborty et.al. | 2411.11301 | null |
2024-11-17 | MolParser: End-to-end Visual Recognition of Molecule Structures in the Wild | Xi Fang et.al. | 2411.11098 | null |
2024-11-17 | SRA-MCTS: Self-driven Reasoning Aurmentation with Monte Carlo Tree Search for Enhanced Code Generation | Bin Xu et.al. | 2411.11053 | null |
2024-11-17 | Towards a framework on tabular synthetic data generation: a minimalist approach: theory, use cases, and limitations | Agus Sudjianto et.al. | 2411.10982 | null |
2024-11-16 | Efficient, Low-Regret, Online Reinforcement Learning for Linear MDPs | Philips George John et.al. | 2411.10906 | null |
2024-11-16 | Watermarking Generative Categorical Data | Bochao Gu et.al. | 2411.10898 | null |
2024-11-15 | Dynamic Causal Effects in a Nonlinear World: the Good, the Bad, and the Ugly | Michal Kolesár et.al. | 2411.10415 | link |
2024-11-15 | How to Build a Quantum Supercomputer: Scaling Challenges and Opportunities | Masoud Mohseni et.al. | 2411.10406 | null |
2024-11-15 | Generation of synthetic gait data: application to multiple sclerosis patients’ gait patterns | Klervi Le Gall et.al. | 2411.10377 | null |
2024-11-15 | Multidimensional Byte Pair Encoding: Shortened Sequences for Improved Visual Data Generation | Tim Elsner et.al. | 2411.10281 | null |
2024-11-15 | Evaluating Text-to-Image Diffusion Models for Texturing Synthetic Data | Thomas Lips et.al. | 2411.10164 | null |
2024-11-15 | Mitigating Sycophancy in Decoder-Only Transformer Architectures: Synthetic Data Intervention | Libo Wang et.al. | 2411.10156 | link |
2024-11-15 | Adaptive Physics-Guided Neural Network | David Shulman et.al. | 2411.10064 | null |
2024-11-14 | Cross-Matched Interval Prevalence of High Dimensional Point Clouds | Jonathan M. Mousley et.al. | 2411.09797 | null |
2024-11-14 | Advancing Fine-Grained Visual Understanding with Multi-Scale Alignment in Multi-Modal Models | Wei Wang et.al. | 2411.09691 | null |
2024-11-16 | SAFES: Sequential Privacy and Fairness Enhancing Data Synthesis for Responsible AI | Spencer Giddens et.al. | 2411.09178 | link |
2024-11-14 | Mono2Stereo: Monocular Knowledge Transfer for Enhanced Stereo Matching | Yuran Wang et.al. | 2411.09151 | null |
2024-11-13 | Drone Detection using Deep Neural Networks Trained on Pure Synthetic Data | Mariusz Wisniewski et.al. | 2411.09077 | link |
2024-11-13 | Evaluating cosmological simulations of galaxy formation with spectral variance in the optical window | Z. Sharbaf et.al. | 2411.08945 | null |
2024-11-13 | A probabilistic reduced-order modeling framework for patient-specific cardio-mechanical analysis | Robin Willems et.al. | 2411.08822 | null |
2024-11-13 | Towards More Accurate Fake Detection on Images Generated from Advanced Generative and Neural Rendering Models | Chengdong Dong et.al. | 2411.08642 | null |
2024-11-13 | Generalized Pose Space Embeddings for Training In-the-Wild using Anaylis-by-Synthesis | Dominik Borer et.al. | 2411.08603 | null |
2024-11-13 | Space-local memory in generalized master equations: Reaching the thermodynamic limit for the cost of a small lattice simulation | Srijan Bhattacharyya et.al. | 2411.08598 | null |
2024-11-13 | CorrSynth – A Correlated Sampling Method for Diverse Dataset Generation from LLMs | Suhas S Kowshik et.al. | 2411.08553 | null |
2024-11-13 | A dark energy parameterization independent constraint of the spatial curvature $Ω_K$ | Zhennan Li et.al. | 2411.08498 | null |
2024-11-13 | Generative AI for Data Augmentation in Wireless Networks: Analysis, Applications, and Case Study | Jinbo Wen et.al. | 2411.08341 | null |
2024-11-13 | DNN Task Assignment in UAV Networks: A Generative AI Enhanced Multi-Agent Reinforcement Learning Approach | Xin Tang et.al. | 2411.08299 | null |
2024-11-13 | Dynamic Thresholding Algorithm with Memory for Linear Inverse Problems | Zhong-Feng Sun et.al. | 2411.08284 | null |
2024-11-12 | SynapsNet: Enhancing Neuronal Population Dynamics Modeling via Learning Functional Connectivity | Parsa Delavari et.al. | 2411.08221 | null |
2024-11-12 | Design optimization of semiconductor manufacturing equipment using a novel multi-fidelity surrogate modeling approach | Bingran Wang et.al. | 2411.08149 | null |
2024-11-12 | Large Language Models Can Self-Improve in Long-context Reasoning | Siheng Li et.al. | 2411.08147 | link |
2024-11-12 | Language Models as Causal Effect Generators | Lucius E. J. Bynum et.al. | 2411.08019 | link |
2024-11-12 | Scalable piecewise smoothing with BART | Ryan Yee et.al. | 2411.07984 | null |
2024-11-12 | Maritime Search and Rescue Missions with Aerial Images: A Survey | Juan P. Martinez-Esteso et.al. | 2411.07649 | null |
2024-11-11 | Music Discovery Dialogue Generation Using Human Intent Analysis and Large Language Models | SeungHeon Doh et.al. | 2411.07439 | link |
2024-11-11 | Feature-Space Semantic Invariance: Enhanced OOD Detection for Open-Set Domain Generalization | Haoliang Wang et.al. | 2411.07392 | null |
2024-11-11 | SynRL: Aligning Synthetic Clinical Trial Data with Human-preferred Clinical Endpoints Using Reinforcement Learning | Trisha Das et.al. | 2411.07317 | null |
2024-11-11 | DLCR: A Generative Data Expansion Framework via Diffusion for Clothes-Changing Person Re-ID | Nyle Siddiqui et.al. | 2411.07205 | link |
2024-11-11 | Data-Driven Predictive Control of Nonholonomic Robots Based on a Bilinear Koopman Realization: Data Does Not Replace Geometry | Mario Rosenfelder et.al. | 2411.07192 | null |
2024-11-11 | Hierarchical Conditional Tabular GAN for Multi-Tabular Synthetic Data Generation | Wilhelm Ågren et.al. | 2411.07009 | null |
2024-11-11 | Maximizing domain generalization in fetal brain tissue segmentation: the role of synthetic data generation, intensity clustering and real image fine-tuning | Vladyslav Zalevskyi et.al. | 2411.06842 | null |
2024-11-11 | Synthesize, Partition, then Adapt: Eliciting Diverse Samples from Foundation Models | Yeming Wen et.al. | 2411.06722 | null |
2024-11-11 | DiffSR: Learning Radar Reflectivity Synthesis via Diffusion Model from Satellite Observations | Xuming He et.al. | 2411.06714 | null |
2024-11-11 | What Should Baby Models Read? Exploring Sample-Efficient Data Composition on Model Performance | Hong Meng Yam et.al. | 2411.06672 | null |
2024-11-10 | In-Context Learning for Preserving Patient Privacy: A Framework for Synthesizing Realistic Patient Portal Messages | Joseph Gatto et.al. | 2411.06549 | link |
2024-11-10 | CRTRE: Causal Rule Generation with Target Trial Emulation Framework | Junda Wang et.al. | 2411.06338 | null |
2024-11-09 | Clustering Algorithms and RAG Enhancing Semi-Supervised Text Classification with Large LLMs | Shan Zhong et.al. | 2411.06175 | null |
2024-11-09 | Behavior-Aware Efficient Detection of Malicious EVs in V2G Systems | Ruixiang Wu et.al. | 2411.06113 | null |
2024-11-09 | A novel study on the MUSIC-type imaging of small electromagnetic inhomogeneities in the limited-aperture inverse scattering problem | Won-Kwang Park et.al. | 2411.06030 | null |
2024-11-08 | DNAMite: Interpretable Calibrated Survival Analysis with Discretized Additive Models | Mike Van Ness et.al. | 2411.05923 | link |
2024-11-08 | Differential Privacy Under Class Imbalance: Methods and Empirical Insights | Lucas Rosenblatt et.al. | 2411.05733 | null |
2024-11-08 | Evaluating Large Language Model Capability in Vietnamese Fact-Checking Data Generation | Long Truong To et.al. | 2411.05641 | null |
2024-11-08 | SynDroneVision: A Synthetic Dataset for Image-Based Drone Detection | Tamara R. Lenhard et.al. | 2411.05633 | null |
2024-11-08 | DeepArUco++: Improved detection of square fiducial markers in challenging lighting conditions | Rafael Berral-Soler et.al. | 2411.05552 | link |
2024-11-08 | A Quality-Centric Framework for Generic Deepfake Detection | Wentang Song et.al. | 2411.05335 | null |
2024-11-08 | Discovering Latent Structural Causal Models from Spatio-Temporal Data | Kun Wang et.al. | 2411.05331 | null |
2024-11-08 | Cancer-Net SCa-Synth: An Open Access Synthetically Generated 2D Skin Lesion Dataset for Skin Cancer Classification | Chi-en Amy Tai et.al. | 2411.05269 | link |
2024-11-07 | Precision or Recall? An Analysis of Image Captions for Training Text-to-Image Generation Model | Sheng Cheng et.al. | 2411.05079 | link |
2024-11-07 | Diff-2-in-1: Bridging Generation and Dense Perception with Diffusion Models | Shuhong Zheng et.al. | 2411.05005 | null |
2024-11-07 | Uncovering Hidden Subspaces in Video Diffusion Models Using Re-Identification | Mischa Dombrowski et.al. | 2411.04956 | null |
2024-11-09 | OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models | Siming Huang et.al. | 2411.04905 | null |
2024-11-07 | Controlling Human Shape and Pose in Text-to-Image Diffusion Models via Domain Adaptation | Benito Buchheim et.al. | 2411.04724 | null |
2024-11-08 | BhasaAnuvaad: A Speech Translation Dataset for 13 Indian Languages | Sparsh Jain et.al. | 2411.04699 | link |
2024-11-07 | Improved Multi-Task Brain Tumour Segmentation with Synthetic Data Augmentation | André Ferreira et.al. | 2411.04632 | link |
2024-11-07 | Enhancing Bronchoscopy Depth Estimation through Synthetic-to-Real Domain Adaptation | Qingyao Tian et.al. | 2411.04404 | null |
2024-11-06 | Generating Synthetic Electronic Health Record (EHR) Data: A Review with Benchmarking | Xingran Chen et.al. | 2411.04281 | link |
2024-11-06 | Debiasing Synthetic Data Generated by Deep Generative Models | Alexander Decruyenaere et.al. | 2411.04216 | null |
2024-11-06 | Topology Bench: Systematic Graph Based Benchmarking for Core Optical Networks | Robin Matzner et.al. | 2411.04160 | null |
2024-11-06 | GUIDE-VAE: Advancing Data Generation with User Information and Pattern Dictionaries | Kutay Bölat et.al. | 2411.03936 | null |
2024-11-06 | VQA $^2$ :Visual Question Answering for Video Quality Assessment | Ziheng Jia et.al. | 2411.03795 | null |
2024-11-06 | Content-Style Learning from Unaligned Domains: Identifiability under Unknown Latent Dimensions | Sagar Shrestha et.al. | 2411.03755 | null |
2024-11-06 | Where Do We Stand with Implicit Neural Representations? A Technical and Performance Survey | Amer Essakine et.al. | 2411.03688 | null |
2024-11-06 | Open-Source High-Speed Flight Surrogate Modeling Framework | Tyler E. Korenyi-Both et.al. | 2411.03598 | null |
2024-11-05 | Forecasting Outside the Box: Application-Driven Optimal Pointwise Forecasts for Stochastic Optimization | Tito Homem-de-Mello et.al. | 2411.03520 | null |
2024-11-04 | Enhancing Table Representations with LLM-powered Synthetic Data Generation | Dayu Yang et.al. | 2411.03356 | null |
2024-11-05 | DiffLM: Controllable Synthetic Data Generation via Diffusion Language Models | Ying Zhou et.al. | 2411.03250 | null |
2024-11-05 | A data-driven study on Implicit LES using a spectral difference method | Nicola Clinco et.al. | 2411.03211 | null |
2024-11-05 | Local Lesion Generation is Effective for Capsule Endoscopy Image Data Augmentation in a Limited Data Setting | Adrian B. Chłopowiec et.al. | 2411.03098 | null |
2024-11-05 | Speech Separation with Pretrained Frontend to Minimize Domain Mismatch | Wupeng Wang et.al. | 2411.03085 | null |
2024-11-05 | Controlling for Unobserved Confounding with Large Language Model Classification of Patient Smoking Status | Samuel Lee et.al. | 2411.03004 | null |
2024-11-05 | IMUDiffusion: A Diffusion Model for Multivariate Time Series Synthetisation for Inertial Motion Capturing Systems | Heiko Oppel et.al. | 2411.02954 | null |
2024-11-05 | SpiDR: A Reconfigurable Digital Compute-in-Memory Spiking Neural Network Accelerator for Event-based Perception | Deepika Sharma et.al. | 2411.02854 | null |
2024-11-05 | On the Comparison between Multi-modal and Single-modal Contrastive Learning | Wei Huang et.al. | 2411.02837 | null |
2024-11-04 | Combining Induction and Transduction for Abstract Reasoning | Wen-Ding Li et.al. | 2411.02272 | link |
2024-11-06 | Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent | Xingwu Sun et.al. | 2411.02265 | link |
2024-11-06 | Digi2Real: Bridging the Realism Gap in Synthetic Data Face Recognition via Foundation Models | Anjith George et.al. | 2411.02188 | null |
2024-11-04 | Generating the Traces You Need: A Conditional Generative Model for Process Mining Data | Riccardo Graziosi et.al. | 2411.02131 | link |
2024-11-04 | GDP nowcasting with large-scale inter-industry payment data in real time – A network approach | Anastasia Mantziou et.al. | 2411.02029 | null |
2024-11-04 | Learning Where to Edit Vision Transformers | Yunqiao Yang et.al. | 2411.01948 | link |
2024-11-04 | Exploring the Landscape for Generative Sequence Models for Specialized Data Synthesis | Mohammad Zbeeb et.al. | 2411.01929 | link |
2024-11-04 | ManiBox: Enhancing Spatial Grasping Generalization via Scalable Simulation Data Generation | Hengkai Tan et.al. | 2411.01850 | null |
2024-11-04 | DiffuMask-Editor: A Novel Paradigm of Integration Between the Segmentation Diffusion Model and Image Editing to Improve Segmentation Ability | Bo Gao et.al. | 2411.01819 | null |
2024-11-03 | Enhancing Forecasts Using Real-Time Data Flow and Hierarchical Forecast Reconciliation, with Applications to the Energy Sector | Lukas Neubauer et.al. | 2411.01528 | link |
2024-11-03 | Privacy-Preserving Customer Churn Prediction Model in the Context of Telecommunication Industry | Joydeb Kumar Sana et.al. | 2411.01447 | null |
2024-11-02 | Network Causal Effect Estimation In Graphical Models Of Contagion And Latent Confounding | Yufeng Wu et.al. | 2411.01371 | null |
2024-11-02 | Guided Synthesis of Labeled Brain MRI Data Using Latent Diffusion Models for Segmentation of Enlarged Ventricles | Tim Ruschke et.al. | 2411.01351 | null |
2024-11-02 | Marginal Causal Flows for Validation and Inference | Daniel de Vassimon Manela et.al. | 2411.01295 | link |
2024-11-02 | Efficient Collaborative Navigation through Perception Fusion for Multi-Robots in Unknown Environments | Qingquan Lin et.al. | 2411.01274 | null |
2024-11-01 | SelfCodeAlign: Self-Alignment for Code Generation | Yuxiang Wei et.al. | 2410.24198 | link |
2024-10-31 | DexMimicGen: Automated Data Generation for Bimanual Dexterous Manipulation via Imitation Learning | Zhenyu Jiang et.al. | 2410.24185 | null |
2024-10-31 | Constraint Back-translation Improves Complex Instruction Following of Large Language Models | Yunjia Qi et.al. | 2410.24175 | null |
2024-11-02 | $π_0$ : A Vision-Language-Action Flow Model for General Robot Control | Kevin Black et.al. | 2410.24164 | null |
2024-10-31 | Understanding Generalizability of Diffusion Models Requires Rethinking the Hidden Gaussian Structure | Xiang Li et.al. | 2410.24060 | link |
2024-10-31 | Unveiling Synthetic Faces: How Synthetic Datasets Can Expose Real Identities | Hatef Otroshi Shahreza et.al. | 2410.24015 | null |
2024-10-31 | Towards Fast Algorithms for the Preference Consistency Problem Based on Hierarchical Models | Anne-Marie George et.al. | 2410.23934 | null |
2024-10-31 | Bayesian Hierarchical Model for Synthesizing Registry and Survey Data on Female Breast Cancer Prevalence | Qiao Wang et.al. | 2410.23580 | null |
2024-10-30 | Neural spell-checker: Beyond words with synthetic data generation | Matej Klemen et.al. | 2410.23514 | link |
2024-10-30 | Development and Comparative Analysis of Machine Learning Models for Hypoxemia Severity Triage in CBRNE Emergency Scenarios Using Physiological and Demographic Data from Medical-Grade Devices | Santino Nanini et.al. | 2410.23503 | null |
2024-10-30 | PACER: Preference-conditioned All-terrain Costmap Generation | Luisa Mao et.al. | 2410.23488 | null |
2024-10-30 | Multilingual Vision-Language Pre-training for the Remote Sensing Domain | João Daniel Silva et.al. | 2410.23370 | link |
2024-10-30 | Strategic communication of narratives | Gerrit Bauch et.al. | 2410.23259 | null |
2024-10-31 | Enhancing Autonomous Driving Safety Analysis with Generative AI: A Comparative Study on Automated Hazard and Risk Assessment | Alireza Abbaspour et.al. | 2410.23207 | null |
2024-10-30 | Directional anomaly detection | Oliver Urs Lenz et.al. | 2410.23158 | null |
2024-10-30 | Federated Learning under Periodic Client Participation and Heterogeneous Data: A New Communication-Efficient Algorithm and Analysis | Michael Crawshaw et.al. | 2410.23131 | link |
2024-10-30 | Automated Image-Based Identification and Consistent Classification of Fire Patterns with Quantitative Shape Analysis and Spatial Location Identification | Pengkun Liu et.al. | 2410.23105 | null |
2024-10-30 | CausalDiff: Causality-Inspired Disentanglement via Diffusion Model for Adversarial Defense | Mingkun Zhang et.al. | 2410.23091 | null |
2024-10-30 | Private Synthetic Text Generation with Diffusion Models | Sebastian Ochs et.al. | 2410.22971 | link |
2024-10-30 | Augmenting Polish Automatic Speech Recognition System With Synthetic Data | Łukasz Bondaruk et.al. | 2410.22903 | null |
2024-10-30 | Universality of the $π^2/6$ Pathway in Avoiding Model Collapse | Apratim Dey et.al. | 2410.22812 | link |
2024-10-30 | Analysis of Classifier Training on Synthetic Data for Cross-Domain Datasets | Andoni Cortés et.al. | 2410.22748 | null |
2024-10-29 | Unpicking Data at the Seams: VAEs, Disentanglement and Independent Components | Carl Allen et.al. | 2410.22559 | null |
2024-10-29 | Evaluating utility in synthetic banking microdata applications | Hugo E. Caceres et.al. | 2410.22519 | null |
2024-10-30 | Nanoscale Connectomics Annotation Standards Framework | Nicole K. Guittari et.al. | 2410.22320 | null |
2024-10-29 | Understanding Synthetic Context Extension via Retrieval Heads | Xinyu Zhao et.al. | 2410.22316 | null |
2024-10-29 | Model-free Estimation of Latent Structure via Multiscale Nonparametric Maximum Likelihood | Bryon Aragam et.al. | 2410.22248 | null |
2024-10-29 | Synthetic Data Generation with Large Language Models for Personalized Community Question Answering | Marco Braga et.al. | 2410.22182 | link |
2024-10-29 | Data Generation for Hardware-Friendly Post-Training Quantization | Lior Dikstein et.al. | 2410.22110 | null |
2024-10-29 | Cross-Entropy Is All You Need To Invert the Data Generating Process | Patrik Reizinger et.al. | 2410.21869 | null |
2024-10-29 | Generating Realistic Tabular Data with Large Language Models | Dang Nguyen et.al. | 2410.21717 | null |
2024-10-28 | Identifying Selections for Unsupervised Subtask Discovery | Yiwen Qiu et.al. | 2410.21616 | null |
2024-10-28 | Approximate Bayesian Computation with Statistical Distances for Model Selection | Clara Grazian et.al. | 2410.21603 | link |
2024-10-28 | Unveiling Context-Aware Criteria in Self-Assessing LLMs | Taneesh Gupta et.al. | 2410.21545 | null |
2024-10-28 | Not All LLM-Generated Data Are Equal: Rethinking Data Weighting in Text Classification | Hsun-Yu Kuo et.al. | 2410.21526 | null |
2024-10-28 | LLM-Forest for Health Tabular Data Imputation | Xinrui He et.al. | 2410.21520 | null |
2024-10-28 | Inferring the Morphology of the Galactic Center Excess with Gaussian Processes | Edward D. Ramirez et.al. | 2410.21367 | link |
2024-10-28 | Reconstructing dynamics from sparse observations with no training on target system | Zheng-Meng Zhai et.al. | 2410.21222 | null |
2024-10-29 | Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction | Qintong Zhang et.al. | 2410.21169 | null |
2024-10-28 | Synthetica: Large Scale Synthetic Data for Robot Perception | Ritvik Singh et.al. | 2410.21153 | null |
2024-10-28 | Topological Identification of Agent Status in Information Contagions: Application to Financial Markets | Anubha Goel et.al. | 2410.21104 | link |
2024-10-28 | Shallow Diffuse: Robust and Invisible Watermarking through Low-Dimensional Subspaces in Diffusion Models | Wenda Li et.al. | 2410.21088 | link |
2024-10-28 | Federated Time Series Generation on Feature and Temporally Misaligned Data | Chenrui Fan et.al. | 2410.21072 | null |
2024-10-28 | Push-Forward Signed Distance Functions enable interpretable and robust continuous shape quantification | Roua Rouatbi et.al. | 2410.21004 | null |
2024-10-29 | Valid Bootstraps for Networks with Applications to Network Visualisation | Emerald Dilworth et.al. | 2410.20895 | null |
2024-10-28 | Super-resolution with dynamics in the loss | Jacob Page et.al. | 2410.20884 | null |
2024-10-29 | zGAN: An Outlier-focused Generative Adversarial Network For Realistic Synthetic Data Generation | Azizjon Azimi et.al. | 2410.20808 | null |
2024-10-28 | Rephrasing natural text data with different languages and quality levels for Large Language Model pre-training | Michael Pieler et.al. | 2410.20796 | null |
2024-10-28 | Scaling-based Data Augmentation for Generative Models and its Theoretical Extension | Yoshitaka Koike et.al. | 2410.20780 | null |
2024-10-28 | Plan $\times$ RAG: Planning-guided Retrieval Augmented Generation | Prakhar Verma et.al. | 2410.20753 | null |
2024-10-28 | General Causal Imputation via Synthetic Interventions | Marco Jiralerspong et.al. | 2410.20647 | null |
2024-10-29 | TabDiff: a Multi-Modal Diffusion Model for Tabular Data Generation | Juntong Shi et.al. | 2410.20626 | link |
2024-10-25 | Considerations for Distribution Shift Robustness of Diagnostic Models in Healthcare | Arno Blaas et.al. | 2410.19575 | null |
2024-10-25 | EDGE: Enhanced Grounded GUI Understanding with Enriched Multi-Granularity Synthetic Data | Xuetian Chen et.al. | 2410.19461 | null |
2024-10-25 | Fictitious Synthetic Data Can Improve LLM Factuality via Prerequisite Learning | Yujian Liu et.al. | 2410.19290 | link |
2024-10-25 | In-Simulation Testing of Deep Learning Vision Models in Autonomous Robotic Manipulators | Dmytro Humeniuk et.al. | 2410.19277 | null |
2024-10-24 | Equitable Federated Learning with Activation Clustering | Antesh Upadhyay et.al. | 2410.19207 | null |
2024-10-24 | Heterogeneous Random Forest | Ye-eun Kim et.al. | 2410.19022 | link |
2024-10-24 | Ferret-UI 2: Mastering Universal User Interface Understanding Across Platforms | Zhangheng Li et.al. | 2410.18967 | null |
2024-10-24 | SkillMimicGen: Automated Demonstration Generation for Efficient Skill Learning and Deployment | Caelan Garrett et.al. | 2410.18907 | null |
2024-10-24 | Distill Visual Chart Reasoning Ability from LLMs to MLLMs | Wei He et.al. | 2410.18798 | link |
2024-10-24 | Learning Geodesics of Geometric Shape Deformations From Images | Nian Wu et.al. | 2410.18797 | null |
2024-10-24 | Unleashing Reasoning Capability of LLMs via Scalable Question Synthesis from Scratch | Yuyang Ding et.al. | 2410.18693 | link |
2024-10-24 | DreamClear: High-Capacity Real-World Image Restoration with Privacy-Safe Dataset Curation | Yuang Ai et.al. | 2410.18666 | link |
2024-10-24 | Little Giants: Synthesizing High-Quality Embedding Data at Scale | Haonan Chen et.al. | 2410.18634 | link |
2024-10-24 | Knowledge Distillation Using Frontier Open-source LLMs: Generalizability and the Role of Synthetic Data | Anup Shirgaonkar et.al. | 2410.18588 | null |
2024-10-24 | Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data | Shuhao Gu et.al. | 2410.18558 | null |