本文为清单版本,详细内容梳理版见:

6万字!30个方向130篇!CVPR 2023 最全 AIGC 论文!一口气读完

关注 公众号 机器学习与AI生成创作 】公众号,在后台回复 AIGC (长按红字、选中复制)即可获取分类、按文件夹汇总好的论文集,gan起来吧!!!

  • 1、图像转换/翻译

  • 2、GAN改进/可控

  • 3、可控文生图/定制化文生图

  • 4、图像恢复

  • 5、布局可控生成

  • 6、医学图像

  • 7、人脸相关

  • 8、3D相关

  • 9、deepfake检测

  • 10、图像超分

  • 11、风格迁移

  • 12、去雨去噪去模糊

  • 13、图像分割

  • 14、视频相关

  • 15、对抗攻击

  • 16、扩散模型改进

  • 17、数据增广

  • 18、说话人生成

  • 19、视图合成

  • 20、目标检测

  • 21、人像生成/姿态迁移

  • 22、发型迁移

  • 23、图像修复

  • 24、表征学习/表示学习

  • 25、语音相关

  • 26、域适应/迁移学习

  • 27、知识蒸馏

  • 28、字体生成

  • 29、异常检测

  • 30、数据集

一、图像转换/翻译

  • 1、Masked and Adaptive Transformer for Exemplar Based Image Translation

  • 2、LANIT: Language-Driven Image-to-Image Translation for Unlabeled Data

  • 3、Interactive Cartoonization with Controllable Perceptual Factors

  • 4、LightPainter: Interactive Portrait Relighting with Freehand Scribble

  • 5、Picture that Sketch: Photorealistic Image Generation from Abstract Sketches

  • 6、Few-shot Semantic Image Synthesis with Class Affinity Transfer

二、GAN改进

  • 7、CoralStyleCLIP: Co-optimized Region and Layer Selection for Image Editing

  • 8、Cross-GAN Auditing: Unsupervised Identification of Attribute Level Similarities and Differences between Pretrained Generative Models

  • 9、Efficient Scale-Invariant Generator with Column-Row Entangled Pixel Synthesis

  • 10、Fix the Noise: Disentangling Source Feature for Transfer Learning of StyleGAN

  • 11、Improving GAN Training via Feature Space Shrinkage

  • 12、Look ATME: The Discriminator Mean Entropy Needs Attention

  • 13、NoisyTwins: Class-Consistent and Diverse Image Generation through StyleGANs

  • 14、DeltaEdit: Exploring Text-free Training for Text-Driven Image Manipulation

  • 15、Delving StyleGAN Inversion for Image Editing: A Foundation Latent Space Viewpoint

  • 16、SIEDOB: Semantic Image Editing by Disentangling Object and Background

三、可控文生图/定制化文生图

  • 17、DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation

  • 18、Ablating Concepts in Text-to-Image Diffusion Models

  • 19、Multi-Concept Customization of Text-to-Image Diffusion

  • 20、Imagic: Text-Based Real Image Editing with Diffusion Models

  • 21、Shifted Diffusion for Text-to-image Generation

  • 22、SpaText: Spatio-Textual Representation for Controllable Image Generation

  • 23、Scaling up GANs for Text-to-Image Synthesis

  • 24、GALIP: Generative Adversarial CLIPs for Text-to-Image Synthesis

  • 25、Variational Distribution Learning for Unsupervised Text-to-Image Generation

四、图像恢复

  • 26、Bitstream-Corrupted JPEG Images are Restorable: Two-stage Compensation and Alignment Framework for Image Restoration

  • 27、Contrastive Semi-supervised Learning for Underwater Image Restoration via Reliable Bank

  • 28、Efficient and Explicit Modelling of Image Hierarchies for Image Restoration

  • 29、Generating Aligned Pseudo-Supervision from Non-Aligned Data forImage Restoration in Under-Display Camera

  • 30、 Learning Semantic-Aware Knowledge Guidance for Low-Light Image Enhancement

  • 31、Refusion: Enabling Large-Size Realistic Image Restoration with Latent-Space Diffusion Model

  • 32、Robust Model-based Face Reconstruction through Weakly-Supervised Outlier Segmentation

  • 33、Robust Unsupervised StyleGAN Image Restoration

五、布局可控生成

  • 34、LayoutDiffusion: Controllable Diffusion Model for Layout-to-image Generation

  • 35、LayoutDM: Discrete Diffusion Model for Controllable Layout Generation

  • 36、PosterLayout: A New Benchmark and Approach for Content-aware Visual-Textual Presentation Layout

  • 37、Unifying Layout Generation with a Decoupled Diffusion Model

  • 38、Unsupervised Domain Adaption with Pixel-level Discriminator for Image-aware Layout Generation

六、医学图像

  • 39、High-resolution image reconstruction with latent diffusion models from human brain activity

  • 40、 Leveraging GANs for data scarcity of COVID-19: Beyond the hype

  • 41、Why is the winner the best?

  • 45、Solving 3D Inverse Problems using Pre-trained 2D Diffusion Models

七、人脸相关

  • 46、A Hierarchical Representation Network for Accurate and Detailed Face Reconstruction from In-The-Wild Images

  • 47、DR2: Diffusion-based Robust Degradation Remover for Blind Face Restoration

  • 48、DiffusionRig: Learning Personalized Priors for Facial Appearance Editing

  • 49、Fine-Grained Face Swapping via Regional GAN Inversion

  • 50、SunStage: Portrait Reconstruction and Relighting using the Sun as a Light Stage

八、3D相关

  • 51、3DQD: Generalized Deep 3D Shape Prior via Part-Discretized Diffusion Process

  • 52、Controllable Mesh Generation Through Sparse Latent Point Diffusion Models

  • 53、GD-MAE: Generative Decoder for MAE Pre-training on LiDAR Point Clouds

  • 54、GINA-3D: Learning to Generate Implicit Neural Assets in the Wild

  • 55、Graphics Capsule: Learning Hierarchical 3D Face Representations from 2D Images

  • 56、HOLODIFFUSION: Training a 3D Diffusion Model using 2D Images

  • 57、Learning 3D-aware Image Synthesis with Unknown Pose Distribution

  • 58、Lift3D: Synthesize 3D Training Data by Lifting 2D GAN to 3D Generative Radiance Field

  • 59、Magic3D: High-Resolution Text-to-3D Content Creation

  • 60、NeuFace: Realistic 3D Neural Face Rendering from Multi-view Images

  • 61、NeuralField-LDM: Scene Generation with Hierarchical Latent Diffusion Models

  • 62、Next3D: Generative Neural Texture Rasterization for 3D-Aware Head Avatars

  • 63、SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation

  • 64、SDFusion: Multimodal 3D Shape Completion, Reconstruction, and Generation

  • 65、Solving 3D Inverse Problems using Pre-trained 2D Diffusion Models

  • 66、T2M-GPT: Generating Human Motion from Textual Descriptions with Discrete Representations

  • 67、TAPS3D: Text-Guided 3D Textured Shape Generation from Pseudo Supervision

九、deepfake检测

  • 68、Detecting and Grounding Multi-Modal Media Manipulation

十、图像超分

  • 69、Activating More Pixels in Image Super-Resolution Transformer

  • 70、Denoising Diffusion Probabilistic Models for Robust Image Super-Resolution in the Wild

  • 71、Implicit Diffusion Models for Continuous Super-Resolution

  • 72、Perception-Oriented Single Image Super-Resolution using Optimal Objective Estimation

  • 73、Structured Sparsity Learning for Efficient Video Super-Resolution

  • 74、Super-Resolution Neural Operator

  • 75、Towards High-Quality and Efficient Video Super-Resolution via Spatial-Temporal Data Overfitting

十一、风格迁移

  • 76、CAP-VSTNet: Content Affinity Preserved Versatile Style Transfer

  • 77、Inversion-Based Style Transfer with Diffusion Models

  • 78、Neural Preset for Color Style Transfer

十二、去雨去噪去模糊

  • 79、Learning A Sparse Transformer Network for Effective Image Deraining

  • 80、Masked Image Training for Generalizable Deep Image Denoising

  • 81、Uncertainty-Aware Unsupervised Image Deblurring with Deep Residual Prior

十三、图像分割

  • 82、DiGA: Distil to Generalize and then Adapt for Domain Adaptive Semantic Segmentation

  • 83、Generative Semantic Segmentation

  • 84、Learning to Generate Text-grounded Mask for Open-world Semantic Segmentation from Only Image-Text Pairs

  • 85、Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models

十四、视频相关

  • 86、A Dynamic Multi-Scale Voxel Flow Network for Video Prediction

  • 87、A Unified Pyramid Recurrent Network for Video Frame Interpolation

  • 88、Conditional Image-to-Video Generation with Latent Flow Diffusion Models

  • 89、Diffusion Video Autoencoders: Toward Temporally Consistent Face Video Editing via Disentangled Video Encoding

  • 90、Extracting Motion and Appearance via Inter-Frame Attention for Efficient Video Frame Interpolation

  • 91、MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation

  • 92、MOSO: Decomposing MOtion, Scene and Object for Video Prediction

  • 93、Text-Visual Prompting for Efficient 2D Temporal Video Grounding

  • 94、Towards End-to-End Generative Modeling of Long Videos with Memory-Efficient Bidirectional Transformers

  • 95、Video Probabilistic Diffusion Models in Projected Latent Space

  • 96、VideoFusion: Decomposed Diffusion Models for High-Quality Video Generation

十五、对抗攻击

  • 97、Adversarial Attack with Raindrops

  • 98、TrojDiff: Trojan Attacks on Diffusion Models with Diverse Targets

十六、扩散模型改进

  • 99、All are Worth Words: A ViT Backbone for Diffusion Models

  • 100、Towards Practical Plug-and-Play Diffusion Models

  • 101、Wavelet Diffusion Models are fast and scalable Image Generators

十七、数据增广

  • 102、DCFace: Synthetic Face Generation with Dual Condition Diffusion Model

  • 103、Leveraging GANs for data scarcity of COVID-19: Beyond the hype

  • 104、Lift3D: Synthesize 3D Training Data by Lifting 2D GAN to 3D Generative Radiance Field

十八、说话人生成

  • 105、MetaPortrait: Identity-Preserving Talking Head Generation with Fast Personalized Adaptation

  • 106、Seeing What You Said: Talking Face Generation Guided by a Lip Reading Expert

十九、视图合成

  • 107、Consistent View Synthesis with Pose-Guided Diffusion Models

二十、目标检测

  • 108、Multi-view Adversarial Discriminator: Mine the Non-causal Factors for Object Detection in Unseen Domains

二十一、人像生成-姿态迁移

  • 109、Person Image Synthesis via Denoising Diffusion Model

  • 110、VGFlow: Visibility guided Flow Network for Human Reposing

二十二、发型迁移

  • 111、StyleGAN Salon: Multi-View Latent Optimization for Pose-Invariant Hairstyle Transfer

二十三、图像修复

  • 112、SmartBrush: Text and Shape Guided Object Inpainting with Diffusion Model

二十四、表征学习

  • 113、GD-MAE: Generative Decoder for MAE Pre-training on LiDAR Point Clouds

二十五、语音相关

  • 114、Conditional Generation of Audio from Video via Foley Analogies

  • 115、Physics-Driven Diffusion Models for Impact Sound Synthesis from Videos

  • 116、Sound to Visual Scene Generation by Audio-to-Visual Latent Alignment

二十六、域适应-迁移学习

  • 117、Back to the Source: Diffusion-Driven Test-Time Adaptation

  • 118、Domain Expansion of Image Generators

  • 119、Zero-shot Generative Model Adaptation via Image-specific Prompt Learning

二十七、知识蒸馏

  • 120、KD-DLGAN: Data Limited Image Generation via Knowledge Distillation

二十八、字体生成

  • 121、CF-Font: Content Fusion for Few-shot Font Generation

  • 122、Handwritten Text Generation from Visual Archetypes

二十九、异常检测

  • 123、SQUID: Deep Feature In-Painting for Unsupervised Anomaly Detection

三十、数据集

  • 124、An Image Quality Assessment Dataset for Portraits

  • 125、CelebV-Text: A Large-Scale Facial Text-Video Dataset

  • 126、Human-Art: A Versatile Human-Centric Dataset Bridging Natural and Artificial Scenes

  • 127、Uncurated Image-Text Datasets: Shedding Light on Demographic Bias

关注公众号【机器学习与AI生成创作】,更多精彩等你来读

深入浅出stable diffusion:AI作画技术背后的潜在扩散模型论文解读

深入浅出ControlNet,一种可控生成的AIGC绘画生成算法!

经典GAN不得不读:StyleGAN

8baa0a7aa2bee8c6abc3ef70348f56b4.png 戳我,查看GAN的系列专辑~!

一杯奶茶,成为AIGC+CV视觉的前沿弄潮儿!

最新最全100篇汇总!生成扩散模型Diffusion Models

ECCV2022 | 生成对抗网络GAN部分论文汇总

CVPR 2022 | 25+方向、最新50篇GAN论文

ICCV 2021 | 35个主题GAN论文汇总

超110篇!CVPR 2021最全GAN论文梳理

超100篇!CVPR 2020最全GAN论文梳理

拆解组新的GAN:解耦表征MixNMatch

StarGAN第2版:多域多样性图像生成

附下载 | 《可解释的机器学习》中文版

附下载 |《TensorFlow 2.0 深度学习算法实战》

附下载 |《计算机视觉中的数学方法》分享

《基于深度学习的表面缺陷检测方法综述》

《零样本图像分类综述: 十年进展》

《基于深度神经网络的少样本学习综述》

《礼记·学记》有云:独学而无友,则孤陋而寡闻

点击 一杯奶茶,成为AIGC+CV视觉的前沿弄潮儿! ,加入 AI生成创作与计算机视觉 知识星球!

原文链接: https://mp.weixin.qq.com/s?__biz=MzU5MTgzNzE0MA==&mid=2247500553&idx=1&sn=d878ff781f4aafd84a7c8284bfd2e664&chksm=fe2a61b2c95de8a4de30949bb42ddb12988123bc813798560a3f005b936263de83fddd40f4fa&scene=126&sessionid=0 CAP-VSTNet是 2023 年提出的风格迁移网络,它在处理风格迁移时表现出了优秀的性能。这个网络包括一个新的可逆残差网络和一个无偏线性变换模块,用于多功能风格转移。CAP-VSTNet的主要目标是解决内容相似度损失(包括特征和像素相似度)问题,这是导致逼真和视频风格迁移中出现伪影的主要问题。 根据相关研究,CAP-VSTNet在多功能风格转移上表现出了有效性,并且可以产生较好的定性和定量结果。这意味着CAP-VSTNet能够在保留内容相似性的同时,实现高质量的风格迁移。 80、半监督学习、弱监督学习/无监督学习/自监督学习。36、行为识别/动作识别/检测/分割/定位。46、场景重建/视图合成/新视角合成。74、迁移学习/domain/自适应。23、图像复原/图像增强/图像重建。26、图像去噪/去模糊/去雨去雾。32、人脸生成/合成/重建/编辑。35、图像&视频检索/视频理解。41、GAN/生成式/对抗式。68、小样本学习/零样本学习。40、文本检测/识别/理解。17、视频生成/视频合成。27、图像编辑/图像修复。​31、人脸识别/检测。42、图像生成/图像合成。 01图像分割类[1]AutoFocusFormer: Image Segmentation off the Grid推荐理由:该 论文 提出了 AutoFocusFormer (AFF),一种局部注意力变换器图像识别主干,它通过学习保留任务最重要的像素来执行自适应下采样。放弃了经典的网格结构,该 论文 开发了一种新的基于点的局部注意力块,由平衡聚类模块和可学习的邻域合并模块提供便利,可以为最先进的分割头的... 它首先开发了一个上下文增强集体规范化(CEGN)层,通过将基于稀疏样本特征的统计数据替换为全局上下文特征的统计数据,然后设计了一种适配多层掩盖策略,以产生不同的高度,在不同大小的前景覆盖范围内产生最优遮瑕率。结果是一个基于素描的图像检测框架。作者:Lingchen Meng,Xiyang Dai,Yinpeng Chen,Pengchuan Zhang,Dongdong Chen,Mengchen Liu,Jianfeng Wang,Zuxuan Wu,Lu Yuan,Yu-Gang Jiang。 整理:AI算法与图像处理 CVPR2023 论文 和代码整理:https://github.com/DWCTOD/ CVPR2023 -Papers-with-Code-Demo欢迎关注公众号 AI算法与图像处理,获取更多干货:大家好,最近正在优化每周分享的 CVPR 论文 , 目前考虑按照不同类别去分类,方便不同 方向 的小伙伴挑选自己感兴趣的 论文 哈大家好,目前给每天的 论文 汇总 接入chatGPT帮忙总结,目前在... 根据官方信息统计,今年共收到 9155 份提交,比去年增加了 12%,创下新纪录,今年接收了 2360 论文 ,接收率为 25.78%。作为对比,去年有 8100 多 有效投稿,大会接收了 2067 ,接收率为 25%。https:// cvpr2023 .thecvf.com/Conferences/ 2023 /AcceptedPapers推荐阅读西电IEEE Fellow团队出品!最新《Trans... 整理:AI算法与图像处理 CVPR2023 论文 和代码整理:https://github.com/DWCTOD/ CVPR2023 -Papers-with-Code-Demo欢迎关注公众号 AI算法与图像处理,获取更多干货:大家好,最近正在优化每周分享的 CVPR 论文 , 目前考虑按照不同类别去分类,方便不同 方向 的小伙伴挑选自己感兴趣的 论文 哈大家好,目前给每天的 论文 汇总 接入chatGPT帮忙总结,目前在... 编辑:桃子 好困 来源:新智元【导读】 CVPR 2023 有哪些亮点?从录用 论文 中我们又能看到CV领域有哪些趋势?一年一度的 CVPR 即将在6月18-22日加拿大温哥华正式开幕。每年,来自世界各地的成千上万的CV研究人员和工程师聚集在一起参加顶会。这个久负盛名的会议可以追溯到1983年,它代表了计算机视觉发展的巅峰。目前, CVPR 的h5指数所有会议或出 物中位列第四,仅次于《自然》、《科学》和... CVPR 2023 顶会 论文 接收结果出炉啦! CVPR 2023 主委会官方发布这次 论文 接收数据:有效投稿 9155 (比 CVPR 2022 增加12%),收录 2360 ,接收率为25.78 %。我整理了 CVPR 2023 论文 &代码,分享给大家~持续更新中…扫码添加我领取资料此外,还有AAAI 2023 论文 &代码,分享给大家~收持续更新中…扫码添加我领取资料最后,我...