佘志用 郭曉新 馮月萍 張東坡
摘要: 基于非Markov鏈去噪擴(kuò)散隱式模型(DDIM), 提出一種粗糙集的去噪擴(kuò)散概率方法, 用粗糙集理論對(duì)采樣的原序列等價(jià)劃分, 在原序列上構(gòu)建子序列的上下近似集和粗糙度, 當(dāng)粗糙度最小時(shí)獲取非Markov鏈去噪擴(kuò)散隱式模型的有效子序列. 利用去噪擴(kuò)散概率模型(DDPM)和DDIM進(jìn)行對(duì)比實(shí)驗(yàn), 實(shí)驗(yàn)結(jié)果表明, 該方法獲取的序列是有效子序列, 且在該序列上的采樣效率優(yōu)于DDPM.
關(guān)鍵詞: 粗糙集; 去噪擴(kuò)散概率模型; 非Markov鏈去噪擴(kuò)散概率模型; Markov鏈
中圖分類號(hào): TP391.4文獻(xiàn)標(biāo)志碼: A文章編號(hào): 1671-5489(2024)02-0339-08
Probability Method of Denoising DiffusionBased on? Rough Sets
SHE Zhiyong1, GUO Xiaoxin2, FENG Yueping2, ZHANG Dongpo1
(1. School of Information Network Security, Xinjiang University of Political Science and Law,
Tumxuk 844000, Xinjiang Uygur Autonomous Region, China;
2. College of Computer Science and Technology, Jilin University, Changchun 130012, China)
Abstract: Based on non Markov chain denoising diffusion implicit model (DDIM), we proposed? probability method of denoising diffusion based on? rough sets. The rough set theory was used to equivalently partition the sampled original sequence, construct the upper and lower approximation sets and roughness of the subsequences on the original sequence, and obtain the effective subsequences of the non Markov chain DDIM when the roughness was the lowest. The comparative experiments were conducted by?? the denoising diffusion probability model (DDPM) and DDIM,? and the experimental results? show that the sequence obtained by proposed method is an effective subsequence, and the sampling efficiency on this sequence is better than that of the DDPM.
Keywords: rough set; denoising diffusion probability model; non Markov chain denoising diffusion probability model; Markov chain
擴(kuò)散模型[1]在計(jì)算機(jī)視覺(jué)、 自然語(yǔ)言處理、 時(shí)間數(shù)據(jù)建模等領(lǐng)域應(yīng)用廣泛. 從高水平的細(xì)節(jié)到生成例子的多樣性, 其展現(xiàn)了強(qiáng)大的生成能力. 目前, 擴(kuò)散模型已被廣泛應(yīng)用于各種生成建模任務(wù), 如圖像生成[2]、 圖像超分辨率[3-4]、 圖像插入繪畫[5]、 圖像編輯[6]和圖像到圖像轉(zhuǎn)換[7]等. 此外, 通過(guò)擴(kuò)散模型學(xué)習(xí)到的潛在表示也被應(yīng)用在鑒別任務(wù)中, 如圖像分割[8-9]、 分類[10]和異常檢測(cè)[11], 從而證實(shí)了擴(kuò)散模型的廣泛適用性.
目前, 擴(kuò)散模型中的去噪擴(kuò)散概率模型[12](DDPM)在生成領(lǐng)域備受關(guān)注. DDPM具有不需要變分自編碼器(VAE)[13]對(duì)準(zhǔn)后驗(yàn)分布、 能量模型(EBM)[14]難以解決的分區(qū)函數(shù)、 對(duì)抗神經(jīng)網(wǎng)絡(luò)(GAN)[15]訓(xùn)練額外的判別器以及歸一化流[16]施加網(wǎng)絡(luò)約束等優(yōu)點(diǎn). 該模型提供了可操作的描述模型的概率參數(shù), 有足夠理論支持穩(wěn)定訓(xùn)練程序. 統(tǒng)一的損失函數(shù)設(shè)計(jì)具有高度的簡(jiǎn)單性. 模型旨在將先驗(yàn)數(shù)據(jù)分布轉(zhuǎn)化為隨機(jī)噪聲, 進(jìn)而逐步加噪擴(kuò)散. 先得到全高斯噪聲的分布, 然后再逐步去噪擴(kuò)散, 從而重建具有相同分布的全新樣本. 但DDPM與生成對(duì)抗網(wǎng)絡(luò)[17](GAN)和變異自動(dòng)編碼器[18](VAE)等模型相比有一個(gè)缺點(diǎn), 即采樣步驟多, 導(dǎo)致采樣時(shí)間長(zhǎng).
針對(duì)上述問(wèn)題, Nichol等[19]提出了非Markov鏈的去噪擴(kuò)散隱式模型(DDIM), 該模型主要證明了擴(kuò)散過(guò)程是非Markov鏈的過(guò)程. DDIM在原序列上隨機(jī)選取子序列進(jìn)行采樣, 雖提高了采樣效率, 但因該子序列是隨機(jī)選取導(dǎo)致無(wú)法有效保證每次采樣效果都比DDPM優(yōu). 針對(duì)該問(wèn)題, 本文在DDIM的基礎(chǔ)上提出了基于粗糙集的去噪概率擴(kuò)散方法, 解決選取較佳采樣子列的問(wèn)題.
1 預(yù)備知識(shí)
1.1 粗糙度
在相同損失值下將DDPM,DDIM和本文方法利用U-net網(wǎng)絡(luò)訓(xùn)練模型. DDPM在{t100,t99,t98,…,t2,t1}序列上的采樣結(jié)果與DDIM和本文方法在{t75,t74,t73,…,t35,t34,t33}序列上的采樣結(jié)果對(duì)比如圖5所示. DDPM,DDIM和本文方法的FID(Fréchet inception distance)值列于表3.
圖5為DDPM在原序列上采樣得到的3張人臉圖像, DDIM和本文算法在獲取的子序列上采樣得到3張人臉圖像. 由圖5可見(jiàn), 通過(guò)直觀的對(duì)3種算法采樣得到圖像對(duì)比, DDIM和本文算法在獲取子序列上采樣效果與DDPM在原序列上的采樣效果相當(dāng). 說(shuō)明本文算法獲取的子序列是有效子序列, 且采樣效果相似. 表3為通過(guò)FID量化評(píng)估3種算法采樣生成圖像的質(zhì)量, FID越小生成的圖像質(zhì)量越好. 由表3可見(jiàn), 3種算法FID值都相對(duì)較小, 說(shuō)明3種算法采樣生成的圖像質(zhì)量較好. 本文算法生成圖像的FID值與DDIM和DDPM的FID值接近, 進(jìn)一步說(shuō)明本文算法獲取的子列是有效子列, 且采樣效果相似.
驗(yàn)證本文算法獲取子列是較佳的采樣子序列. DDIM分別在隨機(jī)選取3個(gè)子序列和本文獲取的子序列上采樣結(jié)果與本文算法采樣結(jié)果對(duì)比如圖6所示. 圖6采樣結(jié)果的FID值列于表4.
為說(shuō)明本文算法獲取子序列采樣的優(yōu)越性, 圖6中的3個(gè)隨機(jī)子序列都是等長(zhǎng)的. 從DDIM在3個(gè)隨機(jī)子序列采樣效果與DDIM和本文算法在本文獲取的子序列采樣結(jié)果對(duì)比, 在本文獲取的子序列上采樣效果明顯優(yōu)于3個(gè)隨機(jī)子序列上的采樣效果. 說(shuō)明本文獲取的采樣子序列不但采樣有效, 且采樣效果較佳. 對(duì)表4中不同子序列生成圖像的FID對(duì)比分析, DDIM和本文方法在獲取子序列上生成圖像的FID明顯低于DDIM在其他隨機(jī)子序列上的FID. 進(jìn)一步說(shuō)明本文方法獲取的子序列是較佳采樣子序列.
由圖5和表3可見(jiàn), 雖然DDPM在原序列上采樣效果與本文方法在獲取子序列上的采樣結(jié)果相似, 但在理論上DDPM比本文方法的采樣效率低. 原因是DDPM在原序列上采樣, 而本文方法在子序列上采樣. DDPM的采樣時(shí)間分別為103,104,102 s, 本文方法的采樣時(shí)間分別為21,22,20 s. 實(shí)驗(yàn)結(jié)果表明, 本文方法在獲取的子序列上采樣效率是DDPM在原序列上采樣效率的4倍多, 說(shuō)明本文方法采樣效率明顯高于DDPM的采樣效率.
綜上所述, 本文在DDIM的基礎(chǔ)上通過(guò)粗糙集和粒子群優(yōu)化算法相結(jié)合獲取較佳采樣子序列. 實(shí)驗(yàn)驗(yàn)證結(jié)果表明, 本文方法獲取的采樣子序列上生成圖像效果與DDPM采樣效果相似, 說(shuō)明本文獲取的子列是有效子列; DDIM和本文方法在獲取子序列上的采樣效果明顯優(yōu)于DDIM在多個(gè)隨機(jī)子序列上的采樣結(jié)果, 說(shuō)明本文獲取的有效子序列是較佳子序列, 進(jìn)一步說(shuō)明DDIM在隨機(jī)子序列上的采樣結(jié)果不穩(wěn)定; 最后通過(guò)本文方法和DDPM分別在獲取的子序列和原序列上的采樣時(shí)間對(duì)比, 表明本文方法采樣效率明顯高于DDPM的采樣效率.
參考文獻(xiàn)
[1]SOHL-DICKSTEIN J, WEISS E, MAHESWARANATHAN N, et al. Deep Unsupervised Learning Using Nonequilibrium Thermodynamics [C]//International Conference on Machine Learning. [S.l.]: PMLR, 2015: 2256-2265.
[2]LU C, ZHOU Y H, BAO F, et al. Dpm-Solver: A Fast Ode Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps [J]. Advances in Neural Information Processing Systems, 2022, 35: 5775-5787.
[3]KAWAR B, ELAD M, ERMON S, et al. Denoising Diffusion Restoration Models [J]. Advances in Neural Information Processing Systems, 2022, 35: 23593-23606.
[4]SAHARIA C, HO J, CHAN W, et al. Image Super-resolution via Iterative Refinement [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 45(4): 4713-4726.
[5]LUGMAYR A, DANELLJAN M, ROMERO A, et al. Repaint: Inpainting Using Denoising Diffusion Probabilistic Models [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2022: 11461-11471.
[6]AVRAHAMI O, LISCHINSKI D, FRIED O. Blended Diffusion for Text-Driven Editing of Natural Images [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2022: 18208-18218.
[7]SAHARIA C, CHAN W, CHANG H W, et al. Palette: Image-to-Image Diffusion Models [C]//ACM SIGGRAPH 2022 Conference Proceedings. New York: ACM, 2022: 1-10.
[8]ROMBACH R, BLATTMANN A, LORENZ D, et al. High-Resolution Image Synthesis with Latent Diffusion Models [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2022: 10684-10695.
[9]LI M Y, LIN J, MENG C L, et al. Efficient Spatially Sparse Inference for Conditional Gans and Diffusion Models [J]. Advances in Neural Information Processing Systems, 2022, 35: 28858-28873.
[10]MIKUNI V, NACHMAN B. Score-Based Generative Models for Calorimeter Shower Simulation [EB/OL]. (2022-06-17)[2023-02-10]. https://arxiv.org/abs/2206.11898.
[11]PINAYA W H L, GRAHAM M S, GRAY R, et al. Fast Unsupervised Brain Anomaly Detection and Segmentation with Diffusion Models [C]//International Conference on Medical Image Computing and Computer-Assisted Intervention. Berlin: Springer, 2022: 705-714.
[12]HO J, JAIN A, ABBEEL P. Denoising Diffusion Probabilistic Models [J]. Advances in Neural Information Processing Systems, 2020, 33: 6840-6851.
[13]GUU K, HASHIMOTO T B, OREN Y, et al. Generating Sentences by Editing Prototypes [J]. Transactions of the Association for Computational Linguistics, 2018, 6: 437-450.
[14]YU L T, SONG Y, SONG J M, et al. Training Deep Energy-Based Models with f-Divergence Minimization [C]//International Conference on Machine Learning. [S.l.]: PMLR, 2020: 10957-10967.
[15]CRESWELL A, WHITE T, DUMOULIN V, et al. Generative Adversarial Networks: An Overview [J]. IEEE Signal Processing Magazine, 2018, 35(1): 53-65.
[16]GARCA G G, CASAS P, FERNNDEZ A, et al. On the Usage of Generative Models for Network Anomaly Detection in Multivariate Time-Series [J]. ACM SIGMETRICS Performance Evaluation Review, 2021, 48(4): 49-52.
[17]PEI H Z, REN K, YANG Y Q, et al. Towards Generating Real-World Time Series Data [C]//2021 IEEE International Conference on Data Mining (ICDM). Piscataway, NJ: IEEE, 2021: 469-478.
[18]LIANG D W, KRISHNAN R G, HOFFMAN M D, et al. Variational Autoencoders for Collaborative Filtering [C]//Proceedings of the 2018 World Wide Web Conference. New York: ACM, 2018: 689-698.
[19]NICHOL A Q, DHARIWAL P. Improved Denoising Diffusion Probabilistic Models [C]//International Conference on Machine Learning. [S.l.]: PMLR, 2021: 8162-8171.
[20]佘志用, 段超, 張雷. 變精度最小平方粗糙熵的圖像分割算法 [J]. 計(jì)算機(jī)工程與科學(xué), 2019, 41(4): 657-664. (SHE Z Y, DUAN C, ZHANG L. An Image Segmentation Algorithm Using Variable Precision Least Square Rough Entropy [J]. Computer Engineering & Science, 2019, 41(4): 657-664.)
(責(zé)任編輯: 韓 嘯)
收稿日期: 2023-04-18.
第一作者簡(jiǎn)介: 佘志用(1990—), 男, 漢族, 碩士, 講師, 從事圖像處理和智能決策的研究, E-mail: szy@xjzfu.edu.cn.
通信作者簡(jiǎn)介: 馮月萍(1958—), 女, 漢族, 博士, 教授, 從事計(jì)算機(jī)圖形學(xué)和圖像處理的研究, E-mail: fengyp@jlu.edu.cn.
基金項(xiàng)目: 國(guó)家自然科學(xué)基金(批準(zhǔn)號(hào): 82071995)、 吉林省科技發(fā)展計(jì)劃重點(diǎn)研發(fā)項(xiàng)目(批準(zhǔn)號(hào): 20220201141GX)和新疆政法學(xué)院校長(zhǎng)基金(批準(zhǔn)號(hào): XZZK2021002; XZZK2022008).