论文周报 | 推荐系统领域最新研究进展
共 24218字,需浏览 49分钟
·
2023-10-05 23:19
首先祝大家国庆节快乐!
本文精选了上周 (0925-1001) 最新发布的15篇推荐系统相关论文,主要研究方向包括鲁棒推荐、推荐中的知识蒸馏、推荐中的公平性、推荐数据集、多模态推荐、多行为推荐、推荐中的冷启动问题等。
1. Toward Robust Recommendation via Real-time Vicinal Defense 2. Interactive Content Diversity and User Exploration in Online Movie Recommenders: A Field Experiment 3. Beyond Co-occurrence: Multi-modal Session-based Recommendation, TKDE2023 4. Multi-Granularity Click Confidence Learning via Self-Distillation in Recommendation 5. Clinical Trial Recommendations Using Semantics-Based Inductive Inference and Knowledge Graph Embeddings 6. Enhancing Cross-Category Learning in Recommendation Systems with Multi-Layer Embedding Training, ACML2023 7. Automatic Feature Fairness in Recommendation via Adversaries, SIGIR2023 8. A Content-Driven Micro-Video Recommendation Dataset at Scale 9. LD4MRec: Simplifying and Powering Diffusion Model for Multimedia Recommendation, WWW2023 10. Modeling Multi-aspect Preferences and Intents for Multi-behavioral Sequential Recommendation, KBS2023 11. Algorithmic Collusion or Competition: the Role of Platforms' Recommender Systems 12. Related Rhythms: Recommendation System To Discover Music You May Like 13. On the Sweet Spot of Contrastive Views for Knowledge-enhanced Recommendation 14. Cold & Warm Net: Addressing Cold-Start Users in Recommender Systems 15. The Role of Document Embedding in Research Paper Recommender Systems: To Breakdown or to Bolster Disciplinary Borders?
1. Toward Robust Recommendation via Real-time Vicinal Defense
Yichang Xu, Chenwang Wu, Defu Lian
https://arxiv.org/abs/2309.17278
Recommender systems have been shown to be vulnerable to poisoning attacks, where malicious data is injected into the dataset to cause the recommender system to provide biased recommendations. To defend against such attacks, various robust learning methods have been proposed. However, most methods are model-specific or attack-specific, making them lack generality, while other methods, such as adversarial training, are oriented towards evasion attacks and thus have a weak defense strength in poisoning attacks.
In this paper, we propose a general method, Real-time Vicinal Defense (RVD), which leverages neighboring training data to fine-tune the model before making a recommendation for each user. RVD works in the inference phase to ensure the robustness of the specific sample in real-time, so there is no need to change the model structure and training process, making it more practical. Extensive experimental results demonstrate that RVD effectively mitigates targeted poisoning attacks across various models without sacrificing accuracy. Moreover, the defensive effect can be further amplified when our method is combined with other strategies.
2. Interactive Content Diversity and User Exploration in Online Movie Recommenders: A Field Experiment
Ruixuan Sun, Avinash Akella, Ruoyan Kong, Moyan Zhou, Joseph A. Konstan
https://arxiv.org/abs/2309.13296
Recommender systems often struggle to strike a balance between matching users' tastes and providing unexpected recommendations. When recommendations are too narrow and fail to cover the full range of users' preferences, the system is perceived as useless. Conversely, when the system suggests too many items that users don't like, it is considered impersonal or ineffective. To better understand user sentiment about the breadth of recommendations given by a movie recommender, we conducted interviews and surveys and found out that many users considered narrow recommendations to be useful, while a smaller number explicitly wanted greater breadth. Additionally, we designed and ran an online field experiment with a larger user group, evaluating two new interfaces designed to provide users with greater access to broader recommendations. We looked at user preferences and behavior for two groups of users: those with higher initial movie diversity and those with lower diversity. Among our findings, we discovered that different level of exploration control and users' subjective preferences on interfaces are more predictive of their satisfaction with the recommender.
3. Beyond Co-occurrence: Multi-modal Session-based Recommendation, TKDE2023
Xiaokun Zhang, Bo Xu, Fenglong Ma, Chenliang Li, Liang Yang, Hongfei Lin
https://arxiv.org/abs/2309.17037
Session-based recommendation is devoted to characterizing preferences of anonymous users based on short sessions. Existing methods mostly focus on mining limited item co-occurrence patterns exposed by item ID within sessions, while ignoring what attracts users to engage with certain items is rich multi-modal information displayed on pages. Generally, the multi-modal information can be classified into two categories: descriptive information (e.g., item images and description text) and numerical information (e.g., price). In this paper, we aim to improve session-based recommendation by modeling the above multi-modal information holistically. There are mainly three issues to reveal user intent from multi-modal information: (1) How to extract relevant semantics from heterogeneous descriptive information with different noise? (2) How to fuse these heterogeneous descriptive information to comprehensively infer user interests? (3) How to handle probabilistic influence of numerical information on user behaviors? To solve above issues, we propose a novel multi-modal session-based recommendation (MMSBR) that models both descriptive and numerical information under a unified framework. Specifically, a pseudo-modality contrastive learning is devised to enhance the representation learning of descriptive information. Afterwards, a hierarchical pivot transformer is presented to fuse heterogeneous descriptive information. Moreover, we represent numerical information with Gaussian distribution and design a Wasserstein self-attention to handle the probabilistic influence mode. Extensive experiments on three real-world datasets demonstrate the effectiveness of the proposed MMSBR. Further analysis also proves that our MMSBR can alleviate the cold-start problem in SBR effectively.
4. Multi-Granularity Click Confidence Learning via Self-Distillation in Recommendation
Chong Liu, Xiaoyang Liu, Lixin Zhang, Feng Xia, Leyu Lin
https://arxiv.org/abs/2309.16322
Recommendation systems rely on historical clicks to learn user interests and provide appropriate items. However, current studies tend to treat clicks equally, which may ignore the assorted intensities of user interests in different clicks. In this paper, we aim to achieve multi-granularity Click confidence Learning via Self-Distillation in recommendation (CLSD). Due to the lack of supervised signals in click confidence, we first apply self-supervised learning to obtain click confidence scores via a global self-distillation method. After that, we define a local confidence function to adapt confidence scores at the user group level, since the confidence distributions can be varied among user groups. With the combination of multi-granularity confidence learning, we can distinguish the quality of clicks and model user interests more accurately without involving extra data and model structures. The significant improvements over different backbones on industrial offline and online experiments in a real-world recommender system prove the effectiveness of our model. Recently, CLSD has been deployed on a large-scale recommender system, affecting over 400 million users.
5. Clinical Trial Recommendations Using Semantics-Based Inductive Inference and Knowledge Graph Embeddings
Murthy V. Devarakonda, Smita Mohanty, Raja Rao Sunkishala, Nag Mallampalli, Xiong Liu
https://arxiv.org/abs/2309.15979
Designing a new clinical trial entails many decisions, such as defining a cohort and setting the study objectives to name a few, and therefore can benefit from recommendations based on exhaustive mining of past clinical trial records. Here, we propose a novel recommendation methodology, based on neural embeddings trained on a first-of-a-kind knowledge graph of clinical trials. We addressed several important research questions in this context, including designing a knowledge graph (KG) for clinical trial data, effectiveness of various KG embedding (KGE) methods for it, a novel inductive inference using KGE, and its use in generating recommendations for clinical trial design. We used publicly available data from
6. Enhancing Cross-Category Learning in Recommendation Systems with Multi-Layer Embedding Training, ACML2023
Zihao Deng, Benjamin Ghaemmaghami, Ashish Kumar Singh, Benjamin Cho, Leo Orshansky, Mattan Erez, Michael Orshansky
https://arxiv.org/abs/2309.15881
Modern DNN-based recommendation systems rely on training-derived embeddings of sparse features. Input sparsity makes obtaining high-quality embeddings for rarely-occurring categories harder as their representations are updated infrequently. We demonstrate a training-time technique to produce superior embeddings via effective cross-category learning and theoretically explain its surprising effectiveness. The scheme, termed the multi-layer embeddings training (MLET), trains embeddings using factorization of the embedding layer, with an inner dimension higher than the target embedding dimension. For inference efficiency, MLET converts the trained two-layer embedding into a single-layer one thus keeping inference-time model size unchanged.
Empirical superiority of MLET is puzzling as its search space is not larger than that of the single-layer embedding. The strong dependence of MLET on the inner dimension is even more surprising. We develop a theory that explains both of these behaviors by showing that MLET creates an adaptive update mechanism modulated by the singular vectors of embeddings. When tested on multiple state-of-the-art recommendation models for click-through rate (CTR) prediction tasks, MLET consistently produces better models, especially for rare items. At constant model quality, MLET allows embedding dimension, and model size, reduction by up to 16x, and 5.8x on average, across the models.
7. Automatic Feature Fairness in Recommendation via Adversaries, SIGIR2023
Hengchang Hu, Yiming Cao, Zhankui He, Samson Tan, Min-Yen Kan
https://arxiv.org/abs/2309.15418
Fairness is a widely discussed topic in recommender systems, but its practical implementation faces challenges in defining sensitive features while maintaining recommendation accuracy. We propose feature fairness as the foundation to achieve equitable treatment across diverse groups defined by various feature combinations. This improves overall accuracy through balanced feature generalizability. We introduce unbiased feature learning through adversarial training, using adversarial perturbation to enhance feature representation. The adversaries improve model generalization for under-represented features. We adapt adversaries automatically based on two forms of feature biases: frequency and combination variety of feature values. This allows us to dynamically adjust perturbation strengths and adversarial training weights. Stronger perturbations are applied to feature values with fewer combination varieties to improve generalization, while higher weights for low-frequency features address training imbalances. We leverage the Adaptive Adversarial perturbation based on the widely-applied Factorization Machine (AAFM) as our backbone model. In experiments, AAFM surpasses strong baselines in both fairness and accuracy measures. AAFM excels in providing item- and user-fairness for single- and multi-feature tasks, showcasing their versatility and scalability. To maintain good accuracy, we find that adversarial perturbation must be well-managed: during training, perturbations should not overly persist and their strengths should decay.
8. A Content-Driven Micro-Video Recommendation Dataset at Scale
Yongxin Ni, Yu Cheng, Xiangyan Liu, Junchen Fu, Youhua Li, Xiangnan He, Yongfeng Zhang, Fajie Yuan
https://arxiv.org/abs/2309.15379
Micro-videos have recently gained immense popularity, sparking critical research in micro-video recommendation with significant implications for the entertainment, advertising, and e-commerce industries. However, the lack of large-scale public micro-video datasets poses a major challenge for developing effective recommender systems. To address this challenge, we introduce a very large micro-video recommendation dataset, named "MicroLens", consisting of one billion user-item interaction behaviors, 34 million users, and one million micro-videos. This dataset also contains various raw modality information about videos, including titles, cover images, audio, and full-length videos. MicroLens serves as a benchmark for content-driven micro-video recommendation, enabling researchers to utilize various modalities of video information for recommendation, rather than relying solely on item IDs or off-the-shelf video features extracted from a pre-trained network. Our benchmarking of multiple recommender models and video encoders on MicroLens has yielded valuable insights into the performance of micro-video recommendation. We believe that this dataset will not only benefit the recommender system community but also promote the development of the video understanding field. Our datasets and code are available at https://github.com/westlake-repl/MicroLens
9. LD4MRec: Simplifying and Powering Diffusion Model for Multimedia Recommendation, WWW2023
Penghang Yu, Zhiyi Tan, Guanming Lu, Bing-Kun Bao
https://arxiv.org/abs/2309.15363
Multimedia recommendation aims to predict users' future behaviors based on historical behavioral data and item's multimodal information. However, noise inherent in behavioral data, arising from unintended user interactions with uninteresting items, detrimentally impacts recommendation performance. Recently, diffusion models have achieved high-quality information generation, in which the reverse process iteratively infers future information based on the corrupted state. It meets the need of predictive tasks under noisy conditions, and inspires exploring their application to predicting user behaviors. Nonetheless, several challenges must be addressed: 1) Classical diffusion models require excessive computation, which does not meet the efficiency requirements of recommendation systems. 2) Existing reverse processes are mainly designed for continuous data, whereas behavioral information is discrete in nature. Therefore, an effective method is needed for the generation of discrete behavioral information.
10. Modeling Multi-aspect Preferences and Intents for Multi-behavioral Sequential Recommendation, KBS2023
Haobing Liu, Jianyu Ding, Yanmin Zhu, Feilong Tang, Jiadi Yu, Ruobing Jiang, Zhongwen Guo
https://arxiv.org/abs/2309.14938
Multi-behavioral sequential recommendation has recently attracted increasing attention. However, existing methods suffer from two major limitations. Firstly, user preferences and intents can be described in fine-grained detail from multiple perspectives; yet, these methods fail to capture their multi-aspect nature. Secondly, user behaviors may contain noises, and most existing methods could not effectively deal with noises. In this paper, we present an attentive recurrent model with multiple projections to capture Multi-Aspect preferences and INTents (MAINT in short). To extract multi-aspect preferences from target behaviors, we propose a multi-aspect projection mechanism for generating multiple preference representations from multiple aspects. To extract multi-aspect intents from multi-typed behaviors, we propose a behavior-enhanced LSTM and a multi-aspect refinement attention mechanism. The attention mechanism can filter out noises and generate multiple intent representations from different aspects. To adaptively fuse user preferences and intents, we propose a multi-aspect gated fusion mechanism. Extensive experiments conducted on real-world datasets have demonstrated the effectiveness of our model.
11. Algorithmic Collusion or Competition: the Role of Platforms' Recommender Systems
Xingchen Xu, Stephanie Lee, Yong Tan
https://arxiv.org/abs/2309.14548
Recent academic research has extensively examined algorithmic collusion resulting from the utilization of artificial intelligence (AI)-based dynamic pricing algorithms. Nevertheless, e-commerce platforms employ recommendation algorithms to allocate exposure to various products, and this important aspect has been largely overlooked in previous studies on algorithmic collusion. Our study bridges this important gap in the literature and examines how recommendation algorithms can determine the competitive or collusive dynamics of AI-based pricing algorithms. Specifically, two commonly deployed recommendation algorithms are examined: (i) a recommender system that aims to maximize the sellers' total profit (profit-based recommender system) and (ii) a recommender system that aims to maximize the demand for products sold on the platform (demand-based recommender system). We construct a repeated game framework that incorporates both pricing algorithms adopted by sellers and the platform's recommender system. Subsequently, we conduct experiments to observe price dynamics and ascertain the final equilibrium. Experimental results reveal that a profit-based recommender system intensifies algorithmic collusion among sellers due to its congruence with sellers' profit-maximizing objectives. Conversely, a demand-based recommender system fosters price competition among sellers and results in a lower price, owing to its misalignment with sellers' goals. Extended analyses suggest the robustness of our findings in various market scenarios. Overall, we highlight the importance of platforms' recommender systems in delineating the competitive structure of the digital marketplace, providing important insights for market participants and corresponding policymakers.
12. Related Rhythms: Recommendation System To Discover Music You May Like
Rahul Singh, Pranav Kanuparthi
https://arxiv.org/abs/2309.13544
Machine Learning models are being utilized extensively to drive recommender systems, which is a widely explored topic today. This is especially true of the music industry, where we are witnessing a surge in growth. Besides a large chunk of active users, these systems are fueled by massive amounts of data. These large-scale systems yield applications that aim to provide a better user experience and to keep customers actively engaged. In this paper, a distributed Machine Learning (ML) pipeline is delineated, which is capable of taking a subset of songs as input and producing a new subset of songs identified as being similar to the inputted subset. The publicly accessible Million Songs Dataset (MSD) enables researchers to develop and explore reasonably efficient systems for audio track analysis and recommendations, without having to access a commercialized music platform. The objective of the proposed application is to leverage an ML system trained to optimally recommend songs that a user might like.
13. On the Sweet Spot of Contrastive Views for Knowledge-enhanced Recommendation
Haibo Ye, Xinjie Li, Yuan Yao, Hanghang Tong
https://arxiv.org/abs/2309.13384
In recommender systems, knowledge graph (KG) can offer critical information that is lacking in the original user-item interaction graph (IG). Recent process has explored this direction and shows that contrastive learning is a promising way to integrate both. However, we observe that existing KG-enhanced recommenders struggle in balancing between the two contrastive views of IG and KG, making them sometimes even less effective than simply applying contrastive learning on IG without using KG. In this paper, we propose a new contrastive learning framework for KG-enhanced recommendation. Specifically, to make full use of the knowledge, we construct two separate contrastive views for KG and IG, and maximize their mutual information; to ease the contrastive learning on the two views, we further fuse KG information into IG in a one-direction manner.Extensive experimental results on three real-world datasets demonstrate the effectiveness and efficiency of our method, compared to the state-of-the-art. Our code is available through the anonymous link:https://figshare.com/articles/conference_contribution/SimKGCL/22783382
14. Cold & Warm Net: Addressing Cold-Start Users in Recommender Systems
Xiangyu Zhang, Zongqiang Kuang, Zehao Zhang, Fan Huang, Xianfeng Tan
https://arxiv.org/abs/2309.15646
Cold-start recommendation is one of the major challenges faced by recommender systems (RS). Herein, we focus on the user cold-start problem. Recently, methods utilizing side information or meta-learning have been used to model cold-start users. However, it is difficult to deploy these methods to industrial RS. There has not been much research that pays attention to the user cold-start problem in the matching stage. In this paper, we propose Cold & Warm Net based on expert models who are responsible for modeling cold-start and warm-up users respectively. A gate network is applied to incorporate the results from two experts. Furthermore, dynamic knowledge distillation acting as a teacher selector is introduced to assist experts in better learning user representation. With comprehensive mutual information, features highly relevant to user behavior are selected for the bias net which explicitly models user behavior bias. Finally, we evaluate our Cold & Warm Net on public datasets in comparison to models commonly applied in the matching stage and it outperforms other models on all user types. The proposed model has also been deployed on an industrial short video platform and achieves a significant increase in app dwell time and user retention rate.
15. The Role of Document Embedding in Research Paper Recommender Systems: To Breakdown or to Bolster Disciplinary Borders?
Eoghan Cunningham, Derek Greene, Barry Smyth
https://arxiv.org/abs/2309.14984
In the extensive recommender systems literature, novelty and diversity have been identified as key properties of useful recommendations. However, these properties have received limited attention in the specific sub-field of research paper recommender systems. In this work, we argue for the importance of offering novel and diverse research paper recommendations to scientists. This approach aims to reduce siloed reading, break down filter bubbles, and promote interdisciplinary research. We propose a novel framework for evaluating the novelty and diversity of research paper recommendations that leverages methods from network analysis and natural language processing. Using this framework, we show that the choice of representational method within a larger research paper recommendation system can have a measurable impact on the nature of downstream recommendations, specifically on their novelty and diversity. We introduce a novel paper embedding method, which we demonstrate offers more innovative and diverse recommendations without sacrificing precision, compared to other state-of-the-art baselines.
往期 精彩 回顾
- 适合初学者入门人工智能的路线及资料下载
- (图文+视频)机器学习入门系列下载
- 机器学习及深度学习笔记等资料打印
- 《统计学习方法》的代码复现专辑
-
- 交流群
-
欢迎加入机器学习爱好者微信群一起和同行交流,目前有机器学习交流群、博士群、博士申报交流、CV、NLP等微信群,请扫描下面的微信号加群,备注:”昵称-学校/公司-研究方向“,例如:”张小明-浙大-CV“。请按照格式备注,否则不予通过。添加成功后会根据研究方向邀请进入相关微信群。请勿在群内发送广告,否则会请出群,谢谢理解~( 也可以加入机器学习交流qq群772479961 )
-
-