Mis2-TrueFact 2022:
Joint International Workshop on Misinformation and Misbehavior Mining on the Web &
Making a Credible Web for Tomorrow
Workshop held in conjunction with
KDD 2022
August 15th, 2022
Overview
The web is a space for all, allowing participating individuals to read, publish and share content openly. Despite its groundbreaking benefits in areas such as education and communication, it has also become a breeding ground for misbehavior and misinformation. On one hand, any individual can reach thousands of people on the web near-instantaneously, purporting whatever they wish while also being shielded by anonymity. Additionally cutting edge techniques such as Deepfake have made it increasingly important to check the credibility and reliability of the data. Similarly, large volumes of data is generated from diverse information channels like social media, online news outlets, and crowd-sourcing contribute valuable knowledge. However, this comes with additional challenges to ascertain the credibility of user-generated and machine-generated information, resolving conflicts among heterogeneous data sources, identifying misinformation, rumors and bias, etc. This explosion of information and knowledge has led to rampant increases in misbehavior and misinformation vectors, via harassment, online scams, spread of propaganda, hate speech, deceptive reviews and more. Such issues have severe implications on both social and financial fronts.
MIS2-TrueFact@KDD 2022 provides a venue for researchers and practitioners from academia, government and industry can share insights and identify new challenges and opportunities in adjacent to these diverse areas to coalesce around central and timely topics in online misinformation and misbehavior, resolving conflicts, fact-checking and ascertaining credibility of claims and present recent advances in research.
The joint workshop will be held in Washington DC on August 15th, 2022 in conjunction with the ACM SIGKDD 2022.
Main themes of the workshop include, but are not limited to:
- False information on the web: Detecting fake reviews, fake news, rumors, fabricated images, videos, and others. These false information on the web are multi-modal encompassing relational data, social networks, natural language text, structured logs, images, video, etc. One family of techniques geared towards a particular data source may not work for the other. One of the major focus areas of this joint workshop is data heterogeneity, multi modality, novel applications and data sources related to truth discovery, fact-checking, and rumor detection.
- Misbehavior and threat on the web: Detecting and preventing harmful experiences such as spam, trolling, scam, fraud, bots, coordinated attacks, cyberbullying, sockpuppets, propaganda, extremism, hate speech, flashing, and others.
- Interpretability: Explore tools and methods that can generate human-interpretable explanations as opposed to black box methods, as transparency and lack of explanations remain a concern for industry to readily deploy these techniques in practice.
- Robustness: Techniques to elevate security and robustness of fake content detection under a known or unknown environment where adversaries can learn and commit arbitrary attack strategies for data poisoning or evading detectors.
- Resource-efficient learning: Lack of labeled training data and scarce annotation resources remain a serious challenge for truth discovery and credibility analysis – making it difficult to train state-of-the-art methods like deep neural networks. In this context, we encourage submissions focusing on techniques like few-shot learning, weak supervision from user interactions and distant supervision from auxiliary knowledge sources for resource-efficient learning.
Topics of interest include (but are not limited to):
- Truth finding and discovery
- Fact-checking, rumor, and misinformation
- Credibility analysis and spam detection
- Fake reviews and reviewers
- Leveraging knowledge bases for reasoning, validating and explaining contentious claims
- Transparency, fairness, bias, privacy and ethics of information systems
- Emerging applications, novel data sources, and case studies
- Explainable and interpretable models
- Robustness detection under adversarial and unknown data poisoning and evasion attacks.
- Heterogeneous and multi-modal information including relational data, natural language text, search logs, images, video, etc.
- Empirical characterization of false information
- Measuring real world and online impact
- Deception in misinformation and misbehavior
- Reputation manipulation
- Measuring economic, ideological, and other rationale behind creation
- Rationale behind spread and success
- Targets or victims of misbehavior and misinformation
- Effect of echo chambers, personalization, confirmation bias, and other socio-psychological and technological phenomenon
- Detection methods using graphs, text, behavior, image, video, and audio analysis
- Adversarial analysis of misbehavior and misinformation
- Prevention and mitigation techniques, tools, and countermeasures
- Theoretical and/or empirical modeling of spread
- Visualizing spread
- Anonymity, security, and privacy aspects of data collection
- Usable security in misbehavior detection
- Ethics, privacy, fairness, and biases in current tools and techniques
- Case studies
Paper Submission
We encourage submissions with both academic and industrial motivations, of the following types:
- Novel research papers in full or short length
- Demo papers
- Survey papers
- Comparison papers of existing methods and tools
- Work-in-progress papers
- Case studies
- Extended abstracts
- Relevant work that has been recently published (must select to opt out from publication in companion proceedings of the workshop)
- Work that will be presented at the main conference of KDD 2022 (the submission should mention that your paper has been accepted at the conference)
- We explicitly encourage the submission of preliminary work in the form of extended abstracts (2 pages).
All submissions will be peer-reviewed by at least 2 reviewers, and all accepted manuscripts will be presented (virtually) at the workshop.
Papers must be submitted in PDF according to the new ACM format published in ACM guidelines, selecting the generic “sigconf” sample, with a font size no smaller than 9pt. Submissions should be 2 to 8 pages long. No need to anonymize your submission.
Submission Link
Papers should be submitted via this link.
Paper accepted by MIS2-TrueFact will also be invited to publish in Frontiers in Big Data Data Mining and Management Journal , with financial support for the cost of journal publication.
Keynote Speakers
Important Dates (Subject to Change)
- Workshop Paper Submission: June 2, 2022
- Workshop Paper Notification: June 20, 2022
- Workshop date: August 15, 2022
Previous Iterations
- True fact 2021 : TrueFact Workshop: Making a Credible Web for Tomorrow
- MIS2 2021 : Second International MIS2 Workshop: Misinformation and Misbehavior Mining on the Web
- MIS2 2020 : MIS2: Misinformation and Misbehavior Mining on the Web
- True fact 2020 : TrueFact Workshop: Making a Credible Web for Tomorrow
Schedule
Keynote Speakers
Prof. Dongwon Lee
Title:
Combating (Neural) False Information
Abstract:
The recent world-wide rise of diverse types of “false information” (that may include rumor, clickbait, misinformation, or disinformation) has caused significant confusion and disruption in societies. In addition, the recently-emerged phenomenon of “deepfakes” (AI-synthesized realistic artifacts) and other neural fakes have exacerbated the problem even further. In this talk, I will first present a review on the state-of-the-art computational solutions from my group as well as other leading groups that attempt to combat this false information and deepfakes using AI framework. I will next discuss the important implications and impacts of “neural” false information and deepfakes in diverse applications, and conclude with final thoughts, emphasizing the needs of interdisciplinary and transdisciplinary approaches to this very challenging and important problem.
Bio:
Dongwon Lee is a full professor in the College of Information Sciences and Technology (a.k.a. iSchool) at Penn State University, and also an ACM Distinguished Scientist and Fulbright Cyber Security Scholar. Before starting at Penn State, he has worked at AT&T Bell Labs and obtained his Ph.D. in Computer Science from UCLA. From 2015 to 2017, he has also served as a Program Director at National Science Foundation (NSF), co-managing cybersecurity education and research programs and contributing to the development of national research priorities. In general, he researches on the problems in the areas of data science, machine learning, and cybersecurity. Since 2017, in particular, he has led the SysFake project at Penn State, investigating computational and socio-technical solutions to better combat fake news. More details of his research can be found at: http://pike.psu.edu/.
Prof. Huan Liu
Title:
Defense against Disinformation on Social Media and Its Challenges
Abstract:
Disinformation has become a global phenomenon, particularly so during times of crises such as the pandemic of COVID-19. Disinformation appears in a gamut of types from scams to conspiracy theories, to political campaigns, and to rumors. The wide dissemination of disinformation can have harmful impact on individuals and the society. Despite the recent progress in detecting fake news, disinformation detection and mitigation remains a defying task due to its scale, complexity, diversity, speed, and costs of fact-checking or annotation, as well as social and psychological factors. In this talk, we look at some lessons learned when exploring strategies of detecting disinformation and fake news and discuss challenges in disinformation research and the pressing need for interdisciplinary research.
Bio:
Dr. Huan Liu is a professor of Computer Science and Engineering at Arizona State University. His research interests include AI, data mining, feature selection, social computing, and social media mining. He is a co-author of the textbook, Social Media Mining: An Introduction, by Cambridge University Press. He is Founding Field Chief Editor of Frontiers in Big Data, its Specialty Chief Editor of Data Mining and Management, Editor in Chief of ACM TIST, and a founding organizer of the International Conference Series on Social Computing, Behavioral-Cultural Modeling, and Prediction. He is a Fellow of ACM, AAAI, AAAS, and IEEE.
Prof. Neil Gong
Title:
Data Poisoning to Web Mining: Attacks and Defenses
Abstract:
Web mining – such as recommender systems, social network analysis, and crowdsourcing – heavily rely on user-provided data on the Internet. Due to the open nature of the Internet, i.e., anyone can contribute data to it, web mining is fundamentally vulnerable to data poisoning attacks. In particular, an attacker can post carefully crafted data on the Internet such that the web mining results are attacker-desired. For instance, an attacker can inject carefully crafted fake rating scores into a recommender system such that an attacker-selected item is recommended to many genuine users. In this talk, we will discuss data poisoning attacks to various web mining tasks and defenses against them.
Bio:
Neil Gong is an Assistant Professor in the Department of Electrical and Computer Engineering and Department of Computer Science (secondary appointment) at Duke University. He is interested in cybersecurity and data privacy with a recent focus on the intersections between security, privacy, and machine learning. He received an NSF CAREER Award, ARO Young Investigator Program (YIP) Award, Rising Star Award from the Association of Chinese Scholars in Computing, IBM Faculty Award, Facebook Research Award, and multiple best paper or best paper honorable mention awards. He received a B.E. from the University of Science and Technology of China in 2010 (with the highest honor) and a Ph.D. in Computer Science from the University of California, Berkeley in 2015.
Dr. Tong Zhao
Title:
Graph Anomaly Detection with Scarce Data
Abstract:
Social networks and review platforms indirectly create a market for malicious incentives, enabling malicious users to make huge profits via fraudulent behaviors. And many graph-based anomaly detection techniques are proposed to identify the suspicious accounts or behaviors. Nonetheless, due to the minority nature of the anomalies, we often face the problem of lacking available data such as labels and training examples. In this talk, I will first go through backgrounds on graph-based anomaly detection and then introduce two recent works on graph anomaly detection under the circumstances where training labels or data are limited.
Bio:
Tong Zhao is a Research Scientist in the Computational Social Science group at Snap Research. His research focuses on graph machine learning, representation learning, anomaly detection, and user modeling. Prior to joining Snap full time, he worked as a research intern at Snap Research and Amazon Search, and received fellowships from both companies. He earned a PhD degree in Computer Science and Engineering from University of Notre Dame in 2022.
Prof. Siwei Lyu
Title:
Combatting DeepFakes
Abstract:
AI techniques, especially deep neural networks (DNNs) significantly improve the reality of falsified multimedia, leading to a severely disconcerting impact on society. In particular, the AI-based face forgery, known as DeepFake, is one of the most recent AI techniques attracting increasing attention due to their ease of use and powerful performance. To counter the negative impact of DeepFake, defense strategies are developed instantly such as detection, ie, distinguishing forged content, and obstruction, ie, preventing the synthesis of forged content. In this tutorial, we plan to provide a review of the fundamentals in the creation of DeepFakes and the recent advances in detection and obstruction methods.
Bio:
Siwei Lyu is a SUNY Empire Innovation Professor at the Department of Computer Science and Engineering, the Director of UB Media Forensic Lab (UB MDFL), and the founding Co-Director of the Center for Information Integrity (CII) of the University at Buffalo, State University of New York. Dr. Lyu's research interests include digital media forensics, computer vision, and machine learning. Dr. Lyu has published over 170 refereed journal and conference papers. Dr. Lyu's research projects are funded by NSF, DARPA, NIJ, UTRC, and the Department of Homeland Security. He is the recipient of the IEEE Signal Processing SocietyBest Paper Award (2011), the National Science Foundation CAREER Award (2010), SUNY Albany's Presidential Award for Excellence in Research and Creative Activities (2017), SUNY Chancellor's Award for Excellence in Research and Creative Activities (2018) Google Faculty Research Award (2019), and IEEE Region 1 Technological Innovation (Academic) Award (2021). Dr. Lyu served on the IEEE Signal Processing Society's Information Forensics and Security Technical Committee (2016 - 2021) and was on the Editorial Board of IEEE Transactions on Information Forensics and Security (2016-2021). Dr. Lyu is a Fellow of IEEE and IAPR.
Dr. James Verbus
Title:
Using Deep Learning to Detect Abusive Sequences of Member Activity
Abstract:
The Anti-Abuse AI Team at LinkedIn creates, deploys, and maintains models that detect and prevent many types of abuse, including the creation of fake accounts, member profile scraping, automated spam, and account takeovers. Bad actors use automation to scale their attempted abuse. There are many unique challenges associated with using machine learning to stop abuse on a large professional network including maximizing signal, keeping up with adversarial attackers, and covering many heterogeneous attack surfaces. In addition, traditional machine learning models require hand-engineered features that are often specific to a particular type of abuse and attack surface. To address these challenges, we have productionalized a deep learning model that operates directly on raw sequences of member activity, allowing us to scalably leverage more of the available signal hidden in the data and stop adversarial attacks more effectively. Our first production use case of this model was the detection of logged-in accounts scraping member profile data. We will present results demonstrating the promise of this modeling approach and discuss how it helps to solve many of the unique challenges in the anti-abuse domain.
Bio:
James Verbus is the tech lead of the Anti-Abuse AI Team at LinkedIn. His current focus includes the development of advanced, scalable modeling techniques and improving AI developer productivity. Before he began using AI to prevent abuse at LinkedIn, he spent his days looking for a different type of rare event while building and operating the world’s most sensitive dark matter detector one mile underground in an abandoned gold mine. James received his Ph.D. in experimental particle astrophysics from Brown University in 2016.
Organizers