MIS2 workshop at KDD 2021

The Second International MIS2 Workshop:
Misinformation and Misbehavior Mining on the Web

Workshop held in conjunction with KDD 2021
Aug 15 from 1pm ET to 9pm ET (10am PT to 6pm PT), 2021
The Zoom link for the workshop is available here

About

The web is a space for all, allowing participating individuals to read, publish and share content openly. Despite its groundbreaking benefits in areas such as education and communication, it has also become a breeding ground for misbehavior and misinformation: any individual can reach thousands of people on the web near-instantaneously, purporting whatever they wish while also being shielded by anonymity. This capacity has led to rampant increases in misbehavior and misinformation vectors, via harassment, online scams, spread of propaganda, hate speech, deceptive reviews and more. Such issues have severe implications on both social and financial fronts.

The study of misinformation and misbehavior mining has become focal for researchers across many subfields of data, computational and social sciences, including network science, machine learning, cybersecurity, privacy, natural language processing, human computer interaction and more. MIS2 @ KDD 2021 provides a venue for researchers working in, or adjacent to these diverse areas to coalesce around central and timely topics in online misinformation and misbehavior, and present recent advances in research.

Areas of interest include, but are not limited to:

Topics of interest include (but are not limited to):

Paper Submission

We encourage submissions with both academic and industrial motivations, of the following types:

All submissions will be peer-reviewed by at least 2 reviewers, and all accepted manuscripts will be presented (virtually) at the workshop.

Format

Papers must be submitted in PDF according to the new ACM format published in ACM guidelines, selecting the generic “sigconf” sample, with a font size no smaller than 9pt. Submissions should be 2 to 8 pages long. No need to anonymize your submission.

Papers should be submitted via the CMT: Submission Portal

Keynote Speakers

James Caverlee Emiliano De Cristofaro Jiawei Han Giovanni Luca Ciampaglia Monica Lee Vishwakarma Singh Kevin Alejandro Roundy
Professor Professor Professor Assistant Professor Research Manager Machine Learning Researcher Senior Technical Director
Texas A&M University UC London UIUC University of South Florida Facebook Pinterest NortonLifeLock

Important Dates (Subject to Change)

Accepted Papers

Schedule

Keynote Speakers

Prof. Giovanni Luca Ciampaglia

Title: Threats to the information ecosystem
Abstract: Social media and before them the Web have deeply transformed the information ecosystem, ushering us into a new era of civic engagement and innovation at a global scale. However, a number of worrisome trends threaten the foundations of these public platforms. Misinformation boosted by automated amplifiers (i.e., social bots) is an obvious example, but we are also witnessing to the rise of other negative trends, such as harassment, hate speech, and gender inequity. In this talk I will describe my contributions to understanding what is threatening the integrity of the modern digital information ecosystem, and reducing universal access to it. I will first describe one of the first large-scale, systematic studies of gender inequality and objectification in the social gaming platform Twitch. I will then talk about tracking the diffusion of digital misinformation and the manipulation strategies employed by social bots on Twitter. Finally, on the side of possible countermeasures to the spread of false information, I will describe two possible approaches: (i) a novel approach to mining knowledge networks like Wikipedia to perform fact-checking and (ii) a collaboration with a political scientist to find scalable signals of news quality based on information diversity. This research could find application in those domains that are currently struggling to cope with massive volumes of unstructured data that need to be checked, like newsrooms and civic society at large. I will conclude describing an agenda of future research in this new exciting field.
Bio: Giovanni Luca Ciampaglia is an assistant professor at the Department of Computer Science and Engineering at the University of South Florida (USF). He is interested in all problems arising from the interplay between people and computing systems, in particular the integrity of information in cyberspace and the trustworthiness and reliability of social computing systems. At USF, he leads the Computational Sociodynamics Laboratory. Prior to joining USF he was at Indiana University as an assistant research scientist at the Indiana University Network Science Institute (IUNI), and before that as a postdoctoral fellow at the Center for Complex Networks and Systems Research, an analyst for the Wikimedia Foundation, and a research associate at the Professorship of Computational Social Science at the Swiss Federal Institute of Technology in Zurich. His work has been covered in major news outlets, including the Wall Street Journal, Wired, MIT Technology Review, NPR, and CBS News, to cite a few.

Dr. Vishwakarma Singh

Title: Fighting online abuse in real-world at web-scale
Abstract: Large web platforms like Pinterest fight bad actors, users who exploit the platform to engage in abusive behaviors, on an ongoing basis to continuously protect both the platform and the experience of hundreds of millions of valued users. Identifying various forms of online abuse is a hard research problem. Effectively addressing those on a real-world platform is even harder because of the ecosystem complexity, dynamic nature, system requirements, massive size, and impact. In the context of Pinterest, I would discuss challenges as well as the standard processes, techniques and systems used to identify and act against harmful content, abusive behavior, and bad actors at scale.
Bio: Vishwakarma Singh is Machine Learning Technical Lead for Trust and Safety at Pinterest where he leads strategy, innovation, and solutions for proactively fighting various platform abuse at scale using Machine Learning. He previously worked at Apple as a Principal Machine Learning Scientist. He earned a PhD in Computer Science with specialization in “Pattern Querying in Heterogenous Datasets” from University of California at Santa Barbara. He has published many research papers in peer-reviewed conferences and journals.

Prof. Jiawei Han 

Title: Text Mining for Automatic Identification of Misinformation:  Are We Close? 
Abstract: In data mining research, we have made good progress on truth discovery that identifies which pieces of information are likely to be true among a few conflict pieces of information related to a data item.  However, misinformation is more challenging since the opinions are abstract, implicit, heterogeneous, and semantically ambiguous.  They are much more difficult to be assessed effectively.   In this talk, I will summarize what we have achieved on truth discovery, sentiment mining, as well as transforming unstructured text into structures and discuss what could be the difficulties at identifying misinformation.   We hope further progress on data-driven with minimal human supervision may shed light on effective and automatic identification of misinformation. 
Bio: Jiawei Han is Michael Aiken Chair Professor in the Department of Computer Science, University of Illinois at Urbana-Champaign.  He is Fellow of ACM and Fellow of IEEE.  He received ACM SIGKDD Innovation Award (2004), IEEE Computer Society Technical Achievement Award (2005), IEEE Computer Society W. Wallace McDowell Award (2009), and Japan's Funai Achievement Award (2018).   

Prof. Emiliano De Cristofaro

Title: iDRAMA History X: A Data-Driven Approach to Probing the Fringe Web
Abstract: The so-called "Information Revolution" has helped advance society in unprecedented ways. Social networks have let us build and foster personal relationships. At the same time, however, the Web has also enabled anti-social and toxic behavior to occur at an unprecedented scale. This has prompted a multitude of challenging research problems around understanding, modeling, and countering safety issues online. First, even discovering appropriate data sources is not a straightforward task. Next, although the Web enables us to collect highly detailed digital information, there are issues of availability and ephemerality: simply put, researchers have no control over what data a 3rd party platform collects and exposes, and more specifically, no control over how long that data will remain available. Third, the massive scale and multiple formats data are available in requires creative execution of analysis. Finally, modern socio-technical problems, while related to typical social problems, are fundamentally different, and in addition to posing a research challenge, can also cause disruption in researchers' personal lives.  In this talk, I will discuss how my work with the iDRAMA Lab has tackled these challenges. Using concrete examples from our research, I will delve into some of the unique datasets and analyses we have performed, focusing on emerging issues like hate speech, coordinate harassment campaigns, and deplatforming as well as modeling the influence that Web communities have on the spread of disinformation, weaponized memes, etc. Finally, I will discuss how we can design proactive systems to anticipate and predict online abuse and, if time permits, how the "fringe" information ecosystem exposes researchers to attacks by the very actors they study.
Bio: Emiliano De Cristofaro is Professor of Security and Privacy-Enhancing Technologies at University College London (UCL), where he heads the Information Security Research Group. He is also Faculty Fellow at the Alan Turing Institute, technology advisor to the Information Commissioner's Office, and co-founder of the iDRAMA Lab. Before moving to pre-Brexit London in 2013, he was a research scientist at Xerox PARC. Emiliano received his PhD in Networked Systems from the University of California, Irvine, advised by Gene Tsudik. Overall, Emiliano does research in the broad security, safety, and privacy areas. These days he mostly works on tackling problems at the intersection of machine learning and security/privacy, as well as understanding and countering information weaponization via data-driven analysis. In 2013 and 2014, he co-chaired the Privacy Enhancing Technologies Symposium, and, in 2018, the security and privacy tracks at WWW and ACM CCS. In the same year, he received distinguished paper awards from NDSS and ACM IMC. In his free(?) time, he likes acting as a coffee snob and an anti pineapple-on-pizza radical, remembering (not without melancholy) the times he used to surf San Onofre, cooking (only pasta and pizza obviously), while finding it awkward to write about himself in the third person.

Dr. Monica Lee

Title: Mining Facebook for Adversarial Networks
Abstract: Facebook is the world’s largest social network. Accordingly, network and graph mining methods have proven effective in detecting pockets of hate and misinformation. This talk presents an overview of some of our key network-based abuse detection models, including applications of k-core decomposition, information corridors, and biclique mining, which have helped us detect and deplatform harmful movements around the 2020 US presidential election. Using machine learning, we are able to fundamentally improve the efficiency of our enforcement against dangerous hate and conspiracy groups.
Bio: Monica Lee leads the Core Data Science: Political & Organizational Science team at Facebook, which develops innovative graph and machine learning models to detect and deter social media abuse, including misinformation, hate, and fraud. Since receiving her Ph.D. in Sociology from the University of Chicago, she has played a key role in developing Facebook’s technologies and strategy to protect global elections from foreign and domestic interference. In addition to hunting bad guys on the internet, she enjoys publishing works on quantitative methods for measuring culture, text mining, musical taste, graph mining, challenges & limitations of big data, and ethics & morality.

Prof. James Caverlee

Title: Rabbit Holes, Fake Reviews, and Misinformation
Abstract: Our daily experiences are mediated by powerful platforms that shape our information diet and impact decisions of who to befriend and what businesses to patronize. But can these platforms be trusted? In this talk, I first highlight our work on both crowd and AI-powered attacks designed to manipulate opinions and perceived support of key issues on these platforms. Then I explore how the design of these platforms can lead to rabbit holes, resulting in undesirable (and often unforeseen) outcomes. I conclude with some thoughts on future directions.
Bio: James Caverlee is Professor and Lynn '84 and Bill Crane '83 Faculty Fellow at Texas A&M University in the Department of Computer Science and Engineering. His research targets topics from recommender systems, social media, information retrieval, data mining, and emerging networked information systems. His group has been supported by NSF, DARPA, AFOSR, Amazon, Google, among others. Caverlee serves as an associate editor for IEEE Transactions on Knowledge and Data Engineering (TKDE) and has been a senior program committee member of venues like KDD, SIGIR, SDM, WSDM, CIKM, and ICWSM. He was general co-chair of the 13th ACM International Conference on Web Search and Data Mining (WSDM 2020).

Dr. Kevin Roundy

Title: Detecting Bots, Impersonators, and Hate on Social Media
Abstract: Social media contains an abundance of content that is misleading or toxic. In this talk, we describe tools built by researchers at NortonLifeLock Research Group to help detect, measure, and mitigate three categories of social-media threats. We present BotSight, a tool that detects and labels automatically generated content on Twitter, much of which serves to amplify misinformation.  Our tool Doppelgänger detects another source of misinformation, impersonator social media accounts. We developed Detox to detect hate speech and abusive language on social media, including on online gaming platforms, where minority populations are routinely subjected to a fire-hose of abuse.
Bio: Kevin Alejandro Roundy is a Senior Technical Director at NortonLifeLock Research Group NRG, where he has been since 2012, when the team was known as Symantec Research Labs. Kevin received his Ph.D. from the University of Wisconsin in 2012, focused on the development of tools for obfuscated malware analysis. He and his colleagues have developed several reputation-based algorithms that block millions of malware files and apps each day, on behalf of NortonLifeLock’s nearly 50 million customers. Kevin's recent research has discovered and detected a wide of variety of abuse-enabling apps for mobile devices, particularly stalkerware apps that enable inter-personal spying. He also authors publications on the human side of security and privacy, which highlight many instances in which existing security solutions align poorly with customer needs. Currently, he and the NRG team are investigating fraud, scams, toxic speech, and misinformation on social media.

Organizers

Srijan Kumar Aude Hofleitner Meng Jiang Neil Shah Kai Shu
Assistant Professor Research scientist and Manager Assistant Professor Senior Research Scientist Assistant Professor
Georgia Institute of Technology Facebook University of Notre Dame Snap Illinois Institute of Technology

Contact

Please direct all questions to srijan@gatech.edu

web created and maintained by Bing He, a CS PhD student at Georgia Institute of Technology