"Social Network Pattern Analysis via Large Language Models and Diffusio" by Sophie Kadifa

Date of Award

Spring 2025

Degree Type

Campus Access Only Thesis

Degree Name

Bachelor of Science

Department

Mathematics

First Advisor

Dr. Lin Junyuan

Abstract

False or misleading information surrounding COVID-19 is prevalent across social media platforms and poses serious public health risks. Existing misinformation detection models often lack interpretability and overlook the role of social interactions. In this study, we enhance the misinformation detection workflow by integrating Large Language Models (LLMs) to decompose tweet content into factual claims and subjective statements, verify the factual claims against ground truths, and input only the factual portions into detection models. This approach increases agreement with human raters, improving Cohen’s kappa from 0.167 to 0.432268 and improved the F-1 score from 0.235 to 0.489143. We also construct a large-scale Twitter interaction network from over 7 million COVID-19-related tweets (2020–2021), where nodes represent tweets and edges reflect direct interactions (replies, quotes) or content similarity (based on shared keywords/hashtags weighted via TF-IDF). To model misinformation diffusion, we apply the award-winning Diffusion State Distance (DSD) metric, which better captures proximity in high-degree networks than traditional shortest-path measures. This enables us to analyze the relational structure of misinformation, observe label proximity, and predict misinformation labels for new tweets using k-nearest neighbors. Our framework enhances both the interpretability and accuracy of misinformation detection models and is adaptable to a wide range of topics and social media platforms.

Available for download on Sunday, May 14, 2028

Share

COinS