research
My research guided by intersectionality, transnational feminism, and other critical theories seeks to
- understand existing societal biases within LLMs and multimodal models, how they present in downstream tasks, and how they affect users
- develop robust algorithms, frameworks, and evaluations to mitigate and address these biases within existing systems and in downstream tasks
2024
- Evaluating the Social Impact of Generative AI Systems in Systems and SocietyIrene Solaiman, Zeerak Talat, William Agnew, and 28 more authors2024
Generative AI systems across modalities, ranging from text (including code), image, audio, and video, have broad social impacts, but there is no official standard for means of evaluating those impacts or for which impacts should be evaluated. In this paper, we present a guide that moves toward a standard approach in evaluating a base generative AI system for any modality in two overarching categories: what can be evaluated in a base system independent of context and what can be evaluated in a societal context. Importantly, this refers to base systems that have no predetermined application or deployment context, including a model itself, as well as system components, such as training data. Our framework for a base system defines seven categories of social impact: bias, stereotypes, and representational harms; cultural values and sensitive content; disparate performance; privacy and data protection; financial costs; environmental costs; and data and content moderation labor costs. Suggested methods for evaluation apply to listed generative modalities and analyses of the limitations of existing evaluations serve as a starting point for necessary investment in future evaluations. We offer five overarching categories for what can be evaluated in a broader societal context, each with its own subcategories: trustworthiness and autonomy; inequality, marginalization, and violence; concentration of authority; labor and creativity; and ecosystem and environment. Each subcategory includes recommendations for mitigating harm.
- Racial/Ethnic Categories in AI and Algorithmic Fairness: Why They Matter and What They RepresentJennifer MickelIn Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency, 2024
Racial diversity has become increasingly discussed within the AI and algorithmic fairness literature, yet little attention is focused on justifying the choices of racial categories and understanding how people are racialized into these chosen racial categories. Even less attention is given to how racial categories shift and how the racialization process changes depending on the context of a dataset or model. An unclear understanding of who comprises the racial categories chosen and how people are racialized into these categories can lead to varying interpretations of these categories. These varying interpretations can lead to harm when the understanding of racial categories and the racialization process is misaligned from the actual racialization process and racial categories used. Harm can also arise if the racialization process and racial categories used are irrelevant or do not exist in the context they are applied. In this paper, we make two contributions. First, we demonstrate how racial categories with unclear assumptions and little justification can lead to varying datasets that poorly represent groups obfuscated or unrepresented by the given racial categories and models that perform poorly on these groups. Second, we develop a framework, CIRCSheets, for documenting the choices and assumptions in choosing racial categories and the process of racialization into these categories to facilitate transparency in understanding the processes and assumptions made by dataset or model developers when selecting or using these racial categories.
- Intersectional Insights for Robust Models: Introducing FOG 😶🌫️ for Improving Worst Case Performance Without Group InformationJennifer MickelTuring Scholars Honors Thesis, 2024
Standard training through empirical risk minimization (ERM) can result in seemingly well-performing models that reach high accuracy on average but achieve low accuracy on specific groups. Group-specific low accuracy is especially of concern in cases in which groups are underrepresented in the training data or when spurious correlations are present within data. Furthermore, instances can be a part of multiple groups, as in the case of demographic groups. Previous approaches, such as group distributional robust optimization (Group DRO), achieve high worst-group accuracy yet require group information. Group information is not always available due to legal, data quality, or cost constraints. Other approaches not requiring group information exist, but gaps between these approaches and group DRO persist and seldom consider overlapping groups. We develop a model development cycle and algorithm Fog to improve the performance of the worst-performing group without group information that accounts for overlapping groups. We first train a model using ERM and utilize the model features corresponding with the data to identify groups. We use these identified groups with group DRO to train a new model. This process can be repeated to improve performance. Using our method, we find that we can improve the performance of the worst-performing group compared to ERM and other algorithms not requiring group information, such as JTT.
2023
- The Importance of Multi-Dimensional Intersectionality in Algorithmic Fairness and AI Model DevelopmentJennifer MickelPolymathic Scholars Honors Thesis, 2023
People are increasingly interacting with artificial intelligence (AI) systems and algorithms, but oftentimes, these models are embedded with unfair biases. These biases can lead to harm when an AI system’s output is implicitly or explicitly racist, sexist, or derogatory. If the output is offensive to a person interacting with it, it can cause the person emotional harm that may manifest physically. Alternatively, if a person agrees with the model’s output, the person’s negative biases may be reinforced, inciting the person to engage in discriminatory behavior. Researchers have recognized the harm AI systems can lead to, and they have worked to develop fairness definitions and methodologies for mitigating unfair biases in machine learning models. Unfortunately, these definitions (typically binary) and methodologies are insufficient for preventing AI models from learning unfair biases. To address this, fairness definitions and methodologies must account for intersectional identities in multicultural contexts. The limited scope of fairness definitions allows for models to develop biases against people with intersectional identities that are unaccounted for in the fairness definition. Existing frameworks and methodologies for model development are based in the US cultural context, which may be insufficient for fair model development in different cultural contexts. To assist machine learning practitioners in understanding the intersectional groups affected by their models, a database should be constructed detailing the intersectional identities, cultural contexts, and relevant model domains in which people may be affected. This can lead to fairer model development, for machine learning practitioners will be better adept at testing their model’s performance on intersectional groups.