CENTER FOR AI & DATA ETHICS

Research, Projects & Publications

Our research is aimed at developing and disseminating educational content, including courses, case studies, and online resources, aimed at fostering ethical literacy and equipping students and data professionals with the knowledge and tools necessary to navigate ethical considerations in AI, machine learning, and data-driven systems.

Current Research Areas

AI and Data Ethics Case Studies

With the rapid rise of AI tools and technologies, it is crucial that current and future policymakers, business leaders, and data/technology professionals have a strong ethical foundation. Our research team has embarked on a project to aggregate and analyze real-world case studies related to AI ethics, algorithmic bias, privacy, and other emerging issues. By curating a database of case studies we aim to develop educational materials that will prepare students in technical and non-technical fields to grapple with complex sociotechnical trade-offs.

We've designed our Ethics in Data Science course to touch on several different themes in AI and data ethics: algorithmic bias and fairness; privacy and data security; transparency and explainable AI; automation; the future of work; academic integrity and truth; copyright and intellectual property; environmental impact; and other unique ethical dilemmas introduced by generative models. As part of our course we've designed a set of case studies to provoke thoughtful discussions about each of these issues. There are several flavors of case studies including standard short-form case studies about real-world events that took place, and role-play-based case studies that require each student to play a role and attempt to analyze the issues from a very specific perspective. Some case studies also contain optional exercises at the end for students to complete on their own.

Case Studies

The following case studies were developed by Robert Clements and Hadley Dixon as part of a project to create extensible case studies for both laymen and technical audiences.

Facial Recognition for Policing: Real-life Negative Consequences of Biased Algorithms
Themes: Algorithmic bias and fairness; privacy

Voice Assistants and Biometric Data: Amazon's Violations of Children's Privacy Rights
Themes: Privacy and data security

Racial Bias in Healthcare Data Solutions: Perpetuating Disparities in Medical Treatment Nationwide
Themes: Algorithmic bias and fairness; social impact

Failures of LLM-Generated Text Detection: False Positives Dilute the Efficacy of AI Detection
Themes: Academic integrity and truth

When Noone is Driving: Navigating Accountability via Cruise's Driverless Vehicles
Themes: Transparency and explainable AI; automation; future of work

AI Art: Assistants, Replacements and Grifters
Themes: Future of work; copyright and intellectual property

Other Resources

Grants

Craig Newmark Fund for Data Ethics, 2019
Established by generous donation of Craig Newmark, an internet entrepreneur best known for being the founder of Craigslist. He was awarded an honorary degree from USF in 2009. The Center for AI & Data Ethics (formerly CADE) aligns with Craig’s priorities to strengthen the foundation of trustworthy press and access for women in technology.

Projects

Every year, our faculty and graduate students in data science collaborate with organizations worldwide to tackle real-world data science and data engineering challenges. Below is a select list of projects with direct ethical considerations in AI and data science.

Along with working on these projects, some of our students have written up formal ethical assessments as part of the Ethics in Data Science course, a few of which we've included below.

ACLU of Northern California

Student Team: Ian Duke, Ho Nam Tong

Faculty Mentor: Robert Clements

Company Liaison: Dylan Verner-Crist

Project Outcomes: Students employed an array of data science methods to automate body camera review for a class action case related to the racial profiling of drivers in Siskiyou, California. Using computer vision in Python, they created a program to automatically link body camera videos with written police reports. By relying on machine learning and natural language processing, they developed models to identify interactions containing police misconduct characteristic of pretextual stops. In partnership with USF, the ACLU’s Lead Investigator was able to review large amounts of body camera footage in real time—a task that would have been impossible with manual review alone.

Boston Children's Hospital

Mindful Machine Learning: Ethical Considerations for Data-Driven Epilepsy Research by Amadeo Cabanela

California Academy of Sciences

A reflection on bias, fairness, and environmental impact during my two projects at the California Academy of Sciences by Maricela Abarca
Candid

Student Team: Zemin Cai, Harrison Jinglun Yu

Faculty Mentor: Shan Wang

Company Liaison: Cathleen Clerkin

Project Outcomes: Candid's Insights department engaged students in impactful research projects in data ethics. These projects included an examination of diversity, equity, and inclusion within nonprofits, an exploration of nonprofits' societal impact, and an investigation into real-time grantmaking data, particularly in relation to issues like racial equity. Students were tasked with identifying factors influencing organizations' willingness to share demographic data and analyzing data to predict nonprofits' societal impact. Additionally, they explored methodologies to provide real-time insights into philanthropic trends while addressing potential biases and confounding factors. These projects harnessed various data science techniques and underscored the importance of ethical considerations in data analysis.

Kidas Inc.

Student Team: Raghavendra Kommavarapu

Faculty Mentor: Mustafa Hajij

Company Liaison: Amit Yungman

Project Outcomes: Students optimized point-of-interest detection algorithms, including hate speech and sexual content detection, using data and metadata. They attempted age detection in audio and text, emotion detection in audio and text, and voice changer detection in audio. Additionally, they worked on displaying data visualizations on personal pages based on user activity and algorithm results using Python.

YLabs (Youth Development Labs)

Student Team: Tejaswi Dasari

Faculty Mentor: Diane Woodbridge

Company Liaison: Robert On

Project Outcomes: In the CyberRwanda project, focused on enhancing the well-being and prospects of urban teenagers through digital education, students used various technologies and techniques to measure project progress and effectiveness. They employed Google Analytics to track engagement metrics and designed KPI dashboards for automatic data generation. However, challenges included manual data tracking, discrepancies between Google Analytics versions, and gaps in tracking product pick-ups. Integrating and utilizing data from different sources for decision-making was identified as a crucial goal.
ACLU

Our Team: Joleena Marshall

Faculty Mentor: Michael Ruddy

Company Liaisons: Linnea Nelson, Tedde Simon, Brandon Greene

Project Outcomes: The team developed a tool with Python to acquire and preprocess publicly-available data related to the Oakland Unified School District to investigate whether or not OUSD’s allocation of resources results in inequities between schools. The team also provided an updated data analysis on educational outcomes for indigenous students for a select number of Humboldt County unified school districts, including data visualizations.

California Forward

Our Team: Evie Klaassen

Faculty Mentor: Michael Ruddy

Company Liaison: Patrick Atwater

Project Outcomes: The team built a tool with Python to determine where high wage jobs are located in California. This tool serves as an extension to current data tools created and maintained by the organization. The team also developed a pipeline to clean and prepare new public data when it is released, and for the tool’s outputs to be regularly updated given any new data.
ACLU Criminal Justice

Our Team: Qianyun Li

Goal: At the ACLU, the student identified potential discrimination in school suspensions by performing feature importance analysis with machine learning models and statistical tests.

ACLU Micromobility

Our Team: Max Shinnerl

Goal: At the ACLU, the student analyzed COVID-19 vaccine equitable distribution data. They developed interactive maps with Leaflet to visualize shortcomings of the distribution algorithm and automated the cleaning of legislative record data. They also developed a pipeline for storing data to enable remote SQL queries using Amazon RDS and S3 from AWS.
Human Rights Data Analysis Group (HRDAG)

Our Team: Bing Wang

Goal: At the Human Rights Data Analysis Group (HRDAG), Bing gleaned critical location of death information from unstructured text fields in Arabic using Google Translate and Python Pandas, adding identifiable records to Syrian conflict data. She wrote R scripts and bash Makefiles to create blocks of similar records on killings in the Sri Lankan conflict to reduce the size of search space in the semi-supervised machine learning record linkage (database deduplication) process.

Publications

Research activities and publications by our faculty, accomplished fellows, and affiliates.

Toward Realignment: Big Tech, Organized Labor, and the Politics of the Future of Workby, Nantina Vgontzas, Sage Journals
Affective Engagement in #StopAAPIHate on Social Media: The Role of Emotion in Driving Engagement for Counter-hate Content on Twitter, USF CRASE Blog
- “Inside DeepMind's Secret Plot to Break Away From Google”, Business Insider
- “Social Media Content Moderation Is Not Neutral, USF Researcher Says”, SF Public Press
- “How to poison the data that Big Tech uses to surveil you”, Technology Review
- "To Live in Their Utopia: Why Algorithmic Systems Create Absurd Outcomes" by Ali Alkhatib at [CHI 2021] - also available as a video summary on YouTube
- "The politicization of face masks in the American public sphere during the COVID-19 pandemic" by Scoville, C., McCumber, A., Amironesei R., Jeon, J. at the American Sociological Association
- "On the Genealogy of Machine Learning Datasets: A Critical History of ImageNet" by Denton, E., Hanna, A., Amironesei, R., Smart, A., Nicole, H. at Big Data and Society
- "Notes on Problem Formulation" by Amironesei, R., Denton, E., Hanna, A. at IEEE Technology and Society Magazine Journal
- "Algorithmic Conservation in a Changing Climate" by Scoville, C., Chapman, M., Amironesei, R., and Boettiger, C. at Current Opinion in Sustainability Journal
- "'You Can’t Sit With Us': Exclusionary Pedagogy in AI Ethics Education" by Raji, I.D., Scheuerman, M.K., Amironesei, R. at FAccT
- "Genealogy, Archeology, Hermeneutics: Techniques of Interpretation in Machine Learning Datasets," by Amironesei, R., Denton, E., Hanna, A. at [IEEESSIT]
- "Bringing the People Back In: Contesting Benchmark Machine Learning Datasets" by Denton, E., Hanna, A., Amironesei, R., Smart, A., Nicole, H., Scheuerman, M.K. at arXiv

Give Today

Research, Projects & Publications

Current Research Areas

AI and Data Ethics Case Studies

Case Studies

Other Resources

Grants

Projects

ACLU of Northern California

Boston Children's Hospital

California Academy of Sciences

Candid

Kidas Inc.

YLabs (Youth Development Labs)

ACLU

California Forward

ACLU Criminal Justice

ACLU Micromobility

Human Rights Data Analysis Group (HRDAG)

Publications

Make A Gift

Follow us