Curtis G. Northcutt

cgn |AT| mit |DOT| edu

Photo of Curtis G. Northcutt, Ph.D. Candidate at MIT.

CEO & Co-Founder of Cleanlab. I enjoy building AI companies to empower people. My scientific contributions focus on theory and algorithms for artificial intelligence including dataset uncertainty estimation and augmenting human capabilities.


Short Biography

Curtis Northcutt is an American computer scientist and entrepreneur focusing on using machine learning and artificial intelligence to empower people. He is the CEO and Co-Founder of Cleanlab used by 100+ of Fortune-500 companies to improve analytics and ML/LLM model performance using Cleanlab’s AI software to automatically improve the accuracy and dollar-value of every datapoint.

Curtis completed his PhD at MIT where he invented Cleanlab’s algorithms for automatically finding issues in most datasets. Curtis is the recipient of the MIT Morris Levin Thesis Award, the NSF Fellowship, and the Goldwater Scholarship and has worked at several leading AI research groups, including Google, Oculus, Amazon, Facebook, Microsoft, and NASA.


While these two ideas appear disparate, they are mutually dependent. Humans often have false notions about the world and encounter misinformation, yet we still learn well in noisy environments. Augmenting human learning with machine learning necessitates a deeper understanding of learning in noisy environments. Across healthcare, agriculture, politics, economics, transportation… our future as a species relies on an increasing synergy between machine learning and human learning: it is paramount that we have the tools to deal with real-world uncertainty, while maintaining the foresight to focus our advances in machine intelligence towards social good.

Industry and Institutional Research

I am fortunate to have had the opportunity to work or intern at:

as well as academic collaborations and visiting research with MIT, Harvard, Vanderbilt, Notre Dame, and the University of Kentucky. Details here.

News Highlights -- updated May 2021

May 2021: Completed my Ph.D. at MIT: “Confident Learning for Machines and Humans”. The questions (and answers) in this thesis develop the field of confident learning for data-centric machine learning with noisy labels + applications for enhancing human capabilities. [ PhD Thesis (PDF) ]
Apr 2021: Released [ paper | code | blog ]
Mar 2021: Published Confident Learning in JAIR (Journal of Artificial Intelligence Research): Confident learning is a subfield for data-centric machine learning with noisy labels, with theory for exactly finding label errors in real-world datasets. [ paper | code | blog ]
June 2020: Founded, an empathy AI company building digital brains with IQ and state-of-the-art EQ.
Dec 2019: Added many updates to my [ Research page ].
Nov 2019: Announcing cleanlab: The official Python framework for machine learning and deep learning with noisy labels in datasets. [ code | docs ]
Jan 2019: Announcing the L7 blog! A place for machine learning and human learning. [ ]

See news for more.

The Gift of Education

When you educate a person, you empower them within their community, and when you empower people socially, you give them hope, purpose, opportunity, and most importantly, you give them freedom.

Growing up below the poverty line in rural Kentucky, I experienced a glass ceiling of limited human and monetary resources. The ladder of opportunity often rises from prosperity rather than ability. My ladder was my education. Education led to exposure, then summer programs, then small scholarships, then bigger scholarships, and eventually opportunity. Everyone deserves access to quality educational resources – this underlies my motivation to pursue research that democratizes education.

To this end, I develop robust machine learning algorithms to enable open learning, i.e. to make advanced education more accessible. I work with edX student data to (1) infer user-intent across terabytes of noisy, massive interaction datasets and (2) implement prediction, inference, and detection algorithms distributed across 400+ MITx and HarvardX open online courses. For example, I ensure the legitimacy of online course certificates via cheating detection algorithms and with the help of exceptional colleagues, have demonstrated how machine learning can transform human learning with accurate proficiency estimation and diversification of comment rankings in discussion forums.

Life Mantras

1. Do work that empowers people, especially the underpriveledged.
2. The "smarter" person is just the one who studied more than you.
3. Work seriously, but don't take yourself too seriously 😉
4. Spend your time with people you admire.
5. Be curious, not judgemental.
6. Loosen your grip. Nothing good gets away.
7. The worst unsolved problem on Earth is bad parenting.
8. Stop asking why. Start asking why not?
9. Never stop welcoming new experiences.
10. Think about what you want to say. Say it. Stop.
10. To know yourself, embrace failure and seek discomfort.
11. Exploit your strengths. Spend your time on your weaknesses. 

When asked if I like rap, I recommend PomDP the PhD rapper.

Long biography


I completed my Ph.D. in Computer Science at MIT, where I was fortunate to work with Isaac Chuang. Before that, I was awarded the MIT Morris Joseph Levin Masters Thesis Award for my masters thesis work at MIT, the NSF Fellowship, and the MITx Digital Learning Research Fellowship. I also taught as a TA for MIT’s graduate machine learning course (6.867). Before that, I graduated as valedictorian from Vanderbilt University (2009-2013) where I majored in mathematics and computer science and was awarded the Barry M. Goldwater National Scholarship.

My work spans the theory and applications of artificial intelligence including uncertainty quantification and augmenting human capabilities. I invented confident learning and cleanlab (1.5k+ stars on GitHub), the Python package for machine learning with noisy labels and finding label errors in datasets. Before that, I created the CAMEO cheating detection system used to validate certificates in MITx and HarvardX online course teams. I am grateful to have worked at many of the world’s leading AI research groups, including Google AI, Oculus Research, Facebook AI Research, Amazon AI, Microsoft Research, NASA, MIT, and Harvard.

Working with Richard Newcombe, I created the first augmented reality dataset for multi-person conversational AI, EgoCom. Our associated T-PAMI paper uses the EgoCom dataset to predict turn-taking in conversations.

With friends from Harvard and MIT, I co-founded ChipBrain, an empathy AI company building digital brains. As CTO of ChipBrain, I led our mission to build emotionally intelligent AI that helps anyone build better relationships and connect with their audience more deeply. I envisioned a world where people from different backgrounds can empathize with one another, whether it’s solving an argument with a partner, selling a product to a customer, or asking for time off from your boss. I resigned from ChipBrain in late 2021 to focus full-time on Cleanlab[].

In my spare time, I help researchers build affordable state-of-the-art deep learning machines and enjoy competitive mountaineering, hiking, and cycling.