Curtis G. Northcutt

cgn |AT| mit |DOT| edu

Photo of Curtis G. Northcutt, Ph.D. Candidate at MIT.

I completed my PhD in Computer Science at MIT in May 2021. I am the Founder & CTO of ChipBrain. My work spans theory and algorithms for artificial intelligence including dataset uncertainty estimation and augmenting human capabilities.

Navigate:




News Highlights -- updated May 2021

May 2021: Completed my Ph.D. at MIT: “Confident Learning for Machines and Humans”. The questions (and answers) in this thesis develop the field of confident learning for data-centric machine learning with noisy labels + applications for enhancing human capabilities. [ PhD Thesis (PDF) | PhD Defense Slides (PDF) ]
Apr 2021: Released labelerrors.com. [ paper | code | blog ]
Mar 2021: Published Confident Learning in JAIR (Journal of Artificial Intelligence Research): Confident learning is a subfield for data-centric machine learning with noisy labels, with theory for exactly finding label errors in real-world datasets. [ paper | code | blog ]
June 2020: Founded ChipBrain.com, an empathy AI company building digital brains with IQ and state-of-the-art EQ.
Dec 2019: Added many updates to my [ Research page ].
Nov 2019: Announcing cleanlab: The official Python framework for machine learning and deep learning with noisy labels in datasets. [ code | docs ]
Jan 2019: Announcing the L7 blog! A place for machine learning and human learning. [ l7.curtisnorthcutt.com ]

See news for more.

About

I completed my Ph.D. in Computer Science at MIT, where I was fortunate to work with Isaac Chuang. Before that, I was awarded the MIT Morris Joseph Levin Masters Thesis Award for my masters thesis work at MIT, the NSF Fellowship, and the MITx Digital Learning Research Fellowship. I also taught as a TA for MIT’s graduate machine learning course (6.867). Before that, I graduated as valedictorian from Vanderbilt University (2009-2013) where I majored in mathematics and computer science and was awarded the Barry M. Goldwater National Scholarship.

My work spans the theory and applications of artificial intelligence including uncertainty quantification and augmenting human capabilities. I invented confident learning and cleanlab (1.5k+ stars on GitHub), the Python package for machine learning with noisy labels and finding label errors in datasets. Before that, I created the CAMEO cheating detection system used to validate certificates in MITx and HarvardX online course teams. I am grateful to have worked at many of the world’s leading AI research groups, including Google AI, Oculus Research, Facebook AI Research, Amazon AI, Microsoft Research, NASA, MIT, and Harvard.

Working with Richard Newcombe, I created the first augmented reality dataset for multi-person conversational AI, EgoCom. Our associated T-PAMI paper uses the EgoCom dataset to predict turn-taking in conversations.

With friends from Harvard and MIT, I co-founded ChipBrain, an empathy AI company building digital brains. As CTO of ChipBrain, I lead our mission to build emotionally intelligent AI that helps anyone build better relationships and connect with their audience more deeply. We envision a world where people from different backgrounds can empathize with one another, whether it’s solving an argument with a partner, selling a product to a customer, or asking for time off from your boss. You can learn more about ChipBrain in this interview.

In my spare time, I help researchers build affordable state-of-the-art deep learning machines and enjoy competitive mountaineering, hiking, and cycling.

Research Manifesto

While these two ideas appear disparate, they are mutually dependent. Humans often have false notions about the world and encounter misinformation, yet we still learn well in noisy environments. Augmenting human learning with machine learning necessitates a deeper understanding of learning in noisy environments. Across healthcare, agriculture, politics, economics, transportation… our future as a species relies on an increasing synergy between machine learning and human learning: it is paramount that we have the tools to deal with real-world uncertainty, while maintaining the foresight to focus our advances in machine intelligence towards social good.

Industry and Institutional Research

I am fortunate to have had the opportunity to work or intern at:

as well as academic collaborations and visiting research with MIT, Harvard, Vanderbilt, Notre Dame, and the University of Kentucky. Details here.

The Gift of Education

When you educate a person, you empower them within their community, and when you empower people socially, you give them hope, purpose, opportunity, and most importantly, you give them freedom.

Growing up below the poverty line in rural Kentucky, I experienced a glass ceiling of limited human and monetary resources. The ladder of opportunity often rises from prosperity rather than ability. My ladder was my education. Education led to exposure, then summer programs, then small scholarships, then bigger scholarships, and eventually opportunity. Everyone deserves access to quality educational resources – this underlies my motivation to pursue research that democratizes education.

To this end, I develop robust machine learning algorithms to enable open learning, i.e. to make advanced education more accessible. I work with edX student data to (1) infer user-intent across terabytes of noisy, massive interaction datasets and (2) implement prediction, inference, and detection algorithms distributed across 400+ MITx and HarvardX open online courses. For example, I ensure the legitimacy of online course certificates via cheating detection algorithms and with the help of exceptional colleagues, have demonstrated how machine learning can transform human learning with accurate proficiency estimation and diversification of comment rankings in discussion forums.

Life Mantras

When asked if I like rap, I recommend PomDP the PhD rapper.