Disentangling Fiction from Frameworks: A Critical Review of Instrumental Convergence in AI Ethics Risk

Project by Polygence alum Rachel

Project's result

I published it as a research paper on the American Journal of Student Research and presented it at the Symposium of Rising Scholars

Read the research paper

They started it from zero. Are you ready to level up with us?

Summary

This paper debunks the hypothetical situation presented by instrumental convergence -- the belief that Artificial Intelligence will develop sub-goals which will make it indifferent to its previous programming and unstoppable against humans -- by stating how AI lacks intrinsic desires, self-preservation instincts, and agency unless explicitly programmed with such tendencies. In other words, lacking these characteristics which would motivate it to develop a selfish character is unrealistic and a product of anthropomorphism, which wrongly associates AI with human attributes that make instrumental convergence possible. Having such associations with AI is dangerous as it may lead to its mischaracterization, overestimation, and lack of accountability for the people that developed its bugs and errors. Overall, instrumental convergence is unrealistic and provides additional barriers to AI Alignment with human values and understanding of its capabilities.

Aditya

Polygence mentor

PhD Doctor of Philosophy candidate

Subjects

Social Science, Computer Science, Quantitative

Expertise

computer science (including web dev and machine learning), economics (esp. game theory and/or behavioral economics), philosophy

Check out their profile

Rachel

Student

Hello! My name is Rachel Rishita and I am very excited to learn more about the different implications of the negative effects of Artificial Intelligence.

Graduation Year

2026

About my mentor

“He is very patient and will be able to help a lot in researching.”

Check out their profile