OpenAI Confess their Approach to Artificial Intelligence System Safety

By Sarnith Varun

April 7, 2023
open ai

On 5 April 2023, OpenAI published a blog addressing AI safety and their approach to ensuring that AI systems are used appropriately, and safe deployment to the public. 

Recently, several groups have demanded governance over the development of AI and massive technological changes happening in the current decade[1].

Matter of fact an “International Panel on Technological Changes”(IPTC) was demanded by Martin Rees, Astronomer Royal of the Royal Household of the United Kingdom, Shivaji Sondhi, Wykeham Professor of Physics at Oxford, and K Vijay Raghavan, former principal scientific adviser to the Government of India to the G20.

The panel expressed the concern of experts over the rise of the development of AI and its impact on the next generation’s livelihoods, hence demanding a governance system over AI and “post-human technology”

And lately, a school of intellectuals including Elon Musk, Steve Wozniak, and Max Tegmark along with several other signatories signed an open letter to halt the training and development of AI systems stronger than GPT-4, the letter reiterates the lack of planning and management in AI advancements and stresses the fact that AI with human-competitive intelligence can cause risks to humanity and society. 

Powerful AI systems should be developed only once we are confident that their effects will be positive and their risks will be manageable, states the open letter.

Directly or indirectly, OpenAI provides answers and insights into the safety measures that it takes before making its products public.

OpenAI’s Approach to AI safety

We believe that powerful AI systems should be subject to rigorous safety evaluations. Regulation is needed to ensure that such practices are adopted, and we actively engage with governments on the best form such regulation could take. 

Open AI

OpenAI begins by reiterating that it is committed to keeping powerful AI broadly beneficial and safe and talks about the positive feedback it receives from users around the globe. 

Rigorous Testing of AI Systems Before Release

OpenAI has stated that it conducts rigorous testing, receives feedback from experts, and works consistently to improve the behavior of the system, learning through human feedback and by building safety and monitoring systems.

Providing strength to their claim, it stated that after the completion of training on the latest and powerful GPT-4, OpenAI spent 6 months working on its safety and alignment before releasing it to the public.

Testing of AI Systems in the Real World to Improve Safeguards

The prevention of foreseeable risks requires testing in the real world rather than in the laboratory.

It is not possible to analyze and predict what ways will people benefit from the system or abuse it, even after extensive research and testing in a laboratory.

So, OpenAI believes firmly in real-world use as a key component in creating and releasing safer AI systems over time. 

AI systems are released carefully and slowly, with strong protections, to more people over time, and make improvements as the team learns more and more about how the products are used by the people.

Models of OpenAI are made through Application Programming Interface(API), Incorporating API into their applications enables developers to monitor and address potential misuse, and develop countermeasures that reflect actual instances of misuse, rather than relying solely on theoretical predictions of misuse.

What is an API?

An Application Programming Interface (API) is a way for two or more computer programs to communicate with each other. It is a type of software interface, that offers a service to other pieces of software.

As AI technology is being used in the real world, developers have come across different scenarios where the technology can be beneficial while also being potentially harmful to people. 

To ensure the safety of individuals, the developers have formulated policies that can identify and prevent such harmful behavior.

However, these policies are designed to be nuanced, meaning that they can differentiate between genuine risks to people and the many positive and beneficial uses of the technology.

This way, the policies can still allow for the useful applications of AI technology while mitigating potential harm.


OpenAI believes that society needs time to adapt to AI, and everyone impacted by this technology should have a say in its development. 

“Crucially, we believe that society must have time to update and adjust to increasingly capable AI, and that everyone who is affected by this technology should have a significant say in how AI develops further. Iterative deployment has helped us bring various stakeholders into the conversation about the adoption of AI technology more effectively than if they hadn’t had firsthand experience with these tools.” states OpenAI.

Protection of Children using AI 

The company places significant importance on safeguarding children using its AI technology. 

OpenAI has established age restrictions for the use of its tools, allowing only users above 18 or older, and 13 or older with parental control

The use of their technology has been prohibited from generating harmful content, including hate speech, harassment, violence, and adult material. 

Their latest model, GPT-4, has 82% lower chances of generating disallowed content compared to their previous model, GPT-3.5. 

Safety measures to prevent the generation of harmful content involving children have been implemented, by blocking and reporting any attempt to upload Child Sexual Abuse Material to their image tools to the National Center for Missing and Exploited Children

In addition to their default safety measures, they are collaborating with developers, such as the non-profit Khan Academy, to tailor safety mitigations to their specific use cases. 

Protection of the Privacy of Individuals

OpenAI’s large language models(LLM), such as ChatGPT, are trained on various types of text data to improve their abilities to assist people. 

This includes publicly available content, licensed content, and human-reviewed content.

Although, the purpose of using data is to enhance the models and not to sell services, advertise, or create profiles of individuals.


The training data used by OpenAI may include personal information that is available on the public internet.

OpenAI claims to prioritize removing personal information from the dataset and training models to reject requests for private individuals’ personal information.

They also promise to respond to requests from individuals to delete their personal information from their systems. 

By taking these steps, OpenAI aims to minimize the likelihood that their models generate responses that include the personal information of private individuals.

OpenAI’s main focus is on improving the models’ knowledge and accuracy, ChatGPT’s performance improves as it continues to engage in conversations with users.

In doing so, the models learn about the world without compromising individual privacy.

Increase in factual accuracy

ChatGPT has the ability to predict the next series of words based on patterns they have previously seen.


Although this doesn’t always guarantee that the output will be factually accurate, OpenAI and other AI developers are working towards improving the factual accuracy of these models.

There is a 40% increase in the likelihood of producing factual content by GPT-4 compared to their previous model, GPT-3.5.

When users sign up to use ChatGPT, they are informed that it may not always be accurate, however, there is still more work to be done to minimize inaccuracies and educate the public about the current limitations of these AI tools.

OpenAI Promises Enhancing Global AI Safety

OpenAI supports a practical approach to solving AI safety concerns, dedicating time and resources to minimize the effect, effective alignment techniques, and testing them against real-world abuse. 

It also believes that improving AI safety and capabilities should go hand in hand, stating that its best safety work to date has come from working with our most capable models because they are better at following users’ instructions and easier to steer or “guide’.

The deployment of stronger models will become increasingly cautious as they are created and will continue to enhance safety precautions as AI systems evolve, this could possibly be an indirect address to the open letter signed by Elon Musk, Max Tegmark, and others

While the post-testing period of GPT-4 took almost 6 months, further models could potentially take more time than 6 months.

OpenAI seems eager to contribute to effective AI development and deployment at the global level, stating that it will take both technical and institutional innovation

Addressing safety issues also requires extensive debate, experimentation, and engagement, including on the bounds of AI system behavior, OpenAI promises to continue to foster collaboration and open dialogue among stakeholders to create a safe AI ecosystem.


  1. OpenAI, ‘Our approach to AI safety’, 5 April 2023, “OpenAI is committed to keeping powerful AI safe and broadly beneficial. We know our AI tools provide many benefits to people today. Our users around the world have told us that ChatGPT helps to increase their productivity, enhance their creativity, and offer tailored learning experiences. We also recognize that, like any technology, these tools come with real risks—so we work to ensure safety is built into our system at all levels.”,[]