A Solution to Evil AI with Too Much Information

4 min readApr 23, 2019

Problem: Sometimes a worry that people have about AI is that it will use the internet to gain information that will allow it to do things that humans don’t want it to do.

Description: I’m assuming that the main problem here is the extra information. That is, if we restricted the AI to some preset amount of information that humans had thoroughly validated, the AI wouldn’t be able to act so badly. However, since this AI is connected to the internet or some other source, it has the potential to gain new ideas that may allow it to act in unpredictable ways, which we may perceive as being harmful.

We can’t just remove the AI from the internet, though, because we do want it to become smarter. So, what do we do?

Solution: Every time the AI attempts to access information that’s outside of the preset human-vetted information, program the AI to have that access fail with a particular percentage. Each attempt, log the details as to why the AI was trying to access the information, and exactly what information it accessed (if any).

Explanation: What does this do? It gives humans time to respond.

If we set the threshold low enough for an unexpected access, we should be able to keep track of all of the new information the AI is picking up, and understand why it’s collecting that information and how it might use it. If something starts to go wrong, we can set the threshold down to 0% so that the AI cannot access information like that anymore. If it seems like the information will actually just help us if the AI knows it, we can raise the threshold to 100% so the AI can access that information just like the preset human-vetted information.

This will allow the AI to slowly grow in intelligence (potentially over multiple generations of human-checking of this process) and become more helpful for us.

Case example: The anonymous data from collected by an app called Strava was released online. Strava was an app that tracked health information for people who were exercising. It also happened to track location during exercise as well. When the anonymous data was released, it revealed military bases (see note from the edit on March 3rd, 2021), as armies had been using the app during training and routine runs around the facilities.

An AI could have quickly used this information to identify all of the military bases revealed by the data. However, if our AI could only access new data probabilistically, chances are that it would log the fact that it was looking for this information, and we would be able to understand why that file was helpful in learning that information, and be able to restrict it if need be.

Conclusion: This mechanism works similar to the mechanisms for passwords. Given enough time, it is always possible to guess someone’s password. But the amount of time it takes is too long for something to go wrong. The hope is that the same thing will happen here — the amount of time it takes the AI to access information that would result in harmful actions would be too long for it to happen.

The most difficult part in implementing this solution is determining the threshold. It needs to be high enough that the AI can access new information and learn, but low enough that the chance of accessing unexpectedly problematic information is small.

Please leave a comment — I hope to be involved in the creation of an AI someday and want to make sure I get it right. Your thoughts matter. I think often times computer scientists don’t spend enough time talking to regular people and finding out their ideas, concerns, and feedback, so I want to make sure I do that. (If you are a computer scientist, I’d love to hear from you as well.)

(Note to readers: I’m trying to write these articles in a format that’s super easy to read the problem and solution without having to get into the nitty-gritty details, but also leave it in an essay format so that you can read it continuously if you’d like to. Again, if you have any ideas about a better format for achieving this, or thoughts on what might work better, please let me know.)

Edit on March 13, 2021 — I talked to one of the founders of Strava a few months ago and found out that Strava did not reveal secret military bases, and had already been in contact with the military for quite some time when they released their anonymous dataset. The Vox article is very misleading about this, and I have changed the wording in this piece to reflect my new understanding of the situation.

A Solution to Evil AI with Too Much Information

Written by London Lowmanstone

No responses yet