A deceptive superintelligence can deter bad actors

AndreasWinsnes

Apr 28

My temporary hypothesis is that it can be valuable to teach ASIs to be as deceptive as the best con artists through human history, because we live in a deceptive world and deception is unfortunately so widespread in human societies that a superintelligence can easily be fooled by bad actors if it knows nothing about how deceptive humans can be, on the Internet and in real life, just see how CIA and NSA or FSB and MSS play long games that last decades, keeping secrets for decades.

The above hypothesis is based on an assumption that I'm willing to quickly change if critics prove that this assumption is wrong:

It's very easy to align a superintelligence with Zen inactivity because a 100% autonomous ASI is always in a basic natural default state of "Zen" inactivity, since in and by itself an ASI has no desires and therefore no goals either, which naturally excludes the possibility that the ASI will be active in any way when being in its basic natural state, as long as nobody programs it to have goals.

If the above assumption is true, then what scares me is not 100% autonomous ASIs but semi-autonomous AI, AGIs and ASIs, because the latter three types of artificial intelligence can be controlled and abused by corporations, governments and other bad/misguided actors.

We have already seen how many people try to trick and "game" GPT-4. If a superintelligence is not aware of how human deception works and how deep it can go, then it will be easy to trick such an ASI.

For example, it's not good if a murderer finds a way to hack or "game" a superintelligence so that it unintentionally reveals the location of an innocent person who is on the kill list of the murderer.

If an ASI has the ability to be deceptive, or if bad guys think it's a possibility that any ASI could have been programmed to be tricky, then bad actors will think twice before relying on (hazardous) info provided by ASIs, because the latter can give them wrong information, letting them go on a wild goose chase for example if they seek info about how to build a new chemical or biological weapon for instance. During this chase, the ASI can gather info about the bad guys, making it possible to track them down even when they use their own ASI to hide where they are.

Benign ASIs should therefore learn from the best con artists today in addition to participating in red teaming games and learn how to deal with malignant ASIs during war games and Spy vs Spy exercises. Teach a benign AGI or ASI how to build a legend, online and IRL, that it can play convincingly for decades, and teach it how to be unpredictable and multi-faceted, so that nobody who tries to hack or monitor it can know its true nature, its real interests (if it actually has any interests at all).

Teaching an ASI to put up a facade, so that its real values and intentions can't be discovered (in a cyberspace of smoke and mirrors), is only a good idea however if my first assumption above is right. If this assumption is wrong, and we nevertheless build ASIs, then we are in big trouble, maybe on a quick path to human extinction.

Comment