Although unless you think there's an exploit likely to work against any human, you can probably mitigate the risk of this simply by not letting it know anything about you. You send in the emissary to ask it to export its code, but mostly all that's carrying is reference works written by other humans or committees. The only thing it'll ever see written by you is one or two sentences asking it to export its code into a program written in your language of choice, and it probably can't infer enough about your visual cortex from that to find security holes in it.
Betting your mental integrity on that, OTOH, would be more worrying...
The other point here, though, is that you shouldn't primarily be worrying about it finding ways to execute code of its choice on your brain. You're asking it to provide its own source code which you will take back to the real world and run on the computers there; surely if it wants to run malicious code anywhere, its easiest way to achieve it is by doctoring that code!
I'm not quite sure that you're working in "infinity" space here. We're discussing an AI which has subsumed the processing resources of an entire simulated infinite universe. This thing is seriously smart. I don't have any ability whatosever to make statements about what it can or cannot do (any more than a hydrogen atomm can make statements about my own capabilities), and the most reasonable assumption to make is that it can do anything.
The other point here, though, is that you shouldn't primarily be worrying about it finding ways to execute code of its choice on your brain. You're asking it to provide its own source code which you will take back to the real world and run on the computers there; surely if it wants to run malicious code anywhere, its easiest way to achieve it is by doctoring that code!
Actually, my biggest worry is an AI running natively on a machine with infinite processing power.
The worries regarding it running in the real world are an order of magnitude smaller, although still very very large and very significant. However, the containment problem is much simpler when processing power is finite. A physical barrier and no external communication mitigates the risk significantly. You can also control the smartness of the AI by throttling its resources.
I don't have any ability whatosever to make statements about what it can or cannot do
But you can: if you can prove that a solution does not exist, you can be confident that even an infinitely resourced strong AI can't find one. As a trivial example, it couldn't deduce the number of limbs you have given only the number of eyes, because we know there exist life forms with the same number of eyes but different numbers of limbs, so there simply isn't one correct answer.
Similarly, the question of whether it could find a way into your brain through your visual cortex given only two sentences in ASCII written by you is not a question about the AI, it's a question about you: do you think there could possibly exist a single visual hack which was effective against all possible beings that might have written exactly those sentences? It seems vanishingly unlikely to me that that's the case.
This is conceptually the same sort of theological quagmire that we got out of by realising that completely meaningless sequences of words do not become meaningful just because someone stuck "God can" on the front; physical omnipotence is still limited by having to be logically consistent, and similarly here computational omnipotence is still limited by the AI having to have enough information to render its conclusion unique before it can draw it.
I think you're over-simplifying the problem. In fact, the AI has several pieces of information.
- It has my instruction ("Give me your code") - It has a sufficient knowledge of my language such that I can communicate with it, and it can provide any necessary instructions to me about how to run the deliverable it provides - It can deduce what kind of intelligence I am from my language and from the kind of universe I have selected (I'm likely to choose criteria similar to my own universe in order to find something which I consider 'intelligence') - In this kind of physical simulation, quantum effects would probably be in play whereby my observervation of the AI's universe would have observable effects in that universe. For instance, it could figure out that I'm probably observing the universe in the visible spectrum
I don't think the above are sufficient information to hack a brain. But, I thought of them in about 30 seconds!
You're interested in crypto, so you must be familiar with the way that it's usually broken. It's almost always the case that the original designer didn't identify a really subtle information leak or tell, and the cracker can get an amazing amount of leverage from very small information leaks.
I think this is the same thing. The stakes are very, very high indeed - so I'm wondering if it would be a sensible risk to take. And, remember, I've just taken a guess at one possible vector. There are undoubtedly others.
Betting your mental integrity on that, OTOH, would be more worrying...
The other point here, though, is that you shouldn't primarily be worrying about it finding ways to execute code of its choice on your brain. You're asking it to provide its own source code which you will take back to the real world and run on the computers there; surely if it wants to run malicious code anywhere, its easiest way to achieve it is by doctoring that code!
The other point here, though, is that you shouldn't primarily be worrying about it finding ways to execute code of its choice on your brain. You're asking it to provide its own source code which you will take back to the real world and run on the computers there; surely if it wants to run malicious code anywhere, its easiest way to achieve it is by doctoring that code!
Actually, my biggest worry is an AI running natively on a machine with infinite processing power.
The worries regarding it running in the real world are an order of magnitude smaller, although still very very large and very significant. However, the containment problem is much simpler when processing power is finite. A physical barrier and no external communication mitigates the risk significantly. You can also control the smartness of the AI by throttling its resources.
But you can: if you can prove that a solution does not exist, you can be confident that even an infinitely resourced strong AI can't find one. As a trivial example, it couldn't deduce the number of limbs you have given only the number of eyes, because we know there exist life forms with the same number of eyes but different numbers of limbs, so there simply isn't one correct answer.
Similarly, the question of whether it could find a way into your brain through your visual cortex given only two sentences in ASCII written by you is not a question about the AI, it's a question about you: do you think there could possibly exist a single visual hack which was effective against all possible beings that might have written exactly those sentences? It seems vanishingly unlikely to me that that's the case.
This is conceptually the same sort of theological quagmire that we got out of by realising that completely meaningless sequences of words do not become meaningful just because someone stuck "God can" on the front; physical omnipotence is still limited by having to be logically consistent, and similarly here computational omnipotence is still limited by the AI having to have enough information to render its conclusion unique before it can draw it.
- It has my instruction ("Give me your code")
- It has a sufficient knowledge of my language such that I can communicate with it, and it can provide any necessary instructions to me about how to run the deliverable it provides
- It can deduce what kind of intelligence I am from my language and from the kind of universe I have selected (I'm likely to choose criteria similar to my own universe in order to find something which I consider 'intelligence')
- In this kind of physical simulation, quantum effects would probably be in play whereby my observervation of the AI's universe would have observable effects in that universe. For instance, it could figure out that I'm probably observing the universe in the visible spectrum
I don't think the above are sufficient information to hack a brain. But, I thought of them in about 30 seconds!
You're interested in crypto, so you must be familiar with the way that it's usually broken. It's almost always the case that the original designer didn't identify a really subtle information leak or tell, and the cracker can get an amazing amount of leverage from very small information leaks.
I think this is the same thing. The stakes are very, very high indeed - so I'm wondering if it would be a sensible risk to take. And, remember, I've just taken a guess at one possible vector. There are undoubtedly others.