You are misrepresenting a lot of stuff here.
it’s behavior is unpredictable
This entirely depends on the quality of the AI and the task at hand. A well made AI can be relatively predictable. However, most tasks that AI excels at are tasks which themselves do not have a predictable solution. For instance, handwriting recognition can be solved by a neural network with much better than human accuracy. That task does not have a perfect solution, and there is not an ideal answer for each possible input (one person’s ‘a’ could look exactly the same as another’s ‘o’). The same can be said for almost all games, especially those involving a human player.
and therefore cannot be tested
Unpredictable things can be tested. That’s pretty much what the entire field of statistics and probability is about. Also, testability is a fundamental requirement for any kind of machine learning. It isn’t just a good practice kind of thing; if you can’t test your model, you don’t even have a model in the first place. The whole point is to create many candidate models and test them to find the best one.
It would cheat and find ways to know things about the game state that it’s not supposed to know
A neural network only knows what you tell it. If you don’t tell it where the player is, it’s not going to magically deduce it from nothing. Also, it’s output has to be interpreted to even be used. The raw output is a vector of numbers. How this is transformed into usable actions is entirely up to the developer. If that transformation allows violating the rules, that’s the developers fault, not the networks. The same can be said of human input; it is the developers responsibility to transform that into permissable actions in game.
it would hide in a corner as far away from the player as possible because it’s parameters is to avoid death
That is possible. Which is why you should make a performance metric that reflects what you actually want it to try to do. This is a very common issue and is just part of the process of making an AI. It is not an insurmountable problem.
Neural networks have been used to play countless games before. It’s probably one of the most studied use cases simply because it is so easy to do.
I kind of assumed that it’s some kind of brain-scanning tech that can extract meaning directly from the language processing part of the brain, and it just needs some calibration for each language. If two random ships can synchronize a communication frequency and video format, they can probably also have some standard brain-scan info dump, so the scan could be done by the speaker.