Programming @programming.dev zolax @programming.dev 5 mo. ago

training a neural network to play a bullet hell game

neural network is trained with deep Q-learning in its own training environment
controls the game with twinject

demonstration video of the neural network playing Touhou (Imperishable Night):

it actually makes progress up to the stage boss which is fairly impressive. it performs okay in its training environment but performs poorly in an existing bullet hell game and makes a lot of mistakes.

let me know your thoughts and any questions you have!

You're viewing a single thread.

14 comments

So the training environment was not Touhou? So what does the training environment look like? I'd be interested to see that, and how it improved over time.
- yeah, the training environment was a basic bullet hell "game" (really just bullets being fired at the player and at random directions) to teach the neural network basic bullet dodging skills
  
  the white dot with 2 surrounding squares is the player and the red dots are bullets
  
  the data input from the environment is at the top-left and the confidence levels for each key (green = pressed) are at the bottom-left
  
  the scoring system is basically the total of all bullet distances
  
  this was one of the training sessions
  
  the fitness does improve but stops improving pretty quickly
  
  the increase in validation error (while training error decreased) is indicated overfitting
  
  it's kinda hard to explain here but basically the neural network performs well with the training data it is trained with but doesn't perform well with training data it isn't (which it should also be good at)
  
  That's an interesting approach. The Traditional way would be to go by game score like the AI Mario Projects. But I can see the value in prioritizing Bullet Avoidance over pure score.
  
  Does your training Environment Model that shooting at enemies (eventually) makes them stop spitting out bullets? I also would assume that total survival time is a part of the score, otherwise the Boss would just be a loosing game score wise.
  
  the training environment is pretty basic right now so all bullets shoot from the top of the screen with no enemy to destroy.
  
  additionally, the program I'm using to get player and bullet data (twinject) doesn't support enemy detection so the neural network wouldn't be able to see enemies in an existing bullet hell game. the character used has a wide bullet spread and honing bullets so the neural network inadvertently destroys the enemies on screen.
  
  the time spent in each training session is constant rather than dependent on survival time because the scoring system is based on the total bullet distance only.

You've viewed 14 comments.