Dervis Mansuroglu
For a given state B and all possible moves, find the best move M given by function C.
For a given state B, calculate a score R given by the target function V.
public int getMove(Board b) {
List<Move> moves = generateMoves(b);
int bestMove = 0;
for (Move : moves) {
b.doMove(move);
score = calcScore(b);
if (score > bestScore) bestMove = move;
...
return bestMove;
}
Identify properties that represent state:
public double calcScore(Board board) {
x1 = evaluator.evaluateConnectedPieces(board, Board.player1);
x2 = evaluator.evaluateConnectedPieces(board, Board.player2);
x3 = evaluator.evaluateThreats(board, N_IN_A_ROW, Board.player1);
x4 = evaluator.evaluateThreats(board, N_IN_A_ROW, Board.player2);
return w0 + w1*x1 + w2*x2 + w3*x3 + w4*x4;
}
Weights,
, is the importance of each property
The target function gives you a score for the next state of the board (with a new move applied)
public int getMove(Board b) {
List<Move> moves = generateMoves(b);
for (Move : moves) {
nesteBrett = gjeldendeBrett.doMove(move);
score = calcScore(nesteBrett);
if (score > bestScore) bestMove = move;
...
Described by Tom Mitchel, Machine Learning, 1997.
Algorithm that "adjusts" the weights after each move
By playing several times, E is reduced and the weights are changed so the machine learns better.
E = Deviation between the training set (next state) and current state.
Weight values reside in an extremely large search space
GAs often find good values very fast
My example is inspired by Watchmaker and JGap
Permutation of 15 towns: 1 307 674 368 000 (3-4 dager)
With genetic algorithm, after just 13 seconds:
Generation 1: distance 588, population size 140
Generation 2: distance 519, population size 196
Generation 3: distance 459, population size 275
.......
Generation 11: distance 200, population size 4143
Generation 14: distance 195, population size 11463
Generation 25: distance 168, population size 478947
Create and mutate neural networks with GAs
INFO Generation 99:
INFO 76238 vs Gomoku Random: 184-15-1
INFO 76238 vs Gomoku Better: 158-41-1
INFO Generation 0:
INFO 2549 vs Gomoku Random: 1-89-10
INFO 2549 vs Gomoku Better: 2-90-8
Neuroph Studio
@dervis_m
dervis.no
Thanks!
https://github.com/dervism