Neural Networks in PHP
Introduction
Neural networks are a relatively new technology that aims to reverse engineer the functionality of the brain within a mathematics model. This may sound daunting and complex but the underlying concepts are very simple and Neural Mesh does the hard work for you.
In a Web environment NNs (neural networks) are considered too slow and complex to warrant effort on what might even be a trivial task. To solve this, Neural Mesh has been heavily optimized. It uses caching to speed up running and training of networks.
The API and framework has been simplified to implement in PHP projects. Unlike other Neural Network frameworks, Neural Mesh is written purely in PHP using a MySQL database. This means it is easy to install on most host providers. The required versions are PHP 5 or later and MySQL 5 or later with the following extensions installed: SimpleXML, MySQLi and GZip.
The concepts
Because neural networks utilize fuzzy logic, the standard system architecture is slightly different. With programs and systems we create, we are used to entering some input data, processing that data with conditions and logic and then outputting some data.
Neural networks work much similar except the processing is done behind curtains. We do not really care how it gets to an output; we just want to make sure it does.
The magic behind neural networks is its ability to learn. Because the second tier of the system, the processing layer, is essentially out of our control, we have to return correct results somehow. This is done by training the network. The network has to learn from past experience to be able to produce reliable results.
The main types of training are supervised and unsupervised training. Supervised training is when we know exactly what we want to output from certain inputs and perform the training iterations ourselves, manually.
Unsupervised training is when we automate a training set based on past experience, for example, if the network was to learn how to play chess and it is beaten, we want the network to learn from its mistake and train it with the preferred response or output.
Figure 1
Neural networks take in binary digits and output fits (fuzzy bits) which is a number between 0 and 1 but never absolute (e.g. 0.4323, 0.9, 0.1). For this reason, to make use of the output, we have to round off the fits to form bits (binary units).
A reasonable threshold would be anything greater than 0.8 should be 1. Anything lower than 0.2 should be 0. Anything in the middle means the network is not smart enough and requires more training.
Looking at Figure 1 above, we see the graphical representation of the sigmoid function, the function used to return a fit. You will notice that 0 or 1 is never reached in the line, hence why we must do logical rounding.
Figure 2
Figure 2 is a graphical representation of the structure of a neural network.
Parts of a Neural Network
A neural network consists of three or more layers, each with some neurons and a synapse linking from each neuron in a layer to each neuron in the next layer. A neuron is basically a black box that takes some input and returns an output based on a formula. A synapse is the connection from neuron to neuron across layers. They have a weight value (a number) which affects the neurons output.
The first layer of the network is the input layer. This is where data gets entered for which we want results. The amount of neurons in this layer will depend on the data we are entering and how large the bit string (string of binary digits) will be. This layer is abstract and acts as a virtual layer to pass on data to succeeding layers.
The information from the input layer is spread through to the hidden layer and using weights and mathematical functions it decides a path and outputs a value for each neuron to the next layer (usually the output layer).
Multiple hidden layers are possible but you will rarely need more than one. The output layer, similar to the input layer, requires an amount of neurons relative to the type of data you want returned.
Designing a Network
Designing the neural network is one of the most important aspects to utilising this great technology. An important thing to ensure is that you have a way to convert your input data to binary and then the resulting output binary to your needed format.
A popular example to use for explaining neural networks is the XOR logic gate. This will return 1 when there is an odd amount of 1's in the input. Here is a truth table for a 2 input XOR logic gate:
00 => 0 01 => 1 10 => 1 11 => 0
So if we wanted a neural network to be able to perform the XOR logic gate, we would need 2 neurons for the input layer, maybe around 3 for the hidden layer (there is no clear cut way of determining a neuron count for the hidden layer so play around and see which renders the best results) and 1 for the output layer.
Another good use for a neural network is the AI in a Connect Four game. We can user a Neural Network to decide based on the status of the board, which column to put a token down.
The rules of Connect Four are simple. Players take turns in placing tokens down a column of a grid and try to form a line of their tokens either diagonally, horizontally or vertically. The first player to connect four tokens wins.
First we need a system of recognising tokens. We will need to display an empty slot, a slot filled by player one and a slot filled by player two. This means we need two bits. 00 will indicate an empty slot, 01 will indicate a slot filled by player one and 10 will indicate a slot filled by player two.
A Connect Four board is 7 columns wide and 6 rows deep. So therefore we will need 7 * 6 * 2 neurons (or 84) to represent the status of the board. We only need 7 output neurons to represent which column to put the token down and we will choose the output neuron closest to 1. For now we will have 40 hidden layer neurons because we want some reasonable speeds but can increase if it is not learning.
The rest of the application can be done in another environment, for my example I used Adobe Flash. I also logged the moves that the player and computer took and what the status of the board was at that time and depending on who won the game, train the network with the winner's moves. So the more you play it, the smarter it gets.
You can try it at Neural Mesh project examples page.
Hidden Neurons
As mentioned above, deciding on an amount of hidden neurons is not deterministic. There are some rules of thumb and heuristics to make the first decision easier, such as having 2/3 the amount of hidden neurons than inputs, but these are not perfect.
Remember, the more hidden neurons, the longer it takes to process and train, so it is a very bad idea to choose a large amount for the sake of it.
Training the Network
Our network is well designed, so now we want the network to actually produce correct results. Therefore, we need to train it. A training set needs to be made using the data from the truth table. Training works by taking a pattern or input and then specifying a desired output (or the output we want the network to return given said pattern).
An algorithm will then work out which synapses need modifying to be able to produce this output, then modify it by the learning rate. This needs to be done many times to enforce these rules upon the network.
How are Neural Networks useful?
Hopefully, the basics of neural networks are making sense to you, but how does this technology help us?
We can use neural networks in many ways such as character recognition, game artificial intelligence and stock prediction. They are useful when the complexity of the data and design is too impractical to implement by hand, when we want decisions to be based on experience or when we want the program to learn from mistakes through trial and error.
Using Neural Mesh
With Neural Mesh installed (read the documentation for instructions and to download Neural Mesh) we can go about creating that Connect Four network we designed. For more details, read the online documentation.
After logging into the Neural Mesh administration interface, create a network with 84 input neurons, 56 hidden layer neurons, 7 output layer neurons and give it a name Connect Four. The rest of the details can be default for now. Below are the definitions of the fields.
The learning rate defines how quick the network learns. There is a trade-off between speed and accuracy when deciding this value. If it is too large the network will oscillate back and forth, not properly reaching a precise level. If it is too small the network will take a lot more time to train. A good medium is 1.
Target MSE (Mean Squared Error) is a target that we want out network to reach. The MSE is basically an indication of how inaccurate the network is or its error. The lower this number, the more intelligent it will be. If this value is reached during supervised training, the training will stop. Do not make this number too small as we can deal with an error of around 0.02 by rounding the values.
The epoch max number is just a maximum amount of supervised training iterations or you can think of it as a time out. If training has gone on for this many iterations and the target MSE is yet to be reached, then stop.
Now it gets even more advanced. Most of these values you will want to leave as default.
The initial bias value should usually always be one.
The momentum rate is when the net reaches a plateau in its learning. The momentum will continue to carry it in the right direction, overcoming the plateau. This value is kept normally between 0.1 and 0.75.
The initial weight range for random weights before the network is trained. If 5 then it will be randomly generated between -5 and 5.
With the network created, select it from the control panel and take note of the Authkey. This string is essential to work with our network outside of the administration interface, either via the API or framework.
If we are working outside of PHP, the API must be used to connect with the network. If we are using PHP, we simply include the framework controller.
Using the Web services API
Using the Web services API we can run and train the network in any application by sending simple HTTP POST requests (which every language should be capable of) and using XML as the data format. There are five types of functionality we can utilise via the API: create, run, train, bulk and destroy.
Create - will create an unmanaged network and return the authkey.
Run - simply takes a string of inputs and returns some outputs.
Train - will take a string of inputs (or pattern) and its desired outputs.
Bulk - will do the same as train but can take more patterns (rather than do multiple train requests).
Destroy - will remove a network.
For added security, you must use an authorisation key specific to a user associated with a network. This can be found in the Structure page in the administration interface. To determine which function is executed, use the listed types as the query string (e.g. nm-api.php?run). The XML request should be posted with the variable name request.
Using the Framework
If you want to implement Neural Mesh with PHP, simply include the nm-controller.php script and create an instance of the NeuralMesh class with the authkey to determine which network.
define("AUTH_KEY","08b5c0f308f7ba12afcc9e9e0682426b"); require("neuralmesh/nm-controller.php"); $nn = NeuralMesh::getNetwork(AUTH_KEY);
This allows us to do the same functionality as with the API but through PHP.
Using Neural Mesh (continued)
The Connect Four example uses Adobe Flash as the interface, so we will need to utilise the API to connect to the network. The first request will be running of the network. The request will look something like this:
<request> <auth>Authorisation Key</auth> <inputs>Input String</inputs> </request>
Where the Authorisation Key is the one found in the Neural Mesh administration interface that was noted and the input string is a snapshot of the Connect Four board which will be a very long string of binary digits (84 digits remembering we have 84 input neurons).
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 => Empty 01 => Player 1 10 => Computer
Above is an example of a snapshot. When passed to the network it will look like:
000000000000000000000000000000000000000000000000000000000000000000000000000000000000
This will return an XML response with the outputs of the run network which will look something like the following:
<response> <outputs> <output>0.994505</output> <output>0.001673</output> <output>0.000907</output> <output>0.004143</output> <output>0.012191</output> <output>0.000961</output> <output>0.004201</output> </outputs> </response>
Based on this result, the first column is the desired column to put a token down because it has the closest value to 1.
Training the network was done by keeping a log of the snapshot of the board and the column the player put a token down (both the player and computer) then depending on who won, trains the network with the data.
Plans for the future
More features will come for Neural Mesh including support for Bayesian Networks and other types of machine learning, the ability to save networks as a file and import, more user management support, greater control with algorithms, tweaks and other settings and crucially more optimization.
If you have any questions feel free to post a comment to this article. Anyone interested in contributing, use the e-mail contact link above to send me a message.
Comments