import { Component } from '@angular/core'
import { AppModeService } from '../../app-mode.service'

@Component({
    template: `
    <div id="main-content">
    <table class="mission-statement-table">
        <tr>
            <td class="mission-statement-column1">
            </td>  
        </tr>
        <td class="mission-statement-column2">
                <h2><strong>n-Neural Network with out Regularization </strong></h2>
                <div class="author-date-item">Jagan Lakshmipathy</div>
                <div class="author-date-item">December 6, 2019</div>
                <p></p>
                <p>This article focusses on a bare minimum neural network implementation in python3. This article is inspired by this article <a href="https://medium.com/binaryandmore/beginners-guide-to-deriving-and-implementing-backpropagation-e3c1a5a1e536"> A beginner's guide </a>. The above article is particularly self sufficient and provides a good beginning point.  This article provides an easy to follow derivation and a very readable implementation.</p>
                <p>While this is by no means an optimal implementation and we will attempt to improve upon this in our future work. However, in this article, we will try to modify the above binary classifier code into a multi-class classifier as follows. We will use the famous MNIST to train and test our multiclass classifier. This dataset provides 60K train data and 10K test data. </p>
                <p>In this implementation, we use the cross-entropy cost function C and sigmoid activation function G(x), which are defined as follows:</p>
                <pre>
    C = -(y.\ln(<span style="text-decoration:overline">y</span>) + (1-y).\ln(1 - <span style="text-decoration:overline">y</span>))

<table>
<tr>
<td rowspan="2"> G(P<sup>T</sup>x) = </td>
<td style="text-align:center;"><span style="text-decoration:underline;">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;1 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;</span></td>
</tr>
<tr>
    <td style="text-align:center;">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;1 + e <sup>-(P<sup>T</sup>)x</sup>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;</td>
</tr>
</table></pre>
                <p>Where, y is the actual output and <span style="text-decoration:overline">y</span> is the predicted output. While deriving the partial derivates using backpropagation as described <a href="https://medium.com/binaryandmore/beginners-guide-to-deriving-and-implementing-backpropagation-e3c1a5a1e536">  here </a>, we have assumed the sigmoid function as the activation function for the last layer for simplicity. We made this assumption to exploit the simplicity of its derivative. </p>
                <p>First change we made was the NeuralNetwork Class c'tor to take an additional tuple argument called classification. The first value in the tuple is a number of the classifications and the second value is an array of OneHotEncoded vector of classes.</p>
 

                <pre class="pre-code-area">def __init__(self, architecture, classification):
    #architecture - numpy array with ith element representing the number of neurons in the ith layer.
                
    #Initialize the network architecture
    self.L = architecture.size - 1 #L corresponds to the last layer of the network.
    self.n = architecture #n stores the number of neurons in each layer
    #input_size is the number of neurons in the first layer i.e. n[0]
    #output_size is the number of neurons in the last layer i.e. n[L]
                
    self.n_class = classification[1]
    self.classes = classification[0] 
    ....</pre>
        <p>So, our invocation of the c'tor would look like the following: </p>
        <pre class="pre-code-area">onehotencoder = OneHotEncoder(categories='auto', sparse=False)
categories = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
classes = onehotencoder.fit_transform(categories.reshape(-1,1))
classification = (classes, 10)
classifier = NeuralNetwork(architecture, classification)</pre>
                <p>Second change we made was to increase the layers and # of neurons in our network architecture. As our MNIST data has 784 (28x28) components per image, we started the # of neurons in the initial layer with 784. There after we decreased the # of neurons roughly by 200 neurons per layer. So our network architecture looked like [784, 600, 400, 200, 1]. </p>
                <pre class="pre-code-area">architecture = np.array([784, 600, 400, 200, 1])</pre>
    <p>As our final layer has to be projected to our classes (our example case has 10 classes). We will have to output a vector of 10 components. Similarly our intermediate layers and their parameters have to be modified accordingly. Following are the dimensions of partial derivatives and intermediate variables.</p>
    <pre>dZ at layer 1 is of shape (600, 10)
Variables W, and dW at layer 1 are of shape (600, 784)
Variables b, and db at layer 1 are of shape (600, 10)
Variable a at layer 1 is of Shape (600, 10)
Variable z at layer 1 is of Shape (600, 10)
Variable z at last layer is of shape (1, 10)
Variable y is of (10,)</pre>
    
    <p>Following code uses the learnt model to predict the test data and maps each 10 component output vector to an integral value. The argmax function maps the integral value to the index of the dominant component of the vector. This integral value is compared with the actual value to finally measure the accuracy.</p>
    <pre class="pre-code-area">y_pred = classifier.predict(x)
yi = np.argmax(y_pred, 1)
    
if (yi == y):
    n_c += 1</pre>
    <p>This implementation predicted the test dateset with 91% accuracy. Feel free to check out the code from <a href="https://github.com/jagan-lakshmipathy/n_nn_classifier"> here. </a></p>
    </table>
    `
})
export class nnWithoutRegComponent {
   
    constructor(private modeService: AppModeService){
        this.modeService.displaySidebar()
    }
}