This is a series where I’m discussing what I’ve learned in Coursera’s machine learning course taught by Andrew Ng by Stanford University. Why? See Machine Learning, Nanodegrees, and Bitcoin. I’m definitely not going into depth, but just briefly summarizing from a 10,000 foot view.

This is a continuation of week 2.

## Installing Octave

I use Ubuntu 16.04. If you want to install Octave on your OS, I’m sure there are plenty of resources out there telling you how to do that. For me, the install was just

1 |
sudo apt install octave |

Granted, I later figured out that version of Octave had a bug with Coursera’s submit function for this course. I ended up having to pull from a different repo in order to get a more recent version of Octave. I can’t remember the exact command I used, but it was similar to the following:

1 2 3 4 |
sudo apt-add-repository -y ppa:picaso/octave sudo apt-get update sudo apt-get install octave sudo apt-get install liboctave-dev |

Once Octave is installed, “octave-cli” from the command line will launch Octave within the terminal.

## A Few Basic Octave Commands

This is certainly not exhaustive, but here are a few example commands, where I’m starting from a gnome-terminal in Ubuntu 16.04. You’ll notice by the few errors that are in there that I’m not perfect. 🙂

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 |
tscott@tscott:~/workspace/mach_learn$ octave-cli GNU Octave, version 4.2.1 Copyright (C) 2017 John W. Eaton and others. This is free software; see the source code for copying conditions. There is ABSOLUTELY NO WARRANTY; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. For details, type 'warranty'. Octave was configured for "x86_64-pc-linux-gnu". Additional information about Octave is available at http://www.octave.org. Please contribute if you find this software useful. For more information, visit http://www.octave.org/get-involved.html Read http://www.octave.org/bugs.html to learn how to submit bug reports. For information about changes from previous versions, type 'news'. octave:1> % Define a 2x3 matrix octave:1> x = [1 2 3; 4 5 6] x = 1 2 3 4 5 6 octave:2> % Transpose matrix octave:2> x' ans = 1 4 2 5 3 6 octave:3> % Inverse of matrix octave:3> inv(x) error: inverse: A must be a square matrix octave:3> inv([1 2; 3 4]) ans = -2.00000 1.00000 1.50000 -0.50000 octave:4> % Multiply two matrices octave:4> x * [1 2; 3 4] error: operator *: nonconformant arguments (op1 is 2x3, op2 is 2x2) octave:4> [1 2; 3 4] * x ans = 9 12 15 19 26 33 octave:5> |

## The Assignment

The assignment starts with loading data from a file and then plotting it. Most of this code has already been done, or the commands needed are specified within the assignment itself.

This kind of annoyed me – it gives the easy stuff then leaves the math-intensive part for us to do. I feel like I would have been a lot better prepared for this assignment if we would have been using Octave all along during our lectures. Nevertheless…

It then asks us to program a compute cost function (solving J(theta)). After that, it asks us to program the gradient descent function, which uses our computer cost function. In both cases, the expected answer is given so we can validate our program. In addition, we can submit our answers to Coursera in order to get almost immediate feedback on our progress.

Figuring out the math for this took me forever. I tried to put what was in the assignment directly into Octave, but I got confused. How do I represent a summation? I knew I could represent those values as matrices in order to do them all at once, but I couldn’t figure out the right conversion. I tried the sum function, until finally I realized a big mathematical insight…

**The summation of those matrices is the same as a matrix multiplication!**

In addition, the second big insight was keeping in mind the dimensions of my matrices. I knew theta was 2×1 to begin with. So if I need to multiple another matrix by the theta vector, it needs to be a 1xSomething matrix. Seeing as some of the matrices were 97×1, I just need to transpose them to get them to play nice with theta.

To sum it up, these two insights got me through the lesson (assuming I used vector forms for my problems):

- The summation can be converted to matrix multiplication
- Look at the size of your matrices to determine how to convert them to the proper form to multiply them

I really appreciate the graphs the lesson gives. The cost function is graphed so you can see that a certain value gives you a minimum… thus your hypothesis function works best when you use those values of theta. Graphing the data makes it so much easier to understand.

## Conclusion

Whereas this assignment was difficult due to trying to get my data in the right vector form, I’m hoping future assignments will be a little easier now that I understand how to manipulate the data into the right vector representation. The programming part of this course (the implementation part rather than the theory part), as I expected, was much more exciting. I enjoy theory, but only so far as I can experiment with it in order to learn better. Hopefully the following weeks will continue to have programs I can use to help me understand the material.