Linear algebra is a branch of mathematics which deals with solving a system of linear equations. It is widely used throughout science and engineering and it is essential to understanding machine learning algorithms.
Linear algebra defines three basic data-structures – vectors, matrices and tensors, which are constantly used in machine and deep learning. In this blog post we’ll go over all three structures and show how to do some simple math using them.
In the computer science world, we represent vectors using one-dimensional arrays of number. A vector can contain any number of values.
If a vector holds n number of values, we say that vector is an n-dimensional vector.
I just said that a vector is a one-dimensional array, but I also said that it can be n-dimensional as well. What’s with that?
In the computer science world and programming, a vector is always represented with a one-dimensional array, but a vector can be n-dimensional. The dimensions of an vector are not the same as the dimensions of an array. A vector is said to be n-dimensional if it stores n number of values.
This image above is a representation of a vector.
Vectors define points in space, where each value represents a coordinate at a certain axis.
So a vector with values [-4, 4, 1] would look something like this:
Vector is often represented as a directional line going from the origin point to the point which is defined by the values of the vector. The origin point is mostly defined to be the point at which coordinates for all of the axis are 0. So in the example above, the origin point is at (0, 0, 0) because we are representing a three-dimensional vector.
In the case above, we represented the vector in the vector space, because the values stored in the vectors are real numbers and the vector is a three-dimensional vector.
In the case of deep learning, we usually use vectors to sore biases, as they are one dimensional arrays.
Matrices are two-dimensional arrays. Elements of the matrix are accessed by two incidences, mostly set to be i and j.
If we have a matrix A with m rows and n columns, then we write .
In deep learning, we use matrices with weights and inputs.
Tensors are mostly described as n-dimensional arrays (not vectors, but computer science arrays), where the .
Some basic matrix and vector math
Matrices are very computationally efficient so they’re used in computer games and, of course, machine learning.
So the two most basic math operators that are used in matrix math are multiplication and addition.
Linear algebra features two main types of matrix multiplication – the dot product and the Hadamard product.
The Hadamard product is the element-wise product and it is the one that most people would assume that we’d be using. It’s denoted like this: .
The more commonly used operation in machine learning is the dot product. When calculating the dot product of two matrices, the first matrix needs to have the same number of columns as the second one has rows. It is denoted like this: .
When you say matrix multiplication, most of the time you’d mean the dot product.
Matrix multiplication is associative and distributive, but not necessarily commutative.
You can also apply the dot product to two vector of the same dimensionality by calculating .
The T in the exponent of x is the transpose operator. A transpose is a mirror image across a diagonal line.