Machine Learning Week1

Linear Regression and Algebra

Posted by yellowDog on 2018-07-10

What is Machine Learning?

Two definitions of Machine Learning are offered. Arthur Samuel described it as: “the field of study that gives computers the ability to learn without being explicitly programmed.” This is an older, informal definition.

Tom Mitchell provides a more modern definition: “A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.”

Example: playing checkers.

  • E = the experience of playing many games of checkers

  • T = the task of playing checkers.

  • P = the probability that the program will win the next game.

In general, any machine learning problem can be assigned to one of two broad classifications:

  • Supervised learning
  • Unsupervised learning

Supervised Learning(监督学习)

数据集中每个数据都有具体的答案(训练集)

  • Regression 回归问题 -> 给出正确答案
  • Classification 分类问题 -> 给出离散值

Unsupervised Learning(无监督学习)

所有的数据都是一样的,找出类型结构

  • 聚类算法
  • 鸡尾酒宴算法

Model

线性回归(linear regression)

训练集
  m = number of training example
  x = input variable / features
  y = output variable / target variable
  (x,y) 训练样本
  h(hypothesis假设) -> 表示函数

当 y 是少量的离散值的时候,(例如,如果给定生活区域,我们想要预测住宅是房屋还是公寓,请说),我们称之为分类问题。

$${ h }{ \theta }\left( x \right) \quad =\quad { \theta }{ 0 }+{ \theta }_{ 1 }(x)$$


Cost Function(代价函数)

also called Squared error function

目标: 绘制 J(θ0,θ1)函数的曲线,得到 Minimize(J(θ0,θ1))


Gradient descent(梯度下降算法) -> 找到 Min(J)

:= 赋值运算符

= 相等操作符

计算 θ0 和 θ1 的时候,要注意同步更新,不能把计算出的新的 θ0 的值带入下一步计算 θ1 的式子中

学习效率$\alpha$ 是一个固定的值,因为越接近局部最低点/全局最低点,导数项  会越来越小,即 θ 的更新会越来越小


梯度下降来最小化平方误差代价函数


Linear Algebra

Matrices and Vectors(矩阵和矢量)

  • Dimension: rows * columns

  • Vector: n * 1 matrix

    1-indexed(数学中使用多) or 0-indexed(应用问题/编程语言) 索引从哪里开始
    
  • Matrix Addition 相同维度才能运算

  • Scalar Multiplication 实数的乘法

  • Identity Matrix 单位矩阵

    For any matrix A
    AI = IA = A
    
  • 乘法不满足交换律

  • Inverse(逆)
    $${ AA }^{ -1 }\quad =\quad { A }^{ -1 }A\quad =\quad I$$
      
        其中 A 为 n*n Materix

  • Transpose(转置)

$${ A }{ ij }^{ T }\quad =\quad { A }{ ji }$$


在 Octave 中打印

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
A = [1,2,3;4,5,6;7,8,9]
v = [1;1;1]
AV = A * v

I = eye(2)
% Note that IA = AI but AB != BA

octave:15> inv(A)
ans =

3.1525e+15 -6.3050e+15 3.1525e+15
-6.3050e+15 1.2610e+16 -6.3050e+15
3.1525e+15 -6.3050e+15 3.1525e+15

octave:23> A'
ans =

1 4 7
2 5 8
3 6 9