004-lec. linear regression with multiple variables(x1,x2,..) # When you have multiple features and one label, # for example, I want to predict score of final exam from 3 scores, # $$$x_{1}$$$(quiz 1) $$$x_{2}$$$(quiz 2) $$$x_{3}$$$(midterm 1) Y(score of final exam) # 73 80 75 152 # 93 88 93 185 # 89 91 90 180 # 96 98 100 196 # 73 66 70 142 # hypothesis function will be following formular # $$$H(x_{1},x_{2},x_{3})=w_{1}x_{1} + w_{2}x_{2} + w_{3}x_{3} + b$$$ # loss function will be following formular # $$$lossfunction(W,b)=\frac{1}{m} \sum\limits_{i=1}^{m} (H(x_{1}^{(i)},x_{2}^{(i)},x_{3}^{(i)}) - y^{(i)})^{2}$$$ # @ # The more you have features, wx term becomes too longer like following # $$$H(x_{1},x_{2},x_{3},...,x_{n})=w_{1}x_{1} + w_{2}x_{2} + ... + w_{n}x_{n} + b$$$ # To conviniently represent wx term, # you can use matrix multiplication # Note order of following formular is important, # when you find shape of weight matrix later # $$$\begin{bmatrix} x_{1}&x_{2}&...&x_{n} \end{bmatrix} \cdot \begin{bmatrix} w_{1} \\ w_{2} \\ ... \\ w_{n} \end{bmatrix}=x_{1}w_{1} + x_{2}w_{2} + ... + x_{n}w_{n}$$$ # Therefore, you can denote like XW=H(X) # $$$x_{1}$$$(feature 1) $$$x_{1}$$$(feature 2) $$$x_{1}$$$(feature 3) Y(score of final exam) # instance1 73 80 75 152 # instance2 93 88 93 185 # instance3 89 91 90 180 # instance4 96 98 100 196 # instance5 73 66 70 142 # hypothesis function # $$$H(x_{1},x_{2},x_{3})=w_{1}x_{1} + w_{2}x_{2} + w_{3}x_{3} + b$$$ # You can calculate each prediction one by one # $$$\begin{bmatrix} 73&80&75 \end{bmatrix} \cdot \begin{bmatrix} w_{1} \\ w_{2} \\ w_{3} \end{bmatrix}=73w_{1} + 80w_{2} + 75w_{3}$$$ # But this way is inefficient # So, you can use matrix multiplication again # row: instance # XW=H(X) # $$$\begin{bmatrix} x_{11}&x_{12}&x_{13} \\ x_{21}&x_{22}&x_{23} \\ x_{31}&x_{32}&x_{33} \\ x_{41}&x_{42}&x_{43} \\ x_{51}&x_{52}&x_{53} \end{bmatrix} \cdot \begin{bmatrix} w_{1} \\ w_{2} \\ w_{n} \end{bmatrix}=\begin{bmatrix} x_{11}w_{1}&x_{12}w_{2}&x_{13}w_{3} \\ x_{21}w_{1}&x_{22}w_{2}&x_{23}w_{3} \\ x_{31}w_{1}&x_{32}w_{2}&x_{33}w_{3} \\ x_{41}w_{1}&x_{42}w_{2}&x_{43}w_{3} \\ x_{51}w_{1}&x_{52}w_{2}&x_{53}w_{3} \end{bmatrix}$$$ # We know x value and w value # XW=H(X) # @ # XW=H(X) # Shapes of H(X) and X are given value # Shape of H(X): [5,1] # 5: instance # 1: number of label Y # Shape of X: [5,3] # 5: instance # 3: number of feature # You should decide shape of W # XW=H(X) # $$$[5,3]\cdot [?,?]=[5,1]$$$ # $$$[?,?]=[3,1]$$$ # Shape of W can be decided by number of feature X and number of label Y # @ # Generally, number of instance is n # In numpy, n can be denoted by -1 # In tensorflow, n can be denoted by None # XW=H(X) # $$$[n,3] \cdot [3,1]=[n,1]$$$ # @ # Let's talk about case that number of label y is 2 # $$$[n,3] \cdot [?,?]=[n,2]$$$ # $$$[?,?]=[3,2]$$$ # @ # Note following order # In function notation, we write: H(x)=Wx+b # In linear algebra notation, we write: XW=H(X)