# Zankoku na tenshi no thesis 2009 version

At one point, I was taking requests , but my backlog quickly grew beyond what I can handle. I haven't done a new transcription in several years now, and likely will not return to it except perhaps someday to finish a few that I already started. If you are a transcriber yourself, please take a look at the current requests and see if anything there strikes you as something you might want to tackle. If you do, or if you find sheet music already available for anything listed, please let me know and I'll update the request list accordingly.

We have users $u$ for items $i$ matrix as in the following:
$$Q_{ui} = \cases{ r & \text{if user u rate item i} \cr 0 & \text{if user u did not rate item i} }$$ where $r$ is what rating values can be. If we have $m$ users and $n$ items, then we want to learn a matrix of factors which represent movies. That is, the factor vector for each movie and that would be how we represent the movie in the feature space. Note that, we do not have any knowledge of the category of the movie at this point. We also want to learn a factor vector for each user in a similar way how we represent the movie. Factor matrix for movies $Y \in \mathbb{R}^{fxn}$ and factor matrix(each movie is a column vector) for users $X \in \mathbb{R}^{mxf}$(each user is a row vector). However, we have two unknown variables. Therefore, we will adopt an alternating least squares approach with regularization. By doing so, we first estimate $Y$ using $X$ and estimate $X$ by using $Y$. After enough number of iterations, we are aiming to reach a convergence point where either the matrices $X$ and $Y$ are no longer changing or the change is quite small. However, there is a small problem in the data. We have neither user full data nor full items data, (suprisingly) this is also why we are trying to build the recommendation engine in the first place. Therefore, we may want to penalize the movies that do not have ratings in the update rule. By doing so, we will depend on only the movies that have ratings from the users and do not make any assumption around the movies that are not rated in the recommendation. Let's call this weight matrix $w_{ui}$ as such: $$w_{ui} = \cases{ 0 &\text{if } q_{ui} = 0 \cr 1 & \text{ else} }$$ Then, cost functions that we are trying to minimize is in the following: $$J(x_u) = (q_u - x_u Y) W_u (q_u - x_u Y)^T + \lambda x_u x_u^T$$ $$J(y_i) = (q_i - X y_i) W_i (q_i - X y_i)^T + \lambda y_i y_i^T$$ Note that we need regularization terms in order to avoid the overfitting the data. Ideally, regularization parameters need to be tuned using cross-validation in the dataset for algorithm to generalize better. In this post, I will use the whole dataset. Solutions for factor vectors are given as follows: $$x_u = (Y W_u Y^T + \lambda I)^{-1} Y W_u q_u$$ $$y_i = (X^T Wi X + \lambda I)^{-1} X^T W_i q_i$$ where $W_u \in \mathbb{R}^{nxn}$ and $W_u \in \mathbb{R}^{mxm}$ diagonal matrices. The algorithm is pretty much of it. In the regulaization, we may want to incorporate both factor matrices in the update rules as well if we want to be more restrictive. That may generalize better, though.