ML&DL/Recommender System

[Recommender System] Neural collaborative filtering (NCF) pytorch 구현

거북이주인장 2024. 11. 4. 07:46

이번 포스팅에서는 NCF 페이퍼에서 나온 모델 구조를 pytorch로 구현해보고 movielens에 적용하여 얻은 실험 결과에 대해서 논의해보고자 한다. NCF 페이퍼 리뷰는 아래 포스팅에서 상세하게 진행했으니 참고 바란다.

https://steady-programming.tistory.com/56

 

[Recommender System / Paper review] #18 Neural Collaborative Filtering

논문 링크(4700회 인용)Summary유저와 아이템간의 interaction을 벡터의 내적이 아닌 neural network으로 모델링하는 neural cf을 제안한다.embedding layer을 통해 유저와 아이템에 대한 latent vector을 만들고 이를

steady-programming.tistory.com

 

아래 레포의 코드 구조를 생각하며 구현할 것이다. 추천 시스템 모델을 일관된 파이프라인으로 구현하고 실험 결과를 도출하는 개인 프로젝트를 하고 있다. PR은 언제든 환영이다!

https://github.com/bohyunshin/recommender

 

GitHub - bohyunshin/recommender: Implementation of various recommender algorithm

Implementation of various recommender algorithm. Contribute to bohyunshin/recommender development by creating an account on GitHub.

github.com

 

모델 구조

구현할 모델 구조를 살펴보자. NCF에서는 크게 GMF / MLP / 이 둘을 합친 Fusion, 총 세가지 모델이 제시된다.

 

GMF의 구조는 아래와 같다.

유저 벡터와 아이템 벡터를 elementwise 곱셈한 뒤에 weight vector인 $h$을 내적한다. 그리고 activation function을 통해 최종 target에 대한 prediction 값을 도출한다. $h$가 one vector이고 activation function이 identity function이면 일반적인 mf와 동일한 식이 된다. 이를 torch을 사용하여 구현하면 아래와 같다.

https://github.com/bohyunshin/recommender/blob/master/recommender/model/deep_learning/gmf.py

 

recommender/recommender/model/deep_learning/gmf.py at master · bohyunshin/recommender

Implementation of various recommender algorithm. Contribute to bohyunshin/recommender development by creating an account on GitHub.

github.com

import torch.nn as nn
import torch.nn.functional as F
from model.torch_model_base import TorchModelBase

class Model(TorchModelBase):
    def __init__(self, num_users, num_items, num_factors, **kwargs):
        super(Model, self).__init__()

        self.embed_user = nn.Embedding(num_users, num_factors)
        self.embed_item = nn.Embedding(num_items, num_factors)
        self.h = nn.Linear(num_factors, 1, bias=False)

        nn.init.xavier_normal_(self.embed_user.weight)
        nn.init.xavier_normal_(self.embed_item.weight)
        nn.init.xavier_normal_(self.h.weight)

    def forward(self, user_idx, item_idx):
        x = self.embed_user(user_idx) * self.embed_item(item_idx)
        x = self.h(x)
        return F.sigmoid(x)

 

논문에서 나온 $h$ 벡터를 `self.h`의 linear layer로 지정하면 동일한 효과를 얻는다. 마지막으로 `F.sigmoid` 함수에 통과시켜서 확률 값을 얻는다.

 

다음으로 MLP 구조를 살펴보자.

 

L개의 레이어로 구성할 때, 마지막 L번째 레이어를 통과한 후에 역시나 weight vector인 $h$을 통해 값을 한번 더 조정한다. 또한 처음 input 값을 유저 벡터와 아이템 벡터인 $p_u, q_i$의 concat vector로 넣는다. 잘 생각해보면 이는 일반적인 two tower 구조는 아니다. 보통 사용하는 two tower 구조는 유저와 아이템 임베딩을 각기 다른 encoder을 통해서 dense vector로 변환하고 최종 layer에서 concat하는 방식을 취한다. 해당 논문에서는 처음부터 concat해서 encoder을 통과시키는 점이 다른 점이다. 물론, GMF, MLP을 fusion하는 모델에서는 two tower와 비슷한 모델 구조가 나오긴 한다.

 

torch을 이용해서 구현해보자. $L=3$으로 지정하고, 각 레이어의 활성함수는 Relu로, 각 레이어를 지날 때마다 output_dim은 1/2씩 줄어들도록 지정한다.

https://github.com/bohyunshin/recommender/blob/master/recommender/model/deep_learning/mlp.py

 

recommender/recommender/model/deep_learning/mlp.py at master · bohyunshin/recommender

Implementation of various recommender algorithm. Contribute to bohyunshin/recommender development by creating an account on GitHub.

github.com

import torch
import torch.nn as nn
import torch.nn.functional as F

from model.torch_model_base import TorchModelBase


class Model(TorchModelBase):
    def __init__(self, num_users, num_items, num_factors, **kwargs):
        super().__init__()

        self.num_users = num_users
        self.num_items = num_items
        self.num_factors = num_factors

        self.embed_user = nn.Embedding(num_users, num_factors)
        self.embed_item = nn.Embedding(num_items, num_factors)

        num_factors = num_factors * 2 # concat user & item embedding
        layers = []
        num_layers = 3
        for i in range(num_layers):
            output_dim = num_factors // 2
            layers.append(nn.Linear(num_factors, output_dim))
            layers.append(nn.ReLU())
            num_factors = output_dim
        self.layers = nn.Sequential(*layers)
        self.h = nn.Linear(num_factors, 1, bias=False)

        nn.init.xavier_normal_(self.embed_user.weight)
        nn.init.xavier_normal_(self.embed_item.weight)
        for layer in self.layers:
            if getattr(layer, "weight", None) != None:
                nn.init.xavier_normal_(layer.weight)
        nn.init.xavier_normal_(self.h.weight)

    def forward(self, user_idx, item_idx):
        x = torch.concat((self.embed_user(user_idx), self.embed_item(item_idx)), dim=1)
        for layer in self.layers:
            x = layer(x)
        x = self.h(x)
        return F.sigmoid(x)

    def predict(self, user_factors, item_factors, userid, **kwargs):
        item_idx = torch.arange(self.num_items)
        user_item_pred = torch.tensor([])
        for user_idx in userid:
            user_item_pred = torch.concat(
                (user_item_pred, self.forward(torch.tensor(user_idx).repeat(self.num_items), torch.tensor(item_idx)).reshape(1,-1)),
                dim=0
            )
        return user_item_pred.clone().detach().cpu().numpy()

 

`layers` 리스트에 원하는 pytorch 모듈을 넣고 `nn.Sequential`을 통해 sequential로 만들어준다. L개의 레이어를 통과한 후에는 `self.h`을 통해 마지막 가중치 조정을 한다. `predict` 메써드는 실험 파이프라인에서 쓰이므로 현재는 무시해도 된다.

ml-1m에 적용

제대로 구현이 됐는지 확인해보기 위해서, movielens 1m 데이터셋에 모델을 학습시켜본다. 파이프라인은 아래 파이썬 파일을 참고하기 바란다.

https://github.com/bohyunshin/recommender/blob/master/recommender/train.py

 

recommender/recommender/train.py at master · bohyunshin/recommender

Implementation of various recommender algorithm. Contribute to bohyunshin/recommender development by creating an account on GitHub.

github.com

 

실행 커맨드는 README을 참고하면 된다.

https://github.com/bohyunshin/recommender/tree/master?tab=readme-ov-file#experiment-results

 

GitHub - bohyunshin/recommender: Implementation of various recommender algorithm

Implementation of various recommender algorithm. Contribute to bohyunshin/recommender development by creating an account on GitHub.

github.com

 

GMF는 아래의 커멘드로 실행한다.

$ python3 recommender/train.py \
  --dataset movielens \
  --model gmf \
  --implicit \
  --epochs 1 \
  --num_factors 16 \
  --train_ratio 0.8 \
  --random_state 42 \
  --movielens_data_type ml-1m \
  --model_path "../gmf_ml_1m.pkl" \
  --log_path "../gmf_ml_1m.log"

 

이때 로그는 아래와 같다.

더보기

2024-10-31 18:06:29,015 - recommender - INFO - selected dataset: movielens
2024-10-31 18:06:29,018 - recommender - INFO - selected model: gmf
2024-10-31 18:06:29,018 - recommender - INFO - batch size: 32
2024-10-31 18:06:29,018 - recommender - INFO - learning rate: 0.01
2024-10-31 18:06:29,018 - recommender - INFO - regularization: 0.0001
2024-10-31 18:06:29,018 - recommender - INFO - epochs: 30
2024-10-31 18:06:29,018 - recommender - INFO - number of factors for user / item embedding: 16
2024-10-31 18:06:29,018 - recommender - INFO - train ratio: 0.8
2024-10-31 18:06:29,018 - recommender - INFO - patience for watching validation loss: 5
2024-10-31 18:06:29,018 - recommender - INFO - selected movielens data type: ml-1m
2024-10-31 18:06:31,950 - recommender - INFO - device info: cuda:0
2024-10-31 18:08:14,648 - recommender - INFO - token time for negative sampling: 54.80386924743652
2024-10-31 18:08:15,616 - recommender - INFO - ####### Epoch 0 #######
2024-10-31 18:10:38,489 - recommender - INFO - Train Loss: 0.693147
2024-10-31 18:10:38,504 - recommender - INFO - Validation Loss: 0.693148
2024-10-31 18:10:38,508 - recommender - INFO - Best validation: 0.693148, Previous validation loss: inf
2024-10-31 18:10:38,509 - recommender - INFO - ####### Epoch 1 #######
2024-10-31 18:12:54,923 - recommender - INFO - Train Loss: 0.693147
2024-10-31 18:12:54,930 - recommender - INFO - Validation Loss: 0.693148
2024-10-31 18:12:54,930 - recommender - INFO - Validation loss did not decrease. Patience 4 left.
2024-10-31 18:12:54,930 - recommender - INFO - ####### Epoch 2 #######
2024-10-31 18:15:19,920 - recommender - INFO - Train Loss: 0.693147
2024-10-31 18:15:19,926 - recommender - INFO - Validation Loss: 0.693148
2024-10-31 18:15:19,926 - recommender - INFO - Validation loss did not decrease. Patience 3 left.
2024-10-31 18:15:19,926 - recommender - INFO - ####### Epoch 3 #######
2024-10-31 18:17:45,198 - recommender - INFO - Train Loss: 0.693147
2024-10-31 18:17:45,204 - recommender - INFO - Validation Loss: 0.693148
2024-10-31 18:17:45,204 - recommender - INFO - Validation loss did not decrease. Patience 2 left.
2024-10-31 18:17:45,204 - recommender - INFO - ####### Epoch 4 #######
2024-10-31 18:20:06,678 - recommender - INFO - Train Loss: 0.693147
2024-10-31 18:20:06,691 - recommender - INFO - Validation Loss: 0.693148
2024-10-31 18:20:06,691 - recommender - INFO - Validation loss did not decrease. Patience 1 left.
2024-10-31 18:20:06,691 - recommender - INFO - ####### Epoch 5 #######
2024-10-31 18:22:24,704 - recommender - INFO - Train Loss: 0.693147
2024-10-31 18:22:24,711 - recommender - INFO - Validation Loss: 0.693148
2024-10-31 18:22:24,711 - recommender - INFO - Validation loss did not decrease. Patience 0 left.
2024-10-31 18:22:24,711 - recommender - INFO - Patience over. Early stopping at epoch 5 with 0.693148 validation loss
2024-10-31 18:23:19,748 - recommender - INFO - Metric for K=10
2024-10-31 18:23:19,749 - recommender - INFO - NDCG@10: 0.010084462523330978
2024-10-31 18:23:19,749 - recommender - INFO - mAP@10: 0.0031447316038900716
2024-10-31 18:23:32,284 - recommender - INFO - Metric for K=20
2024-10-31 18:23:32,284 - recommender - INFO - NDCG@20: 0.010774124277642733
2024-10-31 18:23:32,284 - recommender - INFO - mAP@20: 0.0022548978920949843
2024-10-31 18:23:43,753 - recommender - INFO - Metric for K=50
2024-10-31 18:23:43,753 - recommender - INFO - NDCG@50: 0.013380610656625528
2024-10-31 18:23:43,753 - recommender - INFO - mAP@50: 0.0017988427810576775
2024-10-31 18:23:43,754 - recommender - INFO - Load weight with best validation loss
2024-10-31 18:23:43,763 - recommender - INFO - Save final model

 

MLP는 아래의 커맨드로 실행한다.

python3 recommender/train.py \
  --dataset movielens \
  --model mlp \
  --implicit \
  --epochs 1 \
  --num_factors 16 \
  --train_ratio 0.8 \
  --random_state 42 \
  --movielens_data_type ml-1m \
  --model_path "../mlp_ml_1m.pkl" \
  --log_path "../mlp_ml_1m.log"

 

이때 로그는 아래와 같다.

더보기

2024-10-31 18:04:57,711 - recommender - INFO - selected dataset: movielens
2024-10-31 18:04:57,715 - recommender - INFO - selected model: mlp
2024-10-31 18:04:57,715 - recommender - INFO - batch size: 32
2024-10-31 18:04:57,715 - recommender - INFO - learning rate: 0.01
2024-10-31 18:04:57,715 - recommender - INFO - regularization: 0.0001
2024-10-31 18:04:57,715 - recommender - INFO - epochs: 30
2024-10-31 18:04:57,715 - recommender - INFO - number of factors for user / item embedding: 16
2024-10-31 18:04:57,715 - recommender - INFO - train ratio: 0.8
2024-10-31 18:04:57,715 - recommender - INFO - patience for watching validation loss: 5
2024-10-31 18:04:57,715 - recommender - INFO - selected movielens data type: ml-1m
2024-10-31 18:05:01,033 - recommender - INFO - device info: cuda:0
2024-10-31 18:06:41,455 - recommender - INFO - token time for negative sampling: 52.37311387062073
2024-10-31 18:06:42,318 - recommender - INFO - ####### Epoch 0 #######
2024-10-31 18:10:07,137 - recommender - INFO - Train Loss: 0.568843
2024-10-31 18:10:07,143 - recommender - INFO - Validation Loss: 0.484593
2024-10-31 18:10:07,152 - recommender - INFO - Best validation: 0.484593, Previous validation loss: inf
2024-10-31 18:10:07,152 - recommender - INFO - ####### Epoch 1 #######
2024-10-31 18:13:23,067 - recommender - INFO - Train Loss: 0.47863
2024-10-31 18:13:23,072 - recommender - INFO - Validation Loss: 0.476106
2024-10-31 18:13:23,081 - recommender - INFO - Best validation: 0.476106, Previous validation loss: 0.484593
2024-10-31 18:13:23,081 - recommender - INFO - ####### Epoch 2 #######
2024-10-31 18:16:53,992 - recommender - INFO - Train Loss: 0.473603
2024-10-31 18:16:53,999 - recommender - INFO - Validation Loss: 0.475374
2024-10-31 18:16:54,009 - recommender - INFO - Best validation: 0.475374, Previous validation loss: 0.476106
2024-10-31 18:16:54,009 - recommender - INFO - ####### Epoch 3 #######
2024-10-31 18:20:14,890 - recommender - INFO - Train Loss: 0.470527
2024-10-31 18:20:14,896 - recommender - INFO - Validation Loss: 0.479517
2024-10-31 18:20:14,897 - recommender - INFO - Validation loss did not decrease. Patience 4 left.
2024-10-31 18:20:14,897 - recommender - INFO - ####### Epoch 4 #######
2024-10-31 18:23:38,197 - recommender - INFO - Train Loss: 0.468094
2024-10-31 18:23:38,202 - recommender - INFO - Validation Loss: 0.469739
2024-10-31 18:23:38,212 - recommender - INFO - Best validation: 0.469739, Previous validation loss: 0.475374
2024-10-31 18:23:38,212 - recommender - INFO - ####### Epoch 5 #######
2024-10-31 18:27:07,789 - recommender - INFO - Train Loss: 0.466584
2024-10-31 18:27:07,796 - recommender - INFO - Validation Loss: 0.469123
2024-10-31 18:27:07,807 - recommender - INFO - Best validation: 0.469123, Previous validation loss: 0.469739
2024-10-31 18:27:07,807 - recommender - INFO - ####### Epoch 6 #######
2024-10-31 18:30:31,073 - recommender - INFO - Train Loss: 0.465391
2024-10-31 18:30:31,087 - recommender - INFO - Validation Loss: 0.4701
2024-10-31 18:30:31,087 - recommender - INFO - Validation loss did not decrease. Patience 4 left.
2024-10-31 18:30:31,087 - recommender - INFO - ####### Epoch 7 #######
2024-10-31 18:34:01,434 - recommender - INFO - Train Loss: 0.464173
2024-10-31 18:34:01,441 - recommender - INFO - Validation Loss: 0.468072
2024-10-31 18:34:01,452 - recommender - INFO - Best validation: 0.468072, Previous validation loss: 0.469123
2024-10-31 18:34:01,453 - recommender - INFO - ####### Epoch 8 #######
2024-10-31 18:37:28,354 - recommender - INFO - Train Loss: 0.462456
2024-10-31 18:37:28,361 - recommender - INFO - Validation Loss: 0.467111
2024-10-31 18:37:28,372 - recommender - INFO - Best validation: 0.467111, Previous validation loss: 0.468072
2024-10-31 18:37:28,372 - recommender - INFO - ####### Epoch 9 #######
2024-10-31 18:40:55,650 - recommender - INFO - Train Loss: 0.458776
2024-10-31 18:40:55,656 - recommender - INFO - Validation Loss: 0.463846
2024-10-31 18:40:55,665 - recommender - INFO - Best validation: 0.463846, Previous validation loss: 0.467111
2024-10-31 18:40:55,665 - recommender - INFO - ####### Epoch 10 #######
2024-10-31 18:44:30,668 - recommender - INFO - Train Loss: 0.451102
2024-10-31 18:44:30,674 - recommender - INFO - Validation Loss: 0.453609
2024-10-31 18:44:30,683 - recommender - INFO - Best validation: 0.453609, Previous validation loss: 0.463846
2024-10-31 18:44:30,683 - recommender - INFO - ####### Epoch 11 #######
2024-10-31 18:48:06,162 - recommender - INFO - Train Loss: 0.441097
2024-10-31 18:48:06,170 - recommender - INFO - Validation Loss: 0.44296
2024-10-31 18:48:06,184 - recommender - INFO - Best validation: 0.44296, Previous validation loss: 0.453609
2024-10-31 18:48:06,184 - recommender - INFO - ####### Epoch 12 #######
2024-10-31 18:51:45,326 - recommender - INFO - Train Loss: 0.432934
2024-10-31 18:51:45,401 - recommender - INFO - Validation Loss: 0.438264
2024-10-31 18:51:45,433 - recommender - INFO - Best validation: 0.438264, Previous validation loss: 0.44296
2024-10-31 18:51:45,433 - recommender - INFO - ####### Epoch 13 #######
2024-10-31 18:55:20,593 - recommender - INFO - Train Loss: 0.42761
2024-10-31 18:55:20,602 - recommender - INFO - Validation Loss: 0.43501
2024-10-31 18:55:20,615 - recommender - INFO - Best validation: 0.43501, Previous validation loss: 0.438264
2024-10-31 18:55:20,615 - recommender - INFO - ####### Epoch 14 #######
2024-10-31 18:58:56,529 - recommender - INFO - Train Loss: 0.423283
2024-10-31 18:58:56,535 - recommender - INFO - Validation Loss: 0.431918
2024-10-31 18:58:56,546 - recommender - INFO - Best validation: 0.431918, Previous validation loss: 0.43501
2024-10-31 18:58:56,547 - recommender - INFO - ####### Epoch 15 #######
2024-10-31 19:02:36,283 - recommender - INFO - Train Loss: 0.419294
2024-10-31 19:02:36,290 - recommender - INFO - Validation Loss: 0.433076
2024-10-31 19:02:36,290 - recommender - INFO - Validation loss did not decrease. Patience 4 left.
2024-10-31 19:02:36,290 - recommender - INFO - ####### Epoch 16 #######
2024-10-31 19:05:56,208 - recommender - INFO - Train Loss: 0.415246
2024-10-31 19:05:56,214 - recommender - INFO - Validation Loss: 0.42949
2024-10-31 19:05:56,223 - recommender - INFO - Best validation: 0.42949, Previous validation loss: 0.431918
2024-10-31 19:05:56,223 - recommender - INFO - ####### Epoch 17 #######
2024-10-31 19:09:23,258 - recommender - INFO - Train Loss: 0.411042
2024-10-31 19:09:23,265 - recommender - INFO - Validation Loss: 0.423634
2024-10-31 19:09:23,275 - recommender - INFO - Best validation: 0.423634, Previous validation loss: 0.42949
2024-10-31 19:09:23,275 - recommender - INFO - ####### Epoch 18 #######
2024-10-31 19:12:46,785 - recommender - INFO - Train Loss: 0.406544
2024-10-31 19:12:46,793 - recommender - INFO - Validation Loss: 0.421556
2024-10-31 19:12:46,802 - recommender - INFO - Best validation: 0.421556, Previous validation loss: 0.423634
2024-10-31 19:12:46,803 - recommender - INFO - ####### Epoch 19 #######
2024-10-31 19:16:11,996 - recommender - INFO - Train Loss: 0.402227
2024-10-31 19:16:12,003 - recommender - INFO - Validation Loss: 0.418276
2024-10-31 19:16:12,013 - recommender - INFO - Best validation: 0.418276, Previous validation loss: 0.421556
2024-10-31 19:16:12,013 - recommender - INFO - ####### Epoch 20 #######
2024-10-31 19:19:51,527 - recommender - INFO - Train Loss: 0.39777
2024-10-31 19:19:51,536 - recommender - INFO - Validation Loss: 0.416555
2024-10-31 19:19:51,547 - recommender - INFO - Best validation: 0.416555, Previous validation loss: 0.418276
2024-10-31 19:19:51,547 - recommender - INFO - ####### Epoch 21 #######
2024-10-31 19:23:21,204 - recommender - INFO - Train Loss: 0.393332
2024-10-31 19:23:21,210 - recommender - INFO - Validation Loss: 0.410948
2024-10-31 19:23:21,219 - recommender - INFO - Best validation: 0.410948, Previous validation loss: 0.416555
2024-10-31 19:23:21,220 - recommender - INFO - ####### Epoch 22 #######
2024-10-31 19:27:00,852 - recommender - INFO - Train Loss: 0.388957
2024-10-31 19:27:00,858 - recommender - INFO - Validation Loss: 0.408472
2024-10-31 19:27:00,868 - recommender - INFO - Best validation: 0.408472, Previous validation loss: 0.410948
2024-10-31 19:27:00,868 - recommender - INFO - ####### Epoch 23 #######
2024-10-31 19:30:40,472 - recommender - INFO - Train Loss: 0.384779
2024-10-31 19:30:40,478 - recommender - INFO - Validation Loss: 0.405596
2024-10-31 19:30:40,486 - recommender - INFO - Best validation: 0.405596, Previous validation loss: 0.408472
2024-10-31 19:30:40,486 - recommender - INFO - ####### Epoch 24 #######
2024-10-31 19:34:05,020 - recommender - INFO - Train Loss: 0.380777
2024-10-31 19:34:05,027 - recommender - INFO - Validation Loss: 0.404474
2024-10-31 19:34:05,037 - recommender - INFO - Best validation: 0.404474, Previous validation loss: 0.405596
2024-10-31 19:34:05,037 - recommender - INFO - ####### Epoch 25 #######
2024-10-31 19:37:40,649 - recommender - INFO - Train Loss: 0.376955
2024-10-31 19:37:40,656 - recommender - INFO - Validation Loss: 0.403129
2024-10-31 19:37:40,668 - recommender - INFO - Best validation: 0.403129, Previous validation loss: 0.404474
2024-10-31 19:37:40,668 - recommender - INFO - ####### Epoch 26 #######
2024-10-31 19:41:16,565 - recommender - INFO - Train Loss: 0.373543
2024-10-31 19:41:16,573 - recommender - INFO - Validation Loss: 0.400969
2024-10-31 19:41:16,589 - recommender - INFO - Best validation: 0.400969, Previous validation loss: 0.403129
2024-10-31 19:41:16,589 - recommender - INFO - ####### Epoch 27 #######
2024-10-31 19:44:42,240 - recommender - INFO - Train Loss: 0.370422
2024-10-31 19:44:42,250 - recommender - INFO - Validation Loss: 0.399427
2024-10-31 19:44:42,260 - recommender - INFO - Best validation: 0.399427, Previous validation loss: 0.400969
2024-10-31 19:44:42,260 - recommender - INFO - ####### Epoch 28 #######
2024-10-31 19:48:24,659 - recommender - INFO - Train Loss: 0.367451
2024-10-31 19:48:24,665 - recommender - INFO - Validation Loss: 0.399182
2024-10-31 19:48:24,674 - recommender - INFO - Best validation: 0.399182, Previous validation loss: 0.399427
2024-10-31 19:48:24,675 - recommender - INFO - ####### Epoch 29 #######
2024-10-31 19:51:54,415 - recommender - INFO - Train Loss: 0.364744
2024-10-31 19:51:54,423 - recommender - INFO - Validation Loss: 0.398261
2024-10-31 19:51:54,436 - recommender - INFO - Best validation: 0.398261, Previous validation loss: 0.399182
2024-10-31 19:52:52,764 - recommender - INFO - Metric for K=10
2024-10-31 19:52:52,764 - recommender - INFO - NDCG@10: 0.2613917212269529
2024-10-31 19:52:52,764 - recommender - INFO - mAP@10: 0.15854812493594114
2024-10-31 19:53:06,869 - recommender - INFO - Metric for K=20
2024-10-31 19:53:06,871 - recommender - INFO - NDCG@20: 0.24894602064051807
2024-10-31 19:53:06,871 - recommender - INFO - mAP@20: 0.1275482565021402
2024-10-31 19:53:20,917 - recommender - INFO - Metric for K=50
2024-10-31 19:53:20,917 - recommender - INFO - NDCG@50: 0.2611605084197204
2024-10-31 19:53:20,917 - recommender - INFO - mAP@50: 0.10867373536865668
2024-10-31 19:53:20,918 - recommender - INFO - Load weight with best validation loss
2024-10-31 19:53:20,927 - recommender - INFO - Save final model

Metric 비교

map, ndcg로 각 모델의 성능을 비교해보자.

  mAP@10 mAP@20 mAP@50 ndcg@10 ndcg@20 ndcg@50
GMF 0.0031 0.0022 0.0017 0.0100 0.0107 0.0133
MLP 0.1585 0.1275 0.1086 0.2613 0.2489 0.2611

 

MLP의 성능이 확연하게 높은 것을 알 수 있다. 아무래도 좀 더 복잡한 neural network을 쌓았기 때문에 MLP가 더 나은 성능을 보이는 것 같다. 다만, 실제 production에서는 response time 등도 고려해야할 것이다.

결론

이번 포스팅에서는 NCF 논문에서 소개된 GMF, MLP을 구현해보았다. 의문인 점은 기존에 구현한 ALS, BPR보다 성능이 낮게 나온다는 점이다. 이는 추후에 다시 살펴보도록 하고.. MLP는 user, item 임베딩을 각기 다른 neural network에 태우는, two tower 모델로 변형할 수 있다. 이는 곧 많은 기업에서 일종의 baseline으로 사용하는 모델이다. 유저나 아이템의 메타 정보를 더 넣을 수 있는데 이는 future work로 남겨두도록 하자.