# Supervised Learning - TV-Show recommender

## Specification
TV-Show recommender

This program will recommend you what tv-show to view based on what you like.
You will tell what tv-show you like and how many recommendations wanted, then you will get that 
amount of recommendations of tv-shows in order of rank from your search.

### Data Source:
I will use a dataset from TMBD
https://www.kaggle.com/datasets/asaniczka/full-tmdb-tv-shows-dataset-2023-150k-shows

### Model:
I will use NearestNeighbors (NN) alhorithm together with K-NearestNeighbors alhorithm.

### Features:
1.  Load data from dataset and preprocessing.
2.  Model training with NN & k-NN algorithm.
3.  User input
4.  Recommendations

### Requirements:
1. Title data:
    * Title
    * Genres
    * First/last air date
    * Vote count/average
    * Director
    * Description
    * Networks
    * Spoken languages
    * Number of seasons/episodes
2. User data:
    * What Movie / TV-Show prefers
    * Number of recommendations wanted

### Libraries
  * pandas: Data manipulation and analysis
  * scikit-learn: machine learning algorithms and preprocessing
  * scipy: A scientific computing package for Python
  * time: provides various functions for working with time
  * os: functions for interacting with the operating system
  * re: provides regular expression support
  * textwrap: Text wrapping and filling
    
### Classes
  1. LoadData
     * load_data
     * read_data
     * clean_data
  2. ImportData
     * load_dataset
     * create_data
     * clean_data
     * save_data
  3. TrainModel
     * train
     * recommend
     * preprocess_title_data
     * preprocess_target_data
  4. UserData
     * input
     * n_recommendations
  5. RecommendationLoader
     * run 
     * get_recommendations
     * display_recommendations
     * get_explanation
     * check_genre_overlap
     * check_created_by_overlap
     * extract_years
     * filter_genres

### References   
   * https://scikit-learn.org/dev/modules/generated/sklearn.neighbors.NearestNeighbors.html
   * https://scikit-learn.org/1.5/modules/generated/sklearn.feature_extraction.text.TfidfVectorizer.html
   * https://scikit-learn.org/dev/modules/generated/sklearn.preprocessing.StandardScaler.html
   * https://scikit-learn.org/0.16/modules/generated/sklearn.decomposition.TruncatedSVD.html
   * https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.hstack.html
   * https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.csr_matrix.html
   * https://maartengr.github.io/BERTopic/getting_started/embeddings/embeddings.html

## How to run program

### Before running program
First thing to do is to extract TMDB_tv_dataset_v3.zip in dataset folder so that it contains TMDB_tv_dataset_v3.csv.

### Running program
Start main.py and it will load dataset and ask for a title to get recommendations from, also how many recommendations wanted. Then enter and you will have those recommendations presented on screen.