2.8 KiB
Supervised Learning - TV-Show recommender
How to run program
Before running program First thing to do is to extract TMDB_tv_dataset_v3.zip in dataset folder so that it contains TMDB_tv_dataset_v3.csv.
Running program Start main.py and it will load dataset and ask for a title to get recommendations from, also how many recommendations wanted. Then enter and you will have those recommendations presented on screen.
Specification
TV-Show recommender This program will recommend you what tv-show to view based on what you like. You will tell what tv-show you like and how many recommendations wanted, then you will get that amount of recommendations of tv-shows in order of rank from your search.
Data Source:
I will use a dataset from TMBD
https://www.kaggle.com/datasets/asaniczka/full-tmdb-tv-shows-dataset-2023-150k-shows
Model:
I will use NearestNeighbors (NN) alhorithm together with K-NearestNeighbors alhorithm.
Features:
- Load data from dataset and preprocessing.
- Model training with NN & k-NN algorithm.
- User input
- Recommendations
Requirements:
- Title data:
- Title
- Genres
- First/last air date
- Vote count/average
- Director
- Description
- Networks
- Spoken languages
- Number of seasons/episodes
- User data:
- What Movie / TV-Show prefers
- Number of recommendations wanted
Libraries
- pandas: Data manipulation and analysis
- scikit-learn: machine learning algorithms and preprocessing
- scipy: A scientific computing package for Python
- time: provides various functions for working with time
- os: functions for interacting with the operating system
- re: provides regular expression support
- textwrap: Text wrapping and filling
Classes
- LoadData
- load_data
- read_data
- clean_data
- ImportData
- load_dataset
- create_data
- clean_data
- save_data
- TrainModel
- train
- recommend
- preprocess_title_data
- preprocess_target_data
- UserData
- input
- n_recommendations
- RecommendationLoader
- run
- get_recommendations
- display_recommendations
- get_explanation
- check_genre_overlap
- check_created_by_overlap
- extract_years
- filter_genres
References
- https://scikit-learn.org/dev/modules/generated/sklearn.neighbors.NearestNeighbors.html
- https://scikit-learn.org/1.5/modules/generated/sklearn.feature_extraction.text.TfidfVectorizer.html
- https://scikit-learn.org/dev/modules/generated/sklearn.preprocessing.StandardScaler.html
- https://scikit-learn.org/0.16/modules/generated/sklearn.decomposition.TruncatedSVD.html
- https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.hstack.html
- https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.csr_matrix.html