2020-09-08, 15:09
Hi,
I am developping a scraper where I want to find all movies from a specific actor that is not yet in the library. Example: you liked a movie with Jesicca Alba, you start watching other movies with Jesicca Alba and when finished you would like to know which other movies there are with her that you don't have in your library yet.
In the database, the actor has only 3 fields:
- id (internal index)
- name
- thumb url
There is no link in de "uniqueid" table to IMDB/TMDB id either.
I realise that at the moment of scraping you get the IMDB ID returned by the API. But when a new movie is scraped, how does KoDi know if the actors are already in the system (and then which one it is) or it has to create a new one?
So, imagine one of the actors in the movie is Chris Evans, namely this one: https://www.imdb.com/name/nm0262635/?ref_=fn_al_nm_1 .
Then how does KoDi know which Chris Evans in the database is the correct one? How does KoDi choose between these four?:
https://www.imdb.com/name/nm0262635/?ref_=fn_al_nm_1
https://www.imdb.com/name/nm0262632/?ref_=fn_al_nm_2
https://www.imdb.com/name/nm7429569/?ref_=fn_al_nm_3
https://www.imdb.com/name/nm4087470/?ref_=fn_al_nm_5
I am developping a scraper where I want to find all movies from a specific actor that is not yet in the library. Example: you liked a movie with Jesicca Alba, you start watching other movies with Jesicca Alba and when finished you would like to know which other movies there are with her that you don't have in your library yet.
In the database, the actor has only 3 fields:
- id (internal index)
- name
- thumb url
There is no link in de "uniqueid" table to IMDB/TMDB id either.
I realise that at the moment of scraping you get the IMDB ID returned by the API. But when a new movie is scraped, how does KoDi know if the actors are already in the system (and then which one it is) or it has to create a new one?
So, imagine one of the actors in the movie is Chris Evans, namely this one: https://www.imdb.com/name/nm0262635/?ref_=fn_al_nm_1 .
Then how does KoDi know which Chris Evans in the database is the correct one? How does KoDi choose between these four?:
https://www.imdb.com/name/nm0262635/?ref_=fn_al_nm_1
https://www.imdb.com/name/nm0262632/?ref_=fn_al_nm_2
https://www.imdb.com/name/nm7429569/?ref_=fn_al_nm_3
https://www.imdb.com/name/nm4087470/?ref_=fn_al_nm_5