v21 file names with dots only as seperators result in incorrect search expressions
#1
Video 
Kodi 21.0, probably also Kodi 19
at least I am pretty sure that the issue did not exist with Matrix (which was running until 04-2024)

Since a long time I am using my own scraper to read xml tags from mkv files, however recently I found that some files fail to scrape.
Further investigation showed that in fail cases the filename (which is the movie name) is not deliverd correctly to the scraper.
If the filename has dots only as seperator (like M.A.S.H.mkv) then Kodi replaces the dots with spaces - see log extract.

log:
2024-05-06 10:06:47.852 T:18968   debug <general>: ADDON::CScraper::FindMovie: Searching for 'M A S H' using MKV-TAGs scraper (path: ...
2024-05-06 10:06:47.990 T:18968   debug <general>: CScraperUrl::Get: Using "UTF-8" charset for HTML "http://httpdi.vnet.de/movies/read_mkv_details.php?title=M%20A%20S%20H"
2024-05-06 10:06:47.990 T:18968   error <general>: ADDON::CScraper::Run: Unable to parse web site
2024-05-06 10:06:53.212 T:18968 warning <general>: No information found for item 'nfs://san.vnet.de/movies/M.A.S.H.mkv', it won't be added to the library.

The search expression 'M A S H' does not match the file name 'M.A.S.H' and so the file cannot be found by the scraper.

BUT
as soon as there are also spaces in the filename (except trailing spaces)  the search expression is correct - see log extract.

log:
2024-05-06 10:06:46.753 T:18968   debug <general>: ADDON::CScraper::FindMovie: Searching for 'M.A.R.K. 13 - Hardware' using MKV-TAGs scraper (path: ...
2024-05-06 10:06:46.935 T:18968   debug <general>: CScraperUrl::Get: Using "UTF-8" charset for HTML "http://httpdi.vnet.de/movies/read_mkv_details.php?title=M.A.R.K.%2013%20-%20Hardware"
2024-05-06 10:06:46.935 T:18968   debug <general>: scraper: GetSearchResults returned <?xml version="1.0" encoding="utf-8" standalone="yes"?><results>

my setting in advancedsettings.xml
advancedsettings:
<video>
  <cleanstrings>
   <regexp>(.*)</regexp>
  </cleanstrings>
  <cleandatetime>(.*)</cleandatetime>
</video>

I tried to dig throug the <cleanstring> description, but did not find a way how to change this behaviour.
Any ideas ?
Thanks
Gregor
Reply
#2
From the documentation I read that '.' and '_' are considerted as seperators, like ' '. But it does not say that such seperators are replaced by spaces. In fact it only replaces '.' if there are no other seperators are existing in the filename. I don't think this is intended behaviour.
Reply

Logout Mark Read Team Forum Stats Members Help
file names with dots only as seperators result in incorrect search expressions0