i have POC (non-published) code for moving scrapers to python. while i love my own brainchild (xml based scrapers), there are simply much more human resources out there capable/interested in writing python based scrapers.
i use the current plugin mechanism, so doing lookups is completely done as vfs operations. this has several advantages, the most obvious one being that it's accessible to anything which can access our vfs, i.e. to python add-ons, to anything using json-rpc aso. basically, it's simply queries of the form
plugin://<scraperid>?action=<list>&title=<name>
with some of the calls tailored for tvshows, movies, albums, artists etc.
with one of the "emulators" for our python bindings out there, this approach should also be possible to take outside xbmc. the only con is that it will tie the system somewhat to xbmc internals. personally i don't think this is a big problem, since it's rather general in nature (even though the classes are called xbmc.xxx)
spiff
Grumpy Bastard Developer Joined: Nov 2003 Reputation: 82 |
2012-03-26 14:57
Post: #11
Always read the XBMC online-manual, FAQ and search the forum before posting. Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules. For troubleshooting and bug reporting please make sure you read this first. |
| find quote |
jim0thy
Junior Member Posts: 17 Joined: Mar 2010 Reputation: 0 |
2012-03-26 17:43
Post: #12
With regards to XBMC requiring only one path, I imagine there will be a lot of people who have a media collection spanning several volumes and / or drives; would this still allow for multiple sources?
|
| find quote |
dzan
Junior Member Posts: 39 Joined: Jul 2010 Reputation: 0 |
2012-03-26 18:06
Post: #13
(2012-03-26 17:43)jim0thy Wrote: With regards to XBMC requiring only one path, I imagine there will be a lot of people who have a media collection spanning several volumes and / or drives; would this still allow for multiple sources? Sure, those people could just enter all those paths. The point isn't as much that the data is in one path it's that you have to indicate what type of data it is and what scraper to use. (2012-03-26 14:57)spiff Wrote: i have POC (non-published) code for moving scrapers to python. while i love my own brainchild (xml based scrapers), there are simply much more human resources out there capable/interested in writing python based scrapers.I'm sure you have most feeling with the community but I can just say from my own experience reading the wiki on xml scrapers once was enough to be able to create or adjust one... that's pretty user-friendly. While there certainly are a lot of coders around here not all of them know python. Another advantage of the xml-approach is the fact it leaves less freedom.. this may seem odd but this way others can easily adapt existing scrapers, know how to read them and it has some benefits for security ( which isn't a real issue for xbmc I admit ). I like the xml scrapers but if the dev's think this will be removed in the future it might be better for me to spend more time on other parts of the project of course. (2012-03-26 14:57)spiff Wrote: i use the current plugin mechanism, so doing lookups is completely done as vfs operations. this has several advantages, the most obvious one being that it's accessible to anything which can access our vfs, i.e. to python add-ons, to anything using json-rpc aso. basically, it's simply queries of the formI haven't seen your code changes of course but if I understand it correctly they don't remove the need or advantages from my proposal? XBMC would still benefit from a total separation of the scraper code in a library, in a direct way and through the extra options it would leave for developers of third-party software and such. Also it would still help in the client/back-end movement. XBMC would then just use the library tied to your changes.. Am I right? I could start separating and adding extra functionality for scrapers but wait with including XML-scraper support in the library until a decision is made? So the functionality implementing callbacks and classes would be made stand-alone anyway. I could spend more time on the file iteration/detection while waiting for the decision. Thanks for your feedback and please continue to provide it
|
| find quote |
spiff
Grumpy Bastard Developer Joined: Nov 2003 Reputation: 82 |
2012-03-26 18:13
Post: #14
how we obtain the info is, indeed, completely orthogonal to your ideas. which a hint that supporting both should be doable under one (abstracted) interface.
i can, and will, be much more verbose at a later stage, but right now, it's important that we mentors don't detail the tasks too much. a part of the gsoc idea is that the ideas should come from you
Always read the XBMC online-manual, FAQ and search the forum before posting. Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules. For troubleshooting and bug reporting please make sure you read this first. |
| find quote |
jim0thy
Junior Member Posts: 17 Joined: Mar 2010 Reputation: 0 |
2012-03-26 18:47
Post: #15
(2012-03-26 18:06)dzan Wrote: Sure, those people could just enter all those paths. The point isn't as much that the data is in one path it's that you have to indicate what type of data it is and what scraper to use. Just thought I'd check. You're right that it can be quite confusing for non-technically minded people to get their sources configured correctly.What would be nice would be if XBMC could return a list via an API call of all the parameters it requires from a scraper. That way a possible scraper-building front end could be developed that would allow for the end-user to provide the path to the the data source's API, then match the returned values to the relevant XBMC fields. The aim being to automatically generate the scraper code for new data sources without the end-user having to get their hands dirty with the code. Returning the parameters via an API call would allow for the front-end to keep itself in-sync with any changes to the XBMC library fields. |
| find quote |
Bstrdsmkr
Fan Posts: 711 Joined: Oct 2010 Reputation: 13 |
2012-03-26 21:09
Post: #16
Just wanted to throw my 2 cents in. From an addon dev point of view, no matter how it's implemented, making the scrapers usable by python addon devs would be a huge advantage. There are multiple addons out there (the most complete of which is Eldorado's Metahandlers lib) which have to reinvent the wheel to provide a decent eye candy experience.
Currently, the addons feel like a whole separate experience from local media. I believe the difference between scripts and addons was meant to be that addons emulated local content, and therefore should be as seamless as possible. I think that enabling addons to access the same scrapers used for true local content would go a long way toward making that happen. I admit that I don't fully understand the description and intention of your project, but from what I understand, this seems to be within scope. |
| find quote |
lad1337
Junior Member Posts: 19 Joined: May 2010 Reputation: 0 |
(2012-03-26 09:34)dzan Wrote: This seems like a very useful project, again it's mostly up to scraper developers to use it but it might come in handy when working on the modular scraping ( would have to ask around about depending on such a new initiative ). I could write an optional library function accepting the show's name and the desired source which would then return an url from that source instead of having to search the source ( ex. tvdb ) directly. So the first 'block' of xml scrapers could use it and wouldn't need to parse each specific website anymore... Good stuff to think about if you have any questions about xem or need a feature fell free to contact me either here or on freenode in #sickbeard or #xem |
| find quote |
Syncopation
Member Joined: Dec 2009 Reputation: 0 |
2012-03-27 12:58
Post: #18
+1 for getting this done in GSoC 2012. More flexible scrapers are needed to provide a much improved meta-data experience. And isn't metadata the thing that makes media centers such a great thing?
OS X 10.8.3, XBMC 12.1, XBMC Remote iOS 1.3 |
| find quote |
dzan
Junior Member Posts: 39 Joined: Jul 2010 Reputation: 0 |
2012-03-27 13:02
Post: #19
(2012-03-26 18:47)jim0thy Wrote: Just thought I'd check.No problem. Indeed it's hard for us ( geeks ) to image what "regular" people must go through when using some of our stuff sometimes ![]() (2012-03-26 18:47)jim0thy Wrote: What would be nice would be if XBMC could return a list via an API call of all the parameters it requires from a scraper. That way a possible scraper-building front end could be developed that would allow for the end-user to provide the path to the the data source's API, then match the returned values to the relevant XBMC fields. The aim being to automatically generate the scraper code for new data sources without the end-user having to get their hands dirty with the code.This is a pretty advanced idea but I don't see why it wouldn't be possible even with the current xml-scrapers? You know how an xml scraper is constructed so one could write a gui the way you describe to fill in the blanks in the xml template? (2012-03-26 21:09)Bstrdsmkr Wrote: Just wanted to throw my 2 cents in. From an addon dev point of view, no matter how it's implemented, making the scrapers usable by python addon devs would be a huge advantage. There are multiple addons out there (the most complete of which is Eldorado's Metahandlers lib) which have to reinvent the wheel to provide a decent eye candy experience.Thanks for you feedback! I already concluded from a talk with spiff that there are huge benefits for exposing the scraper lib in the addon-fashion too. I haven't really looked into this yet but I'm pretty sure that once everything is separated in a lib it wouldn't be hard to also expose this functionality that was too. So I'm carefully saying my work would lead to what you want. (2012-03-27 02:34)lad1337 Wrote: if you have any questions about xem or need a feature fell free to contact me either here or on freenode in #sickbeard or #xemThanks! If I'd implement some callback to "translate" movie/show id's in the library using xem i'll contact you for sure. (2012-03-27 12:58)Syncopation Wrote: +1 for getting this done in GSoC 2012. More flexible scrapers are needed to provide a much improved meta-data experience. And isn't metadata the thing that makes media centers such a great thing?Thx!
(This post was last modified: 2012-03-27 14:02 by dzan.)
|
| find quote |
AnalogKid
Fan Joined: Feb 2009 Reputation: 141 |
2012-04-07 03:19
Post: #20
I've just discovered this thread having raised a 'similar' proposal in the feature suggestions thread (for non developers).
I'm a former software architect for elements of the Symbian operating system, along with being a former developer on a number of handsets for Nokia, SonyEricsson and Motorola. Sadly, like most folks I stopped coding and spent more time drawing UML then more time infront of 'managers to becoming a manager. Suffice to say, I don't know the XBMC code, but I'm not a total 'flake' either. At the risk of sounding patronising, or being too simplistic I think it's important to get right back to basics: 1) We have media on the users system, and we 'force' the user to at least classify it as TV, Movie, Music etc. Whether this is entirely necessary is up for debate, but it's how things stand right now. 2) In some cases, we ask the user to go a little further still and help us 'classify' media with more detail by having them stick to SOME semblance of a scheme... e.g. using a filename that might help define the media, or a TV season/episode scheme that regex can parse etc. 3) Based on what we can 'deduce' from 1 and 2 we then try to obtain additional meta information and content (trailers, thumbs, fanart etc). 3a) it SEEMS that XBMC introduces some 'core' functionality that attempts to find some additional meta info and content from the 'local' sources - typically side by side, or within a sub folder of the media itself.... (I'm talking about the NFO file, tbn, jpg etc) 3b) If we can't find that 'local' stuff, or we deem there wasn't enough stuff found, or that local stuff explicitly pointed to online resources, we then use 'scrapers' to try and get meta info and content from online resources My assertion is this: 3a and 3b are the same thing.... 'an attempt to obtain meta info and content'. The highly abstract concept is that XBMC is saying to a scraper: X PHP Code: BMC Scraper Level 42How 'Scraper' finds its information should be of no concern to XBMC. XBMC only needs to provide the scraper with as much information as it possibly can in order the 'help' the scraper. It would be a mistake to make assumptions on how Scraper might do its work... so to assume it will search online, and only provide it with simplistic hints such as 'we think the movie is called Lady In Red' isn't enough. We COULD provide it with: - The media resides at 'C:\my movies\lady in red\man in blue.mp4' - We THINK the movie title is Lady In Red This allows the scraper to think for itself!... it can go along with our suggestion of 'lady in red' OR it attempt to be smarter and opt for 'man in blue'. With me so far? So here's my particular small suggestion in the grand scheme of things (and I'll comment more on the bigger picture later!).... ***** Make the 'local meta / content' detection a scraper like all other scrapers (putting aside the current limitations of scaper API's.***** The fact that that content exists in the local file system as opposed to an online resource simply doesn't matter. The local file systems IS a 'scrapeable resource' and should be scraped by a scraper. That means of course that scrapers would have to be able to exercise logic and effectively be 'executable modules' in some way. But it makes sense to me that scrapers have this ability. Benefits: It abstracts the collection of metadata retrieval away from core XBMC in a consistent manner through the use of 'scrapers'. XBMC makes no assumptions whatsoever on how a scraper deduces the information It allows (theoretically) for an entirely different local NFO / tbn / jpg scheme to be implemented as long as there's a scraper that supports it It moves the NFO / tbn / jpg scanning functionality out of XBMC core and into a scraper Cons: It's probably a lot of work initially and widening of scraper capability There's more to come.... e.g. a strategy for 'daisy chaining' scrapers so that Meta Data and content can progressively be enhanced (sequentially / via priority) There's even scope for a more complex parallel scrape where multiple sources of Meta Data and content are collated and rationalised. It's a lot of words, but a simple concept, and I THINK it's in keeping with the OP's line of thinking. If it's way off, I'll gladly drop out and leave you guys to it. I'm at an 'abstact' level... you guys are at a practical level... but there's a chance somewhere in between lies perfection ;-) I'll possibly come back with the 'daisy chaining' / sequentially scraping stuff later....
(This post was last modified: 2012-04-07 03:21 by AnalogKid.)
|
| find quote |

Search
Help