Allow artist scraping from more than one service (music section)
#1
Curently in the settings I can choose one default service to look for artist info (thumbnail, fanart, info, ...). Allmusic is the default service, but sometimes it does not provide an artist picture and there is no way to edit an artist entry on allmusic. So it might be good to search on freebase.com or last.fm. But then I'd have to switch the default scraping service.

Wouldn't it be smarter to simply allow the user to setup the services in order, so that the first service would be the priority service to check. If certain info is not found on that service move on two the second entry and check there.

That would be a much more efficient method to get this job done.

Maybe this is already possible (then I'd be very happy about a "how-to" for noobs).

Cheerios,
sync

PS: How can I auto-scan all artists in my library? Having to go through each artist manually is a waste of time. Didn't we build computers to do stupid tasks like that, in the first place?
Reply
#2
How many services do you want? If order takes priority, what happens if the first one contains useless information? How does the user then switch to the next one?

Think through how it would work from a user perspective (what would the UI look like?)
Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.


Image
Reply
#3
(2012-03-25, 16:42)Syncopation Wrote: PS: How can I auto-scan all artists in my library? Having to go through each artist manually is a waste of time. Didn't we build computers to do stupid tasks like that, in the first place? Tongue

Hi. earlier we were talking about this in IRC. I am happy to say that I found out how to do that.

1. Go to System -> Settings -> Music -> Database
2. Select "Grab additional information when refreshing database" (I am losely translating from the german UI)
3. Refresh your Database
4. Huh
5. Profit
Reply
#4
@jmarshall: In my case I'd like to have the following services in following order:

a) allmusic
b) last.fm
c) freebase.com

allmusic has the most solid database but sometimes artist pictures are missing and I can't edit their database. So I'd use b) or c) for that. So the order would then be determined by the oder I give the scrapers in the settings view (currently not possible). Then if no artist picture is found XBMC would move to the second scraper and try if it can find an artist picture at there etc. pp. The user would not switch. All this would be done automatically. Simply by using the scrapers order as priority and is fields remain empty xbmc moves to the next scraper on the list.

Currently I don't really understand the situation. I used allmusic first. And it seems that more than one scrapers are not allowed.

@Horscht: tired your suggestion like this: changed the setting you mention. let allmusic run. many missing information components. switch to last.fm and re-run, but it seems many item which have half info get skipped. So not really useful.

Still unsure how to deal with this and think this could be done better by allowing 2 or 3 scrapers. (Doesn't plex work like this? - if information is not found move to the next service?)
Reply
#5
Hi,

I'm proposing some scraper changes and one of them would allow just this kind of modularity ( didn't have the fallbacks option in mind yet but would be an easy extention ). See here: http://forum.xbmc.org/showthread.php?tid=126365 You can always post your request there too then I'd include it in my proposal when writing. So it's up to you to convince jmarshall to pick me :p
Reply
#6
Hey Sander!

Thanks for linking this existing thread. I wish I could convince jmarshall of this, but he has not responded to my newest post yet, so I don't know if there are any counter arguments so far - at all.
Reply
#7
Hi everybody,

actually I'm new to XBMC and I'm thinking to use it as my new media backend now. Fact is, I like XBMC a lot although I still have some problems in understanding how the GUI is working Wink. I'm coming from "Music Section: Artist Info (how to scrape artist info automatically)" and I just experienced a strange problem when I wanted to add thumbnails for all my artists in the library.

I'm using Eden os OSX 10.6 and my music is stored on a NAS which I access via SAMBA. Unfortunately, I have some AFP issues for which I do not have time for solving at the moment. The music is organized in a folder containing folders for every artist which contain folders for the albums. I managed (after reinitializing the library and doing some "random" clicking) that the scrapers dowloaded the album covers. Now I would like to have those fancy artist infos and fanart like in http://wiki.xbmc.org/index.php?title=Fil...t.WSCR.jpg So I went into the library - artist section and opened the context menu. I was having only those generic thumbnails and there was no option to load all artist informations. So I clicked "artist information" and then the window to _enter_ the name of the artist opened - no artist information. Actually clicking on "done" put me into an infinite loop - I was brought back to enter the name of the artist. Hitting escape slved the problem. Then I tried to change the scraper (I used the last.fm one), and suddenly when opening the context menu of one artist the optiion to "load _all_ artist informations" showed up. Nevertheless, when clicking that option only the information for the selected artist was downloaded and after that this option vanished again. So I changed the scraper back to AllMusic and the option reappeared. I tried this several times and I can reproduce it - I always have to switch the scraper before I can download one new artist thumbnail. When opening the context menu of an artist which has already the informations I have the option to load all artist informations but it doesn't load anything.

Either I'm doing something completely wrong or there is a problem in the way the scrapers work.

Thanks in advance,

Ben
Reply
#8
Hi everyone,

I'm facing the same problem as Meow, I need to switch scrapers everytime I want to download a new artist information. Its very annoying. Is there a solution/fix for this?
Reply
#9
I would put myself in this camp, as well, although I would extend the idea to all the scrapers; it's annoying to change scrapers and rescan, not to mention the cumbersome process of changing each source's settings when you switch to a new preferred scraper.

To answer jmarshall's final question above, here are two different visions for the UX (my aplogies for the length, but I'd like to be as complete as possible):

First option: lower flexibility, easier implementation (call this the "FST" (i.e. "first","second","third") option):
1. Consolidate scraper options into a single location (central, not per-source). This is currently done (as of Eden) for music (System->Music->Library->Default ... service), but video scrapers are tied to individual sources.
2. That central location would be a "Metadata Sources" panel in the system settings, with a section for each scraper type. In a Confluence-style skin, this would mean a left-menu containing "Artist Info","Movie Info", etc., with the right-side list being the ordered list of scrapers.
3. Implement the "ordered list" as 3 settings: "First Scraper", "Second Scraper", and "Third Scraper", where each setting is an enum of the installed scrapers of a given type, plus 'none'.
By using the enum and fixed settings for "first", "second", and "third", we keep using all the same widgets, and internal code churn is kept to a minimum.

Second option: full flexibility with new setting/widget type (call this the "new-widget" option)
1. As with the previous option, I'd collect the scraper preference settings into a common location under the main settings
2. Add a new setting type "orderedenum" (or something similar) that pops up a list when selected; the ordered list would be populated by the set of currently-installed scrapers of the appropriate type, and the user could move entries up or down the list using the same mechanisms that are in place for the "Now Playing..." window.
By using the new setting/widget type, we allow all scrapers to come into play, and avoid stale settings if a scraper is uninstalled. There would be nontrivial internal (and skin) churn in implementing the new setting type, but it would be a fairly straightforward effort, and the results would be useful in more cases than just prioritized scrapers.

Now, the use cases:
1. Music
Music hardly changes. Currently the user goes to the main music library settings and configures a default scraper. This behavior would be preserved, save the fact that the user picks a sequence, rather than a default, scraper.
2. Video
When adding a video source, the user would only select a source type, rather than source type+scraper. This actually simplifies the process of adding or editing a video source, since it removes a chunk of complexity from the edit source window.
To set the scraper preferece, the user would go to the system settings instead of editing each source. I believe that this is more intuitive, since this makes video scraper selection the same as music scraper selection, and consistency is _good_ for UX.

Adding/removing scrapers:
1. When installing a scraper, it is immediately added to the lowest-priority slot on the list of that type. For the FST option above, this means adding it to the highest not-set option (or not at all if all three have been user-selected). For the new-widget option, this means adding it to the bottom of the list.
2. When uninstalling a scraper, it is simply removed from the list. For the FST option, this may mean shifting up (i.e. the second choice becomes the first choice). For the new-widget option, the entry is just removed.

Scraping data:
When scraping data, the system would do much like it does now -- pick the preferred scraper and attempt to fetch the data. The difference would be how it handles a "not found" failure -- previously it gave up, but now it would go to the next-most-preferred scraper and repeat the attempt.

To answer a few more of the original questions:
Q1. "How many services?"
A1. I'd say at least 3, although ideally it would be "however many are installed"
Q2. "If order takes priority, what happens if the first one contains useless information?"
A2. If it contains useless information, you get that useless information. This is how it works now. If you don't trust a scraper's data, put it lower in the priority list.
Q3. "How does the user then switch to the next one?"
A3. The user doesn't. If a scraper fails (can't contact server, entry not found, etc), the system falls through to the next scraper in the list. The user expresses a preference in the settings, and the system implements those preferences.

There are a number of other changes that could be implemented (e.g. merged data from scrapers), but a simple failover across a user-prioritized list would seem to satisfy the bulk of the need.
Reply
#10
The only problem there is that tv source "A" needs scraper 1 whereas tv source B will crap out badly with scraper 1 but will do awesome with scraper 2. Prioritising 1 over 2 means tv source B has bad data. Thus, the per-source selection. Now, it may well be that this can be solved by having a more refined source type (tvshow split into a bunch of different genre types for example), or by allowing the source to be tagged in some way (or allowing a different order of scraping for that source...)

Hopefully some of this will be addressed with topfs2's GSoC this summer, by allowing both better chaining/multiple runs of scrapers, and allowing scrapers to feed back accuracy information.

Cheers,
Jonathan
Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.


Image
Reply
#11
Jonathan,

Fair enough, if by "crap out badly" you mean "has info, but it's wrong or isn't very good" -- is that a common case? It usually seems to be a completeness (rather than quality) issue for me (so I'd love to see a system default even if there are per-source overrides), but I'd be a fool to define the world by my own limited experience. Now that I think about it, you might be more network-efficient even in the completeness case if you knew that a particular scraper was more likely to have the data associated with a particular source.

Do you think that this same paradigm might apply to the music side of things as well? I've never really thought about splitting music across multiple sources, but I could certainly imagine people doing it...

I do really think that some kind of user-prioritized scraper chaining (or merging) makes a heck of a lot of sense, though, so I'll happily wait to see what comes out of GSoC...

Keep up the good work, and thanks for responding so quickly!

Reply
#12
It's both basically - completeness is no problem to take care of (as you can simply replace it with info from the next in the chain), but rubbish data (or data that is not very good) is also an issue. A simple example would be themoviedb's ratings vs imdb's ratings (or rotten tomatoes ratings etc) where all have the data, but some are better (to the user at least) than others.

If we could get each scrape source to give a rating of what is decent data and what isn't, then we could prioritise individual bits of data from each scraper reasonably well. Ofcourse, user config will always be required, but hopefully it won't need to be too fancy (or everything will work well enough for most users that they don't need to care about it.)

Cheers,
Jonathan
Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.


Image
Reply
#13
True, some kind of quality metric makes sense. Thinking about implementation, though -- how would you (the general, plural, "you", here) envision the quality rating info for scraper data to be acquired? There's no way under the blue sky that all of the data providers will add that kind of meta-metadata. I guess you could break down the metadata into subclasses and have community or personal quality ratings for that (e.g. scraper A has good ratings, while scraper B has good thumbnails, etc.)

There might be a way to get date/time information for the metadata -- perhaps equating "newer" with "better" is a reasonable rule of thumb...

Maybe the core doesn't really need to worry about it at all, though. Between CU Lyrics going to chained sources and the Universal Scraper that's all over the forums, it looks like the addons themselves are taking up the charge and handling chaining, merging, etc. I get a little worried about the addons doing that, though, as it generates a more fragmented interface, so I'd like to see something from the core.
Reply
#14
(2012-06-04, 15:37)PartialGestalt Wrote: True, some kind of quality metric makes sense. Thinking about implementation, though -- how would you (the general, plural, "you", here) envision the quality rating info for scraper data to be acquired? There's no way under the blue sky that all of the data providers will add that kind of meta-metadata. I guess you could break down the metadata into subclasses and have community or personal quality ratings for that (e.g. scraper A has good ratings, while scraper B has good thumbnails, etc.)

There might be a way to get date/time information for the metadata -- perhaps equating "newer" with "better" is a reasonable rule of thumb...

Maybe the core doesn't really need to worry about it at all, though. Between CU Lyrics going to chained sources and the Universal Scraper that's all over the forums, it looks like the addons themselves are taking up the charge and handling chaining, merging, etc. I get a little worried about the addons doing that, though, as it generates a more fragmented interface, so I'd like to see something from the core.

Instead of thinking on how to use multiple scrapers why noy just improve the content of the sources sites?
If everyone just does it's part of adding the missing info everyone can benefit.

Besides that there are some new scrapers available which you can choose the order of the scraper info. This is what you want. However i still ask everyone to do his part in adding info
Read/follow the forum rules.
For troubleshooting and bug reporting, read this first
Interested in seeing some YouTube videos about Kodi? Go here and subscribe
Reply
#15
(2012-06-04, 15:40)Martijn Wrote: Instead of thinking on how to use multiple scrapers why noy just improve the content of the sources sites?
If everyone just does it's part of adding the missing info everyone can benefit.
I agree wholeheartedly, and while it is undoubtedly true that everyone benefits from improved and updated metadata, and we should all update the sources we use with the information that we know, we can't rely on that to fully solve the problem, since it would require that every metadata source have all of the data at an equivalent quality as every other source. Most folks aren't going to add their updates to IMDB and TMDB and other favorite movie site, etc; they won't update MusicBrainz and AllMusic and Last.FM. They'll pick their favorite and add the data there. The ideal is that everyone adds the info, and "everyone" is a large enough set, that it'll work perfectly for whatever site is your favorite. The reality is, and probably always will be, that the available metadata from any single site is not as good as union of data gleaned from chained or merged accesses to multiple sites.

(2012-06-04, 15:40)Martijn Wrote: Besides that there are some new scrapers available which you can choose the order of the scraper info. This is what you want. However i still ask everyone to do his part in adding info
Yep, and that's what I was talking about in the previous post. The presence of this kind of aggregate data scraper really only proves the point that users want to be able to get combined data. Those scraper authors are fantastic and talented, and I have nothing but appreciation for their time and efforts and results, and if they end up being sufficient, then wonderful -- we all win. It seems to me like the overall user experience would be better if that kind of data aggregation was handled in the core, though, since the config and selection UI would be consistent across the data types. It would avoid a situation where one datatype is purely priority-based while another is fully merged, while another is handled some other way, etc.
Reply

Logout Mark Read Team Forum Stats Members Help
Allow artist scraping from more than one service (music section)0