2012-02-26, 19:03
I have noticed since moving to SQL that loading the movie wall is very slow. I have about 550 movies and it takes about 8sec. Not great but not horrible. Primary motivation is that I also have an ATV2 on 802.11n 5Ghz where it takes MUCH longer. Almost 30sec. I can imagine for larger datasets, music or larger movie libraries, things could take much longer.
To preface, I have very little skills(almost none) in either SQL or programing so please forgive if I have misinterpreted some of the payload data. What I do have a great deal of skill in is network analysis. I run a Cisco Datacenter network. I have undertaken this effort in hopes that it would encourage the XBMC devs to pick off the low hanging fruit to speed up XBMC's use of SQL and SMB for thumbs. Any gains in efficiency I would hope would have a magnified effect on ATV2 for operations that are not CPU bound.
Test were performed on the following setup.
XBMC Eden beta 3 for OSX, PM3.HD view=thunmbnail, show sets, no hide watched
XBMC Desktop Q6600 @ 3ghz, 4GB RAM, 256GB SSD
SQL server Q6600 @ 3ghz, 8GB RAM, 128GB SSD
Gbit LAN, no jumbo
Procedure: start packet cap for only TCP port 3306, click movies, stop packet cap as fast as possible once movie wall loads. There are still packets going by but I think I have nailed down the end of the packet stream needed to load the movie wall. I also did a few caps with SMB included and comments related to such but packet numbers and timestamps are for cap that only includes SQL.
There is some routine that gets called a lot. Seems to happen every time a SQL connection is opened. Action seems to take about 39 packets and about .015 seconds. While this is not a lot of packets or time keep in mind it is repeated MANY MANY times.
TCP 3WAY
Login Request user=xbmc
Command unknown 66
SET NAMES utf8
Command Ping 14
show databases like 'xbmc_video60'
show tables // response is error as we have not issued USE yet
Use Database xbmc_video60
Command Ping 14
show databases like 'xbmc_video60'
show tables
SELECT idVersion FROM version\n
sometimes do more stuff, sometimes just close connection.
Part 1 // first run at our fav routine
start packet 1
start time 0.00000
Do our fav 39 packet routine and close connection. Nothing else happens here.
end packet 39
end timestamp 0.014939
Part 2 // grab some counts
start packet 40
start timestamp 0.017022
do our fav routine
ask for "select count(1) from movie"
ask for "select count(1) from tvshow"
ask for "select count(1) from musicvideo"
issue quit and close connection
end packet 90
end timestamp 0.039172
Part 3 // part 2 was fun, let's do it again
start packet 91
start timestamp 0.039198
do our fav routine
ask for exact same things as in part 2
quit
close connection
end packet 142
end timestamp 0.136162
big delta in time in for part 2 compared to part 3 is TCP 3WAY takes longer this time(~80ms - server was likely busy). Not XBMC's problem other than why did we do part 3 at all as it's identical to part 2.
Part 4 // find info needed for sets
start packet 143
start timestamp 0.136199
do our fav routine
"SELECT sets.idSet,sets.strSet,COUNT(1) AS c,count(files.playCount) FROM sets JOIN setlinkmovie ON sets.idSet=setlinkmovie.idSet JOIN movie ON setlinkmovie.idMovie=movie.idMovie JOIN files ON files.idFile=movie.idFile GROUP BY sets.idSet HAVING" // truncated by wireshark
select strPath, strFileName from movieview join setlinkmovie on setlinkmovie.idMovie=movieview.idMovie where setlinkmovie.idSet=1
repeat for all movie sets one at a time. it seems to take XBMC 3X as long each time to think of it's next query at SQL server does to answer it.
end packet 236
end timestamp 0.356115
connection still open
Part 5 // bulk of info needed for the wall
start packet 237
start timestamp 0.362995
select * from movieview WHERE movieview.idMovie NOT IN (SELECT idMovie FROM setlinkmovie s1 JOIN(SELECT idSet, COUNT(1) AS c FROM setlinkmovie GROUP BY idSet HAVING c>1) s2 ON s2.idSet=s1.idSet) // Query something along the lines of ask for a dump of info(title, year, plot, file path, etc) for all movies not in a set.
lots of 1500 byte packets as SQL server sends over answer and TCP acks going back
connection still open
end packet 967
end time 0.429138
Part 6 // not really sure why we need stream info to show the wall
start packet 968
start timestamp 0.437753
ask for "SELECT * FROM streamdetails WHERE idFile = <ID>"
we walk through each and every ID as a single query, response, TCP ACK
If also using SMB for thumbs we will issue QUERY_PATH_INFO for each thumb path before moving on to next SQL query. Though I haven't figured out where the thumb path is as I do not see it in any of the SQL responses.
end packet 2329
end timestamp 3.174321
From here on out it appears to be all about loading thumbs.
Part 7 // doing something like file system stat for thumb files…?
There appears to be some inefficiencies here as well as we do a LOT of one by one QUERY_PATH_INFO. There appears to be possible improvement here as well but it's hard for me to be sure as SMB is a much more complicated protocol on the network side than SQL.
takes about 3 sec.
Part 8 // get the thumbs themselves
"Open AndX Request" for the thumbs
takes about 1 sec.
Part 9 // This is only included to show that there are times when XBMC is even MORE inefficient.
Ask for "select timeInSeconds, totalTimeInSeconds from bookmark where idFile=<each ID one at a time> and type=1 order by timeInSeconds"
This appears to be the part where XBMC is calculating the total duration of your media.
In part 9 unlike part 6 instead of just doing each Query one at a time. We open a new connection, run through our fav oft repeated routine, do a query, close the connection, open a new connection again, rinse, wash, repeat. Amazingly inefficient!!
As you can see there are two parts, 6 and 7, which take up the bulk of our time(~90% - half SQL, half SMB) and these appear to be to be spent doing lots of one by one transactions. I'm not even sure 6 and 7 are strictly needed to load the wall…? From a network point of view this is a very inefficient way to go. These kinds of things do not scale well to much larger datasets or, not that it matters on a wired LAN, to networks with higher latency between client and server(wireless).
To preface, I have very little skills(almost none) in either SQL or programing so please forgive if I have misinterpreted some of the payload data. What I do have a great deal of skill in is network analysis. I run a Cisco Datacenter network. I have undertaken this effort in hopes that it would encourage the XBMC devs to pick off the low hanging fruit to speed up XBMC's use of SQL and SMB for thumbs. Any gains in efficiency I would hope would have a magnified effect on ATV2 for operations that are not CPU bound.
Test were performed on the following setup.
XBMC Eden beta 3 for OSX, PM3.HD view=thunmbnail, show sets, no hide watched
XBMC Desktop Q6600 @ 3ghz, 4GB RAM, 256GB SSD
SQL server Q6600 @ 3ghz, 8GB RAM, 128GB SSD
Gbit LAN, no jumbo
Procedure: start packet cap for only TCP port 3306, click movies, stop packet cap as fast as possible once movie wall loads. There are still packets going by but I think I have nailed down the end of the packet stream needed to load the movie wall. I also did a few caps with SMB included and comments related to such but packet numbers and timestamps are for cap that only includes SQL.
There is some routine that gets called a lot. Seems to happen every time a SQL connection is opened. Action seems to take about 39 packets and about .015 seconds. While this is not a lot of packets or time keep in mind it is repeated MANY MANY times.
TCP 3WAY
Login Request user=xbmc
Command unknown 66
SET NAMES utf8
Command Ping 14
show databases like 'xbmc_video60'
show tables // response is error as we have not issued USE yet
Use Database xbmc_video60
Command Ping 14
show databases like 'xbmc_video60'
show tables
SELECT idVersion FROM version\n
sometimes do more stuff, sometimes just close connection.
Part 1 // first run at our fav routine
start packet 1
start time 0.00000
Do our fav 39 packet routine and close connection. Nothing else happens here.
end packet 39
end timestamp 0.014939
Part 2 // grab some counts
start packet 40
start timestamp 0.017022
do our fav routine
ask for "select count(1) from movie"
ask for "select count(1) from tvshow"
ask for "select count(1) from musicvideo"
issue quit and close connection
end packet 90
end timestamp 0.039172
Part 3 // part 2 was fun, let's do it again
start packet 91
start timestamp 0.039198
do our fav routine
ask for exact same things as in part 2
quit
close connection
end packet 142
end timestamp 0.136162
big delta in time in for part 2 compared to part 3 is TCP 3WAY takes longer this time(~80ms - server was likely busy). Not XBMC's problem other than why did we do part 3 at all as it's identical to part 2.
Part 4 // find info needed for sets
start packet 143
start timestamp 0.136199
do our fav routine
"SELECT sets.idSet,sets.strSet,COUNT(1) AS c,count(files.playCount) FROM sets JOIN setlinkmovie ON sets.idSet=setlinkmovie.idSet JOIN movie ON setlinkmovie.idMovie=movie.idMovie JOIN files ON files.idFile=movie.idFile GROUP BY sets.idSet HAVING" // truncated by wireshark
select strPath, strFileName from movieview join setlinkmovie on setlinkmovie.idMovie=movieview.idMovie where setlinkmovie.idSet=1
repeat for all movie sets one at a time. it seems to take XBMC 3X as long each time to think of it's next query at SQL server does to answer it.
end packet 236
end timestamp 0.356115
connection still open
Part 5 // bulk of info needed for the wall
start packet 237
start timestamp 0.362995
select * from movieview WHERE movieview.idMovie NOT IN (SELECT idMovie FROM setlinkmovie s1 JOIN(SELECT idSet, COUNT(1) AS c FROM setlinkmovie GROUP BY idSet HAVING c>1) s2 ON s2.idSet=s1.idSet) // Query something along the lines of ask for a dump of info(title, year, plot, file path, etc) for all movies not in a set.
lots of 1500 byte packets as SQL server sends over answer and TCP acks going back
connection still open
end packet 967
end time 0.429138
Part 6 // not really sure why we need stream info to show the wall
start packet 968
start timestamp 0.437753
ask for "SELECT * FROM streamdetails WHERE idFile = <ID>"
we walk through each and every ID as a single query, response, TCP ACK
If also using SMB for thumbs we will issue QUERY_PATH_INFO for each thumb path before moving on to next SQL query. Though I haven't figured out where the thumb path is as I do not see it in any of the SQL responses.
end packet 2329
end timestamp 3.174321
From here on out it appears to be all about loading thumbs.
Part 7 // doing something like file system stat for thumb files…?
There appears to be some inefficiencies here as well as we do a LOT of one by one QUERY_PATH_INFO. There appears to be possible improvement here as well but it's hard for me to be sure as SMB is a much more complicated protocol on the network side than SQL.
takes about 3 sec.
Part 8 // get the thumbs themselves
"Open AndX Request" for the thumbs
takes about 1 sec.
Part 9 // This is only included to show that there are times when XBMC is even MORE inefficient.
Ask for "select timeInSeconds, totalTimeInSeconds from bookmark where idFile=<each ID one at a time> and type=1 order by timeInSeconds"
This appears to be the part where XBMC is calculating the total duration of your media.
In part 9 unlike part 6 instead of just doing each Query one at a time. We open a new connection, run through our fav oft repeated routine, do a query, close the connection, open a new connection again, rinse, wash, repeat. Amazingly inefficient!!
As you can see there are two parts, 6 and 7, which take up the bulk of our time(~90% - half SQL, half SMB) and these appear to be to be spent doing lots of one by one transactions. I'm not even sure 6 and 7 are strictly needed to load the wall…? From a network point of view this is a very inefficient way to go. These kinds of things do not scale well to much larger datasets or, not that it matters on a wired LAN, to networks with higher latency between client and server(wireless).