[RFC] Database Abstraction Proposal - Request for Comments (Devs and DB Admins only!)

  Thread Rating:
  • 0 Votes - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Post Reply
firnsy Offline
Team-XBMC Developer
Posts: 104
Joined: Jul 2009
Reputation: 0
Location: Australia
Lightbulb  [RFC] Database Abstraction Proposal - Request for Comments (Devs and DB Admins only!) Post: #1
Pre-reading
The purpose of this new thread is to focus the discussion, garner support, and tweak the outlined draft proposal for the next iteration of XBMC database support.

I'll keep this first post as the WIP until the majority are content. This first post may be better facilitated by a wiki page but I'll leave that decision to the XBMC team.

Proposals are inline, in BLUE. These proposals are designed to become part of the official documentation as appropriate.

Ideally it would be great to get input from as many as possible to ensure we get it right first time round and that it is scalable from the word go.

Background
The current implementation utilises a separate database for each of Video, Music and Program information. Each database schema shares some similarities in structure and naming convention but do not conform to a strict standard.

A visual representation of each schema is provided in the attachments.

The database backend implementation is based on SQLite3 (SQLite version 3). SQLite[1] is an embedded SQL database engine, in-process library that implements a self-contained, serverless, zero-configuration, transactional SQL database engine.

Unlike most other SQL databases, SQLite does not have a separate server process. SQLite reads and writes directly to ordinary disk files. A complete SQL database with multiple tables, indices, triggers, and views, is contained in a single disk file.

Moving Forward
The aim of this document is to standardise the design and structure of the database to facilitate the future scaling of XBMC in both a standalone configuration and when networked.

The current SQLite implementation stores the database information locally only. Each user profile has there own databases stored in userdata://Database

This works well for single XBMC setups, however, it does not scale well with multiple XBMC setups that utilise a single media repository. In the latter case each XBMC instance requires a dedicated database to store and retrieve information. Currently, to have the same information across all instances requires the databases to be manually replicated.

A solution to this problem is for XBMC to communicate to a variety of database backends; locally and remotely.

Database Servers
The additional database backends to be supported initially include the existing SQLite3 as well as MySQL[2] and PostgreSQL[3].

The inclusion of MySQL and PostgreSQL will allow for remote database connections that facilitate information storage and retrieval from a common database to multiple XBMC instances.

Data types
Due to the current SQLite implementation the data types are already suitably constrained to be well supported across multiple database backends. In SQLite the data type of a value is associated with the value itself, not with the container in which it is stored. Fortunately The dynamic type system of SQLite is backwards compatible with the more common static type systems of other database engines in the sense that SQL statement that work on statically typed databases should would the same way in SQLite.

The data types referenced in the existing schema are:
  • INTEGER
  • TEXT
  • BOOLEAN
  • DOUBLE
  • FLOAT

Proposed: The data types DOUBLE and FLOAT are to be consolidated to the use of FLOAT only.

Schema Naming
The current naming of columns appears to follow a standard for the majority but a number of fields are clearly not conforming.

Proposed: All table names are to utilise CamelCase with the initial identifier representing the data type of that column with the exception of Primary Key and Foreign Key identifiers which are always to be of type INTEGER. The identifiers for the data types are:
  • INTEGER (i)
  • TEXT (s)
  • BOOLEAN (b)
  • FLOAT (f)
  • PRIMARY KEY / FOREIGN KEY (id)

Examples of the usage are:
  • idArtist - Integer representing a Primary/Foreign Key for each unique artist
  • sArtist - String representing the Artist’s name
  • iTrack - Integer representing the track number of a song on an album
  • bSubtitlesOn - Boolean representing whether subtitles should be displayed
  • fAudioDelay - Float representing the delay of audio stream relative to the video stream.


Map table names do not conform to any particular standard, for example in the Music database the table “exgenrealbum” maps the “genre” table to the “album” table, whilst the table “extragenresong” maps the “genre” table to the “song” table. In the Video database we have an “actorlinkmovie” table which maps the “actor” table to the “movie” table.

Proposed: Mapping tables are to be named “map_primarytable_secondarytable”. The previous tables would then be named as “map_genre_album”, “map_genre_song” and “map_actor_movie” respectively.

Indexs are to be prefixed with “idx_”.

Views are to be prefixed with “vw_”.


Database Joining
The current XBMC database implementation separates the Music, Video and Program information into separate databases. This does not easily support the ability to link a music video (from the Video database) to an artist (from the Music database).

Proposed: The Video, Music, and possible the Program, databases to be merged into one Media database. The existing conflicts of table names are to be addressed with a “music_”, “video_” or “program_” prefix were appropriate. The version table will be consolidated to a single table.

Database Settings
Proposed: The advancedsettings.xml will need to be modified to specify the database backend you wish to communicate with. The default will be the existing SQLite3 implementation.

The available tags for configuring the database are:
Code:
<database>
  <type></type>
  <name></name>
  <host></host>
  <port></port>
  <user></user>
  <pass></pass>
</database>

The available tags are described as follows:
  • Type is a string containing one of the following key words: “sqlite3”, “mysql”, or “postgresql”.
  • Name is a string that represents the database name. For SQLite3 this is the name of the file.
  • Host is a string that represents the network name, network IP address or for MySQL and PostgreSQL configurations or the directory path for SQLite3 configurations.
  • Port is an integer that represents the listening port of the database backend. This is ignored for SQLite3 configurations.
  • User is a string represents the user name for access to the database. This is ignored for SQLite3 configurations.
  • Pass is a string represents the password associated with the aforementioned user name for access to the database. This is ignored for SQLite3 configurations.

A typical default installation will look like the following:

Code:
<database>
  <backend>sqlite3</backend>
  <name>MyMedia.db</name>
  <host>userdata://Database</host>
  <port></port>
  <user></user>
  <pass></pass>
</database>


Database Abstraction Library
The current database abstraction library utilised dbiplus for which their only exists and SQLite3 driver.

Proposed: OpenDBX[4] will be used as the database abstraction library. OpenDBX supports the initially desired backends SQLite, MySQL and PostgreSQL. OpenDBX supports a number of other backends which may be supported at a later stage.

Missing Tables
In the current implementation there appears to be a number of tables missing due to the existence of mapping tables that link to undefined Primary Keys. These tables are director and writer.

Proposed: Add the missing tables director and writer.

Footnotes:
[1] SQLite - http://www.sqlite.org
[2] MySQL - http://www.mysql.com
[3] PostgreSQL - http://www.postgresql.org
[4] OpenDBX – http://www.linuxnetworks.de/doc/index.php/OpenDBX

Attachments:
Video Database Schema [Current]

[Image: VideoDatabase_thumb.png]


Music Database Schema [Current]

[Image: MusicDatabase_thumb.png]
find quote
DonJ Offline
Team-XBMC Member
Posts: 490
Joined: May 2005
Reputation: 5
Post: #2
First of all, great initiative & appreciate research/analysis done so far!

jmarshall & cptspiff are probably in the best position to comment on the above. The proposal seems sound to me...

To go one step further, and this can be addressed at a later stage I guess:
How do we manage local (i.e. sources not shared between clients) and network sources in case the db is used by multiple xbmc clients?
I.e. if we use a central database server for multiple xbmc clients, how do we ensure differentiation between 'local' and 'network' sources. Moreover, do the clients need consistent mapping of network drives?

Best,
DonJ

Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.
find quote
firnsy Offline
Team-XBMC Developer
Posts: 104
Joined: Jul 2009
Reputation: 0
Location: Australia
Post: #3
Great questions...

DonJ Wrote:How do we manage local (i.e. sources not shared between clients) and network sources in case the db is used by multiple xbmc clients?
I.e. if we use a central database server for multiple xbmc clients, how do we ensure differentiation between 'local' and 'network' sources.

The table of interest here is the "path" table, both in the Music and Video databases. In this table the column strPath stores the path (eg. smb://media/movies/). We can add two fields here, namely a boolean to indicate its a local path (eg. bLocalPath) and a unique profile ID (eg. idProfile) or XBMC ID (eg. idXBMC). Where local media paths are found only ID's matching the ID doing the requesting will be processed.

This will need a new table that can list all XBMC instances in an XBMC network with persistent unique ID's.

DonJ Wrote:Moreover, do the clients need consistent mapping of network drives?

If you are referring to the individual mapping between multiple XBMC instances to the network drive then I would think yes. However, right now I can't think of a case where consistent mapping would not be the norm.
find quote
wstewart Offline
Senior Member
Posts: 213
Joined: Jan 2009
Reputation: 1
Location: Canada
Post: #4
Hi,
Looks great, your proposal solves many of the problems with the database naming when moving to a consolidated database.

As I mentioned in the other thread (http://forum.xbmc.org/showthread.php?tid=37449), there is already an abstraction layer in XBMC, so I am not sure that OpenDBX is really necessary. The existing code has hooks to pass the ip address and user id/password to the database wrapper layer (currently sqlite).

My plans are/were to expand this current layer to include mysql, however, conversion of my system from mythtv to XBMC has taken much longer than expected, so I haven't had time to work further on it. So if you beat me to it, that would be great.
find quote
eriksmith200 Offline
Senior Member
Posts: 238
Joined: Aug 2005
Reputation: -5
Post: #5
Maybe this has already been discussed: Right now the database tells XBMC which media files are present. The file system also contains this info, and has the correct info. The database right now is leading for XBMC, requiring frequent library updates and cleaning, which from a user perspective I find slightly awkward.

Maybe a better way would be to do a quick directory listing on startup (or on network events, or ...), and then use the database to retrieve all the associated meta data, for the existing files only

Benefits:
- starting XBMC without a network connection will only mean you don't see those files which are on the network
- when removing a file from your HDD you will not need to clean your library, the meta data is still present in the database but not retrieved
- a "clean library" is only needed once in a while to purge the db from meta data for files no longer present in the file system.

Please don't hit me if this has been discussed without me being aware of it Nerd
find quote
ZIOLele Offline
Senior Member
Posts: 130
Joined: Oct 2008
Reputation: 0
Post: #6
eriksmith200 Wrote:Maybe this has already been discussed: Right now the database tells XBMC which media files are present. The file system also contains this info, and has the correct info. The database right now is leading for XBMC, requiring frequent library updates and cleaning, which from a user perspective I find slightly awkward.

Maybe a better way would be to do a quick directory listing on startup (or on network events, or ...), and then use the database to retrieve all the associated meta data, for the existing files only

Benefits:
- starting XBMC without a network connection will only mean you don't see those files which are on the network
- when removing a file from your HDD you will not need to clean your library, the meta data is still present in the database but not retrieved
- a "clean library" is only needed once in a while to purge the db from meta data for files no longer present in the file system.

Please don't hit me if this has been discussed without me being aware of it Nerd

Isn't simpler to just purge the library on deletion of the file?(when the network is reachable) And in case the net is not reachable we could just make a simple journal ( to just list what is to be deleted) to apply as soon as the network become reachable.
find quote
spiff Offline
Grumpy Bastard Developer
Posts: 12,172
Joined: Nov 2003
Reputation: 81
Post: #7
smack. irrelevant to this discussion

Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.
find quote
ZIOLele Offline
Senior Member
Posts: 130
Joined: Oct 2008
Reputation: 0
Post: #8
First of: Great analisys, you wheren't kidding when you spoke of the engineer in you Shocked

firnsy Wrote:Proposed: The Video, Music, and possible the Program, databases to be merged into one Media database. The existing conflicts of table names are to be addressed with a “music_”, “video_” or “program_” prefix were appropriate. The version table will be consolidated to a single table.

I haven't looked at the schemas yet, so i can be wrong, but this duplication is really necessary(if it is duplication)? can't this be solved whit an intertable (if necessary)?
find quote
skunkm0nkee Offline
Team Razorfish
Posts: 1,887
Joined: Aug 2007
Location: London
Post: #9
Or a master media table that contains all the similarities and then separate child tables (music_*, video_* etc) for the media specific columns?
find quote
eriksmith200 Offline
Senior Member
Posts: 238
Joined: Aug 2005
Reputation: -5
Post: #10
spiff Wrote:smack. irrelevant to this discussion

/cowers in corner :o
find quote
Post Reply