GetDetails?
#1
I'm having problems extracting the plot from XML. It seems that the '!' in '<![CDATA[' makes regex go bananas.
How will I go about cutting out the multi line plot?

Current regex:

<Description>([^<]*)</Description>

XML:

Code:
<Description>
<![CDATA[
Arn Magnusson bliver født i 1150 som søn af stormanden Magnus og hans hustru Sigrid. Han bliver opdraget af cisterciensermunkene ved Varnhem Kloster i Västra Götaland og får her den bedste uddannelse, Europa på det tidspunkt kan præstere...
  ]]>
  </Description>


/Tnx
Reply
#2
Code:
"<!\\[CDATA\\[(.*?)\\]\\]>"
in python you have to use re.DOTALL
Always read the XBMC online-manual, FAQ and search and search the forum before posting.
For troubleshooting and bug reporting please read how to submit a proper bug report.

If you're interested in writing addons for xbmc, read docs and how-to for plugins and scripts ||| http://code.google.com/p/xbmc-addons/
Reply
#3
Well this is a scraper and the regex does not seem to do any difference Sad No match
Reply
#4
It was escaped too many times.

Try:

Code:
<!\[CDATA\[(.*?)\]\]>
42.7% of all statistics are made up on the spot

Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.
Reply
#5
Smile 
tnx tslayer, I managed to get it working now Smile

Code:
<Description>[^$]*<!\[CDATA\[([^$]*)\]\][^$]*</Description>
Reply
#6
Crap! It only worked in regex tester Sad when used in scraperxml tester or XBMC, plot is not set to anything...

Here is the RegExp from GetDetails:

Code:
<RegExp input="$$1" output="&lt;plot&gt;\1&lt;/plot&gt;" dest="5+"><expression>&lt;Description&gt;[^$]*&lt;!\[CDATA\[*([^$]*)\]\][^$]*&lt;/Description&gt;</expression></RegExp>

Any ideas?
Reply
#7
Sorry, not sure about scrapers.

Did you try that new scraper program? Maybe that will help.
42.7% of all statistics are made up on the spot

Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.
Reply
#8
This seems to work so far, if anyone has the same problem (basicly regex noob) Smile

Code:
<Description>[^<]*<!\[CDATA\[([^<]*)\]\][^<]*</Description>
Reply

Logout Mark Read Team Forum Stats Members Help
GetDetails?0