How to matching char with gb2312 in scraper
#1
I have written two scraper for chinese movie site. Mtime.com use utf-8, so I can matching any word from the web page. But imdb.cn use gb2312, only english character could be matching, some key word in chinese can't be matching. How to resolve this?
Reply
#2
set the correct encoding on your scraper
Reply
#3
I'll modify imdb.cn scraper to use chinese key word to gether more information. Thanks!
Reply
#4
to make sure you got my point;

the scraper itself is an xml file. set its encoding using

?xml version="1.0" encoding="gb2312"?>
Reply
#5
Thanks for your help. I can use chinese key word in scraper now. It's works well.
Reply

Logout Mark Read Team Forum Stats Members Help
How to matching char with gb2312 in scraper0