PHP Web Crawler or Python Web Crawler
I need to build a web crawler for a little project I want to do. I've
already found PHPCrawl, which I've gotten up and working so far, and it
works fine.
But, it seems that PHPCrawl doesn't handle huge websites like php.net very
well, probably because I'm inserting data that I fetched from the website
into a database, and running it from the browser. After doing some more
research, I found that python was the main language of choice for building
web crawlers.
What are the main performance differences between a web crawler written in
PHP, such as PHPCrawl, and a web crawler written in python, such as
scrapy?
Which one can big/huge websites better?
And, which one do you personally recommend?
Thanks
No comments:
Post a Comment