Posts

Showing posts with the label Screen Scraping

Can Scrapy Be Used To Scrape Dynamic Content From Websites That Are Using AJAX?

Image
Answer : Here is a simple example of scrapy with an AJAX request. Let see the site rubin-kazan.ru. All messages are loaded with an AJAX request. My goal is to fetch these messages with all their attributes (author, date, ...): When I analyze the source code of the page I can't see all these messages because the web page uses AJAX technology. But I can with Firebug from Mozilla Firefox (or an equivalent tool in other browsers) to analyze the HTTP request that generate the messages on the web page: It doesn't reload the whole page but only the parts of the page that contain messages. For this purpose I click an arbitrary number of page on the bottom: And I observe the HTTP request that is responsible for message body: After finish, I analyze the headers of the request (I must quote that this URL I'll extract from source page from var section, see the code below): And the form data content of the request (the HTTP method is "Post"): And...