I am a complete newbie at python, scrapy and web scraping, this being my first learning project. I want to scrape multiple pages from this website using scrapy: http://ift.tt/1C0cyxT
The links seem to be generated using ajax. At the end of the page is the link to next page. Clicking on <2> or and checking the link generated on firebug, shows following request being generated:
GET directory?p=2&category=1&map[disable]=0&map[height]=500&map[list_height]=500&map[span]=5&map[style]=&map[list_show]=0&map[listing_default_zoom]=15&map[options][scrollwheel]=0&map[options][marker_clusters]=1&map[options][force_fit_bounds]=0&distance=0&is_mile=0&zoom=15&perpage=16&scroll_list=0&feature=1&featured_only=0&hide_searchbox=0&hide_nav=0&hide_nav_views=0&hide_pager=0&template=&grid_columns=4&sort=title
So I thought, in my limited understanding, that if i replace p={pagenum} with any page number, that should get me the required page. I tried using the following url to directly request for the page:
However, this link generates an error page saying "page not found".
Can anyone help me understand what am I doing wrong here?
Thanks for your guidance.
Aucun commentaire:
Enregistrer un commentaire