How to scrape data behind a link?

If I wanted to get the names of the hotels in this example, they are all on page 1. That is easy and ok.

But say, if I wanted to get more information about the hotel, and to get that information I would have to click on the Hotel name, to access the information page of the hotel. How can I extract the data from both first page, but also the info-page?

See screenshots for example.

hotel-website-scraping

tripadvisor hotels data scraping

 

 

 

Posted by 1 years ago


Nice question, to scrape data from next detail page you'd need to make one more scraping agent.

So basically you'd need to create 2 scraping agents.

  1. The first agent will scrape data from list page including "details page URL" (as in your #1 screenshot) and save somewhere in you local directory. Let's say "category.csv" will look like below.
    hotel website scraping
     
  2. Now create 2nd agent to scrape data from details page, this agent will read the URL from #1 output file (category.csv) and traverse all pages and extract the data you want.

Posted by 1 years ago

Topic Closed! This question is closed and don't accept posts now.

Close me