Extracting meta tags from URLs

I've an online scraping agent setup in my account that doesn’t look like its grabbing the <meta> tag. I have set it up through the chrome plugin and it showed that it had grabbed the right information but in the output file extracted content from the <meta> tags headers are not there.

meta tags scraper

And the result is empty for sku and other meta fields when I execute the same agent in hosted app. Am, I missing something?

{
  "agent_id": "93bff75677",
  "status": "Completed",
  "version": 1,
  "job_id": 7302,
  "started_at": "10/20/2016 06:33:35 PM",
  "completed_at": "10/20/2016 06:33:45 PM",
  "total": 1,
  "limit": 2500,
  "offset": 0,
  "returned": 1,
  "result": [
    {
      "sku": "",
      "name": "",
      "brand": "",
      "currency": "",
      "price": "",
      "list_price": "$600.00"
    }
  ]
}

 

Posted by anonymous 7 months ago


@Harley Just add the double quote on the meta tag property attribute. Normally it works without double quote as well, but since there is special character in between to so double quote is required.

So the correct selector will be

meta[property="og:upc"]

For META tags below

<meta xmlns:og="http://opengraphprotocol.org/schema/" property="og:upc" content="THRE1007" />

I just tested this by updated the selector for sku filed and it's working perfectly.

header html meta tags scraping

Here is the result after updating the sku field. You can do the same for your other meta tags extraction fields and you should be all set.

extracting meta tags description

Posted by anonymous 7 months ago

Topic Closed! This question is closed and don't accept posts now.

Close me