I am very new to even the idea of web scraping, data mining... harvesting, what have you. Very quick study though. I recently was able to finish a project for a client and if it were not for Data Scraping Studio, I would have never completed my task, I LOVE DATA SCRAPE STUDIO!!! -- stick that in your customer reviews LOL
Here my issue, I need to scrape a span class value depending if its parent is a specific value. For example...
(link in alt text)
As you can see, what I have achieved here is to extract value which is a child of a parent category... If you notice in the chrome ext. the value of which I need is listed in consequential order.
- Prop_type | .ad-info li:nth-child(1) .info-value = Casa
- prop_size | .ad-info li:nth-child(4) .info-value = 1,300
- prop_location | li:nth-child(8) .info-value = CARRETERA SUR/SOUTH PANAMERICAN ROAD
The issue lies in that each span class being ordered (nested), leaves error when a particular property has more or less information disrupting the order of the .info-value and spitting out the wrong information im scrapping for...
The page source for mentioned section is as follows...
The picture above is a snippet of the values i am achieving to scrape but keeping in mind there are many more and as mentioned, if at the posting of each property included more or less info to fill these boxes the order changes. I have been looking at If then statements for regex of which i am still very new to begin with, thinking its possible to add some sort of rule if span class "info-name" value equals "Categoria:" then out put its adjacent info-value, in this case being "Casas" to assure the scrape obtains the value I'm... well, scraping for.
Any help or direction would be greatly appreciated,
Posted by Erick
10 months ago