Size limit to Inputs in Scraping?

Hello,

I've been having issues trying to input very large datasets for my URLs using the windows app. It seems like it doesn't like anything over 10k rows. Is there a fix to this or just the limitations of the program itself? Thanks!

Posted by Kanemyr von Manstein 1 years ago


There are no limit in input size. What is the input type you are using? We recommend to use a Local TSV, JSON or CSV for large URL list instead "DIRECT" copy paste in setup.

Most recommended local input file format is TSV(TAB Delimited File) for better performance

scraping agent setup

When you have "Local file" option selected as a input type, Data Scraping Studio will check and read that file while execution and will add the entire URLs in queue and then start the extraction(See the logs message in below screenshot). 

I just tested with a 800k+ URLs and all the URL queued in 8 seconds and scraping started without any issue.

1 million pages scraping

large input file in scraping

Posted by 1 years ago


Thank you so much! I was using CSV and it was taking forever to load. I couldn't figure it out (CSV might still work and I was saving it wrong) however TSV worked perfectly!

Posted by Kanemyr von Manstein 1 years ago

Topic Closed! This question is closed and don't accept posts now.

Close me