support@sysnucleus.com | sales@sysnucleus.com | YouTube Channel | KB Articles

Articles Home

Product Help

YouTube Channel

WebHarvy Blog


Anonymously scrape data from websites


When you repeatedly access remote websites for data extraction, there is a chance that the remote web server might block your computer's IP address preventing further access to the website. Also, while scraping data, you may not want to reveal your identity (network details) to remote web servers.

You may use the 'Inject pauses during mining' feature to avoid making continuous page requests to web servers for long durations. Although this method will minimize the chances of getting detected and blocked by web servers, this may not be effective always and your identify is still not hidden from the web server.

WebHarvy lets you scrape data anonymously from websites with the help of proxy servers. The 'Scrape via Proxy Server' feature allows you to access and scrape websites through proxy servers, thereby maintaining anonymity while scraping data.

You may also use a VPN instead of proxies to anonymously scrape websites.

To configure this feature, click the 'Settings' option from the Edit menu and select the 'Proxy Settings' tab. You may provide a single proxy address or a list of proxy addresses as shown below.

Scrape using Proxy Server

Either a single proxy server or a list of proxy servers can be used for web scraping. In case you select the 'Rotate proxies' option, WebHarvy will automatically rotate and use each proxy server in the list periodically. Otherwise, the first proxy in the list will be used.

How to obtain proxy server addresses ?


There are free as well as paid proxy servers available in the internet. You may find them by performing a google search. The free proxies available are often slow and unreliable, and may result in early termination of mining process. For this reason we do not recommend using free proxies with WebHarvy.

However, in case you want to try free proxies with WebHarvy we recommend the following list provided by HMA :
http://www.hidemyass.com/proxy-list/

Our recommendation


We have tested Trusted Proxies with WebHarvy and recommend their Proxy Server Cloud service. You can signup for a free trial account with Trusted Proxies at the following link which will let you try their service for FREE with WebHarvy

Signup for a FREE Trusted Proxies Proxy Server Cloud Account for WebHarvy

Once you create a free Proxy Server Cloud trial account you will receive your proxy IP address / port number as well your account username/password. You should add the proxy server IP address/port in WebHarvy as explained at Scrape via Proxy Servers. Then, before mining you should authenticate with Trusted proxies at the web link provided in account settings details using your user name & password.

Please contact our support in case you need assistance or have any questions.