~ Email: support@sysnucleus.com Phone: 91.484.4015479 / 91.94950.45285 Skype: sysnucleus ~
What is Web Scraping ?
Practical uses of a Web Scraper
Advantages of WebHarvy Web Scraper
Anonymously Scrape Data
XML Configuration Format
Scrape data from sites which require login
Case Studies
Scrape Yellow Pages
Scraping Real Estate
Scraping Articles
Scraping Craigslist Data
Scraping Amazon
Anonymously scrape data from websites
When you repeatedly access remote websites for data extraction, there is a chance that the remote web server might block your computer's IP address preventing further access to the website. Also, while scraping data, you may not want to reveal your network details to remote web servers.
WebHarvy lets you scrape data anonymously from websites with the help of proxy servers. The 'Scrape via Proxy Server' feature allows you to access and scrape websites through proxy servers, thereby maintaining anonymity while scraping data.
To configure this feature, click the 'Settings' option from the Edit menu and select the 'Proxy Settings' tab. You may provide a single proxy address or a list of proxy addresses as shown below.

In case you are providing a list of proxy server addresses WebHary will automatically rotate them after the specified time interval.
How to obtain proxy server addresses ?
There are free as well as paid proxy servers available in the internet. You may find them by performing a google search. The free proxies available may be slow and unreliable.
Some premium proxy providers also offer free proxy lists:
http://www.hidemyass.com/proxy-list/
http://www.xroxy.com/proxylist.htm
You may also purchase paid, reliable and fast proxy servers from them.
Another service which lets you try their paid/reliable/fast proxy servers for free (for a limited time) is http://www.cactusvpn.com/.
Kindly note that we are not affiliated in any way to the above mentioned companies/services. We have mentioned them here so that you can get started with using proxy servers without delay. Users are advised to contact the respective services for more information.
