Scraping Articles and Press Releases using WebHarvy
In this article we will see how WebHarvy can be easily configured to scrape articles, publications and press releases . Being a generic web scraping software, WebHarvy can be configured to extract data from any website as per your requirement.
WebHarvy can be used to scrape articles from article directories and press releases from PR websites.
WebHarvy lets you scrape the content of the article as a file (text file) - see Scrape text as file for details. The Capture More Content option also comes in handy while scraping articles.
The following demo shows how WebHarvy can be used to scrape articles from www.ezinearticles.com. Details like article title, author name, date, article body, keywords etc can be easily extracted using WebHarvy.
The following video shows how articles can be extracted (downloaded/saved) from CNN website using WebHarvy
In case you need assistance in configuring WebHarvy, please do not hesitate to contact our support team (firstname.lastname@example.org) with the details (URL of the webpage + details of the data to be scraped). We are happy to help you get started with your first data extracting project using WebHarvy !