In this article you will learn how to scrape data from Google News (news.google.com), without writing any code, using WebHarvy. WebHarvy is a visual web scraping software which can be used to scrape data from any website.
You will need to download and install WebHarvy in your computer. WebHarvy allows you to select data from web pages via an easy-to-use, point-and-click user interface. You can select the data which you need to scrape by simple mouse clicks. WebHarvy automatically identifies and parses repeating data (in lists or tables) displayed by web pages.
Steps to follow to scrape Google News articles
- Download and install WebHarvy in your computer
- Open WebHarvy
- Load the news.google.com page from which you need to scrape data. WebHarvy’s inbuilt browser, which is based on Chromium, can load and navigate web pages just like any normal browser.
- Once the page displaying the data is loaded, click Start Configuration
- Now you can click and select the data which you need to scrape
- News title and URL can be selected by directly clicking on the text displayed on page and by using the corresponding option from the Capture window.
- WebHarvy allows data to be scraped from multiple pages of listings. Various pagination techniques employed by websites are handled by WebHarvy.
- The news article page can be opened by using either the Follow this link or Open Popup option.
- Once all required data is selected, you can stop configuration
- Click on the Start Mine button to start scraping data
- WebHarvy allows you to save the scraped data to a file or database.
Video : Scraping Google News Articles
You may download and try using the free evaluation version of WebHarvy available in our website. Follow this link to get started.