Scrape Gumtree listings using WebHarvy – No Code

Gumtree is one of UK’s biggest classified ads websites. WebHarvy’s point and click, visual web scraping interface can be used to easily scrape Gumtree listings. Details like name, phone number, email, description and images from ads can be easily scraped from Gumtree using WebHarvy.

How to Scrape Gumtree using WebHarvy?

In order to scrape data from Gumtree listings using WebHarvy, first load the Gumtree website within WebHarvy’s configuration browser. Navigate to the page from which you need to scrape data.

load website to scrape

Selecting data to scrape

Click the Start button in the Home menu to start selecting the data which you need to scrape from Gumtree listings. You can select the data which you need to scrape by just clicking over it. Clicking on any data item on the page will bring up a Capture window with various options. Click on the Capture Text option to scrape the selected item’s text.

selecting data to scrape

WebHarvy will automatically parse and identify similar data (from subsequent listings) on the page and display them in the Captured Data Preview pane.

Since listings span across multiple pages, WebHarvy can be configured to scrape data from all of them. Pagination can be configured by clicking on the link to load the next page and by selecting the Set as next page link option from the Capture window.

You will need to click and follow each listing link to load its details page, so that additional data like description and images can be scraped. Links can be followed by clicking on them and by selecting the Follow this link option from the Capture window.

Scraping Gumtree listings data

Once you have selected all required data during configuration, click the Stop button in the Home menu to stop the configuration process. The configuration can now be saved so that it can be run or edited later. Click the Start Mine button to start mining data.

scraping gumtree data

Once mining is finished, you can save the scraped data to a file or database.

Video

The following video clearly explains the steps which you need to follow to scrape Gumtree listings data.

Using WebHarvy to Scrape GumTree

Questions? Need Help?

In case you have any questions, please do not hesitate to contact us.

Try WebHarvy

We highly recommend that you download and try using the free evaluation version of WebHarvy available in our website. To get started, please follow this link.

Scraping Transfermarkt with No Code

Transfermarkt is a website which displays market values, transfer news and rumours of international football players. In this article we will learn how scraping data from Transfermarkt website is possible without writing any code.

WebHarvy is a visual web scraping software which can be used to easily scrape data from websites. You can load web pages within WebHarvy and select data with mouse clicks.

scraping transfermarkt

Image above shows Transfermarkt club listings page loaded within WebHarvy. Since each of these clubs have multiple players, whose details we wish to scrape, the ‘Scrape a list of similar links‘ feature is used.

After clicking on Actions menu > Scrape a list of similar links option, click on the first club name. WebHarvy will select all clubs in the league and display them in the Data Preview pane.

scraping multiple clubs from transfermarkt

After selecting all clubs, WebHarvy will load the first club’s details page in its browser. This is the page from which we need to select data for scraping. So, click on the ‘Start’ button to start configuration. Then click on the first player name.

starting selecting data to scrape

WebHarvy will display a Capture window from which you should select the ‘Capture Text‘ option to scrape the player name. In similar fashion, you can select player position, age/date of birth and market value from the page. WebHarvy will automatically identify and collect all repeating items from the page and display them in the preview pane.

scraping player details from transfermarkt

Now, we can stop configuration (Stop button) and start mining data (Start Mine button).

scraping data from transfermarkt

Once mining is finished, you can save it to a file or to a database.

Video

Watch the following video to see the detailed steps of configuring WebHarvy to scrape data from transfermarkt website.

Try WebHarvy

You can download and try the 15 days free evaluation version of WebHarvy from our website. To get started, please follow the link below.

Getting Started With WebHarvy

If you have any questions, please do not hesitate to contact us.

How to scrape WhoScored.com Live Scores?

Using WebHarvy, you can easily scrape whoscored.com website for live match scores and other data. WhoScored is a website which displays live stats of football matches from various tournaments.

Since WebHarvy is a visual web scraping software, you can easily select the data which you need to scrape from whoscored.com website using mouse clicks. There is no need to write any code or script to scrape data.

Scrape WhoScored.com Live Scores

To start scraping data you should first load whoscored.com home page within WebHarvy’s browser as shown below. Then, click on the Start button in the Home menu to start the configuration process. Now you can select the data which you need to scrape from this page by just clicking over it.

Load the page from which data needs to be scraped within WebHarvy’s browser & start configuration

To select any data, just click on it. WebHarvy will display a Capture window with various options. To capture the selected item’s text, click on the ‘Capture Text’ option. In the resulting window, you can specify a name for the data column. WebHarvy will automatically identify patterns of data in the page and group similar data under the same column. You can see a preview of captured data in the Preview pane.

Selecting data to scrape from WhoScored.com website

Scrape match data by following links

Once you have selected all required data from the starting page of the configuration, you can follow each match link to load its details page. Click on the link which you need to follow and select the ‘Follow this link‘ option from the resulting Capture window.

Wait for the match details page to load and once loaded, you can click and select more data. The ‘Capture following text‘ option in the Capture menu will help you select match details like score, elapsed time, half time / full time score, date, match summary etc. in an accurate manner.

Click on any data item on page to select it for scraping

After selecting all required data from the details page, stop the configuration process by clicking on the ‘Stop’ button in the Home menu. You can now save the configuration. By clicking on the ‘Start Mine‘ button you can start collecting data. WebHarvy allows you to save the mined data to a file or database.

WebHarvy’s Miner Window

Video

The following video explains in detail the configuration process which you need to follow to scrape data from whoscored.com website.

Video showing how WebHarvy can used to scrape WhoScored.com live scores

The Regular Expression string used in the video to follow each match link from the starting page to load its details page is given below.

href=”([^”]*)

Try WebHarvy

We highly recommend that you download and try using the 15 days free evaluation version for WebHarvy available in our website. Follow the link below to get started.

Getting Started with WebHarvy

If you have any questions, please do not hesitate to reach out to us.

WebHarvy 6.3 – Custom Data Fields, Page Screenshot, Miner Settings in Configuration

WebHarvy 6.3 comes with support for custom data fields. The following custom data fields can be added to a configuration.

  1. Current page URL
  2. Screenshot of currently loaded page
  3. Date and Time of mining data
  4. User provided text

Custom data fields can be added by clicking anywhere on the page during configuration and by selecting ‘Add Custom Data’ option under ‘More Options’ from the Capture window.

Add Custom Data

Page URL
Captures URL/address of currently loaded page.

Page Screenshot
Captures screenshot of the currently loaded page and saves it as an image file.

Date Time
Column filled with date-time of mining data.

Text
Column filled with user provided text label.

Other changes

The other changes in this version are:

  1. Automatically suggests field names for ‘Capture following text‘ option
  2. Capture following text‘ made faster during configuration
  3. Supports scraping multiple image URLs automatically. Earlier only multiple image downloads were supported.
  4. Advanced Miner Options are now saved in the configuration. No need to adjust WebHarvy settings before running configuration with non-default miner options.
  5. Updated internal browser to the latest possible version.
  6. Duplicate images are names -001, -002 etc. instead of -1, -2 etc. This helps in sorting images based on name.
  7. Improved smart help (article, video suggestions based on loaded website)

Questions ?

In case you are new to WebHarvy, we highly recommended that you refer our getting started guide. If you have any questions or need assistance in configuring WebHarvy to scrape data as per your requirement, please do not hesitate to contact us.

Scraping Chrome Web Store

WebHarvy can be used to scrape Chrome Web Store data. Chrome web store displays Chrome extensions, listed under various categories. In this article we will see how WebHarvy can be used to scrape details of extensions listed under a specific category from Chrome Web Store.

Using WebHarvy to Scrape Chrome Web Store

Chrome web store uses infinite scroll for pagination. Extensions are loaded in the same page as we scroll down. The newly loaded extensions are loaded under a different HTML element. For this reason, we will need to run a JavaScript code to bring all extensions under a single HTML element, so that all of them will be selected during mining.

The following video shows the steps involved in detail. You can find the various codes used in the video description.

As shown in the above video, web scraping extensions details from Chrome Web Store is performed in 2 stages :

  1. In stage 1, we get the URLs of all extension details pages
  2. In stage 2, we scrape data from all these URLs using a single configuration.

The JavaScript Code used to collate all rows of data under a single element is given below.

var groups = document.querySelectorAll('[role="grid"]');
var parent = groups[0];
for (var i = groups.length - 1; i >= 1; i--) {
	var group = groups[i];
	for (var j = group.children.length - 1; j >= 0; j--) {
		parent.appendChild(group.children[j]);
	}
}

The regular expression string used to get extension details page URL is given below.

href="([^"]*)

Try WebHarvy

We highly recommend that you download and try using the free evaluation version of WebHarvy available in our website. To get started, please follow the link given below.

Getting started with WebHarvy

How to scrape job postings from various websites?

It is no surprise in the current times that most of the new job openings are posted online and most of the job seekers find them online as well. Which makes it easily to collect the available job data from various job boards and websites. This can be useful in may ways.

  • If you own a new job board, aggregator or website you can use this data to populate your site with fresh and updated job postings
  • If you are a staffing agency or HR consultant, you can use this data to better service your clients
  • Companies can analyze latest trends in the job market – which skills are valuable, which roles are high paying etc.
  • As a job seeker you can use this data to better apply to various open positions matching your skillset and land up in a better deal and working environment.

WebHarvy for scraping job postings

WebHarvy is a generic and visual web scraping software which can be configured to scrape data from any website. WebHarvy can scrape job postings from websites like Indeed, Google Jobs, Dice etc. Various job details like job title, description, application URL, job id etc. can be easily scraped using WebHarvy.

If you are interested we highly recommend that you download and try using the free evaluation version of WebHarvy. Scraping data from websites is very easy using WebHarvy.

Scraping Google Jobs / Google Careers

The following video shows how WebHarvy can be used to scrape newly posted job details from Google Career Job listings. Job title, URL, position title and other details can be extracted from Google Job Listings using WebHarvy. To know the regular expression strings used in the video, please refer the video description.

Scraping Indeed Job Listings

Video below shows how WebHarvy can be used to scrape job listings from Indeed.com. As shown in the video, WebHarvy can scrape the Job title, company name, company website, job post date, job description and job URL from indeed.com listings.

The codes used in the above video can be found in the video description.

Scrape Job Reviews from Glassdoor

WebHarvy can also be used to scrape job reviews from websites like Glassdoor.

Questions?

Please feel free to contact us in case you have any questions regarding WebHarvy. We can check and let you know if WebHarvy can be used to solve your data scraping requirement.

Scraping Tennis Scores from Flashscore

Flashscore is a website which displays live scores and odds of various sports matches like football, tennis, basketball etc. Scraping live match scores and betting odds from Flashscore is possible using WebHarvy.

In this article we will see how WebHarvy can be used to scrape tennis match scores from Flashscore. WebHarvy is a generic and visual web scraper which can be configured to scrape data from any website. WebHarvy can also scrape data from other live score/betting odds websites like SofaScore, OddsPortal, BetExplorer etc. The scraped data can be saved as a file or to a database. WebHarvy also allows you to scrape data periodically from these websites via an automated scraping process – which runs and saves data without user intervention.

The following video shows how WebHarvy can scrape tennis match scores from FlashScore website for a list of matches. In this example we provide WebHarvy a list of tennis match details page URLs and WebHarvy will scrape the score data and present it in a spreadsheet format.

Set by set, match points of tennis matches can also be scraped using WebHarvy from Flashscore as shown in the following video.

To know more about scraping live scores from FlashScore please refer this link.

Try WebHarvy for free

If you are interested, we highly recommend that you download and try using the free evaluation version of WebHarvy available in our website.

Getting started with web scraping using WebHarvy

In case you have any questions, please do not hesitate to contact us anytime.

Zillow Scraping Update : How to get all 40 property details per page?

Recently, Zillow updated their website such that the property listings in the search results page are loaded only when the user scrolls down the list (lazy loading). Due to this, if you follow the normal method of data selection, only 9 out of 40 property details per page will be scraped.

To solve this problem and to get all 40 property data per page, please follow the method shown in the following video.

Note that, at the beginning of the video, a placeholder/dummy field is selected. This is important, as this helps WebHarvy to scrape all listings from all pages. To simulate scroll down of the property list, the following JavaScript code is used.

https://gist.github.com/sysnucleus/6a7af56a6a6abe14a697c691d12d4840

As before, you need to use the scroll functionality of your mouse or trackpad to navigate the list up/down, since clicking on the scroll bar during configuration will bring up the Capture window.

In case you need further assistance or have any questions, please do not hesitate to contact our technical support.

Scraping FlashScore Opening/Closing Odds

This article explains how WebHarvy can be used to scrape opening & closing odds from FlashScore website (www.flashscore.com). WebHarvy is a visual web scraping software which can be used to scrape data from any website.

The following video shows the configuration steps which you need to follow to scrape opening and closing odds of various bookmakers from FlashScore website, for multiple matches in a league. The video also shows how basic match details like team names and scores can be scraped.

The Regular Expression strings used in the above video can be found here.

As shown in the video, most of the data which you need to scrape from a web page can be selected using mouse clicks. But to correctly scrape odds values corresponding to a specific bookmaker from the match details page, regular expression strings are used. This is to make sure that the data is correctly selected even if the order and number of bookmakers in the match details page vary.

In addition to scraping various odds values like opening, closing, half time, full time, correct score, over/under etc. WebHarvy can also scrape live match details like score, timing of goals scored (in football) and even video URL of goal highlights. You can watch all WebHarvy demonstration videos related to scraping FlashScore at this link.

WebHarvy can also scrape sports betting odds from other website like OddsPortal, BetExplorer etc.

Try WebHarvy

We highly recommend that you download and try the free evaluation version of WebHarvy available in our website. To get started, please follow the link.

Need Support?

In case you have any questions, please feel free to reach out to our support team.

The easiest way to scrape Zillow property listings – No coding required!

WebHarvy lets you easily scrape property data from multiple real estate websites like Zillow, Trulia, Realtor etc. via a visual and intuitive point-and-click interface. In this article we see how WebHarvy can be used to scrape Zillow property listings, of course without any coding.

Scraping Zillow Property Data

Scraping Zillow Property Data

The following video shows how WebHarvy can be used to scrape Zillow property data. Details like address, price, Zestimate, facts and figures, neighborhood details, pricing history, agent/owner contact details (including phone number) etc. can be scraped from Zillow’s property listings using WebHarvy.

Update (June 2021) : Due to recent changes in Zillow website, a new technique has to be used to scrape all 40 properties which are displayed on each page. Please watch this video to know more.

Scraping Zillow Property Data for a list of locations / ZIP codes

WebHarvy’s Keyword Scraping feature allows you to scrape property listings data for multiple locations using a single configuration. You can submit the location ZIP codes from which you need to scrape property data and WebHarvy will automatically perform the scraping from all locations. The following video shows how WebHarvy can be used to scrape property data for a list of locations (ZIP codes) using the Keyword Scraping feature.

Land Academy on using WebHarvy to scrape Zillow

Shown below is a recent video by Land Academy showcasing WebHarvy for real estate data scraping from Zillow.

Scraping Zillow owner/agent phone numbers

In addition to scraping property data WebHarvy can also scrape contact details (phone numbers) of agents and owners of properties listed in Zillow. The following videos show how.

Try WebHarvy

Download and try the 15 days FREE trial version of WebHarvy from our website. To get started, follow this link.

Need Support?

In case you need assistance in using WebHarvy for your data scraping requirement please reach out to our support team.