Scraping prices and images from product listings

Being a visual web scraper, WebHarvy allows you to select most of the data which you need from web pages via simple mouse clicks (click on the data > select required option from Capture window). But sometimes, the layout of product tiles in ecommerce websites vary from product to product – some products may have a discounted price, some may have a sponsored or a special tag etc. In such cases when you normally click and try to select data like price and images, some rows of the data column may be blank. To overcome this problem the following technique can be used.

  1. Scraping Prices
  2. Scraping Images
  3. Video

Scraping price from product tiles

The following method needs to be used only if the normal ‘click and select data’ method fails to get all product prices.

Step 1

Click on the title of the first product


Step 2

Click on the Capture More Content capture window toolbar option once or twice till the entire product tile text (including price) is displayed in the preview area.


Step 3

Click on the Apply RegEx capture window option and select the regex string for getting price from the dropdown as shown below.


Step 4

Apply the selected RegEx and then click on the main ‘Capture Text’ button once the price is displayed in the preview area.

Scraping images or image URLs from product tiles

Normally, images can be selected for scraping by directly clicking over them during configuration and then by clicking the Capture Image button. If all product images are not selected while following the normal method, the workaround method given below can be followed.

Step 1

Click on the title of the first product


Step 2

Apply Capture More Content option multiple times (as required) till the entire product tile text is displayed in the preview area


Step 3

Click on the Capture HTML capture window toolbar option


Step 4

Click on the Apply RegEx capture window toolbar option and in the resulting window select the regex string for getting image URL from HTML.


Step 5

Apply the selected regex string and click on the ‘Capture Image’ button to download the product image or to scrape its URL.


Video

The following video shows an example where image URLs and product prices are selected in the above described method.