Scraping data from paginasamarillas.es | extracción paginas amarillas

In this article we will see how WebHarvy can be used to extract data from Spanish Yellow Pages website – paginasamarillas.es

Paginas Amarillas Data Extraction

WebHarvy can extract data like business name, address, website, email and phone numbers from paginasamarillas.es listings. The following video shows how this can be done. Most of the details except email address (which is not directly displayed by the website) can be selected by directly by clicking on them during configuration. Email address can be selected from the HTML source of the business details page by applying regular expressions. The Regular Expression string used to extract email address is copied below.

customerMail[^;]*;[^;]*;([^\&]*)

Try WebHarvy

To know more we highly recommend that you download and try the free evaluation version of WebHarvy. To get started, please follow the link below.

Getting started with web scraping using WebHarvy

 

Scraping Yellow Pages Australia (yellowpages.com.au) – phone, email, website

WebHarvy is a visual web scraping software which can be easily configured to scrape data from any website. In this article we will see how WebHarvy can be configured to extract data from www.yellowpages.com.au listings.

Scraping yellowpages.com.au

A special technique is employed to extract data correctly and consistently from yellowpages.com.au listings. This is mainly because the layout of boxes of listings vary from one listing to another – some has header with their logo/image, some does not etc.

The regular expression strings used in the video to extract email, phone, website and address are given below.

https://gist.github.com/sysnucleus/436a2b0be80882f0ae61a391931abf5d

Know More

We highly recommend that you download and try using the free evaluation version of WebHarvy available in our website. To get started, please follow the link below.

Getting started with web scraping using WebHarvy

Scraping Zillow for real estate data and agent phone numbers

WebHarvy can be used to easily scrape data from real estate websites like Zillow, Realtor, Trulia, RedFin etc. In this article we will see how real estate data including agent/owner contact details (phone numbers) can be extracted using WebHarvy.

Scraping Real Estate Data from Zillow

The following video shows the steps involved. You can see that data like property address, price, zestimate, beds/baths, area, property facts and features (like type, year built, parking etc.), pricing history, tax history, neighborhood details etc. can be easily selected for extraction using a point and click interface. WebHarvy will automatically scrape the data which you select from multiple properties listed across multiple pages in Zillow.

Scraping agent phone numbers from Zillow

The following video shows how agent phone numbers can be scraped from Zillow property listings. The ‘contact agent’ button needs to be clicked in each property details page to get the agent contact details.

Try WebHarvy

We recommend that you download and try with the free evaluation version of WebHarvy available in our website and avail our free technical assistance for your first data scraping project. To get started please follow the link below.

Getting started with web scraping using WebHarvy

Scraping Flashscore – Statistics of all matches in a league

The following video shows how match statistics (possession, goal attempts, shots on goal, blocked shots, corners, off-sides  etc.) of all matches in a league from FlashScore website can be extracted using WebHarvy.

In addition to FlashScore, WebHarvy can also be used to extract sports betting odds from many other betting sites like BetExplorer, OddsPortal etc.

The regular expression string used in the video to get match ID is :

id=”([^”]*)

To form the URL, the following string is replaced :

g_1_

with

https://www.flashscore.com/match/

Know More

We recommend that you download and try using the free evaluation version of WebHarvy available in our website. To get started, please visit the link below.

Getting started with web scraping using WebHarvy

Scraping latitude, longitude from Yellow Pages listings – GPS Coordinates Extaction

Yellow Pages business listings often display the location (Map Direction) of the business. The location details are displayed on a map interface. But the latitude, longitude values (GPS coordinates) are not displayed on page. However, this information is present inside the HTML code behind the map interface.

Extracting latitude, longitude values

The Capture HTML feature along with Apply RegEx feature of WebHarvy can be used to extract the map coordinates from the HTML code of the page. The following video shows how this can be done. The Regular Expression strings used in the video are copied below.

data-lat=”([^”]*)

data-lng=”([^”]*)

Try WebHarvy

We recommend that you download and try using the free evaluation version of WebHarvy available in our website. To get started, please follow the link below.

Getting started with web scraping using WebHarvy

Scraping Yellow Pages for Email/Website, Phone numbers and Addresses

WebHarvy is a visual web scraping software which can be easily configured to extract data from any website including Yellow Pages. There are various flavors of Yellow Pages websites. In this article we are focusing on www.yellowpages.com (US) website.

YellowPages.com data extraction

YP website is the go to place for contact details related to any business. And for the same reason it is one of the greatest source of business/professional contact details. The following video shows how easy it is to use WebHarvy to extract details like phone number, website address, address etc. from yellow pages listings.

Keyword based scraping

The video also shows how you can automatically submit multiple search keywords at yellow pages website and scrape the resulting data. This feature is called Keyword based Scraping and is explained in the following link.

WebHarvy Keyword based Scraping Explained

Multiple lists of keywords can be provided (ex: one list for search and another for location) and WebHarvy will automatically submit all combinations of input keyword lists and scrape the resulting data.

We recommend that you download and try using the free evaluation version of WebHarvy to know more. Please follow the link below to get started.

Getting started with web scraping using WebHarvy

Scraping Amazon by submitting a list of ISBN numbers

The Keyword Scraping feature of WebHarvy lets you submit a list of keywords (search terms, ASIN, ISBN etc.) at Amazon and extract the resulting data displayed. WebHarvy supports submitting multiple lists of keywords to multiple search fields (ex: search query + location) in a website and scrape results for all combinations of submitted keywords. To know more please follow the link given below.

Keyword based Scraping Explained

The following video shows how this feature can be used to extract data from Amazon for a list of ISBN numbers. Details like book title, author, reviews, publisher, cover image etc. can be extracted. The same technique can be used to extract product data corresponding to a list of ASINs.

Know more about Amazon product data extraction using WebHarvy

We recommend that you download and try using the free evaluation version of WebHarvy. For more details please follow the link below.

Getting started with web scraping using WebHarvy

Scraping data from a list of URLs

WebHarvy can scrape data from a list of URLs, provided that they all belong to the same website/domain and share the same layout/page design. This technique is explained in the following link.

How to scrape a list of URLs using a single configuration ?

To any WebHarvy configuration (built to extract data from a page / website), you can add additional URLs as explained here. This can be done while creating the configuration, or while editing it later.

The following video shows how a list of Amazon product page URLs can be scraped using WebHarvy.

To know more, please visit the link below.

Getting started with web scraping using WebHarvy

 

How to scrape multiple high res images from Amazon product listings ?

WebHarvy can be used to easily extract high resolution images (multiple images) of products listed at Amazon. Apart from images, WebHarvy can also extract product details like price, rating/reviews, ASIN, BSR, specification, description, seller details etc. 

The following video shows the steps which you need to follow to configure WebHarvy for Amazon Image Extraction. The video shows how the default image displayed, how multiple 500×500 higher resolution images and how the highest resolution 1000×1000 images can be extracted.

The Regular Expression strings used in the video are :

src=”([^_]*)[^\.]*\.([^”]*)

hires=”([^”]*)

Know more about image extraction using WebHarvy : https://www.webharvy.com/tour1.html#ScrapeImage

To know more : Getting started with web scraping using WebHarvy

How to scrape real estate data from redfin.com ?

WebHarvy is a visual (point and click) data extraction software which can be easily configured to extract data from any website. This article explains how WebHarvy can be configured to extract property details from redfin.com which is a real estate website.

Apart from scraping data from redfin, WebHarvy can also be used to extract data from property listing sites like Zillow, Trulia, Realtor etc.

The following video shows how WebHarvy can be used to easily extract property details from redfin.com. Details like property price, address, area, built date, features, property history etc. are selected via the intuitive point and click interface. WebHarvy can follow each property link to extract additional data as well as automatically load and extract these data from multiple pages of listings.

As you can see, most of the details are selected by directly clicking on them. There is no complex configuration process or code/script to write. To know more and to familiarize with how WebHarvy can be used to extract data from websites, please follow the link below.

Getting started with web scraping using WebHarvy

We have several demonstration videos related to real estate / property data extraction in our YouTube channel which you can watch by following the link given below.

Real Estate Data Extraction Videos

If you have any questions you may contact our support at https://www.webharvy.com/support.html