Web Scraping Instagram

Learn how Web Scraping can be used to download images and other data from Instagram, the most popular photo and video sharing website used by hundreds of millions of users.

Web Scraping is the process of automatically extracting data from websites using software tools called web scrapers. In this article we will be using WebHarvy to scrape images and other data from Instagram.

WebHarvy is a visual web scraping software using which you can scrape data from any website. WebHarvy is very easy to use, you can select the data to be scraped from web pages by just clicking over them.

How to scrape Instagram images?

The following demonstration video shows how WebHarvy can be used to scrape images from Instagram. Along with downloading images, other details like profile name, image location, number of likes etc. can also be scraped.

The RegEx string and JavaScript code used in the above video can be found here.

Steps to follow to scrape images from Instagram

  1. 1. Download and install WebHarvy in your computer
  2. 2. Open WebHarvy and load the Instagram page from which you need to scrape images
  3. Instagram loaded for scraping images
  4. 3. Remove the mouse overlay displayed over images (showing number of likes and comments) using Dev Tools. Remove all mouse event handlers other than Mouse Up and Mouse Down.
  5. Removing Mouse Event Handlers for Scraping
  6. 4. Start Configuration
  7. 5. Go to Configuration menu tab and select the Disable pattern detection option
  8. Disable Pattern Detection
  9. 6. Click on the first image tile and select Click option from the resulting Capture window
  10. 7. Once the image details page is opened, you can click and select textual details like profile name, image description, number of likes etc. by using the Capture Text option in the Capture window.
  11. Scraping Instagram Textual Data
  12. 8. To download the image, click over the image.
  13. 9. Select Capture More Content option twice
  14. 10. Select Capture HTML followed by Apply Regular Expression option
  15. 11. Paste and apply the following RegEx string : src[^\=]*="([^\s"]*)
  16. Scraping Instagram Images
  17. 12. Click the Capture Image button which will get enabled once the above RegEx is applied
  18. 13. To scrape images from multiple pages, use pagination via JavaScript method. Use the JavaScript code given here to configure pagination

Scraping multiple images (per post) from Instagram

Some Instagram posts may contain multiple images. WebHarvy can be configured to scrape all images displayed within each Instagram post. The following demonstration video explains how multiple images can be automatically scraped using WebHarvy. This video also shows how details like image location, image URL and content/description can be scraped.

The regular expression strings used in the video along with the JavaScript code used for pagination can be found in the video description.

Scrape follower details of any Instagram profile

The following demonstration video explains in detail how you can scrape name and handles of all followers of any Instagram profile.

The JavaScript code used in the above video for pagination is copied below:

var scrollEl = document.getElementsByClassName('pbNvD fPMEg')[0].children[1].children[0].children[0]; scrollEl.children[scrollEl.children.length-1].scrollIntoView();

Know More

If you are interested in using WebHarvy to scrape data from Instagram and other websites, we highly recommend that you download and try using the free evaluation version of WebHarvy available in our website. To get started, please follow the link given below.

Getting started with WebHarvy

Need Support or Have Questions ?

In you have any questions please contact us at support team (support@webharvy.com) with the details (URL of the webpage + details of the data to be scraped). We are happy to help you get started with your first data extracting project using WebHarvy.