How to web scrape YouTube comments?

Half the fun of YouTube is in the comments section 😃 There is a wealth of user submitted data in the form of comments under each YouTube video. This data can be scraped and used for various purposes like sentiment analysis. In this article we will learn how to easily web scrape YouTube video comments using WebHarvy.

WebHarvy

WebHarvy is an easy to use visual web scraping software which can be used to scrape data from any website. WebHarvy can be locally installed in your computer and the data scraped can be saved to a file or database.

Getting started with Web Scraping using WebHarvy

Scraping YouTube Comments using WebHarvy

Download and install WebHarvy in your computer. WebHarvy has a web browser like user interface in which you can load and navigate web pages.

Load the YouTube video page from which you wish to scrape comments within WebHarvy’s browser. Scroll down the page so that the comments section is loaded and visible.

Then, start configuration. Configuration stage is where you teach WebHarvy which all data to scrape. Just click over the data item (text or image) which you wish to scrape and WebHarvy will display a Capture window with various options. Select the Capture Text option to scrape the item’s text.

Click over the name (handle) of the first comment and select the Capture Text option from the resulting Capture window displayed. Following same method, select the comment text, number of likes and number of replies.

Scraping comments by scrolling down the page

YouTube loads more comments as we scroll down the page. This is known as infinite scrolling (the same technique employed by social feeds like that of Facebook, never-ending hence infinite). We need to teach WebHarvy how to repeatedly scroll down the page and scrape comments.

WebHarvy supports various types of pagination techniques employed by websites to display large sets of data. For scraping YouTube comments we use the Pagination via JavaScript method.

The following code is used for pagination (see video below).

document.getElementById('comments').scrollIntoView();
els = document.getElementsByTagName('ytd-comment-thread-renderer');
els[els.length-1].scrollIntoView();

Click the Stop Configuration button to stop the configuration stage and start mining data.

Video

The following video shows in detail how WebHarvy can be configured to scrape comments from YouTube videos.

Questions?

If you have any questions regarding WebHarvy, please feel free to contact our support.