Keyword Scraping

Watch Video Tutorial: How to scrape data by submitting multiple search keywords?

  • The Evaluation version of WebHarvy supports only 2 keywords while scraping data using this feature
  • The Keyword Scraping feature of WebHarvy allows you to scrape data by submitting a list of keywords to a web page. This feature lets you configure WebHarvy to scrape the data which is displayed after submitting each keyword in the list. Let us follow an example.

    Watch video demonstration of this feature

    Suppose we need to scrape the search results for the following keywords at a yellow pages website : Accountant, Lawyer, Plumber and Doctor. First navigate to the search/home page of the website. Place the cursor in the search box and click the Input a list of keywords button from Actions menu as shown below.

    Scrape data by submitting multiple keywords
  • In the resulting window, enter keywords. You may type in the keywords one-per-line or copy-paste a keyword list in CSV format. You can also import keywords directly from a CSV file (or file with one keyword-per-line format) by clicking the import button on the top right side of the window.

    Input Keywords
  • Click the OK button and WebHarvy will automatically fill the selected input (search) box with the first keyword in the list. Make sure that you do not change this. In case WebHarvy does not automatically fill the input (search) box with the first keyword, you should manually enter the first keyword (as it is, case sensitive).

    You may fill multiple input fields with keywords following the above method. For example separate keywords lists can be provided for search term and location.

    Once all keyword lists are configured, you may manually fill additional search / form parameters and click the 'Search'/Form submit button.

    Perform Search
  • Once the page displaying search results is loaded, click the Configuration - Start button in Home menu and start selecting data to be scraped. Click Configuration - Stop button in Home menu when you have finished selecting data. Click Start-Mine button to start mining data. While mining, the configuration will be repeated for all specified keywords.

  • Adding keywords after starting configuration

    If for some reason you are unable to configure Keywords as explained above, you can add them after starting configuration. For this, after starting configuration, click on the 'Keywords' button in the 'Configuration' tab.

    Add Keywords during Configuration

    In the resulting Keywords window, you can type in the keywords one per line or comma separated. However, the first keyword which you enter should appear 'as it is' in either the start URL / Post Data of the configuration, or in an Input Text field used in the configuration to perform search.

  • Adding Keywords for Input Text action

    For some websites, the search keyword is not passed in the URL or Post Data. In such cases, you can use the Input Text action to enter the keyword in the search box. Follow the steps give below.

    1. 1. Load the search page in WebHarvy's configuration browser.
    2. 2. Start Configuration.
    3. 3. Click over the search box and select More Options > Input Text option from the Capture window.
    4. 4. In the resulting window, paste the first keyword from your list of keywords and apply. The first keyword will appear in the search box.
    5. 5. Use the page interaction functions to fill other inputs in the search form and click on the search button to load the search results page.
    6. 6. From the Configuration menu tab, click the Keywords item under Edit pane.
    7. 7. Paste the full keyword list (including the first one) in the resulting window and Apply
    8. 8. Continue with data selection
  • Related Links

    1. 1. How to edit keywords in the configuration ?
    2. 2. How to add an additional data column which displays the Keyword for each row data ?