{"id":800,"date":"2019-09-24T10:47:10","date_gmt":"2019-09-24T10:47:10","guid":{"rendered":"http:\/\/webharvy.com\/whblog\/?p=800"},"modified":"2024-11-01T05:47:14","modified_gmt":"2024-11-01T05:47:14","slug":"webharvy-5-5-custom-user-agent-string-handles-frames-better-form-submission-navigation","status":"publish","type":"post","link":"https:\/\/www.webharvy.com\/blog\/webharvy-5-5-custom-user-agent-string-handles-frames-better-form-submission-navigation\/","title":{"rendered":"WebHarvy 5.5 (Custom User Agent String, Handles frames, better form submission\/navigation)"},"content":{"rendered":"<p>The following are the main changes (features\/improvements) of WebHarvy 5.5<\/p>\n<h2>1. Custom User Agent String<\/h2>\n<p>If you go to <a href=\"https:\/\/www.webharvy.com\/tour81.html\">WebHarvy Settings<\/a> &gt; Browser tab, you can enable custom user agent string as shown below.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-802 size-full\" src=\"http:\/\/webharvy.com\/whblog\/wp-content\/uploads\/2019\/09\/Custom-User-Agent-String.png\" alt=\"\" width=\"401\" height=\"518\" \/><\/p>\n<p>The\u00a0<span style=\"color: darkblue;\">&#8216;Enable custom user agent string&#8217;<\/span>\u00a0option allows you to specify a\u00a0<a href=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/HTTP\/Headers\/User-Agent\" target=\"_blank\" rel=\"noopener noreferrer\">user agent string<\/a>\u00a0which WebHarvy configuration and mining browsers will use. This option can be used to make WebHarvy&#8217;s browser appear like another specific browser (ex:\u00a0<a href=\"http:\/\/www.useragentstring.com\/pages\/useragentstring.php?name=Edge\" target=\"_blank\" rel=\"noopener noreferrer\">Microsoft Edge<\/a>,\u00a0<a href=\"http:\/\/useragentstring.com\/pages\/useragentstring.php?name=Firefox\" target=\"_blank\" rel=\"noopener noreferrer\">Mozilla Firefox<\/a>,\u00a0<a href=\"http:\/\/useragentstring.com\/pages\/useragentstring.php?name=Chrome\" target=\"_blank\" rel=\"noopener noreferrer\">Google Chrome<\/a>\u00a0or\u00a0<a href=\"http:\/\/useragentstring.com\/pages\/useragentstring.php?name=safari\" target=\"_blank\" rel=\"noopener noreferrer\">Apple Safari<\/a>) to websites from which you are trying to extract data.<\/p>\n<h2>2. Better form submission, initial navigation<\/h2>\n<p>Suppose that you need the configuration to input values to a search form (like the one shown below) and then click the &#8216;Search&#8217; button to perform search and display results. The results contain the data which you need to extract.<br \/>\n<img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-807 size-full\" src=\"http:\/\/webharvy.com\/whblog\/wp-content\/uploads\/2019\/09\/form-1.png\" alt=\"\" width=\"776\" height=\"426\" \/><\/p>\n<p>Earlier, you needed to <a href=\"https:\/\/www.webharvy.com\/tour41.html#DisablePatterns\">disable pattern detection<\/a> before filling the form fields. After clicking the search button, when the data which you need to extract is displayed, you will need to enable pattern detection back again, before selecting the required data.<\/p>\n<p>But now, with the latest version, you no longer need to adjust the pattern detection state manually. WebHarvy will handle this internally, automatically.<\/p>\n<h2>3. Open frames and select data<\/h2>\n<p>Earlier, if the data which you need to select for extraction occur within a <a href=\"https:\/\/www.w3schools.com\/tags\/tag_iframe.asp\">frame<\/a> inside the page, <a href=\"https:\/\/www.webharvy.com\/articles\/troubleshoot.html#DataInsideFrame\">you needed to find the frame URL<\/a> and load the frame URL independently within WebHarvy and then start configuration.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-824\" src=\"http:\/\/webharvy.com\/whblog\/wp-content\/uploads\/2019\/09\/44a.png\" alt=\"\" width=\"620\" height=\"467\" \/><\/p>\n<p>With this version, we have added a new Capture window option to open frames. Whenever you click on any item which occurs within a frame, the resulting Capture window displayed will have an &#8216;Open Frame&#8217; option clicking which WebHarvy automatically loads the frame contents within the browser view, so that you can proceed with data selection.<\/p>\n<h2>4. Browser Search<\/h2>\n<p>You can hit CTRL + F in configuration browser (while not in configuration mode) to bring up the search window, using which you can perform textual search on currently loaded page.<\/p>\n<h2>5. Capture full page HTML<\/h2>\n<p>Sometimes you will need to capture the full page HTML to extract some data within it by applying <a href=\"https:\/\/www.webharvy.com\/tour1.html#ScrapeByRegEx\">regular expressions<\/a>. Earlier you needed to click anywhere on the page and select <a href=\"https:\/\/www.webharvy.com\/tour1.html#ScrapeMore\">Capture More Content<\/a>\u00a0option multiple times so that the whole page content is selected and then you can select <a href=\"https:\/\/www.webharvy.com\/tour1.html#ScrapeHTML\">Capture HTML<\/a>\u00a0option to get the full page HTML. With the latest version, you can double click on the &#8216;Capture HTML&#8217; toolbar button to capture the full page HTML directly.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-811\" src=\"http:\/\/webharvy.com\/whblog\/wp-content\/uploads\/2019\/09\/capture-full-page-html.png\" alt=\"\" width=\"375\" height=\"354\" \/><\/p>\n<h2>6. Reset settings to default<\/h2>\n<p>You no longer need to remember what the default settings were. Just click the &#8216;Reset settings to default&#8217; link in the Settings window.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-812\" src=\"http:\/\/webharvy.com\/whblog\/wp-content\/uploads\/2019\/09\/reset-settings.png\" alt=\"\" width=\"401\" height=\"518\" \/><\/p>\n<h2>7. Lower repetition intervals for scheduled tasks<\/h2>\n<p><a href=\"https:\/\/www.webharvy.com\/tour82.html\">Mining tasks scheduler<\/a> now allows you to repeat mining tasks at 5, 10, 15 and 30 minutes intervals.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-813\" src=\"http:\/\/webharvy.com\/whblog\/wp-content\/uploads\/2019\/09\/scheduler.png\" alt=\"\" width=\"358\" height=\"472\" \/><\/p>\n<h2>Minor Changes<\/h2>\n<ol>\n<li>&#8216;Enable Web Security&#8217; option in Browser Settings is ON by default<\/li>\n<li>Browser handles &#8216;Need Client Certificate&#8217; request from Web Servers<\/li>\n<li>Updated internal browser to latest possible version of Chromium<\/li>\n<li>HTTP2 support enabled<\/li>\n<li>Bug fixes and overall improvements\n<ol>\n<li>Fixed issue where some selected data items were not extracted correctly during mining<\/li>\n<li>Preview generation is stopped when configuration is stopped<\/li>\n<li>Deleting data fields not allowed while preview generation is in progress<\/li>\n<li>Fixed issue with &#8216;pattern detection enabled for a while&#8217; soon after <a href=\"https:\/\/www.webharvy.com\/tour1.html#OpenPopup\">opening popup<\/a><\/li>\n<li>Issue with editing start page URL in a configuration with multiple URLs fixed<\/li>\n<li>Single-term search supported from configuration browser address bar<\/li>\n<li>Pagination controls enabled in Miner window for single-page configurations after mining is stopped. Fixed.<\/li>\n<li>Fixed bug in Keyword Scraping due to case sensitiveness of keyword replacement in start URL \/ Post-data.<\/li>\n<li>Sometimes while starting WebHarvy the initial page (quick start guide) takes forever to load. Fixed.<\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<p>As always, you can download and install the latest version from\u00a0<a href=\"https:\/\/www.webharvy.com\/download.html\">https:\/\/www.webharvy.com\/download.html<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The following are the main changes (features\/improvements) of WebHarvy 5.5 1. Custom User Agent String If you go to WebHarvy Settings &gt; Browser tab, you can enable custom user agent string as shown below. The\u00a0&#8216;Enable custom user agent string&#8217;\u00a0option allows you to specify a\u00a0user agent string\u00a0which WebHarvy configuration and mining browsers will use. This option &#8230; <a title=\"WebHarvy 5.5 (Custom User Agent String, Handles frames, better form submission\/navigation)\" class=\"read-more\" href=\"https:\/\/www.webharvy.com\/blog\/webharvy-5-5-custom-user-agent-string-handles-frames-better-form-submission-navigation\/\" aria-label=\"Read more about WebHarvy 5.5 (Custom User Agent String, Handles frames, better form submission\/navigation)\">Read more<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[6,8],"tags":[74],"class_list":["post-800","post","type-post","status-publish","format-standard","hentry","category-release-update","category-webharvy","tag-new-release"],"_links":{"self":[{"href":"https:\/\/www.webharvy.com\/blog\/wp-json\/wp\/v2\/posts\/800","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.webharvy.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.webharvy.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.webharvy.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.webharvy.com\/blog\/wp-json\/wp\/v2\/comments?post=800"}],"version-history":[{"count":1,"href":"https:\/\/www.webharvy.com\/blog\/wp-json\/wp\/v2\/posts\/800\/revisions"}],"predecessor-version":[{"id":1651,"href":"https:\/\/www.webharvy.com\/blog\/wp-json\/wp\/v2\/posts\/800\/revisions\/1651"}],"wp:attachment":[{"href":"https:\/\/www.webharvy.com\/blog\/wp-json\/wp\/v2\/media?parent=800"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.webharvy.com\/blog\/wp-json\/wp\/v2\/categories?post=800"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.webharvy.com\/blog\/wp-json\/wp\/v2\/tags?post=800"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}