{"id":979,"date":"2020-08-17T12:43:24","date_gmt":"2020-08-17T12:43:24","guid":{"rendered":"http:\/\/webharvy.com\/whblog\/?p=979"},"modified":"2020-08-17T12:43:24","modified_gmt":"2020-08-17T12:43:24","slug":"webharvy-6-1-internal-proxies-database-file-update-new-capture-window-options","status":"publish","type":"post","link":"https:\/\/www.webharvy.com\/blog\/webharvy-6-1-internal-proxies-database-file-update-new-capture-window-options\/","title":{"rendered":"WebHarvy 6.1 &#8211; Internal Proxies, Database\/File Update, New Capture window options"},"content":{"rendered":"<p>The following are the main changes in this version.<\/p>\n<h2>Option to leave a blank row when data is unavailable for a keyword\/category\/URL<\/h2>\n<p>In WebHarvy&#8217;s <a href=\"https:\/\/www.webharvy.com\/tour81.html#CategoryKWSettings\">Keyword\/Category settings page<\/a> a new option has been added to leave a blank row filled with corresponding keyword\/category\/URL when data is unavailable for that item. This option is available only when &#8216;Tag with Category\/URL\/Keyword&#8217; option is enabled.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-980 size-full\" src=\"http:\/\/webharvy.com\/whblog\/wp-content\/uploads\/2020\/08\/addblankrow.png\" alt=\"\" width=\"401\" height=\"518\" \/><\/p>\n<p>For mining data using a list of keywords, categories or URLs, enabling this option helps in identifying the items for which WebHarvy failed to fetch data, as shown below.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-981\" src=\"http:\/\/webharvy.com\/whblog\/wp-content\/uploads\/2020\/08\/blankrows.png\" alt=\"\" width=\"957\" height=\"270\" \/><\/p>\n<h2>Proxies are used internally by WebHarvy, not system-wide<\/h2>\n<p>In earlier versions, <a href=\"https:\/\/www.webharvy.com\/tour8.html\">proxies set in WebHarvy Settings<\/a> were applied system wide during mining. This caused side effects for other applications especially in cases where proxies required login with a user name and password and when a list of proxies were cycled. Starting from this version WebHarvy will use proxies internally so that other applications are not affected during mining. You still can apply proxies directly in Windows settings (system wide) and WebHarvy will use it automatically.<\/p>\n<p>Also, the configuration browser will start using the proxies which are set in WebHarvy settings. In earlier versions, proxies were used only during mining.<\/p>\n<h2>Database, Excel File Export : Update option (Upsert)<\/h2>\n<p>While <a href=\"https:\/\/www.webharvy.com\/tour6.html\">saving\/exporting mined data<\/a> to a database or excel file which already contains data (from a previous mining session), WebHarvy now allows you to update those rows of data which has the same first column value as those in the newly mined data, without creating duplicate rows.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-983\" src=\"http:\/\/webharvy.com\/whblog\/wp-content\/uploads\/2020\/08\/dbupdate.png\" alt=\"\" width=\"304\" height=\"464\" \/><\/p>\n<p>For file export this option is currently available only for Excel files.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-984\" src=\"http:\/\/webharvy.com\/whblog\/wp-content\/uploads\/2020\/08\/fileupdate.png\" alt=\"\" width=\"393\" height=\"264\" \/><\/p>\n<h2>New Capture window options : Page reload and Go back<\/h2>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-large wp-image-985\" src=\"http:\/\/webharvy.com\/whblog\/wp-content\/uploads\/2020\/08\/newpageoptions-1024x733.png\" alt=\"\" width=\"640\" height=\"458\" \/><\/p>\n<p>2 new capture window options for page interaction have been added &#8211; Reload &amp; Go back.\u00a0The reload option is helpful in cases where a page is not correctly loaded first time when a link is followed. The &#8216;Go back&#8217; option navigates the browser back to the previously loaded page.<\/p>\n<h2>Keywords can be added even after starting configuration<\/h2>\n<p>Just like URLs, Keywords can also be added after starting configuration. This method is useful in cases where the <a href=\"https:\/\/www.webharvy.com\/tour71.html\">normal method of Keyword Scraping<\/a> cannot be applied. The only condition for <a href=\"https:\/\/www.webharvy.com\/tour71.html#addkeywordslater\">adding keywords in this method<\/a> is that the first keyword entered should be present in the <a href=\"https:\/\/www.webharvy.com\/tour41.html#EditStartURL\">Start URL or Post Data<\/a> of the configuration.<\/p>\n<h2>Other minor changes<\/h2>\n<ol>\n<li>During configuration, in pages reached by following links from the starting page, links (URLs) selected by applying Regular Expressions on HTML can be followed using the &#8216;Follow this link&#8217; option. Earlier, only the Click option was available for this scenario.<\/li>\n<li>Automatically handles encoded URLs selected from HTML. Example: URLs including &#8216;&amp;amp;&#8217;. This works for following links as well as for image URLs.<\/li>\n<li>&#8216;Enable JavaScript&#8217;, &#8216;Share Location&#8217; and &#8216;Enable plugins&#8217; options removed from Browser settings.<\/li>\n<li>Bug related to scraping a list of URLs when one of the URLs fails to load fixed.<\/li>\n<li>While scraping a list of URLs, URLs which do not start with HTTP scheme part (http:\/\/ or https:\/\/) are handled.<\/li>\n<\/ol>\n<h2>Download the latest version<\/h2>\n<p>The latest version of WebHarvy is available <a href=\"https:\/\/www.webharvy.com\/download.html\">here<\/a>. If you are new to WebHarvy we would recommend you to view our &#8216;<a href=\"https:\/\/www.webharvy.com\/articles\/getting-started.html\">Getting started<\/a>&#8216; guide.<\/p>\n<p>&nbsp;<\/p>\n<p><iframe loading=\"lazy\" title=\"What is new in WebHarvy 6.1 ?\" width=\"1200\" height=\"675\" src=\"https:\/\/www.youtube.com\/embed\/B6zAhJASOz8?feature=oembed\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture\" allowfullscreen><\/iframe><\/p>\n","protected":false},"excerpt":{"rendered":"<p>The following are the main changes in this version. Option to leave a blank row when data is unavailable for a keyword\/category\/URL In WebHarvy&#8217;s Keyword\/Category settings page a new option has been added to leave a blank row filled with corresponding keyword\/category\/URL when data is unavailable for that item. This option is available only when &#8230; <a title=\"WebHarvy 6.1 &#8211; Internal Proxies, Database\/File Update, New Capture window options\" class=\"read-more\" href=\"https:\/\/www.webharvy.com\/blog\/webharvy-6-1-internal-proxies-database-file-update-new-capture-window-options\/\" aria-label=\"Read more about WebHarvy 6.1 &#8211; Internal Proxies, Database\/File Update, New Capture window options\">Read more<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[6,8],"tags":[],"class_list":["post-979","post","type-post","status-publish","format-standard","hentry","category-release-update","category-webharvy"],"_links":{"self":[{"href":"https:\/\/www.webharvy.com\/blog\/wp-json\/wp\/v2\/posts\/979","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.webharvy.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.webharvy.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.webharvy.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.webharvy.com\/blog\/wp-json\/wp\/v2\/comments?post=979"}],"version-history":[{"count":0,"href":"https:\/\/www.webharvy.com\/blog\/wp-json\/wp\/v2\/posts\/979\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.webharvy.com\/blog\/wp-json\/wp\/v2\/media?parent=979"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.webharvy.com\/blog\/wp-json\/wp\/v2\/categories?post=979"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.webharvy.com\/blog\/wp-json\/wp\/v2\/tags?post=979"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}