Cyotek WebCopy Revision History - Copy websites locally for offline browsing • Cyotek

Version 1.9.1.872 04 September 2023

Due to changes in how WebCopy determines whether or not to process a given URL there could be differences with how WebCopy 1.9 works against previous versions. Please report any inconsistencies to us!

Added

Added the ability to read cookies from an external file (#461) (User Manual)
Added the ability to read cookies from an external file (#462) (User Manual)
Test URL dialogue now allows configuring cookies
Added cookie, cookie-jar and discard-session-cookies command line parameters (User Manual)
Added support for the legacy compress (#472) and non-standard BZip2 (#473) content encodings (User Manual)

Changed

Documentation improvements (#400, #443, #461, #462, #482, untracked)
Test URL dialogue now uses load on demand for settings pages
401 challenges no longer display credential dialogues unless the authentication type is either Basic or Digest as no other values have been tested due to lack of resource
Updated mime database

Fixed

Posting a form did not set an appropriate content type (#437)
Custom headers were not applied when posting forms (#436)
If a URL was previously skipped but then included in future scans, the original skip reason could be retained
A blank error message was displayed for Brotli decompression errors (#447)
One-time project validation checks were ignoring the content encoding settings of the project (which by default is Gzip and Deflate) and were requesting content with Brotli compression (#446)
Brotli decompression could fail with streams larger than 65535 bytes (#448)
The URI transformation service incorrectly attempted to add prefixes to email addresses, this in turn caused a crash if the mailto: reference was malformed (#450)
A crash could occur if a content type header was malformed and was either utf or utf- (#450)
Fixed an issue where command line arguments sometimes didn't correctly process ambiguous relative arguments that could be a file name or a unqualified URI (#441). As a result of this fix, all URIs provided to command lines must be fully qualified, e.g. https://example.com over example.com
Fixed a crash that could occur when switching between empty virtual list views during a crawl and items were then subsequently added (#460)
A crash which could occur when loading localised text is no longer fatal (#465). Note that we haven't been able to reproduce this, so if you previously received a crash after setting a language other than English, please email support@cyotek.com
Speed and estimated downtime time calculations were incorrect and could cause a crash when downloading large files (#466)
A crash would occur when editing a file that didn't have a mime type (#468)
Speculative fix for a crash that could occur when finishing the New Project Wizard (#467)
Fixed a crash that occurred if a 401 challenge was received and the www-authenticate header was a bare type (#469)
If a website returns a non-standard Content-Encoding value (or one currently not supported by WebCopy), no attempt will be made to decompress the file and it will be downloaded as-is. A new setting has been added to disable this behaviour, but is currently not exposed (#471)
Crashes that occurred when applying project validation corrections (for example if the base URL redirects, WebCopy will prompt to use the redirect version) were fatal (#474)
Trying to save a CSV export with a relative filename crashed (#475)
The quick scan diagram view could crash if invalid host names were detected (#476). This is another bug reported without context, if any user has previously experienced this please email support@cyotek.com
The "Limit distance from base URL" setting now only applies to URLs that have a content type of text/html, e.g. it will prevent deep scanning whilst still allowing retrieval of all linked resources (#464)
URLs that had exclusion rules would still get requested depending on the combination of project settings (#481)
The CLI would crash if the recursive and output parameters were defined, and the specified output directory did not exist (#483)
Client is no longer marked as dpi-aware, which should resolve pretty much all the problems with the application not displaying correctly on high DPI screens. This is an interim fix until dpi-awareness can be properly introduced.
Fixed a crash that could occur when trying to query if the scan above root setting should be abled and an invalid URI was project (#493)
Fixed a crash that could occur when the scan/download progress dialog was closed (#454)
The Export CSV dialog wasn't localised correctly, resulting in seemingly two Cancel buttons (#495)

Removed

The PDF meta data provider has been removed

Version 1.9.0.822 20 January 2022

Due to changes in how WebCopy determines whether or not to process a given URL there could be differences with how WebCopy 1.9 works against previous versions. Please report any inconsistencies to us!

Added

It is now possible to read additional URLs to scan from a text file [#281] (User Manual)
Added no-directories, max-redirect and header arguments (User Manual)
Added proxy, proxy-user and proxy-password arguments [#337] (User Manual)
Added input-file argument [#282] (User Manual)
Added Redirects To column to the results list
Added Local File column to the files list
Added Local File, Redirects To, Depth and Distance columns to links lists
List views now display a configuration menu when context clicking a column header
The GUI client now supports many of the same command line arguments as the CLI [#403] (User Manual)
Added a new extension remap mode, Only HTML [#365]. This new option will change the extensions of downloaded files only if the content type is text/html, all other files will be as-is (User Manual). This setting is now also the default for new WebCopy projects
Added a new validator to try and detect unsupported websites [#407]
Added new URL normalisation options for forcing HTTPS [#383] (User Manual)
Added new URL normalisation option for ignoring case [#202] (User Manual)

Changed

Setup will now install Microsoft .NET 4.8 if not present
Adding multiple URLs to scan is now easier using a free text field [#282]
Command line tools now report unknown parameters
Major reworking of internal decision making logic [#242]
New WebCopy projects will default to saving headers
The sitemap tree now limits the number of child elements to a maximum of 100 by default [#402]. This setting can be changed in the application options
Documentation updates
Rule Tester dialogue now includes rule components
Reworked setting validation
Rule expressions are now validated before crawling a site
WebCopy no long treats URLs as case-insensitive for new projects
The project URL can now be set via the Quick Scan dialogue

Removed

The Link Checker (GUI and CLI), URI Tester and XPath Tester tools have been removed from distribution due to lack of use

Fixed

Only the last argument error was displayed when running command line tools
WebCopy will now retry URLs that fail with "The server committed a protocol violation" exceptions
If using the default user agent, WebCopy will now try a default browser agent if a 401 response is returned when validating the URL [#382]
When issuing a 401 challenge dialogue, WebCopy could include additional header information in the description
The Move Down button was incorrectly enabled when adding a new password entry, causing a crash if clicked [#394]
Fixed a pair of conditions that could cause site map generation to nest the same tree until it crashes [#391]. This should also resolve a different crash that could occur generating a site diagram [#397]
Cookie editor now does a better job of validating entered values
Invalid cookies should no longer cause a crash [#396]
WebCopy would sometimes remove file extensions that weren't really extensions [#327]
Various performance improvements, both major and minor [#399, #404]
Last modified date is now read from meta tags if available [#405]
Cancelling a crawl should now abort any in-progress downloads
Fixed an issue where reading from a hybrid stream returned null bytes up to the stream capacity after exhausting existing data
The rule editor could allow you select conflicting options
A crash occurred accessing the Quick Scan dialog if the project URL wasn't set (regression) [#430]
A crash occurred using the Quick Scan dialog if there was a problem with the crawl and the Use Browser Option was enabled [#431]
A crash occurred accessing the Quick Scan dialog with an invalid project URL [#434]
Fixed a crash that could occur when trying to use an invalid output path [#435]

Version 1.8.3.768 01 April 2021

Warning! WebCopy projects saved using 1.8 are not backwards-compatible with older versions of WebCopy

Added

Separate 32bit and 64bit setup files are now available

Changed

Reorganised columns in results list view to hopefully reduce confusion when URLs are skipped
Non-crawlable URLs that are skipped during an analyse are now recorded in the results list

Fixed

WebCopy would not copy sites using an IP address rather than a DNS name
URL validation checks were running on the base domain and ignoring any deep linking [#385]
WebCopy wouldn't always strip empty path segments from URLs [#358]
PDF and RSS will no longer be downloaded when performing a site analysis
Improvements to several windows affected when running under custom DPI scaling modes [#241]
The minimum / maximum file size editor required values to be entered as bytes, despite the labelling requesting kibibytes [#387]
Sometimes WebCopy didn't shrink a path correctly to fit within file system limitations [#393]
Default documents should no longer be named index.htm.html

Version 1.8.2.744 03 January 2021

Fixed

Initial redirect check wasn't applying a user agent which caused some sites to reject the request
New Project Wizard wasn't displaying correctly

Version 1.8.2.740 13 December 2020

Fixed

It should now be easier to manually type https addresses into the URL field, without WebCopy trying to inject http after each character press (regression from 1.8.2)

Version 1.8.2.739 06 December 2020

Changed

When crawling for the first time, if the user-entered URL redirects (for example http to https or to www), WebCopy will now prompt to use the final URL [#368]

Removed

Aero glass effects used by Windows 7 have been removed [#366]

Fixed

Changing the URL of an existing project caused further crawling to instantly fail if the domain of the new URL did not match the old [#367] (regression in 1.8.1)
Added a speculative fix for a hang when crawling a website with the experimental "use web browser" option enabled
The GUI client now correctly allows unescaped URLs to be entered
When using the "Use query string in local file names" option, WebCopy now correctly sanitises the query string
Fixed a crash that occurred when trying to empty the destination folder using the 64bit version of WebCopy

Version 1.8.1.725 17 October 2020

Added

Link Checker GUI client now allows the checking of external links to be enabled or disabled
Link Checker GUI client now allows if URLs belonging to parent, sibling or sub domains should be checked
Added auto scroll option to Link Checker GUI client
Added progress indicator to Link Checker GUI client
Added new Use Recycle Bin option to project settings. If set and the Empty website folder before copy is also set, any deleted files will be moved to the Recycle Bin instead
The View Links dialog now allows the display of excluded URLs to be toggled
Added proper editor for defining web page language settings at the project level
Add application level setting for definition web page language settings
List exports now present a configuration dialogue for which columns to include the export [#275]

Changed

WebCopy will now prompt to continue if the Empty website folder before copy option is set and files are present in the destination
The Sitemap Extension will now start from the base domain if the project URL is deep and the Crawl Above Root flag is set
Updated mime-db to 1.44.0
The GUI now displays a proper progress indicator and status information when remapping local files
The CLI client now displays status information when remapping files
The Origin Report option for new projects now defaults to Single File rather than Embedded
WebCopy will now always send the Accept-Language header. If not defined at the project level, it will use the application level setting. If this is not provided, then the current OS culture information will be used
Documentation has had a good overhaul and is in the best state it has ever been in. All help links from option dialogue boxes point where they should, and missing documentation has been added
Expanded default contentfilters.json used by the New Project Wizard to cover other common types
The Accepted Content Types field has been moved from the Advanced category into a category of its own, expanded to use the same type of editor as for the web site language

Removed

The Report Problem Site extension is no longer bundled with Setup
Removed global statistics

Fixed

WebCopy was treating any attribute value that started with javascript as unsupported
The sitemap tree could display duplicate URLs
The sitemap tree would could incorrectly display children of pages that matched a standard document pattern
Link Checker didn't follow internal redirects
WebCopy could incorrectly parse the URL from an @import at-rule if the CSS was minified and another rule contained an empty content declaration
The Project Diagnostics extension now ignores data URLs when performing length checks
Cut, Copy and Paste commands didn't work for the filter fields in list views
Fixed a crash that could occur when ordering the sitemap
Reworked HEAD support detection to be more robust
401 challenges were only processed during HEAD requests
Fixed a performance issue running XPath queries
Per-URL origin reports could be overwritten if URLs differed only by extension
The New Project Wizard no longer creates duplicate rules if content types are present in multiple pre-defined groups
Fixed a crash that could occur when closing the options dialog after switching views
Windows that save their position and size should no longer keep increasing in size each time the window is opened and a custom font is being used with a point size above 8
Options dialogues are now slightly more usable when using custom fonts with a point size above 8
The Quick Scan dialogue now correctly disables the Scan button when busy, preventing a crash trying to perform multiple scans
Setting the URL in the main window now correctly defaults http if a scheme is not explicitly set, preventing a crash when using secondary actions such as trying to capture a form
The New Project Wizard dialogue now ensures that user entered URLs have a default scheme applied if omitted by the user
WebCopy could incorrectly parse blank url CSS functions
Fixed inconsistencies in when the Download All Resources option would be enabled or disabled
Fixed a crash posting a blank form definition

Version 1.8.0.652 12 April 2020

Fixed

Fixed a crash which occurred if contentfilters.json was not present in %appdata%\Cyotek\WebCopy\1.0

Version 1.8.0.651 07 April 2020

Added

Added additional options to proxy server configuration, allowing the use of system proxies and user defined bypass lists
The poster attribute of video elements is now detected

Fixed

Fixed a crash that could occur when verifying the initial path
The Internet Explorer DOM provider failed to process some pages if an attribute request failed with an DISP_E_TYPEMISMATCH error
The Limit Crawl Depth, Limit Distance from Root URL, Maximum Files, Maximum File Size and Minimum File Size options would be processed if they had previously been set even if subsequently disabled
Proxy settings dialog now does a better job of validating the address
Fixed a crash when clicking help links in stand alone tools
Fixed a crash which could occur when using the Select URI dialog
Speculative fix for a crash when painting list views
Speculative fix for a crash trying to capture a form
Speculative fix for a crash setting the same folder

Version 1.8.0.638 Beta 28 February 2020

Added

Added new setting for controlling the HTTP download buffer size
Added new setting for controlling the size of the memory cache when downloading transient content
The Rule Tester dialog now supports rules that use content types
Added new rule options to allow download priorities to be set
The Browse button in the Rule Editor now allows the selection or either URLs or Content Types depending on the rule setting

Changed

Documentation improvements
The Custom Headers editor is now consistent with other similar editors
If a given URL is a redirect, the new location is given higher priority in the crawl queue than existing non-redirect entries
Various graphics and glyphs have been replaced or made consistent with the other styles

Removed

The rule option Do not allow children to inherit this rule has been removed

Fixed

WebCopy should now correctly handle non-ASCII domain names
HTML documents above the root would not be scanned if both the Download All Resources and Crawl Above Root settings were enabled
Certificate checking always used the HEAD method, even if head checking was explicitly disabled for the project
Per-URI memory cache is now correctly cleared after fully processing a given URI
All transient downloads now make use of a memory buffer if possible, instead of only during an analyse
The Status column in URI lists no longer displays Skipped for redirects.
WebCopy now uses buffer pools instead of constantly creating and destroying buffers

Version 1.8.0.627 Beta 22 January 2020

Added

CSV export now allows which columns to include
Cookies can now be set from the project properties dialog
Double clicking a cookie from a list view will display the URL decoded value in an information window
Cookie list context menu now includes a Copy Name/Value Pair option
Links list now displays the relationship of entries to the project URL
Added the ability to control the scan depth
Content type inclusions/exclusions now support regular expressions
Added the ability to specify how many items are displayed on the MRU list
Added support for HTML5 character entities
Added experimental support for crawling JavaScript enabled sites
Added New Project wizard
Rules now allow you to select which components the expression will be applied to
Rules can now be applied to content types

Changed

Setup now uses InnoSetup 6
Cookies extension now uses the same cookies list as other parts of WebCopy
The Feedback dialog no longer has the screenshot option set by default
The Feedback dialog now remembers the email address, if provided
The Feedback dialog now allows you to toggle which displays you want included in the screenshot
Sitemaps are now always sorted alphabetically

Removed

Due to a rewrite of how the site browser tree works, the Simplify Sitemap option no longer has meaning and has been removed
The Sorted option has been removed from projects

Fixed

Non-modal link property dialogues are now positioned more appropriately
FirstRun setting was not being cleared after the initial startup
Creating or opening a project now closes any open property windows
Cookies extension didn't always show all detected cookies
Attempting to open a file that wasn't a WebCopy project would create a blank project, but would incorrectly assign it the filename of the non-project
Fixed various serialization and clone omissions
Site browser tree, sitemap extension and website diagram extension no longer builds an entire sitemap upfront
Sitemap extension now correctly encodes text
HTML entity encoding and decoding should now better handle surrogate pairs
Site browser tree no longer renders any page with sub pages as a folder
WebCopy wouldn't detect encoding correctly if it was specified only via the <meta charset=""> element
WebCopy wouldn't write remapped links for URLs that led to a redirect
Checks to see if a given URL was an ancestor of the root URL were not being skipped if the domains didn't match
The sitemap tree view wouldn't display some URL's if the crawl above root option was set
The Download All Resources option wasn't working the way it should

Version 1.7.0.600 29 April 2019

Changed

URI properties dialog are now opened as non-modal windows if possible

Fixed

Quick Scan dialog temporary restricts maximum number of displayed pages to 200, resolving a crash that occur on sites with thousands of detected pages
Fixed an issue where WebCopy would always full download files above the copy root if the Download All Resources option was enabled
Fixed an issue where WebCopy wouldn't correctly exclude entries above the root if the Download All Resources option was enabled
Files over 2GB in size wouldn't be downloaded
After 2147483647 files had been downloaded, no further downloads would occur
Setup programs were only signed with SHA256, meaning Windows Vista couldn't read the signatures
Setup tried to install .NET 4.6.2, causing an installation failure on Windows Vista which only supports 4.6.0

Version 1.7.0.583 Beta 17 January 2019

Added

It is now possible to authenticate with a website using an embedded web browser prior to copying, allowing WebCopy to work with sites that have complex login procedures or multi-factor authentication [#333]
When copying a website with an SSL certificate, if the Ignore certificate errors option is not set, WebCopy will now display a dialog asking what to do [#329]
Added a new option to include the original extension when remapping files [#324]
Added a new option to include the query string in local filenames [#267]
Added new options for limiting downloads based on file size (minimum and maximum) and on the number of files downloaded
Added Last Modified column to various URL list views
Added URL browser dialogs to various selection fields

Changed

Reinstated URI editor for URI Transforms
URL browser dialogs now keep the original selection

Fixed

HTML entities in attribute values were not decoded when scanning for links
Local files no longer have their extensions changed if the URI extension doesn't match the first extension in the content type database, e.g. .jpg files no longer get renamed to .jpeg
Sitemap generation now correctly ignores redirected URL's
WebCopy could incorrectly abort a download with an insufficient disk space error if even free space was available

Version 1.6.0.559 02 November 2018

Fixed

No progress dialog was displayed when clicking the Copy or Scan buttons to the right of the URL field

Version 1.6.0.555 18 October 2018

Added

Added new nocrashreport switch to command line clients
Command line clients can now display solution information when reporting crashes [#201]

Fixed

Partial output is no longer printed by CLI tools when using the quiet switch
Statistics are now printed when using the statistics switch even if quiet is also specified
All output is now correctly written to log files when the log switch is used, irrespective of the quiet switch setting
Pressing Enter or Escape in the Capture Form dialog no longer closes the dialog if the embedded web browser has focus

Version 1.6.0.551 Beta 06 October 2018

Fixed

The progress dialog was no longer being displayed [regression]

Version 1.6.0.549 Beta 23 September 2018

Fixed

CLI automatically assumed it was installed into a protected folder and refused to download files if the working directory was the application directory
CLI couldn't read meta data from PDF's
Exception reports no longer include the user name of the current user
Exception reports no longer include the raw host name
Clicking the Open in Browser button in the Test URL dialog crashed the application if no URI had been entered
Crawl only exclusion rules we no longer working as expected (regression)
Backup files had the wrong file extension (regression)

Version 1.6.0.543 Beta 08 September 2018

Added

The Basic Authentication dialog now allows the prompting of future passwords to be disabled
Preview functionality of the Test URI dialog now supports a subset of images
Added proxy settings to Test URI dialog
Added new options for specifying custom headers [#219]
The Test URI dialog now allows the configuration of content encoding, custom headers and URI transforms [#296]
Added stand-alone version of the test URI tool
The Capture Form dialog will now try and find the best match if multiple forms are detected on a page [#230]

Changed

The layout of the Test URI dialog has been reworked [#296]
Setup has a new option to determine if icons should be created for stand-alone tools
Setup has a new option to determine if experimental 64bit versions of tools should be installed
Minor improvements to External Tools dialog
Minor start-up improvements
The option to save headers with the project file is now enabled by default for new projects
Some context menu items which disappeared from virtualised lists have now been re-instated
If the character set for a HTML document isn't explicitly specified, WebCopy will now try and autodetect an appropriate value [#303]

Removed

Removed the Content tab from the Link Properties dialog
Removed the unused Modified URI field from the Link Properties dialog
Removed Find more user agents online link from the Edit User Agents dialog
Removed the Allow Editing checkbox from the Link Properties dialog
Removed the Disable Updates flag from link information

Fixed

The Basic Authentication dialog truncated long realm text [#312]
WebCopy no longer tries to unpack custom settings belonging to unloaded extensions [#278]
WebCopy no longer stores downloaded content against the link information when a 400 or 500 series response is returned
Output editors in the Test URI dialog now honour the Fixed Font setting
Project files were no longer being compressed when saved
Fixed a crash that could occur when running the Empty Meta Data report
Backup files were not being created when saving projects
Default external tools configurations were not added when starting WebCopy for the first time
Some files were still download even if they had been excluded via a rule (regression from 1.4)
Editing a local file using the build in text editor always used UTF-8 and would corrupt files using a different encoding
The default user agent was using the file version of the WebCopy client instead of the product version
The Quick Scan window is now resizeable and remembers its position
Corrected some settings that weren't being cached

Version 1.5.0.516 23 July 2018

Added

Added a new diagnosis extension to help investigate certain project errors which are not reproducible in current test data
Added new exclusion options to more finely control the remap extension mode
The Content-Disposition header is now supported and if set will help define the local filename

Changed

Tabbed or tree based option/property dialogs now include a search field
Split the Copy options page into Folder and Local Files pages

Fixed

Uninstall should no longer prompt for feedback when running Setup to upgrade an existing installation
Fixed an issue where the Download all resources setting was switched off when opening the options dialog (regression)
Fixed a crash which could occur when clicking the Test URI button in a form editor for a project with no base URI set
Speculative fix for a crash which could occur when deleting an empty rule or form
Speculative fix for a crash which could occur when displaying the Select URI dialog
Setup was installing the Problem Site Report extension into the wrong folder, overwriting the RSS extension manifest
Strings over 32767 bytes in size are now supported in WebCopy projects
Pressing Enter in multi-line edit fields in the Inclusions / Exclusions option page closed the dialog
Fixed a number of cases where modifying a collection might not mark the project as changed
Fixed a crash that could occur if WebCopy couldn't get an encoding [#304]

Version 1.5.0.501 Beta 09 June 2018

Added

WebCopy now includes a database of all registered mime types, instead of relying solely on what is registered on the local computer [#271]
Added a new remap mode setting, all except arbitrary binary data, which will correctly remap content types to the appropriate extension regardless of the source URL, unless the content type is application/octet-stream [#182]
Rule, Password and Domain Alias editors now highlight items containing invalid expressions

Changed

The default remap mode for all new projects is now all except arbitrary binary data (previously it was only if no extension present, probably the worse default possible) [#182]
The Select Mime Types dialog can now load in all registered types rather than only what was detected in a given website
The Capture Form dialog has had form detection re-wrote so that it now reflects the current state of the web browser, allowing you to enter details into the form and have those captured
The Capture Form dialog now automatically selects non-hidden fields for inclusion in the generated form definition
URI Transforms can now be disabled without having to remove them completely
URI Transforms didn't work correctly if the URI field was specified
Temporarily disabled the JavaScript detected warning as it appears to be causing more confusion with users than it resolves
Added new controls to the main window for ease of use
Documentation updates

Fixed

The Capture Form dialog only detected input elements within forms. Now correctly detects input, output, select, textarea, object and button elements, while excluding reset, button and image input types [#279, #285]
The Capture Form dialog always listed (and processed) form parameters in reverse order [#284]
The Capture Form tool didn't generate form definitions correctly for forms embedded in an iframe
The URI Transforms and Domain Aliases editors now behave the same as other list editors
The Test URL dialog didn't merge form parameters property unless the Merge from field was full specified in addition to the core Uri field
Fixed a crash that could occur when posting forms and a merged parameter was null
If header checking is enabled but the request doesn't support the HEAD method (via the 405 response code), the source URI will be downloaded normally and header checking disabled for that host [#194]
Fixed a crash which could occur when deleting rules or forms from their respective lists [#289]
Right clicking blank areas of URI, rule or form lists now displays the context menu [#276]
Fixed a crash which could occur when right clicking an empty URI list
Fixed an issue where CSS remapping could crash due to a blank source value [#269]
Fixed an issue where WebCopy could hang with a certain combination of HTML and XPath [#291]
Fixed the layout of the main application window slightly overlapping the status bar (regression)
Fixed an issue where it was possible that text tokens weren't replaced
Viewing a diagram could display an error if WebCopy had been started with command line arguments
Link origin wasn't persisted
Clearing the link map didn't clear the sitemap tree [#257]
Setup would display an error stating Unknown custom message name "lcid" if an appropriate version of .NET Framework was not installed and was required to be downloaded by Setup

Version 1.4.0.477 Beta 13 May 2018

Fixed

A crash occurred trying to sort the results view via the Source column (regression from 1.3)

Version 1.4.0.469 03 May 2018

Fixed

The improved Quick Scan dialog crashed if the Visual Link Map extension wasn't installed
If the URI to crawl redirected to an external URI, no feedback would be provided to the user regarding the redirect and the crawl would just appear to halt with an empty response
A crash no longer occurs if a website returns a Content-Encoding header that either isn't a standard value or one that is not supported by WebCopy
List views now remove line breaks from displayed content

Version 1.4.0.465 Beta 14 April 2018

This version of WebCopy changes how rules are executed. Existing rules using the Do not allow children to inherit this rule flag may not function correctly, this flag will be removed in subsequent update.

Added

The Quick Scan dialog has had a major overhaul to make it usable, if not useful. While currently a work in progress, it now offers the following features [#261]
The ability to set a limit on pages per domain during the quick scan
A diagram of the scan results is displayed in the dialog using colour coding to show which URI's will be included in a copy, and which will not
You can change how you expect the website to be crawled and it will automatically update the diagram to reflect the new setting
Using the diagram you can exclude URI's from being crawled, or add excluded domains to be crawled
Confirming the dialog no longer resets many settings in your project back to defaults
A Rule Checker tool has been added, which takes a given URI and passes it through all rules, allowing you to see which rules are matched and which aren't
Added new Stop processing more rules flag. This flag is automatically applied to projects created using older versions of WebCopy
List filters previously removed as part of [#65] have now been reinstated
List filters now support empty / not empty options

Changed

The Rules, Forms and Password list editors now share a common base and are now consistent in how to add and edit items
You can now re-order rules, forms and passwords in their respective editors by dragging items in the list
Rules no longer stop executing after the first match is found, but continue through all rules, allowing for more complex scenarios
Rule lists are no longer sorted by default making it easy to see the execution order
When calling the CLI to download a single file and the /o argument points to an existing directory, the CLI will generate a filename based on the URI to download [#250]
When trying to copy a website, custom expressions are now validated and the copy will not commence if any are invalid
The Crawl Content rule flag can now be set independently of the Exclude flag. This finally allows you to create a copy job that will scan an entire website, but only keep files such as images
Documentation updates

Deprecated

The Reverse and Do not allow children to inherit this rule rule flags are deprecated and will be removed in a future version of WebCopy

Fixed

Redirects with a relative Location header could be incorrectly combined into absolute URI's
Empty analytics sessions are no longer transmitted
Failure to obtain shell icons should no longer crash the application
Loading a diagram didn't update UI state correctly
Changing some diagram properties didn't cause the diagram to be updated
URI's which had a blank charset attribute in the Content-Type header weren't processed properly
Fixed a crash which could occur using the CLI trying to open a file that wasn't a WebCopy project [#253]
Reordering rules and forms didn't reflect properly in the user interface
Application no longer crashes if there is an issue exporting or copying large diagram images [#262]
CLI will no longer attempt to download if the output folder is protected [#249]
When using custom xpath expressions, multiple expressions would be incorrectly created if the same attribute was listed multiple times
Several more list views have been virtualized [#64, #65]
The rule editor no longer tries to convert patterns into URI's
Cloning a WebCopy project skipped numerous values
The keep alive setting wasn't persisted correctly
Fixed an issue where the Quick Scan dialog could crash with a duplicate key error [#251]
Failure to generate the website diagram is no longer fatal [#247]
Website diagrams are now generated directly from link information, rather than building a sitemap and generating from that - this should reduce memory requirements of creating the diagram [#247]
Form and rule lists should now correctly update if their respective contents change
The main results list view is now virtual which should resolve all memory issues relating to working with URI lists [#64, #65]
The Capture Form tool will no longer crash if there is a problem creating the embedded browser
Pressing Escape in the Capture Form tool no longer closes the window

Version 1.3.0.405 21 January 2018

Added

Added new advanced options to configure which security protocols are supported [0000228]

Changed

Images now open in the registered application for the file type
WebCopy will now always try and find an encoding defined in HTML content before falling back to the Content-Type header

Removed

URI lists virtualized as part of [0000064] temporarily no longer support filtering

Fixed

WebCopy could no longer access websites that only supported newer versions of TLS [0000228]
A cross thread exception crash could occur when accessing the Quick Scan dialog
A crash could occur if WebCopy had difficulties processing an URI
Attempting to access help from the project properties dialog had no effect [0000229]
A crash could occur when clicking slices in the Website Size chart [0000208]
Unexpected errors processing URI's will no longer abort the entire crawl
Massive memory and performance improvements for some URI lists [0000064]
URL parser didn't correctly handle URI's which included two : characters at any point before the / or \ characters
Known values weren't picked up from meta tags if the contents of the name attribute weren't using lower case
Sometimes the Capture Form tool crashed when trying to navigate to a new URL
A third party library could cause the entire application to crash with certain HTML
The crawl mode was reset from Sibling to Sub when re-opening the project properties dialog
Various performance improvements

Version 1.2.2.368 15 December 2017

Added

Added a new optional extension for providing feedback/smiles/frowns or support requests from within the application
Error diagnosis dialogs now include a reference to the original report
The Test URI dialog now includes a new tab which lists all links that were detected on the source page

Fixed

Errors loading cached RSS feed resulted in the RSS extension from not functioning [0000204]
Fixed potential exit crash when updating statistics [0000175]
Fixed a potential issue where the last character in a directory path could be removed [0000213]
Fixed a crash which could occur when setting file timestamps [000216]
Fixed a crash that could occur changing the URI of a project containg forms, and the form URI hadn't been set
Fixed a crash that could occur when right clicking some list views
CLI tool didn't handle invalid command line arguments as well as it could [0000223]
The Website Size dialog could crash with a divide by zero exception [0000222]
The WebCopy CLI now creates missing output directories when requesting to download a single URL into a file [0000224]

Version 1.2.1.348 03 November 2017

Added

The Origin Report setting now has a new option to embed the original URL as a comment in the body of the HTML

Changed

The Download All Resources option is now automatically set for new projects [0000193]
The Directory Character option is now automatically set to / for new projects
The Update Local Timestamps is now automatically set for new projects

Fixed

A crash no longer occurs opening the Options dialog and the languages folder doesn't exist [0000187]
A crash no longer occurs opening the Options dialog and duplicate languages are present [0000186]
A crash could occur when loading the sitemap and shell icons were enabled [0000190]
Fixed a number of issues that could occur after opening or saving a project and the MRU was updated (0000161, 0000176, 0000177)
WebCopy no longer aborts the crawl after trying to download a URL with the same name as a file system reserved word [0000195]
WebCopy wasn't detecting flash movies in an object tag [0000173]
Data URI's with padded data weren't processed correctly [0000197]
Fixed trying to run the CLI and specifying a project file that did not exist [0000200]
Added speculative fix for a crash generating sitemaps - this is a common issue yet we've been completely unable to reproduce it in any of our test scripts or saved projects. If anyone can supply information on how to trigger this crash it would be gratefully received! [0000160]
Fixed a regression where it was possible for redirects to get stuck in an infinite loop
Minor improvements to URI exception reporting
RSS entries would duplicate themselves depending on if the feed was accessed via HTTP or HTTPS. Note that a side effect of this fix will result all entries being marked as unread
Fixed a possible crash that could occur when trying to load a themed font [0000203]
Fixed a crash that could occur if a rule had a empty pattern [0000188]
WebCopy will now try to remove base tags after completing a crawl [0000191]

Version 1.2.0.326 09 October 2017

Added

Added a new extension which allows users to easily submit websites that WebCopy isn't copying correctly [0000015]
WebCopy now warns if JavaScript is detected as being in use by the website being copied [0000066]
WebCopy now reports links it detected but couldn't process [0000031]
Added open output folder action to Crawl Complete dialog

Changed

Due to some curious feedback, the checks to validate digital signatures on WebCopy binaries have been reinstated
Sitemap ordering has been changed to a simple sort order, as the natural sort took an extremely long time to run on large websites, with little benefit for the performance hit [0000004, 0000150]
External status is now stored with a link entry instead of being calculated each time it is requested [0000004]

Fixed

Fixed an issue where WebCopy wouldn't display content properly (for example in the Test URI dialog) if the web server returned compressed content regardless of the value of the request's Accept-Encoding header
Fixed an issue where pages weren't processed correctly (e.g. corrupt titles in the UI) if the page wasn't using UTF8, didn't specify a charset in the Content-Type header but did specify the correct content type via a http-equiv meta tag in the document HTML [0000144]
Fixed an issue where remapped files were always read or written as UTF8 regardless of the original encoding [0000144]
The Test URI dialog will now automatically try and add http:// if the user just types/pastes a schema-less value
Fixed an issue where CSV export could fail
Creating an automatic rule for an external URI now creates a valid rule [0000026]
The sitemap treeview shouldn't reload itself quite as often as it previously did [0000003, 0000004]
WebCopy now correctly processes URI's above the crawl root if the Download all resources option is set [0000154]

Version 1.1.3.304 10 August 2017

Added

Added a new Keep Alive setting. Setting this to false can help prevent the "The server committed a protocol violation. Section=ResponseStatusLine" crawl failure [0000002]
Added a new Prefix Mode setting. This setting allows you to force URI's to either have or remove the www prefix, useful for avoiding duplicated files when copying a website which uses a mix of prefixed and non-prefixed URI's
Added the ability to replace sections of a URI when crawling documents
Added a new report to view non-HTTP links
(Experimental) Added new Extract Data URIs setting. Enabling this option will extract inlined images using the data: protocol into separate files.

Changed

Setup should now automatically uninstall previous versions
Numerous changes to how plugins are discovered, loaded and configured. Due to no longer storing plugin details in the Windows Registry, this will cause any disabled plugins to be re-enabled
WebCopy will now correctly report non-HTTP links such as mailto: or ftp: as skipped rather than silently ignoring them
Internal engine changes [0000062]

Removed

The project scan and repair tool is no longer included in setup

Fixed

WebCopy could incorrectly exclude some URL's believing them to be mailto: links
Fixed several occurrences when a crash could occur when invalid path characters were present in URL segments
Some HTML tags appeared as "Unknown" in list views
URI's would be incorrectly combined if the relative URI was just query string and the source URI already had a query string
Download percentages were calculated incorrectly
Fixed a crash that could occur after copying a website if the Update local time stamps option was set
Report viewer didn't show external URI's
Fixed case-sensitivity issues in some built-in reports
Fixed a crash that could occur if non-NBT files were present in report folders with the rpt extension
Fixed a startup crash if the addins folder didn't exist [0000112]
Fixed a crash that could occur when trying to calculate the depth of a URI [0000116]
Fixed a crash that occurred if a project with a blank URI was opened and the user then attempted to browse to the blank URI [0000120]
Fixed an issue where Setup sometimes wouldn't replace files

Version 1.1.2.203 Beta 06 August 2016

Added

The original Include Subdomains option has been replaced with a new set of more comprehensive options, allowing for copying of sibling domains, linked resources, or everything
Added a new Download all resources option. When set, WebCopy will download any non-HTML linked resource, regardless of source domain. If the file would normally be excluded, it will only be downloaded, not crawled
Although the UI editor for additional hosts stated regular expressions could be used, this was never implemented. Regular expressions can now be used with additional hosts
Added a new Sitemap plugin that will generate a simple HTML sitemap of all downloaded files
Added a new Cookie Viewer plugin that allows a global view of cookies created during a crawl
Added native support for the picture element
The Capture Form window now remembers it's size and position
Added an address bar to the Capture Form tool to allow access to hidden login URI's, or for any other type of manual navigation
Added a Scan button to the Capture Form tool allowing a manual scan for forms in the page if WebCopy failed to detect them initially
The maximum redirect chain length setting can now be configured from the Advanced options group
Added a new option to control if external redirects should also be followed
Added support for the brotli compression algorithm
Added a new option to control if the results list should automatically scroll to show the active item while performing a website scan

Changed

Now requires Microsoft .NET 4.6
When following redirects after posting form data, all built in skip rules are ignored, so if a post to one site directs to another to complete the post, WebCopy will now always follow the redirect
The Create Desktop Icon option in setup is no longer checked by default
The Test URL dialog now uses the proxy settings of the currently open project
The Website Links dialog has been slightly redesigned to prevent a crash when working with projects containing many thousands of links
Information dialogs accessed from list views now display the selected context in an easier to read format than plain CSV
The default maximum redirect chain length has been increased from 5 to 25
HTTP Compression options have been removed from the Advanced options group into their own dedicated group
Options for processing redirects have been removed from the Advanced options group into their own dedicated group
Minor performance improvements
Minor optimizations to reduce memory load

Removed

Removed the setup option for creating a Quick Launch icon
Removed support for opening legacy XML based WebCopy projects
Due to offline help always being outdated due to the general weakness of the product manuals, the offline help files are no longer included in the setup, and requesting help will always display the online version

Fixed

Problems loading or saving the user agent store should no longer be fatal
Some dialogs only supported local help requests and were unable to show help if the local help file was not available
Fixed a crash that occurred downloading a file if the Content-Encoding header of the HTTP response was set to identity
Fixed a memory leak in the sitemap component
A number of HTML 5 specific tags were listed as Unknown in crawl result list views
The UI now correctly reports if part of a crawl was aborted due to too many redirects
Crawling the same project multiple times in succession reused the cookies from the preview crawl
Trying to call wcopy.exe with just the file name of an existing WebCopy project always displayed a message about unsupported protocols and refused to continue
Fixed an unauthorized access crash that could occur when using the Capture Form tool (regression)
Fixed an issue where external links could appear in some lists even when filtering options were set to exclude them
Fixed a number of issues which prevented automatically logging into websites where the post URI was different to the get URI and value merging was required, or the post returned 302 and the new location must be read to complete the login
Fixed an issue where occasionally the Capture Form tool didn't refresh available forms after navigating to a page
Fixed an issue where refreshing a page in the Capture Form tool didn't
The CLI tool no longer incorrectly reports failures to download a single file as an application exception
The CLI tool now correctly outputs the reason why a given URI failed when performing recursive downloading
The Capture Form tool now correctly detects forms that are contained within frames
Export to CSV option featured when context clicking some list views didn't correctly escape the CSV
Fixed an issue where the entire application was terminated if CSV export failed
Fixed an issue where URI's that were both invalid and very long could crash WebCopy
Fixed a crash that could occur when attempting to remap a CSS file
Fixed an issue where URI segments containing spaces (or other encoded characters) weren't correctly decoded when generating local folder names

Version 1.1.1.4 04 February 2016

Fixed

Fixed an issue where downloaded files would ignore the save folder and start from the root directory if the URL was malformed and included a double slash after the domain, e.g. http://example.com//image1.png
Fixed a crash that would occur when trying to process a data: URI greater than 65519 characters
The Capture Form tool was incorrectly using the id attribute of form elements instead of the name
Setup was incorrectly downloading .NET Framework 4.5.2 setup if .NET Framework 4.6 was installed
Speculative fix for loading date times from project files
Speculative fix for odd crashes when opening the Capture Form dialog

Added

Double clicking an entry in the Cookies list view of the Test URL dialog now displays the details of the selected item

Version 1.1.1.3 27 December 2015

Fixed

WebCopy wasn't scanning the contents of style elements correctly
@import CSS rules were not being remapped if they did not use url() notation
Fixed a crash which could occur when a request made via the Test URL dialog failed, and no response was available
Fixed an issue where the Capture Form dialog sometimes did not list forms for a page when it should have

Version 1.1.1.2 Beta 13 December 2015

Fixed

Added missing Cyotek.Windows.Forms.FontDialog.dll file to setup
Errors loading statistic values are now ignored

Version 1.1.1.1 Beta 12 December 2015

Fixed

Speculative fix for a crash which could occur after copying a website
Fixed an issue where WebCopy could fail to open files in their default application if the full path included a space

Version 1.1.1.0 Beta 12 September 2015

Added

Added support for the srcset attribute
You can now specify custom attributes to include in link scanning
When logging an exception, diagnosis actions are such as new version downloads or links to workarounds are now displayed, if applicable
Now supports finding links via the 300 "Multiple Choices" HTTP status code
Slight improvements to scan performance

Fixed

Fixed a crash that occurred if you entered an invalid path into the Save Folder field then attempted to copy a website
Fixed a problem where projects using a sub path and the Crawl above root URI option could save duplicate URI's into the project, causing a crash when attempting to reload the project
Fixed a issue where sitemaps belonging to projects using a sub path and the Crawl above root URI option were corrupt
When changing settings via the main Options dialog, some settings would not be applied as the old versions were cached
Fixed a start up crash that occurred if the externaltools.xml file was present, but invalid
The XPath expression for <meta http-equiv='refresh' support wasn't strict enough and was picking up more elements that it should
HTML attribute scan rules that used regular expressions to transform only part of value of the attribute were incorrectly merged the transformed value
The link checker tools would not report URI's that weren't found if the URI was also external
The samples default tool link was incorrect
Demo project corrections

Version 1.1.0.2 Beta 25 August 2015

Added

Added a new option to control whether or not new pre-release (beta) versions are included in update checks
64bit versions of WebCopy (GUI / CLI) and Link Checker (GUI / CLI) are now available
You can now choose to display all errors, or only errors detected during the current scan in the Errors tab
Activating list items in the different result tabs now opens the appropriate properties dialog
Added useragent, prehead and no-prehead command line options to wcopy.exe
Uses alpha version of new exception logging library

Removed

Disabled glass effects unless using Windows Vista or Windows 7

Fixed

Build was deploying the .NET 3.5 version of Luminitix
If posting a form failed, the copy was automatically cancelled, but the reason why the post failed was not available
Pressing enter in the sitemap tree view could cause the link properties dialog to be displayed twice
The Link Checker GUI / CLI clients and the WebCopy CLI client no longer require the source URI to be qualified with the scheme, and will automatically add http if no scheme is present
CLI tools now correctly report errors
Default user agents of CLI tools were malformed
In certain circumstances, command line arguments would not be parsed correctly
401 challenge dialogs were not displaying correctly, instead a "Cross-thread operation not valid" message would be displayed in the log

Version 1.1.0.1 25 May 2015

Fixed

Fixed a crash that occurred using the External Tools dialog the Environment Variables sub menu was clicked
Fixed an issue where token menus (for example those in the External Tools dialog) containing environment variables could be excessively wide
Fixed an issue where the exit code of the CLI tools could be incorrect
If a view crashes when updating, it is now disabled for the remainder of the session without crashing the entire application

Version 1.1.0.0 Beta 09 May 2015

Added

Added a new command line version of WebCopy, allowing you to download single files or entire websites via the command line - perfect for use in scripting and maintenance tasks!
Added a new Dead Link Checker tool, which you can use to scan a website and detect dead links. This tool is available as both a GUI client and a command line interface
Experimental When analysing a site, WebCopy will now attempt to keep content in memory where possible, and only write it to disk if the content is above the default capacity
CSV exports of link maps now include an integer column for the HTTP status in addition to the textual description
Added a new option to disable the automatic URI remapping of downloaded files

Changed

WebCopy now requires Microsoft .NET Framework 4.5
The Errors tab no longer lists redirects, instead you can use the Redirects report

Removed

Windows XP is no longer supported
Removed the prefix with the website name / prefix with the website url option. This setting was confusing, served no real purpose and the default value was wrong

Fixed

Fixed a crash that could occur when loading a WebCopy project if the link map included a link to a resource without a default document, and a link to the same resource with the default document
WebCopy was processing some failed URI's despite the fact they had failed (regression from previous version)
WebCopy wasn't processing some response headers correctly if they weren't cased as expected (regression from previous version)
WebCopy no longer remaps links in local content that has not changed during the current session
Fixed an issue where WebCopy would send the if-modified-since or etag headers even though the local content was no longer available
Fixed a rare crash that could occur when remapping document URI's at the end of a download
Fixed a number of issues with <base> tag processing
Fixed an issue where WebCopy could incorrectly loose custom port information when combining two URI's in certain circumstances
WebCopy was incorrectly shortening file names with multiple periods, ie jquery.min.js to jquery.js (regression from previous version)
Sometimes WebCopy would try to map a document URI to a file name that was actually a directory, causing a crash
URI's path segments which contain illegal characters are now sanitized when converting them into file paths
In-line CSS is now correctly crawled
Crawling will no longer follow redirect chains beyond the 5th consecutive redirect
Fixed an issue where meta data could be read incorrectly based on encoding type
Problems that occur reading meta data for a downloaded file no longer block the crawl with a modal error dialog, instead the error is presented in-line at the end of the crawl the same as other errors
Time stamps displayed on completion dialogs are no longer displayed in UTC
Some columns in the results list view were not updated correctly unless the action was successful
Fixed several occurrences where link information wasn't being updated correctly

Version 1.0.10.1 25 March 2015

Fixed

The Quick Scan dialog can now have the scan cancelled
Fixed an issue where progress percentages for file downloads using gzip or deflate content encoding would be incorrect (regression from 1.0.9.1)
Fixed a crash that occurred in the Test URI dialog when using the POST verb on a page without any forms (regression from 1.0.9.1)
Fixed a crash that could occur when using non-HTTP/S URI's from the Test URI dialog
Fixed a crash that could occur when using the Test URI dialog and the current request failed (regression from 1.0.9.1)
Fixed a threading crash that could sometimes occur when trying to access the Quick Scan dialog
Fixed a crash that could sometimes occur when closing the Quick Scan dialog while a scan was in progress

Version 1.0.10.0 22 March 2015

Added

Reinstated digital signatures
When posting a form, existing values will be automatically merged with the user defined custom values
Added a new tool for capturing a form, making it much easier to extract the basic tokens for posting a form
Cookies are now supported by the Test URL dialog when making multiple requests from the same domain, including their own tab for viewing
All standard HTTP verbs are now supported by the Test URL dialog

Changed

The Test URL dialog has been split in two, so that the result content is always visible
The Rule Editor, Form Editor and Test URL dialogs are now all resizeable

Fixed

Fixed an issue where some form values would not be encoded correctly
GZip and deflate compressed data is now decompressed during the download, rather than after the entire content has been download
The HTML view in the Test URL dialog now correctly updates each time a new request is made
WebCopy would often given file names a numeric suffix even if there was no reason to
If WebCopy tried to shrink a file name to fit within path limits, it incorrectly started by trimming the extension, instead of the name
WebCopy failed to shrink file names where the base path was above 248 characters and promptly crashed
Some files were missing from the setup that prevented exception reports from being submitted (regression from previous version)
Fixed a duplicated shortcut between Rules and Test URI
Exiting WebCopy while the RSS extension was updating caused a crash
Fixed an issue where files could be loaded with the wrong encoding when remapping documents, causing subtle corruption with the final output
The Scan Project repair tool crashed on start up (regression from previous version)
Opening a project always marked it as changed, causing the UI to prompt to save changes unnecessarily

Version 1.0.9.1 13 December 2014

Changes and new features

Temporarily removed digital signatures, these will be reinstated shortly
Added Windows 10 to application manifests
Added Requests Per Minute limit mode
Added a new Enforce Limit Checks option. When set, limit requests will be enforced for all URI's that involved a HTTP request. If not set (default) limit requests will be enforced only for URI's that were successfully processed

Fixed

Fixed an issue where a WebCopy project could become corrupt
Limit checks are no longer applied to URI's that were skipped due to being external or by a rule
Changing the window font is now correctly applied to the main window when the settings are applied, rather than requiring the application to be restarted
Fixed a crash that could occur when attempting to obtain the display string for an enum value
Fixed a crash that could occur if there was a connection error when trying to post a form
Fixed an issue where the RSS feed wouldn't update when the Update Now option was used, unless a daily update was already pending
Fixed a crash that could occur displaying the rules editor

Version 1.0.9.0 01 June 2014

Changes and new features

Deprecated The prefix with the website url / prefix with the website domain name option of a crawl project has been deprecated and will be removed in a future update.
Experimental Added the ability to specify additional hosts. This allows you to include multiple domains per project, for example a CDN
Experimental Added proxy server support
Activating an item in either the Request Headers or Response Headers tabs of the Test URI dialog now displays the header information in a dialog for easy viewing/copying
The contents of the Select Mime Types dialog are now sorted
Items in the Title Replacements and Forms editors can now be reordered via drag and drop
Added a helper tool for backing up and restoring settings, or for resetting settings to default values
Added a stand-alone update check tool

Fixed

The Status Code column in the Results list is now no longer cleared when an action is performed that didn't involve an HTTP request, such as remapping the local file
The value of the Play Sounds setting wasn't being honoured by the Crawl Complete dialog
The prefix with the website url / prefix with the website domain name option of a crawl project now defaults to prefix with the website domain name for new projects
Pressing enter in the Post Values field of the Test URI dialog no longer activates the default button on the dialog
Fixed an issue where only the end of a host was inspected when checking if a given URI was a sub domain of another. For example, it would incorrectly return that static.oneexample.com was a subdomain of example.com
Fixed an issue it was possible WebCopy wouldn't prompt to save changes when exiting
An error is no longer displayed if you open a project saved using a newer version of WebCopy. The project will now be opened where possible, but a warning will now be displayed
Repeatedly clicking column headers in sortable lists now correctly cycles between Ascending, Descending and None, instead of only Ascending and Descending.
Fixed a problem where clicking the Add button in the Form Editor would clone the active form, including the internal ID of the form which should be unique, leading to crashes
Fixed an issue where settings were both loaded and saved using thread specific culture data, which could cause a crash if the computer culture information was subsequently changed. All settings are now saved and load using an invariant culture.
A crash no longer occurs if font information cannot be read correctly from stored settings

Version 1.0.8.0 03 May 2014

Changes and new features

Simplified the highlighting and displaying of matches in the Regular Expression editor
Added support for ETag's and the If-None-Match header when reading headers to determine if a resource should be downloaded
The duration of each URI crawled is now recorded
The duration of the entire crawl is now recorded
Added the ability to limit crawling to a number of requests per second. Options to configure crawl limits can be found beneath the Advanced node in the Project Properties dialog
Added Slow Pages report
Reports are now loaded from disk (and on demand) instead of being pre-defined; user reports are now supported
Setup now allows you to customize which components are installed
Added product RSS notifications add-in

Fixed

The Regular Expression editor now correctly displays line breaks in the Replace tab
WebCopy now tries to be more intelligent when generating paths and file names for local files reach maximum path lengths, by shortening file names and sub folders to try and make files fit. This should reduce occurrences of PathTooLongException exceptions being raised
CSS @imports directives are now correctly processed if the url keyword was missing. Previously only @import url('<file>'); would work, now @import '<file>'; also works
Fixed an issue where the update check could cause the main window to be unresponsive
The Quick Scan dialog no longer crawls either sub-domains or above the root URI
Fixes an issue where the Quick Scan dialog wasn't cleaning up correctly when closed
A URI that returns an error status code is no longer flagged as "skipped" with the reason "Invalid Content Type".
If a crawl was cancelled due to a HTTP status code, the results list no longer flags any such URI as "skipped", but retains the original, correct, value
After editing a rule or form, the enabled state of the item could no longer be correctly toggled via the check boxes in their respective lists
In certain circumstances, creating a backup of a file could take a substantial amount of time
Fixed an occasional The path is not of a legal form exception when using the External Tools dialog
Fixed an issue where colour settings were sometimes not loaded correctly
Fixed an issue where font settings were sometimes not saved correctly
Font sizes are now displayed as whole numbers
Fixed an occasional crash resizing the application window with a collapsed panel
Corrected baseline positioning of editors and labels in dynamic user interfaces

Version 1.0.7.6 02 February 2014

Changes and new features

Panels in Option dialogs now load on demand
Option pages are now only initialized when requested by the appropriate dialog
Removed status code 520 (origin error) from the list of supported codes for automated error reporting during a crawl
The Image Viewer window no longer defaults to Fit when displaying an image, but now defaults to Actual size
Added additional themes for configuring the appearance of the GUI client window

Fixed

Fixed an issue cloning LinkInfo objects which hopefully is responsible for a rare Cannot access a disposed object crash using the Quick Scan dialog
Fixed an issue where the sitemap tree view could be populated up to 3 times rather than the expected once when opening a project
Dynamic options in the Options dialog are now positioned more sensibly in relation to the options label and editor, and other options in the same group
Fixed a problem where tool tips did not display under certain conditions, or could display the wrong (or blank) text
Extension mapping for dropped files was case sensitive
Reworked tool bar layout code to prevent overflowed buttons
Removed a number of integration hacks

Version 1.0.7.5 30 December 2013

Changes and new features

Token menus now include environment variables
Setup now offers to install the Microsoft .NET Framework 3.5 if not already present

Fixed

The maximized or minimized state of a window was no longer being restored when reopening the window
The Find and Replace dialogs in text editing windows now correctly default to the selected text as appropriate
The token button displayed when prompting the user for arguments for external tool execution now displays a menu with available tokens.
Attempting to open a folder who's full path contains a period no longer displays an Invalid Path message.
When restoring window position and size, the restored bounds are automatically recalculated to fit the monitor, for example when using via Remote Desktop with a smaller display resolution, or the removal/repositioning of a monitor in a multi-monitor set up.
The main application window could no longer be sized smaller than its original startup size.
Fixed a problem introduced in the last update which caused the crash reporter to no longer submit crash reports

Version 1.0.7.4 24 November 2013

Fixed

Fixes a crash introduced in 1.0.7.0 when running rules if the URI being processed was shorter than the base project URI
Fixed a crash that would occur when clearing the link map and the project did not have a valid URI
Fixed an issue introduced in 1.0.7.0 when deleting items with the popup Rules and Forms editor, where the either the wrong item would be removed visually, or the software would crash.
Fixed an occasional crash introduced in 1.0.7.0 moving items with the popup Rules and Forms editor. Note that the moving of rules and forms has no functional use and will be removed in a future version of the product. Also note that moving is performed on the underlying collection, not the visual display sort
Fixed a problem with the Rules, Forms and Password editors where it was often quite difficult to add new items via the popup editors as they kept trying to update a previous selection
Fixed an occasional crash after using the Quick Scan dialog

Version 1.0.7.3 16 November 2013

Fixed

Added manifest so that when running under Windows 8.1 / Server 2012 R2 the OS version is correctly reported.

Version 1.0.7.2 12 November 2013

Fixed

Fixed a crash that occurred when building a sitemap for a project that had root level URI's with a query string, but without a document name.

Version 1.0.7.1 11 November 2013

Fixed

Speculative fix for a Parameter is invalid exception that randomly occurs when painting windows
Fixed a crash that occurred when modifying a rule, and the project URI had been cleared or was invalid
Fixed a crash that could occur when inserting a new rule or a new form
Fixed a crash that could occur when attempting to process root level URI's in the sitemap
Fixed the status bar not updating correctly during a crawl action

Version 1.0.7.0 10 November 2013

Changes and new features

Experimental: Added a new option to simplify the sitemap treeview. When this option is set, folder containers are no longer displayed if the folder only has a single page
Experimental: Modifying a rule now reapplies rules to the sitemap, allowing easier sitemap manipulation without having to rescan the site. Note this feature only works on the current contents of the link map, if the linkmap is incomplete due to existing rules a rescan will be required regardless.
Sorting of the sitemap now uses natural sorting, so names appear in a logical order, e.g. 1, 2, 10 rather than 1, 10, 2
The Rule and Forms lists now default to sorted
List views that support sorted columns now use natural sorting
When building a sitemap, folders are no longer generated for URI's that match except for differing query strings
New API to allow plugin authors to add additional functionality to application windows when they are created
Sitemap treeview now displays URI's relative to the base URI
Rules that do not use the Use Full Uri flag now also strip out the leading path of the base URI. For example, if the base URI of the project is http://demo.cyotek.com/staticwebsite/ and the current URI being crawled is http://demo.cyotek.com/staticwebsite/blog/page1.html, the text used by the rule engine will be /blog/page1.html
The Differences tab now lists all URI's which are new to the last scan, in addition to existing checks of modification dates. Due to the introduction of this setting, all URI's will be marked as new for existing projects, until that project is rescanned and saved.
Removed the Use Modified Uri rule flag

Fixed

Fixed a problem where clicking OK on the Edit Rule dialog saved changes even if there was a validation error and the user subsequently clicked Cancel
Fixed a problem where the Quick Scan dialog failed without finding any URL's if the Inclusion / Exclusions options were set
Fixed a problem where page titles and descriptions containing HTML entities were not decoded
Fixed a problem where the sitemap could include URI's containing query strings, even if the strip query string segments option was set
Disabled Glass effects on dialogs when running under terminal services connections
Fixed a problem where source redirect URI's were not excluded, and appeared in the sitemap
Outgoing links for an existing link are no longer cleared if the link is excluded for any reason
Fixed a issue where the skipped status of a URI wasn't reset correctly

Version 1.0.6.1 31 October 2013

Fixed

Fixed a crash which occurred when opening the Website Size dialog and the linkmap was empty.
Fixed a potential crash in 1.0.6.0 due to some left behind debug code

Version 1.0.6.0 31 October 2013

Changes and new features

The Additional URL's section of the Project Properties dialog now allows the entering of relative URI's.
The sitemap tree view now only loads children on demand, improving performance for large projects
The different results tabs are also now load on demand, again improving performance for large projects
Added a new setting which determines if the Sitemap tab is activated when opening existing projects
Added new options to the Website Size dialog for either using total size or link count for content types, and for limiting the number of slices displayed
Various minor UI tweaks

Fixed

Fixed a problem where duplicate URI's could be present in the linkmap in rare circumstances, causing a crash when trying to reopen the project. A repair tool is also available for projects affected by this bug.
Fixed a potential crash that could occur attempting to retrieve shell icons.
Fixed a problem where commands linked to URI's that contained spaces in their respective query strings caused the command to fail with an Invalid URI message.
Fixed rare a problem where it was possible WebCopy would place the same URI twice in the processing queue, and immediately cancel the copy as soon as the second occurrence was hit.
Fixed a problem where WebCopy did not check the internal document version to ensure it was supported
Fixed an issue where toolbars were initialized before the window was resized to whatever the user had defined, meaning some toolbars were unnecessarily placed on new rows
Corrected some invalid message window and dialog titles
Fixed a crash which occurred when clicking pie slices in the Website Size dialog and filtering was enabled

Version 1.0.5.5 20 October 2013

Changes and new features

Added a new Ignore SSL Errors option. If this option is set, attempting to scan a website that contains an invalid SSL certificate will be allowed. The default for this setting is false, meaning that WebCopy will not scan websites with invalid certificates.
Double clicking (or pressing enter) on a file node in the sitemap tree view now displays the appropriate properties dialog
The Print and Print Preview options are now correctly enabled in the Image Viewer.

Fixed

Fixed a problem where CSS was remapped incorrectly if the website scanned wasn't the domain root
Fixed a crash that occurred calculating disk space estimates if a website was being copied to a UNC path
Fixed a problem where a website that was supposed to be copied to a UNC path was instead copied to the drive where WebCopy was installed using the UNC server name as a sub folder
Speculative fix for a CSS remapping crash possibly due to malformed CSS
Fixed a problem where attempting to open an Explorer window to a UNC path displayed an invalid folder message
The Page Setup option in the Print Preview dialog didn't do anything when activated

Version 1.0.5.4 11 September 2013

Changes and new features

Simplified the User Agent editor

Fixed

Fixed a problem where CSS files did not have their URL attributes remapped. This was only evident when using the flatten website folder option on a website where CSS files referenced images from other folders.

Version 1.0.5.3 21 August 2013

Fixed

If a problem occurs decompressing data compressed using the deflate or gzip algorithms, the download will be automatically retried with these options disabled.
WebCopy no longer attempts to decompress files that returned none as the content encoding.
Fixed a crash that occurred if the Use Shell Icons option was enabled but the user did not have access to parts of the Registry when searching for mime types.
Fixed a crash that occurred when shutting down multiple instances of WebCopy at the same time
Fixed a crash that occurred when trying to copy a website if the path entered to download the website into contained invalid characters
Fixed a crash that occurred trying to browse to a path that contained invalid characters

Version 1.0.5.2 03 August 2013

Changes and new features

Add a new use alternate directory character option. By default, when WebCopy remaps URI's to local files, the / character is replaced with \. Setting this option reverses this default behaviour, so that the \ character is replaced with /. This option can be found in the Copy Settings tab of a project's properties.

Fixed

Fixed a problem where remapped anchors had fragment information stripped
Fixed a problem where attempting to open a folder where the drive letter was in lower case caused WebCopy to display an "invalid path" message.

Version 1.0.5.1 23 June 2013

Fixed

Added an additional check for servers that return an malformed UTF charset header.

Version 1.0.5.0 09 June 2013

Changes and new features

Added a new Origin Report setting. This setting allows the generation of either single or multiple origin files which are saved alongside downloaded content and include the source URI. This new setting can be found in the Advanced section of a projects properties.
Request headers are now stored with each URI the same as response headers and will be saved in the project for later retrieval if the Save Headers option is set.
The Headers tab in the Link Properties dialog now displays request headers
Added new options for setting the Accept and Accept-Language request headers.
Removed status code 406 (not acceptable) from the list of supported codes for automated error reporting during a crawl
Temporary files are now created in the folder where the website is being downloaded to, speeding up the final moving of files after a successful download, and avoiding potential problems if the disk where the temp folder is located doesn't have sufficient space to store the downloaded file.
Quick Scan dialog now displays progress state while performing the scan

Fixed

Referring URI's were being incorrectly set since the last update
Total download size is now incremented correctly even if the content length was reported as zero by the server
Filtering a grouped list didn't preserve groups when previously filtered items were restored
URI's that have a status code of 406 (not acceptable) now have the correct skip reason associated with them
The Content tab of the Test Link dialog didn't always correctly display returned content
Fixed a crash that could occur when attempting to sort a list
Fixed a crash if the Content-Type response header contained a space before the encoding name, for example text/css; charset= UTF-8.
Fixed a crash if the Content-Type response header specified utf8 instead of utf-8.
Fixed a crash that could occur if the source URI couldn't be decoded correctly
Deflate encoding now once again correctly works after being broken in a previous build
Fixed an occasional crash attempting to get the short form URI pattern when creating rules from an existing URI
Fixed a problem where tool bars didn't wrap correctly if a new tool bar had to be placed on a new row

Version 1.0.4.0 02 June 2013

Changes and new features

Experimental: Added the basis of a "quick scan" feature. This scans the top level of the website for unique absolute URI's (removing bookmarks and query strings) and is useful for getting a quick overview of the top level structure of the website, making it easier to detect and exclude pages that have no benefit to copy (such as new thread / reply thread pages in a forum). As with other experimental features, this will be expanded over future updates.
By default, new projects will now remap local file extensions based on their file type if no existing extension is present
Removed status code 502 (bad gateway), 503 (service unavailable) and 504 (gateway timeout) from the list of supported codes for automated error reporting during a crawl

Fixed

Fixed a problem where when using the Excluded and Add Rule commands, the generated URI was invalid if there was a mix of www prefixed and non prefixed URI's
Fixed a crash that occurred when clicking the Test URI button in the Form Editor and the URI of the project is invalid
Fixed a problem where occasionally it was possible to execute two crawls at once, causing the second crawl to crash
Fixed a crash that occurred when WebCopy tried to map the folder aspect of a URI and a file already existed with the same name
Fixed a crash that occurred when submitting the remove missing links dialog for a project without a valid URI

Version 1.0.3.3 19 May 2013

Changes and new features

Status bar now shows pending crawl requests.
The progress bar now attempts to show current process based on total requests. It's not hugely accurate as it doesn't take into account the size of each request, but is better than a marquee! Windows 7 and 8 users will see the same behaviour on the taskbar progress.
Added support for the data attribute of the object tag.
Removed status code 500 (internal server error) from the list of supported codes for automated error reporting during a crawl
Removed downloaded file hash calculation as they currently aren't used by WebCopy

Fixed

Fixed a problem where GZIP compressed content was downloaded incorrectly if the response headers didn't include a content length
Fixed a problem where some users experienced a startup crash when initializing fonts
Fixed a build problem that meant some exception reports were missing information
Fixed a problem where buffers were incorrectly being processed when downloading which could lead to a potential crash or corrupt file if the response header didn't include a content length, and otherwise just did extra repeated work if a length was available
Fixed a crash that could occur when crawling websites that had many nested branches of links
Temporary files generated during the analysis of a website are now deleted as soon as they are no longer required, rather than only once the crawl has completed
The "is missing" check was ignoring HTTP status codes and only going from the scan index

Version 1.0.3.2 15 May 2013

Changes and new features

Minor improvements to crawl performance with websites that have a lot of cross page linking
Sitemap tree view now highlights missing URI's with a configurable color
Sitemap tree view now highlights folder nodes when all children match the same status
Added a new Remove Missing Links command. This allows you to selectively remove missing URI's from the sitemap, without having to clear the entire map.
Website Links dialog lists are now highlighted according to the status of the link
Added additional filter options to the Website Links dialog

Fixed

Fixed a crash thath occured when attempting to crawl a CSS file that contained unclosed comments
URI's which are skipped due to a 403 response code are now correctly flagged as Forbidden as the skip reason
Exception reports now include details of type load exception data
A crash no longer occurs if restoring a window's previous state fails
Fixed a crash which occurred when attempting to combine a URI with a partial URI that contained one of the reserved characters from RFC3986.
Fixed a problem where URI's were not combined correctly if the relative URI comprised solely of a query string
Start up errors when loading extensions can now be reported, and no longer prevent the application from starting
Removed invalid Set as active URL item that appeared on several context menus
Obsolete outgoing links were not being removed when crawling a source URI
Filter options in the Website Links dialog are now correctly available no match which tag is active
Fixed a massive performance issue when populating lists under certain conditions

Version 1.0.3.1 28 April 2013

Changes and new features

A new Test URI feature is available. This allows you to test a given URL, choose verbs, post information, or experiment with different user agents. Any returned output is viewable, allowing you to easily check if user agents have an impact on returned content, and methods such as HEAD are supported for crawling.

Fixed

When entering a URL without a schema, a default schema of http:// will be applied
Old version notices are no longer displayed after opening a project in the old XML format and then creating a new project
Fixed a problem where the XML generated by an exception was invalid
Fixed an issue where the content encoding of a page wasn't picked up if the value ended with ;
Fixed a crash when trying to open or copy the URI of an invalid tree view node

Version 1.0.3.0 14 April 2013

Changes and new features

Experimental: Added a new option to display excluded URI's in the sitemap tree view
Experimental: Added new "quick exclude" option to the context menu of the sitemap tree view. This new feature allows you to quickly including or excluding a given URI from being crawled.
Replaced 3rd party PDF library, this should allow for better compatibility with different types of PDF documents
Added a disk space check during the crawl. If WebCopy doesn't think there is sufficient space to download a file, it now automatically aborts the crawl.
Added a new option for automatically opening the last project when starting WebCopy without a command line
Content types prefixed with x- automatically fall back to the non-prefixed version where the prefixed version cannot be found
If the crawl of a given page fails, for example with a 500 error, WebCopy now attempts to get any response data which can then be viewed from the new Content tab of the Link Properties dialog. This response data is not saved with the project and is only available for the duration of the session from which it was populated.
The Addin Manager dialog now lists loaded meta providers
Improved the completion message for where a crawl was cancelled during to a failure trying to crawl the primary site URI, or any user defined additional crawl URI's
Completion dialog now includes statistics on the crawl, such as files downloaded and total size
Added Knowledge Base link to the Help menu
Help file updated

Fixed

The sitemap tree view is no longer sorted by an odd combination of last modified+page count and name, but now only by name
Failure to create the save folder no longer crashes the application
Improved folder validation for projects using the create folder for domain option.
Fixed a problem where the crawl map wasn't correctly generated if the base URI of the project include a document name
Fixed a problem where the Add Rule command didn't create a sensible default if the main project URI included a starting document
Fixed a problem where the selected entry in some list controls was invisible when the control did not have focus
The Errors tab is now once again correctly made the active tab after a crawl has completed with errors
Fixed a problem where the pre-crawl validation didn't correctly handle trying to download to a disk that didn't exist
If a given URI is detected as "not modified", WebCopy will now still scan the links of the local file rather than stopping processing for that URI. This now allows you to correctly download a website once, then only download changed files during future crawls.

Version 1.0.2.2 08 April 2013

Fixed

Fixes a startup crash that could occur if either of the splitters on the main window had been moved to a certain position and the window was then resized to a smaller size

Version 1.0.2.1 30 March 2013

Changes and new features

Crawl exceptions can now be reported to help improve WebCopy

Fixed

Validation errors that occur during HTTP parsing are now ignored. This allows crawling of websites such as snapfiles.com which do not follow the HTTP specification
Exceptions that occur when trying to read website headers (such as the previously mentioned protocol violation) are now correctly reported instead of being silently ignored
If an error is encounted during a crawl, the message now states that the crawl was cancelled rather than completed successfully
When saving a WebCopy project, it is correctly written to a temporary file first and then moved to the save file if successfull
Fixed a rare crash where the exception handler crashed trying to generate the XML report
Only the first occurrence of each unique mime type was given a shell icon in the the sitemap tree view and other supported views
The sitemap tree view context menu was enabled even if the tree view was empty, causing crashes if any of the menu items were clicked
Fixed a crash if you attempted to browse for a folder and had manually entered a path containing invalid characters

Version 1.0.2.0 23 March 2013

Changes and new features

WebCopy projects are now saved as binary files, and it is no longer possible to save in the old XML format (however you can continue to open them). As a consequence of this change, project files are now smaller (for example a 20MB XML project is now 4MB) and are quicker to load and save. When saving a XML based project, a one time backup will be created of the original XML before the new binary file is written. Note that versions of WebCopy prior to 1.0.2.0 cannot open these binary files.
Predefined default documents list now includes index.cfm and index.jsp
Default documents editor now displays the predefined default documents used if the field is left blank
UI access to "hide" URI's in the link map has been disabled. This functionality will be removed in a future update
Experimental Sitemap treeview now highlights new or changed links. The color can be configured via the Appearance tab in the Options dialog
Experimental The Missing tab now also includes new or changed links in addition to links present in a previous scan but now not found
The last downloaded attribute of URI's are no longer updated if the download was the result of an analyse operation rather than a full copy
The default save folder option now has an appropriate default

Fixed

When creating a new project, any existing sitemap wasn't removed from the treeview
Fixed a crash which occurred when switching back to the Sitemap tab and the current projects URI was blank or invalid
Fixed a crash which occurred when attempting to view the website diagram and although a crawl map was available, the current projects URI was blank or invalid
Failure to save a project no longer results in a crash
Exceptions that occur when opening a project can now be reported
Pressing return in the form editor data field and default documents editor no longer attempts to submit the editing dialog
Fixed the client component name for metrics
If there is a problem setting the progress bar overlay on the taskbar, a crash no longer occurs
Fixed the wait cursor not always appearing when a blocking action was occurring
Fixed a rare clean up crash when removing temporary files
Fixed UI controls being enabled in the URI properties dialog when they should have been disabled
Pressing enter on a focused link label now correctly activates the link, instead of attempting to activate the default button for the window
Default images are now correctly used if shell icons are enabled and no appropriate icon is available

Version 1.0.1.4 16 March 2013

Changes and new features

The Last-Modified HTTP header is now supported. Last modified values specified elsewhere (for example in the meta tag of a HTML document) will override this value.
Added new option to set the timestamps of local files to mirror the Last-Modified timestamp where available

Fixed

The "Missing" tab was displaying the wrong content for the Title and Description columns
Fixed a problem where UI settings sometimes were positioned incorrectly in the Options dialog
Fixed a problem where drop down based UI settings didn't render the selected item correctly when displaying the drop down component
Fixed a problem where file dialogs pre-populated with a full path didn't set the initial folder of the dialog
Addins which had additional dependencies located in the addins or views folders failed to work
Fixed issues with toolbar initialization

Version 1.0.1.3 15 December 2012

Fixed

Removed some debug code that could cause a crash
Exceptions which occur during the automated update check (e.g. no internet connection) are now silently ignored

Version 1.0.1.2 13 December 2012

Changes and new features

Backup settings for project files are now available on the Options dialog
Update check is now enabled by default
You can now drag and drop projects from Windows Explorer onto the main application window to open them
Added an option to disable shell icons in the sitemap tree view

Fixed

Added an additional path validation check so the application no longer crashes if you set the download folder to be something invalid, such as ftp://.
Fixed a problem where byte order marks were not being saved into data files, resulting in corrupt text data, such as page titles in the link map. This was only apparent when downloading text documents containing non-ANSI characters. Binary files were not affected.
Fixed a problem where response encoding was not properly processed
Fixed a problem where URI processing exceptions that didn't involve an error-state HTTP code weren't reported correctly in the UI
Fixed a problem where URI which returned quoted character sets (for example text/html; charset="utf-8") caused processing of that URI to fail. When combined with the above bug, this led to URI's being skipped and no way to tell why they were skipped or what any problems were.
Fixed an issue where the default user agent was always being used when crawling a website.
Fixed a crash saving untitled projects introduced in the last build
Fixed a crash when creating a new tool introduced in the last build
Fixed a problem where commands with hot keys could be activated while the user interface was disabled, leading to either a crash or an invalid application state
Fixed a crash if the user attempt to copy an URI and the clipboard was in use

Version 1.0.1.1 10 November 2012

Changes and new features

External Tools dialog now includes a preview of the command line and allows tools to be executed from within the editing dialog
External tools now support using environment variables
User Agent editor now shows the default user agent
Added a status bar indicator showing how long the current operation is taking
Added new External URI's and Images reports
Added automatic update check which can be enabled/disabled in the Options dialog. When enabled, once a day a check is made, and if an update is found a notification is displayed in the status bar.
The remap extensions mode is no longer a simple on/off switch, but now allows you to select if extensions should always be remapped, never be remapped, or remapped only if no existing extension is present.
Added the ability to create content viewers
Added content preview support to report viewer
The always download latest version option is now enabled by default for all new WebCopy projects

Fixed

Fixed an issue where an unexpected exception that occurs during URI preprocessing crashed the application rather than just aborting the active URI
Exception reports were missing data information
Fixed settings dialog always reapplying all settings regardless of if the setting page had tracked any changes
Fixed a crash which occurred if the About dialog was displayed after viewing a report.
Fixed a problem where a settings page that did not save changes correctly crashed the entire application
Fixed a crash that occurred if the root Theme menu was clicked
Fixed an issue where duplicate entries with the same URI and internal ID could be added to the link map when opening a project
Default user agent was missing platform name
Single tabbed options dialogs are no longer quite as narrow
Fixed a number of duplicate accelerators in command menus
Fixed a crash that occurred if the Courier New font was not available even if an alternate font had been specified in the Options dialog.
Malformed reports no longer crash the application
Application should no longer crash if there is a problem rendering button components. Could not reproduce original bug and error report did not include contact details so it's possible issues still exist. This is a workaround, not a true fix.
Fixed some button tooltips incorrectly including the ampersand and ellipsis characters
Fixed a problem where modifying a value in the Options dialog partially applied the value even if the dialog was subsequently cancelled
Fixed a problem where the "show folder paths" options was read via one name, but wrote with another, preventing the value from being usable
Fixed a problem where the initial state of the Rules Editor was incorrect when displaying the editor in a project with no rules

Version 1.0.1.0 21 October 2012

Changes and new features

Sitemap tree view now displays file icons rather than a generic document icon
Added experimental Reports feature for performing dynamic querying of data. At present three read only reports, Redirects, Not Found and Empty Meta Data, are provided, future updates will expand upon these and add the ability to create your own.
Added the ability to allow editing for link titles and descriptions. This allows you to customize the title and description of a link, and also prevent future updates from resetting the values. Note: If the option to clear the link map when analyzing a project is set, customizations will be automatically lost.
Added the ability to specify inclusion/exclusion criteria for mime types. This allows you to exclude certain file types from being downloaded, for example you may wish to ignore all EXE files.
Added new feature to cancel a crawl if a given HTTP status code is met. These options can be configured on the Advanced page of a project's properties.
Added new Export Site URI's option. This will export all URI's and their statuses to a CSV file for external processing
The link properties dialog is now resizable
It is now possible to install additional readers for document meta information
Added the ability to view a diagram of a websites structure via the the Website Diagram addin.
Double clicking an item in the Forms list now automatically opens the form editor
Double clicking an item in the Rules list now automatically opens the rule editor
Browse folder dialogs now allow entering the path on Windows Vista and above
Font settings can now be set through the Options dialog
Substantial API changes to make it easier to use.
User interface should now remain usable whilst analyzing a website or building sitemaps
Added the ability to extract the titles from PDF files during crawling
Added the ability to load meta data from RSS files during crawling
Added Reverse option to rules. When this option is set, the rule is processed if the regular expression is not matched.
The link properties dialog now displays incoming and outgoing links for the source URI
Removed the single instance limit
HTTP redirect responses are no longer classed as errors
URL's in link map dialog selection combo box are now ordered
A new indicator has been added showing the total size of content download during the active crawl operation
Includes customer experience improvement program
Various minor user interface enhancements

Fixed

Fixed an issue where analyzing a website would incorrectly download content files that could not be crawled
Errors list no longer displays -1 instead of the appropriate error code
Fixed an issue where it was not possible to open certain project files
Fixed a problem where it was possible that the wrong item could be edited from the Forms list
Fixed a problem where it was possible that the wrong item could be edited from the Rules list
Fixed a problem where an addin that could not be initialized left the application in an unstable state
Fixed a problem where some application settings were not immediately applied when changed by the user
Default user agent now follows RFC 2616
Fixed an issue where URI preprocessing wasn't immediately applied to URI's detected for crawling, which could cause additional unwanted entries appearing in the link and crawl maps.
Fixed a problem where modifying custom project settings exposed via an addin didn't mark the project as changed
Fixed a problem where the Disable Links rule condition wasn't working as expected
Fixed a number of issues with the error reporting tool
Fixed an issue where creating a new document when changes had been made to the current document did not prompt to save said changes, causing them to be lost
Fixed an issue where clicking an empty MRU could prompt to remove a blank filename
Fixed a crash which occurred when hovering over an overflow toolbar button
Items in the toolbars menu now appear in the correct order
Regular Expression editor now correctly updates when modifying the Replacement field.
If a link was modified, it did not mark the project as changed
The error list now includes the description of HTTP response code errors

Version 1.0.0.9 18 February 2012

Changes and new features

Added a Replace section to the Regular Expression dialog to make it easier to test replacement expressions
Various performance enhancements
The Errors tab no longer lists "Unknown Response" for non-200 HTTP codes, but instead includes the code description
Added the ability to run user defined custom tools from within the application
Attempting to open a recent file which no longer exists now prompts to remove the missing file from the recent files list

Fixed

Fixed a crash when crawling if a rule was created with an invalid regular expression
Reworked application mutex to avoid silent startup and shutdown exceptions
Fixed regular expression cache not being thread safe
Status bar wasn't correctly cleared if there was a problem populating a view which required a valid crawlmap
Fixed status bar messages from occasionally not appearing

Version 1.0.0.8 04 December 2011

Changes and new features

Product help is now available and the product is now out of beta
Added the ability to enable the "multi line" option in the Regular Expression editor to easier test patterns using ^ on $ on lists of URL's
Added a Test URL option for Forms, allowing you to test that your forms can be successfully POSTed prior to running a full crawl
Changed settings dialogs to use a tabbed interface
Holding down Shift when clicking the Copy Website or Analyze buttons forces the download of all resources, skipping last modified checks

Fixed

Fixed a large number of issues with the application services libraries and components
Fixed an issue where attributes of posted URL's were not correctly loaded if encountered at a later point during the crawl
Fixed a crash which could occur when using the title replacement options and a page had a null title
Fixed a crash which could occur when scanning a HTML tag containing a malformed URL
Fixed an issue where email addresses were stripped if they contained the # character and the "strip fragments" option was enabled

Version 1.0.0.7 Beta 24 August 2011

Changes and new features

The Link Map window now remembers its size and position
The URI control for selecting the website to analyze is now tied to the system URI history
Removed the confirmation prompt when rebuilding a crawlmap from saved history information
The link scanner now supports the use of the base tag. If present, the URI value will be combined with links on the page.

Fixed

Fixed various problems which could occur when trying to crawl a site with malformed links containing double slashes after the domain
If the copy process crashes the application will continue to run after dismissal of the exception reporting dialog
Fixed a crash which would occur if a generated file name was the same as an existing directory name
Fixed several crashes which occurred if a valid content type was downloaded as an empty file
The list of incoming URI's for any given URI were being incorrectly populated
Fixed an issue where if a URI was referred to in multiple locations, after the first time it was encountered the outgoing and incoming URI links would not be updated correctly for future encounters
When reloading a project, the link map is no longer crawled looking for pages directly matching the root element, but all non-excluded internal URI's are formed into the map, resolving a problem where the crawl map generating from reloading a project may not match the crawl map generating from analyzing a website
Fixed the & character from not appearing correctly in the status bar
Fixed issue with application window being sent behind other top level windows when cancelling a crawl
Fixed tab order on main window
Fixed one occurrence where links were not combined correctly causing an infinite cascade (or at least until you hit the path limit for your OS). Additional causes of this bug may still be present, investigations are continuing.

Version 1.0.0.6 Beta 03 July 2011

Fixed

If the root URL for a project included a document file name, no files were copied unless the Crawl above Root option was enabled

Version 1.0.0.5 Beta 29 May 2011

Changes and new features

A new rule option has been added that can be used to prevent a rule from matching a child URI
If-Modified-Since header and the NotModified HTTP status code are now supported
Added a new option to allow the latest version of a file to be always downloaded, skipped if the If-Modified-Since checks
A new "Missing" tab has been added that shows URL matches in a previous scan that were not matched in the latest scan
Redirect processing now honors 303 and 307 response codes
Report lists now display tooltips
If a link redirects to another, the destination is now stored with the original link
Content length is now stored with link information, independently of if headers are stored
Link properties dialog now shows redirect information and content length
Added the ability to view the size of a website by content type

Fixed

Exception reports were using the file version instead of the product version
Fixed a rare XML crash when saving a project
Fixed a crash which would sometimes occur when editing a rule or a form
When downloading a file, the Last Downloaded timestamp is now stored as UTC
Fixed an error where the content type was not set correctly if HEAD checking was disabled
Fixed a problem where the local file for a URL would be continously regenerated if the "Empty Save Folder" option was not set
Fixed a problem where it was possible for a URL to be crawled even though pre processing had rejected the URL
Empty directories are no longer generated for URL's which fail pre processing, such as redirects or unsupported content types
Fixed a crash which would occur if the referring URL was not available
Fixed a crash which would occur if the "content-type" header wasn't present when pre-processing a URL
URI's which end with / but point to a valid text/html document no longer strip of the final segment when generating the local filename and the flatten directories option is disabled
Link properties dialog now correctly includes the time when a file was last downloaded
Buttons in the main window now correctly follow the colors of the main theme

Version 1.0.0.4 Beta 08 March 2011

Changes and Updates

Meta refresh redirects are now crawled and remapped
Changed how redirects are handled, these will now appear in the main report lists
Files list now displays the content type of entries
Skipped list now displays the content type of entries
Added new Not Found and Redirect exclusion reasons, redirects and missing files will no long appear as "None" in skip lists.

Fixed

Two URL's with the same host bar the www prefix (e.g. http://cyotek.com/ and http://www.cyotek.com/) are now treat the same when determining if a URL is external.
URI's were not correctly combined on pages being crawled as a result of a redirect.
Reloading a sitemap which contained redirects did not display a map for any content discovered after the redirect
No longer attempts to download content for redirected responses
Project's weren't always being correctly marked as changed
Application wouldn't start on 64bit Windows (regression from 1.0.0.3).
Lists are correctly cleared before an analyze or copy action (regression from 1.0.0.3).
When creating or opening a project, the contents of the Files tab were not being cleared (regression from 1.0.0.3).

Version 1.0.0.3 Beta 21 November 2010

Changes and Updates

Substantial performance improvements have been made when loading large projects containing many links.
Updated to use Html Agility Pack 1.4
A new option to control if headers should be saved in the project file has been added. This option is disabled by default.
Cut, copy and paste commands are now available from the main window. However, lists and trees currently only support copy.

Fixed

Titles and Descriptions were attempted to be obtained from all files, causing a rare crash.
The Accept GZip Compression option was never correctly read from the project file.
Toolbar visibility was not preserved between sessions

Version 1.0.0.2 Beta 02 October 2010

Changes and Updates

Add-ins can now be enabled and disabled.
Appearance themes are now enabled.
The views Skipped and Files now have a context menu.
The Speed, Time Elapsed and Time Remaining columns have been removed as they aren't working.

Fixed

Relative paths weren't being saved in project files correctly
The application wasn't correctly attached to the error handling system
Command line arguments are now correctly processed.
Filenames were not being regenerated when opening a project.
Completion messages now correctly warn when errors were detected during copying.
Fixed a problem where running on XP either didn't display disabled images or crashed.

Version 1.0.0.1 Beta 17 July 2010

Changes and Updates

A new options page for controlling the local copy options has been added.
The project properties dialog now displays several of the common editors to provide access to properties which could not be changed in the alpha build.
The context menu for various lists now has an Edit Local File option.
Added a new option to control if extensions are remapped based on their content type.
Results list now shows elapsed time and estimated time of downloads.
401 authentication requests are now supported, either via predefined credentials or during the crawl via a password dialog.
The default buffer size has been increased to a larger value, allowing for faster downloads. In addition, the buffer size is now configurable.
Gzip compression is now supported.
Deflate compression is now supported.
~~Crawling is now performed on a separate thread, resolving sluggish behaviour with the user interface.~~ Disabled for this build
The Link Map Viewer now has a tab for displaying all links found. All lists in this dialog have had new columns added with more details on the links.
The project properties dialog now provides access to properties which could not be changed in previous builds.
Object model simplified, some confusing class inheritance has been removed.
Added the ability for additional content type handlers to be used.
Added the ability to specify multiple seed URI's.
A new configuration section has been added allowing you to store authentication credentials in a project file and to disable the password dialog when crawling.
Added a new viewer extensibility options allowing new tabs to be added to the interface.
Major refactoring of the base IApplication implementation.
Response headers are now stored in the link map. The Link Properties dialog now displays these headers.
The Link Properties dialog now displays local path information and the ability to open, open the containing folder, or edit the local file.
Scanning of subdomains is now supported.
You can now select from a common list of user agents.
Crawling will no longer occur above the root level by default. A new option has been added to toggle this behaviour.
Exclusions have been renamed to Rules to reflect their changing nature in this build and future planned enhancements.
When using the Add Rule context menu item from a result list, the editing dialog is now displaying allowing the entire rule to be configured.
The Add Rule command now includes any applicable query string in the URL for the rule.
A basic Regular Expression Editor is available and can be accessed via the Function button displayed next to supported fields.
Error text associated with a page error is now stored in the link map.
The page errors list will now be regenerated on loading a project with a saved link map.
The Link Map Viewer now displays link titles and error text.

Fixed

Redirects were not followed for 301 or 307 status codes.
The error list wasn't properly recording all errors which occurred during a crawl.
The failure to download a file due to a non-HTTP related error should no longer crash the application.
The prompt to create a missing save folder now includes the folder name instead of a formatting placeholder.
Fixed an issue where local file names contained escaped HTML entities.
Fixed an issue where it was possible for local file names to contain illegal characters.
Analyzing a website now only downloads files supported for crawling.
CSS contained within comment blocks is no longer crawled.
Page links found in an IFrame or Frameset were not scanned.
Cancelling a crawl now also correctly aborts the current transfer instead of waiting for it to complete.
If a list was scrolled horizontally, the content menu displayed from the filter bar wasn't positioned correctly.
Fixed a bug where response headers were not available if the request was not an expected response code.
The result expression editor no longer displays results for a blank expression.
Duplicate keyboard accelerators have been fixed.
The Sorted property of a crawl map now correctly defaults to false.
Fixed a problem where it was possible for the CommandManager to try and load classes it had no business loading, causing error messages to be displayed on startup.
Fixed a problem where command interface elements were not always given a name, leading to a problem where items could not be accessed unless the full text was known.
The failure to load an image resource for a command interface element will no longer cause the application to fail to initialize.
The Add Rule and Add Form dialog's caused a crash when being used to create rather than modify items.
If a link to child of a page which has been matched to a rule with the DisableCrawl option is detected, the entire link will now be excluded.
Fixed some selection inconsistencies in rules and forms editors.
The Add Rule command now automatically escapes regular expression elements within the URL, such as the ? of a query string.
Fixed some layout problems in Windows XP.

Version 1.0.0.0 Alpha 15 June 2010

Initial Release

Download

Download current and archived versions of Cyotek WebCopy

Download

Minimum Requirements

Windows 10, 8.1, 8, 7, Vista SP2
Microsoft .NET Framework 4.6
20MB of available hard disk space

Donate

This software may be used free of charge, but as with all free software there are costs involved to develop and maintain.

If this site or its services have saved you time, please consider a donation to help with running costs and timely updates.

Donate

Cyotek WebCopy Revision History Copy websites locally for offline browsing

Version 1.9.1.872 04 September 2023

Added

Changed

Fixed

Removed

Version 1.9.0.822 20 January 2022

Added

Changed

Removed

Fixed

Version 1.8.3.768 01 April 2021

Added

Changed

Fixed

Version 1.8.2.744 03 January 2021

Fixed

Version 1.8.2.740 13 December 2020

Fixed

Version 1.8.2.739 06 December 2020

Changed

Removed

Fixed

Version 1.8.1.725 17 October 2020

Added

Changed

Removed

Fixed

Version 1.8.0.652 12 April 2020

Fixed

Version 1.8.0.651 07 April 2020

Added

Fixed

Version 1.8.0.638 Beta 28 February 2020

Added

Changed

Removed

Fixed

Version 1.8.0.627 Beta 22 January 2020

Added

Changed

Removed

Fixed

Version 1.7.0.600 29 April 2019

Changed

Fixed

Version 1.7.0.583 Beta 17 January 2019

Added

Changed

Fixed

Version 1.6.0.559 02 November 2018

Fixed

Version 1.6.0.555 18 October 2018

Added

Fixed

Version 1.6.0.551 Beta 06 October 2018

Fixed

Version 1.6.0.549 Beta 23 September 2018

Fixed

Version 1.6.0.543 Beta 08 September 2018

Added

Changed

Removed

Fixed

Version 1.5.0.516 23 July 2018

Added

Changed

Fixed

Version 1.5.0.501 Beta 09 June 2018

Added

Changed

Fixed

Version 1.4.0.477 Beta 13 May 2018

Fixed

Version 1.4.0.469 03 May 2018

Fixed

Version 1.4.0.465 Beta 14 April 2018

Added

Changed

Deprecated