Cyotek Sitemap Creator Revision History Create sitemaps for use with Google, Bing, ASP.NET and more with ease
Added
- The original Include Subdomains option has been replaced with a new set of more comprehensive options, allowing for copying of sibling domains, linked resources, or everything
- Although the UI editor for additional hosts stated regular expressions could be used, this was never implemented. Regular expressions can now be used with additional hosts
- Added an address bar to the Capture Form tool to allow access to hidden login URI's, or for any other type of manual navigation
- Added a Scan button to the Capture Form tool allowing a manual scan for forms in the page if Sitemap Creator failed to detect them initially
- The maximum redirect chain length setting can now be configured from the Advanced options group
- Added a new option to control if external redirects should also be followed
- Added support for the
brotli
compression algorithm - Added a new option to control if the results list should automatically scroll to show the active item while performing a website scan
- Added a new Keep Alive setting. Setting this to false can help prevent the "The server committed a protocol violation. Section=ResponseStatusLine" crawl failure [#002]
- Added a new Prefix Mode setting. This setting allows you to force URI's to either have or remove the www prefix, useful for avoiding duplicated files when copying a website which uses a mix of prefixed and non-prefixed URI's
- Added the ability to replace sections of a URI when crawling documents
- Added a new report to view non-HTTP links
- Added a new optional extension for providing feedback/smiles/frowns or support requests from within the application
- Error diagnosis dialogs now include a reference to the original report
- The Test URI dialog now includes a new tab which lists all links that were detected on the source page
Changed
- Now requires Microsoft .NET 4.6
- When following redirects after posting form data, all built in skip rules are ignored, so if a post to one site directs to another to complete the post, Sitemap Creator will now always follow the redirect
- The Create Desktop Icon option in setup is no longer checked by default
- The Test URL dialog now uses the proxy settings of the currently open project
- The Website Links dialog has been slightly redesigned to prevent a crash when working with projects containing many thousands of links
- Form editors now include a second URI field, allowing you to set the URI where field values will be read, if the
POST
URI is different for the form URI - The Capture Form window now remembers it's size and position
- Information dialogs accessed from list views now display the selected context in an easier to read format than plain CSV
- The default maximum redirect chain length has been increased from 5 to 25
- HTTP Compression options have been removed from the Advanced options group into their own dedicated group
- Options for processing redirects have been removed from the Advanced options group into their own dedicated group
- Minor performance improvements
- Minor optimisations to reduce memory load
- Setup should now automatically uninstall previous versions
- Numerous changes to how plugins are discovered, loaded and configured. Due to no longer storing plugin details in the Windows Registry, this will cause any disabled plugins to be re-enabled
- Sitemap Creator will now correctly report non-HTTP links such as
mailto:
orftp:
as skipped rather than silently ignoring them - Internal engine changes [#062]
Removed
- Removed the setup option for creating a Quick Launch icon
- Removed support for opening legacy XML based Sitemap Creator projects
- Due to offline help always being outdated due to the general weakness of the product manuals, the offline help files are no longer included in the setup, and requesting help will always display the online version
Fixed
- External domains that would normally be crawled were being excluded if the Download all resources option was set (nightly regression)
- Sibling domains were not being crawled correctly (nightly regression)
- Problems loading or saving the user agent store should no longer be fatal
- Some dialogs only supported local help requests and were unable to show help if the local help file was not available
- Fixed a crash that occurred downloading a file if the
Content-Encoding
header of the HTTP response was set toidentity
- Fixed a memory leak in the sitemap component
- A number of HTML 5 specific tags were listed as Unknown in crawl result list views
- The UI now correctly reports if part of a crawl was aborted due to too many redirects
- Crawling the same project multiple times in succession reused the cookies from the preview crawl
- Fixed an unauthorised access crash that could occur when using the Capture Form tool (regression)
- Fixed a crash that could occur if the website download folder, or any downloaded URI, included the
{
or}
characters - Fixed an issue where external links could appear in some lists even when filtering options were set to exclude them
- Fixed a number of issues which prevented automatically logging into websites where the
post
URI was different to theget
URI and value merging was required, or the post returned302
and the new location must be read to complete the login - Fixed an issue where occasionally the Capture Form tool didn't refresh available forms after navigating to a page
- The Capture Form tool now correctly detects forms that are contained within frames
- Export to CSV option featured when context clicking some list views didn't correctly escape the CSV
- Fixed an issue where the entire application was terminated if CSV export failed
- Fixed an issue where the application could fail to open files in their default application if the full path included a space
- Fixed a crash which could occur when a request made via the Test URL dialog failed, and no response was available
- The Capture Form tool was incorrectly using the
id
attribute of form elements instead of thename
- Double clicking an entry in the Cookies list view of the Test URL dialog didn't display the selected item details
- URI's would be incorrectly combined if the relative URI was just query string and the source URI already had a query string
- Download percentages were calculated incorrectly
- Report viewer didn't show external URI's
- Fixed case-sensitivity issues in some built-in reports
- Fixed a crash that could occur if non-NBT files were present in report folders with the rpt extension
- Fixed a startup crash if the addins folder didn't exist [#112]
- Fixed a crash that could occur when trying to calculate the depth of a URI [#116]
- Fixed a crash that occurred if a project with a blank URI was opened and the user then attempted to browse to the blank URI [#120]
- Fixed an issue where Setup sometimes wouldn't replace files
- Fixed a number of issues that could occur after opening or saving a project and the MRU was updated [#161, #176, #177]
- RSS entries would duplicate themselves depending on if the feed was accessed via HTTP or HTTPS. Note that a side effect of this fix will result all entries being marked as unread
- Fixed a possible crash that could occur when trying to load a themed font [#203]
- Errors loading cached RSS feed resulted in the RSS extension from not functioning [#204]
- Fixed potential exit crash when updating statistics [#175]
- Fixed a potential issue where the last character in a directory path could be removed [#213]
- The Website Size dialog could crash with a divide by zero exception [#222]
- Empty analytics sessions are no longer transmitted
- Failure to obtain shell icons should no longer crash the application
- Diagram viewer crashed saving a diagram [regression]
- Loading a diagram didn't update UI state correctly
- Changing some diagram properties didn't cause the diagram to be updated
- URI's which had a blank
charset
attribute in theContent-Type
header weren't processed properly - Fixed an issue where it was possible that text tokens weren't replaced
- Link origin wasn't persisted
- Clearing the link map didn't clear the sitemap tree [#257]
- Trying to access Quick Scan hung the application [regression]
- Internet Explorer emulation mode wasn't being set, causing HTML previews to always use the IE7 rendering engine
Added
- Now supports finding links via the
300
"Multiple Choices" HTTP status code - When logging an exception, diagnosis actions are such as new version downloads or links to workarounds are now displayed, if applicable
- Slight improvements to scan performance
- You can now choose to display all errors, or only errors detected during the current scan in the Errors tab
Removed
- Disabled glass effects unless using Windows Vista or Windows 7
Fixed
- In certain circumstances, command line arguments would not be parsed correctly
- Fixed a problem where projects using a sub path and the Crawl above root URI option could save duplicate URI's into the project, causing a crash when attempting to reload the project
- Fixed a issue where sitemaps belonging to projects using a sub path and the Crawl above root URI option were corrupt
- When changing settings via the main Options dialog, some settings would not be applied as the old versions were cached
- The XPath expression for
<meta http-equiv='refresh'
support wasn't strict enough and was picking up more elements that it should - HTML attribute scan rules that used regular expressions to transform only part of value of the attribute were incorrectly merged the transformed value
Added
- Added support for canonical URL's via
rel="canonical"
link elements - Added a new option to control whether or not new pre-release (beta) versions are included in update checks
- Reinstated digital signatures
- Experimental When analysing a site, Sitemap Creator will now attempt to keep content in memory where possible, and only write it to disk if the content is above the default capacity
- CSV exports of link maps now include an integer column for the HTTP status in addition to the textual description
- When posting a form, existing values will be automatically merged with the user defined custom values
- Added a new tool for capturing a form, making it much easier to extract the basic tokens for posting a form
- The Test URL dialog has been split in two, so that the result content is always visible
- Cookies are now supported by the Test URL dialog when making multiple requests from the same domain, including their own tab for viewing
- Minor UI improvements here and there
- All standard HTTP verbs are now supported by the Test URL dialog
Changed
- Sitemap Creator now requires Microsoft .NET Framework 4.5
- The Errors tab no longer lists redirects, instead you can use the Redirects report
Removed
- Removed the prefix with the website name / prefix with the website url option. This setting was confusing, served no real purpose and the default value was wrong
Fixed
- The XML plugin no longer include images that were included in a page via a
<link>
tag - The XML plugin now correctly excludes images that have been excluded via rules
- If a view crashes when updating, it is now disabled for the remainder of the session without crashing the entire application
- Fixed an issue where token menus (for example those in the External Tools dialog) containing environment variables could be excessively wide
- Fixed a crash that occurred using the External Tools dialog the Environment Variables sub menu was clicked
- Fixed an issue where Sitemap Creator could incorrectly loose the custom port when combining two URI's in certain circumstances
- In-line CSS is now correctly crawled
- Crawling will no longer follow redirect chains beyond the 5th consecutive redirect
- Fixed an issue where meta data could be read incorrectly based on encoding type
- Problems that occur reading meta data for a downloaded file no longer block the crawl with a modal error dialog, instead the error is presented in-line at the end of the crawl the same as other errors
- Some columns in the results list view were not updated correctly unless the action was successful
- Fixed several occurrences where link information wasn't being updated correctly
- Fixed an issue where some form values would not be encoded correctly
- GZip and deflate compressed data is now decompressed during the download, rather than after the entire content has been download
- The HTML view in the Test URL dialog now correctly updates each time a new request is made
- Some files were missing from the setup that prevented exception reports from being submitted (regression from previous version)
- Fixed a duplicated shortcut between Rules and Test URI
- Exiting Sitemap Creator while the RSS extension was updating caused a crash
- Fixed a threading crash that could sometimes occur when trying to access the Quick Scan dialog
- Fixed a crash that could sometimes occur when closing the Quick Scan dialog while a scan was in progress
Changes and new features
- Deprecated The prefix with the website url / prefix with the website domain name option of a crawl project has been deprecated and will be removed in a future update.
- Experimental Added proxy server support
- Activating an item in either the Request Headers or Response Headers tabs of the Test URI dialog now displays the header information in a dialog for easy viewing/copying
- The contents of the Select Mime Types dialog are now sorted
- Items in the Title Replacements and Forms editors can now be reordered via drag and drop
- Added a helper tool for backing up and restoring settings, or for resetting settings to default values
- Added a stand-alone update check tool
- Added Requests Per Minute limit mode
- Added a new Enforce Limit Checks option. When set, limit requests will be enforced for all URI's that involved a HTTP request. If not set (default) limit requests will be enforced only for URI's that were successfully processed
Fixed
- Pressing enter in the Post Values field of the Test URI dialog no longer activates the default button on the dialog
- Fixed an issue where only the end of a host was inspected when checking if a given URI was a sub domain of another. For example, it would incorrectly return that
static.oneexample.com
was a subdomain ofexample.com
- An error is no longer displayed if you open a project saved using a newer version of Sitemap Creator. The project will now be opened where possible, but a warning will now be displayed
- Repeatedly clicking column headers in sortable lists now correctly cycles between Ascending, Descending and None, instead of only Ascending and Descending.
- Fixed a problem where clicking the Add button in the Form Editor would clone the active form, including the internal ID of the form which should be unique, leading to crashes
- Fixed an issue where using the Move Up and Move Down buttons in the Title Replacements editor moved the wrong item if the button was subsequently clicked without selecting another item first
- Fixed an issue where settings were both loaded and saved using thread specific culture data, which could cause a crash if the computer culture information was subsequently changed. All settings are now saved and load using an invariant culture.
- A crash no longer occurs if font information cannot be read correctly from stored settings
- Limit checks are no longer applied to URI's that were skipped due to being external or by a rule
- Changing the window font is now correctly applied to the main window when the settings are applied, rather than requiring the application to be restarted
- Fixed a crash that could occur when attempting to obtain the display string for an enum value
- Fixed a crash that could occur if there was a connection error when trying to post a form
- Fixed an issue where the RSS feed wouldn't update when the Update Now option was used, unless a daily update was already pending
- Fixed a crash that could occur displaying the rules editor
- Changing the Host Replacement property now correctly rebuilds the sitemap using the updated host
Changes and new features
- Sitemap Creator now supports the
304
"Not Modified" status code in the same way as WebCopy. Previously Sitemap Creator forced a download of all content, regardless of if it had changed or not. To revert to the old behaviour, set the Always download latest version option in the Advanced node of the Project Properties dialog - Title replacements are no longer applied while crawling the project, this irreversibly changing the link map meta data. Instead, they are dynamically applied when creating the sitemap, allowing much easier editing and adjustments without loosing the original data
- Added support for ETag's and the
If-None-Match
header when reading headers to determine if a resource should be downloaded - The regular expression editor for the Title Replacements configuration settings now allows testing against the titles of any pages in the link map
- Simplified the highlighting and displaying of matches in the Regular Expression editor
- Replaced the embedded IE web browser with a custom solution that is faster and doesn't cause havoc with some hot-keys
- The duration of each URI crawled is now recorded
- The duration of the entire crawl is now recorded
- Added the ability to limit crawling to a number of requests per second. Options to configure crawl limits can be found beneath the Advanced node in the Project Properties dialog
- Added Slow Pages report
- Reports are now loaded from disk (and on demand) instead of being pre-defined; user reports are now supported
- Added Status Code and Skip Reason columns to results list
- Setup now allows you to customize which components are installed
- Added product RSS notifications add-in
Fixed
- Fixed version comments being incorrect
- The Regular Expression editor now correctly displays line breaks in the Replace tab
- Fixed an issue where the update check could cause the main window to be unresponsive
- The Quick Scan dialog no longer crawls either sub-domains or above the root URI
- Backups were not being taken of project files when saving even if the option was enabled
- Fixes an issue where the Quick Scan dialog wasn't cleaning up correctly when closed
- The Sitemap Preview add-in now respects the users Fixed Font setting
- A URI that returns an error status code is no longer flagged as "skipped" with the reason "Invalid Content Type".
- If a crawl was cancelled due to a HTTP status code, the results list no longer flags any such URI as "skipped", but retains the original, correct, value
- Fixed an issue where colour settings were sometimes not loaded correctly
- Fixed an issue where font settings were sometimes not saved correctly
- Font sizes are now displayed as whole numbers
- Fixed an issue where only a single backup was created, and did not cascade according to the maximum number of backups defined
- Fixed an occasional crash resizing the application window with a collapsed panel
- Corrected baseline positioning of editors and labels in dynamic user interfaces
Changes and new features
- Panels in Option dialogs now load on demand
- Option pages are now only initialized when requested by the appropriate dialog
- Removed status code 520 (origin error) from the list of supported codes for automated error reporting during a crawl
- The Image Viewer window no longer defaults to Fit when displaying an image, but now defaults to Actual size
- Added additional themes for configuring the appearance of the GUI client window
Fixed
- After editing a rule or form, the enabled state of the item could no longer be correctly toggled via the check boxes in their respective lists
- Dynamic options in the Options dialog are now positioned more sensibly in relation to the options label and editor, and other options in the same group
- Fixed a problem where tool tips did not display under certain conditions, or could display the wrong (or blank) text
- Extension mapping for dropped files was case sensitive
- Reworked tool bar layout code to prevent overflowed buttons
- In certain circumstances, creating a backup of a file could take a substantial amount of time
- Fixed an occasional
The path is not of a legal form
exception when using the External Tools dialog
Changes and new features
- Token menus now include environment variables
- Setup now offers to install the Microsoft .NET Framework 3.5 if not already present
Fixed
- The maximized or minimized state of a window was no longer being restored when reopening the window
- The Find and Replace dialogs in text editing windows now correctly default to the selected text as appropriate
- The token button displayed when prompting the user for arguments for external tool execution now displays a menu with available tokens.
- Attempting to open a folder who's full path contains a period no longer displays an Invalid Path message.
- When restoring window position and size, the restored bounds are automatically recalculated to fit the monitor, for example when using via Remote Desktop with a smaller display resolution, or the removal/repositioning of a monitor in a multi-monitor set up.
- The main application window could no longer be sized smaller than its original startup size.
- Fixed a problem introduced in the last update which caused the crash reporter to no longer submit crash reports
Fixed
- Fixes a crash introduced in 1.0.7.0 when running rules if the URI being processed was shorter than the base project URI
- Fixed a crash that would occur when clearing the link map and the project did not have a valid URI
- Fixed an issue introduced in 1.0.7.0 when deleting items with the popup Rules and Forms editor, where the either the wrong item would be removed visually, or the software would crash.
- Fixed an occasional crash introduced in 1.0.7.0 moving items with the popup Rules and Forms editor. Note that the moving of rules and forms has no functional use and will be removed in a future version of the product. Also note that moving is performed on the underlying collection, not the visual display sort
- Fixed a problem with the Rules, Forms and Password editors where it was often quite difficult to add new items via the popup editors as they kept trying to update a previous selection
- Fixed an occasional crash after using the Quick Scan dialog
Fixed
- Added manifest so that when running under Windows 8.1 / Server 2012 R2 the OS version is correctly reported.
Fixed
- Fixed a crash that occurred when building a sitemap for a project that had root level URI's with a query string, but without a document name.
- Speculative fix for a Parameter is invalid exception that randomly occurs when painting windows
- Fixed a crash that occurred when modifying a rule, and the project URI had been cleared or was invalid
- Fixed a crash that could occur when inserting a new rule or a new form
- Fixed a crash that could occur when attempting to process root level URI's in the sitemap
Changes and new features
- Experimental: Added a new option to simplify the sitemap treeview. When this option is set, folder containers are no longer displayed if the folder only has a single page
- Experimental: Modifying a rule now reapplies rules to the sitemap, allowing easier sitemap manipulation without having to rescan the site. Note this feature only works on the current contents of the link map, if the linkmap is incomplete due to existing rules a rescan will be required regardless.
- Sorting of the sitemap now uses natural sorting, so names appear in a logical order, e.g.
1
,2
,10
rather than1
,10
,2
- The Rule and Forms lists now default to sorted
- List views that support sorted columns now use natural sorting
- When building a sitemap, folders are no longer generated for URI's that match except for differing query strings
- New API to allow plugin authors to add additional functionality to application windows when they are created
- Sitemap treeview now displays URI's relative to the base URI
- Rules that do not use the Use Full Uri flag now also strip out the leading path of the base URI. For example, if the base URI of the project is
http://demo.cyotek.com/staticwebsite/
and the current URI being crawled ishttp://demo.cyotek.com/staticwebsite/blog/page1.html
, the text used by the rule engine will be/blog/page1.html
- The Differences tab now lists all URI's which are new to the last scan, in addition to existing checks of modification dates. Due to the introduction of this setting, all URI's will be marked as new for existing projects, until that project is rescanned and saved.
- Removed the Use Modified Uri rule flag
- Removed all HTML crawl restrictions, Sitemap Creator will use the full set of HTML crawling rules as WebCopy
robots.txt
export now includes a preview of the file- The Additional URL's section of the Project Properties dialog now allows the entering of relative URI's.
- The sitemap tree view now only loads children on demand, improving performance for large projects
- The different results tabs are also now load on demand, again improving performance for large projects
- Added a new setting which determines if the Sitemap tab is activated when opening existing projects
- Added new options to the Website Size dialog for either using total size or link count for content types, and for limiting the number of slices displayed
- Added a new Ignore SSL Errors option. If this option is set, attempting to scan a website that contains an invalid SSL certificate will be allowed. The default for this setting is false, meaning that Sitemap Creator will not scan websites with invalid certificates.
- Double clicking (or pressing enter) on a file node in the sitemap tree view now displays the appropriate properties dialog
- The Print and Print Preview options are now correctly enabled in the Image Viewer.
- Simplified the User Agent editor
- Various minor UI tweaks
Fixed
- Fixed a rare startup crash
- Fixed a problem where clicking OK on the Edit Rule dialog saved changes even if there was a validation error and the user subsequently clicked Cancel
- Fixed a problem where the Quick Scan dialog failed without finding any URL's if the Inclusion / Exclusions options were set
- Fixed a problem where page titles and descriptions containing HTML entities were not decoded
- Fixed a problem where the sitemap could include URI's containing query strings, even if the strip query string segments option was set
- Disabled Glass effects on dialogs when running under terminal services connections
- Fixed a problem where source redirect URI's were not excluded, and appeared in the sitemap
- Outgoing links for an existing link are no longer cleared if the link is excluded for any reason
- Fixed a issue where the skipped status of a URI wasn't reset correctly
- Fixed a problem where the HTML sitemap provider would place links under the wrong parent item
- Fixed a problem were if multiple sitemaps were generated, the header signature in the sitemap output could be incorrect
- Corrected formatting of HTML sitemap output
- Fixed a problem where the sitemap was incorrectly generated if the base URI for the project included a document name
- Robots.txt export ignored rules that started with
^/
- Fixed a crash that could sometimes occur saving a Sitemap Creator project with blank values in provider settings
- Fixed a problem where Sitemap Creator would prompt to use an empty sitemap rather than analysing the source site as expected
- Fixed a rare problem where Sitemap Creator could subtly corrupt a URI when writing sitemap files
- Fixed an issue with the XML sitemap where empty
changefreq
elements would be written - Fixed an issue which mean that using the Include Images flag occasionally didn't work as expected
- Fixed an issue where sitemap providers ignored the excluded status of items when generating output files
- Fixed a problem where result views would be reset when building sitemaps
- Fixed a problem where duplicate URI's could be present in the linkmap in rare circumstances, causing a crash when trying to reopen the project.
- Fixed a potential crash that could occur attempting to retrieve shell icons.
- Fixed a problem where commands linked to URI's that contained spaces in their respective query strings caused the command to fail with an Invalid URI message.
- Fixed rare a problem where it was possible Sitemap Creator would place the same URI twice in the processing queue, and immediately cancel the copy as soon as the second occurrence was hit.
- Fixed a problem where Sitemap Creator did not check the internal document version to ensure it was supported
- Fixed an issue where toolbars were initialized before the window was resized to whatever the user had defined, meaning some toolbars were unnecessarily placed on new rows
- Corrected some invalid message window and dialog titles
- Fixed a crash which occurred when clicking pie slices in the Website Size dialog and filtering was enabled
- Fixed a problem where attempting to open an Explorer window to a UNC path displayed an invalid folder message
- The Page Setup option in the Print Preview dialog didn't do anything when activated
- If a problem occurs decompressing data compressed using the deflate or gzip algorithms, the download will be automatically retried with these options disabled.
- WebCopy no longer attempts to decompress files that returned none as the content encoding.
- Fixed a crash that occurred trying to browse to a path that contained invalid characters
- Fixed a problem where attempting to open a folder where the drive letter was in lower case caused Sitemap Creator to display an "invalid path" message.
- Added an additional check for servers that return an malformed UTF charset header.
Changes and new features
- The Headers tab in the Link Properties dialog now displays request headers
- Added new options for setting the Accept and Accept-Language request headers.
- Removed status code 406 (not acceptable) from the list of supported codes for automated error reporting during a crawl
- Experimental: Added the basis of a "quick scan" feature. This scans the top level of the website for unique absolute URI's (removing bookmarks and query strings) and is useful for getting a quick overview of the top level structure of the website, making it easier to detect and exclude pages that have no benefit to copy (such as new thread / reply thread pages in a forum). As with other experimental features, this will be expanded over future updates.
- By default, new projects will now remap local file extensions based on their file type if no existing extension is present
- Status bar now shows pending crawl requests.
- The progress bar now attempts to show current process based on total requests. It's not hugely accurate as it doesn't take into account the size of each request, but is better than a marquee! Windows 7 and 8 users will see the same behaviour on the taskbar progress.
- Added support for the data attribute of the object tag.
- Removed downloaded file hash calculation as they currently aren't used by Sitemap Creator
Fixed
- Total download size is now incremented correctly even if the content length was reported as zero by the server
- Filtering a grouped list didn't preserve groups when previously filtered items were restored
- URI's that have a status code of 406 (not acceptable) now have the correct skip reason associated with them
- The Content tab of the Test Link dialog didn't always correctly display returned content
- Fixed a crash that could occur when attempting to sort a list
- Fixed a crash if the Content-Type response header contained a space before the encoding name, for example
text/css; charset= UTF-8
. - Fixed a crash if the Content-Type response header specified
utf8
instead ofutf-8
. - Fixed a crash that could occur if the source URI couldn't be decoded correctly
- Fixed an occasional crash attempting to get the short form URI pattern when creating rules from an existing URI
- Fixed a problem where tool bars didn't wrap correctly if a new tool bar had to be placed on a new row
- Fixed a problem where when using the Excluded and Add Rule commands, the generated URI was invalid if there was a mix of www prefixed and non prefixed URI's
- Fixed a crash that occurred when clicking the Test URI button in the Form Editor and the URI of the project is invalid
- Fixed a crash that occurred when submitting the remove missing links dialog for a project without a valid URI
- Fixed a problem where GZIP compressed content was downloaded incorrectly if the response headers didn't include a content length
- Fixed a problem where some users experienced a startup crash when initializing fonts
- Fixed a build problem that meant some exception reports were missing information
- Fixed a problem where buffers were incorrectly being processed when downloading which could lead to a potential crash or corrupt file if the response header didn't include a content length, and otherwise just did extra repeated work if a length was available
- Fixed a crash that could occur when crawling websites that had many nested branches of links
- Temporary files generated during the analysis of a website are now deleted as soon as they are no longer required, rather than only once the crawl has completed
- The "is missing" check was ignoring HTTP status codes and only going from the scan index
Changes and new features
- Added the Differences tab and related functionality previously only available in WebCopy.
- Build Results tab now displays the timestamp of the action, making it much easier to determine the last time you built a sitemap!
- Minor improvements to crawl performance with websites that have a lot of cross page linking
- Sitemap tree view now highlights missing URI's with a configurable color
- Sitemap tree view now highlights folder nodes when all children match the same status
- Added a new Remove Missing Links command. This allows you to selectively remove missing URI's from the sitemap, without having to clear the entire map.
- Website Links dialog lists are now highlighted according to the status of the link
- Added additional filter options to the Website Links dialog
Fixed
- Building a sitemap was no longer flagging the project as changed
- URI's which are skipped due to a 403 response code are now correctly flagged as Forbidden as the skip reason
- Exception reports now include details of type load exception data
- A crash no longer occurs if restoring a window's previous state fails
- Fixed a crash which occurred when attempting to combine a URI with a partial URI that contained one of the reserved characters from RFC3986.
- Fixed a problem where URI's were not combined correctly if the relative URI comprised solely of a query string
- Start up errors when loading extensions can now be reported, and no longer prevent the application from starting
- Removed invalid Set as active URL item that appeared on several context menus
- Obsolete outgoing links were not being removed when crawling a source URI
- Filter options in the Website Links dialog are now correctly available no match which tag is active
- Fixed a massive performance issue when populating lists under certain conditions
Changes and new features
- A new Test URI feature is available. This allows you to test a given URL, choose verbs, post information, or experiment with different user agents. Any returned output is viewable, allowing you to easily check if user agents have an impact on returned content, and methods such as HEAD are supported for crawling.
Fixed
- Fixed a crash when trying to open or copy the URI of an invalid tree view node
- When entering a URL without a schema, a default schema of http:// will be applied
- Old version notices are no longer displayed after opening a project in the old XML format and then creating a new project
Changes and new features
- Sitemap Creator projects are now saved as binary files, and it is no longer possible to save in the old XML format (however you can continue to open them). As a consequence of this change, project files are now smaller and are quicker to load and save. When saving a XML based project, a one time backup will be created of the original XML before the new binary file is written. Note that versions of Sitemap Creator prior to 1.0.2.0 cannot open these binary files.
- Experimental: Added a new option to display excluded URI's in the sitemap tree view
- Experimental: Added new "quick exclude" option to the context menu of the sitemap tree view. This new feature allows you to quickly including or excluding a given URI from being crawled.
- Predefined default documents list now includes index.cfm and index.jsp
- Default documents editor now displays the predefined default documents used if the field is left blank
- UI access to "hide" URI's in the link map has been disabled. This functionality will be removed in a future update
- The last downloaded attribute of URI's are no longer updated if the download was the result of an analyse operation rather than a full copy
- Crawl exceptions can now be reported to help improve Sitemap Creator
- Replaced 3rd party PDF library, this should allow for better compatibility with different types of PDF documents
- Added a disk space check during the crawl. If Sitemap Creator doesn't think there is sufficient space to download a file, it now automatically aborts the crawl.
- Added a new option for automatically opening the last project when starting Sitemap Creator without a command line
- Content types prefixed with x- automatically fall back to the non-prefixed version where the prefixed version cannot be found
- If the crawl of a given page fails, for example with a 500 error, Sitemap Creator now attempts to get any response data which can then be viewed from the new Content tab of the Link Properties dialog. This response data is not saved with the project and is only available for the duration of the session from which it was populated.
- The Test URL dialog has been updated to support the same "raw/WYSIWYG" content preview as the Link Properties dialog.
- The Addin Manager dialog now lists loaded meta providers
- Improved the completion message for where a crawl was cancelled during to a failure trying to crawl the primary site URI, or any user defined additional crawl URI's
- Completion dialog now includes statistics on the crawl, such as files downloaded and total size
- Added Knowledge Base link to the Help menu
- Help file updated
Fixed
- Fixed a problem where the XML generated by an exception was invalid
- Fixed an issue where the content encoding of a page wasn't picked up if the value ended with ;
- Fixed an issue where sometimes the Build Results tab wasn't populated correctly
- Validation errors that occur during HTTP parsing are now ignored. This allows crawling of websites such as snapfiles.com which do not follow the HTTP specification
- Exceptions that occur when trying to read website headers (such as the previously mentioned protocol violation) are now correctly reported instead of being silently ignored
- If an error is encountered during a crawl, the message now states that the crawl was cancelled rather than completed successfully
- Fixed a rare crash where the exception handler crashed trying to generate the XML report
- Only the first occurrence of each unique mime type was given a shell icon in the the sitemap tree view and other supported views
- The sitemap tree view context menu was enabled even if the tree view was empty, causing crashes if any of the menu items were clicked
- Fixed a crash if you attempted to browse for a folder and had manually entered a path containing invalid characters
- When creating a new project, any existing sitemap wasn't removed from the tree view
- Fixed a crash which occurred when switching back to the Sitemap tab and the current projects URI was blank or invalid
- Fixed a crash which occurred when attempting to view the website diagram and although a crawl map was available, the current projects URI was blank or invalid
- Failure to save a project no longer results in a crash
- Exceptions that occur when opening a project can now be reported
- Pressing return in the form editor data field and default documents editor no longer attempts to submit the editing dialog
- Fixed the client component name for metrics
- If there is a problem setting the progress bar overlay on the taskbar, a crash no longer occurs
- Fixed the wait cursor not always appearing when a blocking action was occurring
- Fixed a rare clean up crash when removing temporary files
- Fixed UI controls being enabled in the URI properties dialog when they should have been disabled
- Pressing enter on a focused link label now correctly activates the link, instead of attempting to activate the default button for the window
- Default images are now correctly used if shell icons are enabled and no appropriate icon is available
- Addins which had additional dependencies located in the addins or views folders failed to work
Fixed
- Fixed a crash when building a single sitemap and no result information is available
- Added an additional path validation check so the application no longer crashes if you set the download folder to be something invalid, such as ftp://.
- Fixed a problem where byte order marks were not being saved into data files, resulting in corrupt text data, such as page titles in the link map. This was only apparent when downloading text documents containing non-ANSI characters. Binary files were not affected.
- Fixed a problem where response encoding was not properly processed
- Fixed a problem where URI processing exceptions that didn't involve an error-state HTTP code weren't reported correctly in the UI
- Fixed a problem where URI which returned quoted character sets (for example
text/html; charset="utf-8"
) caused processing of that URI to fail. When combined with the above bug, this led to URI's being skipped and no way to tell why they were skipped or what any problems were. - Fixed an issue where the default user agent was always being used when crawling a website.
- Fixed a crash saving untitled projects introduced in the last build
- Fixed a crash when creating a new tool introduced in the last build
- Fixed a problem where commands with hot keys could be activated while the user interface was disabled, leading to either a crash or an invalid application state
- Fixed a crash if the user attempt to copy an URI and the clipboard was in use
Changes and new features
- External Tools dialog now includes a preview of the command line and allows tools to be executed from within the editing dialog
- External tools now support using environment variables
- User Agent editor now shows the default user agent
- Added a status bar indicator showing how long the current operation is taking
- Added new External URI's report
- Added automatic update check which can be enabled/disabled in the Options dialog. When enabled, once a day a check is made, and if an update is found a notification is displayed in the status bar.
- Backup settings for project files are now available on the Options dialog
- You can now drag and drop projects from Windows Explorer onto the main application window to open them
- A warning message is now displayed if when trying to ping search engines without configuring sitemap URL's
- Added an option to disable shell icons in the sitemap tree view
Fixed
- Fixed an issue where an unexpected exception that occurs during URI preprocessing crashed the application rather than just aborting the active URI
- Exception reports were missing data information
- Fixed settings dialog always reapplying all settings regardless of if the setting page had tracked any changes
- Fixed a crash which occurred if the About dialog was displayed after viewing a report.
- Fixed a problem where a settings page that did not save changes correctly crashed the entire application
- Fixed a crash that occurred if the root Theme menu was clicked
- Fixed an issue where duplicate entries with the same URI and internal ID could be added to the link map when opening a project
- Default user agent was missing platform name
- Single tabbed options dialogs are no longer quite as narrow
- Fixed a number of duplicate accelerators in command menus
- Fixed a crash that occurred if the Courier New font was not available even if an alternate font had been specified in the Options dialog.
- Malformed reports no longer crash the application
- Application should no longer crash if there is a problem rendering button components. Could not reproduce original bug and error report did not include contact details so it's possible issues still exist. This is a workaround, not a true fix.
- Fixed some button tooltips incorrectly including the ampersand and ellipsis characters
- Fixed a problem where modifying a value in the Options dialog partially applied the value even if the dialog was subsequently cancelled
- Fixed a problem where the "show folder paths" options was read via one name, but wrote with another, preventing the value from being usable
- Fixed a crash that occurred clicking the chart area of the Website Size report with an empty link map
Changes and new features
- Sitemap tree view now displays file icons rather than a generic document icon
- Added experimental Reports feature for performing dynamic querying of data. At present three read only reports, Redirects, Not Found and Empty Meta Data, are provided, future updates will expand upon these and add the ability to create your own.
Fixed
- CEIP transport now correctly targets .NET 3.5, preventing errors when exiting Sitemap Creator and .NET 4.0 was not installed
- CEIP will no longer prompt to be enabled each time the main Sitemap Creator executable version changes
- Fixed a problem where modifing the inclusion/exclusion status of a rule via the rule editor didn't commit the change unless some other change was also made to the rule
- Fixed a potential crash reading settings
- Fixed a problem where some button text was hidden after changing font
- Fixed an occasional crash when attempting to save items via popup collection editor, such as Rules or Forms.
- Exception handler now tries to send exception reports using the en-US local
Fixed
- Fixed a problem where setup program didn't correctly update files where the version number was unchanged
- Fixed skipped pages not being added to the skipped pages view during a crawl as they are detected (they were still added after the crawl had completed)
- Fixed a problem introduced in build 1.0.1.2 where the Open command reloaded the current document instead of prompting for a new document
Changes and new features
- Added new feature to cancel a crawl if a given HTTP status code is met. These options can be configured on the Advanced page of a project's properties.
- Added new Export Site URI's option. This will export all URI's and their statuses to a CSV file for external processing
- Added Website Size report. This view displays a graphic of the content types which make up a website.
- A new indicator has been added showing the total size of content download during the active crawl operation
- Includes customer experience improvement program
- Help Updated
Fixed
- Fixed a problem where the Pages list wasn't being updated if you analyzed and built sitemaps as a single action
- Fixed a problem where the Disable Links rule condition wasn't working as expected
- Fixed a number of issues with the error reporting tool
- Fixed an issue where creating a new document when changes had been made to the current document did not prompt to save said changes, causing them to be lost
- Fixed an issue where clicking an empty MRU could prompt to remove a blank filename
- Fixed a crash which occurred when hovering over an overflow toolbar button
- Items in the toolbars menu now appear in the correct order
Changes and new features
- Added the ability to allow editing for link titles and descriptions. This allows you to customize the title and description of a link, and also prevent future updates from resetting the values. Note: If the option to clear the link map when analyzing a project is set, customizations will be automatically lost.
- The link properties dialog is now resizable
- It is now possible to install additional readers for document meta information
Fixed
- Titles could not be read from some PDF documents
- If a link was modified, it did not mark the project as changed
Changes and new features
- Mime types to include / exclude from the generated site map can now be specified. This allows much greater control over what appears in the site map. NOTE: This change defaults to "all documents" which differs from existing behaviour.
- Added the ability to view a diagram of a websites structure via the the Website Diagram addin.
- Font settings can now be set through the Options dialog
- Double clicking an item in the Forms list now automatically opens the form editor
- Double clicking an item in the Rules list now automatically opens the rule editor
- Browse folder dialogs now allow entering the path on Windows Vista and above
- Substantial API changes to make it easier to use.
- User interface should now remain usable whilst analyzing a website or building sitemaps
- Added the ability to preview sitemaps prior to generating them via the new Preview addin
- Added the ability to ping search engines with the URL of your sitemaps via the new Ping addin
- Added the ability to extract the titles from PDF files during crawling
- Added the ability to load meta data from RSS files during crawling
- Added Reverse option to rules. When this option is set, the rule is processed if the regular expression is not matched.
- The link properties dialog now displays incoming and outgoing links for the source URI
- Removed the single instance limit
- HTTP redirect responses are no longer classed as errors
- Added simple sample project demonstrating some Sitemap Creator features
- URL's in link map dialog selection combo box are now ordered
- Various minor user interface enhancements
- Product documentation updated to include outstanding updates and features
Fixed
- Errors list no longer displays -1 instead of the appropriate error code
- Fixes an issue where it was not possible to open certain project files
- Fixes a problem where it was possible that the wrong item could be edited from the Forms list
- Fixes a problem where it was possible that the wrong item could be edited from the Rules list
- Fixes a problem where an addin that could not be initialized left the application in an unstable state
- Fixes a problem where some application settings were not immediately applied when changed by the user
- Default user agent now follows RFC 2616
- Fixed an issue where URI preprocessing wasn't immediately applied to URI's detected for crawling, which could cause additional unwanted entries appearing in the link and crawl maps.
- Fixed an issue where an unexpected exception during some stages of a web crawl would terminate the application
- Resolved a problem were some event handlers weren't correctly cleaned up when creating or opening projects
- Fixed an issue where certain events weren't being raised correctly for plugins
- Fixed an issue where the HTML provider could create non-compliant HTML
- Fixed issues preventing images from being included in XML sitemaps
- Fixed an issue where title replacement wouldn't work for some URL's
- Fixed a problem where modifying custom project settings exposed via an addin didn't mark the project as changed
- Fixed a crash which occurred when hovering over an overflow toolbar button
- Regular Expression editor now correctly updates when modifying the Replacement field.
- The error list now includes the description of HTTP response code errors
Changes and new features
- Added a Replace section to the Regular Expression dialog to make it easier to test replacement expressions
- Various performance enhancements
- The Errors tab no longer lists "Unknown Response" for non-200 HTTP codes, but instead includes the code description
- Added the ability to run user defined custom tools from within the application
- Attempting to open a recent file which no longer exists now prompts to remove the missing file from the recent files list
- Removed status bar animations
Fixed
- Fixed a crash when crawling if a rule was created with an invalid regular expression
- Reworked application mutex to avoid silent startup and shutdown exceptions
- Fixed regular expression cache not being thread safe
- Fixed status bar messages from occasionally not appearing
- Fixed rare object disposed crash when generating site maps using the FTP plugin
Changes and new features
- Added the ability to enable the "multi line" option in the Regular Expression editor to easier test patterns using ^ on $ on lists of URL's
- Added a Test URL option for Forms, allowing you to test that your forms can be successfully POSTed prior to running a full crawl
- Changed settings dialogs to use a tabbed interface
Fixed
- Fixed a large number of issues with the application services libraries and components
- Fixed an issue where attributes of posted URL's were not correctly loaded if encountered at a later point during the crawl
- Fixed a crash which could occur when using the title replacement options and a page had a null title
- Fixed a crash which could occur when scanning a HTML tag containing a malformed URL
- Fixed an issue where email addresses were stripped if they contained the # character and the "strip fragments" option was enabled
Changes and new features
- The Regular Expression editor can now list all URL's in the current link map to help with testing expressions
- Text editor dialog now supports find and replace functionality and other minor tweaks
- Reworked the Rule Editor to work around problems with rules which just manipulated content as opposed to basic inclusion or exclusion
Fixed
- Fixed a crash which occurred when choosing "View Link Map" from list view context menus
- Fixed a crash which occurred when saving a project and no files were present in any build results
- Fixed an issue where sitemaps would fail to build if the active project was untitled
- Corrected typos in flag enumerations
- Fixed an issue where font based window scaling wasn't working correctly
- Fixed an issue where the first item in a list view would be hidden when enabling the filter bar if it was previously hidden
- Fixed an issue where attempting to build sitemaps after cancelling a crawl would always return that the build was cancelled
- Text editor now correctly sets default folders when selecting "Save As"
- Fixed tab character not working in the text editor
Changes and new features
- The URI control for selecting the website to analyze is now tied to the system URI history
Fixed
- Fixed a crash introduced in build 1.0.0.11 when trying to view link properties from any list control
- The "Analyze Successful" message is no longer displayed when the analyze has been performed during the building of sitemaps
- Fixed a problem where tooltips weren't large enough to accommodate their contents
- Opening or creating a project didn't clear the contents of the Build Results tab
- Fixed issues building a sitemap that had only an output filename specified without including a path
- Fixed the & character from not appearing correctly in the status bar
- Fixed issue with application window being sent behind other top level windows when cancelling a crawl
- Corrected typo's in Robots.txt add-in text
Changes and new features
- The last-modified meta tag is now supported. If found on matching documents, it will be preserved in the crawl map and used to sort the sitemap.
- The Link Map window now remembers its size and position
Fixed
- Generating the site map tree no longer marks the project has having changed
- Dates are now saved as UTC
- The list of incoming URI's for any given URI were being incorrectly populated
- When using the host replacement setting, commands now use the original URI as appropriate
- Fixed an issue where if a URI was referred to in multiple locations, after the first time it was encountered the outgoing and incoming URI links would not be updated correctly for future encounters.
- When reloading a project, the link map is no longer crawled looking for pages directly matching the root element, but all non-excluded URI's are formed into the map, resolving a problem where the crawl map generating from reloading a project may not match the crawl map generating from analyzing a website.
Changes and new features
- A new addin is available which allows you to automatically upload sitemaps to a FTP server after successful generation
- A new rule option has been added that disables a sitemap link being created for a given URL, but still allows the contents of the URL to be crawled
- A new rule option has been added that can be used to prevent a rule from matching a child URI
- A new rule option has been added that allows a URL which would normally be skipped to be added to the sitemap without actively crawling them, for example a RSS feed.
- Redirects are no longer listed in the Errors tab and do not trigger the "page errors found" message
- Rules which enable the inclusion of content now appear with green icons in list views
- Redirect processing now honors 303 and 307 response codes
- Report lists now display tooltips
- If a link redirects to another, the destination is now stored with the original link
- Link properties dialog now shows redirect information
Fixed
- Exception reports were using the file version instead of the product version
- Fixed a rare XML crash when saving a project
- Fixed a crash which would sometimes occur when editing a rule or a form
- Fixed a crash opening a project if build result information was present for an unavailable sitemap provider
- External URI processor no longer transforms the URL if the "Host Replacement" option has been specified
- HTML Site Map plugin now correctly includes all items in the crawl map
- Fixed a crash which would occur if the referring URL was not available
- Fixed a crash which would occur if the "content-type" header wasn't present when pre-processing a URL
- Buttons in the main window now correctly follow the colors of the main theme
Changes and Updates
- Meta refresh redirects are now crawled
- Changed how redirects are handled, these will now appear in the main report lists
- Skipped list now displays the content type of entries
- Added new Not Found and Redirect exclusion reasons, redirects and missing files will no long appear as "None" in skip lists.
Fixed
- When using URL replacement, under certain conditions the replaced URL's would be garbled.
- Two URL's with the same host bar the www prefix (e.g. http://cyotek.com/ and http://www.cyotek.com/) are now treat the same when determining if a URL is external.
- URI's were not correctly combined on pages being crawled as a result of a redirect.
- Reloading a sitemap which contained redirects did not display a map for any content discovered after the redirect
- No longer attempts to download content for redirected responses
- Project's weren't always being correctly marked as changed
Changes and Updates
- Substantial performance improvements have been made when loading large projects containing many links.
- Updated to use Html Agility Pack 1.4
- A new option to control if headers should be saved in the project file has been added. This option is disabled by default.
- The log tab has been removed.
- Cut, copy and paste commands are now available from the main window. However, lists and trees currently only support copy.
Fixed
- Titles and Descriptions were attempted to be obtained from all files, causing a rare crash.
- The Accept GZip Compression option was never correctly read from the project file.
- The Sitemaps section in the Project Properties dialog was broken, preventing providers from being selected or unselected.
- The Results view wasn't being cleared when a new site analysis was started if previous information was present.
- Title Replacements with a blank pattern could behave very oddly
Changes and additions
- 401 authentication requests are now supported, either via predefined credentials or during the crawl via a password dialog.
- The default buffer size has been increased to a larger value, allowing for faster downloads. In addition, the buffer size is now configurable.
- Gzip compression is now supported.
- Deflate compression is now supported.
Crawling is now performed on a separate thread, resolving sluggish behaviour with the user interface.Disabled for this build- The Link Map Viewer now has a tab for displaying all links found. All lists in this dialog have had new columns added with more details on the links.
- The project properties dialog now provides access to properties which could not be changed in previous builds.
- Object model simplified, some confusing class inheritance has been removed.
- Added the ability for additional content type handlers to be used.
- Added the ability to specify multiple seed URI's.
- A new configuration section has been added allowing you to store authentication credentials in a project file and to disable the password dialog when crawling.
- Added a new viewer extensibility options allowing new tabs to be added to the interface.
- Major refactoring of the base IApplication implementation.
- Response headers are now stored in the link map. The Link Properties dialog now displays these headers.
- The Link Properties dialog now displays local path information and the ability to open, open the containing folder, or edit the local file. This change is more for Cyotek WebCopy than for the Sitemap Creator.
- Scanning of subdomains is now supported.
- You can now select from a common list of user agents.
- Crawling will no longer occur above the root level by default. A new option has been added to toggle this behaviour.
Fixed
- Redirects were not followed for 301 or 307 status codes.
- Page links found in an IFrame or Frameset were not scanned.
- Image links saved into a Google sitemap weren't updated if the project was using host replacement.
- Cancelling a crawl now also correctly aborts the current transfer instead of waiting for it to complete.
- If a list was scrolled horizontally, the content menu displayed from the filter bar wasn't positioned correctly.
- Fixed a bug where response headers were not available if the request was not an expected response code.
- The results list in the regular expression editor didn't resize with the window.
- The result expression editor no longer displays results for a blank expression.
- Duplicate keyboard accelerators have been fixed.
- The Sorted property of a crawl map now correctly defaults to false.
- Fixed a problem where it was possible for the CommandManager to try and load classes it had no business loading, causing error messages to be displayed on startup.
- Fixed a problem where command interface elements were not always given a name, leading to a problem where items could not be accessed unless the full text was known.
- The failure to load an image resource for a command interface element will no longer cause the application to fail to initialize.
- Fixed some layout problems in Windows XP.
Changes and additions
- Added the ability to include an URL pattern with a title replacement option. This allows you to include titles for non-HTML pages, such as an RSS feed or to replace titles on a single page where more or page pages have the same title.
- Exclusions have been renamed to Rules to reflect their changing nature in this build and future planned enhancements.
- Added a new option for Rules which allows you to include links to images in the sitemap. Any matched images will be be saved into the Google sitemap.
- Titles are now stored with the link map for non html documents if the title element is specified for the link. The new Google image option will uses these titles for generated image sitemap entries.
- When using the Add Rule context menu item from a result list, the editing dialog is now displaying allowing the entire rule to be configured.
- The Add Rule command now includes any applicable query string in the URL for the rule.
- The Build Results view now contains additional text if page errors were detected.
- Build results are now saved with the project
- The Build Results tab is now automatically repopulated when opening a project if previous build information is present.
- A new Edit option has been added to files listed in the Build Results tab. This allows you to view, edit, save and print plain text files within the Sitemap Creator.
- Added an option to sort pages and folders rather than just as found.
- A basic Regular Expression Editor is available and can be accessed via the Function button displayed next to supported fields.
- Error text associated with a page error is now stored in the link map.
- The page errors list will now be regenerated on loading a project with a saved link map.
- The Link Map Viewer now displays link titles and error text.
Fixed
- Fixed a problem where URL's contained spaces were incorrectly encoded in the ASP.NET sitemap.
- The Add Rule and Add Form dialog's caused a crash when being used to create rather than modify items.
- If a link to child of a page which has been matched to a rule with the DisableCrawl option is detected, the entire link will now be excluded.
- Fixed some selection inconsistencies in rules and forms editors.
- The Add Rule command now automatically escapes regular expression elements within the URL, such as the ? of a query string.
Changes and additions
- Command line arguments added.
- The Exclusion Type functionality has been removed and replaced with a series of flags to allow more control over exclusions.
- A new "Use Full URI" option has been added for exclusions. If this flag is set, the entire URI including the domain is used for exclusion matching, otherwise only the path and query is used. This change makes it easier to match URL's such as "/sitemap" but not "subfolder/sitemap".
- A new "Use Modified URI" option has been added for exclusions. If this flag is set, the modified URL as used by the sitemap is used, otherwise the original URL is used. This flag is mainly used when you are using domain aliases to remap URL's.
- The Exclusions and Forms lists and editors have been recoded to work in a less hacky fashion.
- Added the ability to turn on filtering for the remaining lists.
- Additional options added to the context menu for filter columns.
- The link map now stores the skip reason, if applicable.
- The Skipped Pages list is now automatically repopulated when opening a project.
- The Pages list is now automatically repopulated when opening a project.
- Some optimisation made to the crawl process to make it a little faster.
- Redirects are now followed.
- The popup progress dialog no longer appears when an operation is being performing from the UI.
- Added the ability to reorder exclusions.
- The last used folder is now remembered when displaying the Open File dialog.
Fixed
- Fixed a crash when reordering forms.
- The skipped pages list is now correctly populated during the analyze process after getting broke in build 1.0.0.3
- Fixed a bug in ObservableCollection which resulted in duplicate entries appearing in lists such as Exclusions when editing an existing item.
- Title replacement regular expression patterns are now case insensitive to be consistent with other pattern matching used in the software.
- The Source column has been removed from the Analyze Results as it's not applicable for Sitemap Creator projects.
- A status of "Skipped" instead of "Failed" could be returned in some circumstances.
Changes and additions
- The prompt which appears if you attempt to generate a sitemap using an old link map now only appears once per project session.
- Holding down shift when clicking the Build Sitemap or Build All Sitemaps commands will automatically analyze the website first. This will also suppress the displaying of the old link map prompt.
- Filter functionality has now been build into list views, removing the clunky interface from the View Link Map dialog.
- Added a new option to control how URL's are combined if a link starts with a forward slash but doesn't match the absolute path of the website being crawled.
- The Site and DefaultDocuments properties has been moved to the core CrawlerSettings object along with all related functionality. The core object is now fully responsible for creating the map based on the configured content types. This change won't affect the Sitemap Creator directly but makes it easier for other products to display a sitemap.
- Changed how temporary file names are generated for crawling with a specified cache folder.
- Added Title and Description properties to the LinkInfo object
- The sitemap is no longer saved in a project file. Instead, it will be automatically regenerated from the link map information.
- Added the ability to view the properties of link map entries.
- Various user interface tweaks
Fixed
- The wrong status was shown when attempting to post a form and the attempt failed.
- The last crawled date wasn't being read from a project.
- The wrong icons would get displayed in the column header when sorting a list
- Fixed a crash which could occur when right clicking an empty or filtered list in the Link Map dialog.
- Fixed an issue where certain link info properties were not getting persisted
- The progress displayed in the dialog didn't always reflect the progress of the results view.
- Temporary files created when saving a project file were not being removed after the save was complete
- A warning would be displayed for any sitemaps that could not be generated even if that sitemap was not selected for use with the Create All command.
- Fixed an crash which would occur if exclusions were defined and a URI did not have a content type
- Setting some LinkInfo properties did not mark the object as changed
Changes and additions
- Added file association to setup
- Projects are now added to the shell Recent Documents list and, in Windows 7, the jumplist.
- Document crawling has been refactored and separated to allow CSS, script, images, anchors etc to be independently scanned.
Fixed
- Fixed an issue where a header check on a page which did not exist caused a crash
- Fixed a crash which would occur when trying to determine the default extension of an unspecified content type.
- Fixed an issue where an URI which should have been skipped is still processed.
- Fixed an issue when downloading a content type included due to a global inclusion setting but which did not support crawling.
- Fixed various layout glitches.
- Fixed an issue where the wrong status icon was displayed for a "failed" operation
Changes and additions
- The sitemap is now located in the Project object rather than the SitemapCreator object. This means it can be persisted in a project file and reloaded without having to re-analyze a site.
- When opening a project containing a previously saved sitemap, if you try to build a sitemap without analyzing the site first, you are prompted if you want to use the existing sitemap or analyze the site from scratch.
- The Page object was being used to represent both forms and sitemap pages. This has now been seperated into Page and FormPage.
- Added a new Html Page sitemap type.
- Build results is now displayed as tab in the main window rather than via the separate popup dialog.
- Added the ability to disable exclusions and posted forms without having to remove them.
- The "Posted pages" functionality has been renamed to "Forms"
- Lists can now be sorted on any column
- The Sitemap Creator now utilizes Cyotek's Application Services, allowing easy integration of additional commands and functionality and making it much simpler to write add-ons. Existing support for the old plugin framework has been dropped.
- Addins are now operational.
- Domain Aliases, Exclusions, Forms and Title Replacements editors all now allow in-line editing of items. The original double click to remove, edit and add behaviour is still present.
- Moved some options around in the Project Properties dialog, and additional options have been added.
- Added additional URI checks
- You can now add exclusions by context clicking an URL in the Results pane.
- Added the ability to disable the removal of fragment information (bookmarks) from URL's.
- Added status icons to all lists and replaced several button glyphs.
- The crawler now generates a map of all URL's in a site. This can optionally be saved into a Sitemap Project file.
- Added the ability to view the link map from within the Sitemap Creator. If you have saved the link-map into your project, you can reopen the project and view the link-map without having to re-analyze your website.
- Added the "Open in browser" option to the context menu for Page Error entries.
- The DontLeaveSite property of the crawler object has been removed, this behaviour is now implicit and cannot be disabled on a global scale.
- Added progress support to the crawler and front end when downloading files.
- Added additional display options for the sitemap tree
Fixed
- Editors which displayed a relative URL caused a crash if the base URI for the project was invalid.
- Page events always returned "Bad Request" for the HttpStatus property when no status was available.
- Corrected a problem where the crawler downloaded binary files incorrectly.
- Corrected a problem where URL's containing the hash character could be incorrectly parsed
- The page errors tab wasn't always being activated if one or more page errors were found.
- Page errors are now correctly created if the response code of a request is not a success code
- Corrected a problem where query string values were not encoded.
- Fixed a crash which could occur when modifying exclusions in the main window after opening the popup editor.
Minimum Requirements
- Windows 10, 8.1, 8, 7, Vista SP2
- Microsoft .NET Framework 4.6
- 20MB of available hard disk space
Donate
This software may be used free of charge, but as with all free software there are costs involved to develop and maintain.
If this site or its services have saved you time, please consider a donation to help with running costs and timely updates.
Donate