Getting Started Guide
Click on the video below for a short tutorial on how to get up and running with Compare quickly
Working with Compare Output in Excel
The Compare Output File
Compare outputs results to an Excel .xlsx file. Depending on the options chosen, there will be two or more sheets in the file. These are:
- Compare – the main analysis output containing all records or only detected differences, depending on output option selected.
- Warnings – summary of warnings and errors detected from the analysis (only displays if errors were detected in the analysis).
- Stats – file names and statistics from the analysis.
- First File – the content of the first file analyzed (if output original data sheet was selected).
- Second File – the content of the second file analyzed (if output original data sheet was selected).
The Compare Sheet
This sheet contains the details of the comparison between the two files. This sheet will either contain all records or just the records that are different between the files, depending on the option chosen.
The first column contains the key field(s) used as the basis for the comparison. If you selected a single key, the value of that key will be shown for each record. If you selected multiple keys, each key will be included in the column separated by a “#” symbol.
The second column indicates whether the record matches between the files (with the word MATCH) or has differences (with the word DIFF). This can be handy especially when working with wide files with many fields.
Subsequent columns (fields) are presented in three-column structures:
[Column Heading]: Field Name. Flagged as a MATCH if the data is the same in this field for each record, or DIFF if the data is different
[1-Column Heading]: The field data in the first file for each record.
[2-Column Heading]: The field data in the second file for each record.
Note: you may need to expand the column widths to see the entire field names.
If you have selected all records as an output option, you can still filter the excel sheet to show only the differences by clicking the dropdown under Column B and selecting only DIFF’s.
If the data matches in a row, the Full Record column indicates MATCH.
If the data is different, the fields are highlighted and the first column indicates DIFF.
Flagging differences to be accepted or fixed
When you find differences in records, you can indicate whether to accept them or fix them (in the source system) by selecting the DIFF cell next to the field to be corrected and changing the flag to ACCEPT or FIX.
This is a handy way to create a worklist of records to correct as an outcome of this analysis.
Note: there may be more than one field that is different in a record. If you accept all the differences, the entire record will be flagged as ACCEPT in column B. If there are one or more fields to be fixed, column B will show FIX.
At the end of your analysis, you may want to filter Column B to just show the records to be Fixed, which will create the worklist of records to update in the source system.
The Warnings Sheet
The Warnings Sheet contains any errors or warnings detected during the analysis. If there were no warnings detected, this tab will not be included in the output sheet.
The most common warning is if duplicate records are detected in a sheet. This can happen if the key(s) selected are insufficient to define unique records. If duplicates are detected, the second and subsequent records are logged in the Warnings sheet and the record is not used in the analysis.
If duplicate records are detected, you may wish to re-run the Compare using additional or new keys.
If you believe the keys are correct, you can use the output in the Warnings tab to inspect the source system to cleanse duplicate records.
The Stats Sheet
The Stats sheet contains a summary of the statistics of the analysis. This includes a record of the file names and paths compared, a summary of the record counts for both files, the total number of unique records, total different records, and any warnings generated.
The First File and Second File Sheets
If selected as an output option, Compare will write the source data of each file to separate sheets in the output file.
You may chose to do this if you would like a single excel file to contain the source and output of the analysis.
Note: writing these files to Excel can lengthen the duration of the analysis substantially depending on the size and structure of the files. We recommend to not use this option if the files are large.
Frequently Asked Questions
What does Compare do?
Compare analyzes two data files containing similar lists of unique records and generates an excel sheet summarizing the differences between the two files. For an overview video on how to use Compare, click here.
How do I install Compare?
If you haven’t already downloaded the installation file, click on this link to be taken to a site to initiate the download.
Once you have filled out the form and accepted the Terms & Conditions, you will be taken to a secure download page. On this page you can download the Compare installation program, some sample data files to get you started, and some installation instructions.
When the application has finished downloading, run the installation program.
To try out Compare for the first time, download the two practice files located on the application download site:
Compare Practice Data File 1.csv
Compare Practice Data File 2.csv
You can use these files to try the application before using your own data files.
If you have any issues with installation, please contact us at email@example.com
What type of files can be compared?
Compare works with a common data file type called “comma delimited format”. These are plain text files with each row representing one record, and each record consisting of one or more fields, separated by commas. They most often have a .csv file extension, although this is not mandatory.
For more information on these kinds of files, see here.
Do my files need to have a .csv extension?
Compare works with comma delimited files, but does not require a .csv extension. Many systems output in comma delimited format but use their own proprietary file extensions. These files will work with Compare.
What kinds of differences can Compare detect?
Compare detects non-matching cell values such as:
- Simple differences (e.g. “LOA” vs. “Leave of Absence”)
- Capitalization (e.g. “atlantic” vs. “Atlantic”)
- Leading or trailing spaces (e.g. ” Weekly” vs. “Weekly” vs. “Weekly “)
- Special characters (e.g. “Cafe” vs. “Café”)
Any records that only exist on one of the files will be identified, as well as any records where the unique identifier is repeated.
The records in the files are matched based on unique identifiers (one or more fields that uniquely identify a record), so it does not matter if the records in the two files are in a different order.
Can I use files without headers?
We recommend you always use files with defined header rows. Compare always assumes the first row in the files you select is the header row.
If the first row just contains data, it may be difficult to select key columns and the output data may be difficult to interpret. In some cases, Compare may not be able to interpret files without headers at all, in which case you will see an error similar to this:
If you see this error, please open the source files, add the same headers to each file, and re-save as csv files.
How does Compare handle extended character sets?
Compare works with standard ASCII encoded files and extended character set UTF-8 encoded files. UTF-8 encoded files preserve non-English language characters and other special formats. If these files are saved as ASCII encoded CSV files, the extended character sets will be lost.
Note: Compare requires both files to have the same encoding – either ASCII or UTF-8.
Please note: there are multiple CSV file encoding formats and no easy way to tell how a file is encoded. Take caution when opening and saving these kinds of files to ensure appropriate encoding is preserved.
What happens if I have duplicate column header names?
You can compare files that have duplicate column headers, but for the comparison to work properly, these duplicate columns should not be selected as “key fields”.
If you want to include one or both of the duplicate column headers as key fields, we recommend renaming the headers to unique names before you load the files.
Can I use files that have a semicolon or pipe as delimiter?
At this time Compare only works with files which have records delimited by commas.
Sometimes other delimiters are used in data files such as pipes (“|”) and semicolons (“;”). If you can convert these files to comma delimited, they can be processed by Compare.
When converting these files, it is important to ensure the records themselves don’t have commas in them, as these commas would be interpreted as delimiters. This may cause Compare to fail, or make the results invalid. Therefore, we recommend the following steps for conversion:
- Open each file to be compared in a text editor
- Search for commas and replace those commas with another character (such as a dash). You may want to choose a character that isn’t used elsewhere in the file.
- Search for the delimiter (most often | or ; and replace them with commas.
- Save a copy of each file. Be sure the file is saved as plain text.
- Run Compare using these modified files.
Why can't I sort columns in the output file numerically?
Compare formats all data as text. In order to sort a column on the output file numerically, simply select all of the values in that column (control-shift), click on the exclamation mark next to the top cell, and select “convert to number”. Now a filter on that column can sort numerically.
What happens if I often analyze similar types of files?
The first time you analyze files using Compare, the analytics engine creates a fingerprint of the file structure and the key columns you select.
Compare saves that fingerprint so that the next time you analyze the same type of file, the key(s) are automatically selected. This allows you to skip this step and start the process of comparing right away.
Why is the output in .xlsx format?
Excel remains the most popular general purpose analytics tool for business users. By outputting in .xlsx, Compare can add some additional functionality such as the ability to use color to denote records with differences, and dropdowns to select next actions.
How long does it take to process a file?
Compare uses a powerful data processing engine that can complete large files with millions of records in minutes.
Processing speed is a function of file size, file configuration, options selected and the speed and memory of your computer.
If you are comparing large files with many columns (fields), we recommend selecting the option to output only differences and to exclude original data sheets.
Writing output is the most intensive part of the Compare process, so if you select the options to output all records and include original sheets, the process can take 2 or more hours.
Is there a maximum file size?
The maximum file size Compare can handle is as follows:
Each record (row) cannot exceed 255 fields (columns).
Each record (row) cannot be longer than 8,192 characters.
The maximum number of rows that Compare can process is limited by memory on your machine. Note that Excel cannot load files larger than 1 million rows.
The amount of available RAM on your computer may further limit file size that can be processed by Compare.
Is my data secure?
Yes. Compare is a desktop only application. No data is ever moved off of your machine, and all processing occurs on your computer.
You can safely process files containing confidential data, personally identifiable information (PII) or private health information (PHI).
Note: the output files will contain data from the input files, so these should also be treated with the same level of security and privacy controls as the input files.
What happens if input files or output files are open in Excel when running Compare?
Excel “locks” files when in use, which prevents files from being changed by other applications while opened in Excel.
This means that you must ensure Excel does not have input files open when selecting them. You must also ensure the output file is not opened in Excel when starting the compare. If you see errors related to this, just close the files in Excel and re-start the compare.
How does licensing work with Compare?
Compare is available as an annual subscription. Each subscription license entitles the user to install the application on a single computer. If you need to change computers during the year, you can update your license details at any time.
How does the free trial work?
You can download an unlimited use two-month free trial version of Compare for evaluation. There are no limitations on file size or number of compares done during the trial. At the end of the trial, the application will be locked until you purchase a subscription license.
How do I purchase more than one license?
If you’re a manager or team leader who would like to purchase license subscriptions for your team members, you can select this option during checkout.
You can manage and assign licenses to team members and individual computers. If team members change roles, you can reassign licenses to different people.
What are the system requirements to run Compare?
Compare works on Windows 7 Professional or greater and macOS Catalina (or newer). Screen resolution of at least 1290 x 800 is required.
How do I know whether I'm running the up-to-date version of Compare?
In Compare, click on the “Help” menu, and “Check for Updates”. If there is a newer version available, you will have the option to download and update Compare automatically
How do I report a bug, or provide other feedback?
Click on “Help”, and “Support”. This will open your web browser and take you to our support portal where you can send us a support ticket. You can also email us at firstname.lastname@example.org.
I am getting the following error message: Unable to complete processing: Index was outside the bounds of the array.
This error occurs if an input file has rows with more records than the header. In this case, the software cannot determine what column the extra record belongs to.
If you get this error, check the input files to make sure all rows have the same number of columns as the header row.
If you already have an account and would like to change your password, manage your licenses, or change or cancel your subscription, please click the link below.
If you have difficulties with Compare, want to report a bug, have ideas for a feature or have general feedback, click here to go to our support portal. We value all your feedback and will get back to you as soon as possible.
Download a free trial
If you have not yet tried Compare, click here to download a free no obligation unlimited one month trial.
© 2019 Dispatch Integration Ltd.