Data Records Comparison System
Most enterprises have dozens or hundreds of datasets that need to be analyzed and validated whenever systems, processes, or populations change. Dataset analysis can be a tedious and repetitive job, even when using advanced Excel functions.
Dispatch Integration’s Compare is a fully automated, secure, and easy-to-use desktop application that rapidly compares .csv and other comma-delimited files to find differences between datasets. It is a smart, simple and fast data comparison solution that eliminates hours of work normally required to compare data within large files.
Compare Works In 4 Easy Steps
Step 1: Select Files to Compare
Select two comma-delimited files with the same column headers. The ID number of each file is used to match the records between the files.
Step 2: Select Key Fields
After selecting the files that need to be compared, select a key field by using the options available (column headers from the files). More that one field can be selected.
Step 3: Select Output Options
The output file will contain more columns than the original files being compared. This allows the user to filter files that have differences in datasets, and decide how to deal with the identified discrepancies.
Step 4: Compare Output In Excel
The output highlights the discrepancies between files and provides users with information such as the number of records in each file, the number of unique records between the two files, the number of differences detected, and the total number of warnings (e.g. records with identical ID numbers).
Compare has greatly reduced the amount of time I need to analyse “before” and “after” files for our semi-annual upgrade testing. It allows me to quickly focus on where the differences in data are and easily code them as “explainable” or “unexplainable”.
Senior Business Analyst – Retail
Compare Saves Valuable Software Development Time and Resources – and Eliminates Problems Caused by Bad Data
Features And Benefits
Easy To Use
Compare is a powerful data comparison tool designed to be simple and easy to use.
Compare can be used by business analysts, data scientists, integration specialists, and other software professionals to quickly and accurately find the differences between datasets.
Large files consisting of hundreds of columns and hundreds of thousands of rows can be processed in a matter of minutes.
Compare is secure. All data processing occurs on your local machine. Nothing is transmitted or processed in the cloud.
Compare can process millions of records in minutes and outputs the results to Excel for further analysis.
This powerful integration is available as a free download. Compare is designed and built with care by Dispatch, a leading enterprise data integration solution provider.
Compare Is For You!
Developers and Application Managers use Compare to ensure their code is delivering the output they expect. Use Compare during unit testing to validate the output of new enterprise systems matches legacy code and business requirements.
Engineers, Researchers, and many more professions
Professionals use Compare as a simple and fast tool to help them analyze data quickly, often in conjunction with more sophisticated data analytics systems. Compare saves time and eliminates tedium, regardless of data type.
IT professionals use Compare for regression testing and data validation during system upgrades. When applications are modified, reconfigured, upgraded, or replaced, Compare is an invaluable tool to confirm system output is consistent with any application change.
Simply export a “known-good” output file from the current system and compare it with a file generated by the new system. Compare will flag any differences between the files in minutes.
Business Analysts use Compare to quickly analyze datasets from business systems to detect anomalies. Data quality often deteriorates across integrated business systems, and cleansing data can be tedious. Compare can detect data anomalies between systems in minutes, which can be tagged for further review, or accepted as is.
The output is in Excel format and can include both the original data plus a rich comparison sheet to facilitate more analysis if required.
Quality Assurance Professionals
Compare can help QA professionals quickly analyze datasets to pinpoint quality issues or anomalies. Whether you’re looking for temporal changes, piece-to-piece changes, within piece changes or population to population changes, Compare will help you quickly discover potential quality problems.
Data Migration Specialists
Differences can easily be flagged in the output file for further investigation or acceptance.
Frequently Asked Questions
What does Compare do?
Compare analyzes two data files containing similar lists of unique records and generates an excel sheet summarizing the differences between the two files. For an overview video on how to use Compare, click here
How do I install Compare?
If you haven’t already downloaded the installation file, click on this link to be taken to a site to initiate the download.
Once you have filled out the form and accepted the Terms & Conditions, you will be taken to a secure download page. On this page you can download the Compare installation program, some sample data files to get you started, and some installation instructions.
When the application has finished downloading, run the installation program.
To try out Compare for the first time, download the two practice files located on the application download site:
Compare Practice Data File 1.csv
Compare Practice Data File 2.csv
You can use these files to try the application before using your own data files.
If you have any issues with installation, please submit a ticket here.
What type of files can be compared?
For more information on these kinds of files, see here
Do my files need to have a .csv extension?
What kinds of differences can Compare detect?
- Simple differences (e.g. “LOA” vs. “Leave of Absence”)
- Capitalization (e.g. “atlantic” vs. “Atlantic”)
- Leading or trailing spaces (e.g. ” Weekly” vs. “Weekly” vs. “Weekly “)
- Special characters (e.g. “Cafe” vs. “Café”)
Any records that only exist on one of the files will be identified, as well as any records where the unique identifier is repeated.
The records in the files are matched based on unique identifiers (one or more fields that uniquely identify a record), so it does not matter if the records in the two files are in a different order.
Can I use files without headers?
If the first row just contains data, it may be difficult to select key columns and the output data may be difficult to interpret. In some cases, Compare may not be able to interpret files without headers at all, in which case you will see an error similar to this:
If you see this error, please open the source files, add the same headers to each file, and re-save as csv files.
How does Compare handle extended character sets?
Note: Compare requires both files to have the same encoding – either ASCII or UTF-8.
Please note: there are multiple CSV file encoding formats and no easy way to tell how a file is encoded. Take caution when opening and saving these kinds of files to ensure appropriate encoding is preserved.
What happens if I have duplicate column header names?
If you want to include one or both of the duplicate column headers as key fields, we recommend renaming the headers to unique names before you load the files.
Can I use files that have a semicolon or pipe as delimiter?
Sometimes other delimiters are used in data files such as pipes (“|”) and semicolons (“;”). If you can convert these files to comma delimited, they can be processed by Compare.
When converting these files, it is important to ensure the records themselves don’t have commas in them, as these commas would be interpreted as delimiters. This may cause Compare to fail, or make the results invalid. Therefore, we recommend the following steps for conversion:
- Open each file to be compared in a text editor
- Search for commas and replace those commas with another character (such as a dash). You may want to choose a character that isn’t used elsewhere in the file.
- Search for the delimiter (most often | or ; and replace them with commas.
- Save a copy of each file. Be sure the file is saved as plain text.
- Run Compare using these modified files.
Why can't I sort columns in the output file numerically?
What happens if I often analyze similar types of files?
Compare saves that fingerprint so that the next time you analyze the same type of file, the key(s) are automatically selected. This allows you to skip this step and start the process of comparing right away.
Why is the output in .xlsx format?
How long does it take to process a file?
Processing speed is a function of file size, file configuration, options selected and the speed and memory of your computer.
If you are comparing large files with many columns (fields), we recommend selecting the option to output only differences and to exclude original data sheets.
Writing output is the most intensive part of the Compare process, so if you select the options to output all records and include original sheets, the process can take 2 or more hours.
Is there a maximum file size?
Each record (row) cannot exceed 255 fields (columns).
Each record (row) cannot be longer than 8,192 characters.
The maximum number of rows that Compare can process is limited by memory on your machine. Note that Excel cannot load files larger than 1 million rows.
The amount of available RAM on your computer may further limit file size that can be processed by Compare.
Is my data secure?
You can safely process files containing confidential data, personally identifiable information (PII) or private health information (PHI).
Note: the output files will contain data from the input files, so these should also be treated with the same level of security and privacy controls as the input files.
What happens if input files or output files are open in Excel when running Compare?
This means that you must ensure Excel does not have input files open when selecting them. You must also ensure the output file is not opened in Excel when starting the compare. If you see errors related to this, just close the files in Excel and re-start the compare.
How does licensing work with Compare?
Compare is free to download and use with no limitations on time or number of computers. The use of Compare is subject to licensing terms and conditions that you can read here.
What are the system requirements to run Compare?
Compare works on Windows 10 or greater. Screen resolution of at least 1290 x 800 is required. Minimum of 8GB of RAM.
How do I report a bug, or provide other feedback?
Click on “Help”, and “Support”. This will open your web browser and take you to our support portal where you can send us a support ticket.
I am getting the following error message: Unable to complete processing: Index was outside the bounds of the array.
This error occurs if an input file has rows with more records than the header. In this case, the software cannot determine what column the extra record belongs to.
If you get this error, check the input files to make sure all rows have the same number of columns as the header row.
Get Compare Today
Get Started Today
- Compare two comma-delimited files automatically
- Output to Excel
- Selection of single or multiple keys
- Choose to show all records or just differences
- Secure – all data remains on your computer
- Powerful – file size only limited to memory on your computer
We’re Here to Help
Contact us if you need support or have questions regarding Compare