Data Exports are a useful tool for comparing your Magento BI data to your source data in order to pinpoint data discrepancies in your reports, especially if the data discrepancy diagnostic checklist didn't help you pinpoint the problem. This article will walk you through a real-life example of how data discrepancies can be pinpointed using Data Exports. This will allow you to troubleshoot your own discrepancies quickly and efficiently.
Take this analysis, for example:
There’s a suspicious dip in November 2014. $500,780.94 in revenue? That doesn't sound right. You've confirmed that there’s more revenue showing for the month of November 2014 in your source database, and you've double-checked that the Revenue metric used in this report is correctly defined. It seems that the data in the Magento BI data warehouse is incomplete which can be confirmed using a Data Export.
To get started, click the gear in top right corner of the chart and then the Raw Export option in the dropdown menu. This will give you a raw export of the data behind the chart.
In the Raw Data Export menu, you can select the table to export from along with the columns to include in the export. Filters can also be applied to the result set.
In our example, the Revenue metric used on this report uses the order_total field defined on the orders table, using the date as its timestamp. In our export, we want to include all order_id values for November 2014 and their order_total. The Revenue metric doesn’t use any filters, but we'll add a filter to the export to limit the result set to just November 2014.
Here’s what the Raw Data Export menu looks like for this example:
Click Export Data to begin the export. A window with the details of the export, including the status, will display. Prepping the export takes a few minutes, which makes now a good time to perform an analogous extract of our source data for November 2014, including date, order_id, and the order_total. We'll open this file in Excel and leave it up, as we’ll come back to it shortly.
When the Download button appears on the Raw Data Exports window, click it to download the zip file containing the CSV file.
At this point, we need to get all the data into one sheet to find the problem. We'll import the CSV file (the export from Magento BI) into a different sheet of the Excel file containing our source data.
Now that all the data is in one place, we can look for the source of the discrepancy. Comparing the number of rows in each sheet will help us pinpoint the problem. Let's take a closer look at each situation.
If both systems have the same row count and the Revenue metric isn’t matching the source data, then the order_total must be off somewhere. It’s possible that the order_total field has been updated in your source database and Magento BI isn’t picking up these changes.
To confirm this, take a look at whether or not the order_total column is being rechecked. Head to the Data Warehouse Manager and click the orders table. You’ll see the recheck frequency listed in the ‘Changes?’ column. The order_total field should be set to recheck as often as it is expected to change; if it’s not, go ahead and set it to your desired recheck frequency.
If the recheck frequency is already set correctly, then something else is wrong. Refer to the Contacting Support section at the end of this article for next steps.
If the source database has more rows than Magento BI and the gap is greater than the number of orders that you can expect to come in during the length of an update cycle, there may be a connection issue. This means that Magento BI isn’t able to pull in new data from the source database, which can happen for several reasons.
Navigate to the Connections page and take a look at the status of the data source containing the order table:
If the source database has fewer rows than Magento BI, then it’s possible that rows are being deleted from the source database and Magento BI isn’t picking up these deletions. Deleting data can lead to discrepancies, lengthier update times, and a slew of logistical headaches, so we strongly recommend you don’t ever delete data unless it’s really necessary.
If, however, rows are deleted from the table, take a look at the recheck frequency on the primary key. Rechecking the primary key means that the table will be checked for deleted rows.
In the Data Warehouse Manager, primary key columns are marked with a key symbol. In our example, the primary key is the order_id column:
If the primary key is already set to be rechecked or rows are never deleted from this table, then you’ll need RJ support to pinpoint the problem. Refer to the following section for next steps.
If you aren't able to pinpoint the source of the problem, you'll need to loop in RJ Support. Before you submit a ticket, please do the following: