

Lastly, and perhaps most concerning, if different folks are doing data preparation different ways then it’s very possible that you will end up with different answers to the same questions.This work is definitely possible in the reporting tools, but that’s not necessarily where you’re going to get the best end result. Another challenge may be lower usability depending on the expertise that you have in-house to do data preparation in the first place.You’re also likely to experience, in the long run, much longer development times than you would with a centralized solution where all of that work is automated once and then reused.You’re also going to be duplicating effort across departments because different departments may be using the same data and trying to do the same types of transformations.In other words, you’re not going to be able to reuse that data preparation in other reporting tools very easily.

Whatever work you do, it’s going to be in that reporting tool.There are some disadvantages, however, and those include: Lastly, this is definitely the fastest path to get to initial reports.There’s also no need to gather other departments and get everyone on the same page about naming conventions or anything else.You also don’t need to invest in other software or hardware, just a reporting tool, that’s pretty much all you need.For one, you don’t need to get organizational buy-in or executive sponsorship, you just get the tool pointed at some data and start working.If you’re going to decentralize, using only a reporting tool to do your data preparation, there are certain advantages that come with that decision: Lastly, you often need to join data not just from different tables within the same source system, but from entirely different source systems, in order to get a complete picture of the subject matter that you’re trying to report on. As systems change over time, data may no longer be meaningful and you want to strip it out before it ends up in a report. Therefore they work better on a denormalized data structure.Īnother reason is that you usually want to report on things that don’t exist in the source system so you have to create new data elements or calculated fields before you can produce the report.Īnother example is the need to filter out old or invalid data. One of them is that OLTP systems use a normalized data structure, but the types of queries that are needed for reports are very, very complex and require you to join a lot of data together. Usually it comes down to the simple fact that data in source systems is not what we’ll call “reporting ready,” and this can be for a number of reasons. Most of us know reporting is hard, but many aren’t sure why. And before we can answer that question let’s take a quick look at what we mean when we say data preparation.
