Three mistakes to avoid when building a business intelligence solution
Article by Denodo.
Business intelligence (BI) projects are a high priority for companies seeking to empower better, faster, data-driven decisions and actions based on high-quality, high-value reports.
However, new BI implementations are complex and inherently risky, and BI team leads must identify all associated data quality risks upfront while engaging all stakeholders from the top down.
He or she has to ensure that executive-level buy-in filters through all departments involved in the project, that there is sufficient infrastructure to support the multiplicity of data sources available to an enterprise in modern times, and that there is sufficient support in place to incorporate real-time analytics.
Taking the time to plan a BI project and create a clear roadmap carefully will lead to a better, more functional BI deployment.
Here, we address three common mistakes that companies should avoid making so that the implementation process is smooth, and the project is ultimately successful.
Many companies try to modernise their BI solutions while holding onto core solutions that are dated and may no longer be fit for the task. The challenge is that companies often have different tolerance levels for retiring ageing solutions. Deciding on when to phase out or replace solutions can depend on critical infrastructure, data sources, how integrated they are with legacy systems, and so on.
Also, certain solutions stay viable for longer than others. BI implementations designed to be more reporting-centric or built around batch-oriented extract, transform, and load (ETL) processes built around data warehouses tend to age fast, as they do not support many modern data types.
ETL based data integration is also resource and time-intensive, limiting data ingestion and delivery to scheduled batches. This approach will not be able to support many modern use cases like mobile dashboards or live web applications.
Traditional BI solutions that are embedded in ERP systems, as well as some simpler, disparate reporting tools that support limited uses, also have shorter lifespans.
Some companies may simply use a spreadsheet tool to perform fundamental data analysis and manual cutting and pasting between files stored on different computers or emailed back and forth to perform some rudimentary data integration. Companies should look at supporting or replacing these tools with integrated modern analytics technologies if this is the case.
Modern data architecture enables companies to streamline interoperability on data models and data integration. This speeds up business processes and reporting, which increases efficiency.
For instance, in data preparation, data that was created or prepared in one specific product can be further extended to support various other functions using data virtualisation, enabling the organisation to share a virtual view of the data without actually moving the source data. Once this virtual data source is created, it can be shared with other parts of the analytics workflow, including emerging augmented analytics tools.
Such tools leverage machine learning (ML) and natural language processing (NLP) technologies to generate business-friendly intuitive insights. Advanced data architecture provides the foundation for business users to leverage real-time information for timely decision making.
Businesses today are ingesting data from many varied sources to gain deeper insights into customer behaviour, market opportunities, and competition. BI infrastructures can incorporate a wide variety of data sources and data ingestion points.
These can include databases, social media sources, and streaming video sources, and other sources that can come in both structured and unstructured formats. Companies must account for all these data sources and plan for various possible sources in their BI strategy.
However, they must not attempt to collect all of this data into a single repository. Even a modern ETL-based system will be built around collecting all available data and loading it into a data warehouse. Such an attempt will fail because traditional data warehouses cannot support modern data types, such as unstructured data.
Companies may attempt to work around such issues with data adapters or connectors, which establish point-to-point integrations between one source and one or more targets. However, these solutions expose other shortcomings. Like ETL processes, point-to-point integrations are also challenging to manage; they add more complexity to the data-integration problem than reduce complexity.
For a modern BI platform to work, companies need to ensure connectivity with a diverse range of data sources, including structured or unstructured data, relational or nonrelational, or on-premises or in the cloud.
The rapid growth of data generates challenges beyond data volume. Before companies can harness the data, they need to deal with the variety of the data, the time it takes the data to move around the organisation, and data generation speed.
The current data growth is mainly in unstructured data. That is, data that is often characterised as ‘human-generated-information,’ such as high-definition images and videos, social media posts, and phone and chat logs.
This increase in data ingestion points highlights the need for a more agile way to integrate data in real-time so that analysis can be performed at the speed of origination. To make this possible, companies will need modern data architectures that allow for agility and real-time access to a broad base of data sources.
The problem is that most companies are stuck with older IT infrastructures, including backend systems that are not set up for real-time data access. These companies have to first replicate data from all their disparate sources into yet another repository, such as a data lake.
This means that companies take more time than necessary to integrate and process data, making it impossible for them to perform real-time analytics. This approach can also lead to data duplication, loss of data context, and added latency. Worse, the replicated data is always, at least slightly, out of sync with the original sources because data is continuously created and collected as the business runs.
The best solution is to enable each BI reporting solution to interface directly with each applicable data source, so there is no latency in data access, analysis, and reporting. With data virtualisation, companies can establish an intermediate data access layer that can source a wide variety of data while abstracting away all data access complexities from the consuming applications, enabling real-time analytics. Data virtualisation provides a holistic, real-time view of the integrated data without replicating any source data.
BI solutions empower decision-makers with data-driven insights. In today’s highly competitive landscapes, making informed decisions promptly can make a critical difference between success and failure.
However, BI projects can be costly and hard to predict, so organisations must get it right with BI early in the process. By considering data virtualisation at the beginning of a BI project, companies will be well on their way to doing so.