Data Analytics Trends in 2019
In recent years data volumes have been increasing dramatically. This has created major challenges for traditional analytics platforms in terms of storage, management and cost. This trend will continue and accelerate, requiring companies to find better and more cost effective ways to manage their data loads.
Data Lakes vs Data Hubs
The answer to the massive increase in data volumes was the Data Lake. The main aim was to use cheap storage to hold the raw data and only extract, transform and load (ETL) data for immediate needs. This reduces the cost of storage as only the data that is required is now moved to the centralised data warehouse or higher aggregation levels within the lake.
The problem is that there is no clear consensus on what a Data Lake is and how it should be architected. The majority of focus is solely on the storage of raw data and not so much on the usage, servicing, security and privacy. With this trend, many are questioning the value of a data lake.
As a result, the Data Lake concept will be becoming less popular and likely to be replaced with well architected Data Hubs. The Data Hub is the natural progression from the Data Lake as it not only focuses on storage but also provides layers of fast structured data in a cloud environment and a servicing layer that includes self-service as well as API access.
The focus for large scale data users will be shifting to how the hub can deliver the various different data sources and insights effectively to the business and customers in near real time. This will create the much needed value proposition that was envisaged from the Lake.
De-Centralisation of Data
Cloud storage has made it easy to store vast amounts of data for a relatively low cost. This is creating many pockets of raw data stored across the organisation as teams store massive amounts of raw data.
The question that every organisation will have to ask themselves is whether they want to hold on to a centralised analytics platform in form of a data warehouse or whether a decentralised approach is better..
The advantage of the decentralised approach is that the data is stored and maintained by the owners of this data. They can best manage quality, retention, privacy and security. However, to allow synergies across the organisation a centralised framework for governance, data discovery and data servicing must be in place.
The centrepiece of this framework will be an Information Catalogue that integrates the data on a semantic level and provides tools that allow Data Scientists and business people to access the data across the organisation. Analytics sandboxes will be required that can provide masked data for analytics modelling and pattern development.
The requirement for well designed Data Governance frameworks will continue to grow. With decentralised Data Hubs and the huge data volumes on one side and increasingly higher demands in privacy and security regulations on the other. As a result, it will be critical for organisations to invest in new organisational structures with clearly defined accountabilities for data as an asset.
Spearheading the changes in Data Governance are roles like Corporate Data Officers (CDO) who will oversee a number of Data Stewards (aka Data Curators) in their domain. The stewards ensure the quality, management and discovery of decentralised data within the various business domains.
In a “best in class” scenario where data is stored and managed in a decentralised framework across the organisation and Governance being centralised. It will be critical that the data sources can be embedded into a centralised catalogue. New tools are coming on the market that helps to discover data across the organisation and identify synergies automatically.
Tools and processes must be in place that allows staff to create productionised data pipelines that can feed from different decentralised data sources to provide business and customer insights.
The trend of moving analytics platforms from on premise to cloud will continue. Cloud offerings provide more flexible and often more cost effective storage solutions that have a number of advantages. Firstly, it is more effective to bring the processing to the data than bringing the data to the processing. Serverless compute within a cloud environment and the ability to spin up massive clustered analytics platforms on demand for short periods of time allows users to decentralise the analytics workload in a cost effective way.
Organisations have to be careful not to fall into the trap of approaching their cloud strategy with the monolithic mindset of the last 20 years. A successful strategy is to develop a layered data architecture that hooks decentralised data aggregation levels into a centralised data delivery framework allowing all parts of the organisation to access data appropriate to their clearance and data requirements.
Wider Experimentation with Machine Learning and Artificial Intelligence
Machine Learning (ML) and Artificial Intelligence (AI) have been buzzwords for a long time now. Many R&D focused organisations have productionised ML and AI implementations, but the adoption of this technology has been slow.
In the coming years, many more organisations will start experimenting with ML and will find new use cases in which the technology will be useful and add value. In data analytics, this will require new skill sets in the BI departments. Data Engineers will need advanced knowledge in modern analytics technology such as Hadoop, Spark and various different Machine Learning algorithms.
The spectrum of the different ML training models is quite diverse and the innovation rate is still quite high in this field. As a result, any investment in this space needs to be tightly embedded in the long term data strategy in order to make sure that the value added is clearly identified before starting a new project.
Human-to-machine communication has not yet been perfected, but enterprises are already beginning to integrate this groundbreaking technology into their operations,…MORE INFORMATION
Fusion Professionals has signed a partnership agreement with MapR Technologies, provider of the industry’s leading data platform for AI and…MORE INFORMATION
“Big data is at the foundation of all of the megatrends that are happening today, from social to mobile to…MORE INFORMATION
In recent years data volumes have been increasing dramatically. This has created major challenges for traditional analytics platforms in terms…MORE INFORMATION
With the increasing volumes of data that can be cost effectively stored in the cloud, comes increasing responsibility. The current…MORE INFORMATION
With the advancement of technology and abundance of data your business receives on a daily basis, companies are now in…MORE INFORMATION
Fusion Professionals held its annual Fusion Summit last Thursday the 18th of October at the Rag and Famish Hotel in…MORE INFORMATION
The Client is one of major NSW government departments providing services to public. The Department had been experiencing performance issues…MORE INFORMATION
Though its conception dates back to 1979, containers made their mark as much needed, major technology assets in 2000. Digital…MORE INFORMATION
Objective The intelligent mobile app-based lending system is a new field, blending recent technical developments in mobile phones and Artificial…MORE INFORMATION
Our Client is a well-known Australian freight logistics company, operating in railway freight and shipping. The company embarked on a…MORE INFORMATION
Data warehouse management and data analytics always had the challenge to decide what data to store and for how long…MORE INFORMATION
Cloud computing is becoming a preferred storage platform for IT managers and organisations in general. In Australia alone, 31 percent…MORE INFORMATION
Serving your customer in the best possible, most efficient way should always be the major goal of any organisation. The…MORE INFORMATION
Moving out from proprietary software seems like a daredevil act, considering the possible data security issues some open source databases…MORE INFORMATION
The Challenge Complex IT environments can pose significant technical risk that, if not managed adequately, have the potential of major…MORE INFORMATION
Fusion Professionals has signed a partnership agreement with Waterline Data ( https://www.waterlinedata.com/ ) the leading provider of Information Catalogs and…MORE INFORMATION
Most people do not like change. As much as possible, they want things to stay the same that is why,…MORE INFORMATION
Regardless of your infrastructure whether you are running in the cloud or on-premise, there will always be a need to…MORE INFORMATION
Data Analytics tools provide Data Scientists and Data Engineers with the instruments to find patterns in data and provide business…MORE INFORMATION