Big data is basically a collected large volume of various structured, unstructured, and semistructured data, that rapidly grow in time and can not be insightfully processed by means of most traditional data management routines (extract, transform, load).
In this post, we will briefly describe what is big data, go through its reference architecture, and discuss what opportunities big data methods open for telematics.
Big Data: definition
In a simplified way, big data defines a dataset that is huge in size and growing rapidly with time. However, the term “big data” itself is not just covered enormous datasets. It also includes types and structure of big data, as well as it’s characteristics, tools required to collect it, and actual data sources.
In order to insightfully obtain needed information from the big data, appropriate tools are necessary to perform data collection, storage, and processing routines. An example of a normal process flow for Big Data Analytics is shown in a figure below [Prasad, Agarwal 2016].
Generally, big data workflow includes the following steps [Furht, Villanustre 2016]:
- Collection from multiple data sources of structured, semi-structured, and unstructured data.
- Ingestion – loading most of the data into a single data store.
- Discovery and cleansing – understanding, cleaning up, and formatting the content.
- linking, entity extraction, entity resolution, indexing, and data fusion.
- Analysis – Intelligence, statistics, predictive and text analytics, machine learning.
- Delivery – querying, visualization, real-time delivery on enterprise-class
Big data forms a new generation of architectures and technologies, designed to economically extract value from very large volumes of a wide variety of data by enabling high-velocity capture, discovery, and analysis. Big data technologies include:
- Cloud computing platforms.
- Distributed file systems and databases.
- Scalable storage systems.
- Data mining (tolls and techniques).
- Massively Parallel Processing (MPP).
Methods for mining and querying big data are largely different from conventional methods of statistical analysis applied to relatively small data sets.
Big Data: architecture
The Big Data Reference Architecture proposed by NIST that shown on a figure below is a vendor-neutral approach and can be used by any organization, aiming at developing big data architecture.
Such architecture is organized around five major roles (Data Provider, System Orchestrator, Big Data Application Provider, Big Data Framework Provider, and Data Consumer) and multiple sub-roles, aligned along two axes representing the two Big Data value chains: the Information Value and the Information Technology.
System Orchestration represents the automated arrangement, coordination, and management of computer systems, middleware, and services. Orchestration effectively allows the different applications, data, and infrastructure components of the Big Data environment to work efficiently while bound together. The Data Provider role introduces new data/information into the Big Data system for further discovery, access, and transformation by the Big Data system. The data itself could originate from a wide variety of sources and transfers between the Data Provider and the Big Data Application Provider.
Big Data Application Provider includes business logic and functionality, necessary to transform the data into the required results. The crucial goal of this component is to extract value from the input data. On its turn, Big Data Framework Provider has resources and services that can be utilized by the Big Data Application Provider and grants the core infrastructure of the Big Data Architecture.
The Data Consumer depending on the particular case could act either as an end-user or another system. This role in principle could be considered as the mirror image of the Data Provider. The Data Consumer utilizes the interfaces/services provided by the Big Data Application Provider to gain access to the information. These interfaces can include data retrieval, reporting, and data rendering.
Big Data: benefits for Telematics
Big data already diffused into a wide range of industries, advancing their workflow and increasing functionality.
This is also true in the case of telematics: telematics platforms insightfully automate fleet management technologies, various connected devices monitor vehicle and driver performance, while fleet owners and managers could obtain detailed information on how the fleets are performing in real-time.
As applied to telematics, Big Data-related approaches significantly advancing telematics platforms functionality, for instance, smart route optimization, eco-driving (drivers behavior) and various types of alerts to name a few. Lets briefly describe a couple out of these features, considering the Navixy telematics platform as an example.
Imagine that one needs to deliver packages to several addresses in different areas of the city. Moreover, each address has its own delivery time. How to quickly find the most optimal and convenient sequence for visiting each client? For such cases, there is a smart route optimization feature. The system can automatically determine the order for each point of the route task, taking into account the address, time, and the start point of the task.
This very useful feature allows one to reduce fuel costs, increase the speed of completing tasks, and improve productivity. The system also takes into account the task start time. For example, if one of the points has a later interval than others, this point will always be the last in the queue. Navixy can optimize up to 15 different points within one route task.
Eco-driving functionality enables efficient data and computations based driver behavior control and analytics. There are three main types of violations: Speeding, Harsh driving, and Idling. Navixy obtains the data of a committed violation from a vehicle GPS tracker and makes it visual to analyze. Each type of violation has its own set of penalty points which are taken into account by the system for its further rating compilation.
Overall, there is no doubt that Big Data methods with more time spent on technology advances will diffuse into telematics solutions even deeper and cause a further rise in enabled functionality, quality of service, and business efficiency.
5. Bakshi R.P, Agarwal S. Comparative Study of Big Data Computing and Storage Tools: A Review. 2016.
6. Furht B., Villanustre F. Big Data Technologies and Applications, 2016.
7. Yulei Wu et al., Big Data and Computational Intelligence in Networking, 2018.