Design

Real time analytics system increases speed 100 times

3rd August 2016
Daisy Stapley-Bunten
0

Hitachi has announced the development of a database management system optimised for the high-speed embedded memory in the hardware (FPGA) and technology for high performance parallel data processing in FPGAs. Using these technologies, speed of data analytics was successfully increased by up to a maximum of 100 times compared with not using these technologies.

Further, the two technologies developed were combined with 'Pentaho Business Analytics', a business analytics software developed by Pentaho Corporation (a Hitachi Group company), to visualise business analytics results, and with flash storage for data storage, to create a prototype real time data analytics system. The prototype will contribute to realising self-service data analytics enabling employees in the field to easily and quickly execute data analytics on massive business data.

In recent years, self-service analytics that allow employees in the field to easily conduct big data analytics, usually conducted by experts such as data scientists, is gaining attention. One example might be that of a financial advisor, listening to a customer requirements, entering the information into the analytics system on the spot, and being able to suggest a financial product which matches the customer's needs. As can be imagined from this example, the data analytics system for self-service data analytics needs to produce results quickly, and thus must have high processing capabilities to execute data read and data analysis processes.

By using flash storage instead of a hard disk drive to store data, the data read performance was increased by up to 10-100 times. Data analysis performance, however, has been unable to keep up with data read performance, thus creating a bottleneck in the analytics.

To overcome this issue, Hitachi developed a database management system optimised for the high-speed embedded memory in the hardware (FPGA) and technology to conduct high speed parallel data processing in the FPGAs, and succeeded in increasing data analytics speed by up to a maximum of 100 times. A real-time data analytics system prototype was then built by combining these two technologies with Pentaho Business Analytics for visualisation of results, and flash storage for data store.

Image 1Image 2

 

The outline of the two technologies developed is as described below.

1. Database management system optimised for high-speed memory in hardware (FPGA)

FPGA is equipped with small but high-speed internal memory (few MB), and connected to large but low-speed external memory (few GB). In the data format used in column-oriented or columnar databases, data management information which shows the location of data is larger than the internal memory and needed to be stored in the external memory. This management information, however, is required to determine the location of the data and frequently referred every time accessing the data. Thus, storing this information on large but low-speed external memory slows down the processing speed. In this research, a database management system was developed where the database was subdivided into multiple data segments so that the management information of each data segments could be handled by the FPGA internal memory, stored in the flash storage, and processed within the FPGA by each data segment. This database management system enables high-speed processing. (13 patent applications have been filed)

2. Technology for high-speed parallel data processing by the hardware (FPGA)

Parallel data processing is widely adopted to conduct high-speed processing. In column-oriented or columnar database, however, this is difficult as the processing of one column must finish before the next column can be processed. To overcome this, a column processing method was developed to enable a set number of columns to be processed in turn. Parallel data processing was realised using this method together with a data filter circuit to select the data for analytics, and an aggregation circuit to group the data and calculate values such as total or average, to realise parallel data processing.

Hitachi plans to exhibit these technologies at the Flash Memory Summit 2016, to be held 9th-11th August, 2016 in Santa Clara, California, USA.

Product Spotlight

Upcoming Events

View all events
Newsletter
Latest global electronics news
© Copyright 2024 Electronic Specifier