In today's data-driven world, multimodal data has become an important asset for businesses. As data continues to grow in size and diversity, organizations need to not only store and process this data efficiently, but also extract valuable insights from it. In the industrial field, while processing massive equipment time series data, it is also necessary to analyze alarm information, equipment relationship, organization information and other relationship data or graph data. In addition to the common market and order flow time series data, the financial field will also use various types of data such as geographic information, real-time news, and meteorological data to assist decision-making. However, in order to fully mine and utilize these multimodal data, the traditional single-model time series database is no longer able to meet the complex needs of enterprises in modern multi-modal application scenarios.
Transwarp TimeLyre is an enterprise-level distributed time series database independently developed by Transwarp, which has the characteristics of high throughput real-time writes, accurate time series queries, and ultra-high data compression ratios, which can support the storage, query, and analysis of massive time series data, and effectively support various time series data business scenarios in the energy, manufacturing, and financial fields.
Recently, TimeLyre officially released V9.2, which supports massive time series data and has native multi-modal data hybrid storage capabilities, which can integrate and process different types of data to help enterprises achieve multi-dimensional analysis of data. At the same time, it provides new functions such as high-performance analysis, hierarchical storage of hot, warm and cold data, and ultra-fast time series data playback analysis, which can effectively support new scenarios such as large-scale time series data lakes, investment and research integration platforms, and time series data middle platforms, fully meet the needs of enterprises for multi-modal data storage and analysis, and help enterprises give full play to the deep value of data.
The native multimodal architecture supports hybrid storage of time series data and relational data models
TimeLyreV9.2 adopts a native multi-modal architecture, and time series and relational data from multiple data sources are stored in a unified storage engine in batches or in real time through a unified interface, and are read and analyzed by the unified high-performance computing engine Quark, supporting scenarios such as upper-layer model processing, batch processing, online analysis, and high-performance reading, helping enterprises to apply data analysis in more comprehensive and multi-dimensional ways.
Different from traditional solutions that deploy different types of data separately and use different database products, TimeLyre uses a native multi-modal architecture to efficiently realize the transformation, flow and correlation analysis of multiple data models, which has the advantages of low complexity, low development and operation and maintenance costs, and high data processing efficiency.
The high-performance C++ computing engine and vectorized computing significantly improve data analysis performance
Relying on Transwarp's unified multi-model data management platform architecture, TimeLyre incorporates high-performance C++ computing engine technology into the computing engine, making full use of the SIMD instruction set of modern CPUs through the use of vectorized computing, and reducing IO overhead with the help of columnar scanning. At the same time, the high-performance data transmission format is adopted to achieve zero copy of data, reduce the overhead of serialization and deserialization, and with the help of columnar storage and high compression ratio, the amount of data transmitted by the network is reduced, and the data is more quickly connected to the high-performance C++ analysis engine. By adopting a high-performance analytical computing engine, it can help users significantly improve data processing power and efficiency, obtain analysis results faster, accelerate the decision-making process, reduce energy consumption and hardware costs, and help users stay ahead of the curve in a data-driven business environment.
Hot, warm and cold data is automatically tiered to reduce storage costs and optimize resource allocation
TimeLyre provides a new hot, warm and cold data tiered storage solution, which uniformly receives data writes and data queries from external applications, and automatically converts hot, warm, and cold data internally according to time or specified conditions. For hot data, it can achieve millisecond-level query performance and provide more than 5 times the data compression rate. Warm data supports 100 millisecond query performance and provides more than 15 times the compression rate. Cold data provides more than 30 times the data compression rate to meet the requirements of batch data processing. The layering of hot, warm, and cold data can be specified only by DDL when creating a table, and automatic layering can be implemented in the background on a regular basis without the need for post-operation and maintenance. At the same time, it supports storing specified data on different storage media, further reducing comprehensive storage costs and optimizing resource allocation.
It supports TransMatrix, a distributed ultra-fast time series replay analysis engine, to facilitate time series data playback analysis
TransMatrix is a distributed investment and research system developed by Transwarp, which allows users to return data from multiple data structures and frequencies (high, medium and low) in chronological order. In addition to time series and relational data, it also supports text data, graph data, image data, etc., and allows users to process and analyze multi-modal data with the help of Python open source ecology. Built-in rich time series operator library, support custom operator development and sharing; It adopts an event-driven programming paradigm and provides a generative operator development interface. It provides operator splicing interfaces and rich built-in expressions, and supports user-defined expressions. Multi-tenant load balancing is implemented through distributed tasks, and distributed task configuration interfaces are provided to implement large-scale tasks such as task splitting, batch running, and large-scale sampling.
New scenario: Large-scale time series data lake engine helps enterprises cope with massive time series data
Users can build a large-scale time series data lake based on TimeLyre, enter the data lake through the streaming engine, load and process data at the ODS, DW, and dimension layers through TimeLyre, and support upper-layer application development, risk identification, model training, real-time display, data intelligence, time series analysis, and other business scenarios through rich API interfaces and open-source ecological interfaces. The time-series data lake is built with TimeLyre as the core, making full use of the product's ability to store, query and analyze massive real-time data, with real-time write performance of up to 10,000 measurement points per node and real-time query performance of 10,000 QPS per node. Combined with stream and batch computing engines, it meets the requirements of services for end-to-end second-level timeliness. At the same time, it supports efficient correlation analysis of time series data and relational data, provides complete SQL support and flexible schema definition, and provides users with a comprehensive, efficient and flexible data management and analysis platform.
New scenario: The technical foundation of the integrated investment and research platform builds a distributed investment and research framework
For financial investment and research scenarios, TimeLyre can be used as the technical foundation of the investment and research integration platform to help enterprises build a distributed investment and research framework. The underlying layer relies on the time series database TimeLyre and its built-in distributed investment research computing engine TransMatrix to build the core technology base of the investment research platform, and connects the upper-layer business model through standard data interfaces, factor development interfaces, strategy development interfaces and distributed task development interfaces, helping enterprises realize application development such as data tessot, factor research, and strategy backtesting in the integrated platform.
New scenario: The data base of the investment research data center realizes the hierarchical management of multi-source data
Relying on TimeLyre to build an investment research data middle platform, referring to the standard data hierarchy, the data can be divided into data source layer, basic data layer, investment research standard layer, business model layer and income report layer. At the data source layer, it is responsible for synchronizing data from external data sources such as data vendors, exchange data, and user factor data, so that the original market or factor data can be synchronized into the investment research data center in a completely consistent form; The basic data layer is responsible for completing the verification, cleaning and processing of data, and generating clean basic investment research data. The investment research standard layer is responsible for unifying these data into a standard table model and data structure model for the investment research process, so as to shield the differences in field names and field types of data from different sources for users. The business model layer is responsible for generating factors and data for a specific research process; The income report layer is responsible for generating factors and data for investment research income evaluation, and can store the research results and investment research results of the strategy in the form of reports in the time series model or relationship model.
Relying on TimeLyre to build an investment research data middle platform, it can connect with rich external data sources, realize multi-level external data entry, and support multiple data access methods such as loading data in the form of files common in the investment research field, data synchronization from mainstream databases (MySQL, Oracle, etc.), receiving real-time exchange quotes through SQL-like APIs, and accepting data through Kafka. At the same time, in order to meet the needs of real-time and batch data updates, professional ETL tools are provided, which can realize one-click re-entry of data, and enable one-click re-entry to automatically trigger multi-level data processing, realize the automatic update of investment research data, and provide it to business personnel for research and use through a unified API.
Empowering business: TimeLyre helped a PV company build a batch-stream integrated time series data lake solution
In order to solve the problem of data islands, a photovoltaic company relies on Transwarp's distributed time series database TimeLyre to build a time series data lake that integrates batch streaming. Firstly, the raw data is obtained from the data source system through the data acquisition device, and the data is imported into the data access area of the data warehouse platform through the Kafka message system, and the data is loaded into the internal source layer through the Slipstream, the stream processing engine that comes with TimeLyre, so as to realize the unified storage of multimodal data such as time series data and relational data. The unified computing engine and SQL engine are used to process data into different layers, including the DWD layer of standard tables, the MID layer of intermediate tables, the DWS layer of model tables, the DIM layer of dimension tables, and the ADS layer of business wide tables, to support upper-layer business reports, BI reports, data intelligence, real-time analysis and comparison, and 3D display. It is worth noting that the data warehouse platform takes TimeLyre as the core, and only one database of TimeLyre realizes the processing, analysis, and query of time series data and relational data from the source layer to the application layer.
Based on Transwarp's big data technology, the project has realized the unified access of photovoltaic data, including real-time access to equipment measurement point data and full access to management data, inspection pictures, operation logs and other data, and has realized the second-level storage of more than 3,300 devices and nearly 300,000 measurement point data in the base. In addition, the solution has the ability of horizontal expansion, and the hardware resources can be increased in the future, and the data of the new station can also be smoothly accessed.
At the same time, relying on Transwarp's time series database, batch processing engine and analysis library, the photovoltaic data base can realize the storage of various data and the processing of data warehouse models, and support the construction of various visual data applications through a unified computing engine and unified data interface, which is convenient for photovoltaic experimental empirical analysts to use big data technology to carry out daily work such as data comparison and analysis, equipment performance query, and operation curve viewing.
In addition, based on Transwarp's integrated platform and data asset management, the unified authorization, development, governance, openness, and audit of data on the whole platform are realized, so that developers in various departments can quickly and conveniently obtain the required data resources and conduct data analysis based on the high-performance time series data lake platform.