Location:Home > Engineering science > Computer Science > Computer System Architecture > Research and Implementation of Compression Technology in Column-Oriented Data Warehouse

Research and Implementation of Compression Technology in Column-Oriented Data Warehouse

Downloads: []
Tutor: LuoZuoZuo
School: Donghua University
Course: Computer System Architecture
Keywords: data warehouse,column stote,data compression
CLC: TP311.13
Type: Master's thesis
Year:  2013
Facebook Google+ Email Gmail Evernote LinkedIn Twitter Addthis

not access Image Error Other errors

As information has become one of the key aspects of enterprise survival and development, it is more significant to extract and analyze information from huge amounts of data to support decision-making. Data warehouse as an important analysis tool for massive data arouses more attention.Nowadays, the traditional row-oriented database management systems have been unable to adapt to the efficient analytic queries. The column-oriented database storage architecture receives more attention. Under the application environments such as analytical query in data warehouse or business intelligence, column-oriented database storage architecture can avoid reading irrelevant columns during query execution, which has more advantages than row-oriented database.Disk I/O is the main bottleneck during the data query in data warehouse which will has high time cost. Reducing the amount of I/O can improve the efficiency of the data query significantly. Column-store technology which stores data with same data type increases the similarity between the adjacent data. Therefore, data warehouse using column-store technology has better data compression efficiency than the one using traditional row-store. So, data compression is one of most important topics in the column-oriented data warehouse management system.Based on characteristics of the column-oriented data warehouse management system, this paper provides the design and implementation of the compression model; provides the design and implementation of the decompression and the execution on compression data scheme in column-oriented data warehouse management system. Then it proposes an improved version of the classic data compression algorithm, which is the simple-dictionary encoding based on dynamic dictionary. The method provided in this paper combines column-level dictionary with sector-level dictionary and counts the probability of occurrence of every data value in each sector, which supports the establishment of streamlined lightweight column-level dictionary. So the compression ratio and the query performance are improved. At last, the experimental results given are used to verify the effectiveness of the proposed method on the data warehouse benchmark data set SSB.
Related Dissertations
Last updated
Sponsored Links
Home |About Us| Contact Us| Feedback| Privacy | copyright | Back to top