Similarity Data Item Set Approach: An Encoded Temporal Data Base Technique

February 23, 2026

Reading time: 6 minute

...

📝 Original Info

Title: Similarity Data Item Set Approach: An Encoded Temporal Data Base Technique
ArXiv ID: 1003.4076
Date: 2010-03-23
Authors: ** - M. S. Danessh - C. Balasubramanian - K. Duraiswamy **

📝 Abstract

Data mining has been widely recognized as a powerful tool to explore added value from large-scale databases. Finding frequent item sets in databases is a crucial in data mining process of extracting association rules. Many algorithms were developed to find the frequent item sets. This paper presents a summary and a comparative study of the available FP-growth algorithm variations produced for mining frequent item sets showing their capabilities and efficiency in terms of time and memory consumption on association rule mining by taking application of specific information into account. It proposes pattern growth mining paradigm based FP-tree growth algorithm, which employs a tree structure to compress the database. The performance study shows that the anti- FP-growth method is efficient and scalable for mining both long and short frequent patterns and is about an order of magnitude faster than the Apriority algorithm and also faster than some recently reported new frequent-pattern mining.

💡 Deep Analysis

Deep Dive into Similarity Data Item Set Approach: An Encoded Temporal Data Base Technique.

📄 Full Content

JOURNAL OF COMPUTING, VOLUME 2, ISSUE 3, MARCH 2010, ISSN 2151-9617 HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/

95 Similarity Data Item Set Approach: An Encoded Temporal Data Base Technique
M.S.Danessh, C. Balasubramanian and K. Duraiswamy Abstract ‐ Data mining has been widely recognized as a powerful tool to explore added value from large-scale databases. Finding frequent item sets in databases is a crucial in data mining process of extracting association rules. Many algorithms were developed to find the frequent item sets. This paper presents a summary and a comparative study of the available FP-growth algorithm variations produced for mining frequent item sets showing their capabilities and efficiency in terms of time and memory consumption on association rule mining by taking application of specific information into account. It proposes pattern growth mining paradigm based FP-tree growth algorithm, which employs a tree structure to compress the database. The performance study shows that the anti- FP-growth method is efficient and scalable for mining both long and short frequent patterns and is about an order of magnitude faster than the Apriority algorithm and also faster than some recently reported new frequent-pattern mining.

Keywords: Encoding method, frequent pattern mining, FP growth, FP tax, anti FP growth algorithm ——————————  —————————— 1 INTRODUCTION

One of the currently fastest and most popular algorithms for frequent item set mining is the FP-growth algorithm. It is based on a prefix tree representation of the given database of transactions (called an FP-tree), which can save considerable amounts of memory for storing the transactions. The basic idea of the FP-growth algorithm can be described as a recursive elimination scheme in a preprocessing step delete all items from the transactions that are not frequent individually i.e., do not appear in a user-specified minimum number of transactions. Recourses to process the obtained reduced (also known as projected) database, remembering that the item sets found in the recursion share the deleted item as a prefix. On return, remove the processed item also from the database of all transactions and start over, i.e., process the second frequent item etc. In these processing steps the prefix tree, which is enhanced by links between the branches, is exploited to quickly find the transactions containing a given item and also to remove this item from the transactions after it has been processed[4][7].
The Apriori heuristic achieves good performance gained by (possibly significantly) reducing the size of candidate sets [3]. However, in situations with a large number of frequent patterns, long patterns, or quite low minimum support thresholds, compact data structure, called frequent‐pattern tree, or FP‐tree in short is constructed, which is an extended prefix‐tree structure storing crucial, quantitative information about frequent patterns. To ensure that the tree structure is compact and informative only frequent length‐1 items will have nodes in the tree, and the tree nodes are arranged in such a way that more frequently occurring nodes will have better chances of node sharing than less frequently occurring ones. This experiments show that such a tree is compact and it sometimes orders of magnitude smaller than the original database [7]. Subsequent frequent‐pattern mining will only need to work on the FP‐tree instead of the whole data set. The properties of FP‐tree are thoroughly studied [10]. Also, it point out the fact that, although it is often compact, FP‐tree may not always be minimal. Some optimizations are proposed to speed up FP‐growth which is a technique to handle single path FP‐tree has been further developed for performance improvements. A database projection method has been developed in Section 2 to cope with the situation when an FP‐tree cannot be held in main memory the case that may happen in a very large database. Extensive experimental results have been reported. Thus examine the size of FP‐tree as Well as the turning point of FP‐growth on data projection to building FP‐tree[9]. The main step is described in Section 3, namely how an FP‐tree is projected in order

PG student, Dept of CSE, K. S. R. College of Technology,Tiruchengode, Tamilnadu, India.
Asst.Professor,Dept of CSE, K.S.R. College of Technology, Tiruchengode, Tamilnadu, India. 3.Dean(academic),Dept of CSE, K.S.R. College of Technology,Tiruchengode, Tamilnadu, India.

JOURNAL OF COMPUTING, VOLUME 2, ISSUE 3, MARCH 2010, ISSN 2151-9617 HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/

96 to obtain an FP‐tree of the (sub) database containing the transactions with a specific item (though with

…(Full text truncated)…

📄 Read Full PDF on ArXiv

Reference

This content is AI-processed based on ArXiv data.

Similarity Data Item Set Approach: An Encoded Temporal Data Base Technique

📝 Original Info

📝 Abstract

💡 Deep Analysis

📄 Full Content

Reference

Table of Contents

Table of Contents

📝 Original Info

📝 Abstract

💡 Deep Analysis

📄 Full Content

Reference

Related Posts

A Security Based Data Mining Approach in Data Grid

Concept-oriented model: Modeling and processing data using functions

Querying Incomplete Data over Extended ER Schemata

Start searching

No results found