…a data-mining tool for selected Data Mining Methods visualization


Lukáš Civín, Miroslav Pich, Jaroslav Tykal, Petra Vaníčková, Jiří Vitinger


             RNDr. František Mráz, CSc, RNDr. Iveta Mrázová, CSc

1 Introduction

Main purpose of this document is to introduce you an application called Knocker. Knocker was developed as an open data mining tool with focus on visualization of selected data mining methods.

Data mining is the process of data analysis connected with searching of relationships inside the data.

Every day, amount of available data rises and correct understanding of its meaning becomes a bigger problem. Data mining utilities, such as Knocker, are suitable solution for this problem.

Comparison of Knocker with other existing data mining applications:

“Data mining is a future”, you can hear from all database worlds. So it is not surprising that there are lots of solutions, systems or applications for data mining available. But most of them (and also the best of them) are commercial and very expensive tools. We can mention probably the best data mining tools – SAS Enterprise Miner, SPSS with Clementine, or progressive Statistica Data Miner and others (e.g. special mining modules for Oracle, IBM DB2, and Microsoft SQL server). But these solutions are not for free. We can find tools for free, but they have usually some disadvantages:

We find the most interesting in our Knocker tool these features:

There are two most important Knocker functionalities:

2 Documentation

This part should work as a portal to other files of documentation.

2.1 User documentation

There is one base application:

User can add data-mining methods via modules (.dll libraries) into the main application. Some modules have already been created:

2.2 Programmer documentation

There are two approaches to Knocker programmer documentation:

Philosophical documentation is a supplement of exported comments that helps reader to orient in a big amount of classes and methods.

Both documentation approaches are compiled together into one html file (index.html). The diagrams try to describe the main features of program parts – at the low-level, there are commented classes.

3 Conclusion

Knocker solution fulfilled main tasks for

  1. Modularity
  2. Possibility of simple application expansion
  3. Universal access to the different data types
  4. Implementation and presentation of basic data mining methods

There are still advanced features which could be part of Knocker e.g. more method variations and efficiency optimizations, wider spectrum of possible data inputs. But the core parts have already been implemented.

Knocker is an application that is prepared to be a base for future projects. Lots of universities around the world have their own data mining tools, which are improved for years. Knocker could be the first step to a MFF UK data mining tool.

It was interesting to code data mining methods, but after all, it was even more an entertainment to work with them - mine experiences from data. We tried to work with bigger amounts of data (hundreds of thousand rows) and Knocker made out it without problems. We are satisfied with our work.

Knocker can be successfully used both as a mining tool and as a teaching utility.