We begin where others give up.

(C) Seewald Solutions, 1180 Wien, Austria. Commercial use prohibited.
![]() |
Projects | Publications | CV | KDD | WEKA | Contact | Business |
One of my major research topics is Knowledge Discovery in Databases (KDD) process, which is sometimes referred to as Data Mining. KDD is best described as
mining for nuggets of knowledge in mountains of data
KDD aims to find understandable, valid, novel, useful, non-trivial and interesting patterns in large databases - for example sales transactions, active insurance policies, or web server access statistics.

KDD does not refer to a specific tool or application but rather to a creative process of discovery encompassing different fields and a variety of techniques guided by domain and technology experts acquainted with the KDD process and related fields - such as Machine Learning, classical Data Analysis, Large and multidimensional Databases, Human-Computer Interaction, Statistical Analysis and Neural Networks.
KDD provides not just solutions but meta-solutions that allow the creation of new solutions (e.g. predicting probability of customers to accept insurance offers) for new data and thus continuous improvement. To measure performance, KDD offers a variety of techniques to estimate how effective, efficient and reliable a model will be on unseen data.
Major projects which I undertake in this field are an ongoing project on training methods for Spam filtering, as well as a tentative approach to fight back spam to the source which I call proactive spam filtering (current project on this topic see here); the recognition of handwritten digits contributed by my students, where I have implemented the whole pipeline from scanning to segmentation, downsampling, preprocessing and image classification; and an ongoing project in biological image processing in collaboration with a prestigiuous US university, and another locally funded project with a similar topic.
I have been holding lectures on Machine Learning and Data Mining as well as Artificial Intelligence Methods for Data Analysis at the Medical University in Vienna and have held several one-day crash-courses on Artificial Intelligence in a Business Context for the Danube University in Krems, have worked in several locally and EU funded research projects analyzing data from a variety of sources and have a PhD in ensemble learning. I am a long-year user, developer and researcher for the open-source Data Mining workbench WEKA (most recent contribution was a String Kernel SVM -- the first classifier which can process text strings directly), and maintain the popular WEKA Command-line Primer. WEKA is the best open source data mining suite there is, and even beats commercial suites in popularity -- see e.g. my comment on a 2007 Data Mining Tools poll.