04.08.2010   carmen

Visualization of classifier decisions in multi-D spaces

Visualization of classifier decisions in a feature space help us to understand its behavior. The visualization is straightforward when our data has only two features. But what about multi dimensional problems?

PRSD Studio provides visualization of classifier decisions in feature spaces with more than two dimensions.

This video requires a more recent version of the Adobe Flash Player to display. Please update your version of the Adobe Flash Player.


(0) Comments
04.08.2010   pavel

Tutorial example on protecting classifier from outliers

Often, we need to protect a trained classifier from accepting outliers examples appearing in production. This tutorial shows how to achieve this by addding a rejection option to a trained discriminant with sdreject command. Construction of interactive reject curve is also illustrated.

This video requires a more recent version of the Adobe Flash Player to display. Please update your version of the Adobe Flash Player.

Detailed example on adding reject option is available in the Knowledge Base

(0) Comments
15.06.2010   carmen

ASCI course 2010

Last week, we lectured in the Advanced Pattern Recognition Course organized by TU Delft within the Advanced School for Computing and Imaging (ASCI school). The course is offered to PhD students that are interested in the field of Pattern Recognition or that are already experienced and would like to deepen and widen their understanding of the field.  This course has been more then fully booked, with 30 participants from all corners of The Netherlands. Our lectures have focused on evaluation, ROC analysis and classifier optimization, leading to an integrated approach for system design. It has been a pleasure for us to meet so many bright students and learn about their interesting projects. We wish them success and fun in their research! 

(0) Comments
10.06.2010   pavel

Tutorial example on optimizing three-class classifiers with ROC analysis

imageSometimes, one of the classes in multi-class problem is much larger than the remaining classes. Classifiers, trained in such imbalanced problem, usually deliver very poor performances with a default decision function. The reason is that the model output of the large class dominates the solution. The default procedure of making decisions assumes that all the classes are equally important which results in high misclassification of small classes.

PRSD Studio allows you to quickly optimize multi-class classifiers in imbalanced problems. Watch the video inside! 

Read more...
(0) Comments
28.05.2010   pavel

How to quickly rename classes or define meta-classes?

imageWhen designing classifiers, we often need to define new classes by renaming existing ones. sdrelab command allows us to do just that quickly and easily.
Class relabeling helps us to define meta-classes, to compare data sets before and after normalization or to understand where the data of a specific sub-class/patient/cluster fits with respect to others.

Read more...
(0) Comments
26.05.2010   pavel

Presenting our research on ROC hierachical classifiers at NVPHBV meeting

imageWe have presented our research on optimization of hierarchical classifiers at the spring meeting of NVPHBV (Dutch society for pattern recognition and image processing).

Complex problems are often easier to handle if decomposed into sub-problems and tackled independently. Hierarchical classifiers offer a great tool for such decomposition but are difficult to optimize according to application requirements. This is a serious problem we encounter daily in our industrial projects. In our talk, we described our approach allowing the designer to perform cost-sensitive ROC optimization for apriori-defined hierarchical classifiers.

Read more...
(0) Comments
17.05.2010   pavel

How to setup leave-one-patient out cross-validation

In many applications, we need to make sure our classifier generalizes to unseen patients, object events etc. Therefore, we need to consider these entities in cross-validation of our algorithm. PRSD Studio provides leave-one-object-out using the sdcrossval routine. But in this example, we show how to make a very simple leave-one-object-out scheme in two lines of code where everything is open to our direct understanding.

Read more...
(0) Comments
06.05.2010   pavel

Selecting a random subset based on a specific set of labels

Often, we need to generate a random subset of samples using a specific set of labels. For example, in the medical problem we may be interested to sample not from the top-level disease/no-disease classes by from each patient or from each tissue type. randsubset helps you to do just that in a simple way.

Read more...
(0) Comments
04.05.2010   pavel

Live feature distributions in scatter plots

imageThe 2.2.1 release brings new interactive tool into the sdscatter: the feature distribution plot. It shows the histogram for each class for the currently selected feature on the horizontal and vertical axis of the sdscatter. This gives us better understanding of true nature of class overlap especially in large data sets where traditional scatter is very cluttered.

The feature plots are updated live with scatter operations (showing class subsets, hiding classes, painting labels).

Read more...
(0) Comments
22.04.2010   pavel

Protecting clusters from outliers using reject option

Often, when trying to understand our data with cluster analysis we wonder where will the new examples map. Many types of cluster analysis techniques such as k-means or mixtures of Gaussians allow us to apply the trained clustering on new data because they, in fact, train a classifier.

But these trained clustering models act as discriminants. That means that they assign every new data sample into one of the found clusters. This includes the samples very distinct from anything encountered when performing the cluster analysis.
Would not that be great if we could identify samples sticking out from our clustering?

That’s exactly what you may now do using the sdreject command introduced in PRSD Studio 2.1. We may simply add a reject option to a trained clustering model and so protect it from outliers.

Read more...
(0) Comments
05.10.2009   carmen

Advanced Statistical Pattern Recognition course in Poland

Last spring we accepted the invitation of Prof Michal Wozniak to teach an Advanced Pattern Recognition course at Wroclaw University of Technology in Poland. The course took place between the 22 - 26 June 2009. I’ve been lecturing on classification and model building, evaluation techniques and ROC analysis, clustering and image segmentation.  image
The 16 participants, mostly graduate students, responded enthusiastically, asked many questions during the lectures, and spent long hours in the laboratory room tackling the practical exercises with PRTools and PRSD Studio. It has been a pleasure to work with them and with my colleagues Dr Ela Pekalska and Dr David Tax from Manchester and Delft university. 

I’d like to thank Dr Konrad Jackowski and Prof Michal Wozniak for having been such wonderful hosts. Not only they have been very helpful with all the necessary practicalities at university, but also took time to show us the beautiful city of Wroclaw.

(0) Comments
13.05.2009   pavel

Automatic estimation of number of components in Gaussian mixtures

We’re happy to share a new addition to PRSD Studio that greatly simplifies construction of more sophisticated classifiers.
Gaussian mixture models are often used to design classifiers in multi-modal problems and the sake of clustering. Given enough training examples, mixture models may describe arbitrary data distribution. Furthermore, execution speed is usually orders of magnitude higher than non-parametric approaches such as k-NN or Parzen classifier which is very important in on-line industrial applications.
In order to effectively use mixture models in practice, one needs to provide the number of mixture components as an input parameter. sdmixture can now estimate the number of components robustly using a non-parametric density estimation approach.

Read more...
(0) Comments
06.02.2009   pavel

Embedding classifier in pose recognition demo

We've just embedded the classifier execution based into the pose estimation application developed by Feifei Huo in ICT Group at TU Delft. As a result, FeiFei robust model-based torso detector now gets also the pose classifier based on torso model parameters. The video shows the single camera setup which is more challenging than using stereo as the torso parameters exhibit high overlap for the nine poses of interest.

The wire structures represent torso models fitted to the video frames, the crosses show the detected points of interest such as palms (based on skin-color detector). The numbers on the chest show the classifier decision. In this example we use the 10-NN classifier based on 1500 training poses of several people (different from the guys in this test). The decisions are performed at the equal-proir operating point. The complete pose recognition application is a written in C++ application using OpenCV and libPRSD. Note, that libPRSD allows to change the classifier or alter its operating point without recompiling the entire application. Simply export the classifier from Matlab/PRSD Toolbox and load it to the application on-the-fly!

Get the Flash Player to see this player.
(0) Comments
12.12.2008   pavel

ICPR 2008 poster

We have presented our research on variance estimation in ROC analysis at the International Conference on Pattern Recognition in Tampa, Florida. It was very nice to discuss many pattern recognition issues with hundreds of researchers from pattern recognition and computer vision scientific community.  Excellent invited talks on progress in brain-computer interfaces, manifold learning or classifier fusion!

(0) Comments
18.11.2008   pavel

Fast approximated k-NN classifier

k-th nearest neighbor is a robust data driven classifier. However, the more training samples it uses, the slower it gets in execution. This is because distances from each new observation to all stored training examples (prototypes) need to be computed.
We have developed an approximated k-NN computing distances only to potential nearest neighbors and hence significantly speeding the k-NN execution. Although our strategy to localize the nearest neighbor search is similar to the well-known kd-tree approach, it does not employ per-feature splitting but works directly on distances. Therefore, it scales well to higher dimensionalities unlike the kd-tree which becomes inpractical for more than 20D feature spaces.

Read more...
(0) Comments
10.11.2008   carmen

Advanced Patter Recognition Course October 2008

In the end of October the Advanced Pattern Recognition course took place at Delft Technical University. The APRcourse is organized by TUDelft and PR Sys Design and is specifically tailored for people from industry in need of state of the art pattern recognition solutions. There were 12 participants from all over the world (Singapore, Canada and several European countries). This time we invited the participants to briefly introduce themselves and the type of problems they are interested in or dealing with in their work. This gave the opportunity to learn each other interests and background, and stimulated more interactions with the teaching staff and the other participants. A cheerful and cooperative atmosphere developed during the week, boosted also by common dinners and pub celebrations. We have really enjoyed the week and wold like to thank all participants for their contribution. It has been a pleasure to meet and work with you all! Thanks to our “knee man” Shameem we have a group picture of almost the complete team.

(0) Comments
23.08.2008   pavel

LabView interface

We’re happy to announce the LabView interface example graciously donated by Anton Voigt from DeBeers Consolidated Mines. LabView is a an industrial development platform based on graphical language paradigm. It enables fast system integration using multitude of hardware components. The example, which will be available in the coming PRSD Studio release, shows how to call the classifiers trained using PRSD Studio directly from the LabView environment running on PC hardware.

17.06.2008   pavel

Support for decision trees


We’re happy to announce that PRSD Studio supports execution of decision trees trained in PRTools. Decision tree is a classifier trained feature-per-feature splitting the feature space into rectangular subspaces. The two key advantages of decision trees are interpretation capability (why was the decision made?) and speed. It is the speed of execution that makes decision tree classifier particularly interesting for industrial practitioners!

Read more...
04.06.2008   pavel

libPRSD Executing Support Vector classifiers trained in LIBSVM


LIBSVM is a powerful C library implementing Support Vector training and execution with interfaces to numerous scripting environment including Matlab. We’ve posted a Knowledge Base article showing how to bring a classifier, trained in LIBSVM, to the PRSD Studio and through it to custom applications. Executing SVC in libPRSD also brings significant speedup!


(0) Comments