Classifier Class Reference

The Classifier class is the main interface to Classer's functionality. More...

#include <Classifier.h>

Collaboration diagram for Classifier:

Collaboration graph
[legend]
List of all members.

Public Member Functions

 Classifier (string &modelName, DimList *dl, int nv, NameList &cn)
 Creates a Classifier object.
 ~Classifier ()
 Destructor - just invokes deleteClassifiers().
void setNumVoters (int numVoters)
 Sets the number of voters, i.e.
void setNumTrainEpochs (int nte)
 Sets the number of training epochs, that is, the number of times the training set will be presented.
void setDimList (DimList *dl)
 Sets the DimList, that is, the features to use for training/testing.
void setClassCap (int cc)
 Sets the class-cap: this value imposes an upper limit on the number of samples of a given class that will be used in training.
void setLabelChoice (labelChoiceType lc)
 Setting the labelChoiceType defines what to do during training when multiple labels are available, and class capping is turned on.
void registerParams ()
 Registers model parameters with all model instances; this has to be done before a new parameter setting takes effect.
void save (string &fNameTemplate)
 Saves each of the classifier model instances to a file.
void load (string &fNameTemplate, string &specialRequest)
 Performs the inverse operation from Classifier::save(), loading the persistent state for the classifier model instances from a set of files.
void train (ViewSet *vs)
 Trains the classifier using the samples in the ViewSet object.
void test (ViewSet *vs)
 Tests the classifier based on the samples in the ViewSet, that is, generate a prediction as to the class label of each sample.
void classify (View *v, vector< vector< float > > &)
 Similar to test(), but doesn't care about evaluating the performance of the classifier with respect to the ground truth labels.
void setCvMode (bool cvMode)
 Accessor method.
bool getCvMode ()
 Accessor method.
void CvTrain (ViewSet *vs)
 Performs cross-validated training: creates the required number of classifier model instances, sets CvMode to true, and then invokes train() to do the work.
void CvTest (ViewSet *vs)
 Performs cross-validated testing: assumes CvMode is already set to true, and invokes test() to do the work.
void CvSave (string &fNameTemplate)
 Saves the classifier model instances generated by a cross-validating training session.
void CvLoad (string &fNameTemplate, int numTeSets)
 Does the inverse operation from that done by CvSave(), loading N x M classifier model instances from file.
valarray< int > & getClassStats (int clIdx)
 Returns a reference to an array that tracks samples per class.
void predSquare (int n, float *buf)
 A more general version of predPpm: it iterates over a square of size 'resolution', where the dimensions of the square represent the two feature dimensions with which the classifier was trained.
void thematicPpm (ofstream &ost, Image &img, ofstream &confOst)
 Generates a thematic map, that is, one where a pixel's color indicates the predicted class for that pixel.
void reset ()
 Resets the timers, 'trained' flag, cvMode flag, and the entry counters.
void deleteClassifiers ()
 Frees the memory dedicated to the classifier model instances and the class histogram counters.
void newClassifiers ()
 Reallocates memory for the classifier model instances and the class histogram counters, then creates new classifiers.
void setOutputVectorStream (ofstream &ost)
 Set up a file destination for storing distributed prediction vectors, one per sample.
void setDataLabelsStream (ofstream &ost)
 Set up a file destination for storing sample labels.
void setLabelVectorStream (ofstream &ost)
 Set up a file destination for storing sample label vectors.
void setOutputClassStream (ofstream &ost)
 Set up a file destination for storing output predictions.
void setCmatPgmStream (ofstream &ost)
 Set up a file destination for storing graphical confusion matrices at the conclusion of a test.
void setPredPpmStream (ofstream &ost)
 Set up a file destination for storing graphical decision surfaces, as generated by the predPpm() method.
void clearOutputStreams ()
 Clear all output streams; in other words, turn off file-based outputs.
void requestModelOutput (string &requestName, string &fNameTemplate)
 Provides a mechanism for requesting file-based outputs from classifier model instances (see ClassifierModel::requestOutput()).
void clearModelOutputStreams ()
 Invokes ClassifierModel::closeStreams() for each classifier model instance.
NameListgetClassNames ()
 Accessor method.
PropertyListgetModelParams ()
 Accessor method.
DimListgetDimList ()
 Accessor method.
void boostVerbosity (int verbosityLevel, int debuggedEntry)
 Sometimes, an odd behavior only can be reproduced after a certain number of entries have been submitted for training, which can generate too much logged output; to simplify debugging in these situations, this function lets you boost the verbosity of logging to levels 4, 5, or 6, for a particular data entry.
int getNumClassifiers ()
 Accessor method.
int getNumVoters ()
 Accessor method.
int getNumTrainEpochs ()
 Accessor method.
ClassifierModelgetClassifier (int i)
 Accessor method.
int getLtmRequired ()
 Sums the storage requirements for each of the classifier model instances.
float getTrainTime ()
 Accessor method.
float getTestTime ()
 Accessor method.
float getClassifyTime ()
 Accessor method.
float getPercentCorrect ()
 Accessor method.
ConfusionMatrixgetConfusionMatrix ()
 Accessor method.
UnitPlot * plot2D (NameList &dimNames, NameList &classNames)
TxtPadgetClassStats ()
 Returns a TxtPad object summarizing the class statistics in the form of a histogram specifying samples per class.
void setHeartbeat (bool h)
 Accessor method.

Detailed Description

The Classifier class is the main interface to Classer's functionality.

Classer makes a distinction between classifiers and models:


Constructor & Destructor Documentation

Classifier::Classifier string &  modelName,
DimList dl,
int  nv,
NameList cn
 

Creates a Classifier object.

Parameters:
modelName - The type of ARTMAP classifier: 'Fuzzy', 'Default', 'IC' or 'Distrib'
dl - A DimList object describing the features to use for training/testing (can be changed later)
nv - The number of voters - this determines the number of model copies, along with whether or not cross-validation is being used
cn - A NameList object describing the output classes. This is fixed once the classifier is created.
Exceptions:
MsgException If dl is NULL.
MsgException If nv <= 0.

Classifier::~Classifier  ) 
 

Destructor - just invokes deleteClassifiers().


Member Function Documentation

void Classifier::boostVerbosity int  verbosityLevel,
int  debuggedEntry
 

Sometimes, an odd behavior only can be reproduced after a certain number of entries have been submitted for training, which can generate too much logged output; to simplify debugging in these situations, this function lets you boost the verbosity of logging to levels 4, 5, or 6, for a particular data entry.

Parameters:
verbosityLevel The level (4, 5, or 6) to which to boost logging.
debuggedEntry The index into the data set of the sample for which more detail is desired.
Exceptions:
MsgException If the verbosityLevel is in the wrong range.

void Classifier::classify View v,
vector< vector< float > > &  out
 

Similar to test(), but doesn't care about evaluating the performance of the classifier with respect to the ground truth labels.

In other words, generates a set of predictions for the samples in the passed View object, storing the distributed predictions in the passed 2D vector out. As does test(), this function can also write labels and label vectors to a passed output stream.

Parameters:
v The View object holding the data to be classified.
out A 2D vector, pre-allocated to hold the predictions, with dimensions N x L (test set entries x output classes).

void Classifier::clearModelOutputStreams  ) 
 

Invokes ClassifierModel::closeStreams() for each classifier model instance.

void Classifier::clearOutputStreams  )  [inline]
 

Clear all output streams; in other words, turn off file-based outputs.

This is different from and doesn't affect the logging mechanism.

void Classifier::CvLoad string &  fNameTemplate,
int  numTeSets
 

Does the inverse operation from that done by CvSave(), loading N x M classifier model instances from file.

Parameters:
fNameTemplate The template on which to base the input file names.
numTeSets The number of partitions used for cross-validation.

void Classifier::CvSave string &  fNameTemplate  ) 
 

Saves the classifier model instances generated by a cross-validating training session.

This will generate N x M files, where N is the number of partitions in the ViewSet used for cross-validation, and M is the number of voters being used (may be 1). The file names generated will append to the template fNameTemplate the string "tXvY.wgt", where X ranges from 0 to N-1, and Y ranges from 0 to M-1. The actual saving of the classifier model instance is delegated to the model implementation.

Parameters:
fNameTemplate The template on which to base the output file names.

void Classifier::CvTest ViewSet vs  ) 
 

Performs cross-validated testing: assumes CvMode is already set to true, and invokes test() to do the work.

Parameters:
vs The test data, divided into partitions for cross-validation.

void Classifier::CvTrain ViewSet vs  ) 
 

Performs cross-validated training: creates the required number of classifier model instances, sets CvMode to true, and then invokes train() to do the work.

Parameters:
vs The training data, divided into partitions for cross-validation.

void Classifier::deleteClassifiers  ) 
 

Frees the memory dedicated to the classifier model instances and the class histogram counters.

ClassifierModel* Classifier::getClassifier int  i  )  [inline]
 

Accessor method.

Parameters:
i Index of the classifier model instance to retrieve.
Returns:
A pointer to the classifier model instance with the given index.

float Classifier::getClassifyTime  )  [inline]
 

Accessor method.

Returns:
The number of seconds required for classification.

NameList& Classifier::getClassNames  )  [inline]
 

Accessor method.

Returns:
The NameList object holding the list of class label names.

TxtPad * Classifier::getClassStats  ) 
 

Returns a TxtPad object summarizing the class statistics in the form of a histogram specifying samples per class.

Returns:
A TxtPad containing a table of samples per class.

valarray< int > & Classifier::getClassStats int  clIdx  ) 
 

Returns a reference to an array that tracks samples per class.

One array is maintained for each classifier model instance, so this method lets you look the array up by classifier model index.

Parameters:
clIdx Index of the classifier for which to retrieve class sample statistics.
Exceptions:
If the classifier index is out of range

ConfusionMatrix* Classifier::getConfusionMatrix  )  [inline]
 

Accessor method.

Returns:
The confusion matrix object accumulated during the last test() operation.

bool Classifier::getCvMode  )  [inline]
 

Accessor method.

Returns:
The cross-validation mode - if true, cross-validation is on.

DimList* Classifier::getDimList  )  [inline]
 

Accessor method.

Returns:
Returns the DimList object currently set for the classifier.

int Classifier::getLtmRequired  ) 
 

Sums the storage requirements for each of the classifier model instances.

Returns:
The long-term-memory (LTM) required to store each of the model instances, in bytes.

PropertyList* Classifier::getModelParams  )  [inline]
 

Accessor method.

Returns:
The PropertyList object holding the model properties as a set of name/value pairs.

int Classifier::getNumClassifiers  )  [inline]
 

Accessor method.

Returns:
The number of classifier model instances, which is the same as the number of voters unless cross-validating.

int Classifier::getNumTrainEpochs  )  [inline]
 

Accessor method.

Returns:
The number of training epochs requested

int Classifier::getNumVoters  )  [inline]
 

Accessor method.

Returns:
The number of voters

float Classifier::getPercentCorrect  )  [inline]
 

Accessor method.

Returns:
The percent correct for the last test operation.

float Classifier::getTestTime  )  [inline]
 

Accessor method.

Returns:
The number of seconds required for testing.

float Classifier::getTrainTime  )  [inline]
 

Accessor method.

Returns:
The number of seconds required for training.

void Classifier::load string &  fNameTemplate,
string &  specialRequest
 

Performs the inverse operation from Classifier::save(), loading the persistent state for the classifier model instances from a set of files.

See artmap::fread() for details on the specialRequest parameter.

Parameters:
fNameTemplate The template on which file names are based.
specialRequest Set to the empty string.

void Classifier::newClassifiers  ) 
 

Reallocates memory for the classifier model instances and the class histogram counters, then creates new classifiers.

Note that the new classifiers have default settings, i.e., model parameters must be applied to them anew.

UnitPlot * Classifier::plot2D NameList dimNames,
NameList classNames
 

void Classifier::predSquare int  resolution,
float *  buf
 

A more general version of predPpm: it iterates over a square of size 'resolution', where the dimensions of the square represent the two feature dimensions with which the classifier was trained.

However, instead of generating a graphic, it fills a floating-point array with predictions for the class of each 'pixel' in the square.

Parameters:
resolution The size of the side of the square over which predictions are iterated.
buf The floating-point array to be filled with predictions. Allocated by the caller, it should have size resolution * resolution * numClasses. The numClasses predictions for a given pixel are grouped together.

void Classifier::registerParams  ) 
 

Registers model parameters with all model instances; this has to be done before a new parameter setting takes effect.

Note - to set model parameters, use getModelParams() to get a pointer to the PropertyList that stores the parameter settings, and modify the PropertyList.

void Classifier::requestModelOutput string &  requestName,
string &  fNameTemplate
 

Provides a mechanism for requesting file-based outputs from classifier model instances (see ClassifierModel::requestOutput()).

Filenames are generated in a similar manner to the way it's done in Classifier::save(), with the extension '.txt' substituted for '.wgt'.

void Classifier::reset  ) 
 

Resets the timers, 'trained' flag, cvMode flag, and the entry counters.

void Classifier::save string &  fNameTemplate  ) 
 

Saves each of the classifier model instances to a file.

If there is only a single classifier model instance, the filename is fNameTemplate.wgt; if there are several instances, then file names are based on the passed template, as follows:

  • fNameTemplatev0.wgt
  • fNameTemplatev1.wgt
  • fNameTemplatev2.wgt
  • ...
Parameters:
fNameTemplate The template on which file names are based.

void Classifier::setClassCap int  cc  ) 
 

Sets the class-cap: this value imposes an upper limit on the number of samples of a given class that will be used in training.

Use a value of 0 to impose no limit. No limit is the default, but this function can be called with 0 to unset a previous setting, for example to cancel the class cap for testing after using it in training.

Parameters:
cc The class-cap value.

void Classifier::setCmatPgmStream ofstream &  ost  )  [inline]
 

Set up a file destination for storing graphical confusion matrices at the conclusion of a test.

This output is available during test operations.

Parameters:
ost Output file stream to which to write.

void Classifier::setCvMode bool  cvMode  )  [inline]
 

Accessor method.

Parameters:
cvMode The cross-validation mode - if true, cross-validation is on.

void Classifier::setDataLabelsStream ofstream &  ost  )  [inline]
 

Set up a file destination for storing sample labels.

As samples are routinely submitted in randomized order, this is the only way of knowing after the fact which labels go with which predictions. A single label is written per line, each corresponding to a test sample. This output is available during test operations.

Parameters:
ost Output file stream to which to write.

void Classifier::setDimList DimList dl  ) 
 

Sets the DimList, that is, the features to use for training/testing.

The classifier model instances are recreated, as the input features are a basic aspect of the model architecture. This is done via calls to deleteClassifiers() and newClassifiers().

Parameters:
dl The new list of features.

void Classifier::setHeartbeat bool  h  )  [inline]
 

Accessor method.

Parameters:
h If false, heartbeat printout is turned off (it's on by default)

void Classifier::setLabelChoice labelChoiceType  lc  ) 
 

Setting the labelChoiceType defines what to do during training when multiple labels are available, and class capping is turned on.

If class capping is not active, then all training protocols are the same. In general, if multiple labels are available for a sample, one or more of them are chosen for training according to the training protocol. Since the runtime capping of classes can bias the sampling towards the first samples encountered, the data should always be presented in random order when using class capping. More specifically:

  • ALL_MIN: If all labels available for sample are below the class cap, train with all labels. On one hand, all labels for a point are used in training, and all classes are subject to the class cap, including composite classes. On the other hand, the requirement that all labels for a sample must be below the cap results in a considerable under-sampling of the base classes, as the cap for the composite classes is met quickly, and it’s then difficult to find eligible base class samples. This method minimizes the amount of training data.
  • ALL_MAX: If any label available for sample is below the class cap, train with all labels. On one hand, all labels for a point are used in training, and adequate amounts of the base classes are sampled. On the other hand, the composite classes are allowed to exceed the class cap, so the benefits of capping class membership are partially negated.
  • RANDOM: If any labels are below the class cap, train with one of them, selected at random. This comes close to implementing the class cap exactly, though under-sampling of the base-classes is still seen in some cases.
  • ANY: Train with any available labels that are below the class cap. This comes closest to exactly implementing the class cap. Is a hybrid between the AllMax and Rand approaches, as it relaxes the constraint that all labels for a point are used in training, but allows more than one label to be used for a given sample.

void Classifier::setLabelVectorStream ofstream &  ost  )  [inline]
 

Set up a file destination for storing sample label vectors.

This is useful in multiple-label situations. One vector of floating point values is written per line, each corresponding to a test sample. For example, the vector ( 0.5 0.5 0 0 0 0 ) indicates that a sample is labeled as belonging to the first and second classes. This output is available during test operations.

Parameters:
ost Output file stream to which to write.

void Classifier::setNumTrainEpochs int  nte  ) 
 

Sets the number of training epochs, that is, the number of times the training set will be presented.

Parameters:
nte The number of training epochs.

void Classifier::setNumVoters int  numVoters  ) 
 

Sets the number of voters, i.e.

, the number classifier model instances. If CV mode is in effect, the number of voters is multiplied by the number of views in the ViewSet. This adjustment is preceded by a call to deleteClassifiers() and followed by a call to newClassifiers().

Parameters:
numVoters The number of voters.

void Classifier::setOutputClassStream ofstream &  ost  )  [inline]
 

Set up a file destination for storing output predictions.

These single-class predictions are contrast-enhanced derivations of the distributed prediction vector (winner-take-all). If more than one class label is tied to be the winner, then the label -1 is written. This output is available during test and classify operations.

Parameters:
ost Output file stream to which to write.

void Classifier::setOutputVectorStream ofstream &  ost  )  [inline]
 

Set up a file destination for storing distributed prediction vectors, one per sample.

This output is available during test and classify operations.

Parameters:
ost Output file stream to which to write.

void Classifier::setPredPpmStream ofstream &  ost  )  [inline]
 

Set up a file destination for storing graphical decision surfaces, as generated by the predPpm() method.

This output is available during train and test operations. More details are available in predPpm().

Parameters:
ost Output file stream to which to write.

void Classifier::test ViewSet vs  ) 
 

Tests the classifier based on the samples in the ViewSet, that is, generate a prediction as to the class label of each sample.

If CvMode is set, then each View object is submitted to a separate classifier model instance for cross-validated testing. If not cross-validating, then a single instance generates the predictions for all the samples. Starts by resetting all entry count and class statistics counters, and generates any requested stream outputs as part of the process of running the test.

Parameters:
vs The test data.

void Classifier::thematicPpm ofstream &  ost,
Image img,
ofstream &  confOst
 

Generates a thematic map, that is, one where a pixel's color indicates the predicted class for that pixel.

To tell what the colors mean, a legend can be generated via Palette::writeLegend().

Parameters:
ost Output file stream to send graphic to (should have extension '.ppm')
img Image object that is processed into a thematic map.

void Classifier::train ViewSet vs  ) 
 

Trains the classifier using the samples in the ViewSet object.

If CvMode is set, then training is cross-validated, creating one set of classifier model instances per View in the ViewSet. If not cross-validating, then each of the Views in the ViewSet is used to train the classifier in turn. If using class-capping, then training alternates between views, so that not all samples are drawn from the first View. When done, invokes predPpm() if this was requested via setPredPpmStream(). Also, starts out by resetting entry counts and class statistics counters.

Parameters:
vs The training data.


The documentation for this class was generated from the following files:
Generated on Tue Dec 13 11:00:26 2005 for Classer by  doxygen 1.4.3