Classifier Class Reference

The Classifier class is the main interface to Classer's functionality. More...

#include <Classifier.h>

Collaboration diagram for Classifier:


Public Member Functions
	Classifier (string &modelName, DimList *dl, int nv, NameList &cn)
	Creates a Classifier object.
	~Classifier ()
	Destructor - just invokes deleteClassifiers().
void	setNumVoters (int numVoters)
	Sets the number of voters, i.e.
void	setNumTrainEpochs (int nte)
	Sets the number of training epochs, that is, the number of times the training set will be presented.
void	setDimList (DimList *dl)
	Sets the DimList, that is, the features to use for training/testing.
void	setClassCap (int cc)
	Sets the class-cap: this value imposes an upper limit on the number of samples of a given class that will be used in training.
void	setLabelChoice (labelChoiceType lc)
	Setting the labelChoiceType defines what to do during training when multiple labels are available, and class capping is turned on.
void	registerParams ()
	Registers model parameters with all model instances; this has to be done before a new parameter setting takes effect.
void	save (string &fNameTemplate)
	Saves each of the classifier model instances to a file.
void	load (string &fNameTemplate, string &specialRequest)
	Performs the inverse operation from Classifier::save(), loading the persistent state for the classifier model instances from a set of files.
void	train (ViewSet *vs)
	Trains the classifier using the samples in the ViewSet object.
void	test (ViewSet *vs)
	Tests the classifier based on the samples in the ViewSet, that is, generate a prediction as to the class label of each sample.
void	classify (View *v, vector< vector< float > > &)
	Similar to test(), but doesn't care about evaluating the performance of the classifier with respect to the ground truth labels.
void	setCvMode (bool cvMode)
	Accessor method.
bool	getCvMode ()
	Accessor method.
void	CvTrain (ViewSet *vs)
	Performs cross-validated training: creates the required number of classifier model instances, sets CvMode to true, and then invokes train() to do the work.
void	CvTest (ViewSet *vs)
	Performs cross-validated testing: assumes CvMode is already set to true, and invokes test() to do the work.
void	CvSave (string &fNameTemplate)
	Saves the classifier model instances generated by a cross-validating training session.
void	CvLoad (string &fNameTemplate, int numTeSets)
	Does the inverse operation from that done by CvSave(), loading N x M classifier model instances from file.
valarray< int > &	getClassStats (int clIdx)
	Returns a reference to an array that tracks samples per class.
void	predSquare (int n, float *buf)
	A more general version of predPpm: it iterates over a square of size 'resolution', where the dimensions of the square represent the two feature dimensions with which the classifier was trained.
void	thematicPpm (ofstream &ost, Image &img, ofstream &confOst)
	Generates a thematic map, that is, one where a pixel's color indicates the predicted class for that pixel.
void	reset ()
	Resets the timers, 'trained' flag, cvMode flag, and the entry counters.
void	deleteClassifiers ()
	Frees the memory dedicated to the classifier model instances and the class histogram counters.
void	newClassifiers ()
	Reallocates memory for the classifier model instances and the class histogram counters, then creates new classifiers.
void	setOutputVectorStream (ofstream &ost)
	Set up a file destination for storing distributed prediction vectors, one per sample.
void	setDataLabelsStream (ofstream &ost)
	Set up a file destination for storing sample labels.
void	setLabelVectorStream (ofstream &ost)
	Set up a file destination for storing sample label vectors.
void	setOutputClassStream (ofstream &ost)
	Set up a file destination for storing output predictions.
void	setCmatPgmStream (ofstream &ost)
	Set up a file destination for storing graphical confusion matrices at the conclusion of a test.
void	setPredPpmStream (ofstream &ost)
	Set up a file destination for storing graphical decision surfaces, as generated by the predPpm() method.
void	clearOutputStreams ()
	Clear all output streams; in other words, turn off file-based outputs.
void	requestModelOutput (string &requestName, string &fNameTemplate)
	Provides a mechanism for requesting file-based outputs from classifier model instances (see ClassifierModel::requestOutput()).
void	clearModelOutputStreams ()
	Invokes ClassifierModel::closeStreams() for each classifier model instance.
NameList &	getClassNames ()
	Accessor method.
PropertyList *	getModelParams ()
	Accessor method.
DimList *	getDimList ()
	Accessor method.
void	boostVerbosity (int verbosityLevel, int debuggedEntry)
	Sometimes, an odd behavior only can be reproduced after a certain number of entries have been submitted for training, which can generate too much logged output; to simplify debugging in these situations, this function lets you boost the verbosity of logging to levels 4, 5, or 6, for a particular data entry.
int	getNumClassifiers ()
	Accessor method.
int	getNumVoters ()
	Accessor method.
int	getNumTrainEpochs ()
	Accessor method.
ClassifierModel *	getClassifier (int i)
	Accessor method.
int	getLtmRequired ()
	Sums the storage requirements for each of the classifier model instances.
float	getTrainTime ()
	Accessor method.
float	getTestTime ()
	Accessor method.
float	getClassifyTime ()
	Accessor method.
float	getPercentCorrect ()
	Accessor method.
ConfusionMatrix *	getConfusionMatrix ()
	Accessor method.
UnitPlot *	plot2D (NameList &dimNames, NameList &classNames)
TxtPad *	getClassStats ()
	Returns a TxtPad object summarizing the class statistics in the form of a histogram specifying samples per class.
void	setHeartbeat (bool h)
	Accessor method.

Detailed Description

The Classifier class is the main interface to Classer's functionality.

Classer makes a distinction between classifiers and models:

A model is a single instance, or copy, of a model implementation. It is the lowest level object encapsulated by Classer, and deals with samples one at a time: during training, it learns to associate a set of input features with a class label, and during testing or classification to generate a prediction in response to a set of input features (optionally distributed, i.e., a probability distribution over the set of labels). It also accepts parameter settings, and requests for output metrics.
A classifier is a collection of one or more model copies maintained as an aggregate. Within the classifier, when voting, each voter has its own copy. Also, when doing cross-validated training and testing, a model copy is assigned to test each view. In other words, if a system is doing four-fold cross validation with 5 voters, the classifier holds 20 model copies in all.

Constructor & Destructor Documentation

Classifier::Classifier ( string & modelName,

DimList * dl,

int nv,

NameList & cn

)

Creates a Classifier object.

Parameters:

modelName - The type of ARTMAP classifier: 'Fuzzy', 'Default', 'IC' or 'Distrib'

dl - A DimList object describing the features to use for training/testing (can be changed later)

nv - The number of voters - this determines the number of model copies, along with whether or not cross-validation is being used

cn - A NameList object describing the output classes. This is fixed once the classifier is created.

Exceptions:

MsgException If dl is NULL.

MsgException If nv <= 0.

Classifier::~Classifier ( )

Destructor - just invokes deleteClassifiers().

Member Function Documentation

void Classifier::boostVerbosity ( int verbosityLevel,

int debuggedEntry

)

Sometimes, an odd behavior only can be reproduced after a certain number of entries have been submitted for training, which can generate too much logged output; to simplify debugging in these situations, this function lets you boost the verbosity of logging to levels 4, 5, or 6, for a particular data entry.

Parameters:

verbosityLevel The level (4, 5, or 6) to which to boost logging.

debuggedEntry The index into the data set of the sample for which more detail is desired.

Exceptions:

MsgException If the verbosityLevel is in the wrong range.

void Classifier::classify ( View * v,

vector< vector< float > > & out

)

Similar to test(), but doesn't care about evaluating the performance of the classifier with respect to the ground truth labels.
In other words, generates a set of predictions for the samples in the passed View object, storing the distributed predictions in the passed 2D vector out. As does test(), this function can also write labels and label vectors to a passed output stream.
Parameters:

v The View object holding the data to be classified.

out A 2D vector, pre-allocated to hold the predictions, with dimensions N x L (test set entries x output classes).

void Classifier::clearModelOutputStreams ( )

Invokes ClassifierModel::closeStreams() for each classifier model instance.

void Classifier::clearOutputStreams ( ) [inline]

Clear all output streams; in other words, turn off file-based outputs.
This is different from and doesn't affect the logging mechanism.

void Classifier::CvLoad ( string & fNameTemplate,

int numTeSets

)

Does the inverse operation from that done by CvSave(), loading N x M classifier model instances from file.

Parameters:

fNameTemplate The template on which to base the input file names.

numTeSets The number of partitions used for cross-validation.

void Classifier::CvSave ( string & fNameTemplate )

Saves the classifier model instances generated by a cross-validating training session.
This will generate N x M files, where N is the number of partitions in the ViewSet used for cross-validation, and M is the number of voters being used (may be 1). The file names generated will append to the template fNameTemplate the string "tXvY.wgt", where X ranges from 0 to N-1, and Y ranges from 0 to M-1. The actual saving of the classifier model instance is delegated to the model implementation.
Parameters:

fNameTemplate The template on which to base the output file names.

void Classifier::CvTest ( ViewSet * vs )

Performs cross-validated testing: assumes CvMode is already set to true, and invokes test() to do the work.

Parameters:

vs The test data, divided into partitions for cross-validation.

void Classifier::CvTrain ( ViewSet * vs )

Performs cross-validated training: creates the required number of classifier model instances, sets CvMode to true, and then invokes train() to do the work.

Parameters:

vs The training data, divided into partitions for cross-validation.

void Classifier::deleteClassifiers ( )

Frees the memory dedicated to the classifier model instances and the class histogram counters.

ClassifierModel* Classifier::getClassifier ( int i ) [inline]

Accessor method.

Parameters:

i Index of the classifier model instance to retrieve.

Returns:
A pointer to the classifier model instance with the given index.

float Classifier::getClassifyTime ( ) [inline]

Accessor method.

Returns:
The number of seconds required for classification.

NameList& Classifier::getClassNames ( ) [inline]

Accessor method.

Returns:
The NameList object holding the list of class label names.

TxtPad * Classifier::getClassStats ( )

Returns a TxtPad object summarizing the class statistics in the form of a histogram specifying samples per class.

Returns:
A TxtPad containing a table of samples per class.

valarray< int > & Classifier::getClassStats ( int clIdx )

Returns a reference to an array that tracks samples per class.
One array is maintained for each classifier model instance, so this method lets you look the array up by classifier model index.
Parameters:

clIdx Index of the classifier for which to retrieve class sample statistics.

Exceptions:

If the classifier index is out of range

ConfusionMatrix* Classifier::getConfusionMatrix ( ) [inline]

Accessor method.

Returns:
The confusion matrix object accumulated during the last test() operation.

bool Classifier::getCvMode ( ) [inline]

Accessor method.

Returns:
The cross-validation mode - if true, cross-validation is on.

DimList* Classifier::getDimList ( ) [inline]

Accessor method.

Returns:
Returns the DimList object currently set for the classifier.

int Classifier::getLtmRequired ( )

Sums the storage requirements for each of the classifier model instances.

Returns:
The long-term-memory (LTM) required to store each of the model instances, in bytes.

PropertyList* Classifier::getModelParams ( ) [inline]

Accessor method.

Returns:
The PropertyList object holding the model properties as a set of name/value pairs.

int Classifier::getNumClassifiers ( ) [inline]

Accessor method.

Returns:
The number of classifier model instances, which is the same as the number of voters unless cross-validating.

int Classifier::getNumTrainEpochs ( ) [inline]

Accessor method.

Returns:
The number of training epochs requested

int Classifier::getNumVoters ( ) [inline]

Accessor method.

Returns:
The number of voters

float Classifier::getPercentCorrect ( ) [inline]

Accessor method.

Returns:
The percent correct for the last test operation.

float Classifier::getTestTime ( ) [inline]

Accessor method.

Returns:
The number of seconds required for testing.

float Classifier::getTrainTime ( ) [inline]

Accessor method.

Returns:
The number of seconds required for training.

void Classifier::load ( string & fNameTemplate,

string & specialRequest

)

Performs the inverse operation from Classifier::save(), loading the persistent state for the classifier model instances from a set of files.
See artmap::fread() for details on the specialRequest parameter.
Parameters:

fNameTemplate The template on which file names are based.

specialRequest Set to the empty string.

void Classifier::newClassifiers ( )

Reallocates memory for the classifier model instances and the class histogram counters, then creates new classifiers.
Note that the new classifiers have default settings, i.e., model parameters must be applied to them anew.

UnitPlot * Classifier::plot2D ( NameList & dimNames,

NameList & classNames

)

void Classifier::predSquare ( int resolution,

float * buf

)

A more general version of predPpm: it iterates over a square of size 'resolution', where the dimensions of the square represent the two feature dimensions with which the classifier was trained.
However, instead of generating a graphic, it fills a floating-point array with predictions for the class of each 'pixel' in the square.
Parameters:

resolution The size of the side of the square over which predictions are iterated.

buf The floating-point array to be filled with predictions. Allocated by the caller, it should have size resolution * resolution * numClasses. The numClasses predictions for a given pixel are grouped together.

void Classifier::registerParams ( )

Registers model parameters with all model instances; this has to be done before a new parameter setting takes effect.
Note - to set model parameters, use getModelParams() to get a pointer to the PropertyList that stores the parameter settings, and modify the PropertyList.

void Classifier::requestModelOutput ( string & requestName,

string & fNameTemplate

)

Provides a mechanism for requesting file-based outputs from classifier model instances (see ClassifierModel::requestOutput()).
Filenames are generated in a similar manner to the way it's done in Classifier::save(), with the extension '.txt' substituted for '.wgt'.

void Classifier::reset ( )

Resets the timers, 'trained' flag, cvMode flag, and the entry counters.

void Classifier::save ( string & fNameTemplate )

Saves each of the classifier model instances to a file.
If there is only a single classifier model instance, the filename is fNameTemplate.wgt; if there are several instances, then file names are based on the passed template, as follows:

fNameTemplatev0.wgt
fNameTemplatev1.wgt
fNameTemplatev2.wgt
...

Parameters:

fNameTemplate The template on which file names are based.

void Classifier::setClassCap ( int cc )

Sets the class-cap: this value imposes an upper limit on the number of samples of a given class that will be used in training.
Use a value of 0 to impose no limit. No limit is the default, but this function can be called with 0 to unset a previous setting, for example to cancel the class cap for testing after using it in training.
Parameters:

cc The class-cap value.

void Classifier::setCmatPgmStream ( ofstream & ost ) [inline]

Set up a file destination for storing graphical confusion matrices at the conclusion of a test.
This output is available during test operations.
Parameters:

ost Output file stream to which to write.

void Classifier::setCvMode ( bool cvMode ) [inline]

Accessor method.

Parameters:

cvMode The cross-validation mode - if true, cross-validation is on.

void Classifier::setDataLabelsStream ( ofstream & ost ) [inline]

Set up a file destination for storing sample labels.
As samples are routinely submitted in randomized order, this is the only way of knowing after the fact which labels go with which predictions. A single label is written per line, each corresponding to a test sample. This output is available during test operations.
Parameters:

ost Output file stream to which to write.

void Classifier::setDimList ( DimList * dl )

Sets the DimList, that is, the features to use for training/testing.
The classifier model instances are recreated, as the input features are a basic aspect of the model architecture. This is done via calls to deleteClassifiers() and newClassifiers().
Parameters:

dl The new list of features.

void Classifier::setHeartbeat ( bool h ) [inline]

Accessor method.

Parameters:

h If false, heartbeat printout is turned off (it's on by default)

void Classifier::setLabelChoice ( labelChoiceType lc )

Setting the labelChoiceType defines what to do during training when multiple labels are available, and class capping is turned on.
If class capping is not active, then all training protocols are the same. In general, if multiple labels are available for a sample, one or more of them are chosen for training according to the training protocol. Since the runtime capping of classes can bias the sampling towards the first samples encountered, the data should always be presented in random order when using class capping. More specifically:

ALL_MIN: If all labels available for sample are below the class cap, train with all labels. On one hand, all labels for a point are used in training, and all classes are subject to the class cap, including composite classes. On the other hand, the requirement that all labels for a sample must be below the cap results in a considerable under-sampling of the base classes, as the cap for the composite classes is met quickly, and it’s then difficult to find eligible base class samples. This method minimizes the amount of training data.
ALL_MAX: If any label available for sample is below the class cap, train with all labels. On one hand, all labels for a point are used in training, and adequate amounts of the base classes are sampled. On the other hand, the composite classes are allowed to exceed the class cap, so the benefits of capping class membership are partially negated.
RANDOM: If any labels are below the class cap, train with one of them, selected at random. This comes close to implementing the class cap exactly, though under-sampling of the base-classes is still seen in some cases.
ANY: Train with any available labels that are below the class cap. This comes closest to exactly implementing the class cap. Is a hybrid between the AllMax and Rand approaches, as it relaxes the constraint that all labels for a point are used in training, but allows more than one label to be used for a given sample.

void Classifier::setLabelVectorStream ( ofstream & ost ) [inline]

Set up a file destination for storing sample label vectors.
This is useful in multiple-label situations. One vector of floating point values is written per line, each corresponding to a test sample. For example, the vector ( 0.5 0.5 0 0 0 0 ) indicates that a sample is labeled as belonging to the first and second classes. This output is available during test operations.
Parameters:

ost Output file stream to which to write.

void Classifier::setNumTrainEpochs ( int nte )

Sets the number of training epochs, that is, the number of times the training set will be presented.

Parameters:

nte The number of training epochs.

void Classifier::setNumVoters ( int numVoters )

Sets the number of voters, i.e.
, the number classifier model instances. If CV mode is in effect, the number of voters is multiplied by the number of views in the ViewSet. This adjustment is preceded by a call to deleteClassifiers() and followed by a call to newClassifiers().
Parameters:

numVoters The number of voters.

void Classifier::setOutputClassStream ( ofstream & ost ) [inline]

Set up a file destination for storing output predictions.
These single-class predictions are contrast-enhanced derivations of the distributed prediction vector (winner-take-all). If more than one class label is tied to be the winner, then the label -1 is written. This output is available during test and classify operations.
Parameters:

ost Output file stream to which to write.

void Classifier::setOutputVectorStream ( ofstream & ost ) [inline]

Set up a file destination for storing distributed prediction vectors, one per sample.
This output is available during test and classify operations.
Parameters:

ost Output file stream to which to write.

void Classifier::setPredPpmStream ( ofstream & ost ) [inline]

Set up a file destination for storing graphical decision surfaces, as generated by the predPpm() method.
This output is available during train and test operations. More details are available in predPpm().
Parameters:

ost Output file stream to which to write.

void Classifier::test ( ViewSet * vs )

Tests the classifier based on the samples in the ViewSet, that is, generate a prediction as to the class label of each sample.
If CvMode is set, then each View object is submitted to a separate classifier model instance for cross-validated testing. If not cross-validating, then a single instance generates the predictions for all the samples. Starts by resetting all entry count and class statistics counters, and generates any requested stream outputs as part of the process of running the test.
Parameters:

vs The test data.

void Classifier::thematicPpm ( ofstream & ost,

Image & img,

ofstream & confOst

)

Generates a thematic map, that is, one where a pixel's color indicates the predicted class for that pixel.
To tell what the colors mean, a legend can be generated via Palette::writeLegend().
Parameters:

ost Output file stream to send graphic to (should have extension '.ppm')

img Image object that is processed into a thematic map.

void Classifier::train ( ViewSet * vs )

Trains the classifier using the samples in the ViewSet object.
If CvMode is set, then training is cross-validated, creating one set of classifier model instances per View in the ViewSet. If not cross-validating, then each of the Views in the ViewSet is used to train the classifier in turn. If using class-capping, then training alternates between views, so that not all samples are drawn from the first View. When done, invokes predPpm() if this was requested via setPredPpmStream(). Also, starts out by resetting entry counts and class statistics counters.
Parameters:

vs The training data.

The documentation for this class was generated from the following files:

Generated on Tue Dec 13 11:00:26 2005 for Classer by

1.4.3

Classifier Class Reference

Public Member Functions

Detailed Description

Constructor & Destructor Documentation

Member Function Documentation