Adaptive Feature Classifier Properties window
This classifier is used throughout your project wherever content classification is enabled.
You can configure the Adaptive Feature Classifier properties with this window.
You need to retrain the project before any changes made to these settings can take affect.
- Text Filtering
-
This group has the following settings:
- Use digits
-
This setting controls whether the classifier uses digits as features or ignores them during text filtering. (Default: Cleared)
- Min. word length
-
All words that are shorter than this value are ignored during text filtering. Independently of word length, features with a very low or high frequency are also not taken into account. (Default: 3)
- Training
-
This group has the following settings:
- Max. number of features
-
Limits the maximum number of internally generated features per class. (Default: 5000)
- Min. feature length
-
Specifies the minimum number of characters that should be used for a feature. This value cannot be smaller than the Min. word length. (Default: 3)
- Max. feature length
-
Specifies the maximum number of characters that are used for a feature. Should not be larger than 64 characters. (Default: 50)
- Automatic selection of Min. feature frequency
-
Enables the Min. feature frequency to be set automatically. If this setting is selected, you cannot manually assign a Min. feature frequency value. (Default: Cleared)
- Min. feature frequency
-
Specifies how often a substring is displayed inside the training set of a class to be used as a feature for content classification. (Default: 2)
- Start features at beginning of words
-
Specifies that a feature substring needs to start at the beginning of a word. If not checked, the substring can start anywhere. (Default: Selected)
- Max. words per feature (0-n)
-
Limits the number of words per feature. A value of zero means unlimited words, although the total number of characters of the words per feature cannot exceed the "Max. feature length" property. (Default: 2)
- Use fuzzy string match
-
Enables matching fuzziness with the disadvantage of slower classification performance. (Default: Cleared)
- Fuzzy length (5-10)
-
Configures the fuzzy string comparison. (Default: 5)
- Automatic selection of Min. class entropy
-
Enables the Min. class entropy to be set automatically. If this setting is selected, you cannot manually assigned a Min. class entropy value. (Default: Cleared)
- Min. class entropy (0.0 - 1.0)
-
Controls the importance of a feature, depending on the number of classes where it is displayed. A value of 1.0 requires that a feature is displayed only inside the sample documents of a single class; otherwise, it is not used for classification. The lower the value, the more classes can contain the feature inside the training set. (Default: 0.600)
The following buttons are available at the bottom of this window:
Button |
Description |
---|---|
OK |
Closes the window and saves your changes. |
Cancel |
Closes the window without saving your changes. |
Apply |
Applies your changes without closing the window. |
Related topics: