Package weka.core
Class Utils
java.lang.Object
weka.core.Utils
- All Implemented Interfaces:
RevisionHandler
Class implementing some simple utility methods.
- Version:
- $Revision: 10570 $
- Author:
- Eibe Frank, Yong Wang, Len Trigg, Julien Prados
-
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic StringarrayToString(Object array) Returns the given Array in a string representation.static StringbackQuoteChars(String string) Converts carriage returns and new lines in a string into \r and \n.static voidcheckForRemainingOptions(String[] options) Checks if the given array contains any non-empty options.static StringconvertNewLines(String string) Converts carriage returns and new lines in a string into \r and \n.static FileconvertToRelativePath(File absolute) Converts a File's absolute path to a path relative to the user (ie start) directory.static final doublecorrelation(double[] y1, double[] y2, int n) Returns the correlation coefficient of two double vectors.static StringdoubleToString(double value, int afterDecimalPoint) Rounds a double and converts it into String.static StringdoubleToString(double value, int width, int afterDecimalPoint) Rounds a double and converts it into a formatted decimal-justified String.static booleaneq(double a, double b) Tests if a is equal to b.static ObjectCreates a new instance of an object given it's class name and (optional) arguments to pass to it's setOptions method.static ClassReturns the basic class of an array class (handles multi-dimensional arrays).static intgetArrayDimensions(Class array) Returns the dimensions of the given array.static intgetArrayDimensions(Object array) Returns the dimensions of the given array.static booleanChecks if the given array contains the flag "-Char".static booleanChecks if the given array contains the flag "-String".static StringgetGlobalInfo(Object object, boolean addCapabilities) Utility method for grabbing the global info help (if it exists) from an arbitrary object.static StringGets an option indicated by a flag "-Char" from the given array of strings.static StringGets an option indicated by a flag "-String" from the given array of strings.static intgetOptionPos(char flag, String[] options) Gets the index of an option or flag indicated by a flag "-Char" from the given array of strings.static intgetOptionPos(String flag, String[] options) Gets the index of an option or flag indicated by a flag "-String" from the given array of strings.Returns the revision string.static booleangr(double a, double b) Tests if a is greater than b.static booleangrOrEq(double a, double b) Tests if a is greater or equal to b.static doubleinfo(int[] counts) Computes entropy for an array of integers.static StringjoinOptions(String[] optionArray) Joins all the options in an option array into a single string, as might be used on the command line.static doublekthSmallestValue(double[] array, int k) Returns the kth-smallest value in the arraystatic intkthSmallestValue(int[] array, int k) Returns the kth-smallest value in the array.static StringImplements simple line breaking.static doublelog2(double a) Returns the logarithm of a for base 2.static double[]logs2probs(double[] a) Converts an array containing the natural logarithms of probabilities stored in a vector back into probabilities.static voidMain method for testing this class.static intmaxIndex(double[] doubles) Returns index of maximum element in a given array of doubles.static intmaxIndex(int[] ints) Returns index of maximum element in a given array of integers.static doublemean(double[] vector) Computes the mean for an array of doubles.static intminIndex(double[] doubles) Returns index of minimum element in a given array of doubles.static intminIndex(int[] ints) Returns index of minimum element in a given array of integers.static voidnormalize(double[] doubles) Normalizes the doubles in the array by their sum.static voidnormalize(double[] doubles, double sum) Normalizes the doubles in the array using the given value.static StringPads a string to a specified length, inserting spaces on the left as required.static StringPads a string to a specified length, inserting spaces on the right as required.static String[]partitionOptions(String[] options) Returns the secondary set of options (if any) contained in the supplied options array.static intRounds a double to the next nearest integer value in a probabilistic fashion (e.g.static doubleprobToLogOdds(double prob) Returns the log-odds for a given probabilitiy.static StringQuotes a string if it contains special characters.static PropertiesreadProperties(String resourceName) Reads properties that inherit from three locations.static StringremoveSubstring(String inString, String substring) Removes all occurrences of a string from another string.static voidreplaceMissingWithMAX_VALUE(double[] array) Replaces all "missing values" in the given array of double values with MAX_VALUE.static StringreplaceSubstring(String inString, String subString, String replaceString) Replaces with a new string, all occurrences of a string from another string.static StringrevertNewLines(String string) Reverts \r and \n in a string into carriage returns and new lines.static intround(double value) Rounds a double to the next nearest integer value.static doubleroundDouble(double value, int afterDecimalPoint) Rounds a double to the given number of decimal places.static booleansm(double a, double b) Tests if a is smaller than b.static booleansmOrEq(double a, double b) Tests if a is smaller or equal to b.static int[]sort(double[] array) Sorts a given array of doubles in ascending order and returns an array of integers with the positions of the elements of the original array in the sorted array.static int[]sort(int[] array) Sorts a given array of integers in ascending order and returns an array of integers with the positions of the elements of the original array in the sorted array.static int[]sortWithNoMissingValues(double[] array) Sorts a given array of doubles in ascending order and returns an array of integers with the positions of the elements of the original array in the sorted array.static String[]splitOptions(String quotedOptionString) Split up a string containing options into an array of strings, one for each option.static int[]stableSort(double[] array) Sorts a given array of doubles in ascending order and returns an array of integers with the positions of the elements of the original array in the sorted array.static doublesum(double[] doubles) Computes the sum of the elements of an array of doubles.static intsum(int[] ints) Computes the sum of the elements of an array of integers.static StringunbackQuoteChars(String string) The inverse operation of backQuoteChars().static Stringunquotes are previously quoted string (but only if necessary), i.e., it removes the single quotes around it.static doublevariance(double[] vector) Computes the variance for an array of doubles.static doublexlogx(int c) Returns c*log2(c) for a given integer value c.
-
Field Details
-
log2
public static double log2The natural logarithm of 2. -
SMALL
public static double SMALLThe small deviation allowed in double comparisons.
-
-
Constructor Details
-
Utils
public Utils()
-
-
Method Details
-
readProperties
Reads properties that inherit from three locations. Properties are first defined in the system resource location (i.e. in the CLASSPATH). These default properties must exist. Properties defined in the users home directory (optional) override default settings. Properties defined in the current directory (optional) override all these settings.- Parameters:
resourceName- the location of the resource that should be loaded. e.g.: "weka/core/Utils.props". (The use of hardcoded forward slashes here is OK - see jdk1.1/docs/guide/misc/resources.html) This routine will also look for the file (in this case) "Utils.props" in the users home directory and the current directory.- Returns:
- the Properties
- Throws:
Exception- if no default properties are defined, or if an error occurs reading the properties files.
-
correlation
public static final double correlation(double[] y1, double[] y2, int n) Returns the correlation coefficient of two double vectors.- Parameters:
y1- double vector 1y2- double vector 2n- the length of two double vectors- Returns:
- the correlation coefficient
-
removeSubstring
Removes all occurrences of a string from another string.- Parameters:
inString- the string to remove substrings from.substring- the substring to remove.- Returns:
- the input string with occurrences of substring removed.
-
replaceSubstring
Replaces with a new string, all occurrences of a string from another string.- Parameters:
inString- the string to replace substrings in.subString- the substring to replace.replaceString- the replacement substring- Returns:
- the input string with occurrences of substring replaced.
-
padLeft
Pads a string to a specified length, inserting spaces on the left as required. If the string is too long, characters are removed (from the right).- Parameters:
inString- the input stringlength- the desired length of the output string- Returns:
- the output string
-
padRight
Pads a string to a specified length, inserting spaces on the right as required. If the string is too long, characters are removed (from the right).- Parameters:
inString- the input stringlength- the desired length of the output string- Returns:
- the output string
-
doubleToString
Rounds a double and converts it into String.- Parameters:
value- the double valueafterDecimalPoint- the (maximum) number of digits permitted after the decimal point- Returns:
- the double as a formatted string
-
doubleToString
Rounds a double and converts it into a formatted decimal-justified String. Trailing 0's are replaced with spaces.- Parameters:
value- the double valuewidth- the width of the stringafterDecimalPoint- the number of digits after the decimal point- Returns:
- the double as a formatted string
-
getArrayClass
Returns the basic class of an array class (handles multi-dimensional arrays).- Parameters:
c- the array to inspect- Returns:
- the class of the innermost elements
-
getArrayDimensions
Returns the dimensions of the given array. Even though the parameter is of type "Object" one can hand over primitve arrays, e.g. int[3] or double[2][4].- Parameters:
array- the array to determine the dimensions for- Returns:
- the dimensions of the array
-
getArrayDimensions
Returns the dimensions of the given array. Even though the parameter is of type "Object" one can hand over primitve arrays, e.g. int[3] or double[2][4].- Parameters:
array- the array to determine the dimensions for- Returns:
- the dimensions of the array
-
arrayToString
Returns the given Array in a string representation. Even though the parameter is of type "Object" one can hand over primitve arrays, e.g. int[3] or double[2][4].- Parameters:
array- the array to return in a string representation- Returns:
- the array as string
-
eq
public static boolean eq(double a, double b) Tests if a is equal to b.- Parameters:
a- a doubleb- a double
-
checkForRemainingOptions
Checks if the given array contains any non-empty options.- Parameters:
options- an array of strings- Throws:
Exception- if there are any non-empty options
-
getFlag
Checks if the given array contains the flag "-Char". Stops searching at the first marker "--". If the flag is found, it is replaced with the empty string.- Parameters:
flag- the character indicating the flag.options- the array of strings containing all the options.- Returns:
- true if the flag was found
- Throws:
Exception- if an illegal option was found
-
getFlag
Checks if the given array contains the flag "-String". Stops searching at the first marker "--". If the flag is found, it is replaced with the empty string.- Parameters:
flag- the String indicating the flag.options- the array of strings containing all the options.- Returns:
- true if the flag was found
- Throws:
Exception- if an illegal option was found
-
getOption
Gets an option indicated by a flag "-Char" from the given array of strings. Stops searching at the first marker "--". Replaces flag and option with empty strings.- Parameters:
flag- the character indicating the option.options- the array of strings containing all the options.- Returns:
- the indicated option or an empty string
- Throws:
Exception- if the option indicated by the flag can't be found
-
getOption
Gets an option indicated by a flag "-String" from the given array of strings. Stops searching at the first marker "--". Replaces flag and option with empty strings.- Parameters:
flag- the String indicating the option.options- the array of strings containing all the options.- Returns:
- the indicated option or an empty string
- Throws:
Exception- if the option indicated by the flag can't be found
-
getOptionPos
Gets the index of an option or flag indicated by a flag "-Char" from the given array of strings. Stops searching at the first marker "--".- Parameters:
flag- the character indicating the option.options- the array of strings containing all the options.- Returns:
- the position if found, or -1 otherwise
-
getOptionPos
Gets the index of an option or flag indicated by a flag "-String" from the given array of strings. Stops searching at the first marker "--".- Parameters:
flag- the String indicating the option.options- the array of strings containing all the options.- Returns:
- the position if found, or -1 otherwise
-
quote
Quotes a string if it contains special characters. The following rules are applied: A character is backquoted version of it is one of " ' % \ \n \r \t . A string is enclosed within single quotes if a character has been backquoted using the previous rule above or contains { } or is exactly equal to the strings , ? space or "" (empty string). A quoted question mark distinguishes it from the missing value which is represented as an unquoted question mark in arff files.- Parameters:
string- the string to be quoted- Returns:
- the string (possibly quoted)
- See Also:
-
unquote
unquotes are previously quoted string (but only if necessary), i.e., it removes the single quotes around it. Inverse to quote(String).- Parameters:
string- the string to process- Returns:
- the unquoted string
- See Also:
-
backQuoteChars
Converts carriage returns and new lines in a string into \r and \n. Backquotes the following characters: ` " \ \t and %- Parameters:
string- the string- Returns:
- the converted string
- See Also:
-
convertNewLines
Converts carriage returns and new lines in a string into \r and \n.- Parameters:
string- the string- Returns:
- the converted string
-
revertNewLines
Reverts \r and \n in a string into carriage returns and new lines.- Parameters:
string- the string- Returns:
- the converted string
-
partitionOptions
Returns the secondary set of options (if any) contained in the supplied options array. The secondary set is defined to be any options after the first "--". These options are removed from the original options array.- Parameters:
options- the input array of options- Returns:
- the array of secondary options
-
unbackQuoteChars
The inverse operation of backQuoteChars(). Converts back-quoted carriage returns and new lines in a string to the corresponding character ('\r' and '\n'). Also "un"-back-quotes the following characters: ` " \ \t and %- Parameters:
string- the string- Returns:
- the converted string
- See Also:
-
splitOptions
Split up a string containing options into an array of strings, one for each option.- Parameters:
quotedOptionString- the string containing the options- Returns:
- the array of options
- Throws:
Exception- in case of an unterminated string, unknown character or a parse error
-
joinOptions
Joins all the options in an option array into a single string, as might be used on the command line.- Parameters:
optionArray- the array of options- Returns:
- the string containing all options.
-
forName
Creates a new instance of an object given it's class name and (optional) arguments to pass to it's setOptions method. If the object implements OptionHandler and the options parameter is non-null, the object will have it's options set. Example use:String classifierName = Utils.getOption('W', options); Classifier c = (Classifier)Utils.forName(Classifier.class, classifierName, options); setClassifier(c);- Parameters:
classType- the class that the instantiated object should be assignable to -- an exception is thrown if this is not the caseclassName- the fully qualified class name of the objectoptions- an array of options suitable for passing to setOptions. May be null. Any options accepted by the object will be removed from the array.- Returns:
- the newly created object, ready for use.
- Throws:
Exception- if the class name is invalid, or if the class is not assignable to the desired class type, or the options supplied are not acceptable to the object
-
info
public static double info(int[] counts) Computes entropy for an array of integers.- Parameters:
counts- array of counts- Returns:
- - a log2 a - b log2 b - c log2 c + (a+b+c) log2 (a+b+c) when given array [a b c]
-
smOrEq
public static boolean smOrEq(double a, double b) Tests if a is smaller or equal to b.- Parameters:
a- a doubleb- a double
-
grOrEq
public static boolean grOrEq(double a, double b) Tests if a is greater or equal to b.- Parameters:
a- a doubleb- a double
-
sm
public static boolean sm(double a, double b) Tests if a is smaller than b.- Parameters:
a- a doubleb- a double
-
gr
public static boolean gr(double a, double b) Tests if a is greater than b.- Parameters:
a- a doubleb- a double
-
kthSmallestValue
public static int kthSmallestValue(int[] array, int k) Returns the kth-smallest value in the array.- Parameters:
array- the array of integersk- the value of k- Returns:
- the kth-smallest value
-
kthSmallestValue
public static double kthSmallestValue(double[] array, int k) Returns the kth-smallest value in the array- Parameters:
array- the array of doublek- the value of k- Returns:
- the kth-smallest value
-
log2
public static double log2(double a) Returns the logarithm of a for base 2.- Parameters:
a- a double- Returns:
- the logarithm for base 2
-
maxIndex
public static int maxIndex(double[] doubles) Returns index of maximum element in a given array of doubles. First maximum is returned.- Parameters:
doubles- the array of doubles- Returns:
- the index of the maximum element
-
maxIndex
public static int maxIndex(int[] ints) Returns index of maximum element in a given array of integers. First maximum is returned.- Parameters:
ints- the array of integers- Returns:
- the index of the maximum element
-
mean
public static double mean(double[] vector) Computes the mean for an array of doubles.- Parameters:
vector- the array- Returns:
- the mean
-
minIndex
public static int minIndex(int[] ints) Returns index of minimum element in a given array of integers. First minimum is returned.- Parameters:
ints- the array of integers- Returns:
- the index of the minimum element
-
minIndex
public static int minIndex(double[] doubles) Returns index of minimum element in a given array of doubles. First minimum is returned.- Parameters:
doubles- the array of doubles- Returns:
- the index of the minimum element
-
normalize
public static void normalize(double[] doubles) Normalizes the doubles in the array by their sum.- Parameters:
doubles- the array of double- Throws:
IllegalArgumentException- if sum is Zero or NaN
-
normalize
public static void normalize(double[] doubles, double sum) Normalizes the doubles in the array using the given value.- Parameters:
doubles- the array of doublesum- the value by which the doubles are to be normalized- Throws:
IllegalArgumentException- if sum is zero or NaN
-
logs2probs
public static double[] logs2probs(double[] a) Converts an array containing the natural logarithms of probabilities stored in a vector back into probabilities. The probabilities are assumed to sum to one.- Parameters:
a- an array holding the natural logarithms of the probabilities- Returns:
- the converted array
-
probToLogOdds
public static double probToLogOdds(double prob) Returns the log-odds for a given probabilitiy.- Parameters:
prob- the probabilitiy- Returns:
- the log-odds after the probability has been mapped to [Utils.SMALL, 1-Utils.SMALL]
-
round
public static int round(double value) Rounds a double to the next nearest integer value. The JDK version of it doesn't work properly.- Parameters:
value- the double value- Returns:
- the resulting integer value
-
probRound
Rounds a double to the next nearest integer value in a probabilistic fashion (e.g. 0.8 has a 20% chance of being rounded down to 0 and a 80% chance of being rounded up to 1). In the limit, the average of the rounded numbers generated by this procedure should converge to the original double.- Parameters:
value- the double valuerand- the random number generator- Returns:
- the resulting integer value
-
replaceMissingWithMAX_VALUE
public static void replaceMissingWithMAX_VALUE(double[] array) Replaces all "missing values" in the given array of double values with MAX_VALUE.- Parameters:
array- the array to be modified.
-
roundDouble
public static double roundDouble(double value, int afterDecimalPoint) Rounds a double to the given number of decimal places.- Parameters:
value- the double valueafterDecimalPoint- the number of digits after the decimal point- Returns:
- the double rounded to the given precision
-
sort
public static int[] sort(int[] array) Sorts a given array of integers in ascending order and returns an array of integers with the positions of the elements of the original array in the sorted array. The sort is stable. (Equal elements remain in their original order.)- Parameters:
array- this array is not changed by the method!- Returns:
- an array of integers with the positions in the sorted array.
-
sort
public static int[] sort(double[] array) Sorts a given array of doubles in ascending order and returns an array of integers with the positions of the elements of the original array in the sorted array. NOTE THESE CHANGES: the sort is no longer stable and it doesn't use safe floating-point comparisons anymore. Occurrences of Double.NaN are treated as Double.MAX_VALUE.- Parameters:
array- this array is not changed by the method!- Returns:
- an array of integers with the positions in the sorted array.
-
sortWithNoMissingValues
public static int[] sortWithNoMissingValues(double[] array) Sorts a given array of doubles in ascending order and returns an array of integers with the positions of the elements of the original array in the sorted array. Missing values in the given array are replaced by Double.MAX_VALUE, so the array is modified in that case!- Parameters:
array- the array to be sorted, which is modified if it has missing values- Returns:
- an array of integers with the positions in the sorted array.
-
stableSort
public static int[] stableSort(double[] array) Sorts a given array of doubles in ascending order and returns an array of integers with the positions of the elements of the original array in the sorted array. The sort is stable (Equal elements remain in their original order.) Occurrences of Double.NaN are treated as Double.MAX_VALUE- Parameters:
array- this array is not changed by the method!- Returns:
- an array of integers with the positions in the sorted array.
-
variance
public static double variance(double[] vector) Computes the variance for an array of doubles.- Parameters:
vector- the array- Returns:
- the variance
-
sum
public static double sum(double[] doubles) Computes the sum of the elements of an array of doubles.- Parameters:
doubles- the array of double- Returns:
- the sum of the elements
-
sum
public static int sum(int[] ints) Computes the sum of the elements of an array of integers.- Parameters:
ints- the array of integers- Returns:
- the sum of the elements
-
xlogx
public static double xlogx(int c) Returns c*log2(c) for a given integer value c.- Parameters:
c- an integer value- Returns:
- c*log2(c) (but is careful to return 0 if c is 0)
-
convertToRelativePath
Converts a File's absolute path to a path relative to the user (ie start) directory. Includes an additional workaround for Cygwin, which doesn't like upper case drive letters.- Parameters:
absolute- the File to convert to relative path- Returns:
- a File with a path that is relative to the user's directory
- Throws:
Exception- if the path cannot be constructed
-
getGlobalInfo
Utility method for grabbing the global info help (if it exists) from an arbitrary object. Can also append capabilities information if the object is a CapabilitiesHandler.- Parameters:
object- the object to grab global info fromaddCapabilities- true if capabilities information is to be added to the result- Returns:
- the global help info or null if global info does not exist
-
lineWrap
Implements simple line breaking. Reformats the given string by introducing line breaks so that, ideally, no line exceeds the given number of characters. Line breaks are assumed to be indicated by newline characters. Existing line breaks are left in the input text.- Parameters:
input- the string to line wrapmaxLineWidth- the maximum permitted number of characters in a line- Returns:
- the processed string
-
getRevision
Returns the revision string.- Specified by:
getRevisionin interfaceRevisionHandler- Returns:
- the revision
-
main
Main method for testing this class.- Parameters:
ops- some dummy options
-