Package weka.core.converters
Class CSVLoader
java.lang.Object
weka.core.converters.AbstractLoader
weka.core.converters.AbstractFileLoader
weka.core.converters.CSVLoader
- All Implemented Interfaces:
Serializable,BatchConverter,FileSourcedConverter,Loader,EnvironmentHandler,OptionHandler,RevisionHandler
Reads a source that is in comma separated or tab
separated format. Assumes that the first row in the file determines the
number of and names of the attributes.
Valid options are:
-N <range> The range of attributes to force type to be NOMINAL. 'first' and 'last' are accepted as well. Examples: "first-last", "1,4,5-27,50-last" (default: -none-)
-S <range> The range of attribute to force type to be STRING. 'first' and 'last' are accepted as well. Examples: "first-last", "1,4,5-27,50-last" (default: -none-)
-D <range> The range of attribute to force type to be DATE. 'first' and 'last' are accepted as well. Examples: "first-last", "1,4,5-27,50-last" (default: -none-)
-format <date format> The date formatting string to use to parse date values. (default: "yyyy-MM-dd'T'HH:mm:ss")
-M <str> The string representing a missing value. (default: ?)
-E <enclosures> The enclosure character(s) to use for strings. Specify as a comma separated list (e.g. ",' (default: '"')
- Version:
- $Revision: 10372 $
- Author:
- Mark Hall (mhall@cs.waikato.ac.nz)
- See Also:
-
Field Summary
FieldsFields inherited from class weka.core.converters.AbstractFileLoader
FILE_EXTENSION_COMPRESSEDFields inherited from interface weka.core.converters.Loader
BATCH, INCREMENTAL, NONE -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionReturns the tip text for this property.Returns the tip text for this property.Returns the tip text for this property.Return the full data set.Returns the current attribute range to be forced to type date.Get the format to use for parsing date values.Get the character(s) to use/recognize as string enclosuresReturns a description of the file type.Get the file extension used for arff files.String[]Gets all the file extensions used for this type of file.Returns the current placeholder for missing values.getNextInstance(Instances structure) CSVLoader is unable to process a data set incrementally.Returns the current attribute range to be forced to type nominal.String[]Gets the current settings of the Classifier.Returns the revision string.Returns the current attribute range to be forced to type string.Determines and returns (if possible) the structure (internally the header) of the data set as an empty set of instances.Returns a string describing this attribute evaluator.Returns an enumeration describing the available options.static voidMain method.Returns the tip text for this property.Returns the tip text for this property.voidreset()Resets the Loader ready to read a new data set or the same data set again.voidsetDateAttributes(String value) Set the attribute range to be forced to type date.voidsetDateFormat(String value) Set the format to use for parsing date values.voidsetEnclosureCharacters(String enclosure) Set the character(s) to use/recognize as string enclosuresvoidsetMissingValue(String value) Sets the placeholder for missing values.voidsetNominalAttributes(String value) Sets the attribute range to be forced to type nominal.voidsetOptions(String[] options) Parses a given list of options.voidResets the Loader object and sets the source of the data set to be the supplied File object.voidsetSource(InputStream input) Resets the Loader object and sets the source of the data set to be the supplied Stream object.voidsetStringAttributes(String value) Sets the attribute range to be forced to type string.Returns the tip text for this property.Methods inherited from class weka.core.converters.AbstractFileLoader
getUseRelativePath, retrieveFile, runFileLoader, setEnvironment, setFile, setUseRelativePath, useRelativePathTipTextMethods inherited from class weka.core.converters.AbstractLoader
setRetrieval
-
Field Details
-
FILE_EXTENSION
the file extension.
-
-
Constructor Details
-
CSVLoader
public CSVLoader()default constructor.
-
-
Method Details
-
getFileExtension
Get the file extension used for arff files.- Specified by:
getFileExtensionin interfaceFileSourcedConverter- Returns:
- the file extension
-
getFileDescription
Returns a description of the file type.- Specified by:
getFileDescriptionin interfaceFileSourcedConverter- Returns:
- a short file description
-
getFileExtensions
Gets all the file extensions used for this type of file.- Specified by:
getFileExtensionsin interfaceFileSourcedConverter- Returns:
- the file extensions
-
globalInfo
Returns a string describing this attribute evaluator.- Returns:
- a description of the evaluator suitable for displaying in the explorer/experimenter gui
-
listOptions
Returns an enumeration describing the available options.- Specified by:
listOptionsin interfaceOptionHandler- Returns:
- an enumeration of all the available options.
-
setOptions
Parses a given list of options. Valid options are:-N <range> The range of attributes to force type to be NOMINAL. 'first' and 'last' are accepted as well. Examples: "first-last", "1,4,5-27,50-last" (default: -none-)
-S <range> The range of attribute to force type to be STRING. 'first' and 'last' are accepted as well. Examples: "first-last", "1,4,5-27,50-last" (default: -none-)
-D <range> The range of attribute to force type to be DATE. 'first' and 'last' are accepted as well. Examples: "first-last", "1,4,5-27,50-last" (default: -none-)
-format <date format> The date formatting string to use to parse date values. (default: "yyyy-MM-dd'T'HH:mm:ss")
-M <str> The string representing a missing value. (default: ?)
-E <enclosures> The enclosure character(s) to use for strings. Specify as a comma separated list (e.g. ",' (default: '"')
- Specified by:
setOptionsin interfaceOptionHandler- Parameters:
options- the list of options as an array of strings- Throws:
Exception- if an option is not supported
-
getOptions
Gets the current settings of the Classifier.- Specified by:
getOptionsin interfaceOptionHandler- Returns:
- an array of strings suitable for passing to setOptions
-
setNominalAttributes
Sets the attribute range to be forced to type nominal.- Parameters:
value- the range
-
getNominalAttributes
Returns the current attribute range to be forced to type nominal.- Returns:
- the range
-
nominalAttributesTipText
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setStringAttributes
Sets the attribute range to be forced to type string.- Parameters:
value- the range
-
getStringAttributes
Returns the current attribute range to be forced to type string.- Returns:
- the range
-
stringAttributesTipText
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setDateAttributes
Set the attribute range to be forced to type date.- Parameters:
value- the range
-
getDateAttributes
Returns the current attribute range to be forced to type date.- Returns:
- the range.
-
dateAttributesTipText
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setDateFormat
Set the format to use for parsing date values.- Parameters:
value- the format to use.
-
getDateFormat
Get the format to use for parsing date values.- Returns:
- the format to use for parsing date values.
-
dateFormatTipText
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
enclosureCharactersTipText
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setEnclosureCharacters
Set the character(s) to use/recognize as string enclosures- Parameters:
enclosure- the characters to use as string enclosures
-
getEnclosureCharacters
Get the character(s) to use/recognize as string enclosures- Returns:
- the characters to use as string enclosures
-
setMissingValue
Sets the placeholder for missing values.- Parameters:
value- the placeholder
-
getMissingValue
Returns the current placeholder for missing values.- Returns:
- the placeholder
-
missingValueTipText
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setSource
Resets the Loader object and sets the source of the data set to be the supplied Stream object.- Specified by:
setSourcein interfaceLoader- Overrides:
setSourcein classAbstractLoader- Parameters:
input- the input stream- Throws:
IOException- if an error occurs
-
setSource
Resets the Loader object and sets the source of the data set to be the supplied File object.- Specified by:
setSourcein interfaceLoader- Overrides:
setSourcein classAbstractFileLoader- Parameters:
file- the source file.- Throws:
IOException- if an error occurs
-
getStructure
Determines and returns (if possible) the structure (internally the header) of the data set as an empty set of instances.- Specified by:
getStructurein interfaceLoader- Specified by:
getStructurein classAbstractLoader- Returns:
- the structure of the data set as an empty set of Instances
- Throws:
IOException- if an error occurs
-
getDataSet
Return the full data set. If the structure hasn't yet been determined by a call to getStructure then method should do so before processing the rest of the data set.- Specified by:
getDataSetin interfaceLoader- Specified by:
getDataSetin classAbstractLoader- Returns:
- the structure of the data set as an empty set of Instances
- Throws:
IOException- if there is no source or parsing fails
-
getNextInstance
CSVLoader is unable to process a data set incrementally.- Specified by:
getNextInstancein interfaceLoader- Specified by:
getNextInstancein classAbstractLoader- Parameters:
structure- ignored- Returns:
- never returns without throwing an exception
- Throws:
IOException- always. CSVLoader is unable to process a data set incrementally.
-
reset
Resets the Loader ready to read a new data set or the same data set again.- Specified by:
resetin interfaceLoader- Overrides:
resetin classAbstractFileLoader- Throws:
IOException- if something goes wrong
-
getRevision
Returns the revision string.- Specified by:
getRevisionin interfaceRevisionHandler- Returns:
- the revision
-
main
Main method.- Parameters:
args- should contain the name of an input file.
-