Once you have defined your columns, you have to give them their properdata type.
As of current version of nbt, 4 types are supported : string, number, boolean and date.
(Hint : If you have given some rows, so that nbt can extract information on the data, you may already have some types configured, but we encourage you to review them).
Data typing is very important for the analysis because it can allow for different behaviour in different algorithms.
Some rules of thumb you have to follow :
Dates are almost never good predictors. When working with dates, you should convert them to numbers, by for instance, calculating the number of years / months.
- Numbers should be used when something has a logical ordering (higher number means higher value of the property). An example where is correct to use a number is Salary. An example of where is not correct to use a number is Zip Code. (though Zip Codes are usually number is better to use them as stings so they can be grouped).
- Booleans for true/false values
- Strings for all the other types.