How can I change a string variable into a numeric variable?

Sometimes you have a data set with a variable that appears to be a numeric variable, perhaps because it has numeric values, but is really a string variable.  Because you cannot perform most statistical operations on a string variable, you may want to turn the string variable into a numeric variable.  Consider the following data set.

data list list / id * name (A5) score (A5) gender (A2).
begin data
1 "Beth" "57" "f"
2 "Bob" "65" "m"
3 "Barb" "70" "f"
4 "Andy" "45" "m"
5 "Al" "80" "m"
6 "Ann" "81" "f"
7 "Pete" "66" "m"
8 "Pam" "60" "f"
9 "Phil" "70" "m"
end data.

Because the variable score is a string variable, we cannot calculate a mean, etc., for this variable.  There are several ways that you can change a string variable into a numeric variable.  One way is to use the number function with the compute command.  To do this, you need to create a new variable using the compute command.  To use the number function, you need to enclose the name of the string variable and a format for the new numeric variable.

compute score1 = number(score, F2).

Now that we have a the scores in a numeric variable, we can calculate some descriptive statistics.

desc var = score1.
Descriptive Statistics

N Minimum Maximum Mean Std. Deviation
SCORE1 9 45.00 81.00 66.0000 11.24722
Valid N (listwise) 9

Another way to convert string representations of numeric values into a numeric variable is to use the recode command with the convert option.

recode score (convert) into score2.

In some cases, you may have non-numeric symbols in your string variable that stand for numeric values.  In that case, you can also convert them into numbers within this command, as shown below.

recode score ('  ' = -9) (convert) ('-' = 11) ('&' = 12) into newvar1.

If you have only a few values in your string variable, you could use the recode command and create a new numeric variable.  Let's convert the string variable gender into a numeric variable.

recode gender ('m' = 1) ('f' = 2) into ngender.

Another way to recode gender, or any string variable for which you want a numeric representation of the categories, is to use the autorecode command.

autorecode gender /into ngen.

We recommend that you convert or recode your variables into a new variable, such that the original variable is not overwritten or modified in any way.  If the converting or recoding does not work as you intend, you still have the original variable intact, and you can try the converting or recoding again.

If you are using SPSS version 16 or higher, you can also use the alter type command.  For example,

alter type score (f8.2).

Please note that the alter type command does not allow you to create a new version of the variable.

For information regarding the types of variables available in SPSS, please see the SPSS Command Syntax Reference (which can be accessed by clicking on Help and then Command Syntax Reference).  From there, click on "Universals" and then "Variable Types and Formats".  In the example above, the "f" in (f8.2) stands for floating point, which means that the variable will be numeric.  The "8" means that the variable will have a length of 8, and the ".2" means that the variable will have two digits after the decimal point.  (To be completely explicit, this means that there can be up to five digits before the decimal place, one digit for the decimal, and two digits after the decimal, for a total of eight.)

If you wanted to specify a string variable, you might use something like (a10).  The "a" means alphanumeric, which means that a string (AKA character) variable will be created, and "10" means that the string variable will have a length of 10.

How to cite this page

Report an error on this page or leave a comment

The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.