因子是它们用于将数据进行分类并将其存储为级别的数据对象。它们可以同时存储字符串和整数。它们在具有唯一值的有限数目的列是有用的。 例如,"male, "Female" 和 True, False 等. 它们在统计建模的非常有用。
使用 factor() 函数通过采取向量作为输入来创建因子。
示例 # Create a vector as input. data <- c("East","West","East","North","North","East","West","West","West","East","North") print(data) print(is.factor(data)) # Apply the factor function. factor_data <- factor(data) print(factor_data) print(is.factor(factor_data)) 当我们上面的代码执行时,它产生以下结果: [1] "East" "West" "East" "North" "North" "East" "West" "West" "West" "East" "North" [1] FALSE [1] East West East North North East West West West East North Levels: East North West [1] TRUE 在数据帧的因子 在创建任何数据帧文本数据的列,R语言对待文本列作为分类数据,并在其上创建因子。 # Create the vectors for data frame. height <- c(132,151,162,139,166,147,122) weight <- c(48,49,66,53,67,52,40) gender <- c("male","male","female","female","male","female","male") # Create the data frame. input_data <- data.frame(height,weight,gender) print(input_data) # Test if the gender column is a factor. print(is.factor(input_data$gender)) # Print the gender column so see the levels. print(input_data$gender) 当我们上面的代码执行时,它产生以下结果: height weight gender 1 132 48 male 2 151 49 male 3 162 66 female 4 139 53 female 5 166 67 male 6 147 52 female 7 122 40 male [1] TRUE [1] male male female female male female male Levels: female male 更改级别的顺序 一个因素中的级别的顺序可以通过使用级别的新顺序,再次应用因子函数来改变。 data <- c("East","West","East","North","North","East","West","West","West","East","North") # Create the factors factor_data <- factor(data) print(factor_data) # Apply the factor function with required order of the level. new_order_data <- factor(factor_data,levels = c("East","West","North")) print(new_order_data) 当我们上面的代码执行时,它产生以下结果: [1] East West East North North East West West West East North Levels: East North West [1] East West East North North East West West West East North Levels: East West North 生成因子级别 我们可以通过使用 gl()函数生成因子的级别。它有两个整型输入,表示每个级别有多少水平和多少次。 语法 gl(n, k, labels) 以下是所使用的参数的说明: n 是一个整数来给出级别数 k 是一个整数给出重复的数量 labels 为所得到的因子级别标签的向量。 示例 v <- gl(3, 4, labels = c("Tampa", "Seattle","Boston")) print(v) 当我们上面的代码执行时,它产生以下结果: Tampa Tampa Tampa Tampa Seattle Seattle Seattle Seattle Boston [10] Boston Boston Boston Levels: Tampa Seattle Boston