"pivot on two attributes?"
an example dataset:
year, int
sex, nominal: male, female
age_group, nominal: 0-20, 20-40, 40+
mortality, real
how to convert this dataset into:
year,
mortality_male_0-20,
mortality_male_20-40,
mortality_male_40+,
mortality_female_0-20,
mortality_female_20-40,
mortality_female_40+
I tried:
Example2AttributePivoting
group_attribute = year
index_attribute = age_group|sex
but index_attribute can only have a single attribute?
year, int
sex, nominal: male, female
age_group, nominal: 0-20, 20-40, 40+
mortality, real
how to convert this dataset into:
year,
mortality_male_0-20,
mortality_male_20-40,
mortality_male_40+,
mortality_female_0-20,
mortality_female_20-40,
mortality_female_40+
I tried:
Example2AttributePivoting
group_attribute = year
index_attribute = age_group|sex
but index_attribute can only have a single attribute?
Tagged:
0
Answers
eh?
Aggregation loses information!
The transformation I have in mind should be lossless.
Example2AttributePivotingworks fine when you only have a single index_attribute.
There might be some way in Rapidminer to use two "index_attributes". I do not know.
Example dataset:
att1 : nominal : {A, B}
att2 : nominal : {T, F}
att3 : real
att4 : real
Transformed dataset:
att3_A_T : real
att3_B_T : real
att3_A_F : real
att3_B_F : real
att4_A_T : real
att4_B_T : real
att4_A_F : real
att4_B_F : real
there are two possible ways for solving this problem. I will begin with the dirtier one:
Before applying the Pivoting you could create a new attribute using the AttributeConstruction operator. This nominal attribute would need to store the combination of the values of the attribute sex and age_group. You could then use this single new attribute as index attribute for the Pivoting.
The second approach would be to ask us for a quote for extending the pivoting operator. This would solve the problem for once and forever but is a little bit more cost intensive
Greetings,
Sebastian