Derotation¶
What is derotation¶
Derotation of data
is necessary if brands, products or something similar
(levels) are assessed and each respondent (case) rates a different
selection of that levels. So each case has several responses.
Derotation now means, that the data
is switched from case-level to
responses-level.
Example: q1_1/q1_2
: On a scale from 1 to 10, how much do you like the
following drinks?
``data``
id | drink_1 | drink_2 | q1_1 | q1_2 | gender |
case1 | 1 | 3 | 2 | 8 | 1 |
case2 | 1 | 4 | 9 | 5 | 2 |
case3 | 2 | 4 | 6 | 10 | 1 |
derotated ``data``
drink | drink_levelled | q1 | gender | |
case1 | 1 | 1 | 2 | 1 |
case1 | 2 | 3 | 8 | 1 |
case2 | 1 | 1 | 9 | 2 |
case2 | 2 | 4 | 5 | 2 |
case3 | 1 | 2 | 6 | 1 |
case3 | 2 | 4 | 10 | 1 |
To identify which case rates which levels, some key-/level-variables are
included in the data
, in this example drink_1
and drink_2
.
Variables (for example gender
) that are not included to this loop can also
be added.
How to use DataSet.derotate()
¶
The DataSet
method takes a few parameters:
levels
:dict
oflist
Contains all key-/level-variables and the name for the new levelled variable. All key-/level-variables must have the same
value_map
.>>> levels = {'drink': ['drink_1', 'drink_2']}
mapper
:list
ofdicts
oflist
Contains the looped questions and the new
column
name to which the looped questions will be combinded.>>> mapper = [{'q1': ['q1_1', 'q1_2']}]
other
:str
orlist
ofstr
Contains all variables that should be assumed to the derotated
data
, but which are not included in the loop.>>> other = 'gender'
unique_key
:str
Name of varibale that identifies cases in the initial
data
.>>> unique_key = 'id'
dropna
:bool
, defaultTrue
If a case rates less then the possible counts of levels, these responses will be droped.
>>> ds = dataset.derotate(levels = {'drink': ['drink_1', 'drink_2']},
... mapper = [{'q1': ['q1_1', 'q1_2']}],
... other = 'gender',
... unique_key = 'id',
... dropna = True)
What about arrays
?¶
It is possible that also arrays
are looped. In this case a mapper can look
like this:
>>> mapper = [{'q12_1': ['q12a[{q12a_1}].q12a_grid', 'q12b[{q12b_1}].q12b_grid',
... 'q12c[{q12c_1}].q12c_grid', 'q12d[{q12d_1}].q12d_grid']},
... {'q12_2': ['q12a[{q12a_2}].q12a_grid', 'q12b[{q12b_2}].q12b_grid',
... 'q12c[{q12c_2}].q12c_grid', 'q12d[{q12d_2}].q12d_grid']},
... {'q12_3': ['q12a[{q12a_3}].q12a_grid', 'q12b[{q12b_3}].q12b_grid',
... 'q12c[{q12c_3}].q12c_grid', 'q12d[{q12d_3}].q12d_grid']},
... {'q12_4': ['q12a[{q12a_4}].q12a_grid', 'q12b[{q12b_4}].q12b_grid',
... 'q12c[{q12c_4}].q12c_grid', 'q12d[{q12d_4}].q12d_grid']},
... {'q12_5': ['q12a[{q12a_5}].q12a_grid', 'q12b[{q12b_5}].q12b_grid',
... 'q12c[{q12c_5}].q12c_grid', 'q12d[{q12d_5}].q12d_grid']},
... {'q12_6': ['q12a[{q12a_6}].q12a_grid', 'q12b[{q12b_6}].q12b_grid',
... 'q12c[{q12c_6}].q12c_grid', 'q12d[{q12d_6}].q12d_grid']},
... {'q12_7': ['q12a[{q12a_7}].q12a_grid', 'q12b[{q12b_7}].q12b_grid',
... 'q12c[{q12c_7}].q12c_grid', 'q12d[{q12d_7}].q12d_grid']},
... {'q12_8': ['q12a[{q12a_8}].q12a_grid', 'q12b[{q12b_8}].q12b_grid',
... 'q12c[{q12c_8}].q12c_grid', 'q12d[{q12d_8}].q12d_grid']},
... {'q12_9': ['q12a[{q12a_9}].q12a_grid', 'q12b[{q12b_9}].q12b_grid',
... 'q12c[{q12c_9}].q12c_grid', 'q12d[{q12d_9}].q12d_grid']},
... {'q12_10': ['q12a[{q12a_10}].q12a_grid', 'q12b[{q12b_10}].q12b_grid',
... 'q12c[{q12c_10}].q12c_grid', 'q12d[{q12d_10}].q12d_grid']},
... {'q12_11': ['q12a[{q12a_11}].q12a_grid', 'q12b[{q12b_11}].q12b_grid',
... 'q12c[{q12c_11}].q12c_grid', 'q12d[{q12d_11}].q12d_grid']},
... {'q12_12': ['q12a[{q12a_12}].q12a_grid', 'q12b[{q12b_12}].q12b_grid',
... 'q12c[{q12c_12}].q12c_grid', 'q12d[{q12d_12}].q12d_grid']},
... {'q12_13': ['q12a[{q12a_13}].q12a_grid', 'q12b[{q12b_13}].q12b_grid',
... 'q12c[{q12c_13}].q12c_grid', 'q12d[{q12d_13}].q12d_grid']}]]
Can be also writen like this:
>>> for y in frange('1-13'):
... q_group = []
... for x in ['a', 'b', 'c', 'd']:
... var = 'q12{}'.format(x)
... var_grid = var + '[{' + var + '_{}'.format(y) + '}].' + var + '_grid'
... q_group.append(var_grid)
... mapper.append({'q12_{}'.format(y): q_group})
So the derotated dataset
will lose its meta
information about the
mask
and only the columns
q12_1
to q12_13
will be added. To
receive back the mask
structure, use the method dataset.to_array()
:
>>> variables = [{'q12_1': u'label 1'},
... {'q12_2': u'label 2'},
... {'q12_3': u'label 3'},
... {'q12_4': u'label 4'},
... {'q12_5': u'label 5'},
... {'q12_6': u'label 6'},
... {'q12_7': u'label 7'},
... {'q12_8': u'label 8'},
... {'q12_9': u'label 9'},
... {'q12_10': u'label 10'},
... {'q12_11': u'label 11'},
... {'q12_12': u'label 12'},
... {'q12_13': u'label 13'}]
>>> ds.to_array('qTP', variables, 'Var_name')
variables
can also be a list of variable-names, then the mask-items
will be named by its belonging columns
.
arrays
included in other
will keep their meta
structure.