Different ways of creating categorical values

The DataSet methods add_meta(), extend_values() and derive() offer three alternatives for specifying the categorical values of 'single' and 'delimited set' typed variables. The approaches differ with respect to how the mapping of numerical value codes to value text labels is handled.

(1) Providing a list of text labels

By providing the category labels only as a list of str, DataSet is going to create the numerical codes by simple enumeration:

>>> name, qtype, label = 'test_var', 'single', 'The test variable label'
>>> cats = ['test_cat_1', 'test_cat_2', 'test_cat_3']
>>> dataset.add_meta(name, qtype, label, cats)
>>> dataset.meta('test_var')
single                             codes       texts missing
test_var: The test variable label
1                                      1  test_cat_1    None
2                                      2  test_cat_2    None
3                                      3  test_cat_3    None

(2) Providing a list of numerical codes

If only the desired numerical codes are provided, the label information for all categories consequently will appear blank. In such a case the user will, however, get reminded to add the 'text' meta in a separate step:

>>> cats = [1, 2, 98]
>>> dataset.add_meta(name, qtype, label, cats)
...\\quantipy\core\dataset.py:1287: UserWarning: 'text' label information missing,
only numerical codes created for the values object. Remember to add value 'text' metadata manually!
>>> dataset.meta('test_var')
single                             codes texts missing
test_var: The test variable label
1                                      1          None
2                                      2          None
3                                     98          None

(3) Pairing numerical codes with text labels

To explicitly assign codes to corresponding labels, categories can also be defined as a list of tuples of codes and labels:

>>> cats = [(1, 'test_cat_1') (2, 'test_cat_2'), (98, 'Don\'t know')]
>>> dataset.add_meta(name, qtype, label, cats)
>>> dataset.meta('test_var')
single                             codes       texts missing
test_var: The test variable label
1                                      1  test_cat_1    None
2                                      2  test_cat_2    None
3                                     98  Don't know    None

Note

All three approaches are also valid for defining the items object for array-typed masks.