Step-by-Step Guide 5:
Filtering, Transforming, & Computing Variables
Overview
What you will learn here:
When working with data, we often need to manipulate our variables and let's say create new variables. This might entail changing variable levels, creating new variables that sum up a series of variables, or standardizing a variable. Other times we may want to filter out data that does not meet certain criteria. Here you will learn the following:
Transforming variables
Computing variables
Filtering your data
Transforming and computing variables can be complex. Below you will learn basic techniques. There will be links to more advanced learning materials, should you require it.
Dataset & Variables use
Skoczylis, Joshua (2021) "Extremism, Life Experiences and the Internet", https://doi.org/10.7910/DVN/ICTI8T, Harvard Dataverse, Version 3.
Video Guide: Transforming variables
Transforming variables
Variable Used: Age - this is a continuous variable.
Transformation: Below we will transform Age into a categorical variable with the following two levels:
Over 50
Under 50
1.
Select your variable and create new one
Navigate to the Data tab > select variable Age > Transform
A new variable will appear to the right of the variable you want to transform.
Name your new variable. In this case, we will name it Over_50
Sometimes you may also want to add a description.
2.
Create a transformation
You can change the source variable in the Source Variable field, but you won't usually need to do this for a simple transformation.
Using transform > click None > Create new transform...
Jamovi gives each transformation a name - change it. This allows you to use it again later on other variables if you wish. You can also add a description.
If you plan to use this transformation on other variables you can also add a variable suffix. E.g. if you transform your variable using z-scores (more on this in the next tutorial) add _zScore. Now each new variable will have this suffix - Age_zScore, Height_zScore etc.
Now select + Add recode condition
Note: Rather than selecting Create new transform... you can also just select an existing transformation from the list. Jamovi will then apply this transformation to your variable, rather than you having to re-create the conditions.
You can also edit existing transformations by selecting Edit. To edit the transformation, just follow the steps below
3.
Add conditions to your transformation
Now you need to add conditions. Jamovi uses simple if ... else statements. Jamovi executes each statement starting from the top.
You can add as many conditions as you want by selecting + Add recode condition. This will add a new if $source line into the box. remember they will be executed in order.
You now need to populate the boxes with the information. The $source is the variable you want to transform - don't change this.
Now select one of the following:
== equal to
!= not equal to
< Less than
> Greater than
>= Greater or equal to
In the next box, you will put your condition e.g. the age you want as the cut-off point.
After the use, add your new label. If you are using text you must enclose it in '...'
Finally, after the else use box just add the label for the data that does not meet the above conditions. You can leave this blank.
Follow the above step for each recode condition
Example:
In our case, we want to create one category for the under 50s and one for the over 50s.
Essentially what we are saying is if someone's age is below 50, label it 'under 50' otherwise label it 'over 50'
In Jamovi it will look like this:
if $source < 50 use 'under 50'
else use 'Over 50'
Voila, you have transformed your first variable
4.
Multiple transform conditions
Essentially, you just follow the steps above. However, if you have multiple conditions it is important to keep in mind how Jamovi works. As stated in the previous step, Jamovi loops through your if statements in a logical order starting at the top and working its way through each condition.
Example: Let's say we want four categories for age: 25 and under, 26-45, 46-65, over 75.
The code in Jamovi should look like this and will run it in the following order:
if $source < 26 use '25 and under'
if $source < 46 use '45-65'
if $source < 66 use '66-75'
else use 'Over 75'
The order is important, let's look at what happens if the order is switched around:
if $source < 26 use '25 and under'
Here everything under 26 is labelled into the '25 and under' category - this is what we want.
if $source < 66 use ' 66-75'
As everything below 26 is already categorised, this line will label everything between 26 and 66 as '66-65'. This is not what we want.
if $source < 46 use '45-65'
Step 2 already labelled these rows, so this label will never appear.
else use 'Over 75'
This will correctly create a new category for the over 65s.
So, just remember, the order is crucial. Make sure you place your conditions in a logical order.
Video Guide: Computing variables
Computing variables
There are two ways of computing variables - using either the Compute or Transform function. Below we will show you both. The upside of using Transform is that you can save your transformation and easily apply it to new variables.
The below examples will use z-scores and ranks, this is to demonstrate how computing variables work. You will be introduced to z-scores in a later tutorial. This is just for demonstration purposes.
Variable used: Age
Computing variables can be an extremely useful tool. Below we will show you the very basics. Jamovi supports some advanced functionality which is not covered below. For a more advanced introduction follow the link on the right.
Compute variable using Transform
1.
Select your variable > Transform
In this example, we will compute the z-scores for age.
Data > Transform > Select Age
Now give the variable a name and description.
2.
Compute/transform your variable
Using transform > select Create New Transform...
You can now give this transformation a name, description and suffix.
Click on Fx > select your function (Z) and double-click
As the $source variable (in this case Age) is already in the field you just have to move the brackets to surround $source.
Your code should look like this:
Z($source)
You have now computed your first variable. You can now use your Standardise Transform on any variable. If you have added a suffix, your new variable will include this in the title e.g. Age_zScores
Compute variable using Compute
In many cases, we might want to create a new variable based on more than one variable. In the example below we will use compute to sum up the responses to the following two variables:
Place_World
Understand_Personality
1.
Compute use a single function (e.g. sum)
Here we will rank the Age of participants
Data > Compute
Give your new variable a name and descriptions
Click Fx > select function (RANKED) > select variable (Age) and double-click on them
If you want to know what a function does click on it once, a description will appear below.
Your text should look like this:
RANK(Age)
You have now created another computed variable which Ranks your participants according to their age.
2.
Compute using multiple functions (e.g. Sum > Z)
Follow the step above then choose the SUM function.
Click Fx > select function (SUM) > select your variables and double-click on them > separate each with a comma
If you want to know what a function does click on it once, a description will appear below.
Your text should look like this:
SUM (Place_World, Understand_Personality)
You could also standardise the output by using the z-scores as well. Then your text would look as follows:
Z(SUM(Place_World, Understand_Personality))
You have now created another computed variable which gives you the sum of the two variables we have selected.
Video Guide: Filtering your Dataset
Filtering your Data
Variable Used: Age & Household_Dependents
Filter: In this example, we want to filter out all the participants who are over 50 and have 4 or more children.
Important Note: Once the filter(s) is activated the output for all the analyses will be updated accordingly. This can be a bit annoying. A way around this is to create a column filter (more below)
That said, you do not lose the data from the unfiltered analysis. To get this back, you just de-activate the filter. You can toggle this on and off whenever you need.
1.
Build your first Filter
Data > Filter > Select your variable (Age)
You can either select your variable by clicking on the fx symbol, then scroll to the variable and double-click on it, or you can type in the variable name. If you are typing in your variable, and the variable has a space, then use '..' around the variable names.
Now use one of the selectors below:
== equal to
!= not equal to
< Less than
> Greater than
>= Greater or equal to
Now you can add your conditions in this case < 50.
Your filter should look something like this:
Age < 50
Here you are telling Jamovi to filter out all rows where this condition is not met e.g. all above the age of 50 are filtered out.
Voila, you have created your first filter.
2.
Build additional Filters
The easiest way to work with more than 1 condition is to create another filter. Similar to Transform, Jamovi works through each Filter, so your order is important.
Each new filter will already exclude the rows filtered out by previous filters.
To create an additional Filter click on the + to the right of the filter box
Now follow Step 1 for each additional filter.
In our example, we also want to filter out every participant who has 4 or more children.
Our second filter should look something like this:
Household_Dependents < 4
We have now created a second filter that also filters every participant that is over 50 and has more than 4 household dependents.
3.
Deactivating, hiding & deleting filters
You can easily deactivate filters, hide them or delete them.
Deactivate a filter: Toggle active > inactive
Your filter will remain visible and can be toggled to active at any time.
Delete Filters: Click on the cross of the filter you want to delete
This will permanently delete your filter.
Make all Filters invisible: Click on the eye symbol on the right side of the filters.
All filters and the filtered data will remain active but will be invisible.
As mentioned above, when you filter all of your previous analyses' will also be updated. A way around this is to create a new variable using the computer function.
You can now apply a filter to this new variable. You can add this filter to any analysis' and it will filter our the data you do not want. The advantage of this is that this only applies to the analysis you want the filter for.
For a more details description, follow the button below.
Supplementary Material: Computing the Mean and Z-scores
Computing the Mean
Sometimes you may want to create a new variable that is based on the mean of other variables. This is easy to do. Watch the vide to the left.
Produced by datalabcc.
Computing Z-Scores
Sometimes it makes sense to use z-scores. This allows you to easily compare variables. Again this is easily done using the compute function.