You may wish to clean the data in Excel before importing into SPSS/Jamovi or in SPSS/Jamovi directly.
Cleaning in Excel
Data exported in Excel from QuestionPro shows a single participant per row. If you have randomisation or skip logic in your QuestionPro survey, you can toggle on ‘Include Randomization Data’ which will add a new row showing the order participants were presented the questions so you can see the order presented in your data, if required.
Go through the data and delete unnecessary columns, duplicates and test responses. All QuestionPro questions will have their own column so you may wish to delete non-answer questions such as the Participant Information Sheet and debrief.
To clean raw data in Excel, you can identify and address issues like inconsistencies, errors, and extra characters either manually or using Excel functions like TRIM
, CLEAN
, and TEXTJOIN
, and then organize and format the data for analysis.
Here is a more detailed breakdown of common data cleaning techniques in Excel:
1. Identifying and Addressing Issues:
- Inconsistencies: Look for variations in data entry, such as different spellings of the same word (using the spellchecker can be useful in this case), inconsistent capitalisation, or different date formats.
- Errors: Identify and correct typing errors, incorrect values, or missing data.
- Extra Characters: Remove unwanted spaces and special characters manually or using functions like
TRIM
andSUBSTITUTE
- Duplicate Data: Identify and remove duplicate rows or values manually or using the “Remove Duplicates” feature which you can find under the Data tab.
- Missing Answers: Identify missing values by highlighting the column in question, the in the Home tab go to ‘Find & Select’ > ‘Go To Special…’ then select ‘Blanks’ > ‘OK’. You will need to decide how to action missing answers according to your study. You may wish to request advice from your supervisor.
2. Using Excel Functions for Data Cleaning:
CONCAT
: Concatonates a list or range of text strings.- Example:
=CONCAT(A1," ",B2)
(joins cells A1 to A2 with a space as the delimiter):
- Example:
LEFT
,RIGHT
,MID
: Extract specific parts of text strings.- Example:
=LEFT(A1,3)
(extracts the first 3 characters from cell A1)
- Example:
IF
: Perform conditional formatting or calculations based on data values.- Example:
=IF(A1="Yes", "True", "False")
(if A1 is “Yes”, return “True”, otherwise return “False”)
- Example:
AVERAGE
: Calculate the mean of a set of data values.- Example:
=AVERAGE(A1:A150)
(calculates mean from cells A1 to A150)
- Example:
- Navigating to ‘Automate’ > ‘Office Scripts’ allows you to automate data cleaning tasks by running scripts, for example, to remove duplicates and highlight blank cells. You can create your own scripts or use samples provided in the Code Editor.
3. Organizing and Formatting Data:
- Insert Table: Go to ‘Insert’ > ‘Table’ > ‘OK’ to make your data easier to view. This also enables ‘Table Design’ options that you can use to insert a ‘Total Row’ at the bottom of the table. Here you can click the dropdown icon on the bottom of a column for quick totals such as the mean, total count or sum.
- Sort and Filter: Use the “Sort & Filter” features to organize data by columns or rows.
- Freeze Panes: Freeze rows or columns to keep headers visible while scrolling.
- Insert Rows and Columns: Insert new rows or columns to organize data or add additional information.
- Merge Cells: Merge cells to create headers or combine data.
- Conditional Formatting: Highlight cells based on specific criteria to visually identify patterns or errors.
- Data Validation: Set rules for data entry to promote data integrity.
SPSS
Import your Excel data into SPSS.
Assigning Value Labels to Variables: In the ‘Variable View’ tab, click the ‘…’ on the variable value column and change the values as required.
Recode a variable into itself (meaning, overwriting the original values): Go to ‘Transform’ menu > ‘Recode into Same Variables’. Select the variable(s) you want to recode and move it to the ‘Numeric Variables’ box. If you select multiple variables, they must be of the same type (numeric or string). Click on ‘Old and New Values’ and in the dialog box, specify the old value and new value, and then click ‘Add’. Add all the values you want to overwrite and then click ‘Continue’ > ‘OK’.
Note: Recoding into the same variable permanently alters the original variable, so be mindful of this. You can also recode missing values (system-missing or user-defined) using the ‘Old and New Values’ dialog box. You can recode ranges of values into a single value, or vice versa. If you are recoding string variables, the new value must be the same length as the original string. If you want to create a new variable with the recoded values instead of overwriting the original, use ‘Recode into Different Variables’.
Outliers: Go to ‘graphs’ > ‘boxplot’ > select ‘simple’ and ‘summaries of separate variables’ > ‘define’. Drag the variable to investigate into the ‘boxes represent’ area and click ‘OK’. Outliers are then shown on the graph by their participant number. The outliers can then be deleted manually from the data on SPSS.
Jamovi
Import your Excel data into Jamovi.
Assigning Value Labels to Variables: Double-click the column header to open the variable editor, then enter a descriptive label in the ‘Label’ field.
Recode a variable into itself (meaning, overwriting the original values): Select the column(s) required then go ‘Data’ tab > ‘Transform’. This constructs a second ‘transformed variable’ for each column that was selected. Select the appropriate transform function (which includes reverse scoring) and click ‘OK’. Further details on creating transformations can be found on the Jamovi Blogsite.
Outliers: Go to ‘Analyses’ > ‘Exploration’ > ‘Descriptives’, select your variable and then under ‘Plots’, check the ‘boxplot’ option and ‘Label Outliers’.