Once the individual scores have been normalized, it is necessary to unknit the randomised order as played to the subjects. In the example spreadsheet the order was deliberately not randomised, but this is not an example to follow. We also tested each washing machine twice, once using monaural and once using binaural reproduction.
In the worksheet "Mean and s.d." we have brought together the mean scores and standard deviations for each scale and each washing machine. In the worksheet "Data collection" we have then separated out the monaural and binaural tests into two separate blocks, something you won't have to do. Once this has been done, it is then possible to analyse the results in detail.
We can examine the interplay between the different questions, to find out how people's judgements are formed. We do this using a Pearson correlation coefficient. Lower down in the worksheet Data Collection you can see a summary table. For simplicity, just consider the monaural case:
|What it does||-0.233||-0.395||-0.255||-0.112|
The Pearson correlation coefficient lies between -1 and +1.
In most cases, the scores do not lie at the extremes, in which case a significant test should be used to find out whether the scales are significantly related. You compare the magnitude of the correlation coefficient to the values in the Correlation coefficient significance table. If your correlation coefficient exceeds the value in the table, the correlation coefficient is significant.
In this case we have 20 judgements on each scale, which means the degrees of freedom is 18. Consequently the 5% significance level is 0.423. So all the correlations above 0.423 are significant, and are marked in white in the table above. In fact, the ones marked are all significant at the 1% level. This means that there is a less than 1% chance that the inter-relationship between the scales occurred by chance.
This means a pleasant washing machine is one that is quiet, one that sounds robust (strong) and one that is of high quality. However, none of these attributes relate to what the sound tells you about the functionality of the washing machine.
Once you have established the inter-relationship between the scales, then if scales are highly related, you would then usually not bother to examine all the scales in detail, but chose one of them, as they are all similar. (At least in the first instance).
A more rigorous statistical method would be to apply factor analysis to determine a common scale that combines pleasantness, loudness, robustness and quality into one. But for now, we will proceed by looking at one question, the one concerning whether the sound indicates a high quality product. Incidentally, the scale which is different, the one that asking about whether the sound is informative and tells you what the machine does, did not prove to be a useful scale in these tests. Statistical tests show that all the washing machines were scored the same on this scale, and so this scale is not helpful for this product.
In the detailed spreadsheet, the worksheets "washing", "draining" and "spinning" show the interrelations for the scales for the different parts of the washing machine cycle, and "whole machine" summarises the results for the overall impression. The results show that for most parts of the sound there is a close relationship between the different scales apart from the question concerning whether the sound tells one about the functionality of the product.