The Gebesse Blog

Thoughts from the world of technology and business

Psychological testing

Scratch an employment consultant or someone in HR and the subject of psychological or aptitude testing often comes up. I spent some years at university studying psychology and the way it can be used so this has been a matter of interest to me for some time, especially as it is applied in the management of businesses. Here is something I wrote as part of an article about recruitment practices for the Sydney Business Review newspaper in October 1995.

Another shortcut in employee selection is psychological testing. This provides a whole new list of boxes to tick. Before I get attacked by psychologists claiming that I am defaming them, I would like to say that, firstly, I actually know something about testing and, secondly, my objection is not to testing per se but to the inappropriate use of testing. If a proper profile can be developed for the job, and an appropriate test can be found, and it is assumed that the person will never move into any other position within the employing organisation, then testing is justified. I have just too often seen testing used as a crutch.

As an example, I have taken one particular set of tests at least three times. In all cases the comments made about me by the testers indicated that they were not even aware of principles and theories taught as part of any first year university psychology course. One supposed psychologist told me that this test could predict exactly how people would behave in any given set of circumstances. He had no answer when I asked him why it was not applied to all 10-year-olds to weed out those who were going to become murderers and rapists. By the way, the validation for this test (the proof that it works, if you like) was that it had, with hindsight, reasonably accurately predicted the promotion prospects of a group of 70 Los Angeles firemen, yet it was being used by high-priced consultancies to select candidates for sales, management and software development positions.

It’s probably just a coincidence that I revisited this matter exactly ten years later in this article in the October 2005 edition of Australasian Science magazine.

Australasian Science - October 2005Every so often the matter of psychological testing comes up in discussion among skeptics, with opinions varying on where such tests fit on the spectrum of scientific activity. Usually the majority think that psychological testing is about as scientific as the study of alien abductions or the memory of water, but there are sometimes a couple of people prepared to defend the tests. I topped my class at university in the course about the design and interpretation of psychological tests, and my take on them is that they may be very useful if used appropriately, but they are also a very good way of illustrating the meaning of the terms “reliability” and “validity”. “Validity” is the relationship of the findings to the real world, and “reliability” is the reproducibility of the results. It is possible for something to be reliable but not valid, but it is impossible for the opposite to be true. You can print out this page and use the ruler below to measure things. It won’t matter if you use it to measure feet, firkins, furlongs or femtometres, it should produce very close to the same measurement each time and is therefore a reliable measuring instrument. Its validity would be useless (unless you were a crook selling something by length to someone who had never seen a ruler).

A ruler

I remember being asked once by an employment agent if I had any objection to being asked to do a psych test for a potential employer. I told them that I had no objection at all, because unless they could tell me what the test was and how it predicted any aspect of job performance the requirement for a test disqualified the potential employer and saved me the wasted time of interviews. I didn’t get the job. In one case where I did a test, it was simply a process of following some logical paths to reach conclusions based on information provided as part of the test. I was told that it would take about three hours to do the test. I finished in about 45 minutes, so I thought that I must have done something wrong. The only way to test the answers was to do the test again, which this time took three-quarters of an hour.

I was told that I was the first ever applicant to get all the answers correct, but even this wasn’t enough to get me the job. I didn’t care, really, because I didn’t want to work with people who were so dumb that they could get any of the test answers wrong. This appeared to be one of those tests which was highly reliable, but had no validity in the situation in which it was used. (I later found out that the person who would have been my boss was a misogynist creep who groped women at parties and all the programmers employed there really were brainless nincompoops. Lucky escape!)

In one of those discussions between skeptics recently, the matter of the Myers-Briggs test came up. This is a multiple-choice test which purports to place test subjects along several spectra or axes of personality traits. I went off and did a Myers-Briggs test and I am ENFJ:

moderately expressed extrovert (44%)
moderately expressed intuitive personality (50%)
moderately expressed feeling personality (38%)
moderately expressed judging personality (56%)

That sounds like me, especially all those “moderately” measurements. To get these results I answered the questions more-or-less honestly (and I do know something about self-serving bias in personality tests). The danger in using the results of a test like this, however, are at least twofold. First, it is only a single test and can be done in a short time. For it to have validity requires other tests to be taken at the same time which can be used to corroborate the results. I do know of people using just a single test for employment selection, and this makes the choices suspect. Secondly, it has an inherent reliability problem. Two actually – results can vary from time to time just because people feel different on different occasions, and anyone who knows how the test works (and which questions have special significance in scoring) can adjust the results. This is another reason for using batteries of tests – if each has a different reliability, the overall reliability of the collection can be improved. I know that if I were to be interviewing next Tuesday for the position of Promotions Manager for the Anthony Robbins outfit, my Myers-Briggs results would look nothing like the table above. And on Thursday, when I was going for Nursing Manager at a palliative care hospice there would be a different picture again.

As I said above, there can be no validity without reliability. That is why we demand that experimental results in all areas of science be reproducible. It is specially important if those results suggest that the world is not as we think it is. Carl Sagan said that extraordinary claims require extraordinary evidence. He forgot to add that the evidence needs to be found more than once.

If you liked this post, please subscribe to our RSS feed! You can also follow us on Twitter here.

Leave a Reply

Your email address will not be published. Required fields are marked *