Sunday, June 20, 2010

Visual Search

Visual search tasks have a long tradition in understanding the cognitive psychology of vision.  Comparing different search conditions reveals some fundamental aspects about our visual system.  Anne Treisman is usually given much of the credit for popularizing the tasks, and folks like John Palmer Jeremy Wolfe and Laurent Itti have developed reactions to/implementations of these ideas in mathematical and software models.  Basically, the task requires you to look for a target in a field of mostly distractors, like the screen shot below:

Find the X

I have argued that visual search is a fundamental building block of complex human cognition, and that it useful as a test of human-level intelligence, because standard image processing systems probably produce fundamentally different results than humans do.  As part of the PEBL Cognitive Decathlon (which I blog about here in a few weeks), I've implemented a version of the visual search test that enables representative human data to be collected.

The basics of visual search are simple.  The subject sees a screen filled with a bunch of targets (in this case, letters).  Within the display may or may not be hidden a particular search letter.  The participant searches the screen until he or she finds the target, or determines the target is not present.  Below is a sample screenshot from PEBL, in which the target is any one of the "X"characters, hidden among a bunch of round letters. 

So, comparing a number of conditions reveals important properties of the visual system.  For example, consider search for a target when there are fewer or more distractors.  If the participant is doing serial search--looking at each target individually-- the time needed to find a present target should get longer as he number of distractors increases.  Then consider what happens if the target is not there.  In that case, the time to find a single target should be about twice the time it takes to determine the target is not present. Much of the theory about visual search involves identifying the types of target/distractor combinations that require serial search, often because they require 'attention' to bind features together.

One important way to identify whether serial search is required is to find 'pop-out'; target-foil conditions in which the time to find the target DOES NOT DEPEND on the number of distractors.  One great example is color pop-out.  Typically, if one is searching for a single color in a field that does not contain that color, the presence of a target can be determined immediately, regardless of how many distractors there are. For example, consider the following screenshot:

The GREEN X jumps out immediately, and it doesn't really matter how many other things are on the screen.  Compare that with the following, in which you are supposed to find the white O:

Finding the white O is not as easy, and it gets harder as the number of distractors increases.  Theory tells us that even though the X is pretty easy to find, and the green is easy to find, if the target were a green X embedded within white Xs and green Os, it would be difficult.  That is a matter for another version of the test.   One thing I do want to test he the impact of putting multiple targets on the screen (as in the first screenshot above).  The search task should get easier (faster),  but how will it affect the pop-out effect?  I don't know, so I incorporated it into the design.

In designing a visual search task (which will appear in vsearch/ of the PEBL Test Battery version 0.6), I wanted to incorporate the ability to specify the target, the distractors, the colors of the targets,  the number of targets, and the number of distractors.    The design is fairly simple, and consists of about a dozen lines in the .pbl file:

  targs  <- ["O","X"]
  foils <- ["U","D","G","C","Q"]
  foilcolor <- MakeColor("white")
  searchsize <- [10,20,30]
  numtargs <- [0,1,5]
  targcols <- ["white","green"]  
  reps  <- 5
   ##These aggregate the conditions into a single design list.
   sizes  <- DesignFullCounterbalance(numtargs,searchsize)
   bytargs <- DesignFullCounterbalance(targs,targcols)
   sizebycolor <- DesignFullCounterbalance(sizes,bytargs)
   design <- Shuffle(RepeatList(sizebycolor,reps))
The first few lines specify the mix of conditions we want to use.  The reps variable specifies how many times we go through the design (randomized of course).  Since I have four conditions to include, I used the function DesignFullCounterbalance on each pair of factors, and then on each paired design.  Someday, I'll probably write a version of this function that takes as many design factors as you want and returns a complete counterbalance, but this version is not too onerous.  Finally, I repeat the list reps times, and shuffle the whole thing.  The design has 2 (targs) x 3 (searchsize) x 3 (numtargs) x 2 (targcolor) =36 conditions, which is repeated 5 times for a total of 180 trials, which takes about 8 minutes to run.  This provides a lot of hooks to control your own test, although it won't allow you to create the standard color-orientation conjunction search outright.  This could be remedied, and maybe I'll walk through that in another blog post.

Now, it is just a matter of looping through the design, and feeding design variables to a special-purpose trial function.  I won't go into the details too much, but I decided that the basic task would work as follows:

1. An image of the target is shown briefly
2. The target disappears, and a screen of letters is shown.  The letters are laid out using he routine I've described before, and have used in a number of tasks in the PEBL test battery.  It really is useful, and makes deploying PEBL for visual tasks such as this a breeze.
  The instructions are to click the mouse button when search is complete.  No mouse cursor is shown at this point.  This lets one get a good time estimate of when search is done, without worrying about mouse movement times.  If you were using touch screen, this step could probably be eliminated.
3. When the mouse click is made, all the targets disappear and are replaced by empty circles.  Also, at the top of the screen, a button labeled "NONE" appears.  Participant must click on the identified target or NONE.

A screencast of a few trials is shown here:

What happens when I ran this on myself?  Well, first thing, 180 trials is probably not quite enough to get good time estimates, so I ran through it 3 times, and plotted the results below (time is actually log-mean, plotted on a log-seconds scale).

Notice that the green targets (left column) produced flat RTs across the different numbers of distractors.  Also, it didn't really matter whether there were 1 or 5 targets, or whether the target was X or O--RT was about 400 ms no matter what.  But now look at the black targets.  Now, for the  X target, we essentially got a pop-out effect as well--X is easy to discriminate from a bunch of round figures.  But for the O, it took longer as the number of distractors increased.  Plus, in this case, having more targets made search faster.  Both of these are consistent with serial search.  But there is one thing that is curious--the time needed to declare the 'x' is not present seems to increase as the number in the display increases, even if it is flat when the X is present.  It is difficult to know whether this is consistent or reliable, but I think it is easy to detect the absence of color, but difficult to detect the absence of a particular shape, even when the shape pops out when it is present.

So that's the new PEBL visual search task.  It can easily be made to handle a number of interesting visual search paradigms just by changing the first few lines.  For more complex experiments, this could be managed too without too much difficulty.  Enjoy
Post a Comment