Sunday, May 23, 2010

Design and randomization II: the ShuffleCondition function

A while back, someone emailed me about a particular randomization scheme they wanted.

> i am wondering whether it is possible to pseudo-randomize lists, i.e.,
> to produce "random" sequences with constraints such as that each element
> cannot appear more than three times in a row etc.
>
> i am going to need this in training a discrimination between two
> stimuli. the stimuli should appear to the subject to alternate in
> "random" order, but long runs of just one stimulus should be forbidden



There are a lot of ways to do trial randomization in PEBL, and I covered many of them in a previous post. None of these are exactly what is needed. The ones that are closest are ShuffleWithoutAdjacents() and with a little work, DesignBalancedSampling(). ShuffleWithoutAdjacents takes a nested list as an argument, say

x <- ShuffleWithoutAdjacents([1,2,3,4,5],[10,11,12],[21,22,23], [24,25,26])

Here are some sample outputs:
[2, 36, 3, 11, 22, 10, 4, 21, 35, 1, 12, 23, 5, 34]
[23, 34, 2, 22, 11, 4, 21, 5, 35, 12, 36, 10, 1, 3]
[36, 10, 4, 34, 21, 1, 23, 35, 5, 12, 2, 22, 11, 3]
[11, 36, 3, 10, 21, 5, 23, 35, 4, 34, 12, 22, 1, 2]
[21, 34, 23, 3, 35, 12, 4, 36, 5, 11, 22, 2, 10, 1]
[22, 4, 10, 34, 2, 36, 3, 23, 1, 35, 12, 21, 11, 5]
[34, 2, 22, 11, 35, 10, 36, 3, 21, 12, 5, 23, 4, 1]


Notice that the tokens within a sublist tend not to happen adjacently. However, this is not assured if it is really hard to do, because you could easily contrive a situation where it is nearly impossible, but it does an adequate job in most cases. Plus, when it fails, it fails at the end of the list, which could cause other problems because it creates a confound with order. Why doesn't it always work? Well, it can't always work. Consider:


x <- ShuffleWithoutAdjacents([1,2,3,4,5],[10,11]])


There is no way for the 1...5 numbers never occur beside one another. It might be easy to check if the list satisfies the criteria, and if it doesn't, repeat the process until you find one that does. But if you give it input like the one above, it may never happen, so you would be out of luck.  You could also do a recursive backtracking solution, but here failure would mean exhaustive search through permutations, and that could really screw up  an experiment session if the list was complex enough.  There may be a way to check a list prior to doing the randomization to prove it can be randomized, but the strategy I chose is to 1. trust the user that the lists will be easy to randomize, and 2. check exhaustively going forward through the list, but if you fail, just do your best.

ShuffleRepeat() is also very similar, but it won't assure you won't get runs.  If you do
ShuffleRepeat([1,2],6)

it will give you something like:
[1 2 2 1 1 2 1 2 2 1 2 1]
This will assure you don't have runs greater than 2, but will not assure there are no adjacent conditions, and it won't assure there are
a limited number of repetition (e.g., 3).

So, how can this be done? the simplest way I can think of is to basically not leave the ordering to chance. So, decide what your distribution of different runs should be, say 5 singles, 3 doubles, and 2 triples per condition. This will produce 5 +3*2 + 2*3 = 17 trials of each condition. For each condition, choose a different random order through these repetitions, and then explicitly alternate between A and B with those repetition numbers. Here is a way to do it, encapsulated in a special-purpose function:


define ShuffleRuns()
{
##this codes the distribution across run lengths:
reps <- [1,1,1,1,1,2,2,2,3,3]

#Form shuffled lists with the design for A and B
designa <- Shuffle(Transpose([Repeat(1,10),reps]))
designb <- Shuffle(Transpose([Repeat(2,10),reps]))

##Then, merge the two lists together so the tokens alternate.
design <- FlattenN(Transpose([designa,designb]),1)
##design is a list of lists with two parts, the condition
## and the number of repeats, and the condition alternates.
##check out what design looks like:
Print(design)

#If you want to unroll this to produce a trial sequence, do the following:
outcome <- []
loop(i,design)
{
outcome <- Merge(outcome, Repeat(First(i), Second(i)))
}
##Now, outcome is a list with each condition (1 or 2) in sequence,
return outcome
}

 Here, you are guaranteed a condition won't happen more than 3 times in a row, and you can control exactly how many singles, double, and triples (or higher) will happen. And both A and B will have the same distribution of repetitions.   This is nice, but is it useful for other experiments?  Is there a way to generalize it?  Lets see.

My first concern is that focusing on run lengths is somewhat limiting.  Really, those run length could be any set of conditions, say [a,b,c].  You might want to repeat these sets of conditions multiple times, shuffling each set, but ensuring you don't get repeats between sets.  If a,b, and c happen to map on to trial-level run lengths, that is fine (and we'll write a function to translate), or it might be another kind of trial type.  In fact, I believe I wrote a function like this for the PEBL Gambling task.  This is almost like ShuffleRepeat(), except it will check if there are any matches between subsets of trials, and you give it the two factors explicitly.

 So, to start I'll write a function that takes two arguments: 1. a list of condition1 labels, and 2. a set of cond2 labels.

define ShuffleCondition(cond1, cond2)
{

First, lets get the length of the two conditions:
   ncond1 <- Length(cond1)
   ncond2 <- Length(cond2)


Now, make a set of trials for the two conditions.  Cond1 should have each element repeated
for each element of cond2; cond2 should have each element shuffled in place
   cond1trials <- Flatten(Transpose(FoldList(RepeatList(cond1, ncond2),ncond1)))
   cond2trials <- Flatten(ShuffleRepeat(cond2,ncond1))

#Now, if we put these two together and 'fold' the list, we will get a set of epochs of length cond1 that will contain each of the levels of cond1, and a random assortment of the levels of cond2.

   design <- Transpose(FoldList(Transpose([cond1trials,cond2trials]),ncond2))

##Now, let's shuffle each epoch, so that we don't get adjacent runs of cond1 between epochs. 
## all we really need to do is shuffle the epoch, check if there is a match between its first
## level and the previous epoch's last level, and then rotate one item if there is.
## I keep track of the last item in its own variable.  Start out with a nominal value of -1;
## even if you have a condition labeled -1, it won't make a difference because of some
## templating I do below

   newdes <- []
   lastitem <- -1
   loop(i, design)
    {
     try <- Shuffle(i)
     if(First(First(try)) == lastitem)
      {
        try <- Rotate(try,1)
      }
     newdes <- Merge(newdes, try)
     lastitem <- First(Nth(try, Length(try)))
    }

   tmp <- Transpose(newdes)


##tmp should contain the proper sequence, and could be returned.  But what if we have cond1
## defined so that it has duplicates?  Say ShuffleCondition(["1","1","2"],[1,2,3,4,5])?  This would lead to trouble, because we would try to disallow matches between 1 and 1 between epochs, but we would fail, because its impossible.  Really what we want is to ignore identity matches within and between epochs.  So if you put do the thing above, it would be identical to doing
ShuffleCondition(["1a","1b","2"],[1,2,3,4,5])

Treating levels 1a and 1b distinctly, even though they might have the same name.  To do this, lets back up to the beginning and make a template.  Instead of using cond1 explicitly, lets make a template:

  cond1x <- Sequence(1,ncond1,1)  ##Use a template here.
  cond1trials <- Flatten(Transpose(FoldList(RepeatList(cond1x, ncond2),ncond1)))

Now, after we do the shuffling, we just need to replace the template with the original codes:

   template <- First(tmp)
   translation <- Transpose([cond1x,cond1])
   c1 <- Replace(template, translation)

   return Transpose([c1, Second(tmp)])
}
This completes the function. How does it work?

Let's try something akin the the original request:
Print(ShuffleCondition([1,2],[1,1,1,1,1,2,2,2,3,3]))

[[2, 3]
, [1, 1]
, [2, 1]
, [1, 1]
, [2, 1]
, [1, 2]
, [2, 1]
, [1, 3]
, [2, 2]
, [1, 1]
, [2, 1]
, [1, 1]
, [2, 3]
, [1, 3]
, [2, 2]
, [1, 1]
, [2, 1]
, [1, 2]
, [2, 2]
, [1, 2]
]

Perfect, repeating the macro condition 1 2 1 2 1 2, but shuffling the secondary condition across the trials.

What about if we have a duplicate condition label?
Print(ShuffleCondition([21,22,23,23],[10,11,12,13]))
[[23, 10]
, [22, 11]
, [23, 12]
, [21, 13]
, [23, 11]
, [23, 13]
, [21, 11]
, [22, 10]
, [21, 12]
, [22, 13]
, [23, 11]
, [23, 10]
, [21, 10]
, [23, 12]
, [22, 12]
, [23, 13]
]

It works as expected, allowing these duplicates

What if we do some degenerate case of a condition with just a single level?  Will that crash the program?
   Print(ShuffleCondition([1],[10,11,12,13]))

[[1, 12]
, [1, 11]
, [1, 10]
, [1, 13]
]

It is OK--it works just as expected!

Here is sort of a typical example, with multiple levels of both conditions:
Print(ShuffleCondition([1,2,3],[10,11]))

[[1, 11]
, [2, 10]
, [3, 10]
, [2, 11]
, [1, 10]
, [3, 11]
]

We should think about situations that could create havoc here.  What if an argument were not a list, or a list were empty?  that might not turn out so nice, so I'll add some checking for that case. Expect this function to appear in PEBL 0.11

Now, all that remains is to pull the logic from the original function to flatten this out into a sequence of trials, especially for use in the 'runs' situation above.  This is sort of a special-purpose function, so I probably won't put it in PEBL.

define UnrollDesign (design)
{
   #If you want to unroll this to produce a trial sequence, do the following:
  outcome <- []
  loop(i,design)
   {
    outcome   <- Merge(outcome, Repeat(First(i), Second(i)))
   }
   return outcome
}

So to design the trials, you do:

  trials <- UnrollDesign(ShuffleCondition([1,2],[1,1,1,1,1,2,2,2,3,3]))

The complete ShuffleCondition function is below:


define ShuffleCondition(cond1, cond2)
{
  if((not IsList(cond1)))
    {
      SignalFatalError("First argument of [ShuffleCondition(cond1,cond2)] must be a list.")
    }
  if((not IsList(cond2)))
    {
      SignalFatalError("Second argument of [ShuffleCondition(cond1,cond2)] must be a list.")
    }
   
   #Get lengths of condition list
   ncond1 <- Length(cond1)
   ncond2 <- Length(cond2)
   cond1x <- Sequence(1,ncond1,1)  ##Use a template here.

   ##Create initial condition trials
   cond1trials <- Flatten(Transpose(FoldList(RepeatList(cond1x, ncond2),ncond1)))
   cond2trials <- Flatten(ShuffleRepeat(cond2,ncond1))

   ##put them together and fold them into epochs to be shuffled
   design <- Transpose(FoldList(Transpose([cond1trials,cond2trials]),ncond2))

    ## Shuffle each epoch, ensuring the first cond1 of an epoch does not match the last
    ## cond1 of the previous epoch.
   newdes <- []
   lastitem <- -1
   loop(i, design)
    {
     try <- Shuffle(i)
     if(First(First(try)) == lastitem)
      {
        ##If it matches, only rotate once--we can be sure it won't match then.
        try <- Rotate(try,1)
      }
     newdes <- Merge(newdes, try)
     lastitem <- First(Nth(try, Length(try)))
    }

    #Put the original cond1 labels back into the template
   tmp <- Transpose(newdes)
   template <- First(tmp)
   translation <- Transpose([cond1x,cond1])
   c1 <- Replace(template, translation)

   #DONE!
   return Transpose([c1, Second(tmp)])
}


No comments: