Discrimination Reversal Learning

Discrimination reversal learning paradigms are used to study behavioural flexibility. In these tasks, subjects are required to adjust their behavior when previously established reward-related contingencies are reversed. An animal is first trained to discriminate between two stimuli, one of which is rewarded and the other is not. Once this discrimination is successfully acquired, the rewarded stimulus is reversed and the animal must learn to suppress the established response while implementing a new response. Deficits in switching from the previously learned response are seen across various neuropsychiatric disorders, including substance abuse, obsessive compulsive disorder, Parkinson’s disease, and schizophrenia.

Discrimination reversal learning task with zebrafish
Discrimination reversal learning task setup. A) is an overhead view of the inserts in a testing tank. The five apertures can be seen at the bottom of the image. The food-reward delivery zone is seen at the top. B) Still image captured from the Zantiks interface during the running of the experiment. A fish is tracked, as indicated by the white cross. Stimuli are not visible on the live tracking image due to the infra-red tracking, so the stimuli are denoted on the image with coloured boxes.

Experimental setup

The Zantiks AD unit is used for discrimination and reversal learning in adult fish. The five aperture and food hopper inserts are used to create the experimental environment (see image A above). The food hopper insert is designed to form an area for the fish to enter and collect food reward, without the food escaping into the remainder of the tank, as well as an entry point for initiating trials. Two of the apertures in the five aperture insert can be used to create two distant entry points at the opposite end of the tank from food delivery. The two entry points act in a similar fashion to nose poke holes in a rodent operant task.

Colour stimuli is typically used for this experiment, which is presented from the integrated screen below the testing tank. However, various visual stimulus types can be presented, including shapes, stripes or images. Target zones are assigned to specific locations within the tank (e.g., the two stimulus light locations). Responses are detected when a fish enters these zones (see image B above).

Experimental procedure

Typical reversal learning procedures occur in two testing phases. In the acquisition phase, subjects are trained to discriminate two visual stimuli (e.g., blue vs. green light), where responses to the CS+ (e.g., blue light) are reinforced and those to the CS- (e.g., green light) are not. Once the animal demonstrates successful discrimination learning by reaching the set learning criterion (e.g., 80% correct in a given number of choices or n consecutive correct choices), the reversal phase begins with the reward-related contingency reversed. Responses to the green light are now reinforced, and responses to the blue light are not reinforced.

Results/data output

The typical main measures of a subject’s ability to learn the discriminations in a reversal learning task are the number of trials to criterion (TTC) and the number of errors to criterion (ETC). Additional measures recorded for each trial can include response latency and the number of omissions. The Zantiks AD system can automatically measure and process behavioural endpoints in an easy to read format.

  • Trials to criterion (TTC): number of trials to reach the set learning criteria (CORRECT + INCORRECT) – the main measure of accuracy.
  • Errors to criterion (ETC): INCORRECT only – correlated measure of accuracy not affected by increased omission rates.
    • Perseverative errors – consecutive responses to the previously reinforced stimulus. Provides a measure of the ability to initially shift away from a previously relevant response.
    • Regressive errors – responses to the non-rewarded stimulus after making a presently correct response. Allow a measurement of the ability to maintain a new response after the initial shift away from a previously learned response.
  • Omissions: trials where no response is made – a broad measure of the animal’s motivation level.
  • Response latencies – time elapsed between stimulus presentation and a response – a measure of motor function and/or speed of processing.
  • The performance during both test stages are frequently analysed with independent samples t-tests, one-way ANOVAs or two-way repeated measures ANOVAs, depending on the number of factors.


Colwill, R. M., Raymond, M. P., Ferreira, L., & Escudero, H. (2005). Visual discrimination learning in zebrafish (Danio rerio). Behavioural Processes, 70(1), 19-31.

Lucon-Xiccato, T., & Bisazza, A. (2014). Discrimination reversal learning reveals greater female behavioural flexibility in guppies. Biology letters, 10(6), 20140206.

Parker, M. O., Gaviria, J., Haigh, A., Millington, M. E., Brown, V. J., Combe, F. J., & Brennan, C. H. (2012). Discrimination reversal and attentional sets in zebrafish (Danio rerio). Behavioural brain research, 232(1), 264-268.

Ruhl, T., Moesbauer, K., Oellers, N., & von der Emde, G. (2015). The endocannabinoid system and associative learning and memory in zebrafish. Behavioural brain research, 290, 61-69.