Evaluating Mastery:

Measuring Instructional Outcomes for Children with Autism




Michael A. Fabrizio1, 2, 3

Alison L. Moors1

 

1Organization for Research and Learning

Seattle, Washington

 

2University of Washington

Area of Special Education

 

3University of North Texas

Department of Behavior Analysis

 


 

Citation for this article:

 

Fabrizio, M.A. & Moors, A.L. (2003). Evaluating mastery: Measuring instructional outcomes for children with autism. European Journal of Behavior Analysis, 4(1&2), 23-36.

 

Table of Contents for the Online Version of this Manuscript

  • Conclusion
  • References
  •  

     

     

    Introduction

     

                In colleges, universities, general education, and special education classrooms, professors, teachers, and others interested in helping children learn discuss issues related to instruction.  Unfortunately, the issues we discuss usually exclusively relate to the structure of instruction rather than the outcomes of instruction.  In general education, we commonly discuss how to motivate learners to participate actively in their education, how to incorporate technology into teaching and learning, how to encourage learners to “construct their own meaning” of important concepts, how to impart understanding of different cultures, how to best assess learning, and myriad other topics.  In special education, we commonly discuss similar issues: how to work closely with families, how to encourage relationships between children with special needs and their general education peers, how to develop portfolios of work samples for inclusion in state assessments, and how to teach students so that they learn the most. 

                Across general and special education, we have become overly concerned with the structure of instruction.  Very few of us, by sad comparison, seem much concerned about whether learning actually takes place.  Rather than discuss how to ensure that students achieve results, we ruminate about process.  In doing so, we ignore the most important question: Did the learner master the skill?  This should not suggest that motivating students, incorporating new technology, assessing in valid ways, fostering relationships, and the rest are unimportant.  They are important, but only after we consider what should be the primary question in instructional evaluation—did the student really master what we taught? 

                When we fail to consider whether general education students really mastered a particular skill, the consequences of such failure on our part are fairly innocuous.  General education students learn relatively easily compared to special education students.  Regardless of whether a given piece of instruction actually produced meaningful outcomes for a general education student, that student will likely progress quite nicely.  The general education student will probably continue to further her education at least through high school.  She will probably continue to form significant relationships with other people, contribute to society, maintain meaningful employment, and participate in varied free time activities. 

                That special education students will achieve such outcomes is less certain if they do not master the critical skills we teach.  If special education students fail to master skills outlined for them within their Individualized Education Plans (IEPs), the consequences are much more dire.  If special education students do not learn to read or to perform basic computations, for example, they will likely not further their education.  If they do not learn to interact with peers, they will likely not form significant relationships with people throughout their lives—except those people paid to associate with them.  If they do not learn basic vocational skills, they will likely not find gainful employment that allows them to earn a living.  If they do not achieve basic competency in a host of leisure skills, they will likely not develop a menu of activities from which they may choose during their free time—at least not activities appreciated and valued by the larger culture within which the student lives. 

                Given the importance for special education students of receiving specialized instruction that leads to mastery of the skills we teach them, we should ensure that we evaluate such instruction not only by its structural features, but also by its ability to teach skills to mastery.  How are we to know how well a particular student should learn a particular skill before we say the student has mastered it?  How are we to know how may different vocabulary items we should teach a student before we determine he has learned to label items receptively?  Across how may different people should we measure a student’s use of a skill before we stop working on greeting skills? Unfortunately, these are difficult (if not impossible) questions to answer because they are the wrong questions for instructional designers and teachers to ask.  They are the wrong questions because they are unanswerable a priori—before we have taught.  How many different pictures should we teach a student to label receptively?  The answer to this question is that we should teach them as many as required to allow the student to learn new pictures easily. 

                Teaching should end only when we have taught skills to mastery.  Fortunately, the field of Precision Teaching within the discipline of Behavior Analysis contains a helpful, functionally defined metaphor for true skill mastery—fluency—that can help us determine when we might stop teaching a particular skill. 

     

     

    What are the specific outcomes of instruction for which we should strive?

     

                Haughton (1980) listed the outcomes associated with fluent performance using the acronym REAPS—Retention Endurance Application Performance Standards.  Under Haughton’s taxonomy of specific instructional outcomes, retention referred to maintenance of performance following periods without practice.  Endurance referred to performance across long durations and in the presence of distracting stimuli (such as the noise of a busy classroom, or the sound from a favorite song on the radio).  Application referred to use of the skill within more complex contexts (such as using the ability to label fluently objects in the room to include discussion of such objects within conversations).

                Johnson and Layng (1992) further refined Haughton’s (1980) definition of the outcomes of fluent performance when they parsed Haughton’s definition of skill endurance into two separate components: performance across extended periods (endurance) and performance in the presence of distracting stimuli (stability).  Thus, Johnson and Layng created a new acronym to describe the outcomes of fluent performance, RESA—Retention, Endurance, Stability, and Application.  With the outcomes of fluent performance thus separately named and considered, Johnson and Layng set the stage for clinicians and teachers to develop methods of individually evaluating each of those outcomes.  We could now evaluate students’ performances precisely and we could directly assess that performance in terms of its ability to be remembered (retention), performed for long durations without fatiguing (endurance), performed in the presence of distraction (stability), and extended to untaught examples (application). 

               

     

    General Procedures Involved in Directly Assessing the Outcomes of Instruction

    Stages of learning

     

                All learning proceeds through stages.  Various authors such as White and Haring (1980) and Wolery, Bailey, and Sugai (1988) have characterized and defined these stages slightly differently.  Most authors agree, however, that learning develops in at least two stages: accuracy building (also called acquisition) and frequency building.  In the accuracy building stage of skill development, learners’ performance progresses from highly inaccurate to highly accurate.  Students enter the accuracy building stage for a given skill and their performance is characterized by a high rate of errors and a low rate of corrects.  As their performance improves, they make more and more correct responses and fewer incorrect responses per unit of time until their performance reaches a high level of accuracy.  Even though their performance is highly accurate, though, it is usually far from fluent at the end of the accuracy building stage.  They may respond correctly most of the time, but their performance often shows long latencies and durations.  At this stage in their skill development, the student needs frequency building (the next stage of learning) to rectify these problems.

                In the frequency building stage of learning, students’ rate of incorrect often responding stays relatively constant and low, while their rate of correct responding accelerates—they get better and better at engaging in the skill.  Progressing through frequency building is essential to ensure that students really master the skills they learn.  Unfortunately, most instructional models, including most of those for children with autism, improve performance only to the end of the accuracy building phase of learning because of the statistic those models use to measure performance—percent correct. 

                Performance that is accurate but slow and arduous certainly is not fluent.  Accordingly, clinicians and teachers should begin assessing for the outcomes of fluency (RESA) only after a skill has developed through the frequency building stage, but doing so requires a different metric—frequency. 

     


    Frequency Aims

     

                Teachers and clinicians should begin directly assessing the outcomes of instruction once a student’s performance has reached the suspected frequency aim for a skill and the student is practicing that skill across a full range of instructional items at an acceptable level of curricular complexity.  A frequency aim should state the level of performance—the rate of correct responding—that reliably predicts skill retention, endurance, stability, and application (RESA).  Through the fluency assessment procedures described below, we at Fabrizio/Moors Consulting have identified frequency aims for a host of skills that children with autism often need to learn.  Fabrizio/Moors Consulting developed the fluency assessment procedures that led to the frequency aims we include below across a five-year period with 43 children with varying levels of autism ranging in age from 18-months to 14-years old, and over 400 Standard Celeration Charts of students’ performance (Moors & Fabrizio, 2001;Fabrizio, Moors, Pahl, & King, 2002; Moors, Fabrizio, Pahl, King, & Schirmer, 2003).  We updated our frequency aims database by continually adding new data collected across our clients, with the data in Table 1 representing over 400 graphed examples of student performance. 

     

    Table 1: Suggested Frequency Aim Ranges by Learning Channel

     

    Learning Channel

    Suggested Frequency Aim Range

    Example Skills

    Hear/Do

    35-50

    Hear/Do directions

    Hear/Say

    40-601

    70-902

    Hear/Say Sounds

    Hear/Say Sentences

    Hear/Touch

    35-40

    Hear/Touch animals by name

    Hear/Touch colors

    See/Do

    35-50

    See/Do gross motor imitation

    See/Do oral motor imitation

    See/Say

    55-701

    80-1002

    See/Say animals by name

    See/Say size comparisons

    Free/Do

    150-200

    Free/Do grasp-reach-release

    Free/Do squeeze

    Free/Say

    180-2002

    Free/Say steps in a process

    Free/Say things you did in school

     

    1suggested frequency aim ranges when counting words as the movement cycle

    2suggested frequency aim ranges when counting syllables as the movement cycle

     

    Table 1: Learning channels commonly used in intervention with children with autism (left column), suggested frequency aims for each of those learning channels (middle column), and example skills which might be targeted for intervention within each learning channel.  Suggested frequency aims are given for both words and syllables depending on which movement cycle is counted.   

     

    Learning Channels

     

                These aims are organized by learning channel.  One of the earliest discoveries we made when we began collecting these data was that frequency aims seemed more a function of the learning channel than of the pinpoint.  How students practiced (that is, what learning channel we employed to teach the skill) appears more important than what the students practiced in determining what frequency of correct responding would likely predict RESA.  For example, we found that the performance of children who could See and then Say (See/Say) at rates of 50-55 correct responses per minute tended to predict the outcomes of fluency we describe later in this paper regardless of whether the children were Seeing and Saying the names of animals, the letters of the alphabet, locatives, or the names of people.  Prior to this discovery, we assumed that frequency aims depended largely on the nature of the pinpoint practiced.  As an example, we assumed the frequency aim for See/Say Animals would be different from that for See/Say Colors.  Our students’ data showed us otherwise. 

                We stress that the aims we present here are only suggested aims.  While we do have a substantial amount of data to support that these aims predict skill retention, endurance, stability, and application, we still evaluate each of these outcomes for every child on every skill we teach them and recommend that other clinicians and teachers empirically validate these aims for each of their own students. 

     

     

    The Order of the Outcomes Checks

     

                We recommend assessing each of the critical skill outcomes associated with fluent performance (retention, endurance, stability, and application) in no particular order, with one exception—skill retention.  We recommend that teachers and clinicians assess skill retention last.  When we began to evaluate systematically retention, endurance, stability, and application for each skill all our clients practiced, we assessed skill retention first and then proceeded to assess skill endurance, then skill stability, and, finally, skill application.  We quickly discovered a serious flaw with this approach: because assessing skill retention requires that we stop all instruction on the given task for some period, we found that if we assessed skill retention first, and the child’s performance failed this outcome check, then the child lost time!  If we assessed skill retention last, however, then we minimized the number of times that a month passed without instruction when the skill was not yet fluent.  One of our clients, Andrea, taught us this lesson.  We taught her to use locatives in her spoken language by describing the relative locations of two objects (See/Say prepositions).  For this task, she looked at pictures showing a line and a ball and labeled the location of the ball relative to the line in each picture.  We taught her a full range of relevant prepositions and had her practice until her frequency of correct responding reached what we suspected to be the frequency aim for the skill.  Once there, we paused all timed practice on the skill for one month to assess whether she would retain the skill.  When we re-presented her with the instructional task a month later, her performance was not close to what it had been—even after multiple practices.  She had not retained the skill.  This meant the frequency aim that we used did not predict skill retention and, much worse, she had lost a month’s time. 

                Because of the lesson Andrea’s data taught us, we recommend that clinicians and teachers assess skill endurance, stability, and application before they assess skill retention to minimize “lost months” of time.  Which of the other three outcomes of fluent performance (endurance, stability, or application) clinicians and teachers assess first does not appear to have much clinical consequence.  Once we have worked with a particular student for some time and observed which outcomes checks they seem to have difficulty with, we will generally assess that outcome first.  If we notice, for example that a given student has little difficulty passing stability and application checks but tends to have difficulty passing endurance checks, we will begin by assessing skills’ endurance first.  Assessing the outcome of instruction the student is least likely to pass first allows us to modify instructional procedures and arrangements more quickly. 

     

     

    Measuring outcomes separately

     

                We also recommend that teachers and clinicians measure each of the outcomes of fluent performance separately rather than in combination.  Johnson & Layng (1992) demonstrated that frequencies that predict one outcome of fluency might not predict all outcomes.  Performance on a given task of 70 correct responses per minute may reliably predict skill stability, but may not predict skill endurance or application.  Alternatively, 70 corrects per minute may predict skill retention but not skill stability.  Measuring each of the outcomes of fluency separately allows teachers and clinicians to ensure that students have mastered each of the critical outcomes of instruction.  Measuring separately also allows teachers and clinicians to design remedial instruction that precisely targets problems students may experience. 

                Let us consider, as an example, generalization as a desired outcome of instruction.  The larger behavior analytic literature often uses the word generalization to refer to use of a skill across instructional stimuli, people, and places.  One of the advantages of measuring the effectiveness of instruction according to the outcomes associated with fluent performance is that we are able to assess performance across people and places (skill stability) separately from performance across instructional stimuli (skill application).  This allows us to modify instruction accordingly.  If a child’s performance on a given skill does not satisfy our criteria for skill stability, we can then modify the nature of the frequency building arrangements they experience to include performance under distracting conditions.  If a child’s performance does not satisfy our criteria for skill application, we can broaden the range of instructional stimuli used to teach the targeted skill.  Assessing skill stability and skill application as separate outcomes of instruction allows us to modify the instruction easily and directly so that we may correct for any difficulty the student may be having on any given skill. 

     

     

    Explanation of figures

     

                To illustrate the measurement and evaluation of fluency outcomes, we have included four Standard Celeration Charts (SCC’s) as Figures 1 through 4 from four different children.  In selecting which charts to include here, we chose charts that demonstrated how such measurement and evaluation may be applied across a broad range of types of children with autism.  We selected charts both from younger and older children who have varying levels of autism (Asperger’s Syndrome to severe autism).  Throughout the remainder of this paper, we will refer back to these four SCC’s as examples of how we evaluate whether learner performance displays the characteristics of fluency.

                Figure one shows Katherine’s performance sorting pictures of people into the categories of “boy” and “girl.”  At the start of this chart, Katherine was six years and one month old, with a diagnosis of severe autism.  This chart shows her performance sorting pictures of boys and girls into separate piles.  She practiced this task with flashcards of various pictures taken from magazines depicting a variety of critical and variable attributes that define gender (for example, clothing style, face structure, the presence or absence of facial hair).  With six weeks of practice, she passed retention, endurance, stability, and application checks.

                Figure two shows Jonah’s performance shaking an object.  Shaking is a fundamental motor skill that we commonly teach children so that they can play with a wider range of toys.  When Jonah began practicing shaking, he was 11 years and seven months old, with a diagnosis of severe autism.  This chart shows his rates of shaking an object with one hand from side to side separately with his right and left hands. 

                Figure 3 shows Russell’s performance on a conversation component skill—saying as many different facts about various topics as he could (Free/Say facts about nouns).  Russell has a diagnosis of Asperger’s Disorder and was nine years and eight months old when practice started on this chart.  After fifteen weeks of practice, he passed all four outcomes checks at a rate of 180-200 syllables per minute. 

              Figures 4a and 4b show Chris’s performance on a different conversation component skill: Hear category name/Say items within the category.  When timed practice started on this skill, Chris was four years and three months old.  He has a diagnosis of Pervasive Developmental Disorder-Not otherwise specified (PDD-NOS).  After twelve weeks of practice, he passed all four outcomes checks at a rate of 40-50 items named correctly per minute. 

     

     

    Measuring Skill Endurance

     

    Definition

     

                Skill endurance refers to the feature of fluent performance whereby learners may engage in a skill for prolonged periods without fatiguing (Binder, Haughton, & Van Eyk, 1990; Binder 1984, 1996).  At Fabrizio/Moors Consulting, we measure fatigue (or, more precisely, the lack thereof) by comparing our students’ performance across a timing of triple the longest practiced interval to that of their previous best performance.  For us to determine that we have empirically demonstrated skill endurance, our students must meet or exceed their previous best performance across a timing of three times the longest previously practiced interval.  As an example, if a student practiced learning to Hear and then Say (Hear/Say) vowel-consonant sound combinations (e.g., “ag”, “ib”, “ope”) across 30-second timings, we would time them for 90-seconds as their endurance check.  If the students previously practiced a skill for one minute as the longest timing interval, the endurance check for that skill would be three minutes long. 

                We assess skill endurance across timings that are triple the longest previously practiced interval to ensure we are requiring skills to be performed for significantly longer periods than used during frequency building.  During endurance checks, we use the same materials the students used during frequency building and we perform the endurance check in the same physical environment with the same level of distraction as they experienced during frequency building. 

     

    Importance for autism

     

                Children with autism should learn skills well enough that the skills are usable within functional contexts.  One parameter of any functional context is the time across which students need to employ skills.  Students need not only to be able to use skills across people, places, and instances, but also for task-appropriate lengths of time.  If a child learns to answer and ask conversational questions but cannot sustain that performance long enough to hold even a basic conversation, then the utility of the skill is greatly reduced.  If a student learns to read to some level of mastery, but cannot sustain that performance long enough to enjoy a book chapter, magazine article, or some other meaningful unit of text, then the usefulness of their reading is greatly reduced.  If children with autism can shift their attention between a teacher and their peers appropriately, but cannot sustain such shifting long enough to participate meaningfully in a classroom discussion, their ability to succeed in less structured educational contexts such as general education is diminished. 

                To enhance skill utility for children with developmental disabilities, we must ensure empirically that students can perform skills for functional lengths of time.  Because the precise duration of such functional lengths of time is often difficult or impossible for clinicians and teachers to specify, we must ensure that as we teach children with autism skills, we teach endurance as a general property of performance rather than teaching for specific units of time.  If, for example, we teach a child to engage in a conversation with a peer for five minutes, will that child be able to engage in conversations of longer lengths?  Will they be able to engage in conversations of lengths sufficient to converse with the full range of people with whom they will likely need to converse?  Clinicians and teachers cannot predict the future.  We very often cannot know the full range of times across which our students will need to use their skills.  Accordingly, we should teach in ways the promote flexibility in the time across which students perform.

                Beyond using skills for meaningful lengths of time, many children with autism need to learn to persevere in their behavior.  Behavior that perseveres is more likely to be reinforced by the naturally thin and variable schedules of reinforcement often available outside the context of specialized instruction.  When students engage in skills for long periods of time—when their performance endures—their behavior is much more likely to continue under the thinner schedules of reinforcement that often characterize the larger world.  These schedules often differ substantially from those present in the specialized instructional arrangements within which many children with autism learn many important skills. 

     

    Example charts with explanation

                Katherine had been practicing sorting pictures of people into categories within timings that were 30 seconds long.  She completed her 90-second long endurance check on the skill on October 2, 2001, and showed that she could perform at or above the frequency aim for the skill (30 per minute) for this substantially longer length of time.  Jonah’s endurance check for his shaking occurred on October 12, 2002 for both his left and right hands.  For the endurance check, we doubled the length of his previous longest timing to 60 seconds.  Jonah shook at slightly above 100 shakes per minute across the longer timing.  Russell’s endurance check (Figure 3) took place on October 26, 2001, after fourteen weeks of frequency building on saying facts about topics.  He passed his 90-second endurance check at a rate of 200 syllables per minute across a timing of 90 seconds.  Chris’s endurance check (Figure 4b) also lased for 90 seconds because his previous longest timing used during frequency building was 30 seconds.  During his endurance check on February 26, 2002, Chris’s performance stayed within the frequency aim range of 40 to 60 items said correctly per minute from all taught categories. 

     

     

    Measuring Skill Stability

     

    Definition

     

                Fabrizio/Moors Consulting defines skill stability in the same way as Johnson & Layng (1992)—performance in the face of significant distraction.  That a student can perform a given instructional task under the highly stable, and often very sterile, conditions we often arrange for specialized instruction matters much less than whether the student can perform the skill in the active, busy, noisy and distracting world at-large.  If a student cannot use a skill under such highly distracting conditions, then the skill is of relatively little use to the student.  As instructional programmers, teachers and clinicians should seek to develop skills within their students’ repertoires such that the skills are useful in the myriad environments within which students interact throughout the course of their daily life.  If we desire to ensure that our students can use skills we teach in the grocery store, with many people moving about them, music playing in the background, and loud voices intermittently ringing overhead, then it is incumbent upon us to measure skills we teach under conditions that at least approximate such distracting environments. 

                           

    Importance for autism

     

                Stability is a particularly important outcome for children with autism given their difficulty with skill generalization (e.g., Sundberg & Partington, 1998; Belifore & Mace, 1994).  Recommendations that instructional programs for persons with autism target skill generalization as important outcomes abound (c.f., National Research Council, 2001).  To measure skill stability, we present our students with the same materials used during frequency building and time their performance across a timing length equal to that used during frequency building.  What distinguishes a stability check timing from a frequency building timing is the presence of significant distractors that we introduce.  During stability checks, we ensure that the environment contains multiple distractors that were not present during frequency building.  The nature of the specific distractors we may employ when assessing skill stability with our clients varies from child to child.  For a student who prefers a certain cartoon video, we may play that video and have the child complete a stability check while lying on their living room floor in front of the television.  When working with students for whom the presence of their mother or father is highly distracting, we might assess skill stability by arranging for one of the student’s parents to enter and leave the instructional area several times during the stability assessment timing. 

     

    Example charts with explanation

     

                Referring again to Katherine’s sorting performance as shown in Figure 1, at the original timing interval of thirty seconds, and with the original flashcard pictures from timed practice, Katherine passed her Stability check on October 2, 2001 with a rate of 30 correct cards sorted per minute with only one error.  Katherine completed this stability check in the family room with a favorite video playing loudly in the background, rather than in a quiet therapy room within her house where she completed all of her daily timed practice.  The change in environment from where she typically practiced, the increased noise level of the family room, and the presence of a favorite video playing in the background during the stability check timing all constituted significant distractions for her. 

                Jonah passed his stability checks for Free/Shake using left and right hands on 10/16/02 (Figure 2).  He maintained his rate of 200 shakes per minute on a 30-second timing while a favorite video was playing loudly in the background and while he was sitting with his mother at the kitchen table.  This change in environment and with the addition of highly reinforcing songs from the video each constituted significant distraction for this learner.

                Figure 3 shows Russell’s performance on Free/Say facts about a noun.  His stability check (on 10/24/01) yielded a rate of 180 syllables per minute with zero errors which matched the rates of this previously practiced skill.  Russell completed this stability check in a different location from where he typically worked and with a baseball game playing on the television set—baseball was Russell’s biggest passion at that point in his life. 

                The stability checks for Chris’ hear category/say items chart (Figure 4b) was completed on 2/21/02.  As with the previous students’ charts, significant distraction was introduced into the environment during Chris’s timed practice.  In the presence of this distraction, he maintained his frequency of 40 items labeled correctly per minute with zero errors.

     

     

    Measuring Skill Application

     

    Definition

     

                We define skill application as the extension of a skill to untaught examples.  Extending a skill to untaught examples supports a crucial goal of instruction—that the student’s behavior comes under appropriate stimulus control.  A major goal of any piece of instruction for children should be that what they learn they are able to apply to instances beyond those presented within instruction.  Determining, through direct measurement, that students have learned what we taught is certainly something we should celebrate.  We should temper such celebration, however, until we are sure that we have taught our students to perform across a range of instructional examples sufficient to produce generalized responding.  How do we know whether we have accomplished this task?  We measure the student’s ability to perform the skill in response to discriminative stimuli different from those used in instruction. 

                Let us consider a more difficult example: answering personal information questions.  If we teach a student to answer a basic set of questions fluently, then they should be able to answer those questions regardless of how the question is structured so long as the critical information is present.  If we teach a child to answer the question, “How old are you?” and we believe the student can perform fluently across a wide enough range of examples of the question, then they should be able to respond at the same (or higher) frequency when we ask them the question in a way they have not previously heard.  If we taught the student to reply with their age across the example questions, “How old are you?”, “What is your age?”, and, “You are how old?”, and this set of three question forms represents an adequate range of variable stimulus features to occasion generalized responding, then the child should answer quickly and easily when asked, “[Child’s name}, your age is what?”  What represents an adequate range of examples and non-examples needed to occasion variable responding depends on a great many things: the presence or absence of component skills within the child’s repertoire at the time of instruction, the complexity of the discriminative stimulus, the complexity of the variable features of the stimulus.  Because of this, clinicians and teachers often find themselves in positions of having to guess as to the extensity of the range of examples used during instruction to facilitate generalized responding.  Question: “How many different cups do we teach the child to receptively label?”  Answer: “As many as are needed to produce generalized responding.”  How are we to know when we have produced generalized responding across examples?  When the student’s performance data across novel examples matches or exceeds that across taught examples. 

     

    Importance for autism

     

                As with skill stability, skill application is a very important issue for instructional programmers working with children with autism.  That students can respond to the instructional stimuli used in teaching is only a preliminary requirement of well-designed instruction.  If we wish our students to use their skills, we must also ensure that students can respond to the myriad untaught examples they may encounter throughout their lives.  Researchers and clinicians have long noted challenges presented by how readily the behavior of children with autism will come under overly narrow stimulus control.  Further, educators have levied criticism against some behavior analytic models of instruction because of their failure to correct for stimulus overselectivity and larger generalization issues.  If we teach a child to label elephants, kangaroos, dogs, cats, and mice as part of a piece of timed practice, we would then present the child with all new examples of these animals when we conducted the application check of a skill.  When conducting skill application assessments (which we call “application checks”), we present the student with novel examples of the items used during frequency building and have the child complete a timing equal in length to the last timing interval used.  We conduct application checks within the same physical environment as that of frequency building.

     

    Example charts with explanation

     

                Katherine’s sorting chart (Figure 1) shows that she passed her application check on 10/9/2001.  An original timing interval of thirty seconds was used and Katherine sorted all pictures (not previously practiced) by gender at a rate of 36 cards sorted correctly per minute with two errors.

                In Figure 2, the data for Jonah’s application check shows a rate of 172 correct per minute on his left hand and a rate of 210 correct shakes per minute on his right hand.  The materials used for this application check were different than the materials used during timed practice.  For timed practice, Jonah used a tic-tac container and for the application check, he used maracas.

                For the application check in Figure 3, Russell was given a topic which he had never previously practiced (Halloween).  He performed this skill at a rate of 190 syllables said correctly about the topic during a one-minute timing interval on 10/18/01.  This rate matched the rates from the previously practiced topics—showing that he was able to extend his performance to unpracticed topics with the same high degree of competence he showed with skills he practiced daily.

                Referring to Figure 4b, Chris passed his application check on 2/22/02.  As with Russell, Chris was provided with novel categories that he had never previously practiced.  He maintained his rate of 40 items recalled correctly per minute with one error across the new categories. 

     

     

    Measuring Skill Retention

     

    Definition

     

                An operant definition of remembering used here is performance following a period without practice or opportunity for reinforcement.  This definition is helpful because it allows clinicians to develop specific procedures for assessing retention.  How long that period should be is a matter of individual student needs.  For example, it may be important that students remember some skills for a short period.  A student may need to remember other skills for longer periods without practice.  At Fabrizio/Moors Consulting, we measure skill retention after a period of at least one month without instruction.  Once a child’s performance has reached the suspected frequency aim, and they are practicing all of the parts of a curricular sequence we would like them to practice, then we stop all practice on the skill for a period of one month.  We put the materials away and go on to work on other things the child needs to learn.  After one month, we bring out the original materials again and have the child complete up to two timings.  If they match or exceed their previous best performance for both the frequency of correct and incorrect responses, then we have empirically demonstrated retention for that skill. 

                If students do not meet or exceed their previous best performance within the first timing following a full month without practice, it is possible that they performed less than optimally because of the length of time passing with respect to how the skill is practiced.  They may perform poorly not because of a skill retention problem, but because they “forgot” how to engage in the instructional task.  Because of this, we allow students within our private practice one “warm-up” timing.  This is the only assessment of fluency outcomes on which we allow more than one timing.  We require that our students’ performance meet or exceed their previous best performance on the first timing for our assessments of skill endurance, stability, and application.

                           

    Importance for autism

     

                Any skill that we choose to spend our time teaching and which we ask students to spend their precious time learning should be remembered; if a skill is so unimportant that it does not matter whether students do or do not remember it, then perhaps it was not important enough to spend valuable time teaching in the first place.  Too often the same objectives appear over and over again on the Individualized Education Plans (IEP’s) of children with autism because the skill, although previously taught, is mysteriously absent from the student’s repertoire at a later time.  Over the course of their school careers, more and more of these students’ time is spent learning skills they already learned previously.  It is essential that, as clinicians and teachers, we teach skills to sufficient strength that they will likely be remembered and that we evaluate our instruction partially on its ability to systematically and reliably produce such retention.

     

    Example chart with explanation

     

                For Katherine’s see person in a picture/sort by gender (Figure 1), the retention check was the first outcomes checks she completed.  Although we do not recommend that clinicians and teachers complete retention checks before evaluating skill endurance, stability, and application for the reasons we outlined previously, Katherine’s team decided to evaluate skill retention first for two reasons.  First, Katherine was starting school and the team needed to reduce the workload required at home to accommodate her new daily school schedule.  Therefore, her team decided to place this program on retention check first to remove it from the schedule as soon as possible.  Beyond reducing the demands of Katherine’s home program to accommodate her school schedule, Katherine had passed RESA checks on three other skills that used the See/Match learning channel at a rate of 30 correct per minute.  Therefore, her team members were confident that Katherine’s sorting by gender would also pass the RESA checks at that same rate. 

                The retention period began on August 11, 2001 and was completed on September 28, 2001.  After the retention period, Katherine performed the identical task at a rate of 30 cards sorted correctly per minute with one error across a 30-second timing.

                Figure 2 shows that Jonah passed his four-week retention check for Free/Shake on both his right and left hands on November 15, 2002.  After these four weeks without practice, Jonah maintained his frequency of 200 shakes per minute on the original thirty-second timing interval with the original materials.

                For Russell’s Free/Say facts about a noun chart (Figure 3), the retention period began on October 27, 2001 and was completed on Novem