Common Errors Made by Students in Assembler Language: Can They be Avoided? F. Layne Wallace University of North Florida Jeffrey N. Carruth MIPS Computer Co. Douglas R. Luecke Teknekron Infoswitch Abstract While many studies have attempted to explain the underlying knowledge networks of computer programmers, most have done so with little or no empirical supporting evidence. One method of determining the structure of these networks is to look at errors and determine their origins, thus establishing some interrelated cognitive constructs. This study catalogued recurring errors in IBM 360/370 assembler language, attempted to emphasis these er- rors in the classroom and offered some possible explanations for the faulty nodes of the network. Keywords: Programming errors, assembler language, programming in- struction, cognitive models. Recently, computer science research in human factors has been fo- cused on how knowledge networks are established by naive and novice programmers (Friedman, 1974; Dijkstra, 1976; Schwartz, 1976; Wirth, 1977; Soloway, Bonar, Woolf, Barth, Rubin and Ehrlich, 1981). One popular methods of gaining information about these networks is to examine programming errors, which are the instances when the networks prove faulty (Gannon, 1978). Several attempts to catalogue specific errors have been attempted (Soloway, et al., 1981; Swigger and Wallace, 1982; Bayman and Mayer, 1983; Rulon, 1984). Soloway, et al. (1981) caution that examination of errors made by subjects learning a language in the same class with the same teacher may only reflect the teaching style of the instructor. Thus, any conclusions drawn from such a study may not allow the results to be generalized outside that one class. Another area of concern is the study of programming errors in a high level language. The errors associated with a high level lan- guage may have many different origins in the breakdown of the students' knowledge network (Shneiderman, 1980). This makes con- struction of an accurate cognitive model difficult. Past research has suggested that students using a high level language like Pas- cal, BASIC or PL/I did not really understand the proper use or implication of some of the languages' statements (Soloway, et al., 1981; Swigger and Wallace, 1982; Bayman and Mayer, 1983, Ru- lon, 1984). Assembler language might be a better place to start in the examination of student errors because most of the state- ments cause only one action to be taken by the computer. This would force the student to modify any existing network to allow for a more step-by-step implementation of cognitive constructs. The present study examines recurring errors in IBM 360/370 assem- bler and divides these errors into two categories: misunderstood commands and semantic errors. The subjects were graduate and undergraduate students attending North Texas State University and enrolled in an introductory IBM 360/370 assembler language course. Subjects were taken from these courses for a period of three full-length semesters. The effect of an individual instructor's teaching style was reduced by coun- terbalancing using three different instructors. While the text- book for each class was the same, all instructors used a variety of sources for their lecture material. To reduce the possibility of having errors specific to a particular programming assignment, all instructors gave different assignments during the semesters. To insure complete independence of results, the instructors col- lected data from only their classes for the three semesters and then collated the results at the end of the research period. Collection of data was accomplished by observing which errors oc- curred repeatedly during debugging sessions with the students. Errors that the students could correct on their own were not in- cluded in this study. Care was taken to make sure that the same student was not causing the same error time after time. The er- rors were then divided into misunderstood commands and semantic categories. It was found that the same programming errors did indeed appear across classes and across semesters. See Appendices A and B for a list of these errors. Students who made these errors did not seem to fall into groups divided by sex, age or educational lev- el. One common factor, however, was the amount of exposure to the concepts used in assembler language at an intimate level; writing program, for example. After the student was made aware of the faulty concept, the student usually did not commit that specific error again, or if they did, the student was able to correct the error with no assistance from the instructor. It was also discovered that students, as a group and across semesters, continued to make these mistakes even after the in- structor had repeatedly cautioned the class against making such errors. This leads one to believe that students do not process some information at a practical level during lecture but when forced to assimilate information while writing a program they tend to retain the concepts. Findings such as these suggest that courses that try to teach a programming language without the hands-on experience of writing and debugging programs leave the students with incorrect knowledge networks. Taking this idea a step further, programming courses that require the student to write programs may convey more information and convey this infor- mation at a level that will allow the prolonged retention of the material. While some questions were categorized as misunderstood commands, the underlying cause may actually have been semantic. For exam- ple, one error found frequently was that students often used a DS statement to initialize a variable instead of the DC statement. This seems to be purely syntactic on the surface, but it may be that the student has a faulty network node concerning the use of DS and DC and their implications for computer consequences. An- other example of a faulty network node is the error of trying to MVC data to an output file instead of of using the PUT macro. The previous example may be an indication that the networks for high- er level languages are organized in a different fashion than the networks for assembler. An alternative explanation could be that of cognitive interference where the concept of printing in assem- bler has become confused with the concept of printing in a high level language. The errors found to be the most common by the authors were pre- sented to other assembler teachers who had taught this course in the past using different textbooks, and all agreed that these er- rors seemed to be the most prevalent. This provides some conti- nuity between the present instructors and the past instructors, as well as indicating that the errors are not textbook specific. The question of whether these findings can be generalized to oth- er universities has yet to be answered. After compiling the lists found in the Appendices, an attempt was made to determine if the faulty nodes could be avoided by empha- sising the errors in the classroom. This emphasis took the form increased class discussion, homework for each of the common er- rors and an in-class discussion of the homework for one class. Another class (taught by the same instructor) was not given the emphasised lectures or homeworks. The two classes were compared using errors found in programs. The class receiving the addition- al emphasis on errors in class had no fewer errors than the class who received no emphasised lectures or homework. However, on the final programming assignment, the group getting the lecture mate- rial covering the common errors had fewer common errors than the control class (4 errors per program vs 9 errors per program). A Student's t test showed a significant difference (t=5.673, df=165, p<.01). The amount of experience does seem to be an integral part of forming an accurate and usable mental model but experience by it- self may not be enough. Swigger and Wallace (1982) found no dif- ferences between the performance of novice and expert Pascal and PL/I programmers regarding using the appropriate looping struc- tures for a given set of problems. This suggests that undirected experience has little or no benefit in building cognitive models and that experience may need to be directed. Sondheimer (1979) found that people using text processors tend to use the same set of commands over a prolonged period of time. Therefore, if the mental model is built incorrectly the programmer may tend to "patch" faulty nodes in a knowledge network by making their solu- tions to a given problem fit their faulty models instead of changing their model to a more appropriate one. Future research should be directed toward examination of the ac- tual knowledge networks being used by beginning students as well as the manner that the students build such a network. This study suggests that practical experience with a language will facili- tate the establishment of valid networks. Research should be ini- tiated to determine whether this experience is just prolonged ex- posure to a construct of if the construct is learned some type of internal model building. Other languages need to be examined to see if the same errors occur over and over, and to determine if their is any relationship between the errors catalogued here and those catalogued in other languages. References Dijkstra, E. W. A Discipline of Programming. Prentice-Hall, New Jersey, 1976. Bayman, P. and Mayer, R. E. A diagnosis of beginning programmers misconceptions of BASIC programming statements. Communication of the ACM, vol. 26, no. 9, 1983. Friedman, D. The Little LISPer. Science Research Associates, Cal- ifornia, 1974. Gannon, J. D. Characteristic errors in programming languages. Proc. of the 1978 Annual Conference of the ACM, 1978. Rulon, S. R. Beginners' misconceptions of nine BASIC statements. Federation of North Texas Area Universities Eighth Annual Comput- er Science Conference, 1984. Schwartz, J. T. What programmers should know. Computer Languages, vol. 2, 1976. Shneiderman, B. Software Psychology: Human Factors in Computer and Information Systems, Winthrop, Mass., 1980. Soloway, E., Bonar, J., Woolf, B., Barth, E., Rubin, and Ehrlich, K. Cognition and programming: Why your students write those crazy programs. Proc. of the NECC, 1981. Sondheimer, N. On the fate of software enhancements. Proc of the National Computer Conference, 1979. Swigger, K. and Wallace, F. L. Use of appropriate looping struc- tures: Expert vs Novice. Proc of the Human Factors Society, 1982. Wirth, N. The programming language Pascal. Acta Informatica, vol 1, 1977.