The Army Officer Evaluation Report: Historical Background

Arthur Coumbe
Jul 3, 2022
18 min read

Updated: Apr 30, 2024

Published 07/03/22

Updated 04/30/24

Introduction

The principal purpose of the Officer Evaluation Report (OER), or officer efficiency report as it was known until 1973, has been to serve as a basis for personnel decisions. Matters of promotion, elimination, retention in grade, command selection, and school selection have all depended on the OER. Furnishing personnel managers with information crucial for the appropriate assignment and utilization of officers has been another aim of the report. More recently, the OER has been employed as a tool for professional development. Over the last several decades, evaluation reports have attempted to stimulate an active interchange between superiors and subordinates, giving the latter the opportunity to benefit from the former’s knowledge and experience and ensuring that the rated officer was fully aware of his superior’s expectations.

Unfortunately, the OER has not, in the main, lived up to the exalted hopes that the Army and its leaders have had for it. It has been bedeviled by a host of internal and seemingly intractable flaws that make it of marginal value both to the Department of the Army and to the individual officer. This paper will sketch the evolution of the OER and offer some thoughts about the reasons behind its inadequacy.

Historical Overview of the OER

Historically, efficiency reports for Army officers have varied greatly—from rather desultory and unstructured evaluations written by commanders to the complicated 24-page annual report introduced in 1914. The former often told virtually nothing about the rated officer while the latter provided an overabundance of detail. 1

The system of assessing officers had emerged as an issue of considerable concern in the 1880s, when growing professionalism within the officer corps pushed the Army to explore more systematic ways of evaluating performance. The Army first used a permanent evaluation reporting system for its officers in 1890. To be sure, there were earlier attempts to gauge officer effectiveness. Soon after assuming command of the Continental Army, George Washington directed battalion commanders to prepare written evaluations on all officers in their units. The reports were used to adjust the ranks of officers within their battalions. Throughout most of the nineteenth century, the Army relied principally upon two types of evaluations to gauge the effectiveness of its officers—letters to the Secretary of War or The Adjutant General (TAG) and written assessments provided by the Inspector General. Both types of reports aided in the selection of officers for permanent commissions and in the weeding out of less effective officers in the aftermath of conflicts. They proved to be very useful in this latter role following the War of 1812 when, faced with drastic budget cuts, the Army had to trim its bloated officer corps down to a size that it could afford. 2 The Army’s evaluation mechanism for effecting this drawdown sprang from a letter that TAG distributed in 1813. In this letter, TAG asked the Army’s line regiments to submit reports that assigned a relative rank by grade for all officers of the command. This forced ranking system may have been the Army’s initial experiment with a centralized assessment system. 3

Nevertheless, such attempts at assessment did not prove lasting. Before the 1880s, the need for a systematic officer evaluation was slight. The Army was tiny, and officers usually remained with the same regiment throughout most of their careers. Thus, the skills and abilities of officers were widely known. 4

The need for an officer evaluation system became clearer in the late-19th century Army with the reforms and proposals made by Major General Emory Upton. In the 1870s, Upton conducted an extensive examination of military organization and doctrine in the French, German, English, Persian, Chinese and Indian armies. From his analysis, he concluded that the Army’s principal challenge was fashioning and nurturing the need for a profession of arms in a society that had embraced the citizen soldier idea as the centerpiece of its national security policy. 5

Drawing from the Prussian model, Upton proposed the creation of a permanent general staff, a prescribed process for officer examination and promotion, and the formation of professional schools to teach military science. One issue that Upton proposed to address was the concept of lineal promotion. Since the Revolution, officers had been recruited, assessed, promoted, and retired within a single regiment. That practice provided senior regimental leaders with deep insight into the abilities of individual officers. But it also encouraged parochialism and led to wide differences in promotion opportunities among regiments. Upton believed that lineal promotion hurt the performance of Union Armies in the Civil War and prevented the creation of a professional army. Upton proposed changing the system by reassigning officers to different regiments, preferably in different parts of the country, at each grade in their careers. For his plan to work, he knew that the Army would have to develop a centralized promotion system to replace the regimental system in effect. His solution was a formal examination process for company grade officers. A board of officers would review an officer’s recent fitness reports as well as peer evaluations by other officers in his regiment. The board would then administer both an oral and written exam to the officer. In Upton’s model, officers would be evaluated based on three inputs—the standard rater efficiency report, a peer evaluation, and an objective examination by an outside board. All three inputs would be considered in the promotion process. 6

Upton, however, underestimated the power of the military bureaus, which had, for the past century, exercised near total control over the promotion and selection of officers within their specialties. Many of his other reforms eventually were adopted by Secretary of War Elihu Root, following the Army’s disappointing performance in the Spanish-American War. However, his recommendations regarding a rigorous promotion board based on both leader and peer evaluations were largely ignored. 7

Within the War Department, the evaluation of officers first became a matter of concern in the early 1880s. This concern, stimulated by a newfound professionalism growing out of rapid industrialization, led to the introduction of an experimental two-part report in 1890. In the first section of this report, the officer was required to write a self-evaluation. In the second section, the rater provided an assessment of the rated officer’s ability and proficiency. This report, in use army-wide by 1895, formed the embryo around which the modern system of officer evaluation evolved. Until the eve of America’s entrance into World War I, it remained, like many other of the evaluation tools the Army would subsequently adopt, under almost constant scrutiny and revision.8

While Secretary of War Root considered the weaknesses of the Army unveiled in the war with Spain and reflected on Upton’s reform agenda, an important transformation was occurring in American society. This was the progressive era of American politics, and the nation was consumed with the optimism that scientific progress could cure social ills and promote the common welfare. The high confidence in the power of science led to the application of scientific study to the fields of both human behavior and business. As the industrial revolution propelled the American workforce from a collection of cottage industry tradecrafts to a mass-production society, the field of Human Resource Management began to emerge as a formal science. One of the most influential thinkers in the field at this time was Frederick Taylor. Taylor concluded that workers would operate machinery at the slowest rate that went unpunished by management. In 1913, he published a paper entitled, “The Principles of Scientific Management,” in which he argued that production methods could be optimized and standardized with worker compensation directly tied to the achievement of production goals. 9

Taylor’s scientific management theory transformed life in the American factory. Floor managers subdivided complex tasks into routine and standardized actions. Specialization allowed factories to replace skilled artisans with hourly workers who performed only one or two routine actions at a constant rate over the course of a workday. The result was the modern assembly line and the burst in industrial output that came with it. Taylor’s methods not only boosted productivity but transformed the nature and structure of the American workforce. Integrating the various functions performed on the factory floor required a new type of middle manager. With clear standards of performance, a manager could assess the performance of workers based on their capacity to meet production quotas. 10

Root drew upon scientific management theory as he implemented his reform agenda for the War Department. One of the clearest manifestations of this was the centralization of personnel functions at the Army level. This centralization crystallized with the passage of the National Defense Act of 1920. With this act, Congress created a General Staff with an office for personnel management and established a centralized promotion list for each grade. This diminished the ability of the bureaus to control the promotion of their officers and reduced the internecine squabbling among the bureaus competing for promotion quotas. The Act also aimed to preserve the expansibility of the Army by retaining an active-duty force of 12,000 commissioned officers—approximated two times the size of the pre-WWI officer corps. 11

To manage the performance evaluation records of this large mass of officers, the newly established General Staff created the Personnel Office. This office introduced a standardized method for compiling the promotion list based on time in service and medical fitness. Officers with equivalent time in service and shown to be medically fit for duty were judged deserving of promotion. 12

The centralized promotion board provided for in the Defense Act of 1920 was devised to find disqualifying attributes rather than identify the best qualified officer. Early boards normally consisted of three officers selected from different bureaus and two medical officers. While the bureau officers exercised judgment in their decisions, their primary objective was to represent their bureau’s interests. Thus, the bureau representatives combined a cultural antipathy to self-promotion and a respect for long service with the theories of scientific management appearing in the business community. This led to the centralized promotion board process still being used today. 13

The interwar Army used merit as a basis for retention and seniority as the principal basis for promotion to higher rank. This reflected Taylor’s theories of scientific management, which presumed that laborers worthy of promotion would meet the standards of their assigned tasks. Those that did were eventually promoted, while those that didn’t were eliminated. This methodology has caused great frustration among junior officers. The Army has attempted to introduce elements of Auftragstaktik into its warfighting doctrine, but its officer promotion system has obstructed this effort because it was founded on very different organizational assumptions. 14

Frederick Taylor’s scientific management theories and his vision of a commoditized labor pool still have utility in conflicts with near-peer adversaries in wars in which mass casualties produce a constant churn of small unit leaders. Recent US experience in low intensity conflicts, on the other hand, put a premium on specialized individuals capable of exercising sound judgment in unstructured and unpredictable situations. This operational ethos, founded on recent experience, differed from an institutional one founded on the practices of a managerial philosophy suited for a mass citizen army of the type that fought in the two world wars of the 20th Century. The result of this battle of cultures is a cognitive dissonance within the officer corps that pits the Army’s leader development strategy against the historical legacy of its manning and personnel structure. 15

In the early 1920s, the Army introduced an evaluation instrument that used a graphic rating scale to assess the skills and performance of officers. This form, with minor alterations, became Form 67 in 1936; it remained in use until 1947, when it was replaced with Form 67-1, which employed a forced choice rating scheme. This latter instrument sparked widespread protest. Many objected that it did not permit the rater to determine the numerical rating he gave to a subordinate. This was intentional. The Army thought it could significantly reduce inflation by masking the numerical scores. This occasioned raters to begin gaming the system by trying to anticipate the values assigned by the Army to their evaluations. This caused the Army to introduce a new version of the OER in 1950 (DA Form 67-2). Like its forerunner, though, 67-2 was short-lived. It was superseded in 1953 by DA Form 67-3, which was, in turn, was replaced by DA Form 67-4 in 1956. The root cause of all these revisions was the inflation of ratings. 16

The Army in 1951 had adopted the Officer Efficiency Index (OEI) as a management tool for officers. Scores on the index ranged from 50 to 150 and were calibrated by the Army standard rating scale. The officer with the median score received an OEI rating of 100. In this scheme, about two-thirds of all ratings fell between 80 and 120, one-sixth above 120, and one-sixth below 80. Army leaders believed that the index was invaluable in a rating system based on prevailing management principles. The OEI furnished a convenient gauge of “quality” with which officers could be swiftly sorted and categorized regarding their role in a future mobilization. Personnelists bemoaned the Army’s decision in 1961 to discontinue the OEI, which had proved so useful and convenient for them. 17

In the same year that the OEI was abandoned, the Army implemented a new OER, DA Form 67-5. This form was used until 1968, when it was replaced by DA Form 67-6, which in turn was superseded by DA Form 67-7 in 1973. As in the past, inflation was a prime reason for the changes. DA Form 67-7 was a milestone because, with the adoption of this report, the Army started using the term “officer evaluation report” as opposed to “efficiency report”—a term that had been used for 50 years. 18

The 67-7 remained in effect until 1980, when it was replaced with DA Form 67-8. The old report was jettisoned because it became wildly inflated and, in addition, did not support the new Officer Personnel Management System (OPMS) or encourage the professional development of officers. Form 67-8 integrated several new features that were absent in its predecessor: namely, participation by the rated officer, an enhanced role for the reviewer and a format that was ostensibly more conducive to board and personnel management use.19

DA Form 67-9 succeeded 67-8 in 1997. The new OER was designed, inter alia, to make finer distinctions in officer quality, improve the process of senior leader selection, and emphasize junior officer leader development. The developers of the new form purposed to expedite the rapid and smooth assimilation of junior officers into the Army culture by stimulating greater superior/subordinate communication. An innovative feature of the new rating scheme was its masking of second lieutenant OERs. This feature was added to “level the playing field” since there were, among junior officers, great variations in assignments, experiences, and rate of assimilation into the Army culture during the early years of their career. 20

Problems with the OER

Over the years, there have been many problems with the OER from the perspective of both individual officers and personnel managers. There is not sufficient space in this short paper to list them, let alone discuss them. Consequently, only the most intractable and enduring shortcomings in the evaluation system will be touched upon here.

As indicated previously in this narrative, the most persistent and troublesome of these shortcomings has been inflation; all other deficiencies have paled in comparison. Periods of evaluative equanimity have been infrequent and short-lived. One such episode occurred in the immediate aftermath of World War I. In 1922, for example, three quarters of all captains received ratings of less than excellent; about one in twenty earned the top rating of superior; and slightly more than one in five attained an excellent rating. Subsequent years, however, witnessed a progressive inflation of the reports until by 1945, 99 percent of officers received one of the top two ratings. 21

Inflation, in fact, prevented General George C. Marshall, the Army’s Chief of Staff, from relying on efficiency reports to select general officers at the outbreak of World War II. The expansion of the Army that began in 1940 created a need for 150 additional general officers. Of the 4,000 officers eligible by grade and experience to be promoted to that august rank, 2000 were, on the basis of their evaluation reports, found to be superior and suited for this distinction. The outstanding officer could not be distinguished from the good. As a result, Marshall and selection boards had to depend on their own judgment and personal knowledge of the officers being considered to make their decisions. 22

This trend of inordinately high ratings continued in subsequent decades. DA Forms 67-1 through 67-8 all experienced significant inflation within a short time of their introduction. In some cases, it was a matter of a few months. It took about 90 days, for example, for the Department of the Army to determine that raters and endorsers using DA Form 67-6 (adopted for Army wide use in March 1968) were giving “higher that warranted” evaluations to subordinates. The new form soon became as useless as its predecessor in guiding promotion and selection boards in their choices. These boards, like the ones convened by Marshall at the beginning of World War II, found themselves relying principally on their own judgments for their selections. 23

One frustrated and cynical War College student summed up the history of OER inflation as follows: “The adoption of a new report may lower the inflationary trend for a short time, as happened in the past however, as has also happened with every report since [the early 1920s], inflation will take over, making the new report as useless for use by selection boards as the previous ones.” 24

Another common criticism of the OER system was that it did not attach sufficient weight to potential or to long-term professional development. Traditionally, the evaluation report has focused on current performance and short-term results. Thus, the importance of outcomes that are long-term and qualitative in nature tend to be minimized while the significance of accomplishments that render immediate and easily measured results have been over-emphasized. This myopic approach to officer evaluation had several consequences. First, it stifled innovation by rewarding those who followed established paths and accepted conventional wisdom. Second, it favored those who exceled at organizational and direct types of leadership while overlooking those with strategic leadership abilities. 25

A lack of comprehensiveness and specificity has been another long-standing complaint about the evaluation system. Reports have not recorded or identified the specific skills, knowledge, and talents developed or exhibited by officers while serving in particular positions. They have consequently been of limited value to personnel managers in finding officers with particular talents for particular jobs. Of course, the “company man” developmental model that has informed the Army’s officer management system has been responsible for this, or at least much of it. In this model, positions are usually not sufficiently defined to allow for precise evaluation. The Army has looked for people that can handle the mass of “tough, unstructured” jobs that predominate within operational units, not for specialists with specific talents. 26

Many observers have commented on the general lack of confidence displayed by officers toward the evaluation system. This lack of confidence is largely a function of the sharp and dramatic variances in rating behavior that flow from the many complex pressures and influences that make up the rating environment and which, many are convinced, have distorted the evaluation system. Over the years, many officers have felt that their professional fate has been too dependent on the writing ability of their superiors. As they saw it, it was not so much what they did but how effective their rater or reviewer was in describing what they did. Frequent changes in rating scales, procedures, and forms have also lessened the validity of the OER in the minds of countless officers. Not only has the basic form changed, on average, every seven years but there have been frequent changes to each form over its administrative lifetime. In its first ten months in use, for example, 67-6 had eight major modifications made to it. 27

The OER scoring system itself has been a target of almost constant criticism. As we have seen, because raters generally have seen the OER as unfair, they have resorted to “scheming” to protect their subordinates and register a subtle protest against the system. In the late 1940s, raters tried to “outguess” the values assigned by the Department of the Army to OERs, making the evaluation system into a type of game. Presently, reviewers parcel out their COM and ACOMS in such a way to ensure that all deserving officers have a “heartbeat.” In both cases, performance and potential were often secondary considerations. The scaling instruments that have provided the quantitative part of the OER have been denounced as “utterly inappropriate” and “manifestly unfair.” These instruments have been suitable for measuring comparable performances such as those measured on academic tests. When applied to OERs, however, where the duties and responsibilities of even ostensibly similar positions vary widely, they have very limited assessment value. 28

A series of reports describing senior officer misconduct have tarnished the image of the Army profession in the eyes of the American public. These reports caused Secretary of Defense Chuck Hagel to observe in a press conference on February 5, 2014 that the military may suffer from systemic problems in the way it selects and promotes leaders. Recognizing these systemic problems, the 2014 National Defense Authorization Act directed the Department of Defense (DoD) to assess the feasibility of fundamentally changing its performance evaluation system by including peer and subordinate evaluations in the promotion, assignment, and selection of its leaders. 29

Because of the Budget Control Act of 2011, the Army faced a 20 percent reduction in end strength. As a result, the Army had to make deep cuts while attempting to retain its best performers. It seemed that the time was right for the Army to reconsider its approach to talent management. 30

In June 2013, the Army Leader Development Task Force published a study of leadership attitudes across the Army. The study based its findings on detailed interviews with over 550 officers ranking from lieutenant to colonel and over 12,000 responses to an Army-wide survey. One of the study’s most surprising findings was that only about half of Army leaders believed personnel evaluations and promotion decisions were accurate. Additionally, 19 percent of survey respondents claimed that they never received performance counseling, even though performance counseling was a mandatory part of the Officer Evaluation System and the centerpiece of the Army’s performance appraisal system. A separate survey of 250 West Point graduates, both inside and outside the military, found that only 30 percent believed that the Army did a good job promoting the right officers. Over three quarters believed that this failure had a negative impact on national security. Not only did the current evaluation system undermine confidence in the efficacy of Army promotion decisions, but it also engendered dysfunctional behaviors in officers as it encouraged them to game the system. A review infantry battalion command selectees showed an average of 36 months of field grade key and developmental (KD) time. This was a significant departure from the 24 months typically expected of officers at this grade. Since performance in KD assignments was weighted heavier than performance in other broadening assignments, officers expected that their reports would receive special attention from their senior raters while they occupy these choice assignments. 31

The key to gaming the system was to maximize the time spent in KD assignments. Officers with these assignment had a distinct advantage of those within the senior rater’s pool who were not in these positions. Moreover, not all senior raters are created equal. A senior rater with a broad profile of Army officers and a wide reputation across the Army is considered better than one working in a small niche organization, or worse, a joint officer from another service. The net effect has been to discourage talented officers from pursuing broadening assignments in the joint community or unique staff positions, where the population of peer Army officers was necessarily limited. For example, the most common broadening assignment for infantry battalion command selectees in 2012 was aide-de-camp to a General officer. Examined purely on promotion board results, the most valuable service that an officer can provide outside of KD or command duty is to serve as an aide to a senior leader. Evidence also suggested that the writing skill of the rater on an OER often carried nearly equal weight to the merit of the officer being rated. A 2013 study of over 4,000 Army officers revealed a correlation between rater and rated officer promotion rates. The study showed that company commanders stood a 29 percent greater likelihood of promotion below the zone to major if they served under a battalion commander who was likewise promoted below the zone to major. 32

In addition, surveys within and outside the Army repeatedly have suggested that the single-source approach to performance evaluation is a leading cause for talented junior officers to depart the military. A 2000 ARI Study on captain attrition interviewed 161 students of the Combined Arms Staff School and found that eight of the 20 factors most likely to cause officers to resign their commission were related directly to the structure of the performance evaluation system and its perceived effects. A similar study by the Army Training and Leader Development Panel conducted that same year concluded that junior officers observed, “diminishing direct contact between seniors and subordinates . . . evidenced by leaders that are focused up rather than down.” They also cited “the OER as a source of mistrust and anxiety.” In addition to these challenges, evidence continues to mount that senior leaders, no matter how capable, struggle to detect evidence of toxic leadership within their subordinate commands. 33

Conclusion

The officer evaluation system has had a tortuous and troubled history in the U.S. Army. Its tendency toward inflation, its inability to distinguish performance from potential, its inadequacy as a professional development tool, its lack of precision and specificity, its myopic focus, its scaling problems, and its failure to inspire confidence in those whose fate it regulates has prevented the OER from fulfilling the purposes for which it was allegedly designed. Already quite noticeable during the Industrial Age, these deficiencies and shortcomings have become even more pronounced and visible after the advent of the Information Age. To be sure, many officers with exceptional direct and organizational skills have emerged over the course of the last century despite the failings in the evaluation system. Whether or not this system will aid in the development of the kind of strategic thinkers that many observers are convinced will be necessary to deal with the multifarious challenges of the future is another question.

[1] James M. Moynahan, A Suggested Method of Improving Officer Efficiency Reporting (Carlisle, PA: USAWC, 26 March 1956), 1.

[2] Marlin Craig, History of the Officer Efficiency Report System, United States Army 1775-1917 (Washington, DC: Office of the Chief of Military History, 1953), II-2, II-6, and II-17/18.

[3] David P. Kite, The U.S. Army Officer Evaluation Report: Why Are We Writing to Someone Who Isn’t Reading?, Thesis, Maxwell AFB: Air Command and Staff College, 1998, 6.

[4] Ibid., 7.

[5] Curtis D. Taylor, Breaking the Bathsheba Syndrome: Building a Performance Evaluation System that Promotes Mission Command (US Army War College Press, 2015), 16.

[6] Ibid., 17.

[7] Ibid., 18.

[8] Craig, II-2 through III-27.

[9] Taylor, 18.

[10] Ibid., 19.

[11] Ibid., 20.

[12] Ibid.

[13] Ibid., 21.

[14] Ibid. It is worth noting that the Army maintained a formalized system of routine performance evaluation during this period with fitness reports that, at one point, stretched to 24 pages. Despite the exceptional detail of these reports, they appeared to play only a minor role in the promotion of officers.

[15] Ibid., 22.

[16]Paul S. Williams, An Evaluation of the US Army Officer Efficiency Reporting System, Student Thesis, USAWC, 3 March 1969, 4; Sanders A. Cortner, The Officer Efficiency Report Can Be an Effective Tool for Personnel Management, USAWC Research Paper, USAWC, 28 February 1972, 2-3.

[17] Cortner, 17.

[18] Ibid., 3; William R. Mattox, Management by Objective and the New Officer Efficiency Report: A Valid Concept for the Army Reserve, Student Essay, USAWC, 2 December 1975, 5-6; James M. Hardaway, Strategic Leader Development for a 21st Century Army (Fort Leavenworth, KS: School of Advanced Studies, C&GSC, AY 2008), 2.

[19] Charles R. Hamilton, The Effects of Multiple Constraints on the Army’s New Officer Evaluation Report, Master of Military Studies, Marine Corps Command and Staff College, Academic Year 2001-2002, 7.

[20] Department of the Army, Pamphlet No. 623-105, The Officer Evaluation Reporting System “In Brief,” Washington, DC: DA, October 1997, 4-7.

[21] Charles D. Herron, “Efficiency Reports,” Infantry Journal (April 1944): 30-32.

[22] Ibid.

[23] Williams, 22; Cortner, 10.

[24] Cortner, 11.

[25] Williams, 4; Hardaway, 28-30 and 34

[26] Raymond H. Tiffany, The Officer Efficiency Report System, Student Individual Study, USAWC, 26 March 1956, ii; Mattox, Management by Objective and the New Officer Efficiency Report, 5-6; Hardaway, Strategic Leader Development for a 21st Century Army, 12.

[27] Mattox, Management by Objective and the New Officer Efficiency Report, 1-2; Williams, An Evaluation of the US Army Officer Efficiency Reporting System, 4 and 26;

[28] Tiffany, ii and 29; Hamilton, 10-14.

[29] Taylor, 1.

[30] Ibid., 2.

[31] Ibid., 23.

[32] Ibid., 24.

[33] Ibid., 25.