# SAT

Type Logo as of 2013 Paper-based standardized test College Board, Educational Testing Service Writing, critical reading, mathematics Admission to undergraduate programs of universities or colleges 1926; 96 years ago 3 hours Test scored on scale of 200–800, (in 10-point increments), on each of two sections (total 400–1600).Essay scored on scale of 2–8, in 1-point increments, on each of three criteria. 7 times annually Worldwide English Over 1.5 million high school graduates in the class of 2021 No official prerequisite. Intended for high school students. Fluency in English assumed. US$55.00 to US$108.00, depending on country. Most universities and colleges offering undergraduate programs in the U.S. sat.collegeboard.org

The SAT (/ˌɛsˌeɪˈtiː/ ess-ay-TEE) is a standardized test widely used for college admissions in the United States. Since its debut in 1926, its name and scoring have changed several times; originally called the Scholastic Aptitude Test, it was later called the Scholastic Assessment Test, then the SAT I: Reasoning Test, then the SAT Reasoning Test, then simply the SAT.

The SAT is wholly owned, developed, and published by the College Board, a private, not-for-profit organization in the United States. It is administered on behalf of the College Board by the Educational Testing Service, which until recently developed the SAT as well. The test is intended to assess students' readiness for college. The SAT was originally designed not to be aligned with high school curricula, but several adjustments were made for the version of the SAT introduced in 2016, and College Board president David Coleman has said that he also wanted to make the test reflect more closely what students learn in high school with the new Common Core standards.

Starting with the 2015–16 school year, the College Board began working with Khan Academy to provide free SAT preparation. On January 19, 2021, the College Board announced the discontinuation of the optional essay section, as well as its SAT Subject Tests, after June 2021.

While a considerable amount of research has been done on the SAT, many questions and misconceptions remain. Outside of college admissions, the SAT is also used by researchers studying human intelligence in general and intellectual precociousness in particular, and by some employers in the recruitment process.

## Function

U.S. states in blue had more seniors in the class of 2006 who took the SAT than the ACT while those in red had more seniors taking the ACT than the SAT.
U.S. states in blue had more seniors in the class of 2021 who took the SAT than the ACT while those in red had more seniors taking the ACT than the SAT.

The SAT is typically taken by high school juniors and seniors. The College Board states that the SAT is intended to measure literacy, numeracy and writing skills that are needed for academic success in college. They state that the SAT assesses how well the test-takers analyze and solve problems—skills they learned in school that they will need in college. However, the test is administered under a tight time limit (speeded) to help produce a range of scores.

The College Board also states that the SAT, in combination with high school grade point average (GPA), provides a better indicator of success in college than high school grades alone, as measured by college freshman GPA. Various studies conducted over the lifetime of the SAT show a statistically significant increase in correlation of high school grades and college freshman grades when the SAT is factored in. The predictive validity and powers of the SAT are topics of active research in psychometrics.

There are substantial differences in funding, curricula, grading, and difficulty among U.S. secondary schools due to U.S. federalism, local control, and the prevalence of private, distance, and home schooled students. SAT (and ACT) scores are intended to supplement the secondary school record and help admission officers put local data—such as course work, grades, and class rank—in a national perspective.

Historically, the SAT was more widely used by students living in coastal states and the ACT was more widely used by students in the Midwest and South; in recent years, however, an increasing number of students on the East and West coasts have been taking the ACT. Since 2007, all four-year colleges and universities in the United States that require a test as part of an application for admission will accept either the SAT or ACT, and as of Fall 2022, over 1400 four-year colleges and universities do not require any standardized test scores at all for admission, though some of them are applying this policy only temporarily due to the coronavirus pandemic.

### Accommodation for candidates with disabilities

Students with verifiable disabilities, including physical and learning disabilities, are eligible to take the SAT with accommodations. The standard time increase for students requiring additional time due to learning disabilities or physical handicaps is time + 50%; time + 100% is also offered.

## Scaled scores and percentiles

Students receive their online score reports approximately two to three weeks after test administration (longer for mailed, paper scores). Included in the report is the total score (the sum of the two section scores, with each section graded on a scale of 200–800) and three subscores (in reading, writing, and analysis, each on a scale of 2–8) for the optional essay. Students may also receive, for an additional fee, various score verification services, including (for select test administrations) the Question and Answer Service, which provides the test questions, the student's answers, the correct answers, and the type and difficulty of each question.

In addition, students receive two percentile scores, each of which is defined by the College Board as the percentage of students in a comparison group with equal or lower test scores. One of the percentiles, called the "Nationally Representative Sample Percentile", uses as a comparison group all 11th and 12th graders in the United States, regardless of whether or not they took the SAT. This percentile is theoretical and is derived using methods of statistical inference. The second percentile, called the "SAT User Percentile", uses actual scores from a comparison group of recent United States students that took the SAT. For example, for the school year 2019–2020, the SAT User Percentile was based on the test scores of students in the graduating classes of 2018 and 2019 who took the SAT (specifically, the 2016 revision) during high school. Students receive both types of percentiles for their total score as well as their section scores.

### Percentiles for total scores (2019)

Percentiles for total scores (2019)
Score, 400-1600 scale SAT User Nationally
representative sample
1600 99+ 99+
1550 99+ 99+
1500 98 99
1450 96 99
1400 94 97
1350 91 94
1300 86 91
1250 81 86
1200 74 81
1150 67 74
1100 58 67
1050 49 58
1000 40 48
950 31 38
900 23 29
850 16 21
800 10 14
750 5 8
700 2 4
650 1 1
640–400 <1 <1

### Percentiles for total scores (2006)

The following chart summarizes the original percentiles used for the version of the SAT administered in March 2005 through January 2016. These percentiles used students in the graduating class of 2006 as the comparison group.

Percentile Score 400–1600 scale,
(official, 2006)
Score, 600–2400 scale
(official, 2006)
99.93/99.98* 1600 2400
99.5 ≥1540 ≥2280
99 ≥1480 ≥2200
98 ≥1450 ≥2140
97 ≥1420 ≥2100
93 ≥1340 ≥1990
88 ≥1280 ≥1900
81 ≥1220 ≥1800
72 ≥1150 ≥1700
61 ≥1090 ≥1600
48 ≥1010 ≥1500
36 ≥950 ≥1400
24 ≥870 ≥1300
15 ≥810 ≥1200
8 ≥730 ≥1090
4 ≥650 ≥990
2 ≥590 ≥890
* The percentile of the perfect score was 99.98
on the 2400 scale and 99.93 on the 1600 scale.

### Percentiles for total scores (1984)

Percentiles for total scores (1984)
Score (1984) Percentile
1600 99.9995
1550 99.983
1500 99.89
1450 99.64
1400 99.10
1350 98.14
1300 96.55
1250 94.28
1200 91.05
1150 86.93
1100 81.62
1050 75.31
1000 67.81
950 59.64
900 50.88
850 41.98
800 33.34
750 25.35
700 18.26
650 12.37
600 7.58
550 3.97
500 1.53
450 0.29
400 0.002

The version of the SAT administered before April 1995 had a very high ceiling. For example, in the 1985–1986 school year, only 9 students out of 1.7 million test takers obtained a score of 1600.

In 2015 the average score for the Class of 2015 was 1490 out of a maximum 2400. That was down 7 points from the previous class's mark and was the lowest composite score of the past decade.

## SAT–ACT score comparisons

The College Board and ACT, Inc., conducted a joint study of students who took both the SAT and the ACT between September 2004 (for the ACT) or March 2005 (for the SAT) and June 2006. Tables were provided to concord scores for students taking the SAT after January 2005 and before March 2016. In May 2016, the College Board released concordance tables to concord scores on the SAT used from March 2005 through January 2016 to the SAT used since March 2016, as well as tables to concord scores on the SAT used since March 2016 to the ACT.

In 2018, the College Board, in partnership with the ACT, introduced a new concordance table to better compare how a student would fare one test to another. This is now considered the official concordance to be used by college professionals and is replacing the one from 2016. The new concordance no longer features the old SAT (out of 2,400), just the new SAT (out of 1,600) and the ACT (out of 36).

## Elucidation

### Preparation

Pioneered by Stanley Kaplan in 1946 with a 64-hour course, SAT preparation has become a highly lucrative field. Many companies and organizations offer test preparation in the form of books, classes, online courses, and tutoring. The test preparation industry began almost simultaneously with the introduction of university entrance exams in the U.S. and flourished from the start. Test-preparation scams are a genuine problem for parents and students. In general, East Asian Americans, especially Korean Americans, are the most likely to take private SAT preparation courses while African Americans prefer one-on-one tutoring for remedial learning.

Nevertheless, the College Board maintains that the SAT is essentially uncoachable and research by the College Board and the National Association of College Admission Counseling suggests that tutoring courses result in an average increase of about 20 points on the math section and 10 points on the verbal section. Indeed, researchers have shown time and again that preparation courses tend to offer at best a modest boost to test scores. Like IQ scores, which are a strong correlate, SAT scores tend to be stable over time, meaning SAT preparation courses offer only a limited advantage. An early meta-analysis (from 1983) found similar results and noted "the size of the coaching effect estimated from the matched or randomized studies (10 points) seems too small to be practically important." Statisticians Ben Domingue and Derek C. Briggs examined data from the Education Longitudinal Survey of 2002 and found that the effects of coaching were only statistically significant for mathematics; moreover, coaching had a greater effect on certain students than others, especially those who have taken rigorous courses and those of high socioeconomic status. A 2012 systematic literature review estimated a coaching effect of 23 and 32 points for the math and verbal tests, respectively. A 2016 meta-analysis estimated the effect size to be 0.09 and 0.16 for the verbal and math sections respectively, although there was a large degree of heterogeneity. Meanwhile, a 2011 study found that the effects of one-on-one tutoring to be minimal among all ethnic groups. Public misunderstanding of how to prepare for the SAT continues to be exploited by the preparation industry.

While there is a link between family background and taking an SAT preparation course, not all students benefit equally from such an investment. In fact, any average gains in SAT scores due to such courses are primarily due to improvements among East Asian Americans. When this group is broken down even further, Korean Americans are more likely to take SAT prep courses than Chinese Americans, taking full advantage of their Church communities and ethnic economy.

The College Board announced a partnership with the non-profit organization Khan Academy to offer free test-preparation materials starting in the 2015–16 academic year to help level the playing field for students from low-income families. Students may also bypass costly preparation programs using the more affordable official guide from the College Board and with solid studying habits.

There is some evidence that taking the PSAT at least once can help students do better on the SAT; moreover, like the case for the SAT, top scorers on the PSAT could earn scholarships. According to cognitive scientist Sian Beilock, 'choking', or substandard performance on important occasions, such as taking the SAT, can be prevented by doing plenty of practice questions and proctored exams to improve procedural memory, making use of the booklet to write down intermediate steps to avoid overloading working memory, and writing a diary entry about one's anxieties on the day of the exam to enhance self-empathy and positive self-image.

### Predictive validity and powers

In 2009, education researchers Richard C. Atkinson and Saul Geiser from the University of California (UC) system argued that high school GPA is better than the SAT at predicting college grades regardless of high school type or quality. It is the hope of some UC officials to increase the number of African- and Latino-American students attending and they plan to do so by casting doubt on the SAT and by decreasing the number of Asian-American students, who are heavily represented in the UC student body (29.5%) relative to their share of the population of California (13.6%). However, their assertions on the predictive validity of the SAT has been contested by the UC academic senate. In its 2020 report, the UC academic senate found that the SAT was better than high school GPA at predicting first year GPA, and just as good as high school GPA at predicting undergraduate GPA, first year retention, and graduation. This predictive validity was found to hold across demographic groups. A series of College Board reports point to similar predictive validity across demographic groups.

The SAT is correlated with intelligence and as such estimates individual differences. It does not, however, have anything to say about "effective cognitive performance," or what intelligent people do. Nor does it measure non-cognitive traits associated with academic success such as positive attitudes or conscientiousness. Psychometricians Thomas R. Coyle and David R. Pillow showed in 2008 that the SAT predicts college GPA even after removing the general factor of intelligence (g), with which it is highly correlated. A 2009 study found that SAT or ACT scores and high-school GPAs are strong predictors of cumulative university GPAs. In particular, those with standardized test scores in the 50th percentile or better had a two-thirds chance of having a cumulative university GPA in the top half. A 2010 meta-analysis by researchers from the University of Minnesota offered evidence that standardized admissions tests such as the SAT predicted not only freshman GPA but also overall collegiate GPA. A 2012 study from the same university using a multi-institutional data set revealed that even after controlling for socioeconomic status and high-school GPA, SAT scores were still as capable of predicting freshman GPA among university or college students. A 2019 study with a sample size of around a quarter of a million students suggests that together, SAT scores and high-school GPA offer an excellent predictor of freshman collegiate GPA and second-year retention. In 2018, psychologists Oren R. Shewach, Kyle D. McNeal, Nathan R. Kuncel, and Paul R. Sackett showed that both high-school GPA and SAT scores predict enrollment in advanced collegiate courses, even after controlling for Advanced Placement credits.

Education economist Jesse M. Rothstein indicated in 2005 that high-school average SAT scores were better at predicting freshman university GPAs compared to individual SAT scores. In other words, a student's SAT scores were not as informative with regards to future academic success as his or her high school's average. In contrast, individual high-school GPAs were a better predictor of collegiate success than average high-school GPAs. Furthermore, an admissions officer who failed to take average SAT scores into account would risk overestimating the future performance of a student from a low-scoring school and underestimating that of a student from a high-scoring school.

Like other standardized tests like the ACT or the GRE, the SAT is a traditional method for assessing the academic aptitude of students who have had vastly different educational experiences and as such is focused on the common materials that the students could reasonably be expected to have encountered throughout the course of study. As such the mathematics section contains no materials above the precalculus level, for instance. Psychologist Raymond Cattell referred to this as testing for "historical" rather than "current" crystallized intelligence. Psychologist Scott Barry Kaufman further noted that the SAT can only measure a snapshot of a person's performance at a particular moment in time. Educational psychologists Jonathan Wai, David Lubinski, and Camilla Benbow observed that one way to increase the predictive validity of the SAT is by assessing the student's spatial reasoning ability, as the SAT at present does not contain any questions to that effect. Spatial reasoning skills are important for success in STEM. A 2006 study led by psychometrician Robert Sternberg found that the ability of SAT scores and high-school GPAs to predict collegiate performance could further be enhanced by additional assessments of analytical, creative, and practical thinking.

Experimental psychologist Meredith Frey noted that while advances in education research and neuroscience can help improve the ability to predict scholastic achievement in the future, the SAT remains a valuable tool in the meantime. In a 2014 op-ed for The New York Times, psychologist John D. Mayer called the predictive powers of the SAT "an astonishing achievement" and cautioned against making it and other standardized tests optional. Research by psychometricians David Lubinsky, Camilla Benbow, and their colleagues has shown that the SAT could even predict life outcomes beyond university.

### Difficulty and relative weight

The SAT rigorously assesses students' mental stamina, memory, speed, accuracy, and capacity for abstract and analytical reasoning. For American universities and colleges, standardized test scores are the most important factor in admissions, second only to high-school GPAs. By international standards, however, the SAT is not that difficult. For example, South Korea's College Scholastic Ability Test (CSAT) and Finland's Matriculation Examination are both longer, tougher, and count for more towards the admissibility of a student to university. In many countries around the world, exams, including university entrance exams, are the sole deciding factor of admission; school grades are simply irrelevant. In China and India, doing well on the Gaokao or the IIT-JEE, respectively, enhances the social status of the students and their families.

In an article from 2012, educational psychologist Jonathan Wai argued that the SAT was too easy to be useful to the most competitive of colleges and universities, whose applicants typically had brilliant high-school GPAs and standardized test scores. Admissions officers therefore had the burden of differentiating the top scorers from one another, not knowing whether or not the students' perfect or near-perfect scores truly reflected their scholastic aptitudes. He suggested that the College Board make the SAT more difficult, which would raise the measurement ceiling of the test, allowing the top schools to identify the best and brightest among the applicants. At that time, the College Board was already working on making the SAT tougher. The changes were announced in 2014 and implemented in 2016.

After realizing the June 2018 test was easier than usual, the College Board made adjustments resulting in lower-than-expected scores, prompting complaints from the students, though some understood this was to ensure fairness. In its analysis of the incident, the Princeton Review supported the idea of curving grades, but pointed out that the test was incapable of distinguishing students in the 86th percentile (650 points) or higher in mathematics. The Princeton Review also noted that this particular curve was unusual in that it offered no cushion against careless or last-minute mistakes for high-achieving students. The Review posted a similar blog post for the SAT of August 2019, when a similar incident happened and the College Board responded in the same manner, noting, "A student who misses two questions on an easier test should not get as good a score as a student who misses two questions on a hard test. Equating takes care of that issue." It also cautioned students against retaking the SAT immediately, for they might be disappointed again, and recommended that instead, they give themselves some "leeway" before trying again.

### Recognition

Outside of the United States, the SAT is considered for university admissions in Canada, the United Kingdom, Australia, Singapore, and India, among dozens of other countries. About 4,000 institutions of higher learning worldwide accept the SAT, as of early 2022.

### Association with general cognitive ability

In a 2000 study, psychometrician Ann M. Gallagher and her colleagues found that only the top students made use of intuitive reasoning in solving problems encountered on the mathematics section of the SAT. Cognitive psychologists Brenda Hannon and Mary McNaughton-Cassill discovered that having a good working memory, the ability of knowledge integration, and low levels of test anxiety predicts high performance on the SAT.

Frey and Detterman (2004) investigated associations of SAT scores with intelligence test scores. Using an estimate of general mental ability, or g, based on the Armed Services Vocational Aptitude Battery, they found SAT scores to be highly correlated with g (r=.82 in their sample, .857 when adjusted for non-linearity) in their sample taken from a 1979 national probability survey. Additionally, they investigated the correlation between SAT results, using the revised and recentered form of the test, and scores on the Raven's Advanced Progressive Matrices, a test of fluid intelligence (reasoning), this time using a non-random sample. They found that the correlation of SAT results with scores on the Raven's Advanced Progressive Matrices was .483, they estimated that this correlation would have been about 0.72 were it not for the restriction of ability range in the sample. They also noted that there appeared to be a ceiling effect on the Raven's scores which may have suppressed the correlation. Beaujean and colleagues (2006) have reached similar conclusions to those reached by Frey and Detterman. Because the SAT is strongly correlated with general intelligence, it can be used as a proxy to measure intelligence, especially when the time-consuming traditional methods of assessment are unavailable.

Psychometrician Linda Gottfredson noted that the SAT is effective at identifying intellectually gifted college-bound students.

For decades many critics have accused designers of the verbal SAT of cultural bias as an explanation for the disparity in scores between poorer and wealthier test-takers, with the biggest critics coming from the University of California system. A famous example of this perceived bias in the SAT I was the oarsmanregatta analogy question, which is no longer part of the exam. The object of the question was to find the pair of terms that had the relationship most similar to the relationship between "runner" and "marathon". The correct answer was "oarsman" and "regatta". The choice of the correct answer was thought to have presupposed students' familiarity with rowing, a sport popular with the wealthy. However, for psychometricians, analogy questions are a useful tool to gauge the mental abilities of students, for, even if the meaning of two words are unclear, a student with sufficiently strong analytical thinking skills should still be able to identify their relationships. Analogy questions were removed in 2005. In their place are questions that provide more contextual information should the students be ignorant of the relevant definition of a word, making it easier for them to guess the correct answer.

In 2015, educational psychologist Jonathan Wai of Duke University analyzed average test scores from the Army General Classification Test in 1946 (10,000 students), the Selective Service College Qualification Test in 1952 (38,420), Project Talent in the early 1970s (400,000), the Graduate Record Examination between 2002 and 2005 (over 1.2 million), and the SAT Math and Verbal in 2014 (1.6 million). Wai identified one consistent pattern: those with the highest test scores tended to pick the physical sciences and engineering as their majors while those with the lowest were more likely to choose education and agriculture. (See figure below.)

A 2020 paper by Laura H. Gunn and her colleagues examining data from 1389 institutions across the United States unveiled strong positive correlations between the average SAT percentiles of incoming students and the shares of graduates majoring in STEM and the social sciences. On the other hand, they found negative correlations between the former and the shares of graduates in psychology, theology, law enforcement, recreation and fitness.

Various researchers have established that average SAT or ACT scores and college ranking in the U.S. News & World Report are highly correlated, almost 0.9. Between the 1980s and the 2010s, the U.S. population grew while universities and colleges did not expand their capacities as substantially. As a result, admissions rates fell considerably, meaning it has become more difficult to get admitted to a school whose alumni include one's parents. On top of that, high-scoring students nowadays are much more likely to leave their hometowns in pursuit of higher education at prestigious institutions. Consequently, standardized tests, such as the SAT, are a more reliable measure of selectivity than admissions rates. Still, when Michael J. Petrilli and Pedro Enamorado analyzed the SAT composite scores (math and verbal) of incoming freshman classes of 1985 and 2016 of the top universities and liberal arts colleges in the United States, they found that the median scores of new students increased by 93 points for their sample, from 1216 to 1309. In particular, fourteen institutions saw an increase of at least 150 points, including the University of Notre-Dame (from 1290 to 1440, or 150 points) and Elon College (from 952 to 1192, or 240 points).

### Association with types of schooling

While there seems to be evidence that private schools tend to produce students who do better on standardized tests such as the ACT or the SAT, Keven Duncan and Jonathan Sandy showed, using data from the National Longitudinal Surveys of Youth, that when student characteristics, such as age, race, and sex (7%), family background (45%), school quality (26%), and other factors were taken into account, the advantage of private schools diminished by 78%. The researchers concluded that students attending private schools already had the attributes associated with high scores on their own.

### Association with educational and societal standings and outcomes

Research from the University of California system published in 2001 analyzing data of their undergraduates between Fall 1996 through Fall 1999, inclusive, found that the SAT II was the single best predictor of collegiate success in the sense of freshman GPA, followed by high-school GPA, and finally the SAT I. After controlling for family income and parental education, the already low ability of the SAT to measure aptitude and college readiness fell sharply while the more substantial aptitude and college readiness measuring abilities of high school GPA and the SAT II each remained undiminished (and even slightly increased). The University of California system required both the SAT I and the SAT II from applicants to the UC system during the four academic years of the study. This analysis is heavily publicized but is contradicted by many studies.

There is evidence that the SAT is correlated with societal and educational outcomes, including finishing a four-year university program. A 2012 paper from psychologists at the University of Minnesota analyzing multi-institutional data sets suggested that the SAT maintained its ability to predict collegiate performance even after controlling for socioeconomic status (as measured by the combination of parental educational attainment and income) and high-school GPA. This means that SAT scores were not merely a proxy for measuring socioeconomic status, the researchers concluded. This finding has been replicated and shown to hold across racial or ethnic groups and for both sexes. Moreover, the Minnesota researchers found that the socioeconomic status distributions of the student bodies of the schools examined reflected those of their respective applicant pools. Because of what it measures, a person's SAT scores cannot be separated from their socioeconomic background.

In 2007, Rebecca Zwick and Jennifer Greif Green observed that a typical analysis did not take into account that heterogeneity of the high schools attended by the students in terms of not just the socioeconomic statuses of the student bodies but also the standards of grading. Zwick and Greif Green proceeded to show that when these were accounted for, the correlation between family socioeconomic status and classroom grades and rank increased whereas that between socioeconomic status and SAT scores fell. They concluded that school grades and SAT scores were similarly associated with family income.

According to the College Board, in 2019, 56% of the test takers had parents with a university degree, 27% parents with no more than a high-school diploma, and about 9% who did not graduate from high school. (8% did not respond to the question.)

### Association with family structures

One of the proposed partial explanations for the gap between Asian- and European-American students in educational achievement, as measured for example by the SAT, is the general tendency of Asians to come from stable two-parent households. In their 2018 analysis of data from the National Longitudinal Surveys of the Bureau of Labor Statistics, economists Adam Blandin, Christopher Herrington, and Aaron Steelman concluded that family structure played an important role in determining educational outcomes in general and SAT scores in particular. Families with only one parent who has no degrees were designated 1L, with two parents but no degrees 2L, and two parents with at least one degree between them 2H. Children from 2H families held a significant advantage of those from 1L families, and this gap grew between 1990 and 2010. Because the median SAT composite scores (verbal and mathematics) for 2H families grew by 20 points while those of 1L families fell by one point, the gap between them increased by 21 points, or a fifth of one standard deviation.

Speaking to The Wall Street Journal, family sociologist W. Bradford Wilcox stated, "In the absence of SAT scores, which can pinpoint kids from difficult family backgrounds with great academic potential, family stability is likely to loom even larger in determining who makes it past the college finish line in California [whose public university system decided to stop requiring SAT and ACT scores for admissions in 2020]."

### Sex differences

#### In performance

In 2013, the American College Testing Board released a report stating that boys outperformed girls on the mathematics section of the test, a significant gap that has persisted for over 35 years. As of 2015, boys on average earned 32 points more than girls on the SAT mathematics section. Among those scoring in the 700-800 range, the male-to-female ratio was 1.6:1. In 2014, psychologist Stephen Ceci and his collaborators found boys did better than girls across the percentiles. For example, a girl scoring in the top 10% of her sex would only be in the top 20% among the boys. In 2010, psychologist Jonathan Wai and his colleagues showed, by analyzing data from three decades involving 1.6 million intellectually gifted seventh graders from the Duke University Talent Identification Program (TIP), that in the 1980s the gender gap in the mathematics section of the SAT among students scoring in the top 0.01% was 13.5:1 in favor of boys but dropped to 3.8:1 by the 1990s. The dramatic sex ratio from the 1980s replicates a different study using a sample from Johns Hopkins University. This ratio is similar to that observed for the ACT mathematics and science scores between the early 1990s and the late 2000s. It remained largely unaltered at the end of the 2000s. Sex differences in SAT mathematics scores began making themselves apparent at the level of 400 points and above.

Some researchers point to evidence in support of greater male variability in spatial ability and mathematics. Greater male variability has been found in body weight, height, and cognitive abilities across cultures, leading to a larger number of males in the lowest and highest distributions of testing. Consequently, a higher number of males are found in both the upper and lower extremes of the performance distributions of the mathematics sections of standardized tests such as the SAT, resulting in the observed gender discrepancy. Paradoxically, this is at odds with the tendency of girls to have higher classroom scores than boys.

On the other hand, Wai and his colleagues found that both sexes in the top 5% appeared to be more or less at parity when it comes to the verbal section of the SAT, though girls have gained a slight but noticeable edge over boys starting in the mid-1980s. Psychologist David Lubinski, who conducted longitudinal studies of seventh grader who scored exceptionally high on the SAT, found a similar result. Girls generally had better verbal reasoning skills and boys mathematical skills. This reflects other research on the cognitive ability of the general population rather than just the 95th percentile and up.

Although aspects of testing such as stereotype are a concern, research on the predictive validity of the SAT has demonstrated that it tends to be a more accurate predictor of female GPA in university as compared to male GPA.

#### In strategizing

SAT mathematics questions can be answered intuitively or algorithmically.

Mathematical problems on the SAT can be broadly categorized into two groups: conventional and unconventional. Conventional problems can be handled routinely via familiar formulas or algorithms while unconventional ones require more creative thought in order to make unusual use of familiar methods of solution or to come up with the specific insights necessary for solving those problems. In 2000, ETS psychometrician Ann M. Gallagher and her colleagues analyzed how students handled disclosed SAT mathematics questions in self-reports. They found that for both sexes, the most favored approach was to use formulas or algorithms learned in class. When that failed, however, males were more likely than females to identify the suitable methods of solution. Previous research suggested that males were more likely to explore unusual paths to solution whereas females tended to stick to what they had learned in class and that females were more likely to identify the appropriate approaches if such required nothing more than mastery of classroom materials.

#### In confidence

Older versions of the SAT did ask students how confident they were in their mathematical aptitude and verbal reasoning ability, specifically, whether or not they believed they were in the top 10%. Devin G. Pope analyzed data of over four million test takers from the late 1990s to the early 2000s and found that high scorers were more likely to be confident they were in the top 10%, with the top scorers reporting the highest levels of confidence. But there were some noticeable gaps between the sexes. Men tended to be much more confident in their mathematical aptitude then women. For example, among those who scored 700 on the mathematics section, 67% of men answered they believed they were in the top 10% whereas only 56% of women did the same. Women, on the other hand, were slightly more confident in their verbal reasoning ability than men.

#### In glucose metabolism

Cognitive neuroscientists Richard Haier and Camilla Persson Benbow employed positron emission tomography (PET) scans to investigate the rate of glucose metabolism among students who have taken the SAT. They found that among men, those with higher SAT mathematics scores exhibited higher rates of glucose metabolism in the temporal lobes than those with lower scores, contradicting the brain-efficiency hypothesis. This trend, however, was not found among women, for whom the researchers could not find any cortical regions associated with mathematical reasoning. Both sexes scored the same on average in their sample and had the same rates of cortical glucose metabolism overall. According to Haier and Benbow, this is evidence for the structural differences of the brain between the sexes.

### Association with race and ethnicity

SAT Verbal average scores by race or ethnicity from 1986-87 to 2004-05
SAT Math average scores by race or ethnicity from 1986-87 to 2004-05

A 2001 meta-analysis of the results of 6,246,729 participants tested for cognitive ability or aptitude found a difference in average scores between black and white students of around 1.0 standard deviation, with comparable results for the SAT (2.4 million test takers). Similarly, on average, Hispanic and Amerindian students perform on the order of one standard deviation lower on the SAT than white and Asian students. Mathematics appears to be the more difficult part of the exam. In 1996, the black-white gap in the mathematics section was 0.91 standard deviations, but by 2020, it fell to 0.79. In 2013, Asian Americans as a group scored 0.38 standard deviations higher than whites in the mathematics section.

Some researchers believe that the difference in scores is closely related to the overall achievement gap in American society between students of different racial groups. This gap may be explainable in part by the fact that students of disadvantaged racial groups tend to go to schools that provide lower educational quality. This view is supported by evidence that the black-white gap is higher in cities and neighborhoods that are more racially segregated. Other research cites poorer minority proficiency in key coursework relevant to the SAT (English and math), as well as peer pressure against students who try to focus on their schoolwork ("acting white"). Cultural issues are also evident among black students in wealthier households, with high achieving parents. John Ogbu, a Nigerian-American professor of anthropology, concluded that instead of looking to their parents as role models, black youth chose other models like rappers and did not make an effort to be good students.

One set of studies has reported differential item functioning, namely, that some test questions function differently based on the racial group of the test taker, reflecting differences in ability to understand certain test questions or to acquire the knowledge required to answer them between groups. In 2003, Freedle published data showing that black students have had a slight advantage on the verbal questions that are labeled as difficult on the SAT, whereas white and Asian students tended to have a slight advantage on questions labeled as easy. Freedle argued that these findings suggest that "easy" test items use vocabulary that is easier to understand for white middle class students than for minorities, who often use a different language in the home environment, whereas the difficult items use complex language learned only through lectures and textbooks, giving both student groups equal opportunities to acquiring it. The study was severely criticized by the ETS board, but the findings were replicated in a subsequent study by Santelices and Wilson in 2010.

There is no evidence that SAT scores systematically underestimate future performance of minority students. However, the predictive validity of the SAT has been shown to depend on the dominant ethnic and racial composition of the college. Some studies have also shown that African-American students under-perform in college relative to their white peers with the same SAT scores; researchers have argued that this is likely because white students tend to benefit from social advantages outside of the educational environment (for example, high parental involvement in their education, inclusion in campus academic activities, positive bias from same-race teachers and peers) which result in better grades.

Christopher Jencks concludes that as a group, African Americans have been harmed by the introduction of standardized entrance exams such as the SAT. This, according to him, is not because the tests themselves are flawed, but because of labeling bias and selection bias; the tests measure the skills that African Americans are less likely to develop in their socialization, rather than the skills they are more likely to develop. Furthermore, standardized entrance exams are often labeled as tests of general ability, rather than of certain aspects of ability. Thus, a situation is produced in which African-American ability is consistently underestimated within the education and workplace environments, contributing in turn to selection bias against them which exacerbates underachievement.

2003 SAT scores by race and ethnicity

Among the major racial or ethnic groups of the United States, gaps in SAT mathematics scores are the greatest at the tails, with Hispanic and Latino Americans being the most likely to score at the lowest range and Asian Americans the highest. In addition, there is some evidence suggesting that if the test contains more questions of both the easy and difficult varieties, which would increase the variability of the scores, the gaps would be even wider. Given the distribution for Asians, for example, many could score higher than 800 if the test allowed them to. (See figure below.)

2020 was the year in which education worldwide was disrupted by the COVID-19 pandemic and indeed, the performance of students in the United States on standardized tests, such as the SAT, suffered. Yet the gaps persisted. According to the College Board, in 2020, while 83% of Asian students met the benchmark of college readiness in reading and writing and 80% in mathematics, only 44% and 21% of black students did those respective categories. Among whites, 79% met the benchmark for reading and writing and 59% did mathematics. For Hispanics and Latinos, the numbers were 53% and 30%, respectively. (See figure below.)

### Test-taking population

A U.S. Navy sailor taking the SAT aboard the U.S.S Kitty Hawk in 2004.

By analyzing data from the National Center for Education Statistics, economists Ember Smith and Richard Reeves of the Brookings Institution deduced that the number of students taking the SAT increased at a rate faster than population and high-school graduation growth rates between 2000 and 2020. The increase was especially pronounced among Hispanics and Latinos. Even among whites, whose number of high-school graduates was shrinking, the number of SAT takers rose. In 2015, for example, 1.7 million students took the SAT, up from 1.6 million in 2013. But in 2019, a record-breaking 2.2 million students took the exam, compared to 2.1 million in 2018, another record-breaking year. The rise in the number of students taking the SAT was due in part to many school districts offering to administer the SAT during school days often at no further costs to the students. However, in 2021, in the wake of the COVID-19 pandemic and the optional status of the SAT at many colleges and universities, only 1.5 million students took the test.

Psychologists Jean Twenge, W. Keith Campbell, and Ryne A. Sherman analyzed vocabulary test scores on the U.S. General Social Survey (${\displaystyle n=29,912}$) and found that after correcting for education, the use of sophisticated vocabulary has declined between the mid-1970s and the mid-2010s across all levels of education, from below high school to graduate school. However, they cautioned against the use of SAT verbal scores to track the decline for while the College Board reported that SAT verbal scores had been decreasing, these scores were an imperfect measure of the vocabulary level of the nation as a whole because the test-taking demographic has changed and because more students took the SAT in the 2010s than in the 1970s, meaning there were more with limited ability who took it.

### Use in non-collegiate contexts

#### By high-IQ societies

Certain high IQ societies, like Mensa, Intertel, the Prometheus Society and the Triple Nine Society, use scores from certain years as one of their admission tests. For instance, Intertel accepts scores (verbal and math combined) of at least 1300 on tests taken through January 1994; the Triple Nine Society accepts scores of 1450 or greater on SAT tests taken before April 1995, and scores of at least 1520 on tests taken between April 1995 and February 2005.

#### By researchers

Because it is strongly correlated with general intelligence, the SAT has often been used as a proxy to measure intelligence by researchers, especially since 2004. In particular, scientists studying mathematically gifted individuals have been using the mathematics section of the SAT to identify subjects for their research.

A growing body of research indicates that SAT scores can predict individual success decades into the future, for example in terms of income and occupational achievements. A longitudinal study published in 2005 by educational psychologists Jonathan Wai, David Lubinski, and Camilla Benbow suggests that among the intellectually precocious (the top 1%), those with higher scores in the mathematics section of the SAT at the age of 12 were more likely to earn a PhD in the STEM fields, to have a publication, to register a patent, or to secure university tenure. Wai further showed that an individual's academic ability, as measured by the average SAT or ACT scores of the institution attended, predicted individual differences in income, even among the richest people of all, and being a member of the 'American elite', namely Fortune 500 CEOs, billionaires, federal judges, and members of Congress. Wai concluded that the American elite was also the cognitive elite. Gregory Park, Lubinski, and Benbow gave statistical evidence that intellectually gifted adolescents, as identified by SAT scores, could be expected to accomplish great feats of creativity in the future, both in the arts and in STEM.

The SAT is sometimes given to students at age 12 or 13 by organizations such as the Study of Mathematically Precocious Youth (SMPY), Johns Hopkins Center for Talented Youth, and the Duke University Talent Identification Program (TIP) to select, study, and mentor students of exceptional ability, that is, those in the top one percent. Among SMPY participants, those within the top quartile, as indicated by the SAT composite score (mathematics and verbal), were markedly more likely to have a doctoral degree, to have at least one publication in STEM, to earn income in the 95th percentile, to have at least one literary publication, or to register at least one patent than those in the bottom quartile. Duke TIP participants generally picked career tracks in STEM should they be stronger in mathematics, as indicated by SAT mathematics scores, or the humanities if they possessed greater verbal ability, as indicated by SAT verbal scores. For comparison, the bottom SMPY quartile is five times more likely than the average American to have a patent. Meanwhile, as of 2016, the shares doctorates among SMPY participants was 44% and Duke TIP 37%, compared to two percent among the general U.S. population. Consequently, the notion that beyond a certain point, differences in cognitive ability as measured by standardized tests such as the SAT cease to matter is gainsaid by the evidence.

In the 2010 paper which showed that the sex gap in SAT mathematics scores had dropped dramatically between the early 1980s and the early 1990s but had persisted for the next two decades or so, Wai and his colleagues argued that "sex differences in abilities in the extreme right tail should not be dismissed as no longer part of the explanation for the dearth of women in math-intensive fields of science."

#### By employers

Cognitive ability is correlated with job training outcomes and job performance. As such, some employers rely on SAT scores to assess the suitability of a prospective recruit, especially if the person has limited work experience. There is nothing new about this practice. Major companies and corporations have spent princely sums on learning how to avoid hiring errors and have decided that standardized test scores are a valuable tool in deciding whether or not a person is fit for the job. In some cases, a company might need to hire someone to handle proprietary materials of its own making, such as computer software. But since the ability to work with such materials cannot be assessed via external certification, it makes sense for such a firm to rely on something that is a proxy of measuring general intelligence. In other cases, a firm may not care about academic background but needs to assess a prospective recruit's quantitative reasoning ability, and what makes standardized test scores necessary. Several companies, especially those considered to be the most prestigious in industries such as investment banking and management consulting such as Goldman Sachs and McKinsey, have been reported to ask prospective job candidates about their SAT scores. According to the Wall Street Journal, the scores are used similarly to how they are in college admissions, in that companies claim they provide insight into the intellectual capabilities and problem-solving skills of an individual.

Nevertheless, some other top employers, such as Google, have eschewed the use of SAT or other standardized test scores unless the potential employee is a recent graduate because for their purposes, these scores "don't predict anything." Educational psychologist Jonathan Wai suggested this might be due to the inability of the SAT to differentiate the intellectual capacities of those at the extreme right end of the distribution of intelligence. Wai told The New York Times, "Today the SAT is actually too easy, and that's why Google doesn't see a correlation. Every single person they get through the door is a super-high scorer."

## Perception

### Math–verbal achievement gap

In 2002, New York Times columnist Richard Rothstein argued that the U.S. math averages on the SAT and ACT continued their decade-long rise over national verbal averages on the tests while the averages verbal portions on the same tests were floundering.

### Optional SAT

In the 1960s and 1970s there was a movement to drop achievement scores. After a period of time, the countries, states and provinces that reintroduced them agreed that academic standards had dropped, students had studied less, and had taken their studying less seriously. They reintroduced the tests after studies and research concluded that the high-stakes tests produced benefits that outweighed the costs.

In a 2001 speech to the American Council on Education, Richard C. Atkinson, the president of the University of California, urged the dropping admissions tests such as the SAT I but not achievement tests such as the SAT II as a college admissions requirement. Atkinson's critique of the predictive validity and powers of the SAT has been contested by the University of California academic senate. In April 2020, the academic senate, which consisted of faculty members, voted 51–0 to restore the requirement of standardized test scores. However, the governing board overruled the senate. Because of the size of the Californian population, this decision might have an impact on U.S. higher education at large; schools looking to admit Californian students could have harder time.

Despite the fallout from Operation Varsity Blues, which found many wealthy parents illegally intervening to raise their children's standardized test scores, the SAT and the ACT remain popular among American parents and college-bound seniors, who are skeptical of the process of "holistic admissions" because they think is rather vague and uncertain, as schools try to access characteristics not easily discerned via a number, hence the growth in the number of test takers attempting to make themselves more competitive even if this parallels an increase in the number of schools declaring it optional. Holistic admissions notwithstanding, when merit-based scholarships are considered, standardized test scores might be the tiebreakers, as these are highly competitive. Scholarships and financial aid could help students and their parents significantly cut the cost of higher education, especially in times of economic hardship. Moreover, the most selective of schools might have no better options than using standardized test scores in order to quickly prune the number of applications worth considering, for holistic admissions consume valuable time and other resources.

In the wake of the COVID-19 pandemic, around 1,600 institutions decided to waive the requirement of the SAT or the ACT for admissions because it was challenging both to administer and to take these tests, resulting in many cancellations. Some schools chose to make them optional on a temporary basis only, either for just one year, as in the case of Princeton University, or three, like the College of William & Mary. Others dropped the requirement completely. Some schools extended their moratorium on standardized entrance exams in 2021. This did not stop highly ambitious students from taking them, however, as many parents and teenagers were skeptical of the "optional" status of university entrance exams and wanted to make their applications more likely to catch the attention of admission officers. This led to complaints of registration sites crashing in the summer of 2020. On the other hand, the number of students applying to the more competitive of schools that had made SAT and ACT scores optional increased dramatically because the students thought they stood a chance. Ivy League institutions saw double-digit increases in the number of applications, as high as 51% in the case of Columbia University, while their admission rates, already in the single digits, fell, e.g. from 4.9% in 2020 to just 3.4% in 2021 at Harvard University. At the same time, interest in lower-status schools that did the same thing dropped precipitously. In all, 44% of students who used the Common Application—accepted by over 900 colleges and universities as of 2021—submitted SAT or ACT scores in 2020–21, down from 77% in 2019–20. Those who did submit their test scores tended to hail from high-income families, to have at least one university-educated parent, and to be white or Asian.

### Writing section

In 2005, MIT Writing Director Les Perelman plotted essay length versus essay score on the new SAT from released essays and found a high correlation between them. After studying over 50 graded essays, he found that longer essays consistently produced higher scores. In fact, he argues that by simply gauging the length of an essay without reading it, the given score of an essay could likely be determined correctly over 90% of the time. He also discovered that several of these essays were full of factual errors; the College Board does not claim to grade for factual accuracy.

Perelman, along with the National Council of Teachers of English, also criticized the 25-minute writing section of the test for damaging standards of writing teaching in the classroom. They say that writing teachers training their students for the SAT will not focus on revision, depth, accuracy, but will instead produce long, formulaic, and wordy pieces. "You're getting teachers to train students to be bad writers", concluded Perelman.

On January 19, 2021, the College Board announced that the SAT would no longer offer the optional essay section after the June 2021 administration.

## History

 Year ofexam Reading/VerbalScore Math Score 1972 530 509 1973 523 506 1974 521 505 1975 512 498 1976 509 497 1977 507 496 1978 507 494 1979 505 493 1980 502 492 1981 502 492 1982 504 493 1983 503 494 1984 504 497 1985 509 500 1986 509 500 1987 507 501 1988 505 501 1989 504 502 1990 500 501 1991 499 500 1992 500 501 1993 500 503 1994 499 504 1995 504 506 1996 505 508 1997 505 511 1998 505 512 1999 505 511 2000 505 514 2001 506 514 2002 504 516 2003 507 519 2004 508 518 2005 508 520 2006 503 518 2007 502 515 2008 502 515 2009 501 515 2010 501 516 2011 497 514 2012 496 514 2013 496 514 2014 497 513 2015 495 511 2016 494 508 2017 533 527 2018 536 531 2019 531 528 2020 528 523 2021 533 528

In the late nineteenth century, elite colleges and universities had their own entrance exams and they required candidates to travel to the school to take the tests. To better organize matters, the College Board, a consortium of colleges in the northeastern United States, was formed in 1900 to establish a nationally administered, uniform set of essay tests based on the curricula of the boarding schools that typically provided graduates to the colleges of the Ivy League and Seven Sisters, among others. The first College Board exam—covering mathematics, the physical sciences, history, languages, and other subjects—was administered in 1901 to no more than 1,000 candidates. In the same time period, Lewis Terman and others began to promote the use of tests such as Alfred Binet's in American schools. Terman in particular thought that such tests could identify an innate "intelligence quotient" (IQ) in a person. The results of an IQ test could then be used to find an elite group of students who would be given the chance to finish high school and go on to college. By the mid-1920s, the increasing use of IQ tests, such as the Army Alpha test administered to recruits in World War I, led the College Board to commission the development of the SAT. The commission, headed by eugenicist Carl Brigham, argued that the test predicted success in higher education by identifying candidates primarily on the basis of intellectual promise rather than on specific accomplishment in high school subjects. Brigham

created the test to uphold a racial caste system. He advanced this theory of standardized testing as a means of upholding racial purity in his book A Study of American Intelligence. The tests, he wrote, would prove the racial superiority of white Americans and prevent 'the continued propagation of defective strains in the present population'—chiefly, the 'infiltration of white blood into the Negro.'

By 1930, however, Brigham would repudiate his own conclusions, writing that "comparative studies of various national and racial groups may not be made with existing tests" and that SAT scores couldn't reflect some innate, genetically-based ability, but instead would be "a composite including schooling, family background, familiarity with English and everything else, relevant and irrelevant." In 1934, James Conant and Henry Chauncey used the SAT as a means to identify recipients for scholarships to Harvard University. Specifically, Conant wanted to find students, other than those from the traditional northeastern private schools, that could do well at Harvard. The success of the scholarship program and the advent of World War II led to the end of the College Board essay exams and to the SAT being used as the only admissions test for College Board member colleges.

The SAT rose in prominence after World War II due to several factors. Machine-based scoring of multiple-choice tests taken by pencil had made it possible to rapidly process the exams. At the time, elite colleges were admitting mainly students from elite private schools and wanted to take in students from other backgrounds. The G.I. Bill produced an influx of millions of veterans into higher education. The formation of the Educational Testing Service (ETS) also played a significant role in the expansion of the SAT beyond the roughly fifty colleges that made up the College Board at the time. The ETS was formed in 1947 by the College Board, Carnegie Foundation for the Advancement of Teaching, and the American Council on Education, to consolidate respectively the operations of the SAT, the GRE, and the achievement tests developed by Ben Wood for use with Conant's scholarship exams. The new organization was to be philosophically grounded in the concepts of open-minded, scientific research in testing with no doctrine to sell and with an eye toward public service. The ETS was chartered after the death of Brigham, who had opposed the creation of such an entity. Brigham felt that the interests of a consolidated testing agency would be more aligned with sales or marketing than with research into the science of testing. It has been argued that the interest of the ETS in expanding the SAT in order to support its operations aligned with the desire of public college and university faculties to have smaller, diversified, and more academic student bodies as a means to increase research activities. In 1951, about 80,000 SATs were taken; in 1961, about 800,000; and by 1971, about 1.5 million SATs were being taken each year. As more and more students from all over the U.S. tried to enter college, the SAT became more of a high-stakes exam; colleges needed something they could trust to fairly assess a prospective student's scholastic aptitude.

During the 2010s, there was concern over the continued decline of SAT scores, which might be due to the expansion of the test-taking population. (See graph below.)

In the wake of Operation Varsity Blues, it came to light that some wealthy parents obtained extra time on the SATs from doctors willing to sign off on false reports for the students. Already in the 2000s, concerns over parents obtaining fraudulent mental diagnoses to give their children an unfair advantage had been raised.

A timeline of notable events in the history of the SAT follows.

### 1901 essay exams

On June 17, 1901, the first exams of the College Board were administered to 973 students across 67 locations in the United States, and two in Europe. Although those taking the test came from a variety of backgrounds, approximately one third were from New York, New Jersey, or Pennsylvania. The majority of those taking the test were from private schools, academies, or endowed schools. About 60% of those taking the test applied to Columbia University. The test contained sections on English, French, German, Latin, Greek, history, geography, political science, biology, mathematics, chemistry, and physics. The test was not multiple choice, but instead was evaluated based on essay responses as "excellent", "good", "doubtful", "poor" or "very poor".

### 1926 test

The first administration of the SAT occurred on June 23, 1926, when it was known as the Scholastic Aptitude Test. This test, prepared by a committee headed by eugenicist and Princeton psychologist Carl Campbell Brigham, had sections of definitions, arithmetic, classification, artificial language, antonyms, number series, analogies, logical inference, and paragraph reading. It was administered to over 8,000 students at over 300 test centers. Men composed 60% of the test-takers. Slightly over a quarter of males and females applied to Yale University and Smith College. The test was paced rather quickly, test-takers being given only a little over 90 minutes to answer 315 questions. The raw score of each participating student was converted to a score scale with a mean of 500 and a standard deviation of 100. This scale was effectively equivalent to a 200 to 800 scale, although students could score more than 800 and less than 200.

### 1928 and 1929 tests

In 1928, the number of sections on the SAT was reduced to seven, and the time limit was increased to slightly under two hours. In 1929, the number of sections was again reduced, this time to six. These changes were designed in part to give test-takers more time per question. For these two years, all of the sections tested verbal ability: math was eliminated entirely from the SAT.

### 1930 test and 1936 changes

In 1930 the SAT was first split into the verbal and math sections, a structure that would continue through 2004. The verbal section of the 1930 test covered a more narrow range of content than its predecessors, examining only antonyms, double definitions (somewhat similar to sentence completions), and paragraph reading. In 1936, analogies were re-added. Between 1936 and 1946, students had between 80 and 115 minutes to answer 250 verbal questions (over a third of which were on antonyms). The mathematics test introduced in 1930 contained 100 free response questions to be answered in 80 minutes and focused primarily on speed. From 1936 to 1941, like the 1928 and 1929 tests, the mathematics section was eliminated entirely. When the mathematics portion of the test was re-added in 1942, it consisted of multiple-choice questions.

### 1941 and 1942 score scales

Until 1941, the scores on all SATs had been scaled to a mean of 500 with a standard deviation of 100. Although one test-taker could be compared to another for a given test date, comparisons from one year to another could not be made. For example, a score of 500 achieved on an SAT taken in one year could reflect a different ability level than a score of 500 achieved in another year. By 1940, it had become clear that setting the mean SAT score to 500 every year was unfair to those students who happened to take the SAT with a group of higher average ability.

In order to make cross-year score comparisons possible, in April 1941 the SAT verbal section was scaled to a mean of 500, and a standard deviation of 100, and the June 1941 SAT verbal section was equated (linked) to the April 1941 test. All SAT verbal sections after 1941 were equated to previous tests so that the same scores on different SATs would be comparable. Similarly, in June 1942 the SAT math section was equated to the April 1942 math section, which itself was linked to the 1942 SAT verbal section, and all SAT math sections after 1942 would be equated to previous tests. From this point forward, SAT mean scores could change over time, depending on the average ability of the group taking the test compared to the roughly 10,600 students taking the SAT in April 1941. The 1941 and 1942 score scales would remain in use until 1995.

### 1946 test and associated changes

Paragraph reading was eliminated from the verbal portion of the SAT in 1946, and replaced with reading comprehension, and "double definition" questions were replaced with sentence completions. Between 1946 and 1957, students were given 90 to 100 minutes to complete 107 to 170 verbal questions. Starting in 1958, time limits became more stable, and for 17 years, until 1975, students had 75 minutes to answer 90 questions. In 1959, questions on data sufficiency were introduced to the mathematics section and then replaced with quantitative comparisons in 1974. In 1974, both verbal and math sections were reduced from 75 minutes to 60 minutes each, with changes in test composition compensating for the decreased time.

### 1960s and 1970s score declines

From 1926 to 1941, scores on the SAT were scaled to make 500 the mean score on each section. In 1941 and 1942, SAT scores were standardized via test equating, and as a consequence, average verbal and math scores could vary from that time forward. In 1952, mean verbal and math scores were 476 and 494, respectively, and scores were generally stable in the 1950s and early 1960s. However, starting in the mid-1960s and continuing until the early 1980s, SAT scores declined: the average verbal score dropped by about 50 points, and the average math score fell by about 30 points. By the late 1970s, only the upper third of test takers were doing as well as the upper half of those taking the SAT in 1963. From 1961 to 1977, the number of SATs taken per year doubled, suggesting that the decline could be explained by demographic changes in the group of students taking the SAT. Commissioned by the College Board, an independent study of the decline found that most (up to about 75%) of the test decline in the 1960s could be explained by compositional changes in the group of students taking the test; however, only about 25 percent of the 1970s decrease in test scores could similarly be explained. Later analyses suggested that up to 40 percent of the 1970s decline in scores could be explained by demographic changes, leaving unknown at least some of the reasons for the decline.

### 1994 changes

In early 1994, substantial changes were made to the SAT. Antonyms were removed from the verbal section in order to make rote memorization of vocabulary less useful. Also, the fraction of verbal questions devoted to passage-based reading material was increased from about 30% to about 50%, and the passages were chosen to be more like typical college-level reading material, compared to previous SAT reading passages. The changes for increased emphasis on analytical reading were made in response to a 1990 report issued by a commission established by the College Board. The commission recommended that the SAT should, among other things, "approximate more closely the skills used in college and high school work". A mandatory essay had been considered as well for the new version of the SAT; however, criticism from minority groups, as well as a concomitant increase in the cost of the test necessary to grade the essay, led the College Board to drop it from the planned changes.

Major changes were also made to the SAT mathematics section at this time, due in part to the influence of suggestions made by the National Council of Teachers of Mathematics. Test-takers were now permitted to use calculators on the math sections of the SAT. Also, for the first time since 1935, the SAT would now include some math questions that were not multiple choice, and would require students to supply the answers for those questions. Additionally, some of these "student-produced response" questions could have more than one correct answer. The tested mathematics content on the SAT was expanded to include concepts of slope of a line, probability, elementary statistics including median and mode, and problems involving counting.

### 1995 recentering (raising mean score back to 500)

By the early 1990s, average combined SAT scores were around 900 (typically, 425 on the verbal and 475 on the math). The average scores on the 1994 modification of the SAT I were similar: 428 on the verbal and 482 on the math. SAT scores for admitted applicants to highly selective colleges in the United States were typically much higher. For example, the score ranges of the middle 50% of admitted applicants to Princeton University in 1985 were 600 to 720 (verbal) and 660 to 750 (math). Similarly, median scores on the modified 1994 SAT for freshmen entering Yale University in the fall of 1995 were 670 (verbal) and 720 (math). For the majority of SAT takers, however, verbal and math scores were below 500: In 1992, half of the college-bound seniors taking the SAT were scoring between 340 and 500 on the verbal section and between 380 and 560 on the math section, with corresponding median scores of 420 and 470, respectively.

The drop in SAT verbal scores, in particular, meant that the usefulness of the SAT score scale (200 to 800) had become degraded. At the top end of the verbal scale, significant gaps were occurring between raw scores and uncorrected scaled scores: a perfect raw score no longer corresponded to an 800, and a single omission out of 85 questions could lead to a drop of 30 or 40 points in the scaled score. Corrections to scores above 700 had been necessary to reduce the size of the gaps and to make a perfect raw score result in an 800. At the other end of the scale, about 1.5 percent of test-takers would have scored below 200 on the verbal section if that had not been the reported minimum score. Although the math score averages were closer to the center of the scale (500) than the verbal scores, the distribution of math scores was no longer well approximated by a normal distribution. These problems, among others, suggested that the original score scale and its reference group of about 10,000 students taking the SAT in 1941 needed to be replaced.

Beginning with the test administered in April 1995, the SAT score scale was recentered to return the average math and verbal scores close to 500. Although only 25 students had received perfect scores of 1600 in all of 1994, 137 students taking the April test scored 1600. The new scale used a reference group of about one million seniors in the class of 1990: the scale was designed so that the SAT scores of this cohort would have a mean of 500 and a standard deviation of 110. Because the new scale would not be directly comparable to the old scale, scores awarded in April 1995 and later were officially reported with an "R" (for example, "560R") to reflect the change in scale, a practice that was continued until 2001. Scores awarded before April 1995 may be compared to those on the recentered scale by using official College Board tables. For example, verbal and math scores of 500 received before 1995 correspond to scores of 580 and 520, respectively, on the 1995 scale.

### 1995 re-centering controversy

Certain educational organizations viewed the SAT re-centering initiative as an attempt to stave off international embarrassment in regards to continuously declining test scores, even among top students. As evidence, it was presented that the number of pupils who scored above 600 on the verbal portion of the test had fallen from a peak of 112,530 in 1972 to 73,080 in 1993, a 36% backslide, despite the fact that the total number of test-takers had risen by over 500,000.

### 2002 changes – Score Choice

Since 1993, using a policy referred to as "Score Choice", students taking the SAT-II subject exams were able to choose whether or not to report the resulting scores to a college to which the student was applying. In October 2002, the College Board dropped the Score Choice option for SAT-II exams, matching the score policy for the traditional SAT tests that required students to release all scores to colleges. The College Board said that, under the old score policy, many students who waited to release scores would forget to do so and miss admissions deadlines. It was also suggested that the old policy of allowing students the option of which scores to report favored students who could afford to retake the tests.

### 2005 changes, including a new 2400-point score

In 2005, the test was changed again, largely in response to criticism by the University of California system. In order to have the SAT more closely reflect high school curricula, certain types of questions were eliminated, including analogies from the verbal section and quantitative comparison items from the math section. A new writing section, with an essay, based on the former SAT II Writing Subject Test, was added, in part to increase the chances of closing the opening gap between the highest and midrange scores. The writing section reported a multiple-choice subscore that ranged from 20 to 80 points. Other factors included the desire to test the writing ability of each student; hence the essay. The essay section added an additional maximum 800 points to the score, which increased the new maximum score to 2400. The "New SAT" was first offered on March 12, 2005, after the last administration of the "old" SAT in January 2005. The mathematics section was expanded to cover three years of high school mathematics. To emphasize the importance of reading, the verbal section's name was changed to the Critical Reading section.

### Scoring problems of October 2005 tests

In March 2006, it was announced that a small percentage of the SATs taken in October 2005 had been scored incorrectly due to the test papers being moist and not scanning properly and that some students had received erroneous scores. The College Board announced they would change the scores for the students who were given a lower score than they earned, but at this point many of those students had already applied to colleges using their original scores. The College Board decided not to change the scores for the students who were given a higher score than they earned. A lawsuit was filed in 2006 on behalf of the 4,411 students who received an incorrect score on the SAT. The class-action suit was settled in August 2007, when the College Board and Pearson Educational Measurement, the company that scored the SATs, announced they would pay $2.85 million into a settlement fund. Under the agreement, each student could either elect to receive$275 or submit a claim for more money if he or she felt the damage was greater. A similar scoring error occurred on a secondary school admission test in 2010–2011, when the ERB (Educational Records Bureau) announced, after the admission process was over, that an error had been made in the scoring of the tests of 2010 students (17%), who had taken the Independent School Entrance Examination for admission to private secondary schools for 2011. Commenting on the effect of the error on students' school applications in The New York Times, David Clune, President of the ERB stated "It is a lesson we all learn at some point—that life isn't fair."

### 2008 changes

As part of an effort to “reduce student stress and improve the test-day experience", in late 2008 the College Board announced that the Score Choice option, recently dropped for SAT subject exams, would be available for both the SAT subject tests and the SAT starting in March 2009. At the time, some college admissions officials agreed that the new policy would help to alleviate student test anxiety, while others questioned whether the change was primarily an attempt to make the SAT more competitive with the ACT, which had long had a comparable score choice policy. Recognizing that some colleges would want to see the scores from all tests taken by a student, under this new policy, the College Board would encourage but not force students to follow the requirements of each college to which scores would be sent. A number of highly selective colleges and universities, including Yale, the University of Pennsylvania, Cornell, and Stanford, rejected the Score Choice option at the time. Since then, Cornell, University of Pennsylvania, and Stanford have all adopted Score Choice, but Yale continues to require applicants to submit all scores. Others, such as MIT and Harvard, allow students to choose which scores they submit, and use only the highest score from each section when making admission decisions. Still others, such as Oregon State University and University of Iowa, allow students to choose which scores they submit, considering only the test date with the highest combined score when making admission decisions.

### 2012 changes

Beginning in the fall of 2012, test takers were required to submit a current, recognizable photo during registration. In order to be admitted to their designated test center, students were required to present their photo admission ticket—or another acceptable form of photo ID—for comparison to the one submitted by the student at the time of registration. The changes were made in response to a series of cheating incidents, primarily at high schools in Long Island, New York, in which high-scoring test takers were using fake photo IDs to take the SAT for other students. In addition to the registration photo stipulation, test takers were required to identify their high school, to which their scores, as well as the submitted photos, would be sent. In the event of an investigation involving the validity of a student's test scores, their photo may be made available to institutions to which they have sent scores. Any college that is granted access to a student's photo is first required to certify that the student has been admitted to the college requesting the photo.

On March 5, 2014, the College Board announced its plan to redesign the SAT in order to link the exam more closely to the work high school students encounter in the classroom. The new exam was administered for the first time in March 2016. Some of the major changes were: an emphasis on the use of evidence to support answers, a shift away from obscure vocabulary to words that students are more likely to encounter in college and career, an optional essay, questions having four rather than five answer options, and the removal of penalty for wrong answers (rights-only scoring). The Critical Reading section was replaced with the new Evidence-Based Reading and Writing section (the Reading Test and the Writing and Language Test). The scope of mathematics content was narrowed to include fewer topics, including linear equations, ratios, and other precalculus topics. The essay score was separated from the final score, and institutions could choose whether or not to consider it. As a result of these changes, the highest score was returned to 1600. These modifications were the first major redesign to the structure of the test since 2005. As the test no longer deducts points for wrong answers, the numerical scores and the percentiles appeared to have increased after the new SAT was unveiled in 2016. However, this does not necessarily mean students came better prepared.

To combat the perceived advantage of costly test preparation courses, the College Board announced a new partnership with Khan Academy to offer free online practice problems and instructional videos.

### 2019 introduction and abandonment of the 'Adversity Score' and launching of 'Landscape'

In May 2019, the College Board announced that it would calculate each SAT taker's "Adversity Score" using factors such as the proportion of students in a school district receiving free or subsidized lunch or the level of crime in that neighborhood. The higher the score, the more adversity the student faced. However, this triggered a strong backlash from the general public as people were skeptical of how complex information can be conveyed with a single number and were concerned that it might be politically weaponized. The College Board thus abandoned the Adversity Score and instead created a new tool called 'Landscape' to provide the same sort of details to admissions officers using government information but without calculating a score.

### 2021 changes

In the wake of the COVID-19 pandemic, which made administering and taking the tests difficult, on January 19, 2021, the College Board announced plans to discontinue the optional SAT essay following the June 2021 administration. While some administrations were canceled, others continued with precautionary measures such as requirements of temperature checks, enhanced ventilation, higher ceilings, physical distancing, and face masks. The College Board also announced the immediate discontinuation of the SAT Subject Tests in the United States, and the same internationally after the June 2021 administration.

### 2023 changes

In January 2022, College Board revealed its plan to administer the SAT digitally to American students in 2024 and to students in other countries in 2023. The digital format of the test would simplify logistics and allow scores to be determined in a matter of days rather than weeks. Score reports will also carry information about two-year collegiate programs and vocational training. Students may bring their own laptop or tablet computers but must sit at a designated location. Those without a device will be provided with one. The College Board also announced that the SAT will be shortened to two hours from three, with a basic onscreen calculator provided for the math section.

## Name changes

Old SAT logo

The SAT has been renamed several times since its introduction in 1926. It was originally known as the Scholastic Aptitude Test. In 1990, a commission set up by the College Board to review the proposed changes to the SAT program recommended that the meaning of the initialism SAT be changed to "Scholastic Assessment Test" because a "test that integrates measures of achievement as well as developed ability can no longer be accurately described as a test of aptitude". In 1993, the College Board changed the name of the test to SAT I: Reasoning Test; at the same time, the name of the Achievement Tests was changed to SAT II: Subject Tests. The Reasoning Test and Subject Tests were to be collectively known as the Scholastic Assessment Tests. According to the president of the College Board at the time, the name change was meant "to correct the impression among some people that the SAT measures something that is innate and impervious to change regardless of effort or instruction." The new SAT debuted in March 1994, and was referred to as the Scholastic Assessment Test by major news organizations. However, in 1997, the College Board announced that the SAT could not properly be called the Scholastic Assessment Test, and that the letters SAT did not stand for anything. In 2004, the Roman numeral in SAT I: Reasoning Test was dropped, making SAT Reasoning Test the name of the SAT. The "Reasoning Test" portion of the name was eliminated following the exam's 2016 redesign; it is now simply called the SAT.

## Reuse of old SAT exams

The College Board has been accused of completely reusing old SAT papers previously given in the United States. The recycling of questions from previous exams has been exploited to allow for cheating on exams and impugned the validity of some students' test scores, according to college officials. Test preparation companies in Asia have been found to provide test questions to students within hours of a new SAT exam's administration.

On August 25, 2018, the SAT test given in America was discovered to be a recycled October 2017 international SAT test given in China. The leaked PDF file was on the internet before the August 25, 2018 exam.