Page 1
The United States' Threat Perception of China and the USSR, 1952-1975:
An Application of Automated Content Analysis Methods
A Working Paper
Namsu Kim
Department of Political Science and International Relations
Seoul National University
[email protected]
Abstract
This preliminary research focuses on quantifying ideological variables such as the United States’ threat
perception of China and the Soviet Union from 1952 to 1975 through an automated (computer-aided)
content analysis method. This is to build a foundation to quantitatively analyze impact of the US’ threat
perception on US foreign and security policy, and to develop a logic for case selection. This research used
Foreign Relations of the United States (FRUS) as a population for the content analysis, and the process
consists of four steps: 1) gathering FRUS articles from the web; 2) pre-processing the articles to be
suitable for automated content analysis; 3) selecting articles through systematic sampling and categorical
coding; and 4) the final content analysis. Python is used for the first two processes, and the last step is
conducted with ReadMe package in R. The results correspond fairly well to the actual historical events,
and thus this exercise could be seen as resulting in an effective estimation of the perception factors in US
foreign policy.
This draft has been prepared for the 2013 Midwest Political Science Association Conference.
Please do not cite without permission.
Page 2
2
I. Introduction
1. Background
Recently, cooperation and conflict between the US and China (PRC, People's Republic of China) has
become one of the most important issues in East Asian security and foreign policy. It is now one of the
main factors that creates the East Asian security environment. Thus, an accurate assessment of the US-
China relationship is of primary importance for South Korea, which has comparatively less power in
military, economy and foreign policy than the surrounding states, such as China, Japan, Russia, and the
US, to maximize its national interest. Among many factors that decide foreign security policy, this paper
focuses on 'threat perception' to understand US foreign policy toward China. Material factors, like
military and economic power, are still deterministic. However, this research presupposes that threat
perceptions are an important foundation in the process of foreign policy decision-making.
US foreign policy in East Asia, which in particular began with the occupation of Japan and South
Korea after World War II, developed with the Cold War and the Korean War, in which the US interacted
with the USSR and China as threatening enemies. In this aspect, US threat perceptions of the Soviet
Union and especially China have had an important role in US foreign policy in East Asia since 1945. It is
true that specific US foreign policies have changed over the past 70 years. However, threat perceptions
are still important variables in that America thinks China and Russia are important counterparts to be
considered in its foreign policy decision-making. In this regard, this paper raises the question of the
relationship between US threat perception and its East Asian security policy. More precisely, what were
the US threat perceptions of the Soviet Union and China over time? Also, how different were the impacts
of these perceptions on US policy decision-making in size and content?
To answer these comprehensive questions, the 70 years that have elapsed since 1945 should be
analyzed. Or, at least since 1949, when the PRC was established, should be considered. It is true that the
'China threat' discourse has been prevalent since the 1990's; this research regards the Chinese threat in the
perception of the US has having existed ever since the Cold War began, and the American perception of
China more recently has inevitably stemmed from this history. Unfortunately, however, the time span that
this research could assess has been, at least for now, limited from 1952 to 1975. This will be explained in
section III-2.
2. Hypothesis
With regard to the main questions of this project, some hypotheses could be proposed as follows.
Page 3
3
H1. The US threat perception of China and the USSR are different, quantitatively and qualitatively.
The primary goal of this paper is to assess the variation in the US threat perception of China and the
USSR. The size and the content of US threat perceptions of the two states will be measured and explained
respectively.
H2. The bigger the threat perceived from one state, the higher the priority of the foreign policy decisions
taken regarding it.
Even in East Asia, the US threat perception of the USSR was much bigger than that of China in the
early stages of the Cold War. So, the US' East Asian policy was strongly influenced by the Soviet threat.
Conversely, the Chinese threat became prominent during and after the Korean War, and it can be assumed
that at that point, the US’ East Asian policy was more influenced by its threat perception of China.
H3. The variations of US threat perception of China and USSR are related to US foreign policy in East
Asia.
The size of the threat perceptions of China and Soviet Union are supposed to be related,
independently or collectively, to the impact of them on the US policy decisions in East Asia. Here
becomes the threat perceptions an independent variables, while the US East Asian security policy
becomes a dependent variables indicated by military deployment, budget outlay, commitments, and
security institutions such as alliances.
To best review all these hypotheses, in-depth comparative case studies as well as a quantitative
analysis of the threat perception factor should be utilized. This working paper, however, will focus on
preliminary research on the methods that could be used measure the US’ threat perception, since
adequately answering all these questions would be too extensive for this type of limited research paper.
So, only H1 will be examined in depth, while H2 and H3 will be dealt with partially.
3. Threat Perception
Threat perception is an important notion in the fields of both international relations and foreign policy.
Since Stephen Walt proposed his Balance of Threat theory, threat has become a key word in the study of
security and alliances. Walt explains that there are various sources of threat and a state's perception of the
'aggressive intentions' of other states can trigger balancing actions.1 In foreign policy studies, the
1 Walt, 1987. pp. 25-26.
Page 4
4
perceptions of decision-makers play an important role. As Carlsnaes claims, dispositional dimension of
people in the policy-making process works as a perceptional filter of the structural dimension, to say,
international environment.2 It is not untrue to say that the perception factor, especially threat perception,
has an important role in international security policy-making.
However, there are few studies that discuss how to measure threat perceptions. It is hard to measure
an ideal or intentional factor such as a perception. Raymond Cohen has offered in his book, Threat
Perception in International Crisis, a set of indicators to measure the intensity of threat perceptions. They
are: 1) decision makers' comments and evaluations; 2) other colleagues' appreciation for the original
comments and evaluations; 3) decision making processes to seek countermeasures against the threat; and
4) actual policy actions taken.3 One recent and prominent work on threat perceptions is Identifying
Threats and Threatening Identities by David Rousseu. He discusses the impact of identity factors on
threat perception and proves that two states have a lower threat perception of each other when they share
the similar political and economic systems. Using US-Chinese relations in the 1980's as a case, his
research also offers important insights for investigating Chinese cases.4
These are undoubtedly valuable studies on threat perceptions. They give us theoretical guidance
regarding what kind of approaches, indicators, and words should be primarily considered in the study of
threat perceptions. However, they are not sufficient for quantitative measurements of perceptions over the
course of many years. So, this research tries to establish content analysis methods to quantitatively
measure US perceptions through foreign policy articles, year by year. In aggregation, this will built
continuous linear graphs depicting the variations in US perceptions.
II. Method and Data
There could be different ways to get a picture of the threat perception seen in US foreign policy.
Explanatory approaches could be taken by reading and understanding historical materials such as
government documents, memoirs, oral history collections, etc. With this we are able to demonstrate
through narratives what the threat perception was like and whose idea it mainly was. Another is the
content analysis method. Historical texts here becomes datasets to be analyzed with such methods as
2 Carlsnaes, 2002. 3 Cohen, 1978. 4 Rousseu, 2006.
Page 5
5
word-counting, categorization, etc. Basically, categorization makes it possible to measure the proportion
of a specific kind of document in a dataset.
This research project uses the content analysis method to understand the variation of the degree of
threat perception in US foreign policy. Because the amount of entries in the dataset will be very large,
since the time period is rather long and the quantity of government articles is quite big, a computer-aided
content analysis method is suitable for this project. Among several automated content analysis methods,
the ReadMe package in R will be tried. The ReadMe package, an automated content analysis tool based
on R and Python, offers a statistical calculation of relative proportions of each categorized theme within
an extensive set of text articles. Simply speaking, researchers first make a dataset (test set) of articles to
be analyzed, and each article should be saved as a text file in a folder. Then, researchers logically select
articles to make a control set and do the coding of the content of the articles in the control set. Here,
coding means to classify each article into a specific category that is conceptualized in advance according
to research hypotheses. Finally, we can register the coded control set and the test set in the ReadMe
program, then ReadMe will statistically calculate the proportion of articles falling in each category.5
ReadMe was originally developed by Daniel Hopkins and Gary King. On King's well-known website,
you can find the Readme software and relevant theoretical papers and technical documentations.6 In this
age of big data, computer-aided content analysis methods like ReadMe are receiving more attention in
political science. These methods are especially being used to analyze the political positions of political
parties and congresses in Europe and the US. In the meantime, applications of automated content analysis
to the foreign policy or international relations arenas are rarely found. This pilot research thus could make
a contribution by applying diachronic automated content analysis to studies of foreign policy and/or
international relations.
For the data set, Foreign Policy of the United States (FRUS), which is officially published by the
Department of State, will be used. FRUS is a large set of volumes containing selected important
government documents categorized in events and regions. It covers the years from the 1860's to the late
1970's. As a re-organized collection produced by the US government officials and historians who would
not include documents containing classified content and omit articles regarding seemingly unimportant
events at the time, FRUS could have certain selection or systemic biases. Nevertheless, containing most
5 Gary King, “Extracting Systematic Social Science Meaning from Text.” 2007. (http://gking.
harvard.edu/gking/talks/wordstlk-high.pdf); Daniel J. Hopkins, Gary King, “A Method of Automated
Nonparametric Content Analysis for Social Science,” American Journal of Political Science 54(1), 2010. 6 http://gking.harvard.edu/readme
Page 6
6
of the crucial documents regarding US foreign policy decision-making is one important advantage of
FRUS. It is also a virtue of FRUS that it relieves the burden of researchers who would otherwise have to
visit government archives and identify and collect each document personally.
FRUS exists in at least in three forms: printed volumes in libraries, online volumes at the digital
library of the University of Wisconsin, and recently online articles at the Department of State (DoS)
website. To apply automated content analysis in this research project, using already digitalized material is
indispensible to save collecting and preprocessing time. The digital collection of the University of
Wisconsin, although it has the entire volumes from 1881 to 1960, is bound in volumes rather than articles,
and formatted in PDFs rather than TXTs. Thus, they require a laborious process to be ready for automated
analysis. On the other hand, DoS website offers every document of FRUS from the year of 1945 to 1976
categorized by articles, and can be downloaded as HTML files. For the purposes and requirements of this
research project, the DoS version suits best and has thus been used as the data.7
III. Analysis Procedures
The procedure is divided into four steps: 1) collecting articles on the DoS website; 2) making test sets by
pre-processing the articles to be suitable for analysis; 3) making control sets by categorically coding the
articles, selected systematically; and 4) drawing a meaningful result by applying automated content
analysis with ReadMe.
1. Data collection
To extract FRUS articles from the website of the DoS, Python, a programming language, and
BeautifulSoup, a web parsing tool for Python, are used together. In his book Visualize This, Nathan Yau
demonstrated a method for gathering data from websites with patterned internet addresses via Python.8
Since each FRUS article has distinctive and repetitive patterns in their addresses as below, it is relatively
simple to locate a comprehensive lists of addresses of the articles.
7 The digital collection of the University of Wisconsin is at http://uwdc.library.wisc.edu/collections/FRUS; and DoS
Historians website is at http://history.state.gov/departmenthistory. 8 Nathan Yau, 2012, ch. 2.
Page 7
7
http://history.state.gov/historicaldocuments/frus1945-50Intel/d1
︙ http://history.state.gov/historicaldocuments/frus1945-50Intel/d435 http://history.state.gov/historicaldocuments/frus1950-55Intel/d1
︙ http://history.state.gov/historicaldocuments/frus1950-55Intel/d259
︙
Volume titles and the last article number of each volume were needed to make a complete, patterned
list of addresses. To find the volume titles, such as 'frus1945-50Intel' or 'frus1950-55Intel', the Volume
Title Search page was examined first.9 Then all the volumes on the website were scanned with Python to
find the number of the final article of each volume. The results were saved in a spreadsheet. As a result,
the number of volumes of FRUS on the DOS website was 172 and the entries of FRUS articles numbered
64,047. The volumes covered the years from 1945 to 1976.
Using the lists of the addresses, the entire HTML content of each article's web page could now be
downloaded with Python, and BeautifulSoup analyzed the HTML and extracted the actual content of the
articles (see Appendix 4 for some examples of Python codes). The content of each article was saved as a
text file in each document folder, which was named after the FRUS volumes. The programming and
debugging took less time compared to the web mining process, since it took 3 or 4 seconds to retrieve,
extract and save one article, and thus eventually more than 60 hours to process the 64,047 articles.
2. Pre-processing of the data
The ReadMe package in R is an automated content analysis tool, whose dataset should be in one
folder. The dataset consists of one control file (control.txt) and all the other text files as a test set to be
analyzed. The control file contains a list of whole file names in the folder and each entry has two comma-
separated numbers indicating 1) which set each file is belonged to (1 for the control set and 0 for the test
set); and 2) a categorical value of each file, if it is in the control set. All the first value for all the files in
the test set is zero, and the number of categories are decided by the schemes or hypotheses of the research.
Since the purpose of this paper is to analyze the US perception of China and the USSR on a yearly
basis, it is necessary to rearrange the collected FRUS articles in a yearly manner. For this, the content of
each article should be examined with Python again to find the year it was written. The production year
was located in the upper part of each article and could be scanned and recognized. After finding the year,
each article was copied into new yearly folders. FRUS volume titles indicated that they cover the years
9 FRUS Volume Title Search page. (http://history.state.gov/historicaldocuments/volume-title-search)
Page 8
8
from 1945 to 1976. After this rearranging, however, the numbers of articles in the years from 1945 to
1951 and in the year 1976 were found to be too small to be analyzed with ReadMe, so these years had to
be excluded from dataset.
The next stage was to select the articles are related to China and/or Soviet Union. From the files in
each year's folder, those related to China were searched and copied into a new folder. The words for
queries were 'china,' 'sino,' 'ccp,' and 'prc'. For example, if an article was from the year 1952 and had a
word such as PRC in the content, the file was copied into the newly-created 'china1952' folder. The same
process was applied to those articles relating to the USSR, whose keywords were 'soviet,' 'ussr,' and
'russia'. As a result, the total number of articles related to China was 8,198 and those related to Russia
numbered 18,914. The yearly proportions of the total articles are described in Figure 1.
Figure 1
Figure 1 shows that the proportion of documents on China became high in the mid-1950s and again in
the early 1970s, while the overall proportion of documents on the USSR was much higher than of those
on Chinese. The ratio of USSR articles rose in the early 1960's and peaked in 1974. This pattern might be
explained by the Cuban Missile Crisis in 1962 and Detente in the 1970s. For China, FRUS shows
elevated concern over China during the early 1950s and the early 1970s. In this case, the Korean War in
1950 and the US-China reconciliation during the 1970s could be the reasons.
0
10
20
30
40
50
60
Proportion of Articles related with China and USSR (%)
USSR
China
Page 9
9
The third and last part of pre-processing was to cut out the irrelevant content of each article to
improve the precision of the results. A pilot test of ReadMe without this step resulted in showing little
difference between China and the USSR in the crucial aspect of US threat perception. The reason seemed
to be that many articles contain irrelevant contents and were only partially related to China or the USSR.
Documents such as NSC Meetings Series, for example, have multiple subjects on diverse countries of
interest, while discussion on China or the USSR only takes up a little space. To fix this problem, Python is
used again to extract only the relevant part from each article and make new, shortened text files. Search
words indicating China or the USSR (again, ‘china,’ ‘sino,’ ‘ccp,’ ‘prc,’ ‘soviet,’ ‘ussr,’ and ‘russia’) were
again used to scan each article and when one of those words was found, the upper 4 lines and the lower 4
lines would be extracted and saved in a new text file. This made a final dataset of files ready to be
analyzed with ReadMe.
3. Coding
In the coding stage, conceptual categories derived from the hypotheses should be developed and the
control set prepared. Hopkins & King suggested the size of the control set should be at least 100.10 This
research tentatively set the size as 150 and selected about that number of articles from both article groups
of China and the USSR, utilizing a systematic sampling method. Consequently, the control set for China
had 151 articles out of a total 8,189, and the control set for the Soviet Union had 152 articles out of a total
18,914.
Four categories were created for coding. Since the hypotheses are focused on US threat perceptions,
three categories are assigned to measure the levels of threat perception. If an article in the control set
contained comments or expressions of threat such as 'threat,' 'enemy,' or even 'defense,' it was labeled
Category 3. On the other hand, if the article was reconciliatory or benign, it as labeled as belonging to
Category 1. Between them, there should be a practical or neutral sentiment, articles of which were set as
Category 2. Finally, among the dataset, there could be irrelevant articles, and they were put into Category
4. For example, there are many articles regarding Indochina that are sorted into the Chinese group. Those
articles belong to Category 4.11
10 Hopkins and King, 2010. pp. 241-242. 11 Although it is recommended to have at least two or three coders and to secure the inter-coder reliability, this
preliminary research is far from the ideal as a matter of fact. The author hand-coded the control set on his own, and
this shortcoming will be corrected in further research projects.
Page 10
10
4. ReadMe pilot tests
To figure out how ReadMe works with the FRUS articles, two tests were implemented. ReadMe
basically uses a sampling method to analyze the test sets, so each trial of ReadMe, even on the same
dataset, would show slightly different results. Thus, we need to see how big the deviations are, and
ReadMe analyses on China dataset were conducted ten times. The R codes are demonstrated below, and
the test results are in Figure 2.
oldwd <- getwd()
library(ReadMe)
for (j in 1:10){
i<-1952
while (TRUE){
if (i >= 1976) break
DIR <- paste("frus5/china", i, sep='')
setwd(system.file(DIR, package="ReadMe"))
undergrad.results <- undergrad(control="control.txt", sep = ',')
undergrad.preprocess<-preprocess(undergrad.results)
readme.results<-readme(undergrad.preprocess)
print(readme.results$est.CSMF)
cat(j, i, readme.results$est.CSMF,'\n', file="frus5/result.txt", append=TRUE)
i=i+1
}
}
In Figure 2, CM1 denotes benign perception of China, CM2 neutral, and CM3 threat.12 CM4 is
intentionally omitted since this category contains irrelevant articles. Figure 2 demonstrates the 10
iterations of the test outcomes, which are slightly varying but have dependably similar results. Since they
show different outputs for each test, it seemed to be more accurate to take a mean value or a statistical
estimation of the multiple test results.
The next test using ReadMe was regarding the threshold for word frequency. The initial value of the
threshold is automatically set at 0.01. This means that ReadMe only includes words in its internal
dictionary that appear in more than one percent of the whole control set’s content. Users can modify the
threshold if needed, so it should be examined to obtain better control of the results.
12 In CM1 for example, each character stands for China, Mean of ten times of ReadMe results, and category 1.
Page 11
11
Figure 2
To test the threshold, Chinese datasets between 1952 and 1962 are used. The test is again implemented
ten times, and each test as given by increasing the threshold by 0.01. Figure 3 shows these test results.
While in general the changes in the threshold do not result in significant differences in the results, the
initial value of 0.01 resulted in the most distinctive disparity in the results.
All in all, it was decided to set the threshold as 0.01, and from the previous test, to take mean values of
the ten iterations of tests on the same dataset. The final results and the interpretation continue in the next
section.
1955 1960 1965 1970 1975
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Year
Pro
po
rtio
n o
f Art
icle
s
C M1
C M2
C M3
U.S . T hreat P ercep tion o f C h ina , 1952-1975
(try=10, threshold=0.01)
C M1 (benign)C M2 (neutra l)C M3 (threat)
Page 12
12
Figure 3
IV. Results and Interpretations
1. US perception of China and Soviet Union
The final result of the ReadMe analysis on the Chinese dataset is demonstrated in Figure 4. It shows
annually the distinctive values of the proportions of articles in each category. CM1 (benign) has the
lowest value of all, but slightly increases in the early 1970's. CM2 (neutral) fluctuates, showing a decrease
in the early 1960's, an increase in the late 1960's to 1972, and a decrease after 1973. CM3 (threat), most
importantly, increases in the early 1950's and thereafter has a tendency to decrease until 1968, except for
abrupt peaks from 1958 to 1960 and in 1966. In 1969, CM3 places itself in a relatively higher position,
but decreases from the early 1970's to cross with CM1 in 1975.
1952 1954 1956 1958 1960 1962
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Year
Pro
port
ion
of A
rtic
les
CM1
CM3
U.S. Threat Perception of China, 1952-1975
(try=10, threshold=0.01~0.1)
CM1 (benign)CM3 (threat)
Page 13
13
Figure 4
To explain what the variations that the variables show really mean, a close study of the historical
events and foreign policy-making cases contained within FRUS is actually needed. However, this is not
the objective of this paper. A brief outline of US-Chinese relations proves that Figure 4 roughly
corresponds with real history. There were the Korean War and the Geneva Conference in the early 1950's
and after that there were the China-Taiwan conflict and domestically, the Great Leap Forward (Dayuejin)
Movement in China. Figure 4 shows that there is a small peak in CM3 and CM2 from 1965 to 1967,
which could be explained in part by Chinese nuclear and atomic development during the mid-1960s and
partly by the Great Proletarian Cultural Revolution (Wenhua Dageming). Interestingly, a jump of CM3
and CM2 in 1969 coincides with the inauguration of president Nixon. Decreasing CM3 in the early 1970's
reflects the US-China reconciliation and Kissinger and Nixon's visit to China during that time.
1955 1960 1965 1970 1975
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Year
Pro
po
rtio
n o
f Art
icle
s
CM1
CM2
CM3
U.S. Threat Perception of China, 1952-1975
(Mean result, threshold=0.01)
CM1 (benign)CM2 (neutral)CM3 (threat)
Page 14
14
Figure 5
Figure 5 shows the result of the ReadMe test on the Soviet dataset. UM1 (USSR, Mean,
Category 1 (benign)) has a tendency to decrease from 1952 to 1968, except for the peak from 1958-60.
From 1969, it leaps and maintains a relatively higher proportion. UM2 (neutral) generally increases from
21.9% in 1952 to 29.5% in 1975, while it decreases during the late 1960s. Finally, UM3 (threat), which
maintains the highest ratio among the three variables, has sudden decreases in 1955, from 1958 to 1960,
and in 1964. From 1965 on, it keeps increasing slightly until a decrease beginning in 1969.
Does this result corresponds with reality? Since Nikita Khrushchev became the Premier in 1958,
Soviet rhetoric regarding Peaceful Coexistence started. The first summit meeting between the US and the
USSR (Eisenhower and Khrushchev) occurred in 1959, and a summit meeting between Kennedy and
Khrushchev in 1961. Figure 5 shows the sudden reconciliation between the two countries. But, the Cuban
Missile Crisis happened in 1962 and the mood of their relations became cold again. With the start of the
Nixon administration, Detente began and several negotiations such as SALT-II also took place. Those
1955 1960 1965 1970 1975
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Year
Pro
po
rtio
n o
f Art
icle
s
UM1
UM2
UM3
U.S. Threat Perception of USSR, 1952-1975
(mean of 10 tries, threshold=0.01)
UM1 (benign)UM2 (neutral)UM3 (threat)
Page 15
15
negotiations continued until the early 1970s, but were suspended from 1976-79 during the Carter
Administration. Figure 5 seems to correspond well with the historical realities in the 1960s and 1970s,
which could be described in general as searching for the possibility of practical negotiations and thus
reconciliation under prevailing tensions, seeing that UM1s takes a relatively higher position during the
early 1970s, UM2 increases notably in the early1960s, and maintains a high position thereafter.
2. Regression analysis of the variables
A further question could be raised as to the relationship among the result variables CM1, CM2, CM3,
UM1, UM2, and UM3. Specifically, what kind of causal relationships existed between the US perception
of China as a dependent variable and the one of the USSR as an independent variable?
To answer this, every possible bilateral relation among the variables was put in a regression analysis
using the PAIRS command in R. PAIRS is an intuitive way to test general relations of variables when
their correlations are presumably unknown. As figure 6 depicts, each box contains plots and the LOESS
line of each bilateral relation of two crossing variables, and displays notably strong and positive linear
correlations between CM1 and UM1, CM2 and UM2, and CM3 and UM3. To find more exact numbers, a
regression analysis is conducted as shown in Table 1.
Why are they so closely correlated? One explanation could be that the agents who were perceiving
China and the USSR were part of one homogenous group, represented by the 'decision makers' in the
White House, DoS, Department of Defense, etc. Also, it could be because China and the USSR are both
communist states. So, US perceptions and policies for China and the Soviet Union synchronized because
of the homogeneity of the perceptions and processes of US foreign policy decision-making. In this case,
US officials might have simply been identifying China with their perception of the Soviet Union, which
was the most important or most threatening state.
Another explanation is also possible: an intentional or unintentional bias on the part of the editors of
FRUS. They might have a certain cognitive trait in thinking about US policies or the policy-making
structure. If so, their selection bias could be realized in the same proportion in each category. However,
this explanation needs further investigations.
Page 16
16
Figure 6
Table 1
y x b S.E.(b) t adjR2
CM1 UM1 0.945*** 0.017 56.979 0.993
CM2 UM2 1.042*** 0.047 22.19 0.955
CM3 UM3 0.897*** 0.068 13.123 0.882
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
CM1
0.15 0.30 0.08 0.14 0.16 0.24 0.08 0.14
0.10
0.25
0.15
0.30
CM2
CM3
0.40
0.50
0.08
0.14
CM4
UM1
0.10
0.30
0.16
0.24 UM2
UM3
0.40
0.50
0.10 0.25
0.08
0.14
0.40 0.50 0.10 0.30 0.40 0.50
UM4
Page 17
17
Then is the first hypothesis, which expected a disparity between perceptions of China and USSR,
rejected? This can be accessed through two logics. One is the existence of outliers. Figure 7 contains the
diagnostics of the regression analysis of CM3 and UM3 and points out that there are outliers. The outliers
are 1958, 1959, 1960 and 1965. In these years, the US perception of China worsened due to Sino-
Taiwanese conflict and China's nuclear development, as well as the Great Leap Movement and Cultural
Revolution. In contrast, the quantitative analysis of the years from 1958 to 1960 agrees with the US-
Soviet reconciliation represented by serial summit meetings. This difference reveals that the US
perception of China could have an independent or unsynchronized impact, even under the prevailing
influence of US perceptions of the Soviet Union. It is also worth noticing that the outliers, though usually
omitted, have important meaning in this kind of research.
Figure 7
0.38 0.42 0.46 0.50
-0.0
4-0
.02
0.00
0.02
Fitted values
Res
idua
ls
Residuals vs Fitted
14
89
-2 -1 0 1 2
-3-2
-10
12
Theoretical Quantiles
Sta
ndar
dize
d re
sidu
als
Normal Q-Q
14
9
7
0.38 0.42 0.46 0.50
0.0
0.5
1.0
1.5
Fitted values
Sta
ndar
dize
d re
sidu
als
Scale-Location14
97
0.00 0.10 0.20 0.30
-3-2
-10
12
Leverage
Sta
ndar
dize
d re
sidu
als
Cook's distance
1
0.5
0.5
1
Residuals vs Leverage
9
7
14
Page 18
18
Another argument is that the size or intensity of the perceptions was unequal, even if they were
synchronized. The first part of data pre-processing showed that there were big difference in the quantity
of articles discussing China and the USSR. There were 18,914 articles about the Soviet Union, compared
to 8,189 about China; the former is 2.3 times bigger than the latter. Thus, the size or intensity of US threat
perceptions of the two was absolutely different. This factor should be regarded importantly, and should be
included as an important indicator when development of an integrated Index of Threat Perception is
needed.
3. Relations with US military deployments
This part attempts to conduct a regression analyses of US overseas deployments and the threat
perceptions of China and the USSR. This is a primitive test for the third hypothesis that is searching for
some relationship between the perceptions and actual foreign policies. Undoubtedly, the size of US
deployments doesn't represent all the foreign policies of the US, and supposing a relationship between the
deployment, a small part of foreign security policies, and threat perceptions is not an accurate depiction of
reality. Also, assuming relations between a materialistic factor and a conceptual or psychological factor
may also be farfetched. However, it seems worthwhile trying to figure out how much they are related,
because it could offer some ideas for further studies.
Since the existence of the USSR is more prominent in the European sphere, US deployment to
Germany (GER), France (FRA), and Italy (ITA) were selected as dependent variables, and their sum was
considered representative of Europe (EUR). For China, US military forces in Japan (JAP), Korea (KOR),
the Philippines (PHI), and their sum of Asia (ASIA) were tested. Also, the US military budget (Bcon) and
outlays (Ocon) are included as dependent variables. The independent variables are the US threat
perceptions of China (CM3) and the USSR (UM3).
Again, PAIRS was used to review the correlations, but unfortunately there ere no significant results
(see Appendix 2). It was expected that this simple model could only give unsatisfactory results. At least,
however, it can be said that simple matching between ideational variables such as threat perceptions and
material variables such as military spending doesn't offer significant relationship descriptions. Therefore,
to judge the impact size of a threat perception to US security policy, more sophisticated models are
needed.
Page 19
19
V. Discussion
Among the many challenges in finishing this research project, the first was to collect FRUS articles on
the internet, specifically Historians Office website at the DoS, with Python programming. Python is such
an intuitive programming language that it could be used without serious troubles. But unexpected
hardships came with collecting articles themselves because it took so much time. Pre-processing articles
to be suitable for ReadMe analysis was also a tedious process. The problem is that this was a relatively
easier processes when compared with using other types of sources, such as e-books in PDF format or real
government documents in archives. The PDF files of FRUS in the Digital Library at the University of
Wisconsin could be downloaded and used for this analysis since they could perhaps more adequately
cover the years between 1945-1951, which were omitted due to the small quantities of articles. Using
them is a real possibility, since they have already been digitalized, and it would only require several more
steps to reform them to be suitable for analysis. The worst case would be using hard copies of government
documents. This would necessitate visiting archives to take digital pictures of each and then digitalizing
each of the pictures with text recognition applications, which surely would cost more time and labor than
processing the other types of internet materials.
Such difficulties notwithstanding, automated content analysis using ReadMe showed meaningful
results worthy of the time and labor. It succeeded in transforming the linguistic and conceptual text data
into plausible quantitative data. Now it is possible to use the data for pattern finding, case selection, and
even statistical analysis, and to complement the disadvantages of narrative historical case explanation in
the aspects of time and parsimony. For example, as we can see in Figures 4 and 5, sudden changes in
perceptions occur when US administrations change. In particular, the Kennedy Administration shows a
much lower threat perception for China than the previous Eisenhower Administration. It is interesting that
in 1961 CM3 (threat) becomes lower than CM2 (neutral) for the first time. Also, with the start of
Kennedy's era, we can see that the threat perception of the USSR sharply increases, while the benign
perception drops.
The Nixon administration also shows interesting patterns. In 1969, especially, CM3 and CM2
increased sharply. This means that the Nixon Administration's perception of China was much different
than the Johnson Administration's. While CM2 increases until 1972 and then falls until 1975, CM3 has
drops until 1975 to cross with CM1 (benign) for the first time. It could be understood from this that
Nixon's perception of China was relatively bad at first. But, through efforts to negotiate and correspond
with China, this threat perception decreased considerably. At this point, we can raise some questions
regarding the impact of Nixon's visit to China in 1972 on the US’ threat perception. Did Nixon's visit
Page 20
20
lower the threat perception, or were there other causes, such as efforts of some important decision makers,
e.g. Kissinger, to initiate conversations with China? How was it possible to reconcile with China in spite
of the notably high US threat perception? Like this, content analysis of FRUS allows us to identify and
understand interesting patterns in history and ask questions about the cases in which perception variables
can perform strongly.
Through this preliminary research, some recommendations for further studies revealed themselves.
One is to develop an index of threat perception. This index could be used by itself or imported for many
other quantitative analyses. We now have the relative proportions of articles in FRUS on China and USSR
as in Figure 1, and perception variables CM1, CM2, CM3, and so on. Mixing these variables properly, it
could be possible to develop a unified index of threat perception.
Secondly, it could produce a more interesting outcome if government documents in the archives could
be used. In FRUS, articles from all government agents and offices are mixed in a timely manner. But, we
can find in the archives articles that are sorted by offices, meetings, personnel, and themes. So, utilizing
these categorizations would make it possible to analyze a specific group's perceptions. Comparing their
perceptions and discourses could contribute to studies on foreign policy.
Finally, more in-depth and linguistic analyses on the articles’ contents are needed. Developing a
corpus and finding subtle meanings and perceptions from personal comments and evaluations would
bring out important aspects of cases and events of interest. If the ReadMe analysis of FRUS is a macro
analysis of history, micro analysis is also needed to truly understand cases and events. Practically, this
micro level analysis will also help us to find out some critical evidences in a large quantity of documents
and articles.
Page 21
21
Reference
Carlsnaes, Walter. 2002. "Foreign Policy," in Walter Carlsnaes, Thomas Risse and Beth A. Simmons eds.,
Handbook of International Relations. Sage Publications.
Cohen, Raymond. 1978. “Threat Perception in International Crisis,” Political Science Quarterly, 93(1).
Daggett, Stephen and Amy Belasco. 2002. “CRS Report for Congress: Defense Budget for FY2003: Data
Summary," Congressional Research Service.
Hopkins, Daniel J. and Gary King. 2010. “A Method of Automated Nonparametric Content Analysis for
Social Science,” American Journal of Political Science 54(1).
King, Gary. 2007. “Extracting Systematic Social Science Meaning from Text.” (http://gking.
harvard.edu/gking/talks/wordstlk-high.pdf)
Matloff, Norman. 2011. The Art of R Programming: A Tour of Statistical Software Design. No Starch
Press.
Rousseau, David L.. 2006. Identifying threats and threatening identities: the social construction of
realism and liberalism. Stanford University Press.
Walt, Stephen M.. 1987. Origins of Alliances, Cornell University Press.
Yau, Nathan. 2011. Visualize This: : The Flowing Data Guide to Design, Visualization, and Statistics.
Wiley.
http://gking.harvard.edu (Gary King's website)
http://history.state.gov/departmenthistory (Historians Office website in Department of State)
http://uwdc.library.wisc.edu/collections/FRUS (Digital collections in the University of Wisconsin)
http://www.vetfriends.com/US-deployments-overseas/historical-military-troop-data.cfm
(VetFriends.com offers a database for US military deployments)