<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.0 20120330//EN" "http://jats.nlm.nih.gov/publishing/1.0/JATS-journalpublishing1.dtd">
<!--<?xml-stylesheet type="text/xsl" href="article.xsl"?>-->
<article article-type="research-article" dtd-version="1.0" xml:lang="en" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<front>
<journal-meta>
<journal-id journal-id-type="issn">1683-1470</journal-id>
<journal-title-group>
<journal-title>Data Science Journal</journal-title>
</journal-title-group>
<issn pub-type="epub">1683-1470</issn>
<publisher>
<publisher-name>Ubiquity Press</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.5334/dsj-2017-037</article-id>
<article-categories>
<subj-group>
<subject>Research paper</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Statistical Inference in Missing Data by MCMC and Non-MCMC Multiple Imputation Algorithms: Assessing the Effects of Between-Imputation Iterations</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Takahashi</surname>
<given-names>Masayoshi</given-names>
</name>
<email>mtakahashi@tufs.ac.jp</email>
<xref ref-type="aff" rid="aff-1"/>
</contrib>
</contrib-group>
<aff id="aff-1">IR Office, Tokyo University of Foreign Studies, Tokyo, JP</aff>
<pub-date publication-format="electronic" date-type="pub" iso-8601-date="2017-07-28">
<day>28</day>
<month>07</month>
<year>2017</year>
</pub-date>
<volume>16</volume>
<elocation-id>37</elocation-id>
<history>
<date date-type="received" iso-8601-date="2016-11-30">
<day>30</day>
<month>11</month>
<year>2016</year>
</date>
<date date-type="accepted" iso-8601-date="2017-06-23">
<day>23</day>
<month>06</month>
<year>2017</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright: &#x00A9; 2017 The Author(s)</copyright-statement>
<copyright-year>2017</copyright-year>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/4.0/">
<license-p>This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CC-BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. See <uri xlink:href="http://creativecommons.org/licenses/by/4.0/">http://creativecommons.org/licenses/by/4.0/</uri>.</license-p>
</license>
</permissions>
<self-uri xlink:href="http://datascience.codata.org/articles/10.5334/dsj-2017-037/"/>
<abstract>
<p>Incomplete data are ubiquitous in social sciences; as a consequence, available data are inefficient (ineffective) and often biased. In the literature, multiple imputation is known to be the standard method to handle missing data. While the theory of multiple imputation has been known for decades, the implementation is difficult due to the complicated nature of random draws from the posterior distribution. Thus, there are several computational algorithms in software: Data Augmentation (DA), Fully Conditional Specification (FCS), and Expectation-Maximization with Bootstrapping (EMB). Although the literature is full of comparisons between joint modeling (DA, EMB) and conditional modeling (FCS), little is known about the relative superiority between the MCMC algorithms (DA, FCS) and the non-MCMC algorithm (EMB), where MCMC stands for Markov chain Monte Carlo. Based on simulation experiments, the current study contends that EMB is a confidence proper (confidence-supporting) multiple imputation algorithm without between-imputation iterations; thus, EMB is more user-friendly than DA and FCS.</p>
</abstract>
<kwd-group>
<kwd>MCMC</kwd>
<kwd>Markov chain Monte Carlo</kwd>
<kwd>Incomplete data</kwd>
<kwd>Nonresponse</kwd>
<kwd>Joint modeling</kwd>
<kwd>Conditional modeling</kwd>
</kwd-group>
</article-meta>
</front>
<body>
<sec>
<title>1 Introduction</title>
<p>Generally, it is quite difficult to obtain complete data in social surveys (<xref ref-type="bibr" rid="B32">King et al. 2001: 49</xref>). Consequently, available data are not only inefficient due to the reduced sample size, but also biased due to the difference between respondents and non-respondents, thus making statistical inference invalid. Since Rubin (<xref ref-type="bibr" rid="B42">1987</xref>), multiple imputation has been known to be the standard method of handling missing data (<xref ref-type="bibr" rid="B21">Graham 2009</xref>; <xref ref-type="bibr" rid="B4">Baraldi and Enders 2010</xref>; <xref ref-type="bibr" rid="B8">Carpenter and Kenward 2013</xref>; <xref ref-type="bibr" rid="B41">Raghunathan 2016</xref>).</p>
<p>While the theoretical concept of multiple imputation has been around for decades, the implementation is difficult because making a random draw from the posterior distribution is a complicated matter. Accordingly, there are several computational algorithms in software (<xref ref-type="bibr" rid="B43">Schafer 1997</xref>; <xref ref-type="bibr" rid="B25">Honaker and King 2010</xref>; <xref ref-type="bibr" rid="B56">van Buuren 2012</xref>). The most traditional algorithm is Data Augmentation (DA) followed by the other two new algorithms, Fully Conditional Specification (FCS) and Expectation-Maximization with Bootstrapping (EMB). Although an abundant literature exists on the comparisons between joint modeling (DA, EMB) and conditional modeling (FCS), no comparisons have been made about the relative superiority between the MCMC algorithms (DA, FCS) and the non-MCMC algorithm (EMB), where MCMC stands for Markov chain Monte Carlo. This study assesses the effects of between-imputation iterations on the performance of the three multiple imputation algorithms, using Monte Carlo experiments.</p>
<p>By way of organization, Section 2 introduces the notations in this article. Section 3 gives a motivating example of missing data analysis in social sciences. Section 4 presents the assumptions of imputation methods. Section 5 shows the traditional methods of handling missing data. Section 6 introduces the three multiple imputation algorithms. Section 7 surveys the literature on multiple imputation. Sections 8 gives the results of the Monte Carlo experiments, showing the impact of between-imputation iterations on multiple imputation. Section 9 concludes with the findings and the limitations in the current research.</p>
</sec>
<sec>
<title>2 Notations</title>
<p><italic>D</italic> is <italic>n</italic> &#215; <italic>p</italic> data, where <italic>n</italic> is the sample size and <italic>p</italic> is the number of variables. The distribution of <italic>D</italic> is multivariate-normal with mean vector &#956; and variance-covariance matrix &#931;, i.e., <italic>D</italic> ~ <italic>N</italic><sub>p</sub>(&#956;, &#931;), where all of the variables are continuous. Let <italic>i</italic> refer to an observation index (<italic>i</italic> = 1, &#8230;, <italic>n</italic>). Let <italic>j</italic> refer to a variable index (<italic>j</italic> = 1, &#8230;, <italic>p</italic>). Let <italic>D</italic> = {<italic>Y</italic><sub>1</sub>, &#8230;, <italic>Y<sub>p</sub></italic>}, where <italic>Y<sub>j</sub></italic> is the <italic>j</italic>-th column in <italic>D</italic> and <italic>Y<sub>&#8211;j</sub></italic> is the complement of <italic>Y<sub>j</sub></italic>, i.e., all columns in <italic>D</italic> except <italic>Y<sub>j</sub></italic>. Also, let <italic>Y<sub>obs</sub></italic> be observed data and <italic>Y<sub>mis</sub></italic> be missing data: <italic>D</italic> = {<italic>Y<sub>obs</sub>, Y<sub>mis</sub></italic>}.</p>
<p>At the imputation stage, there is no concept of the dependent and independent variables, because imputation is not a causal model, but a predictive model (<xref ref-type="bibr" rid="B32">King et al. 2001: 51</xref>). Therefore, all of the variables are denoted <italic>Y<sub>j</sub></italic> with the subscript <italic>j</italic> indexing a variable number. However, at the analysis stage, one of the <italic>Y<sub>j</sub></italic> variables is the dependent variable and the remaining <italic>Y<sub>&#8211;j</sub></italic> are the independent variables. If the dependent variable is the <italic>p</italic>-th column in <italic>D</italic>, then the dependent variable is simply denoted <italic>Y</italic> and the independent variables are denoted <italic>X</italic><sub>1</sub>, &#8230;, <italic>X</italic><sub><italic>p</italic>&#8211;1</sub>.</p>
<p>Let <italic>R</italic> be a response indicator matrix that has the same dimension as <italic>D</italic>. Whenever <italic>D</italic> is observed, <italic>R</italic> = 1; otherwise, <italic>R</italic> = 0. Note, however, that non-italicized R refers to the R statistical environment. In the multiple imputation context, <italic>M</italic> refers to the number of imputations and <italic>T</italic> refers to the number of between-imputation iterations. In general, <italic>&#952;</italic> is an unknown parameter vector.</p>
</sec>
<sec>
<title>3 Motivating Example: Missing Economic Data</title>
<p>Social scientists have long debated the determinants of economic development across countries (<xref ref-type="bibr" rid="B6">Barro 1997</xref>; <xref ref-type="bibr" rid="B18">Feng 2003</xref>; <xref ref-type="bibr" rid="B2">Acemoglu, Johnson, and Robinson 2005</xref>). Using the data from the Central Intelligence Agency (<xref ref-type="bibr" rid="B10">CIA 2016</xref>) and Freedom House (<xref ref-type="bibr" rid="B19">2016</xref>), we may estimate a multiple regression model, in which the dependent variable is GDP per capita and the independent variables include social, economic, and political variables. The problem is that the data are incomplete (Table <xref ref-type="table" rid="T1">1</xref>), where the median missing rate is 22.4% and the total missing rate is 62.3%.</p>
<table-wrap id="T1">
<label>Table 1</label>
<caption>
<p>Variables and Missing Rates.</p>
</caption>
<table>
<tr>
<th align="left">Variables</th>
<th align="center">Missing Rates</th>
</tr>
<tr>
<td colspan="2"><hr/></td>
</tr>
<tr>
<td align="left">GDP per capita (purchasing power parity)</td>
<td align="right">0.0%</td>
</tr>
<tr>
<td align="left">Freedom House index</td>
<td align="right">15.4%</td>
</tr>
<tr>
<td align="left">Central bank discount rate</td>
<td align="right">32.9%</td>
</tr>
<tr>
<td align="left">Life expectancy at birth</td>
<td align="right">2.6%</td>
</tr>
<tr>
<td align="left">Unemployment rate</td>
<td align="right">10.5%</td>
</tr>
<tr>
<td align="left">Distribution of family income: Gini index</td>
<td align="right">37.3%</td>
</tr>
<tr>
<td align="left">Public debt</td>
<td align="right">22.4%</td>
</tr>
<tr>
<td align="left">Education expenditures</td>
<td align="right">24.6%</td>
</tr>
<tr>
<td align="left">Taxes and other revenues</td>
<td align="right">6.1%</td>
</tr>
<tr>
<td align="left">Military expenditures</td>
<td align="right">43.0%</td>
</tr>
</table>
<table-wrap-foot>
<fn>
<p>Data sources: CIA (<xref ref-type="bibr" rid="B10">2016</xref>) and Freedom House (<xref ref-type="bibr" rid="B19">2016</xref>).</p>
</fn>
</table-wrap-foot>
</table-wrap>
<p>Table <xref ref-type="table" rid="T2">2</xref> presents multiple regression models; however, the conclusions are susceptible to how we deal with missing data. The coefficients for central bank and public debt are statistically significant at the 5% error level using incomplete data, while they are not significant using multiply-imputed data. On the other hand, the coefficients for education and military are not significant using incomplete data, while they are significant using multiply-imputed data. Therefore, the issue of missing data is of grave concern in applied empirical research.</p>
<table-wrap id="T2">
<label>Table 2</label>
<caption>
<p>Multiple Regression Analyses on GDP Per Capita.</p>
</caption>
<table>
<tr>
<th align="left"></th>
<th align="center" colspan="2">Incomplete Data</th>
<th align="center" colspan="2">Multiply-Imputed Data</th>
</tr>
<tr>
<th align="left"></th>
<th colspan="4"><hr/></th>
</tr>
<tr>
<th align="left">Variables</th>
<th align="center">Coef.</th>
<th align="center">Std. Err.</th>
<th align="center">Coef.</th>
<th align="center">Std. Err.</th>
</tr>
<tr>
<td colspan="5"><hr/></td>
</tr>
<tr>
<td align="left">Intercept</td>
<td align="right">&#8211;7.323</td>
<td align="right">3.953</td>
<td align="right">&#8211;11.545*</td>
<td align="right">3.495</td>
</tr>
<tr>
<td align="left">Freedom</td>
<td align="right">&#8211;0.321*</td>
<td align="right">0.127</td>
<td align="right">&#8211;0.362*</td>
<td align="right">0.127</td>
</tr>
<tr>
<td align="left"><bold>Central Bank</bold></td>
<td align="right">&#8211;<bold>0.118*</bold></td>
<td align="right"><bold>0.041</bold></td>
<td align="right">&#8211;0.107</td>
<td align="right">0.049</td>
</tr>
<tr>
<td align="left">Life Expectancy</td>
<td align="right">3.922*</td>
<td align="right">0.794</td>
<td align="right">4.908*</td>
<td align="right">0.655</td>
</tr>
<tr>
<td align="left">Unemployment</td>
<td align="right">&#8211;0.205*</td>
<td align="right">0.087</td>
<td align="right">&#8211;0.214*</td>
<td align="right">0.070</td>
</tr>
<tr>
<td align="left">Gini</td>
<td align="right">0.114</td>
<td align="right">0.253</td>
<td align="right">&#8211;0.018</td>
<td align="right">0.363</td>
</tr>
<tr>
<td align="left"><bold>Public Debt</bold></td>
<td align="right">&#8211;<bold>0.198*</bold></td>
<td align="right"><bold>0.092</bold></td>
<td align="right">&#8211;0.002</td>
<td align="right">0.093</td>
</tr>
<tr>
<td align="left"><bold>Education</bold></td>
<td align="right">0.035</td>
<td align="right">0.164</td>
<td align="right">&#8211;<bold>0.488*</bold></td>
<td align="right"><bold>0.154</bold></td>
</tr>
<tr>
<td align="left">Tax</td>
<td align="right">0.357*</td>
<td align="right">0.174</td>
<td align="right">0.471*</td>
<td align="right">0.151</td>
</tr>
<tr>
<td align="left"><bold>Military</bold></td>
<td align="right">0.123</td>
<td align="right">0.085</td>
<td align="right"><bold>0.299*</bold></td>
<td align="right"><bold>0.109</bold></td>
</tr>
<tr>
<td align="left">Number of obs.</td>
<td align="right"></td>
<td align="right">86</td>
<td align="right"></td>
<td align="right">228</td>
</tr>
</table>
<table-wrap-foot>
<fn>
<p><bold>Note</bold>: *significant at the 5% error level. Coef. stands for coefficient. Std. Err. stands for standard error. Since the distributions of these variables are skewed to the right (log-normal), the variables are log-transformed to normalize the distributions.</p>
</fn>
</table-wrap-foot>
</table-wrap>
</sec>
<sec sec-type="methods">
<title>4 Assumptions of Imputation Methods</title>
<p>Missing data analyses always involve assumptions (<xref ref-type="bibr" rid="B41">Raghunathan 2016: 12</xref>). In order to judge the appropriateness of missing data methods, it is vital to comprehend the assumptions for the methods. Imputation involves the following four assumptions. These assumptions will play important roles in simulation studies (Section 8).</p>
<sec>
<title>4.1 Assumptions of Missing Data Mechanisms</title>
<p>There are three common assumptions of missing data mechanisms in the literature (<xref ref-type="bibr" rid="B32">King et al. 2001: 50&#8211;51</xref>; <xref ref-type="bibr" rid="B38">Little and Rubin 2002</xref>; <xref ref-type="bibr" rid="B8">Carpenter and Kenward 2013: 10&#8211;21</xref>). The first assumption is Missing Completely At Random (MCAR), which is <italic>Pr</italic>(<italic>R</italic>&#124;<italic>D</italic>) = <italic>Pr</italic>(<italic>R</italic>). If respondents are selected to answer their income values by throwing dice, this is an example of MCAR. The second assumption is Missing At Random (MAR), which is <italic>Pr</italic>(<italic>R</italic>&#124;<italic>D</italic>) = <italic>Pr</italic>(<italic>R</italic>&#124;<italic>Y<sub>obs</sub></italic>). If older respondents are more likely to refuse to answer their income values and if the ages of the respondents are available in the data, this is an example of MAR. The third assumption is Not Missing At Random (NMAR), which is <italic>Pr</italic>(<italic>R</italic>&#124;<italic>D</italic>) &#8800; <italic>Pr</italic>(<italic>R&#124;Y<sub>obs</sub></italic>). If respondents with higher values of incomes are more likely to refuse to answer their income values and if the other variables in the data cannot be used to predict which respondents have high amounts of income, this is an example of NMAR.</p>
</sec>
<sec>
<title>4.2 Assumption of Ignorability</title>
<p>To be strict, the missing data mechanism is ignorable if both of the following conditions are satisfied: (1) The MAR condition; and (2) the distinctness condition, which stipulates that the parameters in the missing data mechanism are independent of the parameters in the data model (<xref ref-type="bibr" rid="B43">Schafer 1997: 11</xref>).</p>
<p>However, the MAR condition is said to be more relevant in real data applications (<xref ref-type="bibr" rid="B3">Allison 2002: 5</xref>; <xref ref-type="bibr" rid="B56">van Buuren 2012: 33</xref>). Thus, for all practical purposes, NMAR is Non-Ignorable (NI). The current study assumes that the missing data mechanism is MAR and thus ignorable.</p>
</sec>
<sec>
<title>4.3 Assumption of Proper Imputation</title>
<p>Imputation is said to be Bayesianly proper if imputed values are independent realizations of <italic>Pr</italic>(<italic>Y<sub>mis</sub></italic>&#124;<italic>Y<sub>obs</sub></italic>), which means that successive iterates of <italic>Y<sub>mis</sub></italic> cannot be used because of the correlations between them (<xref ref-type="bibr" rid="B43">Schafer 1997: 105&#8211;106</xref>). Between-imputation convergence relies on a number of factors, but the fractions of missing information are one of the most influential factors (<xref ref-type="bibr" rid="B43">Schafer 1997: 84</xref>; <xref ref-type="bibr" rid="B56">van Buuren 2012: 113</xref>).</p>
<p>van Buuren (<xref ref-type="bibr" rid="B56">2012: 39</xref>) introduces a slightly simplified version of proper imputation, which he calls confidence proper. Let <inline-formula>
<alternatives>
<mml:math id="Eq001-mml">
<mml:mrow>
<mml:mover accent='true'>
<mml:mi>&#x03B8;</mml:mi>
<mml:mo>&#x00AF;</mml:mo>
</mml:mover>
<mml:mtext>&#x00A0;</mml:mtext>
</mml:mrow>
</mml:math>
<tex-math id="M1">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
\bar \theta {\rm{}}
\]
\end{document}
</tex-math>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="dsj-16-690-e1.gif"/>
</alternatives>
</inline-formula> be the multiple imputation estimate, <inline-formula>
<alternatives>
<mml:math id="Eq002-mml">
<mml:mover accent='true'>
<mml:mi>&#x03B8;</mml:mi>
<mml:mo>&#x005E;</mml:mo>
</mml:mover>
</mml:math>
<tex-math id="M2">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
\hat \theta
\]
\end{document}
</tex-math>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="dsj-16-690-e2.gif"/>
</alternatives>
</inline-formula> be the estimate based on the hypothetically complete data, <inline-formula>
<alternatives>
<mml:math id="Eq003-mml">
<mml:mover accent='true'>
<mml:mi>V</mml:mi>
<mml:mo>&#x00AF;</mml:mo>
</mml:mover>
</mml:math>
<tex-math id="M3">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
\bar V
\]
\end{document}
</tex-math>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="dsj-16-690-e3.gif"/>
</alternatives>
</inline-formula> be the estimate of the sampling variance of the estimate based on the hypothetically complete data, and <inline-formula>
<alternatives>
<mml:math id="Eq004-mml">
<mml:mover accent='true'>
<mml:mi>V</mml:mi>
<mml:mo>&#x005E;</mml:mo>
</mml:mover>
</mml:math>
<tex-math id="M4">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
\hat V
\]
\end{document}
</tex-math>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="dsj-16-690-e4.gif"/>
</alternatives>
</inline-formula> be the sampling variance estimate based on the hypothetically complete data. An imputation procedure is said to be confidence proper if all of the following three conditions are satisfied: (1) <inline-formula>
<alternatives>
<mml:math id="Eq005-mml">
<mml:mrow>
<mml:mover accent='true'>
<mml:mi>&#x03B8;</mml:mi>
<mml:mo>&#x00AF;</mml:mo>
</mml:mover>
<mml:mtext>&#x00A0;</mml:mtext>
</mml:mrow>
</mml:math>
<tex-math id="M5">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
\bar \theta {\rm{}}
\]
\end{document}
</tex-math>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="dsj-16-690-e1.gif"/>
</alternatives>
</inline-formula> is equal to <inline-formula>
<alternatives>
<mml:math id="Eq006-mml">
<mml:mover accent='true'>
<mml:mi>&#x03B8;</mml:mi>
<mml:mo>&#x005E;</mml:mo>
</mml:mover>
</mml:math>
<tex-math id="M6">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
\hat \theta
\]
\end{document}
</tex-math>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="dsj-16-690-e2.gif"/>
</alternatives>
</inline-formula> when averaged over the response indicators sampled under the assumed response model; (2) <inline-formula>
<alternatives>
<mml:math id="Eq007-mml">
<mml:mover accent='true'>
<mml:mi>V</mml:mi>
<mml:mo>&#x00AF;</mml:mo>
</mml:mover>
</mml:math>
<tex-math id="M7">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
\bar V
\]
\end{document}
</tex-math>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="dsj-16-690-e3.gif"/>
</alternatives>
</inline-formula> is equal to <inline-formula>
<alternatives>
<mml:math id="Eq008-mml">
<mml:mover accent='true'>
<mml:mi>V</mml:mi>
<mml:mo>&#x005E;</mml:mo>
</mml:mover>
</mml:math>
<tex-math id="M8">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
\hat V
\]
\end{document}
</tex-math>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="dsj-16-690-e4.gif"/>
</alternatives>
</inline-formula> when averaged over the response indicators sampled under the assumed response model; and (3) the extra inferential uncertainty due to missingness is correctly reflected. In order to check whether an imputation method is confidence proper, van Buuren (<xref ref-type="bibr" rid="B56">2012: 47</xref>) recommends to use bias, coverage, and confidence interval length as the evaluation criteria (See Section 8.2).</p>
</sec>
<sec>
<title>4.4 Assumption of Congeniality</title>
<p>Congeniality means that the imputation model is equal to the substantive analysis model. It is widely known that the imputation model can be larger than the substantive analysis model, but the imputation model cannot be smaller than the substantive analysis model (<xref ref-type="bibr" rid="B17">Enders 2010: 227&#8211;229</xref>; <xref ref-type="bibr" rid="B8">Carpenter and Kenward 2013: 64&#8211;65</xref>; <xref ref-type="bibr" rid="B41">Raghunathan 2016: 175&#8211;177</xref>).</p>
</sec>
</sec>
<sec sec-type="methods">
<title>5 Traditional Methods of Handling Missing Data</title>
<p>This section introduces listwise deletion, deterministic single imputation, and stochastic single imputation, which are used as baseline methods for comparisons in Section 8.</p>
<p>Listwise deletion (LD), also known as complete-case analysis, throws away any rows that have at least one missing value (<xref ref-type="bibr" rid="B3">Allison 2002: 6&#8211;8</xref>; <xref ref-type="bibr" rid="B4">Baraldi and Enders 2010: 10</xref>). Although it is simple and convenient, LD is less efficient due to the reduced sample size and may be biased if the assumption of MCAR does not hold (<xref ref-type="bibr" rid="B43">Schafer 1997: 23</xref>).</p>
<p>Deterministic single imputation (D-SI) replaces a missing value with a reasonable guess. The most straightforward version calculates predicted scores for missing values based on a regression model (<xref ref-type="bibr" rid="B3">Allison 2002: 11</xref>; <xref ref-type="bibr" rid="B4">Baraldi and Enders 2010: 12</xref>). If the goal of analysis is to estimate the mean of an incomplete variable, D-SI produces an unbiased estimate under the assumptions of MCAR and MAR. However, D-SI tends to underestimate the variation in imputed data (<xref ref-type="bibr" rid="B14">de Waal, Pannekoek, and Scholtus 2011: 231</xref>). D-SI is available as R-function <monospace>norm.predict</monospace> in MICE (<xref ref-type="bibr" rid="B56">van Buuren 2012: 57</xref>), where MICE stands for Multivariate Imputation by Chained Equations.</p>
<p>Stochastic single imputation (S-SI) also utilizes a regression model to predict missing values, but it adds to imputed values random components drawn from the residual distribution (<xref ref-type="bibr" rid="B4">Baraldi and Enders 2010: 13</xref>). S-SI is likely to recover the variation of an incomplete variable under the assumptions of MCAR and MAR; thus, compensating for the disadvantage of D-SI (<xref ref-type="bibr" rid="B14">de Waal, Pannekoek, and Scholtus 2011: 231</xref>). S-SI is available as R-function <monospace>norm.nob</monospace> in MICE (<xref ref-type="bibr" rid="B56">van Buuren 2012: 57</xref>).</p>
<p>However, both D-SI and S-SI tend to underestimate the standard error in imputed data because imputed values are treated as if they were real (<xref ref-type="bibr" rid="B41">Raghunathan 2016: 77</xref>).</p>
</sec>
<sec>
<title>6 Competing Multiple Imputation Algorithms</title>
<p>Multiple imputation was made widely known by Rubin (<xref ref-type="bibr" rid="B42">1987</xref>) and concise history can be found in Scheuren (<xref ref-type="bibr" rid="B48">2005</xref>). In theory, multiple imputation replaces a missing value by <italic>M</italic> simulated values (<italic>M</italic> &gt; 1) independently and randomly drawn from the distribution of missing data. The variation among <italic>M</italic> simulated values reflects uncertainty about missing data; thus, making the standard error valid. In practice, missing data are by definition unobserved; therefore, the distribution of missing data is also unobserved. Instead, under the assumption of MAR (or MCAR), multiple imputation constructs the posterior predictive distribution of missing data, conditional on observed data. Then, a random draw is independently made from this posterior distribution (<xref ref-type="bibr" rid="B42">Rubin 1987: 75</xref>; <xref ref-type="bibr" rid="B32">King et al. 2001: 53&#8211;54</xref>; <xref ref-type="bibr" rid="B8">Carpenter and Kenward 2013: 38&#8211;39</xref>).</p>
<p>However, using the analytical methods, it is not easy to randomly draw sufficient statistics from the posterior distribution (<xref ref-type="bibr" rid="B3">Allison 2002: 33</xref>; <xref ref-type="bibr" rid="B25">Honaker and King 2010: 564</xref>). In order to solve this problem, three computational algorithms have been proposed in the literature.</p>
<sec>
<title>6.1 Data Augmentation</title>
<p>The traditional algorithm of multiple imputation is the Data Augmentation (DA) algorithm, which is a Markov chain Monte Carlo (MCMC) technique (<xref ref-type="bibr" rid="B54">Takahashi and Ito 2014: 46&#8211;48</xref>). DA improves parameter estimates by repeated substitution conditional on the preceding value, forming a stochastic process called a Markov chain (<xref ref-type="bibr" rid="B20">Gill 2008: 379</xref>).</p>
<p>The DA algorithm works as follows (<xref ref-type="bibr" rid="B43">Schafer 1997: 72</xref>). Equation (1) is the imputation step that generates imputed values from the predictive distribution of missing values, given the observed values and the parameter values at iteration <italic>t</italic>. Equation (2) is the posterior step that generates parameter values from the posterior distribution, given the observed values and the imputed values at iteration <italic>t</italic> + 1.</p>
<disp-formula id="FD1">
<label>1</label>
<alternatives>
<mml:math id="Eq009-mml">
<mml:mrow>
<mml:msubsup>
<mml:mi>Y</mml:mi>
<mml:mrow>
<mml:mtext mathvariant="italic">mis</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:mrow><mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn>
</mml:mrow>
<mml:mo>)</mml:mo></mml:mrow>
</mml:mrow>
</mml:msubsup>
<mml:mo>&#x223C;</mml:mo><mml:mtext mathvariant="italic">Pr</mml:mtext><mml:mrow><mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>Y</mml:mi>
<mml:mrow>
<mml:mtext mathvariant="italic">mis</mml:mtext>
</mml:mrow>
</mml:msub>
<mml:mo>&#x007C;</mml:mo><mml:msub>
<mml:mi>Y</mml:mi>
<mml:mrow>
<mml:mtext mathvariant="italic">obs</mml:mtext>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo><mml:msup>
<mml:mi>&#x03B8;</mml:mi>
<mml:mrow>
<mml:mo stretchy='false'>(</mml:mo><mml:mi>t</mml:mi><mml:mo stretchy='false'>)</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mo>)</mml:mo></mml:mrow>
</mml:mrow>
</mml:math>
<tex-math id="M9">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
Y_{mis}^{\left( {t + 1} \right)} \sim Pr\left( {{Y_{mis}}|{Y_{obs}},{\theta ^{(t)}}} \right)
\]
\end{document}
</tex-math>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="dsj-16-690-e5.gif"/>
</alternatives>
</disp-formula>
<disp-formula id="FD2">
<label>2</label>
<alternatives>
<mml:math id="Eq010-mml">
<mml:mrow>
<mml:msup>
<mml:mi>&#x03B8;</mml:mi>
<mml:mrow>
<mml:mo stretchy='false'>(</mml:mo><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy='false'>)</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>&#x223C;</mml:mo><mml:mtext mathvariant="italic">Pr</mml:mtext><mml:mrow><mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>&#x03B8;</mml:mi><mml:mo>&#x007C;</mml:mo><mml:msub>
<mml:mi>Y</mml:mi>
<mml:mrow>
<mml:mtext mathvariant="italic">obs</mml:mtext>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo><mml:msubsup>
<mml:mi>Y</mml:mi>
<mml:mrow>
<mml:mtext mathvariant="italic">mis</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:mo stretchy='false'>(</mml:mo><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy='false'>)</mml:mo>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
<mml:mo>)</mml:mo></mml:mrow>
</mml:mrow>
</mml:math>
<tex-math id="M10">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
{\theta ^{(t + 1)}} \sim Pr\left( {\theta |{Y_{obs}},Y_{mis}^{(t + 1)}} \right)
\]
\end{document}
</tex-math>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="dsj-16-690-e6.gif"/>
</alternatives>
</disp-formula>
<p>These two steps are repeated <italic>T</italic> times until convergence is attained. The convergence of MCMC is stochastic because it converges to probability distributions (<xref ref-type="bibr" rid="B43">Schafer 1997: 80</xref>). Therefore, it is hard to judge the convergence in MCMC.</p>
<p>There are two ways of generating multiple imputations by DA (<xref ref-type="bibr" rid="B43">Schafer 1997: 139</xref>; <xref ref-type="bibr" rid="B17">Enders 2010: 211&#8211;212</xref>). In the first method, a single chain is run for <italic>M</italic> &#215; <italic>T</italic> iterations, taking every <italic>t</italic>-th iteration of <italic>Y<sub>mis</sub></italic>. In the second method, <italic>M</italic> parallel chains of length <italic>T</italic> are run, and the final values of <italic>Y<sub>mis</sub></italic> from <italic>M</italic> chains are taken as the imputations. The current study adopts the second method.</p>
<p>The software using this algorithm is R-Package NORM2, which was originally developed by Schafer (<xref ref-type="bibr" rid="B43">1997</xref>) and is currently maintained by Schafer (<xref ref-type="bibr" rid="B44">2016</xref>).</p>
</sec>
<sec>
<title>6.2 Fully Conditional Specification</title>
<p>An alternative algorithm to DA is the Fully Conditional Specification (FCS) algorithm, which specifies the multivariate distribution by way of a series of conditional densities, through which missing values are imputed given the other variables (<xref ref-type="bibr" rid="B54">Takahashi and Ito 2014: 50&#8211;53</xref>).</p>
<p>The FCS algorithm works as follows (<xref ref-type="bibr" rid="B57">van Buuren and Groothuis-Oudshoorn 2011: 6&#8211;7</xref>; <xref ref-type="bibr" rid="B56">van Buuren 2012: 110</xref>; <xref ref-type="bibr" rid="B60">Zhu and Raghunathan 2015</xref>). Equation (3) draws the unknown parameters of the imputation model, given the observed values and the <italic>t</italic>-th imputations, where <inline-formula>
<alternatives>
<mml:math id="Eq011-mml">
<mml:mrow>
<mml:msubsup>
<mml:mover accent='true'>
<mml:mi>Y</mml:mi>
<mml:mo>&#x02DC;</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mi>j</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mrow><mml:mo>(</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>)</mml:mo></mml:mrow>
</mml:mrow>
</mml:msubsup>
<mml:mo>=</mml:mo><mml:mrow><mml:mo>(</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mover accent='true'>
<mml:mi>Y</mml:mi>
<mml:mo>&#x02DC;</mml:mo>
</mml:mover>
<mml:mn>1</mml:mn>
<mml:mrow>
<mml:mrow><mml:mo>(</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>)</mml:mo></mml:mrow>
</mml:mrow>
</mml:msubsup>
<mml:mo>,</mml:mo><mml:mo>&#x2026;</mml:mo><mml:mo>,</mml:mo><mml:msubsup>
<mml:mover accent='true'>
<mml:mi>Y</mml:mi>
<mml:mo>&#x02DC;</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>j</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mrow><mml:mo>(</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>)</mml:mo></mml:mrow>
</mml:mrow>
</mml:msubsup>
<mml:mo>,</mml:mo><mml:msubsup>
<mml:mover accent='true'>
<mml:mi>Y</mml:mi>
<mml:mo>&#x02DC;</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>j</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mrow><mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>t</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn>
</mml:mrow>
<mml:mo>)</mml:mo></mml:mrow>
</mml:mrow>
</mml:msubsup>
<mml:mo>,</mml:mo><mml:mo>&#x2026;</mml:mo><mml:mo>,</mml:mo><mml:msubsup>
<mml:mover accent='true'>
<mml:mi>Y</mml:mi>
<mml:mo>&#x02DC;</mml:mo>
</mml:mover>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mrow><mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>t</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn>
</mml:mrow>
<mml:mo>)</mml:mo></mml:mrow>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
<mml:mo>)</mml:mo></mml:mrow>
</mml:mrow>
</mml:math>
<tex-math id="M11">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
\tilde Y_{ - j}^{\left( t \right)} = \left( {\tilde Y_1^{\left( t \right)}, \ldots ,\tilde Y_{j - 1}^{\left( t \right)},\tilde Y_{j + 1}^{\left( {t - 1} \right)}, \ldots ,\tilde Y_p^{\left( {t - 1} \right)}} \right)
\]
\end{document}
</tex-math>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="dsj-16-690-e7.gif"/>
</alternatives>
</inline-formula>, where tilde denotes a random draw. Equation (4) draws imputations, given the observed values, the <italic>t</italic>-th imputations, and the <italic>t</italic>-th parameter estimates. These two steps are repeated for <italic>j</italic> = 1, &#8230;, <italic>p</italic>.</p>
<disp-formula id="FD3">
<label>3</label>
<alternatives>
<mml:math id="Eq012-mml">
<mml:mrow>
<mml:msubsup>
<mml:mover accent='true'>
<mml:mi>&#x03B8;</mml:mi>
<mml:mo>&#x02DC;</mml:mo>
</mml:mover>
<mml:mi>j</mml:mi>
<mml:mrow>
<mml:mo stretchy='false'>(</mml:mo><mml:mi>t</mml:mi><mml:mo stretchy='false'>)</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:mo>&#x223C;</mml:mo><mml:mtext mathvariant="italic">Pr</mml:mtext><mml:mrow><mml:mo>(</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mi>&#x03B8;</mml:mi>
<mml:mi>j</mml:mi>
<mml:mrow>
<mml:mo stretchy='false'>(</mml:mo><mml:mi>t</mml:mi><mml:mo stretchy='false'>)</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:mo>&#x007C;</mml:mo><mml:msub>
<mml:mi>Y</mml:mi>
<mml:mrow>
<mml:mi>j</mml:mi><mml:mo>,</mml:mo><mml:mtext mathvariant="italic">obs</mml:mtext>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo><mml:msubsup>
<mml:mover accent='true'>
<mml:mi>Y</mml:mi>
<mml:mo>&#x02DC;</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mi>j</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mrow><mml:mo>(</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>)</mml:mo></mml:mrow>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
<mml:mo>)</mml:mo></mml:mrow>
</mml:mrow>
</mml:math>
<tex-math id="M12">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
\tilde \theta _j^{(t)} \sim Pr\left( {\theta _j^{(t)}|{Y_{j,obs}},\tilde Y_{ - j}^{\left( t \right)}} \right)
\]
\end{document}
</tex-math>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="dsj-16-690-e8.gif"/>
</alternatives>
</disp-formula>
<disp-formula id="FD4">
<label>4</label>
<alternatives>
<mml:math id="Eq013-mml">
<mml:mrow>
<mml:msubsup>
<mml:mover accent='true'>
<mml:mi>Y</mml:mi>
<mml:mo>&#x02DC;</mml:mo>
</mml:mover>
<mml:mi>j</mml:mi>
<mml:mrow>
<mml:mrow><mml:mo>(</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>)</mml:mo></mml:mrow>
</mml:mrow>
</mml:msubsup>
<mml:mo>&#x223C;</mml:mo><mml:mtext mathvariant="italic">Pr</mml:mtext><mml:mrow><mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>Y</mml:mi>
<mml:mrow>
<mml:mi>j</mml:mi><mml:mo>,</mml:mo><mml:mtext mathvariant="italic">mis</mml:mtext>
</mml:mrow>
</mml:msub>
<mml:mo>&#x007C;</mml:mo><mml:msub>
<mml:mi>Y</mml:mi>
<mml:mrow>
<mml:mi>j</mml:mi><mml:mo>,</mml:mo><mml:mtext mathvariant="italic">obs</mml:mtext>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo><mml:msubsup>
<mml:mover accent='true'>
<mml:mi>Y</mml:mi>
<mml:mo>&#x02DC;</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mi>j</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mrow><mml:mo>(</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>)</mml:mo></mml:mrow>
</mml:mrow>
</mml:msubsup>
<mml:mo>,</mml:mo><mml:msubsup>
<mml:mover accent='true'>
<mml:mi>&#x03B8;</mml:mi>
<mml:mo>&#x02DC;</mml:mo>
</mml:mover>
<mml:mi>j</mml:mi>
<mml:mrow>
<mml:mo stretchy='false'>(</mml:mo><mml:mi>t</mml:mi><mml:mo stretchy='false'>)</mml:mo>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
<mml:mo>)</mml:mo></mml:mrow>
</mml:mrow>
</mml:math>
<tex-math id="M13">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
\tilde Y_j^{\left( t \right)} \sim Pr\left( {{Y_{j,mis}}|{Y_{j,obs}},\tilde Y_{ - j}^{\left( t \right)},\tilde \theta _j^{(t)}} \right)
\]
\end{document}
</tex-math>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="dsj-16-690-e9.gif"/>
</alternatives>
</disp-formula>
<p>The entire process is repeated for <italic>t</italic> = 1,&#8230;, <italic>T</italic> until convergence is attained. FCS can be considered an MCMC method, because FCS is a Gibbs sampler under the compatible conditionals (<xref ref-type="bibr" rid="B57">van Buuren and Groothuis-Oudshoorn 2011: 6</xref>; <xref ref-type="bibr" rid="B56">van Buuren 2012: 109</xref>). This means that the convergence of FCS is stochastic. Therefore, it is hard to judge the convergence in FCS.</p>
<p>The software using this algorithm is R-Package MICE (<xref ref-type="bibr" rid="B57">van Buuren and Groothuis-Oudshoorn 2011</xref>), which stands for Multivariate Imputation by Chained Equations and is currently maintained by van Buuren et al. (<xref ref-type="bibr" rid="B58">2015</xref>). The FCS algorithm is also known as Sequential Regression Multivariate Imputation (<xref ref-type="bibr" rid="B41">Raghunathan 2016: 76</xref>).</p>
</sec>
<sec>
<title>6.3 Expectation-Maximization with Bootstrapping</title>
<p>Another emerging algorithm is the Expectation-Maximization with Bootstrapping (EMB) algorithm, which combines the Expectation-Maximization (EM) algorithm with the nonparametric bootstrap to create multiple imputation (<xref ref-type="bibr" rid="B54">Takahashi and Ito 2014: 55&#8211;57</xref>).</p>
<p>The EMB algorithm works as follows (<xref ref-type="bibr" rid="B25">Honaker and King 2010: 565</xref>; <xref ref-type="bibr" rid="B26">Honaker, King, and Blackwell 2011: 4</xref>). Suppose that a random sample of size <italic>n</italic> is drawn from a population, where some values are missing in the sample. Bootstrap resamples of size <italic>n</italic> are randomly drawn from the sample data with replacement <italic>M</italic> times (<xref ref-type="bibr" rid="B28">Horowitz 2001: 3163&#8211;3165</xref>; <xref ref-type="bibr" rid="B9">Carsey and Harden 2014: 215</xref>). The variation among the <italic>M</italic> resamples represents uncertainty about estimation. The EM algorithm is applied to each of these <italic>M</italic> bootstrap resamples to refine <italic>M</italic> point estimates of parameter <italic>&#952;</italic>. Equation (5) is the expectation step that calculates the Q-function by averaging the complete-data log-likelihood over the predictive distribution of missing data. Equation (6) is the maximization step that finds parameter values at iteration <italic>t</italic> + 1 by maximizing the Q-function.</p>
<disp-formula id="FD5">
<label>5</label>
<alternatives>
<mml:math id="Eq014-mml">
<mml:mrow>
<mml:mi>Q</mml:mi><mml:mrow><mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>&#x03B8;</mml:mi><mml:mo>&#x007C;</mml:mo><mml:msup>
<mml:mi>&#x03B8;</mml:mi>
<mml:mrow>
<mml:mo stretchy='false'>(</mml:mo><mml:mi>t</mml:mi><mml:mo stretchy='false'>)</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:msup>
<mml:mstyle mathsize='140%' displaystyle='true'><mml:mo>&#x222B;</mml:mo></mml:mstyle>
<mml:mtext>&#x200B;</mml:mtext>
</mml:msup>
<mml:mi>l</mml:mi><mml:mrow><mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>&#x03B8;</mml:mi><mml:mo>&#x007C;</mml:mo><mml:mi>Y</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo></mml:mrow><mml:mtext mathvariant="italic">Pr</mml:mtext><mml:mrow><mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>Y</mml:mi>
<mml:mrow>
<mml:mtext mathvariant="italic">mis</mml:mtext>
</mml:mrow>
</mml:msub>
<mml:mo>&#x007C;</mml:mo><mml:msub>
<mml:mi>Y</mml:mi>
<mml:mrow>
<mml:mtext mathvariant="italic">obs</mml:mtext>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo><mml:msup>
<mml:mi>&#x03B8;</mml:mi>
<mml:mrow>
<mml:mo stretchy='false'>(</mml:mo><mml:mi>t</mml:mi><mml:mo stretchy='false'>)</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mo>)</mml:mo></mml:mrow><mml:mi>d</mml:mi><mml:msub>
<mml:mi>Y</mml:mi>
<mml:mrow>
<mml:mtext mathvariant="italic">mis</mml:mtext>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
<tex-math id="M14">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
Q\left( {\theta |{\theta ^{(t)}}} \right) = \int {l\left( {\theta |Y} \right)Pr\left( {{Y_{mis}}|{Y_{obs}},{\theta ^{(t)}}} \right)d{Y_{mis}}}
\]
\end{document}
</tex-math>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="dsj-16-690-e10.gif"/>
</alternatives>
</disp-formula>
<disp-formula id="FD6">
<label>6</label>
<alternatives>
<mml:math id="Eq015-mml">
<mml:mrow>
<mml:msup>
<mml:mi>&#x03B8;</mml:mi>
<mml:mrow>
<mml:mo stretchy='false'>(</mml:mo><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy='false'>)</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>=</mml:mo><mml:mtext>arg</mml:mtext><mml:munder>
<mml:mrow>
<mml:mi>max</mml:mi>
</mml:mrow>
<mml:mi>&#x03B8;</mml:mi>
</mml:munder>
<mml:mi>Q</mml:mi><mml:mrow><mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>&#x03B8;</mml:mi><mml:mo>&#x007C;</mml:mo><mml:msup>
<mml:mi>&#x03B8;</mml:mi>
<mml:mrow>
<mml:mo stretchy='false'>(</mml:mo><mml:mi>t</mml:mi><mml:mo stretchy='false'>)</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mo>)</mml:mo></mml:mrow>
</mml:mrow>
</mml:math>
<tex-math id="M15">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
{\theta ^{(t + 1)}} = {\rm{arg}}\mathop {\max }\limits_\theta Q\left( {\theta |{\theta ^{(t)}}} \right)
\]
\end{document}
</tex-math>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="dsj-16-690-e11.gif"/>
</alternatives>
</disp-formula>
<p>These two steps are repeated until convergence is attained, where the converged value is a Maximum Likelihood Estimate (MLE) under well-behaved conditions (<xref ref-type="bibr" rid="B43">Schafer 1997: 38&#8211;39</xref>; <xref ref-type="bibr" rid="B15">Do and Batzoglou 2008</xref>). The convergence of EM is deterministic because it converges to a point in the parameter space (<xref ref-type="bibr" rid="B43">Schafer 1997: 80</xref>). Therefore, it is straightforward to judge the convergence in EM. The substitution of MLEs from bootstrap resamples is asymptotically equal to a sample from the posterior distribution (<xref ref-type="bibr" rid="B38">Little and Rubin 2002: 216&#8211;217</xref>).</p>
<p>The software using this algorithm is R-Package AMELIA II (<xref ref-type="bibr" rid="B26">Honaker, King, and Blackwell 2011</xref>), which was originally developed by King et al. (<xref ref-type="bibr" rid="B32">2001</xref>) and is currently maintained by Honaker, King, and Blackwell (<xref ref-type="bibr" rid="B27">2016</xref>).</p>
</sec>
<sec>
<title>6.4 Relationships among the Three Algorithms</title>
<p>The three algorithms share certain characteristics with each other, but not exactly the same as summarized in Table <xref ref-type="table" rid="T3">3</xref>.</p>
<table-wrap id="T3">
<label>Table 3</label>
<caption>
<p>Relations among DA, EMB, and FCS.</p>
</caption>
<table>
<tr>
<th align="left"></th>
<th align="left">Joint Modeling</th>
<th align="left">Conditional Modeling</th>
</tr>
<tr>
<td colspan="3"><hr/></td>
</tr>
<tr>
<td align="left">MCMC</td>
<td align="left">DA</td>
<td align="left">FCS</td>
</tr>
<tr>
<td align="left">Non-MCMC</td>
<td align="left">EMB</td>
<td align="left"></td>
</tr>
</table>
</table-wrap>
<p>DA and EMB are joint modeling while FCS is conditional modeling (<xref ref-type="bibr" rid="B33">Kropko et al. 2014</xref>). Joint modeling specifies a multivariate distribution of missing data while conditional modeling specifies a univariate distribution on a variable-by-variable basis (<xref ref-type="bibr" rid="B56">van Buuren 2012: 105&#8211;108</xref>). Conditional modeling is more flexible and joint modeling is computationally more efficient (<xref ref-type="bibr" rid="B56">van Buuren 2012: 117</xref>; <xref ref-type="bibr" rid="B33">Kropko et al. 2014</xref>).</p>
<p>DA and FCS are different versions of MCMC techniques. On the other hand, EMB is not an MCMC technique. It is said that DA and FCS require between-imputation iterations to be confidence proper (<xref ref-type="bibr" rid="B43">Schafer 1997: 106</xref>; <xref ref-type="bibr" rid="B56">van Buuren 2012: 113</xref>) while EMB does not need iterations to be confidence proper (<xref ref-type="bibr" rid="B25">Honaker and King 2010: 565</xref>). However, as is clear in Section 7, whether EMB is confidence proper when DA and FCS are improper, this is an open question that has not been tested in the literature.</p>
</sec>
</sec>
<sec>
<title>7 Comparative Studies on Multiple Imputation in the Literature</title>
<p>Table <xref ref-type="table" rid="T4">4</xref> presents the literature that compared imputation methods. Nine studies compared multiple imputation with other missing data methods, such as listwise deletion, single imputation, and maximum likelihood. Among these nine studies, four studies focused on DA (<xref ref-type="bibr" rid="B45">Schafer and Graham 2002</xref>; <xref ref-type="bibr" rid="B1">Abe and Iwasaki 2007</xref>; <xref ref-type="bibr" rid="B35">Lee and Carlin 2012</xref>; <xref ref-type="bibr" rid="B59">von Hippel 2016</xref>), four studies on FCS (<xref ref-type="bibr" rid="B16">Donders et al. 2006</xref>; <xref ref-type="bibr" rid="B50">Stuart et al. 2009</xref>; <xref ref-type="bibr" rid="B11">Cheema 2014</xref>; <xref ref-type="bibr" rid="B13">Deng et al. 2016</xref>), and one study on an unknown algorithm (<xref ref-type="bibr" rid="B49">Shara et al. 2015</xref>).</p>
<table-wrap id="T4">
<label>Table 4</label>
<caption>
<p>Summary of the 20 Studies on Multiple Imputation.</p>
</caption>
<table>
<tr>
<th valign="top" align="left">Authors</th>
<th valign="top" align="left">MI Algorithms</th>
<th valign="top" align="center">Sample Size</th>
<th valign="top" align="center">Number of Variables</th>
<th valign="top" align="center">Number of Imputations</th>
<th valign="top" align="center">Number of Iterations</th>
<th valign="top" align="center">Missing Rate</th>
</tr>
<tr>
<td colspan="7"><hr/></td>
</tr>
<tr>
<td align="left">Barnard and Rubin (<xref ref-type="bibr" rid="B5">1999</xref>)</td>
<td align="left">DA</td>
<td align="right">10, 20, 30</td>
<td align="right">2</td>
<td align="right">3, 5, 10</td>
<td align="right">Unknown</td>
<td align="right">10%, 20%, 30%</td>
</tr>
<tr>
<td align="left">Horton and Lipsitz (<xref ref-type="bibr" rid="B30">2001</xref>)</td>
<td align="left">DA, FCS</td>
<td align="right">10000</td>
<td align="right">3</td>
<td align="right">10</td>
<td align="right">200</td>
<td align="right">50%</td>
</tr>
<tr>
<td align="left">Schafer and Graham (<xref ref-type="bibr" rid="B45">2002</xref>)</td>
<td align="left">DA</td>
<td align="right">50</td>
<td align="right">2</td>
<td align="right">20</td>
<td align="right">Unknown</td>
<td align="right">73%</td>
</tr>
<tr>
<td align="left">Donders et al. (<xref ref-type="bibr" rid="B16">2006</xref>)</td>
<td align="left">FCS</td>
<td align="right">500</td>
<td align="right">2</td>
<td align="right">10</td>
<td align="right">Unknown</td>
<td align="right">40%</td>
</tr>
<tr>
<td align="left">Abe and Iwasaki (<xref ref-type="bibr" rid="B1">2007</xref>)</td>
<td align="left">DA</td>
<td align="right">100</td>
<td align="right">4</td>
<td align="right">5</td>
<td align="right">100</td>
<td align="right">20%, 30%</td>
</tr>
<tr>
<td align="left"><bold>Horton and Kleinman (<xref ref-type="bibr" rid="B29">2007</xref>)</bold></td>
<td align="left"><bold>DA, EMB, FCS</bold></td>
<td align="right">133774</td>
<td align="right">10</td>
<td align="right">10</td>
<td align="right">5</td>
<td align="right">41%</td>
</tr>
<tr>
<td align="left">Stuart et al. (<xref ref-type="bibr" rid="B50">2009</xref>)</td>
<td align="left">FCS</td>
<td align="right">9186</td>
<td align="right">400</td>
<td align="right">10</td>
<td align="right">10</td>
<td align="right">18%</td>
</tr>
<tr>
<td align="left">Lee and Carlin (<xref ref-type="bibr" rid="B34">2010</xref>)</td>
<td align="left">DA, FCS</td>
<td align="right">1000</td>
<td align="right">8</td>
<td align="right">20</td>
<td align="right">10</td>
<td align="right">33%</td>
</tr>
<tr>
<td align="left">Leite and Beretvas (<xref ref-type="bibr" rid="B36">2010</xref>)</td>
<td align="left">DA</td>
<td align="right">400</td>
<td align="right">10</td>
<td align="right">10</td>
<td align="right">Unknown</td>
<td align="right">10%, 30%, 50%</td>
</tr>
<tr>
<td align="left"><bold>Hardt, Herke, and Leonhart (<xref ref-type="bibr" rid="B24">2012</xref>)</bold></td>
<td align="left"><bold>DA, EMB, FCS</bold></td>
<td align="right">50, 100, 200</td>
<td align="right">3, 13, 23, 43, 83</td>
<td align="right">20</td>
<td align="right"><bold>Unknown</bold></td>
<td align="right">20%, 50%</td>
</tr>
<tr>
<td align="left">Lee and Carlin (<xref ref-type="bibr" rid="B35">2012</xref>)</td>
<td align="left">DA</td>
<td align="right">1000</td>
<td align="right">8</td>
<td align="right">20</td>
<td align="right">Unknown</td>
<td align="right">10%, 25%, 50%, 75%, 90%</td>
</tr>
<tr>
<td align="left">Cranmer and Gill (<xref ref-type="bibr" rid="B12">2013</xref>)</td>
<td align="left">EMB, MHD</td>
<td align="right">500</td>
<td align="right">5</td>
<td align="right">Unknown</td>
<td align="right">NA</td>
<td align="right">20%, 50%, 80%</td>
</tr>
<tr>
<td align="left">Cheema (<xref ref-type="bibr" rid="B11">2014</xref>)</td>
<td align="left">FCS</td>
<td align="right">10, 20, 50, 100, 200, 500, 1000, 2000, 5000, 10000</td>
<td align="right">4</td>
<td align="right">Unknown</td>
<td align="right">Unknown</td>
<td align="right">1%, 2%, 5%, 10%, 20%</td>
</tr>
<tr>
<td align="left"><bold>Kropko et al. (<xref ref-type="bibr" rid="B33">2014</xref>)</bold></td>
<td align="left"><bold>DA, EMB, FCS</bold></td>
<td align="right">1000</td>
<td align="right">8</td>
<td align="right">5</td>
<td align="right"><bold>30</bold></td>
<td align="right">25%</td>
</tr>
<tr>
<td align="left">Shara et al. (<xref ref-type="bibr" rid="B49">2015</xref>)</td>
<td align="left">Unknown</td>
<td align="right">2246</td>
<td align="right">8</td>
<td align="right">Unknown</td>
<td align="right">Unknown</td>
<td align="right">20%, 30%, 40%</td>
</tr>
<tr>
<td align="left">Deng et al. (<xref ref-type="bibr" rid="B13">2016</xref>)</td>
<td align="left">FCS</td>
<td align="right">100</td>
<td align="right">200, 1000</td>
<td align="right">10</td>
<td align="right">20</td>
<td align="right">40%</td>
</tr>
<tr>
<td align="left">von Hippel (<xref ref-type="bibr" rid="B59">2016</xref>)</td>
<td align="left">DA</td>
<td align="right">25, 100</td>
<td align="right">2</td>
<td align="right">5</td>
<td align="right">Unknown</td>
<td align="right">50%</td>
</tr>
<tr>
<td align="left">Hughes, Sterne, and Tilling (<xref ref-type="bibr" rid="B31">2016</xref>)</td>
<td align="left">Unknown</td>
<td align="right">100, 1000</td>
<td align="right">5</td>
<td align="right">50</td>
<td align="right">Unknown</td>
<td align="right">40%, 60%</td>
</tr>
<tr>
<td align="left">McNeish (<xref ref-type="bibr" rid="B39">2017</xref>)</td>
<td align="left">DA, FCS</td>
<td align="right">20, 50, 100, 250</td>
<td align="right">4</td>
<td align="right">5, 25, 100</td>
<td align="right">Unknown</td>
<td align="right">10%, 20%, 30%, 50%</td>
</tr>
</table>
<table-wrap-foot>
<fn>
<p><bold>Note</bold>: DA stands for Data Augmentation, EMis for Expectation-Maximization with Importance Sampling, FCS for Fully Conditional Specification, EMB for Expectation-Maximization with Bootstrapping, and MHD for Multiple Hot Deck. Unknown means that information is unavailable. NA means Not-Applicable.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<p>Four studies investigated specialized situations for multiple imputation, such as small-sample degrees of freedom in DA (<xref ref-type="bibr" rid="B5">Barnard and Rubin 1999</xref>), Likert-scale data in DA (<xref ref-type="bibr" rid="B36">Leite and Beretvas 2010</xref>), non-parametric multiple imputation (<xref ref-type="bibr" rid="B12">Cranmer and Gill 2013</xref>), and variance estimators (<xref ref-type="bibr" rid="B31">Hughes, Sterne, and Tilling 2016</xref>).</p>
<p>Seven studies compared different multiple imputation algorithms (<xref ref-type="bibr" rid="B32">King et al. 2001</xref>; Horton and Lipsitz 2002; <xref ref-type="bibr" rid="B29">Horton and Kleinman 2007</xref>; <xref ref-type="bibr" rid="B34">Lee and Carlin 2010</xref>; <xref ref-type="bibr" rid="B24">Hardt, Herke, and Leonhart 2012</xref>; <xref ref-type="bibr" rid="B33">Kropko et al. 2014</xref>; <xref ref-type="bibr" rid="B39">McNeish 2017</xref>). The comparative perspective in most of the seven studies, except King et al. (<xref ref-type="bibr" rid="B32">2001</xref>), is based on the difference between joint modeling and conditional modeling. Thus, the perspective from MCMC vs. non-MCMC is generally lacking in the literature.</p>
<p>Ten studies did not explicitly state the number of iterations <italic>T</italic>. Furthermore, Horton and Kleinman (<xref ref-type="bibr" rid="B29">2007</xref>) used the default setting in software for <italic>T</italic>, and the information in Kropko et al. (<xref ref-type="bibr" rid="B33">2014</xref>) can be only found in their computer codes, not in the article.</p>
<p>Thus, no studies in Table <xref ref-type="table" rid="T4">4</xref> have systematically investigated the effects of convergence on the three multiple imputation algorithms.</p>
</sec>
<sec>
<title>8 Monte Carlo Simulation</title>
<p>Section 4 introduced MAR, proper imputation, and congeniality as crucial assumptions. To make the assumptions of MAR and congeniality realistic, an inclusive analysis strategy is recommended in the literature (<xref ref-type="bibr" rid="B17">Enders 2010: 16&#8211;17</xref>; <xref ref-type="bibr" rid="B41">Raghunathan 2016: 73</xref>), which contains any auxiliary variables that can increase the predictive power of the imputation model or any variables that may be related to the missing data mechanism. What complicates the matter, however, is that auxiliary variables themselves are often incomplete. This creates a dilemma in multiple imputation. Including many auxiliary variables makes it more likely for MAR and congeniality to be satisfied, but including many incomplete variables leads to a higher total missing rate, which further makes it more difficult for convergence in MCMC to be attained.</p>
<p>When assumptions do not hold in statistical methods, analytical mathematics does not often provide answers about the properties of the methods (<xref ref-type="bibr" rid="B40">Mooney 1997: 1</xref>). Monte Carlo simulation converts the computer into an experimental laboratory, where the researcher can control various conditions in the environment to observe the outcomes (<xref ref-type="bibr" rid="B9">Carsey and Harden 2014: 4</xref>). Thus, Monte Carlo simulation is a powerful method of assessing the performance of statistical methods under various settings especially when assumptions are violated.</p>
<sec>
<title>8.1 Monte Carlo Simulation Designs</title>
<p>The current study prepares two versions of simulation data, (1) theoretical and (2) realistic. Auxiliary variables <italic>X</italic> are generated by R-Function <monospace>mvrnorm</monospace>. All of the computations are done in R version 3.2.4. The computer used in the current study is HP Z440 Workstation (Windows 7 Professional, processor: Intel Xeon CPU E5-1603 v3), with the processor speed of 2.80 GHz and the memory (RAM) of 32.0 GB under the 64 bit operating system. The number of Monte Carlo simulation runs is set to 1000.</p>
<p>The first setting is theoretical. The number of observations is 1000, which is equivalent to the 75<sup>th</sup> percentile of the sample sizes found in the studies listed in Table <xref ref-type="table" rid="T4">4</xref>. The number of variables <italic>p</italic> is changed from 2, 3, 4, 5, 6, 7, 8, 9, to 10, which is equivalent to the 70<sup>th</sup> percentile of the number of variables found in the studies listed in Table <xref ref-type="table" rid="T4">4</xref>. Note that in another simulation run, not reported here, <italic>p</italic> was changed to 20, and the conclusions were similar. As was assumed in Section 2, auxiliary variables <italic>x<sub>j</sub></italic> are multivariate-normal with the mean of 0 and the standard deviation of 1, i.e., <italic>X</italic> ~ <italic>N<sub>p</sub></italic><sub>&#8211;1</sub>(0, 1), where the number of auxiliary variables is <italic>p</italic> &#8211; 1. The correlation among <italic>x<sub>j</sub></italic> is randomly generated in R as follows: <monospace>r&lt;-matrix(runif(9^2,&#8211;1,1), ncol=9)</monospace> and <monospace>Cor&lt;-cov2cor(r%*%t(r))</monospace>. The generated correlation matrix is shown in equation (7). The <italic>p</italic>-th variable <italic>y<sub>i</sub></italic> is a linear combination of <italic>x<sub>j</sub></italic> such that <inline-formula>
<alternatives>
<mml:math id="Eq016-mml">
<mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>=</mml:mo><mml:msub>
<mml:mi>&#x03B2;</mml:mi>
<mml:mn>0</mml:mn>
</mml:msub>
<mml:mo>+</mml:mo><mml:msub>
<mml:mi>&#x03B2;</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mrow>
<mml:mn>1</mml:mn><mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>+</mml:mo><mml:mo>&#x2026;</mml:mo><mml:mo>+</mml:mo><mml:msub>
<mml:mi>&#x03B2;</mml:mi>
<mml:mrow>
<mml:mi>p</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mrow>
<mml:mi>p</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn><mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>+</mml:mo><mml:msub>
<mml:mi>&#x03B5;</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
<tex-math id="M16">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
{y_i} = {\beta _0} + {\beta _1}{x_{1i}} + \ldots + {\beta _{p - 1}}{x_{p - 1i}} + {\varepsilon _i}
\]
\end{document}
</tex-math>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="dsj-16-690-e12.gif"/>
</alternatives>
</inline-formula>, where <italic>&#946;<sub>j</sub></italic> ~ <italic>U</italic>(&#8211;2.0, 2.0) and <italic>&#603;<sub>i</sub></italic> ~ <italic>N</italic>(0, <italic>&#963;</italic>). Note that <italic>&#946;<sub>j</sub></italic> includes <italic>&#946;</italic><sub>0</sub> and <italic>&#963;</italic> ~ <italic>U</italic>(0.5, 2.0).</p>
<disp-formula id="FD7">
<label>7</label>
<alternatives>
<mml:math id="Eq017-mml">
<mml:mrow>
<mml:msub>
<mml:mtext mathvariant="italic">Cor</mml:mtext>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo>=</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mrow>
<mml:mtable columnalign='right'>
<mml:mtr columnalign='right'>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>1.000</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.231</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mtext>&#x00A0;</mml:mtext><mml:mn>0.335</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.401</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.276</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.247</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.120</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.327</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.068</mml:mn>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr columnalign='right'>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.231</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>1.000</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.074</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.761</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.041</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.623</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.083</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.432</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.183</mml:mn>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr columnalign='right'>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mtext>&#x00A0;</mml:mtext><mml:mn>0.335</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.074</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>1.000</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.183</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.323</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.254</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.458</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.434</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.801</mml:mn>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr columnalign='right'>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.401</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.761</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.183</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>1.000</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.007</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.639</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.094</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.676</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.169</mml:mn>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr columnalign='right'>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.276</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.041</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.323</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.007</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>1.000</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.547</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.357</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.025</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.081</mml:mn>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr columnalign='right'>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.247</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.623</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.254</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.639</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.547</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>1.000</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.024</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.204</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.023</mml:mn>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr columnalign='right'>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mtext>&#x00A0;</mml:mtext><mml:mo>&#x2212;</mml:mo><mml:mn>0.120</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.083</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.458</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.094</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.357</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.024</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>1.000</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.486</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.373</mml:mn>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr columnalign='right'>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mtext>&#x00A0;</mml:mtext><mml:mn>0.327</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.432</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.434</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.676</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mtext>&#x00A0;</mml:mtext><mml:mo>&#x2212;</mml:mo><mml:mn>0.025</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.204</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.486</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>1.000</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.153</mml:mn>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr columnalign='right'>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.068</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.183</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.801</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.169</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.081</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.023</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.373</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mtext>&#x00A0;</mml:mtext><mml:mo>&#x2212;</mml:mo><mml:mn>0.153</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>1.000</mml:mn>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow><mml:mo>]</mml:mo></mml:mrow>
</mml:mrow>
</mml:math>
<tex-math id="M17">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
Co{r_1} = \left[ {\begin{array}{*{20}{r}}
{1.000}&amp;{ - 0.231}&amp;{{\rm{}}0.335}&amp;{0.401}&amp;{ - 0.276}&amp;{0.247}&amp;{ - 0.120}&amp;{0.327}&amp;{ - 0.068}\\
{ - 0.231}&amp;{1.000}&amp;{0.074}&amp;{ - 0.761}&amp;{0.041}&amp;{ - 0.623}&amp;{ - 0.083}&amp;{ - 0.432}&amp;{ - 0.183}\\
{{\rm{}}0.335}&amp;{0.074}&amp;{1.000}&amp;{0.183}&amp;{ - 0.323}&amp;{0.254}&amp;{ - 0.458}&amp;{0.434}&amp;{ - 0.801}\\
{0.401}&amp;{ - 0.761}&amp;{0.183}&amp;{1.000}&amp;{0.007}&amp;{0.639}&amp;{ - 0.094}&amp;{0.676}&amp;{0.169}\\
{ - 0.276}&amp;{0.041}&amp;{ - 0.323}&amp;{0.007}&amp;{1.000}&amp;{ - 0.547}&amp;{0.357}&amp;{ - 0.025}&amp;{0.081}\\
{0.247}&amp;{ - 0.623}&amp;{0.254}&amp;{0.639}&amp;{ - 0.547}&amp;{1.000}&amp;{0.024}&amp;{0.204}&amp;{0.023}\\
{{\rm{}} - 0.120}&amp;{ - 0.083}&amp;{ - 0.458}&amp;{ - 0.094}&amp;{0.357}&amp;{0.024}&amp;{1.000}&amp;{ - 0.486}&amp;{0.373}\\
{{\rm{}}0.327}&amp;{ - 0.432}&amp;{0.434}&amp;{0.676}&amp;{{\rm{}} - 0.025}&amp;{0.204}&amp;{ - 0.486}&amp;{1.000}&amp;{ - 0.153}\\
{ - 0.068}&amp;{ - 0.183}&amp;{ - 0.801}&amp;{0.169}&amp;{0.081}&amp;{0.023}&amp;{0.373}&amp;{{\rm{}} - 0.153}&amp;{1.000}
\end{array}} \right]
\]
\end{document}
</tex-math>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="dsj-16-690-e13.gif"/>
</alternatives>
</disp-formula>
<p>The second setting is realistic. The number of observations is 228, which is the full sample size of the real data in Table <xref ref-type="table" rid="T2">2</xref>. The number of variables <italic>p</italic> is again changed from 2, 3, 4, 5, 6, 7, 8, 9, to 10. Auxiliary variables <italic>x<sub>j</sub></italic> are multivariate-normal with the means and standard deviations based on the empirical data (log-transformed), where <italic>x<sub>j</sub></italic> consist of the nine independent variables in Table <xref ref-type="table" rid="T2">2</xref> (<xref ref-type="bibr" rid="B10">CIA 2016</xref>; <xref ref-type="bibr" rid="B19">Freedom House 2016</xref>). Note that, as was explained in Table <xref ref-type="table" rid="T2">2</xref>, the raw empirical data are log-normal; therefore, the input data are log-transformed. Furthermore, the correlation matrix is based on the empirical data (log-transformed) as in equation (8). The <italic>p</italic>-th variable <italic>y<sub>i</sub></italic> is a linear combination of <italic>x<sub>j</sub></italic> such that <inline-formula>
<alternatives>
<mml:math id="Eq018-mml">
<mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>=</mml:mo><mml:msub>
<mml:mi>&#x03B2;</mml:mi>
<mml:mn>0</mml:mn>
</mml:msub>
<mml:mo>+</mml:mo><mml:msub>
<mml:mi>&#x03B2;</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mrow>
<mml:mn>1</mml:mn><mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>+</mml:mo><mml:mo>&#x2026;</mml:mo><mml:mo>+</mml:mo><mml:msub>
<mml:mi>&#x03B2;</mml:mi>
<mml:mrow>
<mml:mi>p</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mrow>
<mml:mi>p</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn><mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>+</mml:mo><mml:msub>
<mml:mi>&#x03B5;</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
<tex-math id="M18">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
{y_i} = {\beta _0} + {\beta _1}{x_{1i}} + \ldots + {\beta _{p - 1}}{x_{p - 1i}} + {\varepsilon _i}
\]
\end{document}
</tex-math>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="dsj-16-690-e14.gif"/>
</alternatives>
</inline-formula>, where <italic>&#946;<sub>j</sub></italic> (including <italic>&#946;</italic><sub>0</sub>) reflects the coefficients in multiple regression models using the empirical data and <italic>&#603;<sub>i</sub></italic> ~ <italic>N</italic>(0, <italic>&#963;<sub>resid</sub></italic>), where <italic>&#963;<sub>resid</sub></italic> is the residual standard deviation from the empirical regression model.</p>
<disp-formula id="FD8">
<label>8</label>
<alternatives>
<mml:math id="Eq019-mml">
<mml:mrow>
<mml:msub>
<mml:mtext mathvariant="italic">Cor</mml:mtext>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo>=</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mrow>
<mml:mtable columnalign='right'>
<mml:mtr columnalign='right'>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>1.000</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.646</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mtext>&#x00A0;</mml:mtext><mml:mo>&#x2212;</mml:mo><mml:mn>0.500</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.007</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.376</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.354</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.378</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.534</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.312</mml:mn>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr columnalign='right'>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.646</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>1.000</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.531</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mtext>&#x00A0;</mml:mtext><mml:mn>0.021</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.371</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.305</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.150</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mtext>&#x00A0;</mml:mtext><mml:mo>&#x2212;</mml:mo><mml:mn>0.427</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.049</mml:mn>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr columnalign='right'>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mtext>&#x00A0;</mml:mtext><mml:mo>&#x2212;</mml:mo><mml:mn>0.500</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.531</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>1.000</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.474</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.512</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.278</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.092</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.280</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.086</mml:mn>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr columnalign='right'>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.007</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.021</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.474</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>1.000</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.205</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.079</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.014</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.086</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.161</mml:mn>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr columnalign='right'>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.376</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.371</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.512</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.205</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>1.000</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mtext>&#x00A0;</mml:mtext><mml:mo>&#x2212;</mml:mo><mml:mn>0.204</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.089</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.370</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.220</mml:mn>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr columnalign='right'>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.354</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.305</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.278</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.079</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.204</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>1.000</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.106</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.212</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.180</mml:mn>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr columnalign='right'>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mtext>&#x00A0;</mml:mtext><mml:mo>&#x2212;</mml:mo><mml:mn>0.378</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.150</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.092</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.014</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.089</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.106</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>1.000</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.578</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.128</mml:mn>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr columnalign='right'>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mtext>&#x00A0;</mml:mtext><mml:mo>&#x2212;</mml:mo><mml:mn>0.534</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.427</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.280</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.086</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mtext>&#x00A0;</mml:mtext><mml:mo>&#x2212;</mml:mo><mml:mn>0.370</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.212</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.578</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>1.000</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.134</mml:mn>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr columnalign='right'>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.312</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.049</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.086</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.161</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>0.220</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.180</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>0.128</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mtext>&#x00A0;</mml:mtext><mml:mo>&#x2212;</mml:mo><mml:mn>0.134</mml:mn>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign='right'>
<mml:mrow>
<mml:mn>1.000</mml:mn>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow><mml:mo>]</mml:mo></mml:mrow>
</mml:mrow>
</mml:math>
<tex-math id="M19">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
Co{r_2} = \left[ {\begin{array}{*{20}{r}}
{1.000}&amp;{0.646}&amp;{{\rm{}} - 0.500}&amp;{ - 0.007}&amp;{0.376}&amp;{ - 0.354}&amp;{ - 0.378}&amp;{ - 0.534}&amp;{0.312}\\
{0.646}&amp;{1.000}&amp;{ - 0.531}&amp;{{\rm{}}0.021}&amp;{0.371}&amp;{ - 0.305}&amp;{ - 0.150}&amp;{{\rm{}} - 0.427}&amp;{0.049}\\
{{\rm{}} - 0.500}&amp;{ - 0.531}&amp;{1.000}&amp;{ - 0.474}&amp;{ - 0.512}&amp;{0.278}&amp;{0.092}&amp;{0.280}&amp;{ - 0.086}\\
{ - 0.007}&amp;{0.021}&amp;{ - 0.474}&amp;{1.000}&amp;{0.205}&amp;{0.079}&amp;{0.014}&amp;{0.086}&amp;{0.161}\\
{0.376}&amp;{0.371}&amp;{ - 0.512}&amp;{0.205}&amp;{1.000}&amp;{{\rm{}} - 0.204}&amp;{ - 0.089}&amp;{ - 0.370}&amp;{0.220}\\
{ - 0.354}&amp;{ - 0.305}&amp;{0.278}&amp;{0.079}&amp;{ - 0.204}&amp;{1.000}&amp;{0.106}&amp;{0.212}&amp;{ - 0.180}\\
{{\rm{}} - 0.378}&amp;{ - 0.150}&amp;{0.092}&amp;{0.014}&amp;{ - 0.089}&amp;{0.106}&amp;{1.000}&amp;{0.578}&amp;{ - 0.128}\\
{{\rm{}} - 0.534}&amp;{ - 0.427}&amp;{0.280}&amp;{0.086}&amp;{{\rm{}} - 0.370}&amp;{0.212}&amp;{0.578}&amp;{1.000}&amp;{ - 0.134}\\
{0.312}&amp;{0.049}&amp;{ - 0.086}&amp;{0.161}&amp;{0.220}&amp;{ - 0.180}&amp;{ - 0.128}&amp;{{\rm{}} - 0.134}&amp;{1.000}
\end{array}} \right]
\]
\end{document}
</tex-math>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="dsj-16-690-e15.gif"/>
</alternatives>
</disp-formula>
<p>In both settings, <italic>x<sub>j</sub></italic> are incomplete variables for imputation, <italic>y<sub>i</sub></italic> is completely observed in all of the situations, and <italic>u<sub>ij</sub></italic> are a set of <italic>p</italic> &#8211; 1 continuous uniform random numbers ranging from 0 to 1 for the missing data mechanism. As was introduced in Section 4.1, under the assumption of MAR, the missingness of <italic>x<sub>ji</sub></italic> depends on the values of <italic>y<sub>i</sub></italic> and <italic>u<sub>ij</sub></italic>, i.e., <italic>x<sub>ji</sub></italic> is missing if <italic>y<sub>i</sub></italic> &lt; median(<italic>y<sub>i</sub></italic>) and <italic>u<sub>ij</sub></italic> &lt; 0.5, and <italic>x<sub>ji</sub></italic> is missing if <italic>y<sub>i</sub></italic> &gt; median(<italic>y<sub>i</sub></italic>) and <italic>u<sub>ij</sub></italic> &gt; 0.9. This creates approximately 30% missing values in each <italic>x<sub>j</sub></italic>. This is realistic, because the average missing rates of income and earnings are 30% on a variable basis in the National Health Interview Survey (<xref ref-type="bibr" rid="B47">Schenker et al. 2006: 925</xref>) and the median missing rate is 30.0% in Table <xref ref-type="table" rid="T4">4</xref>. Note that the above setting may be translated into the following statement. Variable <italic>y<sub>i</sub></italic> is age and <italic>x</italic><sub>1<italic>i</italic></sub> is income. The missingness of income depends on age and some random components. Income is missing if age is less than the median of age and uniform random numbers are less than 0.5. Also, income is missing if age is larger than the median of age and uniform random numbers are larger than 0.9.</p>
<p>Although the literature (<xref ref-type="bibr" rid="B22">Graham, Olchowski, and Gilreath 2007</xref>; <xref ref-type="bibr" rid="B7">Bodner 2008</xref>; <xref ref-type="bibr" rid="B54">Takahashi and Ito 2014: 68&#8211;71</xref>) recommends to use relatively large <italic>M</italic>, the simulation studies in Table <xref ref-type="table" rid="T4">4</xref> use relatively small <italic>M</italic>. This is due to the computational burden of Monte Carlo simulation for multiple imputation. Considering this practical issue, the current study sets <italic>M</italic> to 20, which is equivalent to the 75<sup>th</sup> percentile of the number of multiply-imputed data found in the studies listed in Table <xref ref-type="table" rid="T4">4</xref>.</p>
<p>As for <italic>T</italic>, there is no consensus in the literature (Table <xref ref-type="table" rid="T4">4</xref>). There are no clear-cut rules for determining whether MCMC algorithms attained convergence (<xref ref-type="bibr" rid="B43">Schafer 1997: 119</xref>; <xref ref-type="bibr" rid="B32">King et al. 2001: 59</xref>; <xref ref-type="bibr" rid="B57">van Buuren and Groothuis-Oudshoorn 2011: 37</xref>). Though not perfect, doubling the number of EM iterations is a rule of thumb for a conservative estimate about convergence speed for MCMC (<xref ref-type="bibr" rid="B46">Schafer and Olsen 1998</xref>; <xref ref-type="bibr" rid="B17">Enders 2010: 204</xref>). Since it is not possible to check convergence in each of the 1000 simulation runs, the current study relies on the rule of thumb to set <italic>T</italic>.</p>
</sec>
<sec>
<title>8.2 Criteria for Judging Simulation Results</title>
<p>The estimand in all of the simulation runs is <italic>&#946;</italic><sub>1</sub> in <inline-formula>
<alternatives>
<mml:math id="Eq020-mml">
<mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>=</mml:mo><mml:msub>
<mml:mi>&#x03B2;</mml:mi>
<mml:mn>0</mml:mn>
</mml:msub>
<mml:mo>+</mml:mo><mml:msub>
<mml:mi>&#x03B2;</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mrow>
<mml:mn>1</mml:mn><mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>+</mml:mo><mml:mo>&#x2026;</mml:mo><mml:mo>+</mml:mo><mml:msub>
<mml:mi>&#x03B2;</mml:mi>
<mml:mrow>
<mml:mi>p</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mrow>
<mml:mi>p</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn><mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>+</mml:mo><mml:msub>
<mml:mi>&#x03B5;</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
<tex-math id="M20">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
{y_i} = {\beta _0} + {\beta _1}{x_{1i}} + \ldots + {\beta _{p - 1}}{x_{p - 1i}} + {\varepsilon _i}
\]
\end{document}
</tex-math>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="dsj-16-690-e16.gif"/>
</alternatives>
</inline-formula>. The purpose of multiple imputation is to find an unbiased estimate of the population parameter that is confidence valid (<xref ref-type="bibr" rid="B56">van Buuren 2012: 35&#8211;36</xref>).</p>
<p>Unbiasedness can be assessed by equation (9), because an estimator <inline-formula>
<alternatives>
<mml:math id="Eq021-mml">
<mml:mover accent='true'>
<mml:mi>&#x03B8;</mml:mi>
<mml:mo>&#x005E;</mml:mo>
</mml:mover>
</mml:math>
<tex-math id="M21">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
\hat \theta
\]
\end{document}
</tex-math>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="dsj-16-690-e2.gif"/>
</alternatives>
</inline-formula> is an unbiased estimator of <italic>&#952;</italic> if the expected value of <inline-formula>
<alternatives>
<mml:math id="Eq022-mml">
<mml:mover accent='true'>
<mml:mi>&#x03B8;</mml:mi>
<mml:mo>&#x005E;</mml:mo>
</mml:mover>
</mml:math>
<tex-math id="M22">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
\hat \theta
\]
\end{document}
</tex-math>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="dsj-16-690-e2.gif"/>
</alternatives>
</inline-formula> is equal to the true <italic>&#952;</italic> (<xref ref-type="bibr" rid="B40">Mooney 1997: 59</xref>; <xref ref-type="bibr" rid="B23">Gujarati 2003: 899</xref>).</p>
<disp-formula id="FD9">
<label>9</label>
<alternatives>
<mml:math id="Eq023-mml">
<mml:mrow>
<mml:mtext>Bias</mml:mtext><mml:mrow><mml:mo>(</mml:mo>
<mml:mover accent='true'>
<mml:mi>&#x03B8;</mml:mi>
<mml:mo>&#x005E;</mml:mo>
</mml:mover>
<mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mi>E</mml:mi><mml:mrow><mml:mo>(</mml:mo>
<mml:mover accent='true'>
<mml:mi>&#x03B8;</mml:mi>
<mml:mo>&#x005E;</mml:mo>
</mml:mover>
<mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2212;</mml:mo><mml:mi>&#x03B8;</mml:mi>
</mml:mrow>
</mml:math>
<tex-math id="M23">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
{\rm{Bias}}\left( {\hat \theta } \right) = E\left( {\hat \theta } \right) - \theta
\]
\end{document}
</tex-math>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="dsj-16-690-e17.gif"/>
</alternatives>
</disp-formula>
<p>Unbiasedness and efficiency can be simultaneously assessed by the Root Mean Square Error (RMSE), defined as equation (10). The RMSE measures the spread around the true value of the parameter, placing slightly more emphasis on efficiency than bias (<xref ref-type="bibr" rid="B23">Gujarati 2003: 901</xref>; <xref ref-type="bibr" rid="B9">Carsey and Harden 2014: 88&#8211;89</xref>).</p>
<disp-formula id="FD10">
<label>10</label>
<alternatives>
<mml:math id="Eq024-mml">
<mml:mrow>
<mml:mtext>RMSE</mml:mtext><mml:mrow><mml:mo>(</mml:mo>
<mml:mover accent='true'>
<mml:mi>&#x03B8;</mml:mi>
<mml:mo>&#x005E;</mml:mo>
</mml:mover>
<mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:msqrt>
<mml:mrow>
<mml:mi>E</mml:mi><mml:msup>
<mml:mrow>
<mml:mrow><mml:mo>(</mml:mo>
<mml:mrow>
<mml:mover accent='true'>
<mml:mi>&#x03B8;</mml:mi>
<mml:mo>&#x005E;</mml:mo>
</mml:mover>
<mml:mo>&#x2212;</mml:mo><mml:mi>&#x03B8;</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo></mml:mrow>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:mtext>&#x00A0;</mml:mtext>
</mml:mrow>
</mml:msqrt>
</mml:mrow>
</mml:math>
<tex-math id="M24">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
{\rm{RMSE}}\left( {\hat \theta } \right) = \sqrt {E{{\left( {\hat \theta - \theta } \right)}^2}{\rm{}}}
\]
\end{document}
</tex-math>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="dsj-16-690-e18.gif"/>
</alternatives>
</disp-formula>
<p>Confidence validity can be assessed by the coverage probability of the nominal 95% confidence interval (CI), which &#8216;is the proportion of simulated samples for which the estimated confidence interval includes the true parameter&#8217; (<xref ref-type="bibr" rid="B9">Carsey and Harden 2014: 93</xref>). The formula of the standard error for proportions is equation (11), where &#960; is the proportion and <italic>s</italic> is the number of simulation runs.</p>
<disp-formula id="FD11">
<label>11</label>
<alternatives>
<mml:math id="Eq025-mml">
<mml:mrow>
<mml:mtext>SE</mml:mtext><mml:mrow><mml:mo>(</mml:mo>
<mml:mi>&#x03C0;</mml:mi>
<mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:msqrt>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mi>&#x03C0;</mml:mi><mml:mrow><mml:mo>(</mml:mo>
<mml:mrow>
<mml:mn>1</mml:mn><mml:mo>&#x2212;</mml:mo><mml:mi>&#x03C0;</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo></mml:mrow>
</mml:mrow>
<mml:mi>s</mml:mi>
</mml:mfrac>
</mml:mrow>
</mml:msqrt>
</mml:mrow>
</mml:math>
<tex-math id="M25">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
{\rm{SE}}\left( \pi \right) = \sqrt {\frac{{\pi \left( {1 - \pi } \right)}}{s}}
\]
\end{document}
</tex-math>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="dsj-16-690-e19.gif"/>
</alternatives>
</disp-formula>
<p>The standard error of the 95% CI coverage over 1000 iterations is <inline-formula>
<alternatives>
<mml:math id="Eq026-mml">
<mml:mrow>
<mml:msqrt>
<mml:mrow>
<mml:mn>0.95</mml:mn><mml:mo>&#x00D7;</mml:mo><mml:mn>0.05</mml:mn><mml:mo>/</mml:mo><mml:mn>1000</mml:mn>
</mml:mrow>
</mml:msqrt>
<mml:mo>&#x2248;</mml:mo><mml:mn>0.007</mml:mn>
</mml:mrow>
</mml:math>
<tex-math id="M26">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
\sqrt {0.95 \times 0.05/1000} \approx 0.007
\]
\end{document}
</tex-math>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="dsj-16-690-e20.gif"/>
</alternatives>
</inline-formula> which is 0.7%. Therefore, with 95% confidence, the estimated coverage probability should be between 93.6% and 96.4% (<xref ref-type="bibr" rid="B1">Abe and Iwasaki 2007: 10</xref>; <xref ref-type="bibr" rid="B34">Lee and Carlin 2010: 627</xref>; <xref ref-type="bibr" rid="B9">Carsey and Harden 2014: 94&#8211;95</xref>; <xref ref-type="bibr" rid="B31">Hughes, Sterne, and Tilling 2016</xref>).</p>
</sec>
<sec>
<title>8.3 Results of the Simulation</title>
<p>Abbreviations in this section are explained in Table <xref ref-type="table" rid="T5">5</xref>, where MI stands for multiple imputation and SI for single imputation.</p>
<table-wrap id="T5">
<label>Table 5</label>
<caption>
<p>Abbreviations and the Missing Data Methods.</p>
</caption>
<table>
<tr>
<th align="left">Abbreviations</th>
<th align="left">Missing Data Methods</th>
</tr>
<tr>
<td colspan="2"><hr/></td>
</tr>
<tr>
<td align="left">CD</td>
<td align="left">Complete data without missing values</td>
</tr>
<tr>
<td align="left">LD</td>
<td align="left">Listwise deletion</td>
</tr>
<tr>
<td align="left">EMB</td>
<td align="left">MI by AMELIA II</td>
</tr>
<tr>
<td align="left">DA1</td>
<td align="left">MI by NORM2 with no iterations</td>
</tr>
<tr>
<td align="left">DA2</td>
<td align="left">MI by NORM2 with 2*EM iterations</td>
</tr>
<tr>
<td align="left">FCS1</td>
<td align="left">MI by MICE with no iterations</td>
</tr>
<tr>
<td align="left">FCS2</td>
<td align="left">MI by MICE with 2*EM iterations</td>
</tr>
<tr>
<td align="left">D-SI</td>
<td align="left">Deterministic SI by <monospace>norm.predict</monospace> in MICE</td>
</tr>
<tr>
<td align="left">S-SI</td>
<td align="left">Stochastic SI by <monospace>norm.nob</monospace> in MICE</td>
</tr>
</table>
</table-wrap>
<sec>
<title>8.3.1 Theoretical Case</title>
<p>This section presents the results of the Monte Carlo simulation for the theoretical case, where the correlation matrix and the regression coefficients are randomly generated.</p>
<p>Table <xref ref-type="table" rid="T6">6</xref> shows the Bias and RMSE values for the regression coefficient <italic>&#946;</italic><sub>1</sub>. The Bias and RMSE values for listwise deletion and single imputation methods indicate that these methods are not recommended at all. All of the Bias and RMSE values from EMB, DA1, DA2, and FCS2 are almost identical, showing that they are generally unbiased. However, FCS1 is rather biased, quite similar to S-SI. Therefore, when between-imputation iterations are ignored, there are no discernible effects on bias and efficiency in EMB and DA, but FCS may suffer from some bias.</p>
<table-wrap id="T6">
<label>Table 6</label>
<caption>
<p>Bias and RMSE (Theoretical Data).</p>
</caption>
<table>
<tr>
<th align="center" rowspan="2" colspan="2"></th>
<th align="center" colspan="9">Number of Variables<hr/></th>
</tr>
<tr>
<th align="center">2</th>
<th align="center">3</th>
<th align="center">4</th>
<th align="center">5</th>
<th align="center">6</th>
<th align="center">7</th>
<th align="center">8</th>
<th align="center">9</th>
<th align="center">10</th>
</tr>
<tr>
<td colspan="11"><hr/></td>
</tr>
<tr>
<td valign="middle" align="left" rowspan="2">CD</td>
<td align="left">Bias</td>
<td align="right">0.001</td>
<td align="right">0.003</td>
<td align="right">0.001</td>
<td align="right">0.002</td>
<td align="right">0.001</td>
<td align="right">0.001</td>
<td align="right">0.001</td>
<td align="right">0.002</td>
<td align="right">0.001</td>
</tr>
<tr>
<td align="left">RMSE</td>
<td align="right">0.040</td>
<td align="right">0.047</td>
<td align="right">0.038</td>
<td align="right">0.039</td>
<td align="right">0.058</td>
<td align="right">0.026</td>
<td align="right">0.046</td>
<td align="right">0.039</td>
<td align="right">0.047</td>
</tr>
<tr>
<td align="left"></td>
<td colspan="10"><hr/></td>
</tr>
<tr>
<td valign="middle" align="left" rowspan="2">LD</td>
<td align="left">Bias</td>
<td align="right"><bold>0.032</bold></td>
<td align="right"><bold>0.135</bold></td>
<td align="right"><bold>0.105</bold></td>
<td align="right"><bold>0.104</bold></td>
<td align="right"><bold>0.332</bold></td>
<td align="right"><bold>0.085</bold></td>
<td align="right"><bold>0.129</bold></td>
<td align="right"><bold>0.210</bold></td>
<td align="right"><bold>0.116</bold></td>
</tr>
<tr>
<td align="left">RMSE</td>
<td align="right">0.059</td>
<td align="right">0.153</td>
<td align="right">0.122</td>
<td align="right">0.121</td>
<td align="right">0.349</td>
<td align="right">0.103</td>
<td align="right">0.160</td>
<td align="right">0.228</td>
<td align="right">0.155</td>
</tr>
<tr>
<td align="left"></td>
<td colspan="10"><hr/></td>
</tr>
<tr>
<td valign="middle" align="left" rowspan="2">EMB</td>
<td align="left">Bias</td>
<td align="right">0.000</td>
<td align="right">0.004</td>
<td align="right">0.002</td>
<td align="right">0.000</td>
<td align="right">0.005</td>
<td align="right">0.001</td>
<td align="right">0.005</td>
<td align="right">0.005</td>
<td align="right">0.002</td>
</tr>
<tr>
<td align="left">RMSE</td>
<td align="right">0.046</td>
<td align="right">0.053</td>
<td align="right">0.050</td>
<td align="right">0.051</td>
<td align="right">0.075</td>
<td align="right">0.041</td>
<td align="right">0.069</td>
<td align="right">0.059</td>
<td align="right">0.072</td>
</tr>
<tr>
<td align="left"></td>
<td colspan="10"><hr/></td>
</tr>
<tr>
<td valign="middle" align="left" rowspan="2">DA1</td>
<td align="left">Bias</td>
<td align="right">0.001</td>
<td align="right">0.002</td>
<td align="right">0.003</td>
<td align="right">0.001</td>
<td align="right">0.001</td>
<td align="right">0.000</td>
<td align="right">0.003</td>
<td align="right">0.003</td>
<td align="right">0.002</td>
</tr>
<tr>
<td align="left">RMSE</td>
<td align="right">0.046</td>
<td align="right">0.053</td>
<td align="right">0.050</td>
<td align="right">0.051</td>
<td align="right">0.074</td>
<td align="right">0.041</td>
<td align="right">0.069</td>
<td align="right">0.058</td>
<td align="right">0.072</td>
</tr>
<tr>
<td align="left"></td>
<td colspan="10"><hr/></td>
</tr>
<tr>
<td valign="middle" align="left" rowspan="2">DA2</td>
<td align="left">Bias</td>
<td align="right">0.002</td>
<td align="right">0.001</td>
<td align="right">0.005</td>
<td align="right">0.002</td>
<td align="right">0.001</td>
<td align="right">0.000</td>
<td align="right">0.001</td>
<td align="right">0.003</td>
<td align="right">0.000</td>
</tr>
<tr>
<td align="left">RMSE</td>
<td align="right">0.046</td>
<td align="right">0.053</td>
<td align="right">0.050</td>
<td align="right">0.051</td>
<td align="right">0.074</td>
<td align="right">0.041</td>
<td align="right">0.069</td>
<td align="right">0.058</td>
<td align="right">0.072</td>
</tr>
<tr>
<td align="left"></td>
<td colspan="10"><hr/></td>
</tr>
<tr>
<td valign="middle" align="left" rowspan="2">FCS1</td>
<td align="left">Bias</td>
<td align="right">0.002</td>
<td align="right">0.001</td>
<td align="right"><bold>0.082</bold></td>
<td align="right"><bold>0.040</bold></td>
<td align="right"><bold>0.090</bold></td>
<td align="right"><bold>0.047</bold></td>
<td align="right"><bold>0.093</bold></td>
<td align="right"><bold>0.027</bold></td>
<td align="right"><bold>0.233</bold></td>
</tr>
<tr>
<td align="left">RMSE</td>
<td align="right">0.047</td>
<td align="right">0.053</td>
<td align="right">0.097</td>
<td align="right">0.062</td>
<td align="right">0.116</td>
<td align="right">0.065</td>
<td align="right">0.109</td>
<td align="right">0.052</td>
<td align="right">0.239</td>
</tr>
<tr>
<td align="left"></td>
<td colspan="10"><hr/></td>
</tr>
<tr>
<td valign="middle" align="left" rowspan="2">FCS2</td>
<td align="left">Bias</td>
<td align="right">0.001</td>
<td align="right">0.002</td>
<td align="right">0.004</td>
<td align="right">0.002</td>
<td align="right">0.001</td>
<td align="right">0.000</td>
<td align="right">0.001</td>
<td align="right">0.002</td>
<td align="right">0.001</td>
</tr>
<tr>
<td align="left">RMSE</td>
<td align="right">0.046</td>
<td align="right">0.053</td>
<td align="right">0.050</td>
<td align="right">0.051</td>
<td align="right">0.075</td>
<td align="right">0.041</td>
<td align="right">0.069</td>
<td align="right">0.058</td>
<td align="right">0.071</td>
</tr>
<tr>
<td align="left"></td>
<td colspan="10"><hr/></td>
</tr>
<tr>
<td valign="middle" align="left" rowspan="2">D-SI</td>
<td align="left">Bias</td>
<td align="right"><bold>0.186</bold></td>
<td align="right"><bold>0.242</bold></td>
<td align="right"><bold>0.174</bold></td>
<td align="right"><bold>0.093</bold></td>
<td align="right"><bold>0.187</bold></td>
<td align="right"><bold>0.098</bold></td>
<td align="right"><bold>0.231</bold></td>
<td align="right"><bold>0.070</bold></td>
<td align="right"><bold>0.163</bold></td>
</tr>
<tr>
<td align="left">RMSE</td>
<td align="right">0.192</td>
<td align="right">0.248</td>
<td align="right">0.182</td>
<td align="right">0.110</td>
<td align="right">0.207</td>
<td align="right">0.109</td>
<td align="right">0.248</td>
<td align="right">0.099</td>
<td align="right">0.189</td>
</tr>
<tr>
<td align="left"></td>
<td colspan="10"><hr/></td>
</tr>
<tr>
<td valign="middle" align="left" rowspan="2">S-SI</td>
<td align="left">Bias</td>
<td align="right">0.002</td>
<td align="right">0.000</td>
<td align="right"><bold>0.081</bold></td>
<td align="right"><bold>0.038</bold></td>
<td align="right"><bold>0.090</bold></td>
<td align="right"><bold>0.047</bold></td>
<td align="right"><bold>0.091</bold></td>
<td align="right"><bold>0.029</bold></td>
<td align="right"><bold>0.230</bold></td>
</tr>
<tr>
<td align="left">RMSE</td>
<td align="right">0.050</td>
<td align="right">0.057</td>
<td align="right">0.102</td>
<td align="right">0.066</td>
<td align="right">0.124</td>
<td align="right">0.076</td>
<td align="right">0.119</td>
<td align="right">0.062</td>
<td align="right">0.241</td>
</tr>
</table>
<table-wrap-foot>
<fn>
<p><bold>Note</bold>: Biased results are in boldface, i.e., Bias &gt; 0.010.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<p>Table <xref ref-type="table" rid="T7">7</xref> gives the coverage probability of the 95% CI for <italic>&#946;</italic><sub>1</sub>. The CIs for listwise deletion and single imputation methods are not confidence valid. When the number of auxiliary variables is small (and hence the overall missing rate is small), the between-imputation iterations may be ignored, where all of the multiple imputation CIs are confidence valid. However, as the number of auxiliary variables becomes large, DA1 and FCS1 drift away from the confidence validity. EMB, DA2, and FCS2 are confidence valid regardless of the number of variables and the missing rate. This shows that EMB is confidence proper even if it does not iterate. This is an important finding in the current study.</p>
<table-wrap id="T7">
<label>Table 7</label>
<caption>
<p>Coverage of the 95% CI (Theoretical Data).</p>
</caption>
<table>
<tr>
<th align="left"></th>
<th align="center" colspan="9">Number of Variables<hr/></th>
</tr>
<tr>
<th align="left"></th>
<th align="center">2</th>
<th align="center">3</th>
<th align="center">4</th>
<th align="center">5</th>
<th align="center">6</th>
<th align="center">7</th>
<th align="center">8</th>
<th align="center">9</th>
<th align="center">10</th>
</tr>
<tr>
<td colspan="10"><hr/></td>
</tr>
<tr>
<td align="left">CD</td>
<td align="right">95.3</td>
<td align="right">94.9</td>
<td align="right">94.2</td>
<td align="right">94.0</td>
<td align="right">96.0</td>
<td align="right">96.0</td>
<td align="right">95.3</td>
<td align="right">94.9</td>
<td align="right">94.6</td>
</tr>
<tr>
<td align="left">LD</td>
<td align="right"><bold>88.5</bold></td>
<td align="right"><bold>47.9</bold></td>
<td align="right"><bold>54.6</bold></td>
<td align="right"><bold>56.7</bold></td>
<td align="right"><bold>10.8</bold></td>
<td align="right"><bold>65.1</bold></td>
<td align="right"><bold>69.2</bold></td>
<td align="right"><bold>32.1</bold></td>
<td align="right"><bold>78.1</bold></td>
</tr>
<tr>
<td align="left">EMB</td>
<td align="right">95.0</td>
<td align="right">95.1</td>
<td align="right">94.2</td>
<td align="right">95.5</td>
<td align="right">94.9</td>
<td align="right">94.4</td>
<td align="right">94.3</td>
<td align="right">94.1</td>
<td align="right">95.0</td>
</tr>
<tr>
<td align="left">DA1</td>
<td align="right">94.6</td>
<td align="right">94.9</td>
<td align="right"><bold>93.2</bold></td>
<td align="right"><bold>93.1</bold></td>
<td align="right">94.1</td>
<td align="right"><bold>91.8</bold></td>
<td align="right"><bold>92.9</bold></td>
<td align="right"><bold>92.4</bold></td>
<td align="right"><bold>92.9</bold></td>
</tr>
<tr>
<td align="left">DA2</td>
<td align="right">94.3</td>
<td align="right">95.8</td>
<td align="right">95.1</td>
<td align="right">94.1</td>
<td align="right">94.8</td>
<td align="right">94.3</td>
<td align="right">94.2</td>
<td align="right"><bold>93.2</bold></td>
<td align="right">94.9</td>
</tr>
<tr>
<td align="left">FCS1</td>
<td align="right">94.2</td>
<td align="right">95.0</td>
<td align="right"><bold>75.0</bold></td>
<td align="right"><bold>91.6</bold></td>
<td align="right"><bold>84.4</bold></td>
<td align="right">95.5</td>
<td align="right"><bold>84.5</bold></td>
<td align="right"><bold>96.8</bold></td>
<td align="right"><bold>6.8</bold></td>
</tr>
<tr>
<td align="left">FCS2</td>
<td align="right">94.7</td>
<td align="right">95.6</td>
<td align="right">94.4</td>
<td align="right">93.9</td>
<td align="right">95.4</td>
<td align="right">94.5</td>
<td align="right">94.2</td>
<td align="right">95.0</td>
<td align="right">95.0</td>
</tr>
<tr>
<td align="left">D-SI</td>
<td align="right"><bold>0.8</bold></td>
<td align="right"><bold>0.2</bold></td>
<td align="right"><bold>2.2</bold></td>
<td align="right"><bold>37.8</bold></td>
<td align="right"><bold>22.2</bold></td>
<td align="right"><bold>16.9</bold></td>
<td align="right"><bold>8.3</bold></td>
<td align="right"><bold>51.0</bold></td>
<td align="right"><bold>22.5</bold></td>
</tr>
<tr>
<td align="left">S-SI</td>
<td align="right"><bold>88.9</bold></td>
<td align="right"><bold>89.6</bold></td>
<td align="right"><bold>47.8</bold></td>
<td align="right"><bold>75.0</bold></td>
<td align="right"><bold>62.3</bold></td>
<td align="right"><bold>64.4</bold></td>
<td align="right"><bold>48.9</bold></td>
<td align="right"><bold>76.0</bold></td>
<td align="right"><bold>3.7</bold></td>
</tr>
</table>
<table-wrap-foot>
<fn>
<p><bold>Note</bold>: Confidence invalid results are in boldface, i.e., outside of 93.6 and 96.4.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<p>Table <xref ref-type="table" rid="T8">8</xref> shows the CI lengths. The CI length by listwise deletion is generally too long, reflecting inefficiency due to the reduced sample size. The CI lengths by single imputation methods are &#8216;correct&#8217; in the sense that they are quite similar to those of complete data analysis; however, this means that single imputation methods ignore estimation uncertainty associated with imputation. This is the cause of confidence invalidity of single imputation methods in Table <xref ref-type="table" rid="T7">7</xref>. The CI length by DA1 is too short and the CI length by FCS1 is too long. The CI lengths by EMB, DA2, and FCS2 are essentially equal, reflecting the correct level of estimation uncertainty associated with imputation.</p>
<table-wrap id="T8">
<label>Table 8</label>
<caption>
<p>Lengths of the 95% CI (Theoretical Data).</p>
</caption>
<table>
<tr>
<th align="left"></th>
<th colspan="9">Number of Variables<hr/></th>
</tr>
<tr>
<th align="left"></th>
<th align="center">2</th>
<th align="center">3</th>
<th align="center">4</th>
<th align="center">5</th>
<th align="center">6</th>
<th align="center">7</th>
<th align="center">8</th>
<th align="center">9</th>
<th align="center">10</th>
</tr>
<tr>
<td colspan="10"><hr/></td>
</tr>
<tr>
<td align="left">CD</td>
<td align="right">0.157</td>
<td align="right">0.184</td>
<td align="right">0.144</td>
<td align="right">0.148</td>
<td align="right">0.236</td>
<td align="right">0.102</td>
<td align="right">0.184</td>
<td align="right">0.151</td>
<td align="right">0.180</td>
</tr>
<tr>
<td align="left">LD</td>
<td align="right">0.189</td>
<td align="right">0.259</td>
<td align="right">0.226</td>
<td align="right">0.235</td>
<td align="right">0.384</td>
<td align="right">0.213</td>
<td align="right">0.358</td>
<td align="right">0.339</td>
<td align="right">0.390</td>
</tr>
<tr>
<td align="left">EMB</td>
<td align="right">0.178</td>
<td align="right">0.209</td>
<td align="right">0.196</td>
<td align="right">0.200</td>
<td align="right">0.301</td>
<td align="right">0.160</td>
<td align="right">0.275</td>
<td align="right">0.229</td>
<td align="right">0.281</td>
</tr>
<tr>
<td align="left">DA1</td>
<td align="right">0.176</td>
<td align="right">0.207</td>
<td align="right">0.187</td>
<td align="right">0.192</td>
<td align="right">0.293</td>
<td align="right">0.145</td>
<td align="right">0.256</td>
<td align="right">0.208</td>
<td align="right">0.253</td>
</tr>
<tr>
<td align="left">DA2</td>
<td align="right">0.177</td>
<td align="right">0.208</td>
<td align="right">0.194</td>
<td align="right">0.198</td>
<td align="right">0.298</td>
<td align="right">0.158</td>
<td align="right">0.271</td>
<td align="right">0.223</td>
<td align="right">0.274</td>
</tr>
<tr>
<td align="left">FCS1</td>
<td align="right">0.178</td>
<td align="right">0.209</td>
<td align="right">0.237</td>
<td align="right">0.211</td>
<td align="right">0.324</td>
<td align="right">0.248</td>
<td align="right">0.306</td>
<td align="right">0.223</td>
<td align="right">0.299</td>
</tr>
<tr>
<td align="left">FCS2</td>
<td align="right">0.178</td>
<td align="right">0.209</td>
<td align="right">0.197</td>
<td align="right">0.201</td>
<td align="right">0.302</td>
<td align="right">0.161</td>
<td align="right">0.275</td>
<td align="right">0.228</td>
<td align="right">0.281</td>
</tr>
<tr>
<td align="left">D-SI</td>
<td align="right">0.143</td>
<td align="right">0.174</td>
<td align="right">0.133</td>
<td align="right">0.149</td>
<td align="right">0.244</td>
<td align="right">0.103</td>
<td align="right">0.205</td>
<td align="right">0.150</td>
<td align="right">0.188</td>
</tr>
<tr>
<td align="left">S-SI</td>
<td align="right">0.157</td>
<td align="right">0.184</td>
<td align="right">0.161</td>
<td align="right">0.155</td>
<td align="right">0.238</td>
<td align="right">0.145</td>
<td align="right">0.188</td>
<td align="right">0.149</td>
<td align="right">0.186</td>
</tr>
</table>
</table-wrap>
<p>Table <xref ref-type="table" rid="T9">9</xref> displays the computational time required to generate multiple imputations. When the number of auxiliary variables is small (and hence the overall missing rate is small), DA2 is fastest among the three confidence proper multiple imputation algorithms. On the other hand, as the number of auxiliary variables becomes large, EMB becomes fastest. As is known in the literature (<xref ref-type="bibr" rid="B56">van Buuren 2012: 117</xref>; <xref ref-type="bibr" rid="B33">Kropko et al. 2014</xref>), FCS2 is at least 5 times slower and can be more than 50 times slower than EMB and DA2. However, the difference in computational time is not substantial, given that all of the computations can be done within a few minutes.</p>
<table-wrap id="T9">
<label>Table 9</label>
<caption>
<p>Computational Time (Theoretical Data).</p>
</caption>
<table>
<tr>
<th align="left"></th>
<th colspan="9">Number of Variables<hr/></th>
</tr>
<tr>
<th align="left"></th>
<th align="center">2</th>
<th align="center">3</th>
<th align="center">4</th>
<th align="center">5</th>
<th align="center">6</th>
<th align="center">7</th>
<th align="center">8</th>
<th align="center">9</th>
<th align="center">10</th>
</tr>
<tr>
<td colspan="10"><hr/></td>
</tr>
<tr>
<td align="left">EMB</td>
<td align="right">0.46</td>
<td align="right">0.53</td>
<td align="right">0.53</td>
<td align="right">0.59</td>
<td align="right">0.71</td>
<td align="right"><bold>0.78</bold></td>
<td align="right"><bold>0.97</bold></td>
<td align="right"><bold>1.27</bold></td>
<td align="right"><bold>1.69</bold></td>
</tr>
<tr>
<td align="left">DA2</td>
<td align="right"><bold>0.10</bold></td>
<td align="right"><bold>0.16</bold></td>
<td align="right"><bold>0.29</bold></td>
<td align="right"><bold>0.42</bold></td>
<td align="right"><bold>0.55</bold></td>
<td align="right">1.09</td>
<td align="right">1.39</td>
<td align="right">2.22</td>
<td align="right">3.63</td>
</tr>
<tr>
<td align="left">FCS2</td>
<td align="right">2.47</td>
<td align="right">5.98</td>
<td align="right">14.48</td>
<td align="right">21.33</td>
<td align="right">25.40</td>
<td align="right">54.71</td>
<td align="right">59.14</td>
<td align="right">85.69</td>
<td align="right">133.17</td>
</tr>
</table>
<table-wrap-foot>
<fn>
<p><bold>Note</bold>: Reported values are the time in seconds to perform multiple imputation, which is averaged over 1,000 simulation runs. The fastest results are in boldface.</p>
</fn>
</table-wrap-foot>
</table-wrap>
</sec>
<sec>
<title>8.3.2 Realistic Case</title>
<p>This section presents the results of the Monte Carlo simulation for the realistic case, where the correlation matrix and the regression coefficients are based on the real data (<xref ref-type="bibr" rid="B10">CIA 2016</xref>; <xref ref-type="bibr" rid="B19">Freedom House 2016</xref>). The results in this section reinforce the findings in Section 8.3.1.</p>
<p>Table <xref ref-type="table" rid="T10">10</xref> shows the Bias and RMSE values for the regression coefficient <italic>&#946;</italic><sub>1</sub>. The overall conclusions are similar to Table <xref ref-type="table" rid="T6">6</xref>. When between-imputation iterations are ignored, there are no discernible effects on bias and efficiency in EMB and DA, but FCS may occasionally suffer from small bias.</p>
<table-wrap id="T10">
<label>Table 10</label>
<caption>
<p>Bias and RMSE (Realistic Data).</p>
</caption>
<table>
<tr>
<th align="left"></th>
<th align="left"></th>
<th colspan="9">Number of Variables<hr/></th>
</tr>
<tr>
<th align="left"></th>
<th align="left"></th>
<th align="center">2</th>
<th align="center">3</th>
<th align="center">4</th>
<th align="center">5</th>
<th align="center">6</th>
<th align="center">7</th>
<th align="center">8</th>
<th align="center">9</th>
<th align="center">10</th>
</tr>
<tr>
<td colspan="11"><hr/></td>
</tr>
<tr>
<td valign="middle" align="left" rowspan="2">CD</td>
<td align="left">Bias</td>
<td align="right">0.003</td>
<td align="right">0.002</td>
<td align="right">0.002</td>
<td align="right">0.002</td>
<td align="right">0.001</td>
<td align="right">0.002</td>
<td align="right">0.000</td>
<td align="right">0.002</td>
<td align="right">0.002</td>
</tr>
<tr>
<td align="left">RMSE</td>
<td align="right">0.074</td>
<td align="right">0.086</td>
<td align="right">0.068</td>
<td align="right">0.067</td>
<td align="right">0.066</td>
<td align="right">0.065</td>
<td align="right">0.070</td>
<td align="right">0.069</td>
<td align="right">0.075</td>
</tr>
<tr>
<td align="left"></td>
<td colspan="10"><hr/></td>
</tr>
<tr>
<td valign="middle" align="left" rowspan="2">LD</td>
<td align="left">Bias</td>
<td align="right"><bold>0.034</bold></td>
<td align="right"><bold>0.047</bold></td>
<td align="right"><bold>0.037</bold></td>
<td align="right"><bold>0.054</bold></td>
<td align="right"><bold>0.082</bold></td>
<td align="right"><bold>0.099</bold></td>
<td align="right"><bold>0.083</bold></td>
<td align="right"><bold>0.072</bold></td>
<td align="right"><bold>0.085</bold></td>
</tr>
<tr>
<td align="left">RMSE</td>
<td align="right">0.095</td>
<td align="right">0.128</td>
<td align="right">0.104</td>
<td align="right">0.118</td>
<td align="right">0.141</td>
<td align="right">0.154</td>
<td align="right">0.157</td>
<td align="right">0.159</td>
<td align="right">0.188</td>
</tr>
<tr>
<td align="left"></td>
<td colspan="10"><hr/></td>
</tr>
<tr>
<td valign="middle" align="left" rowspan="2">EMB</td>
<td align="left">Bias</td>
<td align="right">0.001</td>
<td align="right">0.002</td>
<td align="right">0.002</td>
<td align="right">0.005</td>
<td align="right">0.001</td>
<td align="right">0.000</td>
<td align="right">0.000</td>
<td align="right">0.002</td>
<td align="right">0.006</td>
</tr>
<tr>
<td align="left">RMSE</td>
<td align="right">0.084</td>
<td align="right">0.113</td>
<td align="right">0.091</td>
<td align="right">0.090</td>
<td align="right">0.089</td>
<td align="right">0.092</td>
<td align="right">0.102</td>
<td align="right">0.099</td>
<td align="right">0.110</td>
</tr>
<tr>
<td align="left"></td>
<td colspan="10"><hr/></td>
</tr>
<tr>
<td valign="middle" align="left" rowspan="2">DA1</td>
<td align="left">Bias</td>
<td align="right">0.006</td>
<td align="right">0.001</td>
<td align="right">0.003</td>
<td align="right">0.003</td>
<td align="right">0.001</td>
<td align="right">0.001</td>
<td align="right">0.001</td>
<td align="right">0.001</td>
<td align="right">0.002</td>
</tr>
<tr>
<td align="left">RMSE</td>
<td align="right">0.084</td>
<td align="right">0.112</td>
<td align="right">0.090</td>
<td align="right">0.089</td>
<td align="right">0.087</td>
<td align="right">0.091</td>
<td align="right">0.100</td>
<td align="right">0.096</td>
<td align="right">0.105</td>
</tr>
<tr>
<td align="left"></td>
<td colspan="10"><hr/></td>
</tr>
<tr>
<td valign="middle" align="left" rowspan="2">DA2</td>
<td align="left">Bias</td>
<td align="right">0.009</td>
<td align="right">0.000</td>
<td align="right">0.002</td>
<td align="right">0.004</td>
<td align="right">0.002</td>
<td align="right">0.004</td>
<td align="right">0.000</td>
<td align="right">0.001</td>
<td align="right">0.001</td>
</tr>
<tr>
<td align="left">RMSE</td>
<td align="right">0.084</td>
<td align="right">0.111</td>
<td align="right">0.089</td>
<td align="right">0.088</td>
<td align="right">0.086</td>
<td align="right">0.090</td>
<td align="right">0.098</td>
<td align="right">0.094</td>
<td align="right">0.102</td>
</tr>
<tr>
<td align="left"></td>
<td colspan="10"><hr/></td>
</tr>
<tr>
<td valign="middle" align="left" rowspan="2">FCS1</td>
<td align="left">Bias</td>
<td align="right">0.007</td>
<td align="right"><bold>0.013</bold></td>
<td align="right">0.006</td>
<td align="right">0.005</td>
<td align="right">0.002</td>
<td align="right">0.008</td>
<td align="right">0.006</td>
<td align="right"><bold>0.012</bold></td>
<td align="right">0.000</td>
</tr>
<tr>
<td align="left">RMSE</td>
<td align="right">0.084</td>
<td align="right">0.106</td>
<td align="right">0.081</td>
<td align="right">0.081</td>
<td align="right">0.080</td>
<td align="right">0.081</td>
<td align="right">0.086</td>
<td align="right">0.083</td>
<td align="right">0.088</td>
</tr>
<tr>
<td align="left"></td>
<td colspan="10"><hr/></td>
</tr>
<tr>
<td valign="middle" align="left" rowspan="2">FCS2</td>
<td align="left">Bias</td>
<td align="right">0.007</td>
<td align="right">0.001</td>
<td align="right">0.002</td>
<td align="right">0.002</td>
<td align="right">0.003</td>
<td align="right">0.005</td>
<td align="right">0.002</td>
<td align="right">0.003</td>
<td align="right">0.005</td>
</tr>
<tr>
<td align="left">RMSE</td>
<td align="right">0.084</td>
<td align="right">0.112</td>
<td align="right">0.088</td>
<td align="right">0.088</td>
<td align="right">0.086</td>
<td align="right">0.090</td>
<td align="right">0.097</td>
<td align="right">0.093</td>
<td align="right">0.100</td>
</tr>
<tr>
<td align="left"></td>
<td colspan="10"><hr/></td>
</tr>
<tr>
<td valign="middle" align="left" rowspan="2">D-SI</td>
<td align="left">Bias</td>
<td align="right"><bold>0.188</bold></td>
<td align="right"><bold>0.075</bold></td>
<td align="right"><bold>0.011</bold></td>
<td align="right"><bold>0.035</bold></td>
<td align="right"><bold>0.037</bold></td>
<td align="right"><bold>0.047</bold></td>
<td align="right"><bold>0.023</bold></td>
<td align="right"><bold>0.034</bold></td>
<td align="right"><bold>0.059</bold></td>
</tr>
<tr>
<td align="left">RMSE</td>
<td align="right">0.207</td>
<td align="right">0.163</td>
<td align="right">0.115</td>
<td align="right">0.118</td>
<td align="right">0.118</td>
<td align="right">0.123</td>
<td align="right">0.130</td>
<td align="right">0.127</td>
<td align="right">0.151</td>
</tr>
<tr>
<td align="left"></td>
<td colspan="10"><hr/></td>
</tr>
<tr>
<td valign="middle" align="left" rowspan="2">S-SI</td>
<td align="left">Bias</td>
<td align="right">0.005</td>
<td align="right"><bold>0.014</bold></td>
<td align="right">0.007</td>
<td align="right">0.006</td>
<td align="right">0.002</td>
<td align="right">0.006</td>
<td align="right">0.005</td>
<td align="right">0.009</td>
<td align="right">0.006</td>
</tr>
<tr>
<td align="left">RMSE</td>
<td align="right">0.089</td>
<td align="right">0.116</td>
<td align="right">0.096</td>
<td align="right">0.095</td>
<td align="right">0.091</td>
<td align="right">0.094</td>
<td align="right">0.100</td>
<td align="right">0.102</td>
<td align="right">0.105</td>
</tr>
</table>
<table-wrap-foot>
<fn>
<p><bold>Note</bold>: Biased results are in boldface, i.e., Bias &gt; 0.010.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<p>Table <xref ref-type="table" rid="T11">11</xref> gives the coverage probability of the 95% CI for <italic>&#946;</italic><sub>1</sub>. The overall conclusions are similar to Table <xref ref-type="table" rid="T7">7</xref>, except that DA1 is confidence invalid even when <italic>p</italic> = 3. This implies that we cannot ignore between-imputation iterations in MCMC-based approaches even when the number of variables is small. On the other hand, EMB is confidence valid and we can safely ignore between-imputation iterations in EMB. Again, this is an important finding in the current study.</p>
<table-wrap id="T11">
<label>Table 11</label>
<caption>
<p>Coverage of the 95% CI (Realistic Data).</p>
</caption>
<table>
<tr>
<th align="left"></th>
<th colspan="9">Number of Variables<hr/></th>
</tr>
<tr>
<th align="left"></th>
<th align="center">2</th>
<th align="center">3</th>
<th align="center">4</th>
<th align="center">5</th>
<th align="center">6</th>
<th align="center">7</th>
<th align="center">8</th>
<th align="center">9</th>
<th align="center">10</th>
</tr>
<tr>
<td colspan="10"><hr/></td>
</tr>
<tr>
<td align="left">CD</td>
<td align="right">94.6</td>
<td align="right">95.3</td>
<td align="right">95.8</td>
<td align="right">94.7</td>
<td align="right">95.2</td>
<td align="right">96.4</td>
<td align="right">94.6</td>
<td align="right">95.3</td>
<td align="right">94.8</td>
</tr>
<tr>
<td align="left">LD</td>
<td align="right"><bold>92.2</bold></td>
<td align="right"><bold>91.6</bold></td>
<td align="right"><bold>92.8</bold></td>
<td align="right"><bold>91.5</bold></td>
<td align="right"><bold>86.8</bold></td>
<td align="right"><bold>85.0</bold></td>
<td align="right"><bold>89.8</bold></td>
<td align="right"><bold>90.0</bold></td>
<td align="right"><bold>90.8</bold></td>
</tr>
<tr>
<td align="left">EMB</td>
<td align="right">94.3</td>
<td align="right">94.1</td>
<td align="right">94.7</td>
<td align="right">93.9</td>
<td align="right">96.1</td>
<td align="right">94.2</td>
<td align="right">94.0</td>
<td align="right">94.4</td>
<td align="right">94.7</td>
</tr>
<tr>
<td align="left">DA1</td>
<td align="right">94.1</td>
<td align="right"><bold>92.2</bold></td>
<td align="right">94.4</td>
<td align="right"><bold>93.4</bold></td>
<td align="right">95.7</td>
<td align="right"><bold>92.2</bold></td>
<td align="right"><bold>93.1</bold></td>
<td align="right"><bold>92.9</bold></td>
<td align="right"><bold>93.1</bold></td>
</tr>
<tr>
<td align="left">DA2</td>
<td align="right">94.0</td>
<td align="right">94.0</td>
<td align="right">94.8</td>
<td align="right">94.4</td>
<td align="right">95.9</td>
<td align="right">94.5</td>
<td align="right">93.8</td>
<td align="right">95.0</td>
<td align="right">95.0</td>
</tr>
<tr>
<td align="left">FCS1</td>
<td align="right">94.6</td>
<td align="right">94.7</td>
<td align="right">96.3</td>
<td align="right"><bold>96.7</bold></td>
<td align="right"><bold>97.0</bold></td>
<td align="right"><bold>97.0</bold></td>
<td align="right"><bold>96.7</bold></td>
<td align="right"><bold>96.9</bold></td>
<td align="right"><bold>97.7</bold></td>
</tr>
<tr>
<td align="left">FCS2</td>
<td align="right">94.7</td>
<td align="right">93.8</td>
<td align="right">95.5</td>
<td align="right">95.7</td>
<td align="right">96.4</td>
<td align="right">94.3</td>
<td align="right">94.8</td>
<td align="right">95.2</td>
<td align="right">96.1</td>
</tr>
<tr>
<td align="left">D-SI</td>
<td align="right"><bold>32.7</bold></td>
<td align="right"><bold>74.5</bold></td>
<td align="right"><bold>79.2</bold></td>
<td align="right"><bold>77.6</bold></td>
<td align="right"><bold>77.7</bold></td>
<td align="right"><bold>74.1</bold></td>
<td align="right"><bold>75.3</bold></td>
<td align="right"><bold>75.1</bold></td>
<td align="right"><bold>68.8</bold></td>
</tr>
<tr>
<td align="left">S-SI</td>
<td align="right"><bold>87.9</bold></td>
<td align="right"><bold>83.2</bold></td>
<td align="right"><bold>82.3</bold></td>
<td align="right"><bold>82.5</bold></td>
<td align="right"><bold>84.2</bold></td>
<td align="right"><bold>82.1</bold></td>
<td align="right"><bold>81.0</bold></td>
<td align="right"><bold>80.3</bold></td>
<td align="right"><bold>81.2</bold></td>
</tr>
</table>
<table-wrap-foot>
<fn>
<p><bold>Note</bold>: Confidence invalid results are in boldface, i.e., outside of 93.6 and 96.4.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<p>Table <xref ref-type="table" rid="T12">12</xref> shows the CI lengths. The overall conclusions are similar to Table <xref ref-type="table" rid="T8">8</xref>. One difference is that the CI length by FCS1 is slightly short.</p>
<table-wrap id="T12">
<label>Table 12</label>
<caption>
<p>Lengths of the 95% CI (Realistic Data).</p>
</caption>
<table>
<tr>
<th align="left"></th>
<th colspan="9">Number of Variables<hr/></th>
</tr>
<tr>
<th align="left"></th>
<th align="center">2</th>
<th align="center">3</th>
<th align="center">4</th>
<th align="center">5</th>
<th align="center">6</th>
<th align="center">7</th>
<th align="center">8</th>
<th align="center">9</th>
<th align="center">10</th>
</tr>
<tr>
<td colspan="10"><hr/></td>
</tr>
<tr>
<td align="left">CD</td>
<td align="right">0.279</td>
<td align="right">0.334</td>
<td align="right">0.268</td>
<td align="right">0.266</td>
<td align="right">0.267</td>
<td align="right">0.261</td>
<td align="right">0.278</td>
<td align="right">0.274</td>
<td align="right">0.289</td>
</tr>
<tr>
<td align="left">LD</td>
<td align="right">0.333</td>
<td align="right">0.441</td>
<td align="right">0.389</td>
<td align="right">0.412</td>
<td align="right">0.436</td>
<td align="right">0.457</td>
<td align="right">0.516</td>
<td align="right">0.543</td>
<td align="right">0.631</td>
</tr>
<tr>
<td align="left">EMB</td>
<td align="right">0.314</td>
<td align="right">0.429</td>
<td align="right">0.364</td>
<td align="right">0.356</td>
<td align="right">0.362</td>
<td align="right">0.359</td>
<td align="right">0.397</td>
<td align="right">0.396</td>
<td align="right">0.432</td>
</tr>
<tr>
<td align="left">DA1</td>
<td align="right">0.313</td>
<td align="right">0.414</td>
<td align="right">0.348</td>
<td align="right">0.342</td>
<td align="right">0.343</td>
<td align="right">0.337</td>
<td align="right">0.370</td>
<td align="right">0.364</td>
<td align="right">0.390</td>
</tr>
<tr>
<td align="left">DA2</td>
<td align="right">0.315</td>
<td align="right">0.423</td>
<td align="right">0.356</td>
<td align="right">0.351</td>
<td align="right">0.353</td>
<td align="right">0.351</td>
<td align="right">0.383</td>
<td align="right">0.380</td>
<td align="right">0.410</td>
</tr>
<tr>
<td align="left">FCS1</td>
<td align="right">0.315</td>
<td align="right">0.416</td>
<td align="right">0.353</td>
<td align="right">0.348</td>
<td align="right">0.350</td>
<td align="right">0.350</td>
<td align="right">0.382</td>
<td align="right">0.380</td>
<td align="right">0.406</td>
</tr>
<tr>
<td align="left">FCS2</td>
<td align="right">0.316</td>
<td align="right">0.429</td>
<td align="right">0.359</td>
<td align="right">0.355</td>
<td align="right">0.358</td>
<td align="right">0.352</td>
<td align="right">0.389</td>
<td align="right">0.386</td>
<td align="right">0.413</td>
</tr>
<tr>
<td align="left">D-SI</td>
<td align="right">0.288</td>
<td align="right">0.380</td>
<td align="right">0.292</td>
<td align="right">0.289</td>
<td align="right">0.291</td>
<td align="right">0.278</td>
<td align="right">0.302</td>
<td align="right">0.294</td>
<td align="right">0.315</td>
</tr>
<tr>
<td colspan="10"><hr/></td>
</tr>
<tr>
<td align="left">S-SI</td>
<td align="right">0.281</td>
<td align="right">0.325</td>
<td align="right">0.262</td>
<td align="right">0.257</td>
<td align="right">0.259</td>
<td align="right">0.255</td>
<td align="right">0.269</td>
<td align="right">0.267</td>
<td align="right">0.277</td>
</tr>
</table>
</table-wrap>
<p>Table <xref ref-type="table" rid="T13">13</xref> displays the computational time required to generate multiple imputations. The overall conclusions are similar to Table <xref ref-type="table" rid="T9">9</xref>.</p>
<table-wrap id="T13">
<label>Table 13</label>
<caption>
<p>Computational Time (Realistic Data).</p>
</caption>
<table>
<tr>
<th align="left"></th>
<th colspan="9">Number of Variables<hr/></th>
</tr>
<tr>
<th align="left"></th>
<th align="center">2</th>
<th align="center">3</th>
<th align="center">4</th>
<th align="center">5</th>
<th align="center">6</th>
<th align="center">7</th>
<th align="center">8</th>
<th align="center">9</th>
<th align="center">10</th>
</tr>
<tr>
<td colspan="10"><hr/></td>
</tr>
<tr>
<td align="left">EMB</td>
<td align="right">0.14</td>
<td align="right">0.15</td>
<td align="right">0.16</td>
<td align="right">0.20</td>
<td align="right">0.23</td>
<td align="right">0.28</td>
<td align="right">0.36</td>
<td align="right"><bold>0.44</bold></td>
<td align="right"><bold>0.53</bold></td>
</tr>
<tr>
<td align="left">DA2</td>
<td align="right"><bold>0.04</bold></td>
<td align="right"><bold>0.05</bold></td>
<td align="right"><bold>0.06</bold></td>
<td align="right"><bold>0.10</bold></td>
<td align="right"><bold>0.15</bold></td>
<td align="right"><bold>0.22</bold></td>
<td align="right"><bold>0.33</bold></td>
<td align="right">0.47</td>
<td align="right">0.67</td>
</tr>
<tr>
<td align="left">FCS2</td>
<td align="right">1.05</td>
<td align="right">2.55</td>
<td align="right">4.22</td>
<td align="right">8.92</td>
<td align="right">12.02</td>
<td align="right">15.59</td>
<td align="right">20.82</td>
<td align="right">26.78</td>
<td align="right">35.95</td>
</tr>
</table>
<table-wrap-foot>
<fn>
<p><bold>Note</bold>: Reported values are the time in seconds to perform multiple imputation, which is averaged over 1,000 simulation runs. The fastest results are in boldface.</p>
</fn>
</table-wrap-foot>
</table-wrap>
</sec>
</sec>
</sec>
<sec>
<title>9 Conclusions</title>
<p>This article assessed the relative performance of the three multiple imputation algorithms (DA, FCS, and EMB). In both theoretical and realistic settings (Table <xref ref-type="table" rid="T7">7</xref> and Table <xref ref-type="table" rid="T11">11</xref>), if between-imputation iterations were ignored, the MCMC algorithms (DA and FCS) did not attain confidence validity. The nominal 95% CIs by DA and FCS without iterations were different from 95% coverage beyond the margin of error in 1,000 simulation runs. This is because the CI lengths by DA without iterations were generally too short, and the CI lengths by FCS are generally too long (Table <xref ref-type="table" rid="T8">8</xref> and Table <xref ref-type="table" rid="T12">12</xref>). Based on Schafer (<xref ref-type="bibr" rid="B43">1997: 139</xref>), this can be explained by choices for starting values. DA uses EM as a single starting value for <italic>M</italic> chains that understates missing data uncertainty (<xref ref-type="bibr" rid="B44">Schafer 2016: 22</xref>) while FCS uses random draws as <italic>M</italic> over-dispersed starting values that overstates missing data uncertainty (<xref ref-type="bibr" rid="B57">van Buuren and Groothuis-Oudshoorn 2011: 6</xref>). Without iterations, imputed values depend on the choice of starting values.</p>
<p>DA and FCS can be both confidence valid under the large number of iterations; however, the assessment of convergence in MCMC is notoriously difficult. Furthermore, the convergence properties of FCS are currently under debate due to possible incompatibility (<xref ref-type="bibr" rid="B37">Li, Yu, and Rubin 2012</xref>; <xref ref-type="bibr" rid="B60">Zhu and Raghunathan 2015</xref>). On the other hand, the current study found that EMB was confidence valid regardless of the situations. Therefore, EMB is a confidence proper imputation algorithm without iterations, which allows us to avoid a painful decision-making process of how to judge the convergence to generate confidence proper multiple imputations. This finding is useful in the missing data literature. For example, while ratio imputation is often used in official statistics (<xref ref-type="bibr" rid="B55">Takahashi, Iwasaki, and Tsubaki 2017</xref>), multiple ratio imputation does not exist in the literature. The EMB algorithm was applied to ratio imputation to create multiple ratio imputation (<xref ref-type="bibr" rid="B51">Takahashi 2017a</xref>; <xref ref-type="bibr" rid="B52">Takahashi 2017b</xref>).</p>
<p>No simulation studies can include all the patterns of relevant data (<xref ref-type="bibr" rid="B33">Kropko et al. 2014: 511</xref>). Therefore, the current study focused on two types of data, (1) theoretical and (2) realistic. Although the author believes that the two data generation processes cover data types relevant to many social research situations, the results in any simulation studies must be read with caution (Hardt, Herke, and Leonhart 2014: 11). Future research should delve into other data types, such as small-<italic>n</italic> data, large-<italic>p</italic> data, categorical data, and non-normal data, to name a few.</p>
</sec>
<sec sec-type="supplementary-material">
<title>Additional File</title>
<p>The additional file for this article can be found as follows:</p>
<supplementary-material id="S1" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://doi.org/10.5334/dsj-2017-037.s1">
<label>Data for Tables 1 and 2</label>
<caption>
<p>Political and Economic Data from CIA (<xref ref-type="bibr" rid="B10">2016</xref>) and Freedom House (<xref ref-type="bibr" rid="B19">2016</xref>). DOI: <uri>https://doi.org/10.5334/dsj-2017-037.s1</uri></p>
</caption>
</supplementary-material>
</sec>
</body>
<back>
<ack>
<title>Acknowledgements</title>
<p>The author wishes to thank Dr. Manabu Iwasaki (Seikei University), Dr. Michiko Watanabe (Keio University), and Dr. Takayuki Abe (Keio University) for the helpful comments. The author also wishes to thank the two anonymous reviewers for their comments that improved the quality of the article. Note that part of this article in its very early version was presented at the 59<sup>th</sup> World Statistics Congress of the International Statistical Institute (<xref ref-type="bibr" rid="B53">Takahashi and Ito 2013</xref>).</p>
</ack>
<sec>
<title>Competing Interests</title>
<p>The author has no competing interests to declare.</p>
</sec>
<ref-list>
<ref id="B1">
<label>1</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Abe</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Iwasaki</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>Evaluation of statistical methods for analysis of small-sample longitudinal clinical trials with dropouts</article-title>
<source>Journal of the Japanese Society of Computational Statistics</source>
<year iso-8601-date="2007">2007</year>
<volume>20</volume>
<issue>1</issue>
<fpage>1</fpage>
<lpage>18</lpage>
<pub-id pub-id-type="doi">10.5183/jjscs1988.20.1</pub-id>
</element-citation>
</ref>
<ref id="B2">
<label>2</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Acemoglu</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Johnson</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Robinson</surname>
<given-names>J A</given-names>
</name>
</person-group>
<person-group person-group-type="editor">
<name>
<surname>Aghion</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Durlauf</surname>
<given-names>S</given-names>
</name>
</person-group>
<chapter-title>Institutions as the fundamental cause of long-run growth</chapter-title>
<source>Handbook of Economic Growth</source>
<year iso-8601-date="2005">2005</year>
<publisher-loc>North Holland</publisher-loc>
<publisher-name>Elsevier</publisher-name>
</element-citation>
</ref>
<ref id="B3">
<label>3</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Allison</surname>
<given-names>P D</given-names>
</name>
</person-group>
<source>Missing Data</source>
<year iso-8601-date="2002">2002</year>
<publisher-loc>Thousand Oaks, CA</publisher-loc>
<publisher-name>Sage Publications</publisher-name>
<pub-id pub-id-type="doi">10.4135/9781412985079</pub-id>
</element-citation>
</ref>
<ref id="B4">
<label>4</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Baraldi</surname>
<given-names>A N</given-names>
</name>
<name>
<surname>Enders</surname>
<given-names>C K</given-names>
</name>
</person-group>
<article-title>An introduction to modern missing data analyses</article-title>
<source>Journal of School Psychology</source>
<year iso-8601-date="2010">2010</year>
<volume>48</volume>
<issue>1</issue>
<fpage>5</fpage>
<lpage>37</lpage>
<pub-id pub-id-type="doi">10.1016/j.jsp.2009.10.001</pub-id>
</element-citation>
</ref>
<ref id="B5">
<label>5</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Barnard</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Rubin</surname>
<given-names>D B</given-names>
</name>
</person-group>
<article-title>Small-sample degrees of freedom with multiple imputation</article-title>
<source>Biometrika</source>
<year iso-8601-date="1999">1999</year>
<volume>86</volume>
<issue>4</issue>
<fpage>948</fpage>
<lpage>955</lpage>
<pub-id pub-id-type="doi">10.1093/biomet/86.4.948</pub-id>
</element-citation>
</ref>
<ref id="B6">
<label>6</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Barro</surname>
<given-names>R J</given-names>
</name>
</person-group>
<source>Determinants of Economic Growth: A Cross-Country Empirical Study</source>
<year iso-8601-date="1997">1997</year>
<publisher-loc>Cambridge, MA</publisher-loc>
<publisher-name>MIT Press</publisher-name>
</element-citation>
</ref>
<ref id="B7">
<label>7</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bodner</surname>
<given-names>T E</given-names>
</name>
</person-group>
<article-title>What improves with increased missing data imputations?</article-title>
<source>Structural Equation Modeling</source>
<year iso-8601-date="2008">2008</year>
<volume>15</volume>
<fpage>651</fpage>
<lpage>675</lpage>
<pub-id pub-id-type="doi">10.1080/10705510802339072</pub-id>
</element-citation>
</ref>
<ref id="B8">
<label>8</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Carpenter</surname>
<given-names>J R</given-names>
</name>
<name>
<surname>Kenward</surname>
<given-names>M G</given-names>
</name>
</person-group>
<source>Multiple Imputation and its Application</source>
<year iso-8601-date="2013">2013</year>
<publisher-loc>Chichester, West Sussex</publisher-loc>
<publisher-name>A John Wiley &amp; Sons Publication</publisher-name>
</element-citation>
</ref>
<ref id="B9">
<label>9</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Carsey</surname>
<given-names>T M</given-names>
</name>
<name>
<surname>Harden</surname>
<given-names>J J</given-names>
</name>
</person-group>
<source>Monte Carlo Simulation and Resampling Methods for Social Science</source>
<year iso-8601-date="2014">2014</year>
<publisher-loc>Thousand Oaks, CA</publisher-loc>
<publisher-name>Sage Publications</publisher-name>
<pub-id pub-id-type="doi">10.4135/9781483319605</pub-id>
</element-citation>
</ref>
<ref id="B10">
<label>10</label>
<element-citation publication-type="webpage">
<person-group person-group-type="author">
<collab>Central Intelligence Agency</collab>
</person-group>
<source>The World Factbook</source>
<year iso-8601-date="2016">2016</year>
<comment>Available at: <uri>https://www.cia.gov/library/publications/the-world-factbook/index.html</uri> [Last accessed November 27, 2016]</comment>
</element-citation>
</ref>
<ref id="B11">
<label>11</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cheema</surname>
<given-names>J R</given-names>
</name>
</person-group>
<article-title>Some general guidelines for choosing missing data handling methods in educational research</article-title>
<source>Journal of Modern Applied Statistical Methods</source>
<year iso-8601-date="2014">2014</year>
<volume>13</volume>
<issue>2</issue>
<fpage>53</fpage>
<lpage>75</lpage>
<pub-id pub-id-type="doi">10.22237/jmasm/1414814520</pub-id>
</element-citation>
</ref>
<ref id="B12">
<label>12</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cranmer</surname>
<given-names>S J</given-names>
</name>
<name>
<surname>Gill</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>We have to be discrete about this: A non-parametric imputation technique for missing categorical data</article-title>
<source>British Journal of Political Science</source>
<year iso-8601-date="2013">2013</year>
<volume>43</volume>
<issue>2</issue>
<fpage>425</fpage>
<lpage>449</lpage>
<pub-id pub-id-type="doi">10.1017/S0007123412000312</pub-id>
</element-citation>
</ref>
<ref id="B13">
<label>13</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Deng</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Chang</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Ido</surname>
<given-names>M S</given-names>
</name>
<name>
<surname>Long</surname>
<given-names>Q</given-names>
</name>
</person-group>
<article-title>Multiple imputation for general missing data patterns in the presence of high-dimensional data</article-title>
<source>Scientific Reports</source>
<year iso-8601-date="2016">2016</year>
<volume>6</volume>
<issue>21689</issue>
<fpage>1</fpage>
<lpage>10</lpage>
<pub-id pub-id-type="doi">10.1038/srep21689</pub-id>
</element-citation>
</ref>
<ref id="B14">
<label>14</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>de Waal</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Pannekoek</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Scholtus</surname>
<given-names>S</given-names>
</name>
</person-group>
<source>Handbook of Statistical Data Editing and Imputation</source>
<year iso-8601-date="2011">2011</year>
<publisher-loc>Hoboken, NJ</publisher-loc>
<publisher-name>John Wiley &amp; Sons</publisher-name>
<pub-id pub-id-type="doi">10.1002/9780470904848</pub-id>
</element-citation>
</ref>
<ref id="B15">
<label>15</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Do</surname>
<given-names>B C</given-names>
</name>
<name>
<surname>Batzoglou</surname>
<given-names>S</given-names>
</name>
</person-group>
<article-title>What is the expectation maximization algorithm?</article-title>
<source>Nature Biotechnology</source>
<year iso-8601-date="2008">2008</year>
<volume>26</volume>
<issue>8</issue>
<fpage>897</fpage>
<lpage>899</lpage>
<pub-id pub-id-type="doi">10.1038/nbt1406</pub-id>
</element-citation>
</ref>
<ref id="B16">
<label>16</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Donders</surname>
<given-names>A R T</given-names>
</name>
<name>
<surname>van der Heijden</surname>
<given-names>G J M G</given-names>
</name>
<name>
<surname>Stijnen</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Moons</surname>
<given-names>K G M</given-names>
</name>
</person-group>
<article-title>Review: A gentle introduction to imputation of missing values</article-title>
<source>Journal of Clinical Epidemiology</source>
<year iso-8601-date="2006">2006</year>
<volume>59</volume>
<fpage>1087</fpage>
<lpage>1091</lpage>
<pub-id pub-id-type="doi">10.1016/j.jclinepi.2006.01.014</pub-id>
</element-citation>
</ref>
<ref id="B17">
<label>17</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Enders</surname>
<given-names>C K</given-names>
</name>
</person-group>
<source>Applied Missing Data Analysis</source>
<year iso-8601-date="2010">2010</year>
<publisher-loc>New York, NY</publisher-loc>
<publisher-name>The Guilford Press</publisher-name>
</element-citation>
</ref>
<ref id="B18">
<label>18</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Feng</surname>
<given-names>Y</given-names>
</name>
</person-group>
<source>Democracy, Governance, and Economic Performance: Theory and Evidence</source>
<year iso-8601-date="2003">2003</year>
<publisher-loc>Cambridge, MA</publisher-loc>
<publisher-name>The MIT Press</publisher-name>
</element-citation>
</ref>
<ref id="B19">
<label>19</label>
<element-citation publication-type="webpage">
<person-group person-group-type="author">
<collab>Freedom House</collab>
</person-group>
<source>Freedom in the World 2016</source>
<year iso-8601-date="2016">2016</year>
<comment>Available at: <uri>https://freedomhouse.org/report/freedom-world/freedom-world-2016</uri> [Last accessed November 30, 2016]</comment>
</element-citation>
</ref>
<ref id="B20">
<label>20</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Gill</surname>
<given-names>J</given-names>
</name>
</person-group>
<source>Bayesian Methods: A Social and Behavioral Sciences Approach</source>
<year iso-8601-date="2008">2008</year>
<edition>Second Edition</edition>
<publisher-loc>London</publisher-loc>
<publisher-name>Chapman &amp; Hall/CRC</publisher-name>
</element-citation>
</ref>
<ref id="B21">
<label>21</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Graham</surname>
<given-names>J W</given-names>
</name>
</person-group>
<article-title>Missing data analysis: Making it work in the real world</article-title>
<source>Annual Review of Psychology</source>
<year iso-8601-date="2009">2009</year>
<volume>60</volume>
<fpage>549</fpage>
<lpage>576</lpage>
<pub-id pub-id-type="doi">10.1146/annurev.psych.58.110405.085530</pub-id>
</element-citation>
</ref>
<ref id="B22">
<label>22</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Graham</surname>
<given-names>J W</given-names>
</name>
<name>
<surname>Olchowski</surname>
<given-names>A E</given-names>
</name>
<name>
<surname>Gilreath</surname>
<given-names>T D</given-names>
</name>
</person-group>
<article-title>How many imputations are really needed? Some practical clarifications of multiple imputation theory</article-title>
<source>Prevention Science</source>
<year iso-8601-date="2007">2007</year>
<volume>8</volume>
<issue>3</issue>
<fpage>206</fpage>
<lpage>213</lpage>
<pub-id pub-id-type="doi">10.1007/s11121-007-0070-9</pub-id>
</element-citation>
</ref>
<ref id="B23">
<label>23</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Gujarati</surname>
<given-names>D N</given-names>
</name>
</person-group>
<source>Basic Econometrics</source>
<year iso-8601-date="2003">2003</year>
<edition>Fourth Edition</edition>
<publisher-loc>Boston, MA</publisher-loc>
<publisher-name>McGraw-Hill</publisher-name>
</element-citation>
</ref>
<ref id="B24">
<label>24</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hardt</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Herke</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Leonhart</surname>
<given-names>R</given-names>
</name>
</person-group>
<article-title>Auxiliary variables in multiple imputation in regression with missing X: A warning against including too many in small sample research</article-title>
<source>BMC Medical Research Methodology</source>
<year iso-8601-date="2012">2012</year>
<volume>12</volume>
<issue>184</issue>
<fpage>1</fpage>
<lpage>13</lpage>
<pub-id pub-id-type="doi">10.1186/1471-2288-12-184</pub-id>
</element-citation>
</ref>
<ref id="B25">
<label>25</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Honaker</surname>
<given-names>J</given-names>
</name>
<name>
<surname>King</surname>
<given-names>G</given-names>
</name>
</person-group>
<article-title>What to do about missing values in time series cross-section data</article-title>
<source>American Journal of Political Science</source>
<year iso-8601-date="2010">2010</year>
<volume>54</volume>
<issue>2</issue>
<fpage>561</fpage>
<lpage>581</lpage>
<pub-id pub-id-type="doi">10.1111/j.1540-5907.2010.00447.x</pub-id>
</element-citation>
</ref>
<ref id="B26">
<label>26</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Honaker</surname>
<given-names>J</given-names>
</name>
<name>
<surname>King</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Blackwell</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>Amelia II: A program for missing data</article-title>
<source>Journal of Statistical Software</source>
<year iso-8601-date="2011">2011</year>
<volume>45</volume>
<issue>7</issue>
<fpage>1</fpage>
<lpage>47</lpage>
<pub-id pub-id-type="doi">10.18637/jss.v045.i07</pub-id>
</element-citation>
</ref>
<ref id="B27">
<label>27</label>
<element-citation publication-type="webpage">
<person-group person-group-type="author">
<name>
<surname>Honaker</surname>
<given-names>J</given-names>
</name>
<name>
<surname>King</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Blackwell</surname>
<given-names>M</given-names>
</name>
</person-group>
<source>Package &#8216;Amelia&#8217;</source>
<year iso-8601-date="2016">2016</year>
<comment>Available at: <uri>http://cran.r-project.org/web/packages/Amelia/Amelia.pdf</uri> [Last accessed November 30, 2016]</comment>
</element-citation>
</ref>
<ref id="B28">
<label>28</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Horowitz</surname>
<given-names>J L</given-names>
</name>
</person-group>
<person-group person-group-type="editor">
<name>
<surname>Heckman</surname>
<given-names>J J</given-names>
</name>
<name>
<surname>Leamer</surname>
<given-names>E</given-names>
</name>
</person-group>
<chapter-title>The bootstrap</chapter-title>
<source>Handbook of Econometrics</source>
<year iso-8601-date="2001">2001</year>
<publisher-loc>North Holland</publisher-loc>
<publisher-name>Elsevier</publisher-name>
<volume>5</volume>
<pub-id pub-id-type="doi">10.1016/s1573-4412(01)05005-x</pub-id>
</element-citation>
</ref>
<ref id="B29">
<label>29</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Horton</surname>
<given-names>N J</given-names>
</name>
<name>
<surname>Kleinman</surname>
<given-names>K P</given-names>
</name>
</person-group>
<article-title>Much ado about nothing: A comparison of missing data methods and software to fit incomplete data regression models</article-title>
<source>The American Statistician</source>
<year iso-8601-date="2007">2007</year>
<volume>61</volume>
<issue>1</issue>
<fpage>79</fpage>
<lpage>90</lpage>
<pub-id pub-id-type="doi">10.1198/000313007X172556</pub-id>
</element-citation>
</ref>
<ref id="B30">
<label>30</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Horton</surname>
<given-names>N J</given-names>
</name>
<name>
<surname>Lipsitz</surname>
<given-names>S R</given-names>
</name>
</person-group>
<article-title>Multiple imputation in practice: Comparison of software packages for regression models with missing variables</article-title>
<source>The American Statistician</source>
<year iso-8601-date="2001">2001</year>
<volume>55</volume>
<issue>3</issue>
<fpage>244</fpage>
<lpage>254</lpage>
<pub-id pub-id-type="doi">10.1198/000313007X172556</pub-id>
</element-citation>
</ref>
<ref id="B31">
<label>31</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hughes</surname>
<given-names>R A</given-names>
</name>
<name>
<surname>Sterne</surname>
<given-names>J A C</given-names>
</name>
<name>
<surname>Tilling</surname>
<given-names>K</given-names>
</name>
</person-group>
<article-title>Comparison of imputation variance estimators</article-title>
<source>Statistical Methods in Medical Research</source>
<year iso-8601-date="2016">2016</year>
<volume>25</volume>
<issue>6</issue>
<fpage>2541</fpage>
<lpage>2557</lpage>
<pub-id pub-id-type="doi">10.1177/0962280214526216</pub-id>
</element-citation>
</ref>
<ref id="B32">
<label>32</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>King</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Honaker</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Joseph</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Scheve</surname>
<given-names>K</given-names>
</name>
</person-group>
<article-title>Analyzing incomplete political science data: An alternative algorithm for multiple imputation</article-title>
<source>American Political Science Review</source>
<year iso-8601-date="2001">2001</year>
<volume>95</volume>
<issue>1</issue>
<fpage>49</fpage>
<lpage>69</lpage>
</element-citation>
</ref>
<ref id="B33">
<label>33</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kropko</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Goodrich</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Gelman</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Hill</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>Multiple imputation for continuous and categorical data: Comparing joint multivariate normal and conditional approaches</article-title>
<source>Political Analysis</source>
<year iso-8601-date="2014">2014</year>
<volume>22</volume>
<issue>4</issue>
<fpage>497</fpage>
<lpage>519</lpage>
<pub-id pub-id-type="doi">10.1093/pan/mpu007</pub-id>
</element-citation>
</ref>
<ref id="B34">
<label>34</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lee</surname>
<given-names>K J</given-names>
</name>
<name>
<surname>Carlin</surname>
<given-names>J B</given-names>
</name>
</person-group>
<article-title>Multiple imputation for missing data: Fully conditional specification versus multivariate normal imputation</article-title>
<source>American Journal of Epidemiology</source>
<year iso-8601-date="2010">2010</year>
<volume>171</volume>
<issue>5</issue>
<fpage>624</fpage>
<lpage>632</lpage>
<pub-id pub-id-type="doi">10.1093/aje/kwp425</pub-id>
</element-citation>
</ref>
<ref id="B35">
<label>35</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lee</surname>
<given-names>K J</given-names>
</name>
<name>
<surname>Carlin</surname>
<given-names>J B</given-names>
</name>
</person-group>
<article-title>Recovery of information from multiple imputation: A simulation study</article-title>
<source>Emerging Themes in Epidemiology</source>
<year iso-8601-date="2012">2012</year>
<volume>9</volume>
<issue>3</issue>
<fpage>1</fpage>
<lpage>10</lpage>
<pub-id pub-id-type="doi">10.1186/1742-7622-9-3</pub-id>
</element-citation>
</ref>
<ref id="B36">
<label>36</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Leite</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Beretvas</surname>
<given-names>S</given-names>
</name>
</person-group>
<article-title>The performance of multiple imputation for Likert-type items with missing data</article-title>
<source>Journal of Modern Applied Statistical Methods</source>
<year iso-8601-date="2010">2010</year>
<volume>9</volume>
<issue>1</issue>
<fpage>64</fpage>
<lpage>74</lpage>
<pub-id pub-id-type="doi">10.22237/jmasm/1272686820</pub-id>
</element-citation>
</ref>
<ref id="B37">
<label>37</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Yu</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Rubin</surname>
<given-names>D B</given-names>
</name>
</person-group>
<article-title>Imputing missing data by fully conditional models: Some cautionary examples and guidelines</article-title>
<source>Duke University Department of Statistical Science Discussion Paper</source>
<year iso-8601-date="2012">2012</year>
<volume>11</volume>
<issue>14</issue>
<fpage>1</fpage>
<lpage>35</lpage>
</element-citation>
</ref>
<ref id="B38">
<label>38</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Little</surname>
<given-names>R J A</given-names>
</name>
<name>
<surname>Rubin</surname>
<given-names>D B</given-names>
</name>
</person-group>
<source>Statistical Analysis with Missing Data</source>
<year iso-8601-date="2002">2002</year>
<edition>Second Edition</edition>
<publisher-loc>Hoboken, NJ</publisher-loc>
<publisher-name>John Wiley &amp; Sons</publisher-name>
<pub-id pub-id-type="doi">10.1002/9781119013563</pub-id>
</element-citation>
</ref>
<ref id="B39">
<label>39</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>McNeish</surname>
<given-names>D</given-names>
</name>
</person-group>
<article-title>Missing data methods for arbitrary missingness with small samples</article-title>
<source>Journal of Applied Statistics</source>
<year iso-8601-date="2017">2017</year>
<volume>44</volume>
<issue>1</issue>
<fpage>24</fpage>
<lpage>39</lpage>
<pub-id pub-id-type="doi">10.1080/02664763.2016.1158246</pub-id>
</element-citation>
</ref>
<ref id="B40">
<label>40</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Mooney</surname>
<given-names>C Z</given-names>
</name>
</person-group>
<source>Monte Carlo Simulation</source>
<year iso-8601-date="1997">1997</year>
<publisher-loc>Thousand Oaks, CA</publisher-loc>
<publisher-name>Sage Publications</publisher-name>
<pub-id pub-id-type="doi">10.4135/9781412985116</pub-id>
</element-citation>
</ref>
<ref id="B41">
<label>41</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Raghunathan</surname>
<given-names>T</given-names>
</name>
</person-group>
<source>Missing Data Analysis in Practice</source>
<year iso-8601-date="2016">2016</year>
<publisher-loc>Boca Raton, FL</publisher-loc>
<publisher-name>CRC Press</publisher-name>
</element-citation>
</ref>
<ref id="B42">
<label>42</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Rubin</surname>
<given-names>D B</given-names>
</name>
</person-group>
<source>Multiple Imputation for Nonresponse in Surveys</source>
<year iso-8601-date="1987">1987</year>
<publisher-loc>New York, NY</publisher-loc>
<publisher-name>John Wiley &amp; Sons</publisher-name>
<pub-id pub-id-type="doi">10.1002/9780470316696</pub-id>
</element-citation>
</ref>
<ref id="B43">
<label>43</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Schafer</surname>
<given-names>J L</given-names>
</name>
</person-group>
<source>Analysis of Incomplete Multivariate Data</source>
<year iso-8601-date="1997">1997</year>
<publisher-loc>Boca Raton, FL</publisher-loc>
<publisher-name>Chapman &amp; Hall/CRC</publisher-name>
<pub-id pub-id-type="doi">10.1201/9781439821862</pub-id>
</element-citation>
</ref>
<ref id="B44">
<label>44</label>
<element-citation publication-type="webpage">
<person-group person-group-type="author">
<name>
<surname>Schafer</surname>
<given-names>J L</given-names>
</name>
</person-group>
<source>Package &#8216;norm2&#8217;</source>
<year iso-8601-date="2016">2016</year>
<comment>Available at: <uri>https://cran.r-project.org/web/packages/norm2/norm2.pdf</uri> [Last accessed November 30, 2016]</comment>
</element-citation>
</ref>
<ref id="B45">
<label>45</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Schafer</surname>
<given-names>J L</given-names>
</name>
<name>
<surname>Graham</surname>
<given-names>J W</given-names>
</name>
</person-group>
<article-title>Missing data: Our view of the state of the art</article-title>
<source>Psychological Methods</source>
<year iso-8601-date="2002">2002</year>
<volume>7</volume>
<issue>2</issue>
<fpage>147</fpage>
<lpage>177</lpage>
<pub-id pub-id-type="doi">10.1037//1082-989X.7.2.147</pub-id>
</element-citation>
</ref>
<ref id="B46">
<label>46</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Schafer</surname>
<given-names>J L</given-names>
</name>
<name>
<surname>Olsen</surname>
<given-names>M K</given-names>
</name>
</person-group>
<article-title>Multiple imputation for multivariate missing-data problems: A data analyst&#8217;s perspective</article-title>
<source>Multivariate Behavioral Research</source>
<year iso-8601-date="1998">1998</year>
<volume>33</volume>
<fpage>545</fpage>
<lpage>571</lpage>
<pub-id pub-id-type="doi">10.1207/s15327906mbr3304_5</pub-id>
</element-citation>
</ref>
<ref id="B47">
<label>47</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Schenker</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Raghunathan</surname>
<given-names>T E</given-names>
</name>
<name>
<surname>Chiu</surname>
<given-names>P-L</given-names>
</name>
<name>
<surname>Makuc</surname>
<given-names>D M</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Cohen</surname>
<given-names>A J</given-names>
</name>
</person-group>
<article-title>Multiple imputation of missing income data in the national health interview survey</article-title>
<source>Journal of the American Statistical Association</source>
<year iso-8601-date="2006">2006</year>
<volume>101</volume>
<issue>475</issue>
<fpage>924</fpage>
<lpage>933</lpage>
<pub-id pub-id-type="doi">10.1198/016214505000001375</pub-id>
</element-citation>
</ref>
<ref id="B48">
<label>48</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Scheuren</surname>
<given-names>F</given-names>
</name>
</person-group>
<article-title>Multiple imputation: How it began and continues</article-title>
<source>The American Statistician</source>
<year iso-8601-date="2005">2005</year>
<volume>59</volume>
<issue>4</issue>
<fpage>315</fpage>
<lpage>319</lpage>
<pub-id pub-id-type="doi">10.1198/000313005X74016</pub-id>
</element-citation>
</ref>
<ref id="B49">
<label>49</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Shara</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Yassin</surname>
<given-names>S A</given-names>
</name>
<name>
<surname>Valaitis</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Howard</surname>
<given-names>B V</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Lee</surname>
<given-names>E T</given-names>
</name>
<name>
<surname>Umans</surname>
<given-names>J G</given-names>
</name>
</person-group>
<article-title>Randomly and non-randomly missing renal function data in the strong heart study: A comparison of imputation methods</article-title>
<source>PLOS ONE</source>
<year iso-8601-date="2015">2015</year>
<volume>10</volume>
<issue>9</issue>
<fpage>1</fpage>
<lpage>11</lpage>
<pub-id pub-id-type="doi">10.1371/journal.pone.0138923</pub-id>
</element-citation>
</ref>
<ref id="B50">
<label>50</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Stuart</surname>
<given-names>E A</given-names>
</name>
<name>
<surname>Azur</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Frangakis</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Leaf</surname>
<given-names>P</given-names>
</name>
</person-group>
<article-title>Multiple imputation with large data sets: A case study of the children&#8217;s mental health initiative</article-title>
<source>American Journal of Epidemiology</source>
<year iso-8601-date="2009">2009</year>
<volume>169</volume>
<issue>9</issue>
<fpage>1133</fpage>
<lpage>1139</lpage>
<pub-id pub-id-type="doi">10.1093/aje/kwp026</pub-id>
</element-citation>
</ref>
<ref id="B51">
<label>51</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Takahashi</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>Multiple ratio imputation by the EMB algorithm: Theory and simulation</article-title>
<source>Journal of Modern Applied Statistical Methods</source>
<year iso-8601-date="2017a">2017a</year>
<volume>16</volume>
<issue>1</issue>
<fpage>630</fpage>
<lpage>656</lpage>
<pub-id pub-id-type="doi">10.22237/jmasm/1493598840</pub-id>
</element-citation>
</ref>
<ref id="B52">
<label>52</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Takahashi</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>Implementing multiple ratio imputation by the EMB algorithm (R)</article-title>
<source>Journal of Modern Applied Statistical Methods</source>
<year iso-8601-date="2017b">2017b</year>
<volume>16</volume>
<issue>1</issue>
<fpage>657</fpage>
<lpage>673</lpage>
<pub-id pub-id-type="doi">10.22237/jmasm/1493598900</pub-id>
</element-citation>
</ref>
<ref id="B53">
<label>53</label>
<element-citation publication-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Takahashi</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Ito</surname>
<given-names>T</given-names>
</name>
</person-group>
<article-title>&#8220;Multiple imputation of missing values in economic surveys: Comparison of competing algorithms,&#8221;</article-title>
<conf-name>Proceedings of The 59th World Statistics Congress of the International Statistical Institute (ISI)</conf-name>
<year iso-8601-date="2013">2013</year>
<conf-loc>Hong Kong, China</conf-loc>
<fpage>3240</fpage>
<lpage>3245</lpage>
</element-citation>
</ref>
<ref id="B54">
<label>54</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Takahashi</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Ito</surname>
<given-names>T</given-names>
</name>
</person-group>
<article-title>Comparison of competing algorithms of multiple imputation: Analysis using large-scale economic data</article-title>
<source>Research Memoir of Official Statistics</source>
<year iso-8601-date="2014">2014</year>
<issue>71</issue>
<fpage>39</fpage>
<lpage>82</lpage>
</element-citation>
</ref>
<ref id="B55">
<label>55</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Takahashi</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Iwasaki</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Tsubaki</surname>
<given-names>H</given-names>
</name>
</person-group>
<article-title>Imputing the mean of a heteroskedastic log-normal missing variable: A unified approach to ratio imputation</article-title>
<source>Statistical Journal of the IAOS</source>
<year iso-8601-date="2017">2017</year>
<volume>33</volume>
<issue>3</issue>
<pub-id pub-id-type="doi">10.3233/SJI-160306</pub-id>
<comment>in press</comment>
</element-citation>
</ref>
<ref id="B56">
<label>56</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>van Buuren</surname>
<given-names>S</given-names>
</name>
</person-group>
<source>Flexible Imputation of Missing Data</source>
<year iso-8601-date="2012">2012</year>
<publisher-loc>Boca Raton, FL</publisher-loc>
<publisher-name>Chapman &amp; Hall/CRC</publisher-name>
<pub-id pub-id-type="doi">10.1201/b11826</pub-id>
</element-citation>
</ref>
<ref id="B57">
<label>57</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>van Buuren</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Groothuis-Oudshoorn</surname>
<given-names>K</given-names>
</name>
</person-group>
<article-title>mice: multivariate imputation by chained equations in R</article-title>
<source>Journal of Statistical Software</source>
<year iso-8601-date="2011">2011</year>
<volume>45</volume>
<issue>3</issue>
<fpage>1</fpage>
<lpage>67</lpage>
<pub-id pub-id-type="doi">10.18637/jss.v045.i03</pub-id>
</element-citation>
</ref>
<ref id="B58">
<label>58</label>
<element-citation publication-type="webpage">
<person-group person-group-type="author">
<name>
<surname>van Buuren</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Groothuis-Oudshoorn</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Roxitzsch</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Vink</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Doove</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Jolani</surname>
<given-names>S</given-names>
</name>
</person-group>
<source>Package &#8216;mice&#8217;</source>
<year iso-8601-date="2015">2015</year>
<comment>Available at: <uri>https://cran.r-project.org/web/packages/mice/mice.pdf</uri> [Last accessed November 30, 2016]</comment>
</element-citation>
</ref>
<ref id="B59">
<label>59</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>von Hippel</surname>
<given-names>P T</given-names>
</name>
</person-group>
<article-title>New confidence intervals and bias comparisons show that maximum likelihood can beat multiple imputation in small samples</article-title>
<source>Structural Equation Modeling</source>
<year iso-8601-date="2016">2016</year>
<volume>23</volume>
<issue>3</issue>
<fpage>422</fpage>
<lpage>437</lpage>
<pub-id pub-id-type="doi">10.1080/10705511.2015.1047931</pub-id>
</element-citation>
</ref>
<ref id="B60">
<label>60</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhu</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Raghunathan</surname>
<given-names>T E</given-names>
</name>
</person-group>
<article-title>Convergence properties of a sequential regression multiple imputation algorithm</article-title>
<source>Journal of the American Statistical Association</source>
<year iso-8601-date="2015">2015</year>
<volume>110</volume>
<issue>511</issue>
<fpage>1112</fpage>
<lpage>1124</lpage>
<pub-id pub-id-type="doi">10.1080/01621459.2014.948117</pub-id>
</element-citation>
</ref>
</ref-list>
</back>
</article>