• PDF
• Cite
• Share
Article Contents  Article Contents

# Modal additive models with data-driven structure identification

• * Corresponding author: Hong Chen
• Additive models, due to their high flexibility, have received a great deal of attention in high dimensional regression analysis. Many efforts have been made on capturing interactions between predictive variables within additive models. However, typical approaches are designed based on conditional mean assumptions, which may fail to reveal the structure when data is contaminated by heavy-tailed noise. In this paper, we propose a penalized modal regression method, Modal Additive Models (MAM), based on a conditional mode assumption for simultaneous function estimation and structure identification. MAM approximates the non-parametric function through forward neural networks, and maximizes modal risk with constraints on the function space and group structure. The proposed approach can be implemented by the half-quadratic (HQ) optimization technique, and its asymptotic estimation and selection consistency are established. It turns out that MAM can achieve satisfactory learning rate and identify the target group structure with high probability. The effectiveness of MAM is also supported by some simulated examples.

Mathematics Subject Classification: Primary: 68T05; Secondary: 62J02.

 Citation: • • Figure 1.  Estimated transformation function for selected groups. Top-left: group $(1, 6)$, top-right: group $(8, 12)$, bottom-left: group $(3, 7 )$, bottom-right: group $(10, 13)$

Table Algorithm 1.  Half-quadratic Optimization for MAM

 1: Require: Input data $({{\bf x}}_i, y_i)_{i=1}^n$, kernel-induced representing function $\phi$, activating function $\psi$, weight parameter ${{\bf w}}$ and bias term ${{\bf b}}$. 2: Ensure: ${{\bf a}}_{{\bf z}}$; 3: Define function $f$ such that $f({{\bf x}}^2) = \phi({{\bf x}})$; 4: Initialize $\sigma$, coefficient ${{\bf a}}$; 5:while not converge do 6:    Update $e_i$ by $e_i = f^\prime \Big( \big(\frac{y_i - f({{\bf x}}_i)}{\sigma} \big)^2 \Big)$; 7:    Update ${{\bf a}}$ by ${{\bf a}} = \arg \max_{{{\bf a}} \in \mathbb{R}^h} \frac{1}{n \sigma}\sum_{i=1}^{n} \Big( e_i \big(\frac{y_i - f({{\bf x}}_i)}{\sigma} \big)^2 - g(e_i) \Big) - \lambda \|{{\bf a}}\|_2^2$; 8:    update $\sigma$; 9: end while 10: Output: ${{\bf a}}_{{\bf z}} = {{\bf a}}$.

Table Algorithm 2.  Backward Stepwise Selection for MAM

 1: Start with the variable pool $G = \{(1,2,\cdots, d)\}$; 2: Solve (13) to obtain the maximum value $\mathscr{R}_{\lambda, G}$; 3: for each variable $j$ in $G$ do 4:    $\hat{G} \longleftarrow$ either divide $j$ into subgroups or add to an existing group; 5:    Solve (13) to obtain the maximum value $\mathscr{R}_{\lambda, \hat{G}}$; 6:    if $\mathscr{R}_{\lambda, \hat{G}} > \mathscr{R}_{\lambda, G}$ then 7:        Preserve $\hat{G}$ as the new group structure; 8:    end if 9: end for 10: Return $\hat{G}$.

Table 1.  Selected models for simulation study and the corresponding intrinsic group structures

 ID Model Intrinsic group structure M1 $y = x_1 + x_2^2 + \frac{1}{1+ x_3^2} + \sin(\pi x_4) +\log(x_5+5) + \sqrt{|x_6|} + \epsilon$ $\{(1),(2),(3),(4),(5),(6)\}$ M2 $y = \frac{\sin(x_1)}{x_1 } + \cos((x_2 +x_3)\cdot \pi ) + \arctan((x_4 + x_5 + x_6)^2)+ \epsilon$ $\{(1),(2, 3),(4, 5, 6)\}$ M3 $y = \sin(x_1 + x_2) + 2\log(x_3 + 5) +x_4 + x_5\cdot x_6 + \epsilon$ $\{(1, 2), (3), (4), (5, 6)\}$

Table 3.  Average performance that intrinsic group structures are identified for $(\mu, \beta)$ pair (Gaussian noise)

 Parameters M1 M2 M3 $\mu$ $\beta$ MF Size TP U O MF Size TP U O MF Size TP U O $1 \rm{e} - 6$ $1$ 0 2 1 1 0 0 2 0.66 1 0 0 2 1 0 1 $1 \rm{e} - 5$ $1$ 0 2 1 1 0 0 2 0.84 1 0 0 2 1 0 1 $1 \rm{e} - 4$ $1$ 0 2 1 1 0 0 2 0.68 1 0 0 2 0.1 1 0 $1 \rm{e} - 3$ $1$ 0 2 1 1 0 0 2 0.46 0.46 0 0 2 1 1 0 $1 \rm{e} - 2$ $1$ 0 2 1 1 0 0 2 0.62 0.62 0 0 2 1 1 0 $1 \rm{e} - 1$ $1$ 0 2 1 1 0 0 2 0.78 0.78 0 0 2 1 0 0 $1 \rm{e} - 6$ $3$ 0 3 2 1 0 0 2 0.42 0.42 0 0 2 0.66 0.66 0 $1 \rm{e} - 5$ $3$ 0 2.84 1.78 0.94 0 0 2 0.54 0.54 0 0 2 0 1 0 $1 \rm{e} - 4$ $3$ 0 3.36 2.32 1 0 0 2 0.58 0.58 0 0 2.2 1.6 1 0 $1 \rm{e} - 3$ $3$ 0 4.9 3.9 1 0 0 2 0.78 0.78 0 50 4 4 0 0 $1 \rm{e} - 2$ $3$ 50 6 6 0 0 29 3.62 1.9 0 0.22 50 4 4 0 0 $1 \rm{e} - 1$ $3$ 50 6 6 0 0 0 5.38 1.62 0 1 0 6 2 0 1 $1 \rm{e} - 6$ $5$ 0 2.72 1.64 0.92 0 0 2 0.5 0.5 0 0 2.3 0.6 1 0 $1 \rm{e} - 5$ $5$ 0 3.4 1.6 0.8 0 0 2 0.58 0.58 0 0 3 2 1 0 $1 \rm{e} - 4$ $5$ 0 4.82 3.82 1 0 0 2.01 0.38 0.38 0 50 4 4 0 0 $1 \rm{e} - 3$ $5$ 27 5.54 5.08 0.46 0 28 3.44 1.76 0 0 50 4 4 0 0 $1 \rm{e} - 2$ $5$ 50 6 6 0 0 0 5 2 0 1 0 6 2 0 1 $1 \rm{e} - 1$ $5$ 50 6 6 0 0 0 6 1 0 1 0 6 2 0 1

Table 4.  Average performance that intrinsic group structures are identified for $(\mu, \beta)$ pair (Gamma noise)

 Parameters M1 M2 M3 $\mu$ $\beta$ MF Size TP U O MF Size TP U O MF Size TP U O $1 \rm{e} - 6$ $1$ 0 2 1 1 0 0 2 0.6 0.6 0 0 2 1 1 0 $1 \rm{e} - 5$ $1$ 0 2 1 1 0 0 2 0.7 0.7 0 0 2 1 1 0 $1 \rm{e} - 4$ $1$ 0 2 1 1 0 0 2 1 1 0 0 2 1 1 0 $1 \rm{e} - 3$ $1$ 0 2 1 1 0 0 2 0.92 0.92 0 0 2 1 1 0 $1 \rm{e} - 2$ $1$ 0 2 1 1 0 0 2 0.58 0.58 0 0 2 1 1 0 $1 \rm{e} - 1$ $1$ 0 2 1 1 0 0 2 0.76 0.76 0 0 2 1 1 0 $1 \rm{e} - 6$ $3$ 0 2 1 1 0 0 2 0.52 0.52 0 0 2 1 1 0 $1 \rm{e} - 5$ $3$ 0 2 1 1 0 0 2 1 1 0 0 2.42 0.66 1 0 $1 \rm{e} - 4$ $3$ 0 3.8 2.6 1 0 0 2 0.8 0.8 0 0 2 1 1 0 $1 \rm{e} - 3$ $3$ 0 4 3 1 0 5 2.26 0.92 0.62 0 50 4 4 0 0 $1 \rm{e} - 2$ $3$ 42 5.84 5.88 0.16 0 27 3.66 1.82 0 0.2 50 4 4 0 0 $1 \rm{e} - 1$ $3$ 50 6 6 0 0 0 6 1 0 1 0 6 2 0 1 $1 \rm{e} - 6$ $5$ 0 2.56 1.48 1 0 0 2 0.62 0.62 0 0 2 0.92 0.92 0 $1 \rm{e} - 5$ $5$ 0 3.5 2.5 1 0 0 2 0.66 0.66 0 0 3 2 1 0 $1 \rm{e} - 4$ $5$ 7 4.88 3.76 0.86 0 24 3.08 1.8 0 0.08 0 2.2 0.52 1 0 $1 \rm{e} - 3$ $5$ 8 4.94 3.84 0.84 0 27 3.4 1.6 0 0 50 4 4 0 0 $1 \rm{e} - 2$ $5$ 50 6 6 0 0 0 5 2 0 1 0 5.14 2.86 0 1 $1 \rm{e} - 1$ $5$ 50 6 6 0 0 0 6 1 0 1 0 6 2 0 1

Table 2.  Mean absolute error comparisons (Mean$\pm$std.) for Gaussian and Gamma noise}

 GASI MAM Model Gaussian Gamma Gaussian Gamma M1 $186.3.92. \pm 437.8$ $458.8 \pm 988.8$ $\mathbf{109.92 \pm 257.2}$ $\mathbf{272.8 \pm 536.2}$ M2 $1.088 \pm 0.025$ $0.774 \pm 0.032$ $\mathbf{0.839 \pm 0.023}$ $\mathbf{0.751 \pm 0.028}$ M3 $\mathbf{0.857 \pm 0.025}$ $\mathbf{ 0.873 \pm 0.019}$ $0.901 \pm 0.028$ $0.917 \pm 0.021$
• Figures(1)

Tables(6)

## Article Metrics  DownLoad:  Full-Size Img  PowerPoint