Rotation Forest: Model, Persamaan, dan Alur Metode

Sumber utama: Rodríguez, J. J., Kuncheva, L. I., & Alonso, C. J. (2006). Rotation Forest: A New Classifier Ensemble Method. IEEE TPAMI.


🇮🇩 Pendahuluan

Rotation Forest adalah metode ensemble untuk klasifikasi yang membangun banyak decision tree, tetapi tiap pohon dilatih pada data yang sudah “dirotasi” menggunakan PCA. Tujuannya sederhana: membuat pohon-pohon yang tetap kuat (akurasi tinggi) namun berbeda satu sama lain (diversitas tinggi).

Intuisi singkatnya:
Random Forest merandomisasi pemilihan fitur saat split, sedangkan Rotation Forest merandomisasi orientasi ruang fitur sebelum pohon dilatih.


🇬🇧 Introduction

Rotation Forest is an ensemble classification method that builds multiple decision trees, where each tree is trained on a PCA-rotated version of the data. The goal is to create base learners that are both accurate and diverse.

Quick intuition:
Random Forest randomizes feature selection during splitting, while Rotation Forest randomizes the geometry (orientation) of the feature space before training.


1) Notasi Dasar | Basic Notation

🇮🇩

Misalkan dataset latih adalah:

  • Jumlah observasi: NN
  • Jumlah fitur: nn
  • Jumlah kelas: cc
  • Matriks fitur: XRN×nX \in \mathbb{R}^{N \times n}
  • Label kelas: y{ω1,,ωc}Ny \in \{\omega_1,\dots,\omega_c\}^N
  • Jumlah classifier dalam ensemble: LL

Tiap classifier ke-ii akan memiliki rotation matrix sendiri, yaitu RiaR_i^a​.

🇬🇧

Let the training data be:

  • Number of samples: NN
  • Number of features: nn
  • Number of classes: cc
  • Feature matrix: XRN×nX \in \mathbb{R}^{N \times n}
  • Class labels: y{ω1,,ωc}Ny \in \{\omega_1,\dots,\omega_c\}^N
  • Number of ensemble members: LL

Each classifier iii has its own rotation matrix RiaR_i^a​.


2) Inti Model Rotation Forest | The Core Model

🇮🇩

Untuk classifier ke-iii, data latih dirotasi menjadi:X(i)=XRiaX^{(i)} = X R_i^{a}

Pohon keputusan DiD_iDi​ dilatih menggunakan X(i)X^{(i)}X(i) (bukan XXX asli).

🇬🇧

For classifier iii, the training data is rotated as:X(i)=XRiaX^{(i)} = X R_i^{a}

Then decision tree DiD_i​ is trained on X(i)X^{(i)}.


3) Membagi Fitur Menjadi K Subset | Splitting Features into K Subsets

🇮🇩

Himpunan fitur F={1,2,,n}F=\{1,2,\dots,n\} dibagi acak menjadi KK subset yang saling lepas:F=Fi,1Fi,2Fi,K,Fi,pFi,q= (pq)F = F_{i,1} \cup F_{i,2} \cup \dots \cup F_{i,K}, \quad F_{i,p} \cap F_{i,q} = \emptyset \ (p\neq q)

Jika nnn habis dibagi KKK, ukuran subset adalah:M=nKM = \frac{n}{K}

🇬🇧

The feature set F={1,2,,n}F=\{1,2,\dots,n\}F={1,2,…,n} is randomly split into KKK disjoint subsets:F=Fi,1Fi,2Fi,K,Fi,pFi,q= (pq)F = F_{i,1} \cup F_{i,2} \cup \dots \cup F_{i,K}, \quad F_{i,p} \cap F_{i,q} = \emptyset \ (p\neq q)

If nnn is divisible by KKK, the subset size is:M=nKM = \frac{n}{K}


4) PCA per Subset (Tanpa Membuang Komponen) | PCA per Subset (No Component Dropping)

🇮🇩

Untuk setiap subset fitur Fi,jF_{i,j}Fi,j​, diambil sampel bootstrap (misalnya 75% data) untuk membentuk matriks data subset Xi,jX_{i,j}Xi,j​. Lalu PCA dilakukan lewat kovarians:Σi,j=1Ni,j1(Xi,jXˉi,j)(Xi,jXˉi,j)\Sigma_{i,j} = \frac{1}{N_{i,j}-1}\left(X_{i,j}-\bar{X}_{i,j}\right)^\top\left(X_{i,j}-\bar{X}_{i,j}\right)

Kemudian eigen-decomposition:Σi,jvi,j(m)=λi,j(m)vi,j(m),m=1,,Mj\Sigma_{i,j} v_{i,j}^{(m)} = \lambda_{i,j}^{(m)} v_{i,j}^{(m)}, \quad m=1,\dots,M_j

Kumpulan eigenvector membentuk matriks:Pi,j=[vi,j(1) vi,j(2)  vi,j(Mj)]P_{i,j} = \left[ v_{i,j}^{(1)} \ v_{i,j}^{(2)} \ \dots \ v_{i,j}^{(M_j)} \right]

Catatan penting: Rotation Forest menyimpan semua komponen PCA (tidak ada reduksi dimensi).

🇬🇧

For each feature subset Fi,jF_{i,j}Fi,j​, bootstrap samples are taken (e.g., 75% of data) to form Xi,jX_{i,j}Xi,j​. PCA is computed via covariance:Σi,j=1Ni,j1(Xi,jXˉi,j)(Xi,jXˉi,j)\Sigma_{i,j} = \frac{1}{N_{i,j}-1}\left(X_{i,j}-\bar{X}_{i,j}\right)^\top\left(X_{i,j}-\bar{X}_{i,j}\right)

Then eigen-decomposition:Σi,jvi,j(m)=λi,j(m)vi,j(m),m=1,,Mj\Sigma_{i,j} v_{i,j}^{(m)} = \lambda_{i,j}^{(m)} v_{i,j}^{(m)}, \quad m=1,\dots,M_j

Eigenvectors are collected into:Pi,j=[vi,j(1) vi,j(2)  vi,j(Mj)]P_{i,j} = \left[ v_{i,j}^{(1)} \ v_{i,j}^{(2)} \ \dots \ v_{i,j}^{(M_j)} \right]

Key note: Rotation Forest keeps all PCA components (no dimensionality reduction).


5) Menyusun Rotation Matrix (Persamaan “Resmi”) | Constructing the Rotation Matrix (Canonical Form)

Ini bentuk yang paling “rapi” untuk blog, namun tetap sesuai struktur paper: matriks rotasi RiR_iRi​ berbentuk blok-diagonal dari hasil PCA tiap subset.Ri=[Pi,1000Pi,2000Pi,K]R_i= \begin{bmatrix} P_{i,1} & 0 & \cdots & 0\\ 0 & P_{i,2} & \cdots & 0\\ \vdots & \vdots & \ddots & \vdots\\ 0 & 0 & \cdots & P_{i,K} \end{bmatrix}

Lalu dilakukan pengaturan ulang kolom (reordering) agar urutan fitur kembali cocok dengan fitur asli, menghasilkan:RiaR_i^{a}

Data latih untuk pohon DiD_iDi​:X(i)=XRiaX^{(i)} = X R_i^{a}


6) Agregasi Prediksi Ensemble | Ensemble Prediction Aggregation

🇮🇩

Untuk data uji xxx, setiap classifier memberi probabilitas kelas di,j()d_{i,j}(\cdot)di,j​(⋅). Kombinasi dilakukan dengan rata-rata:μj(x)=1Li=1Ldi,j(xRia),j=1,,c\mu_j(x) = \frac{1}{L}\sum_{i=1}^{L} d_{i,j}\left(x R_i^{a}\right), \quad j=1,\dots,c

Prediksi akhir:y^(x)=argmaxj{1,,c}μj(x)\hat{y}(x)=\arg\max_{j \in \{1,\dots,c\}} \mu_j(x)

🇬🇧

For a test instance xxx, each classifier outputs class probabilities di,j()d_{i,j}(\cdot)di,j​(⋅). The ensemble averages them:μj(x)=1Li=1Ldi,j(xRia),j=1,,c\mu_j(x) = \frac{1}{L}\sum_{i=1}^{L} d_{i,j}\left(x R_i^{a}\right), \quad j=1,\dots,c

Final prediction:y^(x)=argmaxj{1,,c}μj(x)\hat{y}(x)=\arg\max_{j \in \{1,\dots,c\}} \mu_j(x)


7) (Opsional) Kappa–Error untuk Mengukur Diversitas | (Optional) Kappa–Error for Diversity

🇮🇩

Untuk dua classifier DiD_iDi​ dan DjD_jDj​, kappa dihitung dari matriks coincidence M=[mk,s]M=[m_{k,s}]M=[mk,s​]. Definisi kappa:κi,j=kmk,kABC1ABC\kappa_{i,j}=\frac{\sum_{k} m_{k,k} – ABC}{1-ABC}

dengan:ABC=k(smk,s)(sms,k)ABC=\sum_{k}\left(\sum_{s} m_{k,s}\right)\left(\sum_{s} m_{s,k}\right)

🇬🇧

For two classifiers DiD_iDi​ and DjD_jDj​, kappa is computed from the coincidence matrix M=[mk,s]M=[m_{k,s}]M=[mk,s​]:κi,j=kmk,kABC1ABC\kappa_{i,j}=\frac{\sum_{k} m_{k,k} – ABC}{1-ABC}

where:ABC=k(smk,s)(sms,k)ABC=\sum_{k}\left(\sum_{s} m_{k,s}\right)\left(\sum_{s} m_{s,k}\right)


Kesimpulan | Conclusion

🇮🇩

Rotation Forest bisa diringkas dalam tiga ide:

  1. fitur dibagi menjadi beberapa subset,
  2. tiap subset diputar dengan PCA (semua komponen disimpan),
  3. tiap pohon dilatih pada data hasil rotasi, lalu output digabung rata-rata.

Hasilnya adalah ensemble yang biasanya memiliki kombinasi optimal antara akurasi dan diversitas.

🇬🇧

Rotation Forest can be summarized into three ideas:

  1. split features into subsets,
  2. rotate each subset using PCA (keep all components),
  3. train each tree on rotated data and average predictions.

This often yields a strong balance between accuracy and diversity.