LG Aimers 4기 그리고 Ensemble Learning

LG Aimers 4기 그리고 Ensemble Learning

2024. 1. 14. 09:03ㆍ코딩 도구/LG Aimers

LG Aimers: AI전문가과정 4차

Module 4. 『지도학습(분류/회귀)』

ㅇ 교수 : 이화여자대학교 강제원 교수
ㅇ 학습목표
Machine Learning의 한 부류인 지도학습(Supervised Learning)에 대한 기본 개념과 regression/classification의 목적 및 차이점에 대해 이해하고, 다양한 모델 및 방법 (linear and nonlinear regression, classification, ensemble methods, kernel methods 등)을 통해 언제 어떤 모델을 사용해야 하는지, 왜 사용하는지, 모델 성능을 향상시키는 방법을 학습하게 됩니다.

-Ensemble Learning
이미 사용하거나 개발한 알고리즘의 간단한 확장이다.
(Supervised Learning Task에서 성능을 올릴 수 있는 방법.)

-알고리즘 때 공부한 Confusion matrix 와 ROC Curve 를 오랜만에 만났는데 다시 공부할 필요를 느끼고 대학교 수업때 시험을 위해서 열심히 공부할 필요를 느꼈다.

-Ensemble Methods
• Predict class label for unseen data by aggregating a set of
predictions : different classifiers (experts) learned from the training data
• Make a decision with a voting

-Build Ensemble Classifiers
Basic idea: Build different experts, and let them vote.
• Bagging and boosting

Advantages:
• Improve predictive performance
• Other types of classifiers can be directly included
• Easy to implement
• No too much parameter tuning

Disadvantage
• Not a compact representation

-Bagging
• Bootstrapping + aggregating (for more robust performance;
lower variance)

• Train several models in parallel
  A classifier 𝐶𝑖 is learned for each 𝑆𝑖 in sample set 𝑆

• Bagging works because it reduces variance by voting/averaging (robust to overfitting)

  Learning algorithm is unstable: if small changes to the training set cause large changes in the learned classifier.
  Usually, the more classifiers the better

-Bootstrapping
  Generate multiple datasets 𝑆𝑖 in a dataset 𝑆
• 𝑆𝑖 has 𝑛 randomly chosen samples, which may be less than the
original set, with replacement
• Repeat 𝑀 times
→ generate 𝑀 datasets, in which the size is 𝑛.
→ Train 𝑀 models

-Aggregating
• Committee prediction

-Boosting
Cascading of weak classifiers
• Train multiple models in sequence
• Assign a larger weight for misclassified points by one of the base classifiers, when training the next classifier in the sequence (combat to lower bias)
• Adaboost

Advantage
• Simple and easy to implement
• Flexible : can combine with any learning algorithm
• No prior knowledge needed about weak learner
• Versatile : can be applied on a wide variety of problems
• Non-parametric

-Adaboost
AdaBoost, short for Adaptive Boosting, by Y. Freund and R. Shapire (1996)

• 𝑀 sequential base classifiers :
ℎ1, … , ℎ𝑚, … , ℎ𝑀
• Trained on weighted form of the training set
• Weight depends on the performance of the previous classifier
• Combined to give the final classifier

- Bagging and Boosting

Improving decision tree
• By bagging -> random forest (inherently boosting)
• By boosting -> gradient boosting machine (GBM) as
generalized Adaboost
    • Very popular machine learning algorithm
    • One of leading methods for winning many Kaggle competition

-Supervised learning (SL)
limitation, future research topics

• SL is a baseline study on many recent AI tasks, owing to large-scaled labeled datasets

• Nevertheless, it relies on the sizes of datasets ; what if we
have no sufficient data samples?
    • Data augmentation (computer-synthesized data, generated data by
unsupervised learning, etc.)
    • Learning from insufficient labels (weak supervision, etc.)
• Furthermore, what if the data properties are different
between datasets?
    • Domain adaptation, transfer learning, etc.

Quiz

What answers are correct? Select all that apply.

A. Adaboost algorithm considers the failure of previous classifiers

Correct.
Adaboost algorithm considers the failure of previous classifiers, when choosing (or weighting) samples in the data set

B. In many computer vision and language processing methods applying deep learning, a supervised learning do not really play an important role

False.
In fact, supervised learning is a baseline study of such recent state-of-the-art studies

저작자표시 비영리 변경금지

'코딩 도구 > LG Aimers' 카테고리의 다른 글

LG Aimers 4기 인과추론 수행을 위한 기본 방법론 (0)	2024.01.16
LG Aimers 4기 인과성과 기본개념 (1)	2024.01.15
LG Aimers 4기 그리고 Advanced Classification Model (2)	2024.01.13
LG Aimers 4기 그리고 Linear Classification (0)	2024.01.12
LG Aimers 4기 그리고 Gradient Descent (2)	2024.01.11

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

MK 실험실

MK 실험실

태그

최근글

댓글

공지사항

아카이브

LG Aimers: AI전문가과정 4차

Module 4. 『지도학습(분류/회귀)』

'코딩 도구 > LG Aimers' 카테고리의 다른 글

관련글

티스토리툴바

개인정보

단축키

내 블로그

블로그 게시글

모든 영역