Exploiting GPUs for Efficient Gradient Boosting Decision Tree Training

Wen Zeyi; Shi Jiashuai; He Bingsheng; Chen Jian; Ramamohanarao Kotagiri; Li Qinbin

国产bbaaaaa片,成年美女黄网站色视频免费,成年黄大片,а天堂中文最新一区二区三区,成人精品视频一区二区三区尤物

首頁> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >Exploiting GPUs for Efficient Gradient Boosting Decision Tree Training

【24h】

Exploiting GPUs for Efficient Gradient Boosting Decision Tree Training

機譯：利用GPU進行有效的梯度提升決策樹訓(xùn)練

獲取原文

獲取原文并翻譯 | 示例

開具論文收錄證明 >>

頁面導(dǎo)航

摘要
著錄項
引文網(wǎng)絡(luò)
相似文獻
相關(guān)主題

摘要

In this paper, we present a novel parallel implementation for training Gradient Boosting Decision Trees (GBDTs) on Graphics Processing Units (GPUs). Thanks to the excellent results on classification/regression and the open sourced libraries such as XGBoost, GBDTs have become very popular in recent years and won many awards in machine learning and data mining competitions. Although GPUs have demonstrated their success in accelerating many machine learning applications, it is challenging to develop an efficient GPU-based GBDT algorithm. The key challenges include irregular memory accesses, many sorting operations with small inputs and varying data parallel granularities in tree construction. To tackle these challenges on GPUs, we propose various novel techniques including (i) Run-length Encoding compression and thread/block workload dynamic allocation, (ii) data partitioning based on stable sort, and fast and memory efficient attribute ID lookup in node splitting, (iii) finding approximate split points using two-stage histogram building, (iv) building histograms with the aware of sparsity and exploiting histogram subtraction to reduce histogram building workload, (v) reusing intermediate training results for efficient gradient computation, and (vi) exploiting multiple GPUs to handle larger data sets efficiently. Our experimental results show that our algorithm named ThunderGBM can be 10x times faster than the state-of-the-art libraries (i.e., XGBoost, LightGBM and CatBoost) running on a relatively high-end workstation of 20 CPU cores. In comparison with the libraries on GPUs, ThunderGBM can handle higher dimensional problems which the libraries become extremely slow or simply fail. For the data sets the existing libraries on GPUs can handle, ThunderGBM achieves up to 10 times speedup on the same hardware, which demonstrates the significance of our GPU optimizations. Moreover, the models trained by ThunderGBM are identical to those trained by XGBoost, and have similar quality as those trained by LightGBM and CatBoost.

機譯：在本文中，我們提出了一種新穎的并行實現(xiàn)，用于在圖形處理單元（GPU）上訓(xùn)練梯度提升決策樹（GBDT）。得益于分類/回歸方面的出色成果以及諸如XGBoost之類的開源庫，GBDT近年來變得非常流行，并在機器學(xué)習(xí)和數(shù)據(jù)挖掘競賽中贏得了許多獎項。盡管GPU已經(jīng)證明了其在加速許多機器學(xué)習(xí)應(yīng)用程序方面的成功，但是開發(fā)基于GPU的高效GBDT算法仍然是一項挑戰(zhàn)。關(guān)鍵的挑戰(zhàn)包括不規(guī)則的內(nèi)存訪問，使用小輸入的許多排序操作以及樹結(jié)構(gòu)中變化的數(shù)據(jù)并行粒度。為了解決GPU上的這些挑戰(zhàn)，我們提出了多種新穎的技術(shù)，其中包括（i）運行長度編碼壓縮和線程/塊工作負載動態(tài)分配，（ii）基于穩(wěn)定排序的數(shù)據(jù)分區(qū)，以及在節(jié)點拆分中快速高效地進行內(nèi)存ID查找；（iii）使用兩階段直方圖構(gòu)建找到近似的分割點；（iv）意識到稀疏性的直方圖構(gòu)建，并利用直方圖減法來減少直方圖構(gòu)建工作量；（v）將中間訓(xùn)練結(jié)果重新用于有效的梯度計算；以及（vi ）利用多個GPU來有效處理更大的數(shù)據(jù)集。我們的實驗結(jié)果表明，我們的名為ThunderGBM的算法比在20個CPU內(nèi)核的高端工作站上運行的最新庫（即XGBoost，LightGBM和CatBoost）快10倍。與GPU上的庫相比，ThunderGBM可以處理更高維度的問題，這些問題變得極其緩慢或完全失敗。對于GPU上現(xiàn)有庫可處理的數(shù)據(jù)集，ThunderGBM在相同硬件上的速度提高了10倍，這證明了我們進行GPU優(yōu)化的重要性。此外，ThunderGBM訓(xùn)練的模型與XGBoost訓(xùn)練的模型相同，并且質(zhì)量與LightGBM和CatBoost訓(xùn)練的模型相似。

著錄項

來源
《IEEE Transactions on Parallel and Distributed Systems》 |2019年第12期|2706-2717|共12頁
作者
Wen Zeyi; Shi Jiashuai; He Bingsheng; Chen Jian; Ramamohanarao Kotagiri; Li Qinbin;
展開▼
作者單位

Natl Univ Singapore SoC Singapore 119077 Singapore;

South China Univ Technol Sch Software Engn Guangzhou 510330 Guangdong Peoples R China;

Univ Melbourne Parkville Vic 3010 Australia;

展開▼
收錄信息
原文格式 PDF
正文語種 eng
中圖分類
關(guān)鍵詞
Graphics processing units; Decision trees; Training; Histograms; Libraries; Machine learning; Instruction sets; Graphics processing units; gradient boosting decision trees; machine learning;

機譯：圖形處理單元;決策樹;訓(xùn)練;直方圖;圖書館;機器學(xué)習(xí);指令集;圖形處理單元;梯度提升決策樹;機器學(xué)習(xí);

相似文獻

外文文獻
中文文獻
專利

1. FPGA and GPU-based acceleration of ML workloads on Amazon cloud - A case study using gradient boosted decision tree library [J] . Shepovalov Maxim, Akella Venkatesh Integration . 2020,第Jana期

機譯：亞馬遜云上基于FPGA和GPU的ML工作負載加速-使用梯度提升決策樹庫的案例研究
2. Finding Influential Training Samples for Gradient Boosted Decision Trees [J] . Boris Sharchilev, Yury Ustinovskiy, Pavel Serdyukov, JMLR: Workshop and Conference Proceedings . 2018,第1期

機譯：為梯度提升決策樹尋找有影響力的訓(xùn)練樣本
3. Finding Influential Training Samples for Gradient Boosted Decision Trees [J] . Boris Sharchilev, Yury Ustinovskiy, Pavel Serdyukov, JMLR: Workshop and Conference Proceedings . 2018,第1期

機譯：為梯度提升決策樹尋找有影響力的訓(xùn)練樣本
4. Efficient Gradient Boosted Decision Tree Training on GPUs [C] . Zeyi Wen, Bingsheng He, Ramamohanarao Kotagiri, IEEE International Parallel and Distributed Processing Symposium . 2018

機譯：在GPU上進行高效的梯度增強決策樹訓(xùn)練
5. Gradient Boosted Regression Tree Methods for Semicontinuous Data [D] . Deshmukh, Sanket . 2020

機譯：半連續(xù)數(shù)據(jù)的漸變提升回歸樹方法
6. A Hyperspectral Image Classification Approach Based on Feature Fusion and Multi-Layered Gradient Boosting Decision Trees [O] . Shenyuan Xu, Size Liu, Hua Wang, 2021

機譯：基于特征融合和多層梯度升壓決策樹的高光譜圖像分類方法
7. Inclusion of genetic variants in an ensemble of gradient boosting decision trees does not improve the prediction of citalopram treatment response [O] . Jason Shumake, Travis T. Mallard, John E. McGeary, 2021

機譯：在梯度升壓決策樹的集合中包含遺傳變異并不改善基于科內(nèi)替普治療響應(yīng)的預(yù)測

獲取原文

客服郵箱：kefu@zhangqiaokeyan.com

京公網(wǎng)安備：11010802029741號 ICP備案號：京ICP備15016152號-6 六維聯(lián)合信息科技 (北京) 有限公司?版權(quán)所有

客服微信
服務(wù)號

国产bbaaaaa片,成年美女黄网站色视频免费,成年黄大片,а天堂中文最新一区二区三区,成人精品视频一区二区三区尤物

Exploiting GPUs for Efficient Gradient Boosting Decision Tree Training

摘要

著錄項

引文網(wǎng)絡(luò)

相似文獻

相關(guān)主題

期刊訂閱