UM

Browse/Search Results:  1-3 of 3 Help

Selected(0)Clear Items/Page:    Sort:
Planck: Optimizing LLM Inference Performance in Pipeline Parallelism with Fine-Grained SLO Constraint Journal article
Lin, Yanying, Peng, Shijie, Wu, Shuaipeng, Li, Yanbo, Lu, Chengzhi, Xu, Chengzhong, Ye, Kejiang. Planck: Optimizing LLM Inference Performance in Pipeline Parallelism with Fine-Grained SLO Constraint[J]. Proceedings of the IEEE International Conference on Web Services, ICWS, 2024, 1306-1313.
Authors:  Lin, Yanying;  Peng, Shijie;  Wu, Shuaipeng;  Li, Yanbo;  Lu, Chengzhi; et al.
Favorite | TC[WOS]:0 TC[Scopus]:1 | Submit date:2024/12/26
LLM Serving  Pipeline Bubble  Pipeline Parallelism  SLO Constraint  
Planck: Optimizing LLM Inference Performance in Pipeline Parallelism with Fine-Grained SLO Constraint Conference paper
Lin, Yanying, Peng, Shijie, Wu, Shuaipeng, Li, Yanbo, Lu, Chengzhi, Xu, Chengzhong, Ye, Kejiang. Planck: Optimizing LLM Inference Performance in Pipeline Parallelism with Fine-Grained SLO Constraint[C]:Institute of Electrical and Electronics Engineers Inc., 2024, 1306-1313.
Authors:  Lin, Yanying;  Peng, Shijie;  Wu, Shuaipeng;  Li, Yanbo;  Lu, Chengzhi; et al.
Favorite | TC[WOS]:0 TC[Scopus]:1 | Submit date:2024/12/05
Llm Serving  Pipeline Bubble  Pipeline Parallelism  Slo Constraint  
Deep Learning for Web Services Classification Conference paper
Yang, Yilong, Ke, Wei, Wang, Weiru, Zhao, Yongxin. Deep Learning for Web Services Classification[C]. Bertino E., Chang C.K., Chen P., Damiani E., Damiani E., Goul M., Oyama K., IEEE, 345 E 47TH ST, NEW YORK, NY 10017 USA:IEEE, 2019, 440-442.
Authors:  Yang, Yilong;  Ke, Wei;  Wang, Weiru;  Zhao, Yongxin
Favorite | TC[WOS]:33 TC[Scopus]:47 | Submit date:2022/05/17
Deep Learning  Service  Service Classification  Service Discovery  Web Service