TY - GEN
T1 - RECL
T2 - 20th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2023
AU - Khani, Mehrdad
AU - Ananthanarayanan, Ganesh
AU - Hsieh, Kevin
AU - Jiang, Junchen
AU - Netravali, Ravi
AU - Shu, Yuanchao
AU - Alizadeh, Mohammad
AU - Bahl, Victor
N1 - Publisher Copyright:
© NSDI 2023.All rights reserved
PY - 2023
Y1 - 2023
N2 - Continuous learning has recently shown promising results for video analytics by adapting a lightweight “expert” DNN model for each specific video scene to cope with the data drift in real time. However, current adaptation approaches either rely on periodic retraining and suffer its delay and significant compute costs or rely on selecting historical models and incur accuracy loss by not fully leveraging the potential of persistent retraining. Without dynamically optimizing the resource sharing among model selection and retraining, both approaches have a diminishing return at scale. RECL is a new video-analytics framework that carefully integrates model reusing and online model retraining, allowing it to quickly adapt the expert model given any video frame samples. To do this, RECL (i) shares across edge devices a (potentially growing) “model zoo” that comprises expert models previously trained for all edge devices, enabling history model reuse across video sessions, (ii) uses a fast procedure to online select a highly accurate expert model from this shared model zoo, and (iii) dynamically optimizes GPU allocation among model retraining, model selection, and timely updates of the model zoo. Our evaluation of RECL over 70 hours of real-world videos across two vision tasks (object detection and classification) shows substantial performance gains compared to prior work, further amplifying over the system lifetime.
AB - Continuous learning has recently shown promising results for video analytics by adapting a lightweight “expert” DNN model for each specific video scene to cope with the data drift in real time. However, current adaptation approaches either rely on periodic retraining and suffer its delay and significant compute costs or rely on selecting historical models and incur accuracy loss by not fully leveraging the potential of persistent retraining. Without dynamically optimizing the resource sharing among model selection and retraining, both approaches have a diminishing return at scale. RECL is a new video-analytics framework that carefully integrates model reusing and online model retraining, allowing it to quickly adapt the expert model given any video frame samples. To do this, RECL (i) shares across edge devices a (potentially growing) “model zoo” that comprises expert models previously trained for all edge devices, enabling history model reuse across video sessions, (ii) uses a fast procedure to online select a highly accurate expert model from this shared model zoo, and (iii) dynamically optimizes GPU allocation among model retraining, model selection, and timely updates of the model zoo. Our evaluation of RECL over 70 hours of real-world videos across two vision tasks (object detection and classification) shows substantial performance gains compared to prior work, further amplifying over the system lifetime.
UR - http://www.scopus.com/inward/record.url?scp=85153753793&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85153753793&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85153753793
T3 - Proceedings of the 20th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2023
SP - 917
EP - 932
BT - Proceedings of the 20th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2023
PB - USENIX Association
Y2 - 17 April 2023 through 19 April 2023
ER -