TY - JOUR
T1 - Exploring software partitions for fast security processing on a multiprocessor mobile SoC
AU - Arora, Divya
AU - Raghunathan, Anand
AU - Ravi, Srivaths
AU - Sankaradass, Murugan
AU - Jha, Niraj K.
AU - Chakradhar, Srimat T.
N1 - Funding Information:
Manuscript received December 21, 2006. This work was supported by the National Science Foundation under Grant CCR-0326372. D. Arora and N. K. Jha are with the Department of Electrical Engineering, Princeton University, Princeton, NJ 08544 USA (e-mail: [email protected]; [email protected]). A. Raghunathan, M. Sankaradass, and S. T. Chakradhar are with NEC Laboratories, Princeton, NJ 08540 USA (e-mail: [email protected]; [email protected]; [email protected]). S. Ravi is with the Research and Development Center, Texas Instruments, Bangalore 560093, India (e-mail: [email protected]). Digital Object Identifier 10.1109/TVLSI.2007.898740
PY - 2007/6
Y1 - 2007/6
N2 - The functionality of mobile devices, such as cell phones and personal digital assistants (PDAs), has evolved to include various applications where security is a critical concern (secure web transactions, mobile commerce, download and playback of protected audio/video content, connection to corporate private networks, etc.). Security mechanisms (e.g., secure communication protocols) involve cryptographic algorithms, and are often quite computationally intensive, challenging the constrained processing and battery resources of mobile devices. Extensive design effort and aggressive hardware and software optimizations are required to address this challenge. Previous work has addressed the design of hardware architectures (custom accelerators, domain-specific processors, etc.) to accelerate security processing, and many emerging systems-on-chip (SoCs) feature some form of hardware support for security. In this paper, we address the complementary problem of mapping a complex security software library to an SoC platform with security hardware enhancements. We present a systematic methodology for exploring the software architecture for security processing for a commercial heterogeneous multiprocessor SoC for mobile devices. The SoC contains multiple host processors executing applications and a dedicated programmable security processing engine. We developed an exploration methodology to map the code and data of security software libraries onto the platform, with the objective of maximizing the overall application-visible performance. The salient features of the methodology include: 1) the use of real performance measurements from a prototyping board, which contains the target platform, to drive the exploration; 2) a new data structure access profiling framework that allows us to accurately model the communication overheads involved in offloading a given set of functions to the security processor; and 3) an exact branch-and-bound-based design space exploration algorithm that determines the best mapping of security library functions and data structures to the host and security processors. We used the proposed framework to map a commercial security library to the target mobile application SoC. The resulting optimized software architecture outperformed several manually designed software architectures, resulting in up to 12.5 × speed-up for individual cryptographic operations (encryption, hashing) and 2.2-6.2 × speed-up for applications such as a digital rights management (DRM) agent and secure sockets layer (SSL) client. We also demonstrate the applicability of our framework to software architecture exploration in other multiprocessor scenarios.
AB - The functionality of mobile devices, such as cell phones and personal digital assistants (PDAs), has evolved to include various applications where security is a critical concern (secure web transactions, mobile commerce, download and playback of protected audio/video content, connection to corporate private networks, etc.). Security mechanisms (e.g., secure communication protocols) involve cryptographic algorithms, and are often quite computationally intensive, challenging the constrained processing and battery resources of mobile devices. Extensive design effort and aggressive hardware and software optimizations are required to address this challenge. Previous work has addressed the design of hardware architectures (custom accelerators, domain-specific processors, etc.) to accelerate security processing, and many emerging systems-on-chip (SoCs) feature some form of hardware support for security. In this paper, we address the complementary problem of mapping a complex security software library to an SoC platform with security hardware enhancements. We present a systematic methodology for exploring the software architecture for security processing for a commercial heterogeneous multiprocessor SoC for mobile devices. The SoC contains multiple host processors executing applications and a dedicated programmable security processing engine. We developed an exploration methodology to map the code and data of security software libraries onto the platform, with the objective of maximizing the overall application-visible performance. The salient features of the methodology include: 1) the use of real performance measurements from a prototyping board, which contains the target platform, to drive the exploration; 2) a new data structure access profiling framework that allows us to accurately model the communication overheads involved in offloading a given set of functions to the security processor; and 3) an exact branch-and-bound-based design space exploration algorithm that determines the best mapping of security library functions and data structures to the host and security processors. We used the proposed framework to map a commercial security library to the target mobile application SoC. The resulting optimized software architecture outperformed several manually designed software architectures, resulting in up to 12.5 × speed-up for individual cryptographic operations (encryption, hashing) and 2.2-6.2 × speed-up for applications such as a digital rights management (DRM) agent and secure sockets layer (SSL) client. We also demonstrate the applicability of our framework to software architecture exploration in other multiprocessor scenarios.
KW - Embedded processors
KW - Performance
KW - Security and protection
KW - Software partitioning
UR - http://www.scopus.com/inward/record.url?scp=34250166167&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=34250166167&partnerID=8YFLogxK
U2 - 10.1109/TVLSI.2007.898740
DO - 10.1109/TVLSI.2007.898740
M3 - Article
AN - SCOPUS:34250166167
SN - 1063-8210
VL - 15
SP - 699
EP - 710
JO - IEEE Transactions on Very Large Scale Integration (VLSI) Systems
JF - IEEE Transactions on Very Large Scale Integration (VLSI) Systems
IS - 6
ER -