1、0 SECURING MACHINE LEARNING ALGORITHMSDECEMBER 2021 SECURING MACHINE LEARNING ALGORITHMS December 2021 1 ABOUT ENISA The European Union Agency for Cybersecurity, ENISA, is the Unions agency dedicated to achieving a high common level of cybersecurity across Europe. Established in 2004 and strengthene
2、d by the EU Cybersecurity Act, the European Union Agency for Cybersecurity contributes to EU cyber policy, enhances the trustworthiness of ICT products, services and processes with cybersecurity certification schemes, cooperates with Member States and EU bodies, and helps Europe prepare for the cybe
3、r challenges of tomorrow. Through knowledge sharing, capacity building and awareness raising, the Agency works together with its key stakeholders to strengthen trust in the connected economy, to boost resilience of the Unions infrastructure, and, ultimately, to keep Europes society and citizens digi
4、tally secure. More information about ENISA and its work can be found here: www.enisa.europa.eu. CONTACT For contacting the authors please use infoenisa.europa.eu For media enquiries about this paper, please use pressenisa.europa.eu EDITORS Apostolos Malatras, Ioannis Agrafiotis, Monika Adamczyk, ENI
5、SA ACKNOWLEDGEMENTS We would like to thank the Members and Observers of the ENISA ad hoc Working Group on Artificial Intelligence for their valuable input and feedback. LEGAL NOTICE Notice must be taken that this publication represents the views and interpretations of ENISA, unless stated otherwise.
6、 This publication should not be construed to be a legal action of ENISA or the ENISA bodies unless adopted pursuant to the Regulation (EU) No 2019/881. This publication does not necessarily represent state-of the-art and ENISA may update it from time to time. Third-party sources are quoted as approp
7、riate. ENISA is not responsible for the content of the external sources including external websites referenced in this publication. This publication is intended for information purposes only. It must be accessible free of charge. Neither ENISA nor any person acting on its behalf is responsible for t
8、he use that might be made of the information contained in this publication. COPYRIGHT NOTICE European Union Agency for Cybersecurity (ENISA), 2021 Reproduction is authorised provided the source is acknowledged. Copyright for the image on the cover: Shutterstock For any use or reproduction of photos
9、or other material that is not under the ENISA copyright, permission must be sought directly from the copyright holders. ISBN: 978-92-9204-543-2 DOI: 10.2824/874249 - Catalogue Nr.: TP-06-21-153-EN-N SECURING MACHINE LEARNING ALGORITHMS December 2021 2 TABLE OF CONTENTS EXECUTIVE SUMMARY 3 1. INTRODU
10、CTION 4 1.1 OBJECTIVES 4 1.2 METHODOLOGY 4 1.3 TARGET AUDIENCE 5 1.4 STRUCTURE 6 2. MACHINE LEARNING ALGORITHMS TAXONOMY 7 2.1 MAIN DOMAIN AND DATA TYPES 8 2.2 LEARNING PARADIGMS 9 2.3 NAVIGATING THE TAXONOMY 10 2.4 EXPLAINABILITY AND ACCURACY 10 2.5 AN OVERVIEW OF AN END-TO-END MACHINE LEARNING LIF
11、ECYCLE 11 3. ML THREATS AND VULNERABILITIES 13 3.1 IDENTIFICATION OF THREATS 13 3.2 VULNERABILITIES MAPPED TO THREATS 16 4. SECURITY CONTROLS 18 4.1 SECURITY CONTROLS RESULTS 18 5. CONCLUSION 26 A ANNEX: TAXONOMY OF ALGORITHMS 28 B ANNEX: MAPPING SECURITY CONTROLS TO THREATS 34 C ANNEX: IMPLEMENTING
12、 SECURITY CONTROLS 38 D ANNEX: REFERENCES 43 SECURING MACHINE LEARNING ALGORITHMS December 2021 3 EXECUTIVE SUMMARY The vast developments in digital technology influence every aspect of our daily lives. Emerging technologies, such as Artificial Intelligence (AI), which are in the epicentre of the di
13、gital evolution, have accelerated the digital transformation contributing in social and economic prosperity. However, the application of emerging technologies and AI in particular, entails perils that need to be addressed if we are to ensure a secure and trustworthy environment. In this report, we f
14、ocus on the most essential element of an AI system, which are machine learning algorithms. We review related technological developments and security practices to identify emerging threats, highlight gaps in security controls and recommend pathways to enhance cybersecurity posture in machine learning
15、 systems. Based on a systematic review of relevant literature on machine learning, we provide a taxonomy for machine learning algorithms, highlighting core functionalities and critical stages. The taxonomy sheds light on main data types used by algorithms, the type of training these algorithms entai
16、l (supervised, unsupervised) and how output is shared with users. Particular emphasis is given to the explainability and accuracy of these algorithms. Next, the report presents a detailed analysis of threats targeting machine learning systems. Identified threats include inter alia, data poisoning, a
17、dversarial attacks and data exfiltration. All threats are associated to particular functionalities of the taxonomy that they exploit, through detailed tables. Finally, we examine mainstream security controls described in widely adopted standards, such as ISO 27001 and NIST Cybersecurity framework, t
18、o understand how these controls can effectively detect, deter and mitigate harms from the identified threats. To perform our analysis, we map all the controls to the core functionalities of machine learning systems that they protect and to the vulnerabilities that threats exploit in these systems. O
19、ur analysis indicates that the conventional security controls, albeit very effective for information systems, need to be complemented by security controls tailored to machine learning functionalities. To identify these machine-learning controls, we conduct a systematic review of relevant literature,
20、 where academia and research institutes propose ways to avoid and mitigate threats targeting machine learning algorithms. Our report provides an extensive list of security controls that are applicable only for machine learning systems, such as “include adversarial examples to training datasets”. For
21、 all controls, we map the core functionality of machine learning algorithms that they intend to protect to the vulnerabilities that threats exploit. Our findings indicate that there is no unique strategy in applying a specific set of security controls to protect machine learning algorithms. The over
22、all cybersecurity posture of organisations who use machine learning algorithms can be enhanced by carefully choosing controls designed for these algorithms. As these controls are not validated in depth, nor standardised in how they should be implemented, further research should focus on creating ben
23、chmarks for their effectiveness. We further identified cases where the deployment of security controls may lead to trade-offs between security and performance. Therefore, the context in which controls are applied is crucial and next steps should focus on considering specific use cases and conducting
24、 targeted risk assessments to better understand these trade-offs. Finally, given the complexity of securing machine learning systems, governments and related institutions have new responsibilities in raising awareness regarding the impact of threats on machine learning. It is important to educate da
25、ta scientists on the perils of threats and on the design of security controls before machine learning algorithms are used in organisations environments. By engaging experts in machine learning in cybersecurity issues, we may create the opportunity to design innovative security solutions and mitigate
26、 the emerging threats on machine learning systems. This report provides a taxonomy for machine learning algorithms, a detailed analysis of threats and security controls in widely adopted standards SECURING MACHINE LEARNING ALGORITHMS December 2021 4 1. INTRODUCTION Artificial Intelligence (AI) has g
27、rown significantly in recent years and driven by computational advancements has found wide applicability. By providing new opportunities to solve decision-making problems intelligently and automatically, AI is being applied to more and more use cases in a growing number of sectors. The benefits of A
28、I are significant and undeniable. However, the development of AI is also accompanied by new threats and challenges, which relevant professionals will have to face. In 2020, ENISA published a threat landscape report on AI1. This report, published with the support of the Ad-Hoc Working Group on Artifi
29、cial Intelligence Cybersecurity2, presents the Agencys active mapping of the AI cybersecurity ecosystem and its threat landscape. This threat landscape not only lays the foundation for upcoming cybersecurity policy initiatives and technical guidelines, but also stresses relevant challenges. Machine
30、learning (ML), which can be defined as the ability for machines to learn from data to solve a task without being explicitly programmed to do so, is currently the most developed and promising subfield of AI for industrial and government infrastructures. It is also the most commonly used subfield of A
31、I in our daily lives. ML algorithms and their specificities, such as the fact that they need large amount of data to learn, make them the subject of very specific cyber threats that project teams must consider. The aim of this study is to help project teams identify the specific threats that can tar
32、get ML algorithms, associated vulnerabilities, and security controls for addressing these vulnerabilities. Building on the ENISA AI threat landscape mapping, this study focuses on cybersecurity threats specific to ML algorithms. Furthermore, vulnerabilities related to the aforementioned threats and
33、importantly security controls and mitigation measures are proposed. The adopted description of AI is a deliberate simplification of the state of the art regarding that vast and complex discipline with the intent of not precisely or comprehensively define it but rather pragmatically contextualise the
34、 specific technique of machine learning. 1.1 OBJECTIVES The objectives of this publication are: To produce a taxonomy of ML techniques and core functionalities to establish a logical link between threats and security controls. To identify the threats targeting ML techniques and the vulnerabilities o
35、f ML algorithms, as well as the relevant security controls and how these are currently being used in the field to ensure minimisation of security risks. To propose recommendations on future steps to enhance cybersecurity in systems that rely on ML techniques. 1.2 METHODOLOGY To produce this report,
36、the work was divided into three stages. At the core of the methodology was an extensive literature review (full list of references may be found in Annex D). The aim 1 https:/www.enisa.europa.eu/publications/artificial-intelligence-cybersecurity-challenges 2 See https:/www.enisa.europa.eu/topics/iot-
37、and-smart-infrastructures/artificial_intelligence/ad-hoc-working-group/adhoc_wg_calls SECURING MACHINE LEARNING ALGORITHMS December 2021 5 was to consult documents that are more specific to ML algorithms in general in order to build the taxonomy, and to consult documents more specific to security to
38、 identify threats, vulnerabilities, and security controls. At the end of the systematic review, more than 200 different documents (of which a hundred are related to security) on various algorithms of ML had been collected and analysed. First, we introduced a high-level ML taxonomy. To understand the
39、 vulnerabilities of different ML algorithms, how they can be threatened and protected, it is crucial to have an overview of their core functionalities and lifecycle. To do so, a first version of the desk research on ML-focussed sources was compiled and the ML lifecycle presented in ENISAs work on AI
40、 cybersecurity challenges was consulted3. We then analysed and synthesised all references to produce a first draft of the taxonomy. The draft was submitted and interviews were held with the ENISA Ad-Hoc Working Group on Artificial Intelligence Cybersecurity. After considering their feedback, the ML
41、taxonomy and lifecycle were validated. The second step was to identify the cybersecurity threats that could target ML algorithms and potential vulnerabilities. For this task, the threat landscape from ENISAs report on AI cybersecurity challenges was the starting point, which was then enriched throug
42、h desk research with sources related to the security of ML algorithms. Additionally, the expertise of the ENISA Ad-Hoc Working Group on Artificial Intelligence Cybersecurity was sought. This work allowed us to select threats and identify associated vulnerabilities. Subsequently, they were linked to
43、the previously established ML taxonomy. The last step of this work was the identification of the security controls addressing the vulnerabilities. To do this, we utilised the desk research and enriched it with the most relevant standard security controls from ISO 27001/2 and the NIST 800-53 framewor
44、k. The output was reviewed with the experts of the ENISA Ad-Hoc Working Group on Artificial Intelligence Cybersecurity. This work allowed us to identify security controls that were then linked to the ML taxonomy. It is important to note that we opted to enrich the ML-targeted security controls with
45、more conventional ones to highlight that applications using ML must also comply with more classic controls in order to be sufficiently protected. Considering measures that are specific to ML would only give a partial picture of the security work needed on these applications. 1.3 TARGET AUDIENCE The
46、target audience of this report can be divided into the following categories: Public/governmental sector (EU institutions and agencies, Member States regulatory bodies, supervisory authorities in the field of data protection, military and intelligence agencies, law enforcement community, internationa
47、l organisations, and national cybersecurity authorities): to help them with their risk analysis, identify threats and understand how to secure ML algorithms. Industry (including Small and Medium Enterprises (SMEs) that makes use of AI solutions and/or is engaged in cybersecurity, including operators
48、 of essential services: to help them with their risk analysis, identify threats and understand how to secure ML algorithms. AI technical community, AI cybersecurity experts and AI experts (designers, developers, ML experts, data scientists, etc.) with an interest in developing secure solutions and i
49、n integrating security and privacy by design in their solutions. Cybersecurity community: to identify threats and security controls that can apply to ML algorithms. 3 https:/www.enisa.europa.eu/publications/artificial-intelligence-cybersecurity-challenges SECURING MACHINE LEARNING ALGORITHMS Decembe
50、r 2021 6 Academia and research community: to obtain knowledge on the topic of securing ML algorithms and identify existing work in the field. Standardisation bodies: to help identify key aspects to consider regarding securing ML algorithms. 1.4 STRUCTURE The report aims to help the target audience t