Privacy protection in machine learning: The state-of-the-art for a private decision tree

Yee Jian Chew, Kok Seng Wong, Shih Yin Ooi

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

The explosive growth and widespread accessibility of digital data have led to a surge of research activity in the machine learning field. Typically a massive data collection is required to increase the quality of machine learning result. Often, these data contained highly sensitive information such as medical history, or financial records. Hence, privacy concerns have overshadowed by other factors in today's machine learning systems. A fundamental problem in privacy-preserving machine learning (PPML) is how to make the right tradeoff between privacy and utility. On the one hand, the PPML solution must not allow the original data records (e.g., training data) to be adequately recovered (i.e., privacy loss). On the other, it must allow the system to learn the model that is closely approximates to the model that is trained using the original data (i.e., utility gain). In this chapter, we will discuss several emerging technologies that can be used to protect privacy in machine learning systems. In addition, we also provide a state-of-the-art of the adoption of privacy preserving schemes in decision tree algorithms.

Original languageEnglish
Title of host publicationSecurity and Authentication
Subtitle of host publicationPerspectives, Management and Challenges
PublisherNova Science Publishers, Inc.
Pages13-39
Number of pages27
ISBN (Electronic)9781536129434
ISBN (Print)9781536129427
Publication statusPublished - Jan 1 2017

Fingerprint

Decision trees
Learning systems

Keywords

  • C4.5
  • Classification
  • ID3
  • Machine learning
  • Privacy-preserving

ASJC Scopus subject areas

  • Computer Science(all)

Cite this

Chew, Y. J., Wong, K. S., & Ooi, S. Y. (2017). Privacy protection in machine learning: The state-of-the-art for a private decision tree. In Security and Authentication: Perspectives, Management and Challenges (pp. 13-39). Nova Science Publishers, Inc..

Privacy protection in machine learning : The state-of-the-art for a private decision tree. / Chew, Yee Jian; Wong, Kok Seng; Ooi, Shih Yin.

Security and Authentication: Perspectives, Management and Challenges. Nova Science Publishers, Inc., 2017. p. 13-39.

Research output: Chapter in Book/Report/Conference proceedingChapter

Chew, YJ, Wong, KS & Ooi, SY 2017, Privacy protection in machine learning: The state-of-the-art for a private decision tree. in Security and Authentication: Perspectives, Management and Challenges. Nova Science Publishers, Inc., pp. 13-39.
Chew YJ, Wong KS, Ooi SY. Privacy protection in machine learning: The state-of-the-art for a private decision tree. In Security and Authentication: Perspectives, Management and Challenges. Nova Science Publishers, Inc. 2017. p. 13-39
Chew, Yee Jian ; Wong, Kok Seng ; Ooi, Shih Yin. / Privacy protection in machine learning : The state-of-the-art for a private decision tree. Security and Authentication: Perspectives, Management and Challenges. Nova Science Publishers, Inc., 2017. pp. 13-39
@inbook{101f731466df4662a9656b5666bab323,
title = "Privacy protection in machine learning: The state-of-the-art for a private decision tree",
abstract = "The explosive growth and widespread accessibility of digital data have led to a surge of research activity in the machine learning field. Typically a massive data collection is required to increase the quality of machine learning result. Often, these data contained highly sensitive information such as medical history, or financial records. Hence, privacy concerns have overshadowed by other factors in today's machine learning systems. A fundamental problem in privacy-preserving machine learning (PPML) is how to make the right tradeoff between privacy and utility. On the one hand, the PPML solution must not allow the original data records (e.g., training data) to be adequately recovered (i.e., privacy loss). On the other, it must allow the system to learn the model that is closely approximates to the model that is trained using the original data (i.e., utility gain). In this chapter, we will discuss several emerging technologies that can be used to protect privacy in machine learning systems. In addition, we also provide a state-of-the-art of the adoption of privacy preserving schemes in decision tree algorithms.",
keywords = "C4.5, Classification, ID3, Machine learning, Privacy-preserving",
author = "Chew, {Yee Jian} and Wong, {Kok Seng} and Ooi, {Shih Yin}",
year = "2017",
month = "1",
day = "1",
language = "English",
isbn = "9781536129427",
pages = "13--39",
booktitle = "Security and Authentication",
publisher = "Nova Science Publishers, Inc.",
address = "United States",

}

TY - CHAP

T1 - Privacy protection in machine learning

T2 - The state-of-the-art for a private decision tree

AU - Chew, Yee Jian

AU - Wong, Kok Seng

AU - Ooi, Shih Yin

PY - 2017/1/1

Y1 - 2017/1/1

N2 - The explosive growth and widespread accessibility of digital data have led to a surge of research activity in the machine learning field. Typically a massive data collection is required to increase the quality of machine learning result. Often, these data contained highly sensitive information such as medical history, or financial records. Hence, privacy concerns have overshadowed by other factors in today's machine learning systems. A fundamental problem in privacy-preserving machine learning (PPML) is how to make the right tradeoff between privacy and utility. On the one hand, the PPML solution must not allow the original data records (e.g., training data) to be adequately recovered (i.e., privacy loss). On the other, it must allow the system to learn the model that is closely approximates to the model that is trained using the original data (i.e., utility gain). In this chapter, we will discuss several emerging technologies that can be used to protect privacy in machine learning systems. In addition, we also provide a state-of-the-art of the adoption of privacy preserving schemes in decision tree algorithms.

AB - The explosive growth and widespread accessibility of digital data have led to a surge of research activity in the machine learning field. Typically a massive data collection is required to increase the quality of machine learning result. Often, these data contained highly sensitive information such as medical history, or financial records. Hence, privacy concerns have overshadowed by other factors in today's machine learning systems. A fundamental problem in privacy-preserving machine learning (PPML) is how to make the right tradeoff between privacy and utility. On the one hand, the PPML solution must not allow the original data records (e.g., training data) to be adequately recovered (i.e., privacy loss). On the other, it must allow the system to learn the model that is closely approximates to the model that is trained using the original data (i.e., utility gain). In this chapter, we will discuss several emerging technologies that can be used to protect privacy in machine learning systems. In addition, we also provide a state-of-the-art of the adoption of privacy preserving schemes in decision tree algorithms.

KW - C4.5

KW - Classification

KW - ID3

KW - Machine learning

KW - Privacy-preserving

UR - http://www.scopus.com/inward/record.url?scp=85044637604&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85044637604&partnerID=8YFLogxK

M3 - Chapter

AN - SCOPUS:85044637604

SN - 9781536129427

SP - 13

EP - 39

BT - Security and Authentication

PB - Nova Science Publishers, Inc.

ER -