Privacy Preserving Collaborative Model Document Clustering Using TF-IDF Approach

Authors

  • T. G. Babu  Assistant Professor, PG &Research Department of Computer Science and Science and Applications, Arignar Anna Govt Arts College Arcot Road, Cheyyar, Vellore, Tamil Nadu, India
  • E. Anitha  M.Phil (CS) Research Scholar PG &Research Department of Computer Science and Science and Applications, Arignar Anna Govt Arts College Arcot Road, Cheyyar, Vellore, Tamil Nadu, India

Keywords:

Continuous Bag of words, Skip-gram model, Google DeepMind, AlphaGo, Healthcare provider

Abstract

With the expanded popularity of public computing infrastructures (e.g., cloud platform), it has been more advantageous than any other time in recent days for distributed users (across the Internet) to perform collaborative learning through the shared infrastructure. While the potential advantages of (collective) machine learning can be gigantic, and the large-scale training data may posture generous privacy risks. In other words, centralized collection of data from different participants may raise great concerns in data confidentiality and privacy. For instance, in certain application scenarios such as healthcare, individuals/patients may not reveal their sensitive information (e.g., protected health data) to any other person, and the exposure of such exclusive information is prohibited by the laws or controls of HIPAA1. To manage such privacy issues, a clear approach is to encode sensitive information before sharing it. However, data encryption hinders data utilization and computation, making it hard to proficiently perform (community) machine learning compared with the case in plaintext domain.

References

  1. D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot et al., "Mastering the game of go with deep neural networks and tree search," Nature, vol. 529, no. 7587, pp. 484–489, 2016.
  2. R. L. Rivest, L. Adleman, and M. L. Dertouzos, "On data banks and privacy homomorphisms," Foundations of secure computation, vol. 4, no. 11, pp. 169–180, 1978.
  3. V. Nikolaenko, U. Weinsberg, S. Ioannidis, M. Joye, D. Boneh, and N. Taft, "Privacy-preserving ridge regression on hundreds of millions of records," in Proc. of S&P’13. IEEE, 2013, pp. 334–348.
  4. A. C. Yao, "Protocols for secure computations," in Proc. of FOCS’82. IEEE, 1982, pp. 160–164.
  5. V. Nikolaenko, S. Ioannidis, U. Weinsberg, M. Joye, N. Taft, and D. Boneh, "Privacy-preserving matrix factorization," in Proc. Of CCS’13. ACM, 2013, pp. 801–812.
  6. S. Kim, J. Kim, D. Koo, Y. Kim, H. Yoon, and J. Shin, "Efficient privacy-preserving matrix factorization via fully homomorphic encryption," in Proc. of AsiaCCS’16. ACM, 2016, pp. 617–628.
  7. R. Bost, R. A. Popa, S. Tu, and S. Goldwasser, "Machine learning classification over encrypted data," in Proc. of NDSS’15, 2015.
  8. N. Dowlin, R. Gilad-Bachrach, K. Laine, K. Lauter, M. Naehrig, and J. Wernsing, "Cryptonets: Applying neural networks to encrypted data with high throughput and accuracy," in Proc. Of ICML’16, vol. 48, 2016, pp. 201–210.
  9. C. Dwork, "Differential privacy," in Proc. of ICALP’06. Springer, 2006, pp. 1–12.
  10. K. Chaudhuri, A. D. Sarwate, and K. Sinha, "A near-optimal algorithm for differentially-private principal components." Journal of Machine Learning Research, vol. 14, no. 1, pp. 2905–2943, 2013.
  11. J. Zhang, Z. Zhang, X. Xiao, Y. Yang, and M. Winslett, "Functional mechanism: regression analysis under differential privacy," Proc. of VLDB’12, vol. 5, no. 11, pp. 1364–1375, 2012.
  12. R. Shokri and V. Shmatikov, "Privacy-preserving deep learning," in Proc. of CCS’15. ACM, 2015, pp. 1310–1321.
  13. M. Abadi, A. Chu, I. Goodfellow, H. B. McMahan, I. Mironov, K. Talwar, and L. Zhang, "Deep learning with differential privacy," in Proc. of CCS’16. ACM, 2016, pp. 308–318.
  14. Y. Elmehdwi, B. K. Samanthula, and W. Jiang, "Secure k-nearest neighbor query over encrypted data in outsourced environments," in Proc. of ICDE’14. IEEE, 2014, pp. 664–675.
  15. O. Goldreich, Foundations of cryptography: volume 2, basic applications. Cambridge university press, 2009.

Downloads

Published

2018-08-30

Issue

Section

Research Articles

How to Cite

[1]
T. G. Babu, E. Anitha, " Privacy Preserving Collaborative Model Document Clustering Using TF-IDF Approach, International Journal of Scientific Research in Science, Engineering and Technology(IJSRSET), Print ISSN : 2395-1990, Online ISSN : 2394-4099, Volume 4, Issue 9, pp.615-627, July-August-2018.