Privacy Preserving Collaborative Model Document Clustering Using TF-IDF Approach

Authors(2) :-T. G. Babu, E. Anitha

With the expanded popularity of public computing infrastructures (e.g., cloud platform), it has been more advantageous than any other time in recent days for distributed users (across the Internet) to perform collaborative learning through the shared infrastructure. While the potential advantages of (collective) machine learning can be gigantic, and the large-scale training data may posture generous privacy risks. In other words, centralized collection of data from different participants may raise great concerns in data confidentiality and privacy. For instance, in certain application scenarios such as healthcare, individuals/patients may not reveal their sensitive information (e.g., protected health data) to any other person, and the exposure of such exclusive information is prohibited by the laws or controls of HIPAA1. To manage such privacy issues, a clear approach is to encode sensitive information before sharing it. However, data encryption hinders data utilization and computation, making it hard to proficiently perform (community) machine learning compared with the case in plaintext domain.

Authors and Affiliations

T. G. Babu
Assistant Professor, PG &Research Department of Computer Science and Science and Applications, Arignar Anna Govt Arts College Arcot Road, Cheyyar, Vellore, Tamil Nadu, India
E. Anitha
M.Phil (CS) Research Scholar PG &Research Department of Computer Science and Science and Applications, Arignar Anna Govt Arts College Arcot Road, Cheyyar, Vellore, Tamil Nadu, India

Continuous Bag of words, Skip-gram model, Google DeepMind, AlphaGo, Healthcare provider

  1. D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot et al., "Mastering the game of go with deep neural networks and tree search," Nature, vol. 529, no. 7587, pp. 484–489, 2016.
  2. R. L. Rivest, L. Adleman, and M. L. Dertouzos, "On data banks and privacy homomorphisms," Foundations of secure computation, vol. 4, no. 11, pp. 169–180, 1978.
  3. V. Nikolaenko, U. Weinsberg, S. Ioannidis, M. Joye, D. Boneh, and N. Taft, "Privacy-preserving ridge regression on hundreds of millions of records," in Proc. of S&P’13. IEEE, 2013, pp. 334–348.
  4. A. C. Yao, "Protocols for secure computations," in Proc. of FOCS’82. IEEE, 1982, pp. 160–164.
  5. V. Nikolaenko, S. Ioannidis, U. Weinsberg, M. Joye, N. Taft, and D. Boneh, "Privacy-preserving matrix factorization," in Proc. Of CCS’13. ACM, 2013, pp. 801–812.
  6. S. Kim, J. Kim, D. Koo, Y. Kim, H. Yoon, and J. Shin, "Efficient privacy-preserving matrix factorization via fully homomorphic encryption," in Proc. of AsiaCCS’16. ACM, 2016, pp. 617–628.
  7. R. Bost, R. A. Popa, S. Tu, and S. Goldwasser, "Machine learning classification over encrypted data," in Proc. of NDSS’15, 2015.
  8. N. Dowlin, R. Gilad-Bachrach, K. Laine, K. Lauter, M. Naehrig, and J. Wernsing, "Cryptonets: Applying neural networks to encrypted data with high throughput and accuracy," in Proc. Of ICML’16, vol. 48, 2016, pp. 201–210.
  9. C. Dwork, "Differential privacy," in Proc. of ICALP’06. Springer, 2006, pp. 1–12.
  10. K. Chaudhuri, A. D. Sarwate, and K. Sinha, "A near-optimal algorithm for differentially-private principal components." Journal of Machine Learning Research, vol. 14, no. 1, pp. 2905–2943, 2013.
  11. J. Zhang, Z. Zhang, X. Xiao, Y. Yang, and M. Winslett, "Functional mechanism: regression analysis under differential privacy," Proc. of VLDB’12, vol. 5, no. 11, pp. 1364–1375, 2012.
  12. R. Shokri and V. Shmatikov, "Privacy-preserving deep learning," in Proc. of CCS’15. ACM, 2015, pp. 1310–1321.
  13. M. Abadi, A. Chu, I. Goodfellow, H. B. McMahan, I. Mironov, K. Talwar, and L. Zhang, "Deep learning with differential privacy," in Proc. of CCS’16. ACM, 2016, pp. 308–318.
  14. Y. Elmehdwi, B. K. Samanthula, and W. Jiang, "Secure k-nearest neighbor query over encrypted data in outsourced environments," in Proc. of ICDE’14. IEEE, 2014, pp. 664–675.
  15. O. Goldreich, Foundations of cryptography: volume 2, basic applications. Cambridge university press, 2009.

Publication Details

Published in : Volume 4 | Issue 9 | July-August 2018
Date of Publication : 2018-08-30
License:  This work is licensed under a Creative Commons Attribution 4.0 International License.
Page(s) : 615-627
Manuscript Number : IJSRSET1849134
Publisher : Technoscience Academy

Print ISSN : 2395-1990, Online ISSN : 2394-4099

Cite This Article :

T. G. Babu, E. Anitha, " Privacy Preserving Collaborative Model Document Clustering Using TF-IDF Approach, International Journal of Scientific Research in Science, Engineering and Technology(IJSRSET), Print ISSN : 2395-1990, Online ISSN : 2394-4099, Volume 4, Issue 9, pp.615-627, July-August-2018.
Journal URL :

Article Preview

Follow Us

Contact Us