A Survey on Anomalous Topic Discovery in High Dimensional Data

Authors

  • Chaitali M. Mohod  PG Scholar, Department of Computer Science & Engineering, Guru Nanak Institute of Engineering & Technology, Nagpur, Maharashtra, India
  • Prof. Kalpana Malpe  Assistant Professor, Department of Computer Science & Engineering, Guru Nanak Institute of Engineering & Technology, Nagpur, Maharashtra, India

DOI:

https://doi.org//10.32628/IJSRSET196148

Keywords:

Anomaly Detection, Pattern Detection, Topic Models, Topic Discovery

Abstract

Generally, finding of an unusual information i.e. anomalies from discrete information leads towards the better comprehension of atypical conduct of patterns and to recognize the base of anomalies. Anomalies can be characterized as the patterns that don't have ordinary conduct. It is likewise called as anomaly detection. Anomaly detection procedures are for the most part utilized for misrepresentation detection in charge cards, bank extortion; organize interruption and so on. It can be eluded as, oddities, deviation, special cases or exception. Such sort of patterns can't be seen to the diagnostic meaning of an exception, as uncommon question till it has been incorporated legitimately. A bunch investigation strategy is utilized to recognize small scale clusters shaped by these anomalies. In this paper, we show different techniques existed for recognizing anomalies from datasets which just distinguishes the individual anomalies. Issue with singular anomaly detection strategy that identifies anomalies utilizing the whole highlights commonly neglect to identify such anomalies. A strategy to recognize bunch of anomalous information join show atypical area of a little subset of highlights. This technique utilizes an invalid model to for commonplace topic and after that different test to identify all clusters of strange patterns.

References

  1. Hossein Soleimani, and David J. Miller, “ATD: Anomalous Topic Discovery in High Dimensional Discrete Data,” IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 2016.
  2. Naresh Kumar Nagwani, “A Comment on A Similarity Measure for Text Classification and Clustering,” IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 27, NO. 9, SEPTEMBER 2015
  3. Xueqi Cheng, Xiaohui Yan, Yanyan Lan, and Jiafeng Guo, BTM: Topic Modeling over Short Texts, IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 26, NO. 12, DECEMBER 2014
  4. V. J. Hodge and J. Austin, “A survey of outlier detection methodologies,” Artificial Intelligence Review, vol. 22, no. 2, pp. 85–126, 2004.
  5. V. Chandola, A. Banerjee, and V. Kumar, “Anomaly detection: A survey,” ACM Computing Surveys (CSUR), vol. 41, no. September, pp. 1–58, 2009.
  6. A. Srivastava and A. Kundu, “Credit card fraud detection using hidden Markov model,” IEEE Transactions on Dependable and Secure Computing, vol. 5, no. 1, pp. 37–48, 2008.
  7. J. Major and D. Riedinger, “EFD: A Hybrid Knowledge/Statistical- Based System for the Detection of Fraud,” Journal of Risk and Insurance, vol. 69, no. 3, pp. 309–324, 2002.
  8. K. Wang and S. Stolfo, “Anomalous payload-based network intrusion detection,” in Recent Advances in Intrusion Detection, pp. 203– 222, 2004.
  9. F. Kocak, D. Miller, and G. Kesidis, “Detecting anomalous latent classes in a batch of network traffic flows,” in Information Sciences and Systems (CISS), 2014 48th Annual Conference on, pp. 1–6, 2014.
  10. D. M. Blei, A. Y. Ng, and M. I. Jordan, “Latent Dirichlet Allocation,” Journal of Machine Learning Research, vol. 3, pp. 993–1022, 2003.
  11. H. Soleimani and D. J. Miller, “Parsimonious Topic Models with Salient Word Discovery,” Knowledge and Data Engineering, IEEE Transaction on, vol. 27, pp. 824–837, 2015.
  12. L. Xiong, s. P. Barnaba, J. G. Schneider, A. Connolly, and V. Jake, ´ “Hierarchical probabilistic models for group anomaly detection,” in International Conference on Artificial Intelligence and Statistics, pp. 789–797, 2011.
  13. L. Xiong, B. Poczos, and J. Schneider, “Group anomaly detection ´ using flexible genre models,” in Advances in neural information processing systems, pp. 1071–1079, 2011.
  14. R. Yu, X. He, and Y. Liu, “GLAD : Group Anomaly Detection in Social Media Analysis,” in Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 372– 381, 2014.
  15. K. Muandet and B. Scholkopf, “One-class support measure ma- ¨ chines for group anomaly detection,” in 29th Conference on Uncertainty in Artificial Intelligence, 2013.
  16. W. Wong, A. Moore, G. Cooper, and M. Wagner, “Rule-based anomaly pattern detection for detecting disease outbreaks,” 2002.
  17. W. Wong, A. Moore, G. Cooper, and M. Wagner, “Bayesian network anomaly pattern detection for disease outbreaks,” 2003.
  18. K. Das, J. Schneider, and D. B. Neill, “Anomaly pattern detection in categorical datasets,” 2008
  19. E. McFowland, S. Speakman, and D. Neill, “Fast generalized subset scan for anomalous pattern detection,” Journal of Machine Learning Research, vol. 14, no. 1, pp. 1533–1561, 2013.
  20. J. Allan, R. Papka, and V. Lavrenko, “On-line new event detection and tracking,” 1998.
  21. X. Dai, Q. Chen, X. Wang, and J. Xu, “Online topic detection and tracking of financial news based on hierarchical clustering,” in Machine Learning and Cybernetics (ICMLC), 2010 International Conference on, pp. 3341–3346, 2010.
  22. Q. He, K. Chang, E.-P. Lim, and A. Banerjee, “Keep it simple with time: A reexamination of probabilistic topic detection models,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 32, no. 10, pp. 1795–1808, 2010.
  23. V. J. Hodge and J. Austin, “A survey of outlier detection methodologies,” Artificial Intelligence Review, vol. 22, no. 2, pp. 85–126, 2004.
  24. B. Efron, “Bootstrap methods: another look at the jackknife,” The annals of Statistics, pp. 1–26, 1979.
  25. K. Wang and S. Stolfo, “Anomalous payload-based network intrusion detection,” in Recent Advances in Intrusion Detection, pp. 203– 222, 2004.
  26. F. Kocak, D. Miller, and G. Kesidis, “Detecting anomalous latent classes in a batch of network traffic flows,” in Information Sciences and Systems (CISS), 2014 48th Annual Conference on, pp. 1–6, 2014.
  27. N.Gosavi, S.H.Patil, “Generalization Based Approach to Confidential Database Updates,” in International Journal of Engineering Research and Applications (IJERA), vol.2, Issue 3, pp.1596-1602,May-June 2012.
  28. Y.S.Patil, M.B.Vaidya, “K-means Clustering with MapReduce Technique,” in International Journal of Advanced Research in Computer and Communication Engineering (IJARCCE), vol.4, Issue 11, November 2015.

Downloads

Published

2019-01-30

Issue

Section

Research Articles

How to Cite

[1]
Chaitali M. Mohod, Prof. Kalpana Malpe, " A Survey on Anomalous Topic Discovery in High Dimensional Data, International Journal of Scientific Research in Science, Engineering and Technology(IJSRSET), Print ISSN : 2395-1990, Online ISSN : 2394-4099, Volume 6, Issue 1, pp.188-194, January-February-2019. Available at doi : https://doi.org/10.32628/IJSRSET196148