Query Aware Determinization of Uncertain Objects by Indexing

Authors

  • Sakthi Priyadharshan V  Department of Computer Science and Engineering, Dhanalakshmi College of Engineering, Chennai, India
  • Dr. Sivasubramaniam S  Department of Computer Science and Engineering, Dhanalakshmi College of Engineering, Chennai, India

Keywords:

Optimization, Image Identification, Indexing,Categorization

Abstract

This project considers the problem of determinizing uncertain data data to enable to facilitate the data data storage in legacy systems that accept only deterministic input. Probabilistic data may be generated by automated data analysis/enrichment techniques such as entity resolution, information extraction, and speech processing. The legacy system may represent already existing web applications such as Flickr, Picasa, etc. The idea is to create a deterministic representation of probabilistic data that improves the quality of the user end-application built on deterministic data. We study and solve such a determinization issue in the context of two different data processing tasks—triggers and selection queries. It is known that methods such as thresholding or top-1 selection traditionally used for determinization lead to suboptimal performance for such applications. Instead, we develop a query-aware strategy and show its advantages over existing solutions through a comprehensive empirical evaluation over real and synthetic datasets.

References

  1. D. V. Kalashnikov, S. Mehrotra, J. Xu, and N. Venkatasubramanian, “A semantics-based approach for speech annotation of images,” IEEE Trans. Knowl. Data Eng., vol. 23, no. 9, pp. 1373–1387, Sept. 2011.
  2. J. Li and J. Wang, “Automatic linguistic indexing of pictures by a statistical modeling approach,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 25, no. 9, pp. 1075–1088, Sept. 2003.
  3. C. Wangand, F. Jing, L. Zhang, and H. Zhang, “Image annotation refinement using random walk with restarts,” in Proc. 14th Annu. ACM Int. Conf. Multimedia, New York, NY, USA, 2006.
  4. B. Minescu, G. Damnati, F. Bechet, and R. de Mori, “Conditional use of word lattices, confusion networks and 1-best string hypotheses in a sequential interpretation strategy,” in Proc. ICASSP, 2007.
  5. R. Nuray-Turan, D. V. Kalashnikov, S. Mehrotra, and Y. Yu, “Attribute and object selection queries on objects with probabilistic attributes,” ACM Trans. Database Syst., vol. 37, no. 1, Article 3, Feb. 2012.
  6. J. Li and A. Deshpande, “Consensus answers for queries over probabilistic databases,” in Proc. 28th ACM SIGMOD-SIGACTSIGART Symp. PODS, New York, NY, USA, 2009.
  7. M. B. Ebarhimi and A. A. Ghorbani, “A novel approach for frequent phrase mining in web search engine query streams,” in Proc. 5th Annu. Conf. CNSR, Frederlcton, NB, Canada, 2007.
  8. S. Bhatia, D. Majumdar, and P. Mitra, “Query suggestions in the absence of query logs,” in Proc. 34th Int. ACM SIGIR, Beijing, China, 2011.
  9. C. Manning and H. Schutze, Foundations of Statistical Natural Language Processing, Cambridge, MA, USA: MIT Press, 1999.
  10. D. V. Kalashnikov and S. Mehrotra, “Domain-independent data cleaning via analysis of entity-relationship graph,” ACM Trans. Database Syst., vol. 31, no. 2, pp. 716–767, Jun. 2006.

Downloads

Published

2017-12-31

Issue

Section

Research Articles

How to Cite

[1]
Sakthi Priyadharshan V, Dr. Sivasubramaniam S, " Query Aware Determinization of Uncertain Objects by Indexing, International Journal of Scientific Research in Science, Engineering and Technology(IJSRSET), Print ISSN : 2395-1990, Online ISSN : 2394-4099, Volume 2, Issue 2, pp.332-336, March-April-2016.