Attribute Based Document De-Duplication Using the Metadata based Framework

Authors

  • Ravikanth M  Associate Professor of CSE in CMRTC, Hyderabad, Telangana, India
  • Bhuvaneshwari  Professor of CSE in CU, Kalapet, Pondicherry, Tamil Nadu, India

Keywords:

Document, de-duplication, adaptive forms, collaborative platforms

Abstract

Technology and its advantage of using in the modern age of Information Technology, where the content based document de-duplication keep on changing. In the context of the structured and unstructured data gives us the most significant information, but in order to process the data of the content structured would be useful. In this Paper, we try to give the most significant glimpse of the metadata based information in the Human Interface of the UI. Technologically its process of facilitation but cannot ensure all mentioning your data can be made search. In order to over to such trend we need protocol of User interface before submitting the data making in the format the query based structured or unstructured approach. In this one we have used the UI based framework which in turn uses the approach of the content in the document in order to facilitate the process of the QTP and the metadata makes the sense protocol of the category.

References

  1. "Google," Google Base, http://www.google.com/base, 2011.
  2. S.R. Jeffery, M.J. Franklin, and A.Y. Halevy, "Pay-as-You-Go UserFeedback for Dataspace Systems," Proc. ACM SIGMOD Int'l Conf.Management Data, 2008.
  3. K. Saleem, S. Luis, Y. Deng, S.-C. Chen, V. Hristidis, and T. Li, "Towards a Business Continuity Information Network for Rapid Disaster Recovery," Proc. Int'l Conf. Digital Govt. Research (dg.o'08), 2008.
  4. A. Jain and P.G. Ipeirotis, "A Quality-Aware Optimizer for Information Extraction," ACM Trans. Database Systems, vol. 34,article 5, 2009.
  5. J.M. Ponte and W.B. Croft, "A Language Modeling Approach to Information Retrieval," Proc. 21st Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR'98), pp. 275-281, http://doi.acm.org/10.1145/290941.291008, 1998.
  6. R.T. Clemen and R.L. Winkler, "Unanimity and Compromise among Probability Forecasters," Management Science, vol. 36, pp. 767-779, July 1990.
  7. C.D. Manning, P. Raghavan, and H. Schu¨ tze, Introduction to Information Retrieval, first ed. Cambridge Univ. Press, http://www.amazon.com/exec/obidos/redirect?tag=citeulike07-20&path=ASIN/0521865719, July 2008.
  8. P.G. Ipeirotis, F. Provost, and J. Wang, "Quality Management on Amazon Mechanical Turk," Proc. ACM SIGKDD Workshop Human Computation (HCOMP'10), pp. 64-67, 10.1145/1837885.1837906, 2010.
  9. R. Fagin, A. Lotem, and M. Naor, "Optimal Aggregation Algorithms for Middleware," J. Computer Systems Sciences,vol. 66, pp. 614-656, http://portal.acm.org/citation. cfm?id=861182.861185, June 2003.
  10. K.C.-C. Chang and S.-w. Hwang, "Minimal Probing: Supporting Expensive Predicates for Top-K Queries," Proc. ACM SIGMOD Int'l Conf. Management Data, 2002.
  11. G. Tsoumakas and I. Vlahavas, "Random K-Labelsets: An Ensemble Method for Multilabel Classification," Proc. 18th European Conf. Machine Learning (ECML'07), pp. 406-417, http://dx.doi.org/10.1007/978-3-540-74958-5fi38, 2007.
  12. M. Miah, G. Das, V. Hristidis, and H. Mannila, "Standing out in aCrowd: Selecting Attributes for Maximum Visibility," Proc. Int'l Conf. Data Eng. (ICDE), 2008.
  13. P. Heymann, D. Ramage, and H. Garcia-Molina, "Social Tag Prediction," Proc. 31st Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR'08), pp. 531-538, http://doi.acm.org/10.1145/1390334.1390425, 2008.
  14. Y. Song, Z. Zhuang, H. Li, Q. Zhao, J. Li, W.-C. Lee, and C.L. Giles,"Real-Time Automatic Tag Recommendation," Proc. 31st Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR'08), pp. 515-522, http://doi.acm.org/10.1145/1390334.1390423, 2008.
  15. D. Eck, P. Lamere, T. Bertin-Mahieux, and S. Green, "Automatic Generation of Social Tags for Music Recommendation," Proc. Advances in Neural Information Processing Systems 20, 2008.
  16. B. Sigurbjo¨rnsson and R. van Zwol, "Flickr Tag Recommendation Based on Collective Knowledge," Proc. 17th Int'l Conf. World Wide Web (WWW'08), pp. 327-336, http://doi.acm.org/10.1145/1367497.1367542, 2008.
  17. B. Russell, A. Torralba, K. Murphy, and W. Freeman, "LabelMe: A Database and Web-Based Tool for Image De-duplication," Int'l J. Computer Vision, vol. 77, pp. 157-173, http://dx.doi.org/10.1007/s11263-007-0090-8, 2008, doi: 10.1007/s11263-007-0090-8.

Downloads

Published

2015-02-25

Issue

Section

Research Articles

How to Cite

[1]
Ravikanth M, Bhuvaneshwari, " Attribute Based Document De-Duplication Using the Metadata based Framework, International Journal of Scientific Research in Science, Engineering and Technology(IJSRSET), Print ISSN : 2395-1990, Online ISSN : 2394-4099, Volume 1, Issue 1, pp.396-400, January-February-2015.