A Simple Study on Search Engine Text Classification for Retails Store

Renien John Joseph, Samith Sandanayake, Thanuja Perera


It is obvious, the continuing growth of textual content rapidly increasing within the Word Wide Web (WWW). So certainly with the combination of sophisticated text processing and classification techniques it leads to produce high accurate search results. Even though a large body of research has delved into these problems; each has their theories and different approaches according to their data collection. This has been very challenging task continuously and this paper converges solutions, comprehensive comparisons that leads to different approaches. Therefore it will help to implement a robust search engine.  The research proves probability text classification models classify documents robustly. But to improve the search result that involves short texts, we should certainly go through a hybrid approach including rules and statistical neural network models. As a pruning components the pre-processing and post-processing modules should adapted. And also due to the dynamic data the process pipeline should be frequently update.


Search queries; Text Classification; Rules; Machine Learning; and Information Retrieval

Full Text:



AGARAM, M. K. & LIU, C. 2011. An Engine-independent Framework for Business Rules Development. Enterprise Distributed Object Computing Conference (EDOC). Helsinki: IEEE.

AGARWAL, S. 2013. Data Mining: Data Mining Concepts and Techniques. Machine Intelligence and Research Advancement (ICMIRA), 2013 International Conference on. Katra: IEEE.

AGGARWAL, C. C. & ZHAI, C. 2012. A survey of text classification algorithms. In: Mining Text Data. Springer.

ALAVIJEH, A. Z. 2015. The Application of Link Mining in Social Network Analysis. Advances in Computer Science : an International Journal.

BISHOP, C. M. 1996. Neural Networks for Pattern Recognition, Clarendon Press; 1 edition

CESKA, Z. & FOX, C. 2009. The Influence of Text Pre-processing on Plagiarism Detection. International Conference RANLP. Borovets, Bulgaria.

COLLOBERT, R., WESTON, J., BOTTOU, L., KARLEN, M., KAVUKCUOGLU, K. & KUKSA, P. 2011. Natural Language Processing (Almost) from Scratch. Journal of Machine Learning Research 12.

CORRÊA, R. F. & LUDERMIR, T. B. 2002. Automatic Text Categorization: Case Study. IEEE Conference Publications. Neural Networks, 2002. SBRN 2002. Proceedings. VII Brazilian Symposium on.

DEMERS, T. 2010. Short Vs. Long Tail: Which Search Queries Perform Best? [Online]. Search Engine Land. Available: http://searchengineland.com/short-vs-long-tail-which-search-queries-perform-best-36762 [Accessed July, 04th 2015].

DREYER, K. 2014. Study: Consumers Demand More Flexibility When Shopping Online.

FISHKIN, R. 2009. Illustrating the Long Tail [Online]. Moz. Available: https://moz.com/blog/illustrating-the-long-tail [Accessed July 04th 2015].

GOOGLE, N. R. S. H. D. C. S. T. L. S. T. W. 2015. New Research Shows How Digital Connects Shoppers to Local Stores – Think with Google [Online]. Available: https://http://www.thinkwithgoogle.com/articles/how-digital-connects-shoppers-to-local-stores.html.

HU, H., WEN, Y., CHUA, T.-S. & LI, X. 2014. Toward Scalable Systems for Big Data Analytics: A Technology Tutorial. Access, IEEE. IEEE.

KIBRIYA, A. M., FRANK, E., PFAHRINGER, B. & HOLMES, G. 2005. Multinomial Naive Bayes for Text Categorization Revisited. AI'04 Proceedings of the 17th Australian joint conference on Advances in Artificial Intelligence, 3339.

LIU, B. 2011. Web Data Mining, SIGKDD Explorations, springer

LUO, Y., WANG, W. & LIN, X. 2008. SPARK: A Keyword Search Engine on Relational Databases. Data Engineering, 2008. ICDE 2008. IEEE 24th International Conference on. IEEE.

MAHAR, J. A. & MEMON, G. Q. 2010. Rule Based Part of Speech Tagging of Sindhi Language. Signal Acquisition and Processing, 2010. ICSAP '10. International Conference Bangalore.

P.J, A., MOHAN, S. P. & K.P., S. 2010. SVM Based Part of Speech Tagger for Malayalam. Recent Trends in Information, Telecommunication and Computing (ITC), 2010 International Conference Kochi, Kerala.

PARK, H. 2014. Bigger, Better, Faster, Stronger: The Future of Big Data [Online]. cmswire. Available: http://www.cmswire.com/cms/big-data/bigger-better-faster-stronger-the-future-of-big-data-027026.php 06 March 2015].

QUAN, X., LIU, G., LU, Z., NI, X. & WENYIN, L. 2010. Short text similarity based on probabilistic topics. Knowledge and Information Systems, 25.

WANG, W., LIN, X. & LUO, Y. 2007. Keyword Search on Relational Databases. Network and Parallel Computing Workshops, 2007. NPC Workshops. IFIP International Conference IEEE.

WIDROW, B. & LEHR, M. A. 1990. 30 Years of Adaptive Neural Networks: Perceptron, Madaline, and Backpropagation. IEEE.

Lululemon Black Friday cheap nfl jerseys Lululemon factory Outlet ny Black Friday discount tiffany outlet wholesale soccer jerseys online oakley black friday cheap nhl jerseys china cheap nfl jerseys north face black friday sale cheap nfl jerseys online Jordans Black Friday Sale 2015 Cheap Moncler Cyber Monday moncler outlet cheap soccer jerseys moncler outlet black friday cheap authentic nfl jerseys north face cyber monday Louboutin Black Friday canada wholesale cheap nfl jerseys lululemon cyber monday 2015 cheap nfl jerseys from china 2015 Cheap Moncler Black Friday Sale Moncler Cyber Monday 2015 cheap jerseys Lululemon Cyber Monday Sale jordans cyber monday deals 2015 cheap nike nfl jerseys Black Friday deals Lululemon 2015 jordan black friday 2015 Moncler Jackets Black Friday Sale 2015 Louboutin Pas Cher Black Friday 2015 Canada Lululemon north face black friday cheap wholesale soccer jerseys