Yazar "Ozcan, Rifat" seçeneğine göre listele
Listeleniyor 1 - 11 / 11
Sayfa Başına Sonuç
Sıralama seçenekleri
Öğe A Financial Cost Metric for Result Caching(Assoc Computing Machinery, 2013) Sazoglu, Fethi Burak; Cambazoglu, B. Barla; Ozcan, Rifat; Altingovde, Ismail Sengor; Ulusoy, OzgurWeb search engines cache results of frequent and/or recent queries. Result caching strategies can be evaluated using different metrics, hit rate being the most well-known. Recent works take the processing overhead of queries into account when evaluating the performance of result caching strategies and propose cost-aware caching strategies. In this paper, we propose a financial cost metric that goes one step beyond and takes also the hourly electricity prices into account when computing the cost. We evaluate the most well-known static, dynamic, and hybrid result caching strategies under this new metric. Moreover, we propose a financial-cost-aware version of the well-known LRU strategy and show that it outperforms the original LRU strategy in terms of the financial cost metric.Öğe A Suggested Picture of Web Search in Turkish(Assoc Computing Machinery, 2016) Sarigil, Erdem; Yilmaz, Oguz; Altingovde, Ismail Sengor; Ozcan, Rifat; Ulusoy, OzgurAlthough query log analysis provides crucial insights about Web users' search interests, conducting such analyses is almost impossible for some languages, as large-scale and public query logs are quite scarce. In this study, we first survey the existing query collections in Turkish and discuss their limitations. Next, we adopt a novel strategy to obtain a set of Turkish queries using the query autocompletion services from the four major search engines and provide the first large-scale analysis of Web queries and their results in Turkish.Öğe Analyzing Readability Level of Educational Content in Turkish Language(Ieee, 2015) Torer, Mustafa Anil; Ozcan, RifatIn today's environment, in which digital data is continuously increasing, it is of prime importance for students to find data appropriate for their readability level. In this study, our aim is to classify educational data in Turkish based on their readability level. Three readability formulas and new syllable and word level features are used in this study. K12 level Turkish Language course textbooks published by Turkish Ministry of National Education are used as training data. Classifier models are created with Naive Bayes, Decision Tree, Random Forest and Multilayer Perceptron classification algorithms, from these books. Our test data includes educational web pages obtained from 14 different web sites. As a result of the study, the readability formulas with the suggested word level features achieved more successful readability level detection than the readability formulas without them.Öğe Classification of news-related tweets(Sage Publications Ltd, 2017) Demirsoz, Orhan; Ozcan, RifatIt is important to obtain public opinion about a news article. Microblogs such as Twitter are popular and an important medium for people to share ideas. An important portion of tweets are related to news or events. Our aim is to find tweets about newspaper reports and measure the popularity of these reports on Twitter. However, it is a challenging task to match informal and very short tweets with formal news reports. In this study, we formulate this problem as a supervised classification task. We propose to form a training set using tweets containing a link to the news and the content of the same news article. We preprocess tweets by removing unnecessary words and symbols and apply stemming by means of morphological analysers. We apply binary classifiers and anomaly detection to this task. We also propose a textual similarity-based approach. We observed that preprocessing of tweets increases accuracy. The textual similarity method obtains results with the highest recognition rate. Success increases in some cases when report text is used with tweets containing a link to the news report within the training set of classification studies. We propose that this study, which is made directly in consideration of tweet texts that measure the trends of national newspaper reports on social media, has a higher significance when compared to Twitter analyses made by using a hashtag. Given the limited number of scientific studies on Turkish tweets, this study makes a contribution to the literature.Öğe Comparing classification methods for link context based focused crawlers(IEEE Computer Society help@computer.org, 2013) Caliskan, Kamil; Ozcan, RifatFocused crawlers aim to fetch pages only related to a specific subject area from millions of web pages on the Internet. The essential task in a focused crawler is to predict whether a page is related to the target subject area or not without actually fetching the page content itself. Link context based focused crawlers focus on the surrounding text around each link to classify the page pointed by the URL. In this paper, we aim to compare three different classification methods (naïve bayes, decision tree, and support vector machines) for the task of link context based focused crawling. © 2013 IEEE. © 2014 Elsevier B.V., All rights reserved.Öğe COMPARING CLASSIFICATION METHODS FOR LINK CONTEXT BASED FOCUSED CRAWLERS(Ieee, 2013) Caliskan, Kamil; Ozcan, RifatFocused crawlers aim to fetch pages only related to a specific subject area from millions of web pages on the Internet. The essential task in a focused crawler is to predict whether a page is related to the target subject area or not without actually fetching the page content itself. Link context based focused crawlers focus on the surrounding text around each link to classify the page pointed by the URL. In this paper, we aim to compare three different classification methods (naive bayes, decision tree, and support vector machines) for the task of link context based focused crawling.Öğe How K-12 Students Search For Learning? Analysis of an Educational Search Engine Log(Assoc Computing Machinery, 2014) Usta, Arif; Altingovde, Ismail Sengor; Vidinli, Ibrahim Bahattin; Ozcan, Rifat; Ulusoy, OzgurIn this study, we analyze an educational search engine log for shedding light on K-12 students' search behavior in a learning environment. We specially focus on query, session, user and click characteristics and compare the trends to the findings in the literature for general web search engines. Our analysis helps understanding how students search with the purpose of learning in an educational vertical, and reveals new directions to improve the search performance in the education domain.Öğe New query suggestion framework and algorithms: A case study for an educational search engine(Elsevier Sci Ltd, 2016) Vidinli, I. Bahattin; Ozcan, RifatQuery suggestion is generally an integrated part of web search engines. In this study, we first redefine and reduce the query suggestion problem as comparison of queries. We then propose a general modular framework for query suggestion algorithm development. We also develop new query suggestion algorithms which are used in our proposed framework, exploiting query, session and user features. As a case study, we use query logs of a real educational search engine that targets K-12 students in Turkey. We also exploit educational features (course, grade) in our query suggestion algorithms. We test our framework and algorithms over a set of queries by an experiment and demonstrate a 66-90% statistically significant increase in relevance of query suggestions compared to a baseline method. (C) 2016 Elsevier Ltd. All rights reserved.Öğe Propagating Expiration Decisions in a Search Engine Result Cache(Assoc Computing Machinery, 2015) Sazoglu, Fethi Burak; Altingovde, Ismail Sengor; Ozcan, Rifat; Barla Cambazoglu, B.; Ulusoy, OzgurDetecting stale queries in a search engine result cache is an important problem. In this work, we propose a mechanism that propagates the expiration decision for a query to similar queries in the cache to re-adjust their time-to-live values.Öğe Second Chance: A Hybrid Approach for Dynamic Result Caching and Prefetching in Search Engines(Assoc Computing Machinery, 2013) Ozcan, Rifat; Altingovde, Ismail Sengor; Barla Cambazoglu, B.; Ulusoy, OzgurWeb search engines are known to cache the results of previously issued queries. The stored results typically contain the document summaries and some data that is used to construct the final search result page returned to the user. An alternative strategy is to store in the cache only the result document IDs, which take much less space, allowing results of more queries to be cached. These two strategies lead to an interesting trade-off between the hit rate and the average query response latency. In this work, in order to exploit this trade-off, we propose a hybrid result caching strategy where a dynamic result cache is split into two sections: an HTML cache and a docID cache. Moreover, using a realistic cost model, we evaluate the performance of different result prefetching strategies for the proposed hybrid cache and the baseline HTML-only cache. Finally, we propose a machine learning approach to predict singleton queries, which occur only once in the query stream. We show that when the proposed hybrid result caching strategy is coupled with the singleton query predictor, the hit rate is further improved.Öğe Strategies for Setting Time-to-Live Values in Result Caches(Assoc Computing Machinery, 2013) Sazoglu, Fethi Burak; Cambazoglu, B. Barla; Ozcan, Rifat; Altingovde, Ismail Sengor; Ulusoy, OzgurIn web query result caching, staleness of queries are often bounded via a time-to-live (TTL) mechanism, which expires the validity of cached query results at some point in time. In this work, we evaluate the performance of three alternative TTL mechanisms: time-based TTL, frequency-based TTL, and click-based TTL. Moreover, we propose hybrid approaches obtained by pair-wise combination of these mechanisms. Our results indicate that combining time-based TTL with frequency-based TTL yields superior performance (i.e., lower stale query traffic and less redundant computation) than using a particular mechanism in isolation.












