Web Exploration and Search Technology Lab Department of Computer and Information Science Polytechnic University Brooklyn, NY 11201 Faculty: Torsten Suel PhD Students: PhD Graduates: suel@poly.edu Qingqing Gan Hao Yan Jiangong Zhang Yen-Yu Chen (2006) -> Yahoo Utku Irmak (2006) -> Yahoo Xiaohui Long (2006) –> MSN Search Looking for additional PhD students … WHERE? Polytechnic University: • “Brooklyn Poly”, founded in 1854 • in downtown Brooklyn • Engineering, CS, Management • 1500 ugrads, 1400 grad students • CS: 16 tenure/t faculty, 40 PhD studs. • Algorithms, Networks, Security, Software Eng., Image/Vision/Graphics WHAT? • Databases ? • Information Retrieval ? • Web Search !! - core web search - related work in algorithms, systems, databases - emerging applications: social networks, blogs, local search, … WHAT EXACTLY? core search image & video blogs mobile desktop low level stuff: “search engine guts” • Systems/Architectures/Scalability: - efficient crawling, data distribution, indexing, query execution, link analysis • Emerging Applications: - geographic/mobile search, deep web search, blog/RSS search, P2P search • Web Spam Some Research Projects • Scalability of Large Search Engines - automatic - interactive - can we do with less? - scale to larger data? - storage/indexing/mining of web archives • Future Search Architectures Search Engine Research Cluster at Poly - peer-to-peer as Google killer? - desktop/client based search - blogs/social networks/new media • Geo / Local Search Engines ` Example: Google Local Search Geo Search Research at Poly ODISSEA System Architecture Some Recent Group Publications: Search Engine Query Processing: • Three-Level Caching for Efficient Query Processing in Large Web Search Engines. X. Long, T. Suel. 14th WWW Conf., 2005. • Optimized Query Execution in Large Search Engines with Global Page Ordering. X. Long, T. Suel. VLDB, 2003. Geographic Web Search: • Efficient Query Processing in Geographic Web Search Engines. Y. Chen, T. Suel, A. Markowetz. ACM SIGMOD, 2006. • Design and Implementation of a Geographic Search Engine. A. Markowetz, Y. Chen, et al. WebDB 2005 Miscellaneous: • Efficient Query Subscription Processing for Prospective Search Engines. U. Irmak, S. Mihaylov et al. USENIX, 2006. • Interactive Wrapper Generation with Minimal User Effort. U. Irmak, T. Suel. 15th WWW Conf., 2006. • Efficient Query Evaluation on Large Textual Collections in a P2P Environment. J. Zhang, T. Suel. IEEE Conf. on P2P, 2005. • Improved Single-Round Protocols for Remote File Synchron. U. Irmak, S. Mihaylov, T. Suel. IEEE Infocom, 2005. • Hierarchical Substring Caching for Efficient Content Distr. to Low-Bandwidth Clients. U. Irmak, T. Suel. 14th WWW Conf., 2005.