GOOGLE OPINIONS PROJECT PROPOSAL Prepared for Professor Betsy Schlobohm Prepared by David Urbina Khairun-nisa Hassanali Michael Fashola Scott Larson April 28, 2009 GOOGLE OPINIONS II Memorandum DATE: TO: FROM: SUBJECT: March 25, 2009 Scott Larson – Project Manager, Google Think Team Marissa Mayer - VP Search Products and User Experience Update to Google Search Engine Recent advances in Information and Communication technology have made our search engine developed in the early days of the World Wide Web in need of major reviews and update. We have seen increase in the number of blogs, twittering, social networking sites and other opinionthemed sites like yelp.com, jdpower.com, and rottentomatoes.com that signal a paradigm shift in content and distribution [1]. No longer are opinions two dimensional images that exist only in the mind of the conceiver but, through the instrumentality of the web, are now potent agents of change [2]. Google as a leader in launching feature-rich applications has set up a new group called Google Think to undertake the task of enhancing the Google Search Engine in order to accommodate this paradigm shift and develop a contemporary search engine that will provide decision-making solutions to consumers. I am happy to invite your team to undertake this project with the responsibility of designing and developing a comprehensive search engine that will mine blogs, opinion editorials and provide themed search which will ultimately appeal to users. We wish you the best in this assignment. Cheers, Marissa Mayer III Memorandum DATE: TO: FROM: SUBJECT: March 26, 2009 Marissa Mayer – Vice President, Search Products and User Experience Scott Larson – Project Manager, Google Think Team Acceptance of Offer Your memo empowering our group to create an update to the Google search engine was received with enthusiasm. Our appointment to undertake this project reflects the high level of confidence you repose on us to develop a ground-breaking application that will re-launch Google as a leader in search engine development. Please rest assured we will approach this assignment with vigor. My team will be conducting surveys and may request for information that will aid a successful completion of this project. This project will involve the use of products and technologies developed by other teams at Google. We would therefore need the cooperation of all these teams at Google. Kindly inform us if there is to be any change in scope of this project. Cheers, Scott Larson IV Memorandum DATE: TO: FROM: SUBJECT: April 10, 2009 Marissa Mayer – Vice President, Search Products and User Experience Scott Larson – Project Manager, Google Think Team Updates to Google Search Engine We have conducted an assessment of the current search engine in line with your request, and proffered viable options in enhancing the Google search engine as we know it today. We are in agreement with the need to update the Google search engine capabilities. While Google’s share of searches has increased year-on-year [3], it is still unable to meet one of the primary needs of today’s internet users: searching for opinions. We propose Google Opinions, an extension of the Google Search Engine, which provides an end to end solution to users searching for opinions. Google Opinions will also provide users with advanced search capabilities that will enable them to further pin down the source of the opinions and display statistics on the opinions retrieved by the system. The general objective of the proposed plan is to make Google more responsive to the evolving internet culture [1], [2] and launch new capabilities that will put Google at the frontline for years to come. A critical success factor is to have a cross-over appeal to Google, so that it can be of use to all strata of the society. This better positioning will lead to an improvement in revenue streams. We hereby acknowledge with thanks, the kind assistance of Vinton Cerf and his team in providing market intelligence reports and new markets product themes. These came in handy while analyzing the cost and benefits of the proposed option. Working on the project proposal has broadened our insights. We envisage this is just a step in a series of concerted efforts to make Google a leader in her field, and we will be glad to play a role in helping realize Google’s short and long term goals. Cheers, Scott Larson. V Table of Contents Memorandum ...........................................................................................................................III Memorandum .......................................................................................................................... IV Memorandum ........................................................................................................................... V Table of Contents .................................................................................................................... VI List of Illustrations ................................................................................................................ VIII List of Tables ......................................................................................................................... VIII Executive Summary ................................................................................................................ IX Google Opinions ...................................................................................................................... 1 Introduction ............................................................................................................................ 1 Current Situation .................................................................................................................... 1 Project Plan .............................................................................................................................. 2 The solution ........................................................................................................................... 2 Objectives .............................................................................................................................. 2 Major and Minor Steps ........................................................................................................... 4 Deliverables and outcomes .................................................................................................... 5 Qualifications ........................................................................................................................... 8 The Google Think Group ........................................................................................................ 8 The People............................................................................................................................. 8 Costs and Benefits................................................................................................................... 9 Conclusion and Recommendations ......................................................................................11 Appendix A. Project Plan ..................................................................................................12 Google Opinions System.......................................................................................................12 Google Opinions Project Timeline .........................................................................................14 Google Opinions Main Page .................................................................................................15 Appendix B. Costs and Revenue ......................................................................................16 Costs.....................................................................................................................................16 Revenue ...............................................................................................................................20 Appendix C. Google Opinions Glossary ..........................................................................23 Appendix D. Financial Statements ...................................................................................32 Google Inc.: Income Statement .............................................................................................32 Google Inc.: Balance Sheet ...................................................................................................36 VI Google Inc.: Cash Flow Statement ........................................................................................39 Appendix E. Resumes .......................................................................................................41 References ..............................................................................................................................46 VII List of Illustrations FIGURE A-1: GOOGLE OPINIONS SYSTEM DIAGRAM .................................................................................................12 FIGURE A-2: GOOGLE OPINIONS PROJECT TIME LINE ..............................................................................................14 FIGURE A-3: GOOGLE OPINIONS USER INTERFACE...................................................................................................15 FIGURE B-1: COST PROFILE .......................................................................................................................................18 FIGURE B-2: PHASE-WISE COST ................................................................................................................................19 FIGURE B-3: COST PER MODULE ...............................................................................................................................20 FIGURE B-4: COST BENEFIT ANALYSIS ......................................................................................................................22 List of Tables TABLE 1: MAJOR AND MINOR STEPS ............................................................................................................................5 TABLE 2: PROJECT DELIVERABLES AND OUTCOMES ...................................................................................................7 TABLE 3: SOFTWARE COST.........................................................................................................................................17 TABLE 4: PHASE-WISE COST ......................................................................................................................................18 TABLE 5: COST PER MODULE .....................................................................................................................................19 TABLE 6: INCOME STATEMENT ....................................................................................................................................35 TABLE 7: BALANCE SHEET ..........................................................................................................................................38 TABLE 8: CASH FLOW STATEMENT .............................................................................................................................40 VIII Executive Summary Google needs to once again re-position herself as a leader in search engine development, and offer greater value to users in real value terms. Opinion-themed searches will enable us seize the initiative and open new vistas in previously uncharted territory. We propose Google Opinions, an extension of the Google Search Engine, to provide opinionthemed searches and appeal to a broad spectrum of users with varied needs such as consumers, employers, students and businessmen [1] – [3]. We envisage Google Opinions being used to provide opinions in a recruitment decision, product ratings in a purchasing decision and other individual and corporate decision challenges arising from the current environment of a vast array of products and an information overload. The Google Opinions project is proposed to start on June 1, 2009 and end on May 31, 2010. This project will involve a team of thirty highly qualified personnel, with extensive experience in information retrieval, sentiment mining and software development. The financial and non-financial benefits from the Google Opinions project far outweigh the cost. The Google Opinions project will cost $10,909,357. The increase in revenue from Google Opinions is envisaged to be above $4,000,000 per annum. Implementing the Google Opinions project will lead to publishing path breaking papers on sentiment mining. Further, our competitors such as Microsoft [4] – [6] are working on similar technology and therefore it is imperative that we get Google Opinions out into the market before our competitors. IX Google Opinions Introduction A widely valued, but rarely provided service is that of opinions. Customers are always in need of advice about other interests, frequently referring to reviews written by professionals and other customers [1, 2]. However, while these opinions may be useful, they are far from exhaustive, and do not allow a view of the wide variety of views available. Google Opinions attempts to solve this problem by providing a search engine designed specifically for retrieving opinions. Current Situation As of April 2009, search engines do not have the capability of performing opinion based searches. The World Wide Web is abounding with opinions on blogs, newspapers, review sites and social network sites [1] – [3]. Given time and effort, a user can use a standard search engine to research these websites and analyze the few opinions they find. However, the rate at which the information is collected, the quality of opinions retrieved in this manner is time consuming and insufficient for most consumer and business needs. The cause of this problem is the lack of an efficient means to collect and retrieve opinions from websites [1] – [3]. The collection and analysis of the large quantity and variety of opinions available on the Web is beyond the scope of a single user’s practical and willing effort to retrieve the corresponding results. The Google Search Engine was not designed with a perspective of retrieving opinions from websites. If this problem is not solved, both users and businesses will continue to suffer from a lack of online consensus and views on particular products, positions, and ideals [1] – [3], [7], [8]. 1 2 Google’s major competitor Microsoft has filed patents on similar technology [4] – [6]. Google needs to bring out a system that solves this problem before its competitors giving it an edge in the market of opinion-based retrieval. Not doing so would lead to a loss of revenue as people will move to the competitors search engine. Further, there would other benefits that Google would miss out on such as publishing path breaking research papers that result from this project. Project Plan The solution In order to meet the demand for opinion based searches [1] – [3], we propose to develop Google Opinions [7], [8], an extension of the Google Search Engine. Google Opinions will enable a user to perform opinion-based search retrieving both positive and negative opinions on a particular subject. Google Opinions will also provide the user with advanced search features such as specifying the specific source of opinions and how negative or positive the opinions retrieved by Google Opinions should be. Further, using Thought Stats, users can view statistics and graphs on the opinions retrieved for a search subject. Thought Stats also includes i-Util, which gives a measure of satisfaction derived from opinions on products and services. Please refer to Appendix A for a detailed explanation of the Google Opinions system along with the system diagram and proposed user interface. Please refer to Appendix C for an explanation of the terms used in the Google Opinions project. Objectives Google Opinions must meet the following objectives: 1. Provide an end to end solution that allows for users to: 3 a. Search for opinions on a subject [7], [8]. b. Display links to articles that contain opinions on these subjects. c. Display representative sentences containing these opinions [7], [8]. 2. .Provide a user interface that will provide for users to: a. Provide the search words on which they want an opinion. b. Provide for advanced search options that allow the user to: i. Specify the polarity (degree of positivity or negativity of the opinions retrieved by Google Opinions [7], [8]. ii. Specify the sources Google Opinions should use for retrieving opinions. iii. Specify the time frame within which the opinions are expressed. c. Provide help on using Google Opinions. 3. Provide the user with statistics and graphs on the opinions retrieved by Google Opinions. 4. Display advertisements related to the search words. 5. Allow for easy integration with other Google components such as Google [9] and Google AdSense [10]. 4 Major and Minor Steps The table below gives the major and minor steps of the Google Opinions Project along with the timelines. The Google Opinions project is expected to start on June 1, 2009 and end on May, 31 2009. Please refer to Figure A-2 in Appendix A for the Google Opinions project timeline. Expected Major Step/Component Minor Step Expected Start Date Completion Date Testing and Pre-Requirements June 1, 2009 June 30, 2009 Requirements July 1, 2009 July 31, 2009 Design August 1, 2009 August 31, 2009 Coding September 1, 2009 September 31, 2009 Unit Testing October 1, 2009 October 31, 2009 Requirements July 1, 2009 July 31, 2009 Design August 1, 2009 August 31, 2009 Coding September 1, 2009 October 31, 2009 Unit Testing November 1, 2009 November 30, 2009 Requirements July 1, 2009 July 31, 2009 Design August 1, 2009 August 31, 2009 Coding September 1, 2009 September 30, 2009 Unit Testing October 1, 2009 October 31, 2009 Requirements July 1, 2009 July 31, 2009 Design August 1, 2009 August 31, 2009 Coding September 1, 2009 October 31, 2009 Unit Testing November 1, 2009 November 30, 2009 installing software User Interface Go-Op-Crawler Pre-Processor Go-Top-Generator 5 Expected Major Step/Component Minor Step Expected Start Date Completion Date Requirements July 1, 2009 July 31, 2009 Design August 1, 2009 August 31, 2009 Coding September 1, 2009 October 31, 2009 Unit Testing November 1, 2009 November 30, 2009 Requirements July 1, 2009 July 31, 2009 Design August 1, 2009 August 31, 2009 Coding September 1, 2009 October 31, 2009 Unit Testing November 1, 2009 November 30, 2009 Requirements July 1, 2009 July 31, 2009 Design August 1, 2009 August 31, 2009 Coding September 1, 2009 October 31, 2009 Unit Testing November 1, 2009 November 30, 2009 Integration All components December 1, 2009 December 31, 2009 System Integration Testing and bug January 1, 2010 March 31, 2010 Testing [11] fixing Product Quality Testing Testing and bug April 1, 2010 May 31, 2010 [12] fixing Go-Op-Selector Go-Summarizer Thought Stats Table 1: Major and Minor Steps Deliverables and outcomes Table 2 gives the deliverables and outcomes for the Google Opinions Project. These deliverables are due once the component is completed. Please refer to Table 1 for the completion dates of each component. 6 Component Deliverables Outcome User Interface Source Code [13] A user interface which enables users to Executable [14] search for opinions. The user interface will Deployment Manual [15] provide for advanced search options and a User Manual Help section. Software Requirements Specification [16] Go-Op-Crawler Source Code [13] A crawler that collects opinion-laden web Executable [14] pages based on the search words. Deployment Manual [15] Software Requirements Specification [16] Unit Testing Set Kit Preprocessor Source Code [13] A program that takes web pages collected Executable [14] by the Go-Op-Crawler and produces plain Deployment Manual [15] text. Software Requirements Specification [16] Unit Testing Set Kit Go-Top-Generator Source Code [13] A module that takes as input text and gives Executable [14] as output the topics discussed in the text. Deployment Manual [15] Software Requirements Specification [16] 7 Component Deliverables Outcome Unit Testing Set Kit Topic Aspect Database Go-Op-Selector Source Code A module that takes as input a set of topics Executable and text and gives as output opinion-laden Deployment Manual sentences within the text on the given Software Requirements topics. Specification Unit Testing Set Kit Go-Summarizer Taxonomic Sentiment A module that takes a set of sentences as Database input and selects sentences that represent Source Code [13] a summary of the input sentences. Executable [14] Deployment Manual [15] Software Requirements Specification [16] Unit Testing Set Kit Thought Stats Source Code [13] A module that generates statistics and Executable [14] graphs based on the data collected by other Deployment Manual [15] modules. Software Requirement Specification [16] Unit Testing Set Kit Table 2: Project Deliverables and Outcomes Other general deliverables this project will produce are: 8 System Architecture Document [17] Integration Testing Kit Google Opinions will be the final deliverable which will provide an end to end solution enabling users to search for opinions. Google Opinions will be ready for deployment on May 31, 2009. Qualifications The Google Think Group The Google Think group was founded three months ago to harness the potential market in opinion extraction and mining. The Google Think team consists of software engineers who are experienced in software development, sentiment mining, information retrieval and web based applications. Please refer to Appendix E for the resumes of the Google Think team members. The People Scott Larson – Manager Scott is close to completing his Master’s Degree in Software Engineering at the University of Texas at Dallas. He is experienced in leading groups, including time management, resource management, work distribution, and project integration. Scott will be managing the Google Opinions project. Khairun-nisa Hassanali – Lead Researcher and Developer Khairun-nisa is a highly educated Researcher of Natural Language Processing at the University of Texas at Dallas. She is currently pursuing a PhD in Computer Science with a research focus 9 on Sentiment Mining. Khairun-nisa will be leading all research and development on the Google Opinions project. David Urbina – Lead Architect David has extensive experience as a Solutions Architect and Developer. He is a Graduate Student at the University of Texas at Dallas with a 4.0 GPA, majoring in Software Engineering. David will be leading the Requirements Engineering, architecture, and design of the project. Michael Fashola – Lead Tester and Manager of Marketing and Finance Michael is a Graduate Student at the University of Texas at Dallas. He is experienced in software testing and financial auditing. Michael will be leading the Testing and Validation phase of the Google Opinions project, as well as managing the finances and marketing of the Google Opinions project. Costs and Benefits Our market research indicated that while Google has maintained her market share despite growing competition [18] – [21], opportunities for growth have been limited. This is supported by financial reports for preceding four quarters as shown in Appendix E. A key ingredient to jumpstart another era of prolific growth is being able to offer quick and efficient search for opinions [1] – [3]. Google Opinions will fulfill this need. None of our competitors such as Microsoft [22] and Yahoo [23] have this capability in their search engines. As shown in Appendix B, the implementation of the Google Opinions project will cost an estimated $10,909,357 dollars. We believe that this investment in Google Opinions will enable Google to take over a niche part of the market. We expect annual revenue increases of over $4,000,000 from Google Opinion and expect to break even in three years of implementing this 10 product. This increase in revenue will come from Google AdWords [9] and AdSense [10] as businesses would want to purchase keywords for the Google Opinions related to their activities. A person looking for a positive or negative review on a product or service may also be looking at purchasing the same. Therefore, based on the reviews they are most likely to click on the business that sells this product or service. Further, Thought Stats is also a useful tool for businesses to track people’s opinions on their products and services. We intend to commercialize Thought Stats four years from now as a way of giving businesses access to otherwise copyrighted materials with expected revenues thereby further boosting our growth prospects. Google Opinions can be ported at no extra cost with Google Site Search [22]. Businesses such as newspaper sites and review sites would be interested in opinion based searches in addition to a general search. But the advantages of our plans go beyond simple costs: First, this will be a major research breakthrough and will lead to publishing of path breaking papers by Google in the field of sentiment mining. Second, we will lose a portion of our market share if we do not get Google Opinions into the market. Our major competitor Microsoft is also working on similar technology and has filed a patent for the same [4] – [6]. If we do not get our product out into the market first, we will lose a portion of our customers to Microsoft. This will lead a loss of revenue. Third, this technology can be used in future in the Google Suite of products such as Orkut with little cost. 11 Conclusion and Recommendations Google Opinions will capture the market as it will be the first search engine to provide opinion based searches thereby providing Google with increased revenues. As a result of its fast performance and reliability, we expect this project to do better than similar products in future by Google’s competitors. Further, If our proposal is approved, we anticipate a commencement date of June 1, 2009 and being completed no later than May 31, 2010. In order to proceed, we suggest recruiting 26 employees before May 15, 2009, and purchase the required hardware and software no later than May 10, 2009. The software should be installed on all machines no later than May 15, 2009. This will allow the Google Opinions project to proceed as planned and have maximum opportunity to be tested before deployed. 12 Appendix A. Project Plan Google Opinions System The Google Opinions system uses the Go-Compare algorithm and Thoughts Discovery paradigm. The Google Opinions system consists of the following major steps: The figure below gives the system diagram for the Google Opinions project. Newspapers Web Portals Blogs Raw Data Web Crawling A1 Go-OpCrawler Cleaned up Data Preprocessing List of Aspects A2 Preprocess or Input Search Text Articles related to Search Text Topic generation A3 Go-TopGenerator Polarity Thesaurus /Dictionary Polarity Scanning A4 Go-OpSelector Opinion-Laden Statements Summarizing Polarity Rate Summary A5 GoSummariz er NODE: A0 TITLE: Google Opinions Figure A-1: Google Opinions System Diagram NO.: 13 Web Crawling: In order to perform an opinion-based search, the Google Opinions will require access to opinion-laden data. The Go-Op-Crawler [7], [8] component will crawl the web for articles related to the search words given by the user. This data will be stored on servers and will be updated on a daily basis. This data will be grouped according to the time the data was published on the web. Preprocessing: The data collected by the Go-Op-Crawler will be in HTML or XML format. The algorithms used by the Google Opinions system require plain text along with statistics on the occurrences of words within the input text [7], [8]. The Pre-Processor component will take in raw data in HTML/XML form and process it to give plain text along with statistics on the occurrences of the words in the text. Topic Generation: The Go-Top-Generator will find the main topics in a given text. The text will then be tagged with the topics generated by the Go-Top-Generator. The Go-OpSelector component will be given as inputs those texts with topics related to the search words. Polarity Scanning: The Google Opinions system is interested only in retrieving opinions on the search words entered by the user. The Go-Op-Selector searches for sentences that contain opinions [7], [8]. This is done by looking for words (adjectives, adverbs or verbs) that have a positive or negative connotation. Examples of such words would be “delicious” and “awful”. Summarizing: The Go-Summarizer component will select representative sentences from the set of sentences containing opinions [7], [8]. These sentences, along with a link to the document, will be shown to the user by Google Opinions. 14 Google Opinions Project Timeline The figure below gives the timelines each of the task will be completed. The Google Opinions project is expected to start on June 1, 2009 and end on May 31, 2010. 2009 ID Task Name Start Finish 2010 Duration Jun Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr May 1 User Interface 6/1/2009 10/30/2009 110d 2 Go-Op-Crawler 6/1/2009 11/30/2009 131d 3 Preprocessor 6/1/2009 10/30/2009 110d 4 Go-Top-Generator 6/1/2009 11/30/2009 131d 5 Go-Op-Selector 6/1/2009 11/30/2009 131d 6 Go-Summarizer 6/1/2009 11/30/2009 131d 7 Integration 12/1/2009 12/31/2009 23d 8 Integration Testing 1/1/2010 4/1/2010 65d 9 Quality Testing 4/2/2010 6/1/2010 43d Figure A-2: Google Opinions Project Time Line 15 Google Opinions Main Page Figure A-3 gives the proposed user interface for the Google Opinions project [25]. Figure A-3: Google Opinions User Interface 16 Appendix B. Costs and Revenue Costs The Google Opinions project will cost a total of $10,909,357. These costs are divided as follows: Personnel costs - $3,000,000 [26] o The Google Think team consists of 30 members who will all be working on the Google Opinions project. This consists of 16 software engineers with 1-2 years of experience, 10 software engineers with 4-6 years of experience, 1 project manager, 1 lead researcher, 1 testing manager and 1 lead architect. These personnel will take part in all the phases of the project which include requirements collection, developing and testing. Hardware costs - $7,000,000 o The hardware requirements for the Google Opinions project consists a total of 2000 Intel Xeon Processors [27]. The project will require 1220 computing processors, 30 workstations and 750 data servers. This will be written off over a 5-year period in line with best accounting practices. Software costs - $909,357 o The table below gives a breakdown of the software costs of the Google Opinions project: Software Microsoft Office Ultimate 2007 [28] Number of Total Cost in US Licenses/Copies Dollars 30 15,000 17 Software Number of Total Cost in US Licenses/Copies Dollars Microsoft Office Project Ultimate 2007 5 8,000 100 150,000 5 115,000 10 18,750 700 105,000 Rational Rose [34 -37] 10 50,000 Rational PurifyPlus [38] 10 25,000 X-Manager Enterprise 3 [39] 30 10,857 [29] Microsoft Vista Business Operating System [30] Oracle Data Mining Database [31] OriginPro 8 Data Analysis and Graphing Software [32] Red Hat Enterprise Linux Advanced Platform [33] Table 3: Software Cost o We will require Python, Java, Perl, C, C++, Postgres and MySQL software to be installed on all servers and workstations. This software is available free of cost. o The following software has been developed by other groups in Google and will be enhanced to be used in the Google Opinions project. Go-Op-Crawler Pre-processing software Go-Summarizer 18 Additionally, a team of 10 software engineers will be required to maintain the Google Opinions product. We therefore envision a cost of $100,000 per annum in maintenance costs. Figure B-1: Cost Profile shows the proportion of the costs: Figure B-1: Cost Profile The phase-wise cost is given in Table 4 and Figure B-2. Phase Cost in US Dollars Pre-Requirements 909,113 Requirements 909,113 Design 909,113 Development 1818227 Unit Testing 909,113 Integration 909,113 System Integration 909,113 System Integration Testing 1818226 Product Quality Testing 1818226 Table 4: Phase-wise Cost 19 Figure B-2: Phase-wise Cost The cost per module is given in the table below: Module Cost in US Dollars Go-Op-Crawler 1090936 Pre-processor 1090936 Go-Top-Generator 2181871 Go-Op-Selector 2181871 Go-Summarizer 2181871 Thought Stats 1090936 User Interface 1090936 Table 5: Cost per Module The chart below shows the division of the costs per module: 20 Figure B-3: Cost per Module Revenue The majority of Google’s revenue comes from advertising on Google’s website and her partners’ websites. We surveyed 20 people to see if they would like to use Google Opinions. All responded they would use Google Opinions for product reviews, movie reviews, service reviews, school projects and performing background checks. This survey indicates that Google Opinions would be well received in the market. Our advertising revenues from posting advertisements on Google Opinions would rise since stores, service provides, product manufacturers, education providers would want their product, store or service listed when a user searches for an opinion on them. We therefore predict an increase in the revenues from Google AdWords and AdSense as a result of Google Opinions as below [32]: AdWords – 1,000 new opinion-themed words for auction at an average cost-per-click of $0.38 and 10,000 visits per annum translates to $3,800,000 (1,000 * 0.38 * 10,000) 21 All things being equal, we expect a growth rate of 5% in traffic year-on-year thereby yielding the following figures for subsequent years: Year 2: ($3,800,000 * 1.05) = $3,990,000 Year 3: ($3,990,000 * 1.05) = $4,189,500 Year 4: ($4,189,500 * 1.05) = $4,398,975 Year 5: ($4,398,975 * 1.05) = $4,618,924 AdSense – The domino-effect of opinion-themed search will see partner websites receiving increased traffic due to clients curious about the advisability of a particular product or service relative to close substitutes. We estimate an increase in net revenue of $300,000 [33] from this medium in Year 1 with year-on-year increases of 10% as the web-space opens up, and more and more people have access to the Internet. The table below shows projected increase in net revenue from AdSense over a 5-year period: Year 1: $300,000 Year 2: ($300,000 * 1.10) - $330,000 Year 3: ($330,000 * 1.10) - $363,000 Year 4: ($363,000 * 1.10) - $399,300 Year 5: ($399,300 * 1.10) - $439,230 22 Thought Stat – A survey on Thought Stat will be conducted as it runs a free pilot phase in the first and second years to enable organizations gauge its effect on their bottom-line. Survey responses will be used to revamp the service before commercialization beginning 2 years after deployment. With Fortune 100 companies all paying for license keys at rate of $5,000 per annum, we expect substantial increase in revenue derivable from this window as it compares favorably with amount paid for such professional services individually procured. Google Opinions can be offered as an extension to Google Site Search at no extra cost. Websites such as product websites, newspaper sites and review sites would be interested in having an opinion based search capability since most of the users would be interested in opinions. Based on the market survey we conducted, we believe our estimates are conservative and substantial positive differences could be expected. This translates to the chart below showing the break-even point towards the end of the second year. Figure B-4: Cost Benefit Analysis 23 Appendix C. Google Opinions Glossary Term: Go-Compare Definition: Go-Compare is an algorithm in the project that will aid consumers in decisionmaking by offering an analysis of all products of interest, and providing alternatives. No longer will decisions be made with inadequate information. Word Origin: Go-Compare originates from the words “Go” for “Google” and “compare”. The word “compare” is a verb to relate two or more items together and draw out key differences and similarities between them. Word History: Comparing is a key issue in decision-making. To everything, there are alternatives. Man is a rational being; given a set of competing options, he always chooses what suits his interests at any given time. Those choices are an expression of his likes and dislikes. Selecting from two or more alternatives requires that adequate information about the alternatives and their consequences be made available. Go-Compare strives to make this information available to aid decision-making. Negation: Go-Compare is not a technology or application but an algorithm that feeds on the divergent opinions existing on the web for a particular product or service, and close substitutes. It assumes consumers are rational. Go-Compare is not an information store. Different products of same type maybe measured using different indices. Go-Compare attempts to synthesize these indices and offer a total picture that will enable consumers make the right choice. Division into Parts: Go-Compare will have to relate like with like. Product genre will be mined, and listed prices of the products will also be compared. To make these possible, a geno-crawler will consider the characteristics and attributes of products in order to categorize them 24 appropriately according to some strict terms. Then a comparator will generate the products’ features and i-Util ratings. If no ratings are available, it generates one. Term: Go-Op-Crawler Definition: This term is used to call a web crawler specifically applied in a Thoughts Discovery process. It denotes is a computer program that browses relevant and specific Internet sources in a methodical and automated manner. These sources provide a window to access opinions about general or particular topics [1][40]. Word Origin: The word Go-Op-Crawler originates from the words “Go” for “Google”, “Op” for “opinions” and “crawler”. Division into Parts: The Go-Op-Crawler consists of two principal parts: the web crawler itself that Google is already using to browse through the Internet which is a part of other Google’s products such as Google Scholar, Google Images and Google Code; and a classifier which categorizes Internet sources by relevance, type of site and topics of interest. In order to categorize the internet sources by relevance, this classifier uses the Google PageRank technology and Hypertext-Matching Analysis [41], [42]. Similarities and Differences: Like the other web crawlers a Go-Op-Crawler browses web pages following links. It presents a parallel multi-threading architecture which facilitates the processing of huge volumes of data [41]. A Go-Op-Crawler is different of other web crawlers because it selectively only goes through links which has been categorized by relevance, type of site and topics of interest using PageRank and Hypertext=Matching Analysis. 25 Term: Go-Op-Selector Definition: Go-Op-Selector is the opinion selector component in the Google Opinions project. It is responsible for selecting the opinion-laden statements on the topics generated by the GoTop-Generator component. Word Origin: Go-Op-Selector originates from the words “Go” for “Google”, “Op” for “opinion” and “Selector” for “selector”. Division into Parts: The Go-Op-Selector consists of the input, output and processor module [42], [43]. The input module takes as input the topics which were generated by the Go-TopGenerator component and text from which opinions are to be extracted. The processor module finds sentences that contain either the topic or synonyms of the topic. It looks for sentences that have adjectives or adverbs within the vicinity of the topic word (or its synonym). These words generally denote polarity (negative or positive intonation). The output module passes the selected opinion-laden statements to the Go-Summarizer. Analogy: The Go-Op-Selector is like a search engine that searches for a topic in a collection of opinion laden text [43]. Examples: If a user types “George Bush” into the Google Opinions search bar, the Google-TopGenerator would generate “George Bush” as a topic. A statement “I hate George Bush” would result in the Go-Op-Selector selecting this statement as the statement has “hate” which is an opinion laden word in the vicinity of “George Bush”. Term: Go-Summarizer 26 Definition: The Go-Summarizer is the summarizer component in the Google Opinions project. It is responsible for generating a summary of the opinions contained in the statements selected by the Go-Op-Selector component. Word Origin: Go-Summarizer originates from the words “Go” for “Google” and “summarizer”. Division into Parts: The Go-Summarizer [44] consists of an input, output and processor component. The input component takes as input the opinion-laden sentences selected by the Go-Op-Selector. It passes these sentences to the processor component. The processor component is responsible for selecting sentences that accurately represent all the opinions in the input sentences. This component ensures that no two sentences that express the same opinion are selected. The output component displays the sentences selected in the summary. Analogy: The Go-Summarizer component is similar to the automatic summarizers available in the market [45]. The Go-Summarizer component is an extractive summary generator (it generates a summary by selecting representative sentences) and not an abstractive summary (rewriting the sentences). Examples: If the Go-Op-Selector gives the statements “I hate George Bush”, “I think George Bush has done great work”, “Everyone hates George Bush” as input to the Go-Summarizer, the Go-Summarizer will select “Everyone hates George Bush” and “George Bush has done great work” as representative summary sentences. Term: Go-Top-Generator Definition: The Go-Top-Generator is the topic generator component in the Google Opinions project. It is responsible for selecting the main topics talked about in an article. 27 Word Origin: Go-Top-Generator originates from the words “Go” for “Google”, “Top” for “topic” and “generator”. Division into Parts: The Go-Top-Generator consists of a Syntactic Analysis component, Statistical Analyzer component and a Domain component [43]. The Syntactic Analysis component looks for syntactic patterns in the input text. These patterns are commonly found in sentences that talk about a topic. The Statistical Analyzer component will calculate the number of occurrences of words in the input text. Words that are not stop words (commonly occurring words such as prepositions) and occur frequently are likely to be topics. The Domain component is responsible for looking for different topics that are specific to the domain of the text. For example, a text related to restaurant reviews would talk about customer service. Examples: Given a sentence “I love HP laptops”, the Go-Top-Generator will select “HP laptops” as the topic. Term: Google Opinions Definition: Google Opinions is an extension of the Google Search Engine [25] which searches for opinions on a particular subject. Word Origin: Originates from the company name “Google Opinions” and the other Google categories such as “Google Images”, “Google News”, and “Google Maps”. Division into Parts: Google Opinions consists of the following parts: Go-Op-Crawler (web crawler), Go-Summarizer (summarization component), engine, and a statistics section. The search engine works similarly to Google’s other search categories in that it reports sites from the web related to the opinion typed into the search bar, as well as providing options to limit or enhance ones search. The statistics is explained in the Thought Stats definition. 28 Analogy: Google Opinions can be thought of as a more technical variation of the Google Search available on the Google Main Page, with the web sites limited to opinions instead of pure key-word findings. Similarities and Differences: The similarities between a standard Google search and a Google Thoughts search include the report of web pages containing either positive or negative opinions of the search criteria and a page allowing a refinement of search options. Differences include a marked rating next to the web site indicating whether opinions are positive or negative, and an extra section for statistics. Examples: A user who types “George Bush” into the Google Opinions search bar will have access to articles that talk about George Bush in a positive or negative manner. Term: i-Util Meter Definition: An i-Util meter is a computer program that collates and measures the satisfaction users or consumers derive from a product. It uses the ratings awarded to the product to arrive at a weighted average. Word origin: i-Util Meter originates from the words “i” for “internet”, “Util” for “utility” and meter. Word History: Measuring satisfaction across a broad spectrum has always been a challenge. People have different needs and expectations from a given item. Satisfaction then has become a social question. Hence we look to the social sciences for answers. A Util is a hypothetical unit of measurement of utility that is commonly used by economists to present hypothetical information about utility and consumer demand theory. The util measurement unit was developed as a convenient way to illustrate and discuss concepts such as total utility, marginal utility, and the law of diminishing marginal utility. However, because utility is not a measurable 29 characteristic, the utility does represent an actual unit of measurement, such as inches or pounds. Division into parts: An i-Util Meter is made up of two principal parts: the i-Util itself that is a numerical representation of derived benefits according to an individual user, and the meter that aggregates scores and generates an average. Similarities and differences: The i-Util Meter is a virtual measurement of perceived satisfaction derived from a product by end-users. It is aggregated among multiple users and makes allowances for special events such as promos that can suddenly spike consumer appreciation. It is similar to a utility meter [46] that measures consumption levels of a utility. The differences stems from their levels of alignment with the customer. The i-Util Meter measures the satisfaction with a product or service whereas utility meters measure consumption patterns and trends with particular emphasis on charging a fee on consumption. Term: Thought Stats Definition: Thought Stats is a component of the Google Opinions project which collects statistics on opinions on a topic and displays the same as tables, graphs and charts. Word Origin: Originates from the group name “Google Think” and the abbreviation “stats” for “statistics” Division into Parts: Thought Stats consists of multiple portions for gathering either opinionspecific or comparative-opinion statistics on various products, ideas, views, positions, and any other physical or abstract article for which an opinion can be formed. Both numerical and graphical content can be obtained for summarizing a wide variety of opinions. 30 Analogy: Thought Stats can is similar to Google Analytics [47] with the data being number of opinions instead of number of website hits. Examples: A Pie Chart showing the division of positive and negative opinions about a product, A Bar Graph displaying the comparative percentage of positive and negative opinions about 5 different objects, A Line Graph traversing the percentage of positive or opinions towards a particular view over a period of time, A Table showing opinion statistics about a set of products, such as number of opinions, percentage of opinions, increase or decrease in number and quality of opinions, rate of increase of number and quality of opinions, and product information referenced as influencing these opinions. Term: Thoughts Discovery Definition: Thoughts Discovery is the process of automatically searching large volumes of data for patterns that can be considered opinions or particular thoughts about general or specific topics. Word Origin: Thoughts Discovery originates from the specialization of a branch of Data Mining called Knowledge Discovery [48], [49]. Division into Parts: Thoughts Discovery is based on the following computer science disciplines: Artificial Intelligence, Data Mining, Statistics, Machine Learning and Pattern Matching and Recognition. The application of the former three, as a set, has been the focus of 31 all Data mining investigations. Now, with the state-of-the-art advances in Pattern Matching and recognition applied to the existing data mining techniques the new Thoughts Discovery investigation area has raised. Negation: Thoughts Discovery is not a technology but an amalgamation of disciplines. It cannot be applied as a physical or logical tool, but can be applied as a methodological paradigm to guide the development of tools. 32 Appendix D. Financial Statements Google Inc.: Income Statement The table below gives the income statement for the previous five quarters [16]. In Millions of USD (except for per share items) Revenue 3 months 3 months 3 months 3 months 3 months ending ending ending ending ending 2009-03- 2008-12- 2008-09- 2008-06- 2008-03- 31 31 30 30 31 5,508.99 5,700.90 5,541.39 5,367.21 5,186.04 - - - - - Total Revenue 5,508.99 5,700.90 5,541.39 5,367.21 5,186.04 Cost of Revenue, Total 2,101.50 2,190.01 2,173.39 2,147.57 2,110.54 Gross Profit 3,407.49 3,510.90 3,368.00 3,219.64 3,075.51 882.25 917.35 1,015.87 959.46 856.20 Research & Development 641.64 733.34 704.57 682.21 673.07 Depreciation/Amortization - - - - - Interest Expense(Income) - - - - - - Unusual Expense (Income) - 1,094.76 - - - Other Operating Expenses, - - - - - Total Operating Expense 3,625.40 4,935.46 3,893.83 3,789.25 3,639.81 Operating Income 1,883.59 765.45 1,647.57 1,577.96 1,546.23 Other Revenue, Total Selling/General/Admin. Expenses, Total Net Operating Total 33 In Millions of USD (except for per share items) 3 months 3 months 3 months 3 months 3 months ending ending ending ending ending 2009-03- 2008-12- 2008-09- 2008-06- 2008-03- 31 31 30 30 31 6.21 - - - - - - - - - - 1.41 0.75 5.32 -2.96 Income Before Tax 1,889.80 835.35 1,668.78 1,635.89 1,713.58 Income After Tax 1,422.83 382.44 1,289.94 1,247.39 1,307.09 Minority Interest - - - - - Equity In Affiliates - - - - - 1,422.83 382.44 1,289.94 1,247.39 1,307.09 Accounting Change - - - - - Discontinued Operations - - - - - Extraordinary Item - - - - - 1,422.83 382.44 1,289.94 1,247.39 1,307.09 Preferred Dividends - - - - - Income Available to 1,422.83 382.44 1,289.94 1,247.39 1,307.09 1,422.83 382.44 1,289.94 1,247.39 1,307.09 Interest Income(Expense), Net Non-Operating Gain (Loss) on Sale of Assets Other, Net Net Income Before Extra. Items Net Income Common Excl. Extra Items Income Available to Common Incl. Extra Items 34 In Millions of USD (except for per share items) Basic Weighted Average 3 months 3 months 3 months 3 months 3 months ending ending ending ending ending 2009-03- 2008-12- 2008-09- 2008-06- 2008-03- 31 31 30 30 31 - - - - - - - - - - - - - - - - 0.00 0.00 0.00 0.00 317.22 316.86 317.78 318.02 317.39 4.49 1.21 4.06 3.92 4.12 - - - - - - 0.00 0.00 0.00 0.00 - - - - - - - - - - - - - - - Shares Basic EPS Excluding Extraordinary Items Basic EPS Including Extraordinary Items Dilution Adjustment Diluted Weighted Average Shares Diluted EPS Excluding Extraordinary Items Diluted EPS Including Extraordinary Items Dividends per Share Common Stock Primary Issue Gross Dividends - Common Stock Net Income after Stock Based Comp. Expense Basic EPS after Stock Based 35 In Millions of USD (except for per share items) 3 months 3 months 3 months 3 months 3 months ending ending ending ending ending 2009-03- 2008-12- 2008-09- 2008-06- 2008-03- 31 31 30 30 31 Comp. Expense Diluted EPS after Stock - - - - - Depreciation, Supplemental - - - - - Total Special Items - - - - - Normalized Income Before - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 4.49 2.79 4.06 3.92 7.31 Based Comp. Expense Taxes Effect of Special Items on Income Taxes Income Taxes Ex. Impact of Special Items Normalized Income After Taxes Normalized Income Avail to Common Basic Normalized EPS Diluted Normalized EPS Table 6: Income Statement 36 Google Inc.: Balance Sheet The table below gives the balance sheet for the previous five quarters [16]. In Millions of USD (except As of As of As of As of As of for per share items) 2009-03- 2008-12- 2008-09- 2008-06- 2008-03- 31 31 30 30 31 Cash & Equivalents 10,426.29 8,656.67 8,370.47 7,363.54 6,519.75 7,358.64 7,189.10 6,042.14 5,370.13 5,614.76 17,784.93 15,845.77 14,412.61 12,733.67 12,134.51 2,543.11 2,642.19 2,541.49 2,641.90 2,560.91 - - - - - 2,543.11 2,642.19 2,541.49 2,641.90 2,560.91 - - - - - 1,317.86 1,404.11 897.35 846.87 697.79 434.90 286.11 111.40 94.40 71.72 22,080.80 20,178.18 17,962.85 16,316.84 15,464.93 - 7,576.34 7,325.79 7,013.00 6,430.85 4,830.31 4,839.85 4,821.65 4,853.81 4,791.40 Intangibles, Net 910.34 996.69 1,047.72 1,138.99 1,203.97 Long Term Investments 101.00 85.16 1,100.90 1,067.52 1,056.97 Other Long Term Assets, 468.46 433.85 660.70 664.92 345.99 Short Term Investments Cash and Short Term Investments Accounts Receivable Trade, Net Receivables - Other Total Receivables, Net Total Inventory Prepaid Expenses Other Current Assets, Total Total Current Assets Property/Plant/Equipment, Total - Gross Goodwill, Net Total 37 In Millions of USD (except As of As of As of As of As of for per share items) 2009-03- 2008-12- 2008-09- 2008-06- 2008-03- 31 31 30 30 31 33,513.03 31,767.58 30,806.97 29,179.79 27,604.98 196.22 178.00 240.71 439.28 358.12 1,452.90 1,824.45 1,683.23 1,565.69 1,730.86 - 0.00 0.00 0.00 0.00 - - - - - 534.74 299.63 300.34 340.55 371.50 2,183.85 2,302.09 2,224.28 2,345.51 2,460.47 Long Term Debt - - - - - Capital Lease Obligations - - - - - Total Long Term Debt - 0.00 0.00 0.00 0.00 Total Debt - 0.00 0.00 0.00 0.00 0.00 12.52 20.42 22.20 - - - - - - Other Liabilities, Total 1,481.08 1,214.11 1,087.41 899.06 806.97 Total Liabilities 3,664.93 3,528.71 3,332.11 3,266.77 3,267.45 - - - - - - - - - - Total Assets Accounts Payable Accrued Expenses Notes Payable/Short Term Debt Current Port. of LT Debt/Capital Leases Other Current liabilities, Total Total Current Liabilities Deferred Income Tax Minority Interest Redeemable Preferred Stock, Total Preferred Stock - Non Redeemable, Net 38 In Millions of USD (except As of As of As of As of As of for per share items) 2009-03- 2008-12- 2008-09- 2008-06- 2008-03- 31 31 30 30 31 Common Stock, Total 0.32 0.32 0.32 0.31 0.31 Additional Paid-In Capital 14,694.50 14,450.34 14,194.20 13,904.27 13,561.95 Retained Earnings 14,984.46 13,561.63 13,179.19 11,889.25 10,641.86 - - - - - 168.82 226.58 101.16 119.18 133.41 Total Equity 29,848.10 28,238.86 27,474.86 25,913.01 24,337.53 Total Liabilities & 33,513.03 31,767.58 30,806.97 29,179.79 27,604.98 - - - - - 315.70 312.92 314.59 314.25 313.50 (Accumulated Deficit) Treasury Stock - Common Other Equity, Total Shareholders' Equity Shares Outs - Common Stock Primary Issue Total Common Shares Outstanding Table 7: Balance Sheet 39 Google Inc.: Cash Flow Statement The table below gives the cash flow statement for the previous five quarters [16]. In Millions of USD 3 months 12 months 9 months 6 months 3 months (except for per share ending ending ending ending ending items) 2009-03-31 2008-12-31 2008-09-30 2008-06-30 2008-03-31 1,422.83 4,226.86 3,844.42 2,554.48 1,307.09 321.13 1,212.24 898.76 589.28 280.56 82.09 287.65 215.62 138.85 55.96 Deferred Taxes -12.85 -224.65 -124.60 -105.89 -38.21 Non-Cash Items 224.23 2,023.53 704.34 433.92 184.78 Changes in Working 212.08 327.23 192.03 -65.04 -10.72 2,249.51 7,852.86 5,730.56 3,545.60 1,779.45 Capital Expenditures -262.75 -2,358.46 -1,990.62 -1,539.11 -841.60 Other Investing Cash -156.08 -2,960.96 -1,511.97 -826.51 -565.40 -418.83 -5,319.42 -3,502.58 -2,365.63 -1,406.99 31.84 159.09 114.77 94.98 51.10 - - - - - -36.74 -71.52 -38.25 -22.75 -22.45 Net Income/Starting Line Depreciation/Depletion Amortization Capital Cash from Operating Activities Flow Items, Total Cash from Investing Activities Financing Cash Flow Items Total Cash Dividends Paid Issuance (Retirement) of 40 In Millions of USD 3 months 12 months 9 months 6 months 3 months (except for per share ending ending ending ending ending items) 2009-03-31 2008-12-31 2008-09-30 2008-06-30 2008-03-31 - - - - - -4.89 87.57 76.52 72.23 28.66 -56.17 -45.92 -15.62 29.74 37.05 1,769.62 2,575.08 2,288.88 1,281.94 438.16 - 1.56 1.29 0.95 0.39 - 1,223.98 743.44 378.55 12.09 Stock, Net Issuance (Retirement) of Debt, Net Cash from Financing Activities Foreign Exchange Effects Net Change in Cash Cash Interest Paid, Supplemental Cash Taxes Paid, Supplemental Table 8: Cash Flow Statement 41 Appendix E. Resumes This appendix contains the resumes of Scott Larson, Khairun-nisa Hassanali, David Urbina and Michael Fashola. Scott Larson will serve as the Project Manager for the Google Opinions project. Khairun-nisa Hassanali will serve as the Lead Researcher for the Google Opinions project. David Urbina will serve as the Lead Architect for the Google Opinions project. Michael Fashola will serve as the Testing Manager and Manager of Finance and Marketing for the Google Opinions project. 42 Khairun-nisa Hassanali Home: 214-281-8888 Email: khairunnisa.hassanali@gmail.com www.utdallas.edu/~khassanali 2400 Waterview Parkway, #418 Richardson, TX 75080 OBJECTIVE Secure a Software Engineering position in an innovative team with a passion for quality EDUCATION The University of Texas – Dallas, PhD in Computer Science, anticipated 2011 G.P.A 3.888/4.0 Major: Computer Science Research Focus: Natural Language Processing, Opinion Mining Bangalore University - Bangalore, India, MCA, Dec 2001 Percentage 87.78/100 Major: Computer Applications Bangalore University, - Bangalore, India, Bachelors of Science (B.Sc.), May 1998 Percentage 67.38/100 Major: Computer Science, Mathematics, Statistics PROGRAMMING LANGUAGES C C++ Java Python Visual Basic SKILLS Dynamic, self-motivated, customer-oriented, team player and a quick learner 6 years experience in developing SIP Servers and User Agent Toolkits and application software MEMBERSHIPS IEEE, Student Member, January 2009 – Current AWARDS AND HONORS Recipient of the Graduate Student Scholarship, University of Texas at Dallas Team award, Flextronics Software Systems for improving the performance of SIP Server PROFESSIONAL EXPERIENCE Teaching Assistant The University of Texas at Dallas Sep. 2008 – Present Assist professors in tutoring students and grading assignments of courses including Natural Language Processing. Research Assistant The University of Texas at Dallas Sep. 2007 – May 2008 Conducted research on automatic classification of political blogs, named entity recognition, opinion mining and detection of sarcasm in written text. Implemented prototypes using Python, Java and C on UNIX environment. Technical Leader Flextronics Software Systems, Bangalore, India May 2003 – Feb. 2007 Led a team of 4 software engineers in developing SIP (Session Initiation Protocol) Server Frameworks and User Agent Toolkits. Responsibilities included design, testing, reviewing and customer support. Received a team award for improving the performance of the SIP Server Frameworks 5.9. Used C++ and UNIX. Trainee Software Engineer Icope Technologies Pvt. Ltd, Bangalore, India Feb. 2003 – May 2003 Developed J-Theseus, a Customer Relationship Management package using Java, JSP, JDBC and MS SQL Server. Software Developer Lakhani General Suppliers, Mombasa, Kenya Feb. 2002 – Nov. 2002 Single handedly developed and deployed an Inventory Control system using Visual Basic and MS SQL Server. REFERENCES AVAILABLE UPON REQUEST 43 David Urbina 2400 Waterview Parkway, Apt. 524 Richardson, Texas 75080 Home: 972-233-1659 Email: david.urbina@acm.org OBJECTIVES Work in an innovative company thereby orienting my professional career in the Software Architecture and Software Requirements knowledge areas. EDUCATION The University of Texas – Dallas Master in Computer Science, anticipated December 2010 G.P.A.: 4.0/4.0 Major: Software Engineering Courses: Advance Requirements Engineering, Object-Oriented Analysis and Design Simón Bolívar University – Venezuela Computer Engineering, 2006 Average: 4.25/5.0 Courses: Database Systems I, II, III; Operative Systems I, II, III; Information Systems I, II, III PROGRAMMING LANGUAGES J2SE C# TL-SQL SOFTWARE Visual Studio 2008, .NET Framework 3.5, Spring Framework 2.5, Eclipse, Tomcat, WebSphere Community Edition, Oracle Database Server 10g, Microsoft SQL Server 2005, Windows XP, Linux CERTIFICATIONS J2SE Sun Certified Programmer SKILLS 5 years of experience working in team environments 4 years of experience in object-oriented analysis/design/development 3 years of experience using UML modeling language 1 year of experience assisting students in computer science laboratory LANGUAGES Written and Oral fluency in Spanish and English MEMBERSHIPS ACM, student membership, 2005 - current AWARDS AND HONORS Honorable Mention for the research “Security and Portals for SUMA grid” PROFESSIONAL EXPERIENCE Solutions Architect DBAccess, Inc. January 2008 – December 2008 Principal responsible for the decisions of design in two large-scale projects. Demonstrated skills for designing software-intensive system architectures, Communicating ideas and training co-workers. Promoter and co-organizer of the Quality Architecture Evaluation Service in this company and first Quality Architecture Evaluator of the SCADA system of PDVSA, S.A., one of the largest oil companies in the world. Solutions Developer DBAccess, Inc. January 2006 – December 2007 Worked in 3 projects for across-the-globe clients. Demonstrated skills for accomplishing goals on schedule and learning new technologies and methodologies. Continually collaborate with the design team, giving ideas to improve the architecture design. Quick assignation to the highly respected Software Architecture Unit of the company. INTERESTS Technical Reading, Working in Open Source projects REFERENCES AVAILABLE UPON REQUEST 44 Michael Fashola Home: 214-575-0104 Cell: 469-693-8687 mof081000@utdallas.edu 9637 Forest Lane Dallas, Tx 75243 OBJECTIVE Multi-disciplinary Software architect with an eye for value adding and impacting positively on the bottom-line through sales strategy EDUCATION The University of Texas at Dallas, M.S. in Software Engineering anticipated May 2010 Hours Completed: 9 out of 24 for degree EdExcel, UK B-TEC/HND Software Engineering 2008 University of Ilorin, Ilorin, NIGERIA B.Sc in FINANCE 1999 PROGRAMMING LANGUAGES Java C C++ C# Python PERL SKILLS 8years experience working in Teams environment 5years experience developing and testing enterprise applications 2years experience in optimization and synthesis of high performance systems LANGUAGES English MEMBERSHIPS IEEE, student membership, 2009 – current ACM, student membership, 2009 – current Institute of Chartered Accountants on Nigeria, 1997 – current Chartered Institute of Bankers of Nigeria, 2001 – current PROFESSIONAL EXPERIENCE Head, Asset & Liability TEAM OCEANIC BANK INTERNATIONAL PLC January 2001 – present Asset and Liability monitoring and management Testing, recommendation and deployment of new software applications for financial system governance Strategically building the Bank's assets and liabilities in line with Central Bank regulations Audit Officer OFFICE OF AUDITOR GENERAL FOR LOCAL GOVERNMENT Aug 1999 – Feb. 2000 Public Sector Accounting Value for money audit Audit Officer ARUNA BAWA & Co. (Chartered Accountants) Consultancy services March 1999 – July 1999 REFERENCES AVAILABLE ON REQUEST 45 Scott Larson Home: (972) 495-6171 Cell: (214) 738-7986 E-mail: s_larson323@yahoo.com 2713 Chariot Ln. Garland, Tx 75044 Objective To contribute to the Software Development Industry with knowledge of Programming and Software Engineering, and increase my knowledge of corporate practice Education The University of Texas at Dallas M.S. – Software Engineering Anticipated Fall 2009 GPA: 3.46 / 4.00 Software Engineering Courses: Requirements Engineering Software Architecture and Design Software Testing and Verification Database Design Texas Christian University B.A. – Music Spring 2007 GPA: 3.29 / 4.00 Music Composition Programming Languages Java JavaScript C++ ASP C Visual Basic Software MS Office MS Visio Adobe Photoshop SQL Server Skills 6 years experience programming Java 5 years Organizational experience 4 years experience Problem Solving 4 years Analytical experience 3 years Communication [academic environment] 1 year Leadership [academic environment] Languages English (Native) German (some) Memberships IEEE - current SQL HTML Visual C++ Codewarrior Studio Factory Academic Projects at The University of Texas – Dallas Title Software Testing and documentation (leader) Proposal Architecture and Design Brief Description 6-person group project black-box testing Semester Spring 2009 4-person group project developing a business-level project proposal (leader) 3-person group project designing and Implementing an online search engine (leader) Spring 2009 REFERENCES AVAILABLE UPON REQUEST Fall 2008 46 References [1] R. Kumar, J. Novak, P. Raghavan, and A. Tomkins, "On the bursty evolution of blogspace," in WWW '03: Proceedings of the twelfth international conference on World Wide Web. ACM Press, 2003, pp. 568-576. [Online]. Available: http://dx.doi.org/10.1145/775152.775233 [2] K. Dave, S. Lawrence, and D. M. Pennock, "Mining the peanut gallery: opinion extraction and semantic classification of product reviews," in WWW '03: Proceedings of the twelfth international conference on World Wide Web. ACM Press, 2003, pp. 519-528. [Online]. Available: http://dx.doi.org/10.1145/775152.775226 [3] A. N. Langville and C. D. Meyer, Google's PageRank and Beyond: The Science of Search Engine Rankings. [4] Princeton University Press, July 2006. B. Williams and J. Jacobs, "Exploring the use of blogs as learning spaces in the higher education sector," Australasian Journal of Educational Technology, vol. 20(2), pp. 232247, 2004. [Online]. Available: http://www.jeremybwilliams.net/AJETpaper.pdf [5] Ian A. McAllister, Christoph R. Ponath, Ling Bao, Steven J. Hanks, Microsoft Corporation. “Extraction and summarization of information.” US 2007/0282867 A1 , May 30, 2006 [6] Simon H. Corston-Oliver, Anthony Aue, Eric K. Ringger, Michael Gamon, Microsoft Corporation. “System for processing sentiment-bearing text.” US 2006/0200342 A1, Apr. 14, 2005 [7] I. Titov and R. Mcdonald, "A joint model of text and aspect ratings for sentiment summarization," in Proceedings of ACL-08: HLT. Columbus, Ohio: Association for Computational Linguistics, June 2008, pp. 308-316. [Online]. Available: http://www.aclweb.org/anthology-new/P/P08/P08-1036.bib [8] E. Spertus, M. Sahami, and O. Buyukkokten, "Evaluating similarity measures: a largescale study in the orkut social network," in KDD '05: Proceeding of the eleventh ACM 47 SIGKDD international conference on Knowledge discovery in data mining. New York, NY, USA: ACM Press, 2005, pp. 678-684. [Online]. Available: http://dx.doi.org/10.1145/1081870.1081956 [9] “Google AdWords: Promote Your Business with Google.” Internet: https://www.google.com/accounts/ServiceLogin?service=adwords&cd=null&hl=enUS&ltmpl=regionala&passive=true&ifr=false&alwf=true&continue=https%3A%2F%2Fadwo rds.google.com%2Fselect%2Fgaiaauth%3Fapt%3DNone%26ugl%3Dtrue&sourceid=awo &subid=ww-en-et-ads-newawhptest2, [Apr. 22, 2009] [10] “Google AdSense: Maximize revenue you’re your online content.” Internet: https://www.google.com/adsense/login/en_US/?sourceid=aso&subid=na-en-habk&utm_medium=ha&utm_term=google%20adsense&gsessionid=RH1NF7ML6p-FQWsMkHEYg, [Apr. 27, 2009] [11] Wikipedia, The Free Encyclopedia. “Integration testing” Internet: http://en.wikipedia.org/wiki/Integration_testing, [Apr. 24, 2009] [12] Wikipedia, The Free Encyclopedia. “System testing” Internet: http://en.wikipedia.org/wiki/System_testing, [Apr. 24, 2009] [13] Wikipedia, The Free Encyclopedia. “Code Source” Internet: http://en.wikipedia.org/wiki/Source_code, [Apr. 24, 2009] [14] Wikipedia, The Free Encyclopedia. “Executable” Internet: http://en.wikipedia.org/wiki/Executable, [Apr. 24, 2009] [15] Wikipedia, The Free Encyclopedia. “Software deployment” Internet: http://en.wikipedia.org/wiki/Software_deployment, [Apr. 24, 2009] [16] Wikipedia, The Free Encyclopedia. “Software Requirements Specification” Internet: http://en.wikipedia.org/wiki/Software_Requirements_Specification, [Apr. 24, 2009] 48 [17] Kruchten, P., “The 4+1 View Model of Software Architecture”. Architectural Blueprints IEEE Software, pp. 42-50, November, 1995. [18] “Which Google Products Make Money?.” Internet: http://blogoscoped.com/archive/200901-07-n84.html, [Apr. 27, 2009] [19] “Google Financial Statements.” Internet: http://www.google.com/finance?fstype=bi&cid=694653 [20] Google. “Financial Release.” Internet: http://investor.google.com/releases/2008Q4_google_earnings.html, [Apr. 22, 2009] [21] Google. “Financial Release.” Internet: http://investor.google.com/releases/2009Q1_google_earnings.html, [Apr. 22, 2009] [22] “Live Search.” Internet: http://www.live.com/, [Apr. 22, 2009] [23] “Yahoo Search.” Internet: http://tools.search.yahoo.com/about/forsearchers.html?p, [Apr. 22, 2009] [24] “Google Site Search: Power your Website Search with Google.” Internet: http://www.google.com/sitesearch/, [Apr. 22, 2009] [25] “Google Search Engine”, Internet: http://www.google.com/, [Apr. 14, 2009] [26] “Google Salaries.” Internet: http://www.glassdoor.com/Salaries/Google-SalariesE9079.htm, [Apr. 27, 2009] [27] Amazon.com.” Intel BX80582X7460 6-Core Xeon X7460 Processor.” Internet:http://www.amazon.com/exec/obidos/ASIN/B001CH9B9Q/ref=nosim/6684177-20, [Apr. 22, 2009] [28] “Microsoft Office Professional 2007 - Full Version”, Internet: http://www.qvc.com/qic/qvcapp.aspx/view.2/app.detail/params.aol_refer.false.tpl.detail.ms n_refer.false.item.E176568.ref.GBA?cm_ven=GOOGLEBASE&cm_cat=Electronics&cm_p la=Software&cm_ite=E176568, [Apr. 22, 2009] 49 [29] “Microsoft Office Project Professional 2007 - PC - CD-ROM – English”, Internet:http://www.google.com/products/catalog?q=MS+Project&hl=en&cid=86751705098 42985112&sa=title#ps-sellers, [Apr. 22, 2009] [30] “Microsoft Windows Vista Business w/SP1.” Internet: http://www.google.com/products/catalog?hl=en&rls=com.microsoft:*:IESearchBox&q=cost+of+microsoft+vista+business&um=1&ie=UTF8&cid=982345806236338989&ei=tFL2SZLBJeKrtgej943tDw&sa=X&oi=product_catalog_r esult&resnum=1&ct=result#ps-sellers, [Apr. 27, 2009] [31] Oracle. “Data Mining.” Internet: http://oraclestore.oracle.com/OA_HTML/ibeCCtpSctDspRte.jsp?section=11223&sitex=100 21:22372:US, [Apr. 22, 2009] [32] OriginLab Data Analysis and Graphing Software. “Products.” Internet: http://www.originlab.com/index.aspx?s=8&lm=88&pid=941, [Apr. 22, 2009] [33] “Red Hat Store.” Internet: https://www.redhat.com/wapps/store/catalog.html;jsessionid=83T9Bh5ijdD6C2s95AHYgQ* *.4b748952, [Apr. 27, 2009] [34] “IBM Rational Purify for Linux and UNIX – Unix.” Internet:http://www.google.com/products?q=rational+purify+price&hl=en, [Apr. 22, 2009] [35] “IBM Rational Rose Developer for UNIX – Unix.” Internet:http://www.google.com/products/catalog?q=rational+rose+price&hl=en&cid=6305 737877623743092&sa=title#ps-sellers, [Apr. 22, 2009] [36] “IBM Rational Rose Technical Developer - PC, Unix.” Internet:http://www.google.com/products/catalog?q=rational+rose+price&hl=en&cid=1546 7682535907615653&sa=title#ps-sellers, [Apr. 22, 2009] 50 [37] “IBM Rational Rose Data Modeler – PC.“ Internet:http://www.google.com/products/catalog?q=rational+rose+price&hl=en&cid=1741 6741335320358674&sa=title#ps-sellers, [Apr. 22, 2009] [38] “IBM Rational PurifyPlus Enterprise Edition - PC, Unix.” Internet:http://www.google.com/products/catalog?hl=en&q=rational+purifyplus+price&cid= 224150831539729924&sa=title#ps-sellers, [Apr. 22, 2009] [39] NetSarang Computer. “XManager Enterprise.” Internet: http://sales.netsarang.com/e_sales/online_store.html?open=xme#xme, [Apr. 22, 2009] [40] Kobayashi, M. and Takeda, K. “Information retrieval on the web”, ACM Computing Surveys, New York, NY, June 2000. [41] Brin, S.; Page, L., "The Anatomy of a Large-Scale Hyper textual Web Search Engine", Seventh International World-Wide Web Conference, Brisbane, Australia, April 1998. [42] Wikipedia, The Free Encyclopedia. “PageRank.” Internet:http://en.wikipedia.org/wiki/PageRank, [Apr. 9, 2009] [43] G. Mishne, "Multiple ranking strategies for opinion retrieval in blogs," 2006 TREC Blog Track, 2006. [Online]. Available: http://staff.science.uva.nl/~gilad/pubs/trec06-blogret.pdf [44] K. Lerman, S. Blair-Goldensohn, and R. Mcdonald. "Sentiment summarization: Evaluating and learning user preferences," in 12th Conference of the European Chapter of the Association for Computational Linguistics (EACL-09), 2009. [45] Wikipedia, The Free Encyclopedia. “Automatic Summarization.” Internet: http://en.wikipedia.org/wiki/Automatic_summarization, [Apr. 9, 2009]. [46] “Amos WebGLOSS arama.” Internet: http://www.amosweb.com/cgibin/awb_nav.pl?s=gls&c=dsp&k=util, [Apr. 14, 2009]. [47] “Google Analytics”, Internet: http://www.google.com/analytics/, [Apr. 14, 2009]. 51 [48] Wright, P., “Knowledge Discovery In Databases: Tools and Techniques”, ACM Crossroads, Winter 1998. [49] Fayyad, U., Shapiro, G. and Smyth, P., From Data mining to Knowledge Discovery in Databases. [50] Wikipedia, The Free Encyclopedia. “Web Crawler.” Internet: http://en.wikipedia.org/wiki/Web_crawler, [Apr. 9, 2009] [51] M. Steyvers and T. Griffiths, Probabilistic Topic Models. Lawrence Erlbaum Associates, 2007. [52] Wikepedia, The Free Encyclopedia. “IBM WebFountain.” Internet: http://en.wikipedia.org/wiki/WebFountain, [Apr. 20, 2009] [53] Roland Piquepaille's Technology Trends. "IBM's WebFountain of Knowledge." Internet: http://radio.weblogs.com/0105910/2004/03/01.html, [Apr. 20, 2009] [54] Qingliang Miao , Qiudan Li and Ruwei Dai. “AMAZING: A sentiment mining and retrieval system.” In Expert Systems with Applications: An International Journal, v.36 n.3, p.71927198, April, 2009