Shutterstock Four positions. If interested, send resume and statement of intent/cover letter to Eliot Brenner Data Scientist, Algorithms Team [email protected] 350 Fifth Avenue, 21st Floor New York, NY 10118 -----------------------1. Machine Translation Research Internship One of our challenges at shutterstock is the translation of keywords and search queries. Keywords are a bag of terms of varying word lengths. Queries, on the other hand, are an ordered set of terms, typically one to four terms in length. Translations are performed between English and up to 20 other languages. We are looking for innovative means of improving our translations. In this project, you will tackle such problems as word sense disambiguation using language modeling and other probabilistic techniques. The ideal candidate will have experience with language modeling, word sense disambiguation, statistical machine translation, and other nlp techniques. Advanced understanding in the morphology of multiple languages is a plus. Understanding of unicode is a plus. 2. Personalization Research Internship The purpose of the personalization project is to devise and implement methods to improve search results and download recommendations by incorporating a subtler understanding of user intent than can be obtained from looking at the most recent query text alone. We are looking for ways to leverage the recent search activity of the customer, the longer term behavioral history, and other data about the customer to better understand the context of the search. Part of this project will be answering questions around which types of data sources can contribute to an understanding of the search context. Another part of the project will be building and evaluating models of user intent based on these data sources. The goals and behavioral patterns of stock photo/footage customers present many challenges which are not encountered in the more traditional domains of recommender systems. This project presents the intern with the opportunity to do original research at the frontiers of the field of personalization. The ideal candidate will have experience applying machine learning techniques as clustering, probabilistic graphical models, logistic regression, and Bayesian decision theory. Experience analyzing large data sets, such as search and event logs, using tools such as Hadoop, is a plus. The coding will likely be done in Java and/or Python. 3. Image Content Research Internship Looking to apply your image analysis skills to a collection of millions of high-quality images? Do you dream in kernels, pixels, and color spaces? If so, come join us for a summer of exploration and prototyping where you will put your skills to the test researching and building important customerfacing features built on top of image analysis. You will work with us on extracting value from millions of pixels of data, and work with our data scientists to develop and extract new features from our vast image collection. The internship will hopefully lead to novel product features seen by thousands of customers and contributors. The ideal candidate will have a demonstrable background in image analysis. Familiarity with OpenCV, scikit-image, ndimage or similar is a plus. Coding ability in python, Java or C++ is a plus. 4. Deep Learning Project Research Internship We are seeking a motivated intern with relevant research and coding experience to come help us implement ground-breaking services built around deep learning. Shutterstock has an unusually large (tens of millions) high quality collection of multimedia assets (images, videos, vector drawings, etc). This project will extract value from these media files and help us launch industry-leading features. This internship will offer a unique opportunity to apply academic research to a rich and procured data set not often found in academia. Ideally you will have experience with implementing and tweaking large neural networks, convolutional neural networks, auto-encoders and/or deep belief networks. GPU programming experience is also a plus as is experience with Python, Java, or CUDA. ----------------------------------------------------------------About Shutterstock -----------------At Shutterstock we are committed to using our vast amount of data to solving challenging problems. We have over 40 million labelled images and logs of over 50 million events every day, including 5 million searches in over 20 languages. We are constantly looking for innovative approaches to leveraging our data to improve all aspects of our customer’s experience, including recommendations, personalization, clustering, and translations. Shutterstock’s internship program: ---------------------------------Interns will come work within the search and data science team in midtown Manhattan (NY, NY). We are currently seeking interns for the summer of 2014 but will also consider applicants seeking internships at other times.