Four positions.
If interested, send resume and statement of intent/cover letter to
Eliot Brenner
Data Scientist, Algorithms Team
[email protected]
350 Fifth Avenue, 21st Floor
New York, NY 10118
-----------------------1. Machine Translation Research Internship
One of our challenges at shutterstock is the translation of keywords and search queries. Keywords are a
bag of terms of varying word lengths. Queries, on the other hand, are an ordered set of terms, typically
one to four terms in length. Translations are performed between English and up to 20 other languages.
We are looking for innovative means of improving our translations. In this project, you will tackle such
problems as word sense disambiguation using language modeling and other probabilistic techniques.
The ideal candidate will have experience with language modeling, word sense disambiguation, statistical
machine translation, and other nlp techniques. Advanced understanding in the morphology of multiple
languages is a plus. Understanding of unicode is a plus.
2. Personalization Research Internship
The purpose of the personalization project is to devise and implement methods to improve search
results and download recommendations by incorporating a subtler understanding of user intent than
can be obtained from looking at the most recent query text alone. We are looking for ways to leverage
the recent search activity of the customer, the longer term behavioral history, and other data about the
customer to better understand the context of the search.
Part of this project will be answering questions around which types of data sources can contribute to an
understanding of the search context. Another part of the project will be building and evaluating models
of user intent based on these data sources.
The goals and behavioral patterns of stock photo/footage customers present many challenges which are
not encountered in the more traditional domains of recommender systems. This project presents the
intern with the opportunity to do original research at the frontiers of the field of personalization.
The ideal candidate will have experience applying machine learning techniques as clustering,
probabilistic graphical models, logistic regression, and Bayesian decision theory. Experience analyzing
large data sets, such as search and event logs, using tools such as Hadoop, is a plus. The coding will likely
be done in Java and/or Python.
3. Image Content Research Internship
Looking to apply your image analysis skills to a collection of millions of high-quality images? Do you
dream in kernels, pixels, and color spaces? If so, come join us for a summer of exploration and
prototyping where you will put your skills to the test researching and building important customerfacing features built on top of image analysis. You will work with us on extracting value from millions of
pixels of data, and work with our data scientists to develop and extract new features from our vast
image collection. The internship will hopefully lead to novel product features seen by thousands of
customers and contributors.
The ideal candidate will have a demonstrable background in image analysis. Familiarity with OpenCV,
scikit-image, ndimage or similar is a plus. Coding ability in python, Java or C++ is a plus.
4. Deep Learning Project Research Internship
We are seeking a motivated intern with relevant research and coding experience to come help us
implement ground-breaking services built around deep learning. Shutterstock has an unusually large
(tens of millions) high quality collection of multimedia assets (images, videos, vector drawings, etc). This
project will extract value from these media files and help us launch industry-leading features. This
internship will offer a unique opportunity to apply academic research to a rich and procured data set not
often found in academia.
Ideally you will have experience with implementing and tweaking large neural networks, convolutional
neural networks, auto-encoders and/or deep belief networks. GPU programming experience is also a
plus as is experience with Python, Java, or CUDA.
----------------------------------------------------------------About Shutterstock
-----------------At Shutterstock we are committed to using our vast amount of data to solving challenging problems. We
have over 40 million labelled images and logs of over 50 million events every day, including 5 million
searches in over 20 languages. We are constantly looking for innovative approaches to leveraging our
data to improve all aspects of our customer’s experience, including recommendations, personalization,
clustering, and translations.
Shutterstock’s internship program:
---------------------------------Interns will come work within the search and data science team in midtown Manhattan (NY, NY). We are
currently seeking interns for the summer of 2014 but will also consider applicants seeking internships at
other times.