Uploaded by sophia.pung22

ChatGPT, LLMs, and Indeed AI Ethics' Guidelines - AI Ethics - Confluence

advertisement
2/7/23, 1:24 PM
ChatGPT, LLMs, and Indeed: AI Ethics' Guidelines - AI Ethics - Confluence
Pages / AI Ethics
ChatGPT, LLMs, and Indeed: AI Ethics' Guidelines
Created by Trey Causey 29 minutes ago
Authors: @ Thom Lake (Principal Data Scientist & Deep Learning Lead), @ Lewis Baker (Staff Product Scientist, AI
Ethics), and @ Trey Causey (Head of AI Ethics and Director, Data Science). All opinions are our own and are offered as
scientists who think daily about the potential risks and rewards of AI and ML. Any errors are the fault of @ Trey Causey .
Please don't hesitate to provide feedback or get in touch if you'd like to arrange a more in-depth discussion with you or your
teams.
As we write this in early February 2023, artificial intelligence (AI) has captured the world's imagination and businesses are
scrambling to figure out how AI will affect them. AI has seemingly made incredible progress in a short period of time. Like
many others, we are very excited about the prospects AI has for helping people get jobs and transforming our work and our
world. We are also cognizant that these advances bring both old risks and a host of new ones. Our goal is to demystify some
of the recent developments in AI, provide some introductory guidance for how to think about their use at Indeed, and make
sure that this pivotal moment doesn't pass us by unprepared.
Indeed's next Hackathon theme is "artificial intelligence", Indeedians are pouring ideas into shared docs and Slack channels
such as #indeedgpt, and questions are already appearing in Q&As. Let's help all people get jobs!
 Executive Summary
LLMs such as ChatGPT are exciting, but not new – the recent explosion in hype is mostly a function of UX and
availability.
Bias, toxicity, and hallucinations are all major risks for these models. Big companies such as Indeed have to
use caution when deploying.
ChatGPT is exciting, but the business use cases are ill-defined. We propose some ways of thinking about
defensibility, differentiation, and scalability.
Risk doesn't trump reward – let's invest at this pivotal moment and cement Indeed's place as a leader in AIdriven HR technology!
What are LLMS and ChatGPT? Why am I hearing about them all of a sudden?
Large language models (LLMs, also called 'foundation models') such as ChatGPT aren't particularly new – they've been
around for a few years. "Attention is All You Need", the paper from Google that introduced the transformer architecture on
which these models are based, came out in 2017. OpenAI's ChatGPT is assumed to be based mostly on GPT-3.5, which is
itself an update to GPT-3 – released in 2020. In fact, OpenAI's GPT-3 API is already in use at Indeed, Indeed has its own
custom LLM (Indeed-BERT), and smaller language models have been successfully deployed to improve matching, resume
parsing, taxonomy, and more.
So, why did ChatGPT explode into the public consciousness in late 2022?
Three reasons, only one of which is related to developments in the underlying models:
1. UX improvements. The chat interface provided a novel, approachable way for anyone to interact with these models.
Yann LeCun (Chief Scientist at Meta and one of the "Big 3" names in modern AI) has caught a lot of flak for saying
there isn't much "new" here, but fundamentally he's correct. OpenAI turned GPT-3.5 into a product that resonated.
2. Widespread (free) availability. The models that power ChatGPT have been available in OpenAI's playground and via
API for a while, but you had to pay for access. ChatGPT has effectively productized the users, whose queries are being
fed back into the underlying model to improve its performance.
3. Reinforcement learning with human feedback (RLHF). OpenAI invested significant resources into "aligning" its LLM
with human intent so that the responses were more likely to be helpful, accurate, unbiased, and free of toxicity. Anyone
who has played with ChatGPT for a little while knows this was only somewhat successful. If you're interested in
learning more about how transformers and RLHF work, this is a good and approachable read, though the definitive
guide is Jay Allamar's Illustrated Transformer Guide.
https://wiki.indeed.com/display/AIEthics/ChatGPT%2C+LLMs%2C+and+Indeed%3A+AI+Ethics%27+Guidelines
1/3
2/7/23, 1:24 PM
ChatGPT, LLMs, and Indeed: AI Ethics' Guidelines - AI Ethics - Confluence
Sounds great! Let's get JobbyGPT deployed today... right?
Not (quite) so fast. The risks to job seekers are plentiful, as are the reputational (and potentially legal) risks to Indeed. As the
#1 job site in the world, we are stewards of job seekers and must proceed responsibly.
One of the main reasons OpenAI has made such a splash in this space is that they are a startup with little to lose. GPT-3.5
(the underlying model) is available for paid use via OpenAI's API right now! However, this model does not have the same
guardrails as ChatGPT. Google has notably been much slower to deploy their own LLM (LaMDA), citing concerns about
toxicity, bias, and hallucinations, though they have promised to release it soon. Microsoft, in partnership with OpenAI, sees
this as an opportunity to wrestle back some market share in search, and thus are betting big on integrating LLMs into their
products and moving quicker.
Notably, both Microsoft and Google have placed responsible and ethical AI at the center of both of their statements on these
models. Google has been accused of shipping too slowly, but scale matters – if Google's model is hallucinating or toxic 0.1%
of the time, that's still thousands of instances a day. These kinds of errors generate Congressional inquiries and new laws.
Bias and toxicity. The risk of AI models producing biased, racist, sexist, ableist, transphobic, homophobic, and other toxic
outputs is well-documented and require effort to manage. LLMs produce an entirely new set of risks in this domain – they are
generative models, meaning that they produce new outputs in a (mostly) non-deterministic manner, and thus can't be
guaranteed to behave as intended. OpenAI is thought to have spent circa $10MM USD to train the model, and an additional
undisclosed amount (including employing hundreds of contractors) to try and debias the models. Some Reddit users have
taken this as a challenge to "jailbreak" the model to produce very toxic outputs – successfully. Discussing the risks here could
consume many pages.
Privacy and security. LLMs can "memorize" their training data and reproduce verbatim copies of that data when producing
outputs. Imagine a job seeker interacting with Indeed's GPT implementation only to have someone's exact resume data (at
best) or PII (at worst) produced, screenshotted, and shared. Similarly, a niche community has popped up around "prompt
injection", where users try to get LLMs to reveal sensitive information about their inner workings.
Got it, there are risks. Let's manage those and transform our business!
We agree! However, we think there is a lot of work to do to figure out how to do so. We can't just "rub some AI on it" and
watch our profits soar. As discussed above, it's taken nearly 3 years for GPT-3 to make a big splash with the broader public.
We need to identify use cases that are a) helpful to job seekers, employers, or Indeedians, b) defensible and/or differentiating,
and c) scalable.
Helpfulness. Right now, it's not clear what the really transformative use cases are here. There is, of course, a "WOW" factor to
using ChatGPT – but will it help job seekers get a job faster? Will it improve matching or get us closer to the hire? Does it
reduce friction? Can we do it in a way that minimizes the risks above? Importantly, even if we can do this, we will often be
unable to explain to job seekers or employers why the model produced a given set of outputs, which can raise regulatory risks
as well as #brokenexperiences.
Defensible and differentiating. The technology underlying these models is mostly "open" (even if the weights of a specific
model are not available, the methodology is generally understood) and is commoditizing quickly. What seems amazing right
now will be available as a commodity API in six months. Many copmanies are rushing to add "chat" or "AI" to their offerings,
mostly in a bolt-on fashion. As it turns out, even though chat is a really fun interface modality, the number of use cases that
really benefit from it are few and far between (for now).
Counterintuitively, we find that the most promising near-term applications for LLMs are usually not in the obvious chat
interfaces, but ones that power experiences invisibly for end users. Think things such as improving search results or as a
backend or "middleware" to existing UX (though beware prompt injections, discussed above). While we don't claim to be
product development experts across the business, we do think the lion's share of current initial offerings are unremarkable
and unlikely to generate significant moats.
Let's identify the things that Indeed does uniquely well, the use cases that are uniquely suited to our vast data, and invest
quickly and substantially there to drive positive feedback loops. Investing in areas that are unlikely to be market-movers for us
or that are easily copied by competitors are likely misallocations of resources.
Scalability. It's estimated that current ChatGPT queries are being processed by multiple A100s (the ~most powerful GPU on
the market) and cost at least a cent per query. While that seems small, multiple Google engineers have pointed out that
Google could not survive with that kind of cost per query. This stuff is expensive – and we're not even talking about training
models, which is $MMs. On top of this, we would need to reorient significant portions of our infrastructure to increase GPU
availability – a significant cost increase over CPU-based inference.
https://wiki.indeed.com/display/AIEthics/ChatGPT%2C+LLMs%2C+and+Indeed%3A+AI+Ethics%27+Guidelines
2/3
2/7/23, 1:24 PM
ChatGPT, LLMs, and Indeed: AI Ethics' Guidelines - AI Ethics - Confluence
So is this all a dead-end? NO!
Absolutely not – we meant what we said above about this being a pivotal, epoch-defining moment for AI. We are about to
witness truly significant shifts in how products are developed, the kinds of products that are developed, and many much
weirder things that we're not even anticipating yet. We should take advantage of this and cement our place as the #1 job site
in the world for the next era of AI-driven HR tech. If anyone can do this in a way that helps all people get jobs, Indeed
can!
Luckily, Indeed already possesses talent, infrastructure, and energy to put us on that path. @ Thom Lake has been working
in this space for years and has thought carefully about integrating deep learning and AI into our products. The AI Ethics team
has been waiting for this moment since its formation. We are not advocating for prioritizing risk over reward and are
excited about helping reorient parts of our business. Doing so will require broad efforts:
Ensuring our data platforms and model training infrastructures have sufficient access to the kinds of unstructured data
that power these models and the GPUs needed to train and serve them – this can get expensive quick and requires
expertise
Investing in the Data Science, Product Science, and SWE talent necessary to drive innovation and develop new
models (this is already happening, but we can do much more!)
Adjust our development portfolio to incorporate more exploratory (read: risky) investments in (L)LMs and other
generative models – these won't be iterative improvements on existing models and will take experimentation to get
right
Explore open-source options such as T0, BLOOM, and more – OpenAI isn't the only game in town!
We'll need to do this responsibly and inclusively – AI Ethics is here to help. While we're not ready quite yet to produce a set of
guidelines that can be followed to ensure safe deployment of LLMs or generative models, we're happy to work through these
issues with you.
1. Always ask yourself about the risk of PII being sent to or from these models.
2. Ask yourself about the possibility of non-deterministic outputs being "gamed" to produce embarrassing or toxic outputs.
Big players such as Indeed are targets for hackers, trolls, and regulators.
3. Think about "rich get richer" scenarios – a small group of people in the world right now understand how to interact with
LLMs to get optimal results. Let's build products for all job seekers.
4. Remember, these models were trained (for all intents and purposes) on the free text of the internet. You've been on the
internet, haven't you? You know what's out there, lurking in latent space.
Want to partner with AI Ethics on your next AI project? Reach out in #ai-ethics on Slack, email ai-ethics@indeed.com for
help, or ping @ Trey Causey .
No labels
https://wiki.indeed.com/display/AIEthics/ChatGPT%2C+LLMs%2C+and+Indeed%3A+AI+Ethics%27+Guidelines
3/3
Download