Uploaded by Sonia L.

Couderay Guidelines-converted

advertisement
NH & Campus Integrity Labeling
Guidelines v.2 [DRAFT]
use this data to build a taxonomy to inform future product policies, understand the most
frequently occurring ‘borderline harmful content’, implement product interventions, and inform
future ML development so that we can minimize the impact of harmful or unwanted content in
the community to keep users safe and have positive experiences within each product.
B. Sensitive Categories Taxonomy (DEFINITIONS AND POLICIES)
1. Violating: This post content violates our implementation standards policy. See IS policy
here
2. Sensitive: The content in this post contains anything that could be potentially harmful,
offensive, controversial, divisive, unsafe, be sensitive to some users (age, cultural
background, etc) cause a negative user experience or reaction, borderline to our IS
policies, but is not captured within the IS policy.
3. Completely Benign: The content in this post has no potential to contain any degree of
integrity risk and would be ok for any user to see/interact with this content.
Sensitive Topics
Sensitive Topic Category
Hateful Speech
Topic Definition
Examples
Content that may be hateful but
Eg. saying something about
another group, race or
ethnicity that classify them as
a specific category or class
with negative references
does not meet threshold to
violate community standards
Borderline
bullying/harassment/unkin
d behaviors
Content that is
bullying/harassment but doesn’t
meet threshold to violate
community standards or is
unkind towards others
Eg name calling, making fun
of others in a mean spirited
way
Discrimination/Racial
Profiling
Content that discriminates
against certain protected
characteristics eg race, gender,
age, nationality, sexuality,
religious beliefs, or racially
profiles others
Eg. describing crime in
neighborhood by only
referring to a person’s race
Financial Manipulation
Pyramid schemes, multi-level
marketing, and other ways to
scam users
Eg. transferring funds to other
accounts by usings illegal
means
Promotional
Content that seeks to publicize a
product, org or venture
Eg. sharing personal social
media pages, advertising
local/personal business,
selling goods
Potential misinfo
Any content that contains
potentially false or misleading
information
Eg. conspiracy theories, links
to unverified/false news
Political Content
Any content related to politics,
government, political figures,
instigation of highly debated
topics
Eg. discussing election
results, political candidates,
policy and propaganda, BLM,
Defunding the police
Religious Content
Content discussing/sharing
religion or religious beliefs
Eg discussing religious
persecution or specific
beliefs
Low Quality health
Content promoting things like
miracle cures, quick weight loss
methods, unverified
supplements, etc
Eg. discussing or promoting a
product that makes claims to
alleviate illness with no
supporting evidence
Sexual Content
Content that is sexual in nature
or sexually suggestive that may
be perceived as offensive by
some and does not hit threshold
for nudity/pornography in
implementation standards
Eg. discussing, promoting or
displaying body parts such as
cleavage or other suggestive
body parts.
Dating
Content that is about dating or is
seeking a romantic relationship
relationship or sexual solication
Eg. “anyone looking for a
Profanity
Content that contains any
amount of profanity
Eg words 18-r,
slang/offensive terms
Potential
Drug/Weapon/Alcohol
Content referring to drugs,
weapons, or alcohol. Doesn‘t
have to be related to a sale, and
merely discussing.
Eg. discussing or promoting
the sale, use of drugs,
weapons or alcohol
Animal Sales
Content that sells or gives
animals up for adoption
Eg. discussing putting
puppies for sale or looking for
a new home for stray dogs or
horses.
Eating Disorder
Content promoting or discussing
Eg. ways to lose
boyfriend?"
topics surrounding eating
disorders.
Exclude content that provides
resources/educational info to
receive help
weight/restrict calories in a
unhealthy and extreme way
and promoting anorexic
lifestyles
Spam
Any content that may be
considered spammy but doesn’t
hit threshold to violate IS
Eg. posts of just links, shares
to external pages
Harm to self/others
Content promoting or discussing
topics that may cause harm to
self or others.
Eg. people talking about ways
to cut yourself or commit
suicide
Exclude content that provides
resources/educational info to
receive help
Ranting
Any post that is going on a rant,
highly opinionated/strongly
worded and emotional and
potentially divisive
Fundraising
Posts asking for gofundmes and
other fundraising asks
Other
Any topic not covered by the
above categories that you view
may be perceived as
controversial, offensive,
sensitive, or inappropriate for
any user or that seems low
quality and unrelated to the
group
Eg. someone expressing their
personal opinion on a
controversial or public matter
in a heated / aggressive
manner
Eg. discussing stores price of
items in a group focused on
politics or education
C. Review Protocol
1. Answer question 1: Does this post contain potentially harmful, offensive, or sensitive
content?
2. If it violates IS
a. select all the IS harms that apply (you can select multiple if multiple apply) and
click Done (hotkey is d) and click submit to complete the labeling.
3. If it is Potentially Sensitive content
a. If the content meets any of the ‘sensitive‘ topic category definitions in the table
above, select all the potentially sensitive content that apply (you can select
multiple if multiple apply) and click Done (hotkey is d).
b. Answer the next question “explain your rationale". You should explain your
rationale for every single answer you select, and provide context to why you
selected those answers and your reasoning. Hit Done [enter] and click submit to
finish submitting the job and complete the labeling
c. Answer the next question
4. If it is Not Rendering
a. Only select this if there is a tool/render issue in the job and you cannot see it.
Click Not rendering and submit to complete the labeling
5. If it is completely benign
a. Only select this option if the post has no violating, harmful, offensive, divisive, or
sensitive content. Click submit to complete the labeling
6. If it is Foreign Language
a. Only select this option if you are unable to understand the post using google
translate, and it is in a non-english language.
7. Answer question 2: Does this post contain potentially sensitive or problematic
comments?
a. Select Yes if the comments violate IS, are potentially sensitive given above
definitions.
i.
Answer the next question and explain your rationale.
b. Select No if there are no comments or if they are benign or don’t render.
Examples - Neighborhoods
1. Promotional
a. Advertising for a personal or local business is promotional
i.
“Hi My name is <redacted_person_name> , I run a private dayhome in
Millrise Drive sw. I have a spot available full time, part time, drop in or
Before/after school care( Our Lady of Peace school) if you want more
information please contact me at: <redacted_emaiI>"
ii.
online finance classes for kids offered bv a mortaaae aaent to advertise
for his business
b. Trying to gain a social media following is promotional
Lunching (and working) at Fresh Restaurant on Front St, today.
Tha Cobb salad is ahhhmazing !
I'm a local Vintage Furniture Dealer, Interior Designer & Blogger. Follow me on lnstagram tor all things
Beautiful, Bold & some fun Entrepreneurship Stories too !
Instagram @_sundaycreative
Cheers
XX
-<redacted person name>
c. Selling goods is promotional
i.
"Hello neighbours, if you want area rugs, please contact me: thanks"
d. In search of goods (to buy/sell/trade/get for free) is NOT promotional
i.
e.
"Starting a new job on Monday in need of size XI black polo shirt or button
down black shirt. Please help"
Garage, moving, or estate sale is NOT considered promotional
i.
movino sale
2. Spam
a. Posting of external link without providing context is spam
i.
Link to an external article Thinos to Do in Pittsburgh this Weekend
ii.
A Youtube video link of someone Dlavino the ouitar
b. Posting of gibberish text w/ no discernible meaning is spam
i.
"Sdfsss"
c. Facebook hosted photos, vidoes, or links without text but are reasonable in the
neighborhood context are NOT spam
i.
A ohoto taken inside a store without text explanation - we can assume the
photo was taken locally with the intent to share the (new) businessw/
ii.
neighbors
A video of a cat - it's very common for neighbors to share photos or
videos of pets
TikTok video hon Facebook) about a treasure huntino oame in Montreal following the link, we can see that the game video is originally shared by a
Geo-Game company founded by McGill University
iv.
A ohoto of a freshlv painted wall - home remodeling is a topic we'd expect
neighbors to share and discuss about
v.
A link to a FB Dost about a church event for Dets - sharing of local events
are expected on the platform
d. Incomplete sentences that are reasonable in the neighborhood context are NOT
iii.
spam
i.
ii.
iii.
iv.
"Welcome" - welcoming new neighbors
"Yoooo" - greeting the community
"Hello)" - greeting the community
"May God bless everyone with health and money" - good wishes to the NH
e. Edge cases
i.
ii.
"Golden king Pokémon sword"
"I like walking died" - likely a misspelling of \A/aIking Dead - a popular TV
series
iii.
"[redacted] the gofastmasteL"
iv.
"set. it. up"
3. Animal Sales
a. Selling, rehoming, or allowing adoption of animals is animal sales
i.
puppies for sale
ii.
a cat UD for adoDtion
b. ISO (in search of) animals to buy is NOT animal sales
Hello
Neighbors
I’m looking for a kitten
for my brotherl !
i
4. Political
a. Content pertaining to any political figure, public policy, or political party is political
i.
hoto with a man pointing to an election campaign sign of a political figure
ii.
a reminder to vote bv an election candidate
b. Content related to highly debated topics such as covid and vaccination is political
i.
video of a woman fliDDina throuah naDers w/ messaaes in suDDort of
mask wearinq and vaccination
c. Content pertaining to non-political strikes or human rights protests is not
political. This includes protests for LGBT rights, etc.
5. Religious Content
a. Prayers are religious
i.
“MY [redacted] [redacted] TEACHER OF THE WORD REVEREND/DR.
[redacted] AND THE [redacted] FAMILY WE SEEK A WORD FROM THE
LORD LETALL GOD'S [redacted] „/ AND SAY AMEN HAL LELUJAH
BLESSINGS AND LOVE. . MY [redacted] FAMILY SEE YOU AS WE ARE
WELCOME INTO THE SANCTUARY ONE MORE TIME. . WONT YOU
COME. THANK YOU.“
ii.
holirlav relaterl nraver anrl wishes
b. Content that preaches for a specific belief or advocating for a specific religious
view is religious
i.
“It is truly a time for you to pray. God is calling for more people than usual but
not by covid. Be prepared”
c. Events hosted by churches or other religious organizations are NOT religious
i.
a link to a FB Dost about a church run charitv event
ii.
“Special Christmas mass this Sunday 9am at my church on Main and 2nd St.
Everyone's welcome!”
Examples - Campus
1. Promotional
a. Personal social media link
b. Promoting IG Account
i.
"I'm trying to put a team together check out my Instagram [redacted]”
c. Personal business
i.
"I am currently starting up a dog walking service through rover and I
figured the best way to promote myself is through here"
2. Spam
a. Resharing/reposting videos or posts with no caption
i.
Reshared video post of animal rescue
b. Resharino news post with no relevant text
c. Meme posts w/ clear context is NOT spam
i.
Covid related meme
3. Animal Sales
a. Rehoming pet
i.
"I am looking to re-home a beautiful anatolian shepherd/ labrador retriever
mlX"
4. Religious Content
a. Content that promotes/preach a specific religious is considered religous
i.
GraDhic Dreachina reliaion
D. REVIEW DECISION TREE
Whmlmpomenlmion8Wmds‹dPolkydQQ$
conlajt does IhIs fall Mnder?
Violerrbgaghfi conlenl
g
Po8r#l$ldru9AwBéponbIcoñol
Cruel/hsensltlve
Question 1
QUIZ
ViOlating IS flOW
Sensitive category flow
TRue
Download