I attended a lecture from the Tau Beta Pi lecture series on March 28. In this talk, two
speakers shared their thoughts on how diversity of data can improve engineering and
scientific outcomes. I chose this lecture because it was engineering based, and the
speakers seemed to have interesting backgrounds.
The first speaker was a biomedical engineer, who spent much of her career studying
adipose tissue in the body. She spoke about how differences in genetic ancestry can
affect how the body stores and uses fat. In the medical field, people have several
different issues relating to adipose tissue, each with several different treatments. Not
considering genetic demographics can lead to missing the best treatment for each
patient. Some ancestries warrant different treatments, and an expansion of the pool
used in studies is needed to ensure all genetic backgrounds are represented. For
example sending american vaccines to Africa sometimes results in worse performance
in African populations, so they have to be adapted. If more Africans had been included
in the trials, this shortcoming could have been caught much sooner.
The second speaker spoke about facial recognition technology and using
intersectionality as a methodology for research. For example, when studying facial
recognition efficacy, both women and african americans showed slightly lower rates of
success. However, when considering the group of female african americans, the
success rate was significantly lower. The speaker argued that this highlights the need to
consider intersecting identities in research. After investigation, this was found to be due
to the dataset used for training. While the makers of the dataset made sure to include all
races and genders, they forgot to account for intersecting identities, leaving gaps in
certain places.
These speakers impacted me by showing the importance of diversity of data, not solely
for ethical reasons but also for effectiveness of the end product. I will continue to
consider this in my engineering career, and try to evaluate how well diversified my data
is, even beyond surface level. I had not previously considered how data can seem
diverse at first glance, but only with certain manipulation can discrepancies become
clear. I am glad I attended the talk and learned to consider data in this way.