Notes by Taichi Nakatani (tnakatani3@gatech.edu)
Big Data: Process of applying computing power to aggregate large, complex sets of information.
References and Links:
Example: Criminial detection with headshots
Example: AI guessing sexual orientation
References and links:
References and Links:
Rose, A. (2010). Are face-detection cameras racist? Time, January 22, http://content.time.com/time/business/article/0,8599,1954643,00.html (Links to an external site.)
Griffiths, J. (2016). New Zealand passport robot thinks this Asian man's eyes are closed. CNN, December 9, www.cnn.com/2016/12/07/asia/new-zealand-passport-robot-asian-trnd/ (Links to an external site.)
Pulliam-Moore, C. (2015). Google photos identified black people as ‘gorillas,’ but racist software isn’t new. Fusion. fusion.net/story/159736/google-photos-identified-black-people-as-gorillas-but-racist-software-isnt-new/ (Links to an external site.)
Simonite, T. (2018). When It Comes to Gorillas, Google Photos Remains Blind. Wired. https://www.wired.com/story/when-it-comes-to-gorillas-google-photos-remains-blind (Links to an external site.)
Harwell, D. (2018). The Accent Gap. https://www.washingtonpost.com/graphics/2018/business/alexa-does-not-understand-your-accent/?noredirect=on (Links to an external site.)
Metz, R. (2018). Microsoft’s neo-Nazi sexbot was a great lesson for makers of AI assistants. MIT Technology Review https://www.technologyreview.com/s/610634/microsofts-neo-nazi-sexbot-was-a-great-lesson-for-makers-of-ai-assistants/ (Links to an external site.)
Tatman, R. (2016). Google’s speech recognition has a gender bias. Making Noise & Hearing Things, July 12, makingnoiseandhearingthings.com/2016/07/12/googles-speech- (Links to an external site.)recognition-has-a-gender-bias/ (Links to an external site.)
Hornigold, T. (2019). This Chatbot has Over 660 Million Users—and It Wants to Be Their Best Friend. Singularity Hub. https://singularityhub.com/2019/07/14/this-chatbot-has-over-660-million-users-and-it-wants-to-be-their-best-friend/ (Links to an external site.)
Vincent, J. (2018). Google removes gendered pronouns from Gmail’s Smart Compose to avoid AI bias, https://www.theverge.com/2018/11/27/18114127/google-gmail-smart-compose-ai-gender-bias-prounouns-removed (Links to an external site.)
Kay, M., Matuszek, C., & Munson, S. A. (2015). Unequal representation and gender stereotypes in image search results for occupations. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (CHI '15). ACM, New York, NY, USA, 3819-3828. https://www.csee.umbc.edu/~cmat/Pubs/KayMatuszekMunsonCHI2015GenderImageSearch.pdf (Links to an external site.)
Dastin, J. (2018). Amazon scraps secret AI recruiting tool that showed bias against women. Reuters. www.reuters.com/article/us-amazon-com-jobs-automation-insight/amazon-scraps-secret-ai-recruiting-tool-that-showed-bias-against-women-idUSKCN1MK08G (Links to an external site.)
Guarino, B. (2016). Google faulted for racial bias in image search results for black teenagers. The Washington Post, https://www.washingtonpost.com/news/morning-mix/wp/2016/06/10/google-faulted-for-racial-bias-in-image-search-results-for-black-teenagers/ (Links to an external site.)
Angwin, J., Larson, J., Mattu S. and Kirchner, L. (2016). There’s software used across the country to predict future criminals. And it’s biased against blacks. ProPublica. www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing
Ethics: Principles that distinguish what is morally right/wrong. No governing authority to sanction it.
Law: System of rules established by governemtn to maintain stability and justice. Defines legal rights and provides means of enforcing them.
Algorithmic fairness: how can we ensure that our algorithms act in ways that are fair?
Bank loan problem: If sensitive attribute (e.g. postal code) is correlated with other attributes, AI will find those correlations.
Group Fairness: Assessing fairness by using statistical parity. Require same percentage of group A and B to receive loans (in bank loan context).
P(loan|no repay, A) == P(loan|no repay, B)
P(no loan|would repay) == P(no loan|would repay, B)
Individual Fairness: Assess fairness by whether similar people (background) experience similar outcomes.
What is bias?
Statistics: The science of collecting, organizing, presenting, analyzing and interpreting data to assist in making effective decisions.
Goal of module:
Brief history of stats: Graunt's "Natural and Political Observation Made upon the Bills of Mortality"
13,200 * 88 / 3
).Definitions:
Example: Analyzing change in high school students' interest in computing.
Definitions:
Unemployment Rate = # unemployed / # labor force
Household Survey vs Establishment Survey
tl;dr - don't trust graphs
Reference: https://www.callingbullshit.org/tools/tools_misleading_axes.html
Diff between big data & data analytics
AI / ML / DL
Data:
Descriptive stats: Methods of organizing, summarizing, and presenting data in an informative way (freq table, histogram, mean, variance)
Inferential Analytics: Methods used to determine something about a population on the basis of a sample (ML/AI for big data)
Sampling Error: Discrepency between a sample statistic and its population parameter. Leads to sampling bias.
When to use median vs mean:
Headline should be the one on bottom if math was done correctly.
Example: Manipulating average income of a neighborhood
Definition: Tallies number of times a data point occurs.
Cumulative frequency distribution: "Runnint total" of frequencies.
Ref: https://www150.statcan.gc.ca/n1/edu/power-pouvoir/ch10/5214862-eng.htm
Definition: Measures the amount of "scatter" in a dataset. Shows how well the avg characterizes data as a whole.
# Both have same mean (50) but different stdev (20 vs 10)
a = [30, 50, 70]
b = [40, 50, 60]
Examples: range, variance, stdev, interquartile range, coefficient of variation.
Ref: https://junkcharts.typepad.com/junk_charts/boxplot/
Definition: Drawing inferences about an individual based on data drawn from a larger group of similar individuals.
Examples: Credit card / loans, hiring.
Institute decide to get rid of the Chick-fil-A Express in the student center. After a survey of all the faculty, it was overwhelmingly decided that Chick-fil-A would be replaced with To- Go Fogo de Chão Brazilian Steakhouse.
Definition: Trend appears in several different groups of data but disappears of reverses when these groups are combined.
Example (ref: https://blog.revolutionanalytics.com/2013/07/a-great-example-of-simpsons-paradox.html)
Example (How statistics can be misleading - Mark Liddell): https://www.youtube.com/watch?v=sxYrzzy3cq8&t=26s
Simple random sampling: randomly sample from population
Systematic Sampling
Statified random sampling: Data is divided in subgroups (strata)
Pros and Cons of each:
Cluster random sampling: Split population into similar parts of clusters.
Non-probability Sampling: Participants are chosen/choose themselves so that chance of being selected is not known.
Correlation tells us two variables are related.
Types of relationship reflected in correlation:
Important: Correlation doesn't imply causation.
Correlation coefficient summarizes the association between 2 variables.
"Correlation between worker's education levels and wages is strongly positive"
Issues:
Examples of "spurious correlations": www.tylervigen.com
Relationships between two variables are often influenced by other unknown variables.
Linear correlation coefficient: a measure of the strength and direction of a linear association between two random variables (also called the Pearson product-moment correlation coefficient)
from scipy import stats
scipy.stats.pearsonr(X, Y)
Definition: For a normal distribution, almost all of the data will fall within 3 standard deviations from the mean. Assumes that the data follows a gaussian distribution.
Margin of Error:
# At 95% level of confidence
# n = sample size
margin_of_error = 1 / math.sqrt(n)
# Example of n=1000, MoE is ±3%
0.03 = 1 / math.sqrt(1000)
Example: Surveys
Company X surveys customers and finds that 50% of the respondents say its customer service is "very good". The confidence level is cited as 95% ± 3% MoE.
Example:
MoE Table: | Sample Size | % MoE | |:------------|:-| | 25 | ±20% | | 64 | ±12.5% | | 100 | ±10% | | 256 | ±6.25% | | 400 | ±5% | | 625 | ±4% | | 1111 | ±3% | | 1600 | ±2.5% | | 2500 | ±2% | | 10000 | ±1% |
Example 1:
Answer:
125/625 = 0.2
, 20%1/math.sqrt(625)
, ±0.04 (±40%)Example 2:
# Derive sample size from MoE of 0.05
moe = 0.05
moe = 1/math.sqrt(n)
moe**2 = 1/n
1/(moe**2) = n
n = 400
Example 3:
Answer:
Goal: Understand and apply basic AI/ML techniques to data scenarios, with a focus on instituting "fair" practices when designing decision-making systems based on big data.
Bias in word embeddings
A note on word vs semantic similarity:
2 prevailing use of simlarity:
Document Occurrence: Assign identifiers corresponding to the count of words in each document (from a cluster of docs) in which the word occurs.
Word Context: Quantify co-occurrence of terms in a corpus by constructing a co-occurrence matrix to capture the number of times a term appears in the context of another term.
Example: Create word cooccurrence table between "chocolate is the bets dessert in the world", "GT is the best university in the world" and "The world runs on chocolate".
Example: Comparing tiny corpus of sports corpus. Doc occurrence finds "losangeles" + "dodgers" and "atlanta" + "falcons" co-occur. Word occurrence shows different viewpoint.
FORMULA:
# Similarity = (A.B) / (||A||.||B||)
import numpy as np
from numpy.linalg import norm
from itertools import permutations
# toy vectors using atlanta, falcons, los angeles and dodgers
atlanta = ('atlanta', np.array([1, 1, 0, 0]))
falcons = ('falcons', np.array([1, 1, 0, 0]))
los_angeles = ('los angeles', np.array([0, 0, 1, 1]))
dodgers = ('dodgers', np.array([0, 0, 1, 1]))
# compute cosine similarity
def cos_sim(x, y):
return np.dot(x, y) / (norm(x) * norm(y))
# compute cosine similarities among toy vectors
for p1, p2 in list(permutations([atlanta, falcons, los_angeles, dodgers], 2)):
cosine = cos_sim(p1[1], p2[1])
print(f"Similarity({p1[0]}, {p2[0]}): {round(cosine, 2)}")
Results: Cosine similarity between atlanta and falcons, and los angeles and dodgers are similar.
Similarity(atlanta, falcons): 1.0
Similarity(atlanta, los angeles): 0.0
Similarity(atlanta, dodgers): 0.0
Similarity(falcons, atlanta): 1.0
Similarity(falcons, los angeles): 0.0
Similarity(falcons, dodgers): 0.0
Similarity(los angeles, atlanta): 0.0
Similarity(los angeles, falcons): 0.0
Similarity(los angeles, dodgers): 1.0
Similarity(dodgers, atlanta): 0.0
Similarity(dodgers, falcons): 0.0
Similarity(dodgers, los angeles): 1.0
vec_c + vec_b - vec_a
Examples: From http://bionlp-www.utu.fi/wv_demo/ - English GoogleNews Model
[1, 0, 1, 1, 0, 2]
Context prediction models (Skipgram, W2V): Predict the context of a given word by learning probabilities of co-occurence from a corpus.
2 Types of Word2Vec:
"passed"
, context words = ["the", "student", "the", "exam"]
.Important parameters
Steps:
All 3 were taken down in 2019. All were CC, but were all used by foreign surveillance and defence organizations.
Other datasets: MegaFace Dataset - face recognition training set 4.7MM faces. Sourced from Flickr.
Facial recognition algorithms are used to gauge a person's emotion. Used for driver attention, monitor movie audience reaction, healthcare use.
AIs originally built upon Ekman's studies (emotion expressions are universal)
Procedure:
Case Study: TSA's Screening of Passengers by Observation Techniques (SPOT) Program
In the wild, facial identification becomes problematic because:
Results in:
Error rates for face recognition:
Why Bias Occurs in the Data
Training sets are hard to get. Need to buy/scrape/obtain more samples from underrepresented classes. Grey area occurs with regards to scraping.
Evaluation Metrics
Algorithmic Fairness - mitiagate effects of unwarranted bias/discrimination from AIML algos. Focus on mathematical formalism / algo arpproaches to fairness.
Examples of algo bias:
Problem: Biased data stored in protected attributes. Solution: Remove protected class attributes. But redundant inherent encodings in other features that correlate to protected class may occur.
Problem: There are issues with error-rate imbalances such that different groups have different outcomes. Solution: Only outcomes matter, mmake sure g roups are in line with preetermined "fiarness" metrics
Issues:
Principles for quantifying fairness
Two basic frameworks for measuring fairness:
Max Profit Model - Setting different thresholds for disadvantaged groups in order to maximize profit. Split into priviledge vs unpriviledged group.
Profits computed on 4 components:
Set different thresholds to the two groups, gives most loans to those with the highest probability of paying back. At least gives loans out to underpriviledged group, rather than denying then altogether.
Blinding Model - Class features and all "proxy" information removed.
Demographic Parity - All groups have same percentage approved.
Equal Opportunity - Same percentage of "credit-worthy" candidates, ie. true positives, in both groups.
Other Group Fairness Metrics
3 phases of bias mitigation steps
Preprocessing
https://pair-code.github.io/what-if-tool/
Race/Sex Discrimination on different algorithms
Determining thresholds for accuracy vs fairness must take into considerations: legal, ethical, gain trust.
When false positives is better than false negatives.
When false negatives is better than false positives.
Bias consideration with regards to task: Example with gender.