Skip to content

Commit 6c082e7

Browse files
committed
--
1 parent e1051dd commit 6c082e7

185 files changed

Lines changed: 16888 additions & 0 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.DS_Store

8 KB
Binary file not shown.

BingSiteAuth.xml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
<?xml version="1.0"?>
2+
<users>
3+
<user>D79DAC6EC190664ECB9445823D3F7C07</user>
4+
</users>

_mithra.htm

Lines changed: 136 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,136 @@
1+
<!doctype html>
2+
<html lang="en">
3+
<head>
4+
<!-- Required meta tags -->
5+
<meta charset="utf-8">
6+
<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
7+
<link rel="icon" href="imgs/InDeXLab.gif"/>
8+
<!-- Bootstrap CSS -->
9+
<link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/4.2.1/css/bootstrap.min.css" integrity="sha384-GJzZqFGwb1QTTN6wy59ffF1BuGJpLSa9DkKMp0DgiMDm4iYMj70gZWKYbI706tWS" crossorigin="anonymous">
10+
11+
<title>InDeX Lab. Mithra: Responsible Data Science and Algorithmic Fairness</title>
12+
<link rel="stylesheet" type="text/css" href="ppl.css" />
13+
<link rel="stylesheet" type="text/css" href="style.css" />
14+
<script type="text/javascript" src="js/myobjects.js"></script>
15+
<script type="text/javascript" src="js/MyFunctions.js"></script>
16+
<script type="text/javascript" src="js/myobjects.js"></script>
17+
<script type="text/javascript" src="js/indexstart.js"></script>
18+
<script type="text/javascript">
19+
function init() {
20+
header();
21+
//header2();
22+
fillContent("content/FairRanking.txt","FairR");
23+
fillContent("content/StableRanking.txt","StableR");
24+
fillContent("content/MithraRanking.txt","MithraR");
25+
fillContent("content/coverage.txt","coverage");
26+
fillContent("content/coverage2.txt","coverage2");
27+
fillContent("content/nl1.txt","nl1");
28+
fillContent("content/nl2.txt","nl2");
29+
fillContent("content/cherrypicking.txt","cherrypicking");
30+
}
31+
</script>
32+
</head>
33+
34+
<body onload='init()'>
35+
<div id="headerDiv"></div>
36+
<!-- Your code starts here -->
37+
<div class="container">
38+
<div class="jumbotron">
39+
<p style="width: 100%; text-align: center;"><img class="roundphoto" src="imgs/Equity.png"/></p>
40+
<h3 class="display-4">Data Equity</h3>
41+
<p class="lead">Projects related to Responsible Data Science and Algorithmic Fairness</p>
42+
<hr class="my-4">
43+
<p> Big data technologies have affected every corner of human life and society. These technologies have made our lives unimaginably more shared, connected, convenient, and cost-effective. Using data-driven technologies gives us the ability to make wiser decisions, and can help make society safer, more equitable and just, and more prosperous.</p>
44+
<p>
45+
On the other hand, even if it looks promising, data-driven decision making can cause harm.
46+
Probably the main reason is that real-life social date is almost always ``biased''. No one can miss the extensive recent discussion about race in the context of policing and criminal justice. But similar questions arise in many other domains as well.
47+
Take college admission for example. It has been shown that the GPA values has gender bias. That is due to grading policies that, for instance, may reduce grades for students with late homework, disruptive behavior, or inattention. As a result, using GPA as one of the features for generating the scores and ranking the students without considering the inherent bias in data can lead to gender bias.
48+
Evidence of bias has also been reported in recommendaton systems, advertisement, job interviewing, hiring, and promotion, among others.
49+
</p>
50+
<p style="text-align: center;"><img width="30%" src="imgs/dda.jpg"/></p>
51+
<p>
52+
In order to minimize societal harms of data-driven technologies, and to ensure that objectives such as fairness, equity, diversity, robustness, accountability, and transparency are satisfied, we aim to develop proper <i> algorithms, tools, strategies, and metrics</i>.
53+
In particular, we divide our effort in three categories:
54+
<dl>
55+
<dt>Data Prepration & Investigation:</dt>
56+
<dd>The focus of this category is on <i>bias in data</i> (colored in purple in the figure).
57+
Social data is almost always biased as it inherently reflects historical biases and stereotypes. Data collection and representation methods often introduce additional bias.
58+
Using biased data without paying attention to societal impacts can create a <i>feedback loop</i>, and even increase discrimination in society.
59+
Projects in this category aim to investigate the data used for building models and algorithms, (i) identify bias, and (iii) mitigate bias, and (iii) annotate data with information that show their fitness for use.
60+
</dd>
61+
<dt>Algorithm & Model Design:</dt>
62+
<dd>In particular, our focus is on score-based evaluation (colored in pink in the figure).
63+
The scores are often derived by combining multiple criteria (aka features or attributes).
64+
For instance, a lender may combine attributes such as payment history, salary, education, and age to develop a creditworthiness score for each customer.
65+
The scores can be generated with different methods, linearly or using a complex function, and be used for different purposes.
66+
In classification, scores are used to draw a decision boundary to specify, for example, if a woman is at risk of developing invasive breast cancer over the next 5 years.
67+
In ranking, the scores are used to sort the entities and, for example, select the top-8 soccer teams for seeding pot 1 in the world cup tournament.
68+
The scores are usually assigned either through (i) a process learned by machine learning models using some labeled training data, or (ii) using a weight vector or a procedure designed by human experts.
69+
Our objective in these set of proects is to mitigate the bias in scoring algorithms to generate <i>fair</i> and <i>stable</i> outcomes.
70+
</dd>
71+
<dt>Data Presentation & Output Investigation:</dt>
72+
<dd>
73+
The projects in this category (i) provide tools for investigating and mitigating bias in the outcome of algorithms (ii) study how the data presentation can introduce bias.
74+
</dd>
75+
</dl>
76+
Below, you can find more details about each of the projects. You can also refer to the following publications for more details.
77+
<dl>
78+
<dt>(Blog Post) Abolfazl Asudeh. <a target="_blank" href="http://wp.sigmod.org/?p=3174">Enabling Responsible Data Science in Practice</a>. <i>ACM SIGMOD Blog</i>, Jan. 2021.</dt>
79+
<dt>(Tutorial) Abolfazl Asudeh, HV Jagadish. <a target="_blank" href="http://www.vldb.org/pvldb/vol13/p3445-asudeh.pdf">Fairly Evaluating and Scoring Items in a Data Set</a>. <i>PVLDB</i>, 2020, VLDB Endowment.</dt>
80+
<dd>&nbsp;&nbsp;&nbsp;&nbsp;<b>Presentation videos and slides <a href="tutorial20.htm">here</a> (strongly recommended to watch)</b></dd>
81+
<dt>(Invited Paper) Abolfazl Asudeh, HV Jagadish, Julia Stoyanovich. <a target="_blank" href="http://sites.computer.org/debull/A19sept/issue1.htm">Towards Responsible Data-driven Decision Making in Score-Based Systems</a>. <i>Data Engineering Bulletin</i>, Vol. 42(3), pages 76--87, 2019, Special Issue on Fairness, Diversity, and Transparency in Data Systems.</dt>
82+
</dl>
83+
</p>
84+
</div>
85+
<div class="text-left">
86+
<ul class="list-group text-left" style="display:inline-block">
87+
<li class="list-group-item"><a href="#BiasinData">Data Prepration & Investigation</a>
88+
<ul>
89+
<li><a href="#coverage">Coverage over non-ordinal Categorical Attributes</a></li>
90+
<li><a href="#coverage2">Coverage over ordinal Continuous-Valued Attributes</a></li>
91+
<li><a href="#nl1">MithraLabel: Flexible Dataset Nutritional Labels (Demo)</a></li>
92+
</ul>
93+
</li>
94+
<li class="list-group-item"><a href="#ranking">Algorithm & Model Design</a>
95+
<ul>
96+
<li><a href="#FairR">Fair Ranking Schemes</a></li>
97+
<li><a href="#StableR">Stable Rankings</a></li>
98+
<li><a href="#MithraR">MithraRanking (Demo)</a></li>
99+
</ul>
100+
</li>
101+
<li class="list-group-item"><a href="#op">Data Presentation & Output Investigation</a>
102+
<ul>
103+
<li><a href="#cherrypicking">Cherry picking Trendlines</a></li>
104+
<li><a href="#nl2">Nutritional Labels for Rankings (Demo)</a></li>
105+
</ul>
106+
</li>
107+
</ul>
108+
</div>
109+
110+
<div>&nbsp;</div>
111+
<h3 id="BiasinData" class="alert alert-info" role="alert">Data Prepration & Investigation (Bias in Data)</h3>
112+
<div id="coverage"></div>
113+
<div>&nbsp;</div>
114+
<div id="coverage2"></div>
115+
<div>&nbsp;</div>
116+
<div id="nl1"></div>
117+
118+
<div>&nbsp;</div>
119+
<h3 id="ranking" class="alert alert-info" role="alert">Algorithm & Model Design</h3>
120+
<div id="FairR"></div>
121+
<div>&nbsp;</div><div id="StableR"></div>
122+
<div>&nbsp;</div><div id="MithraR"></div>
123+
124+
<div>&nbsp;</div>
125+
<h3 id="op" class="alert alert-info" role="alert">Data Presentation & Output Investigation</h3>
126+
<div id="cherrypicking"></div>
127+
<div>&nbsp;</div><div id="nl2"></div>
128+
</div>
129+
<!-- Your code ends here -->
130+
<!-- Optional JavaScript -->
131+
<!-- jQuery first, then Popper.js, then Bootstrap JS -->
132+
<script src="https://code.jquery.com/jquery-3.3.1.slim.min.js" integrity="sha384-q8i/X+965DzO0rT7abK41JStQIAqVgRVzpbzo5smXKp4YfRvH+8abtTE1Pi6jizo" crossorigin="anonymous"></script>
133+
<script src="https://cdnjs.cloudflare.com/ajax/libs/popper.js/1.14.6/umd/popper.min.js" integrity="sha384-wHAiFfRlMFy6i5SRaxvfOCifBUQy1xHdJ/yoi7FRNXMRBu5WHdZYu1hA6ZOblgut" crossorigin="anonymous"></script>
134+
<script src="https://stackpath.bootstrapcdn.com/bootstrap/4.2.1/js/bootstrap.min.js" integrity="sha384-B0UglyR+jN6CkvvICOB2joaf5I4l3gm9GU6Hc1og6Ls7i6U/mkkaduKaBhlAXv9k" crossorigin="anonymous"></script>
135+
</body>
136+
</html>

_projects.htm

Lines changed: 134 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,134 @@
1+
<!doctype html>
2+
<html lang="en">
3+
<head>
4+
<!-- Required meta tags -->
5+
<meta charset="utf-8">
6+
<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
7+
8+
<!-- Bootstrap CSS -->
9+
<link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/4.2.1/css/bootstrap.min.css" integrity="sha384-GJzZqFGwb1QTTN6wy59ffF1BuGJpLSa9DkKMp0DgiMDm4iYMj70gZWKYbI706tWS" crossorigin="anonymous">
10+
<link rel="icon" href="imgs/InDeXLab.gif"/>
11+
<title>InDeX Lab. Projects Summary</title>
12+
<link rel="stylesheet" type="text/css" href="ppl.css" />
13+
<link rel="stylesheet" type="text/css" href="style.css" />
14+
<script type="text/javascript" src="js/myobjects.js"></script>
15+
<script type="text/javascript" src="js/indexstart.js"></script>
16+
<script type="text/javascript">
17+
function init() {
18+
header();
19+
}
20+
</script>
21+
</head>
22+
23+
<body onload='init()'>
24+
<div id="headerDiv"></div>
25+
<!-- Your code starts here -->
26+
<div class="container">
27+
<div class="jumbotron jumbotron-fluid text-center" style="padding-block-start: 10pt;">
28+
<img class="roundphoto" src="imgs/Equity.png"/>
29+
<h3 class="display-4"><a href="mithra.htm">Data Equity</a></h3>
30+
<p class="lead">Projects related to Responsible Data Science and Algorithmic Fairness</p>
31+
<hr class="my-4">
32+
<div class="text-left" style="padding: 1%;">
33+
<p>More details in <a href="mithra.htm">Project Page</a>.</p>
34+
<p>Course Page: <a href="https://www.cs.uic.edu/~asudeh/teaching/archive/cs594fall20/index.html">CS 594: Responsible Data Science and Algorithmic Fairness</a></p>
35+
36+
<p>More Resources:
37+
<ul>
38+
<li>(Blog Post) Abolfazl Asudeh. <a target="_blank" href="http://wp.sigmod.org/?p=3174">Enabling Responsible Data Science in Practice</a>. <i>ACM SIGMOD Blog</i>, Jan. 2021.</li>
39+
<li>(Tutorial) Abolfazl Asudeh, HV Jagadish. <a target="_blank" href="http://www.vldb.org/pvldb/vol13/p3445-asudeh.pdf">Fairly Evaluating and Scoring Items in a Data Set</a>. <i>PVLDB</i>, 2020, VLDB Endowment.<br>&nbsp;&nbsp;&nbsp;&nbsp;-- Presentation videos and slides <a href="tutorial20.htm">here</a> (strongly recommended to watch)</li>
40+
<li>(Invited Paper) Abolfazl Asudeh, HV Jagadish, Julia Stoyanovich. <a target="_blank" href="http://sites.computer.org/debull/A19sept/issue1.htm">Towards Responsible Data-driven Decision Making in Score-Based Systems</a>. <i>Data Engineering Bulletin</i>, Vol. 42(3), pages 76--87, 2019, Special Issue on Fairness, Diversity, and Transparency in Data Systems.</li>
41+
</ul>
42+
</p>
43+
</div>
44+
</div>
45+
46+
<div class="jumbotron jumbotron-fluid text-center" style="padding-block-start: 10pt;">
47+
<img class="roundphoto" src="imgs/factke.jpg"/>
48+
<h3 class="display-4"><a href="FactChecking.htm">Fact Checking</a></h3>
49+
<p class="lead">Projects related to Computational Fact Checking</p>
50+
<hr class="my-4">
51+
<div class="text-left" style="padding: 1%;">
52+
<p>
53+
In this project, we particularly are interested to detect mileading information and statements made by <i>cherry-picking data</i>.
54+
This is popular among politicians since they would like not to be caught blatantly lying.
55+
That is why "<i>A lie which is half a truth is ever the blackest of lies</i>" (A.Tennyson).<br>
56+
But how to computationally detect/measure "half a truth"?
57+
We aim to address this question in this project.
58+
</p>
59+
<p>This project has been supported by the Google Scholar Award.</p>
60+
<p>More details in <a href="FactChecking.htm">Project Page</a>.</p>
61+
</div>
62+
</div>
63+
64+
<div class="jumbotron jumbotron-fluid text-center" style="padding-block-start: 10pt;">
65+
<img class="roundphoto" src="imgs/orca.JPG"/>
66+
<h3 class="display-4"><a href="">Orca</a></h3>
67+
<p class="lead">Projects related to Computation at Scale</p>
68+
<hr class="my-4">
69+
<div class="text-left" style="padding: 1%;">
70+
<p> Scalability has always been a challenge in CS.<br/>
71+
Just like Orcas who find smart ways to hunt the giants (watch these: <a href="https://www.youtube.com/watch?v=9DG8zCIBNh0" target="_blank">[1]</a>, <a href="https://www.youtube.com/watch?v=ZPjkHFD3bpA" target="_blank">[2]</a>),
72+
we aim to design efficient and accurate algorithms that can solve problems at scale.
73+
</p>
74+
<p>More details in <a href="FactChecking.htm" >Project Page</a>.</p>
75+
</div>
76+
</div>
77+
78+
<div class="jumbotron jumbotron-fluid text-center" style="padding-block-start: 10pt;">
79+
<img class="roundphoto" src="imgs/CH.jpg"/>
80+
<h3 class="display-4"><a href="">Ranking & Representatives</a></h3>
81+
<p class="lead">Projects related to Ranking and Top-k query processing; Skyline and Regret-Minimizing sets</p>
82+
<hr class="my-4">
83+
<div class="text-left" style="padding: 1%;">
84+
<p>
85+
Evaluating objects, ranking them, and selecting the <i>"best"</i>, is the key to decision making and a fundumental CS problem.
86+
However, almost as critical it is, ranking has always been controversial.
87+
Probably the main reason is that the concept of "best" lies in the eyes of the beholder and there can be many criteria involoved.
88+
Examples range from simple daily tasks such as booking a hotel or choosing a photo to post, to evaluating candidates for college admission, ranking universities, you name it.
89+
</p>
90+
<p>
91+
Finding novel, efficient, and accurate technical solutions for addressing the challenging problems in this area is our focus here.
92+
</p>
93+
<p>More details in <a href="ranking.htm">Project Page</a>.</p>
94+
</div>
95+
</div>
96+
97+
<div class="jumbotron jumbotron-fluid text-center" style="padding-block-start: 10pt;">
98+
<img class="roundphoto" src="imgs/ML.png"/>
99+
<h3 class="display-4"><a href="">AI&ML</a></h3>
100+
<p class="lead">Projects related to AI and machine learning</p>
101+
<hr class="my-4">
102+
<div class="text-left" style="padding: 1%;">
103+
<p>We specifically target data prepration and leveraging data management techniques for machine learning.
104+
Some of our projects in this area include creating nutritional labels for dataset, efficient construction of ad-hoc ML models, and fairness in ML.</p>
105+
<p>More details in <a href="ml.htm" >Project Page</a>.</p>
106+
</div>
107+
</div>
108+
109+
<div class="jumbotron jumbotron-fluid text-center" style="padding-block-start: 10pt;">
110+
<img class="roundphoto" src="imgs/HDB.jpg"/>
111+
<h3 class="display-4"><a href="">Web DataX</a></h3>
112+
<p class="lead">Projects related to Web Data Exploration </p>
113+
<hr class="my-4">
114+
<div class="text-left" style="padding: 1%;">
115+
<p>Web databases, also known as deep web or hidden web databases, cover a large portion of web.
116+
This includes shopping websites such as Amazon and eBay,
117+
service search sites such as Expedia and Google Flights,
118+
recommendation websites, and P2P marketplaces such as Airbnb and Craigslist.
119+
With minor differences, such systems have a typical structure that enforce a ranked retrieval interface.<br>
120+
Our projects in this category enable efficient data exploration and flexible query answering and recommendation in this environment.
121+
</p>
122+
<p>More details in <a href="webdb.htm" >Project Page</a>.</p>
123+
</div>
124+
</div>
125+
126+
</div>
127+
<!-- Your code ends here -->
128+
<!-- Optional JavaScript -->
129+
<!-- jQuery first, then Popper.js, then Bootstrap JS -->
130+
<script src="https://code.jquery.com/jquery-3.3.1.slim.min.js" integrity="sha384-q8i/X+965DzO0rT7abK41JStQIAqVgRVzpbzo5smXKp4YfRvH+8abtTE1Pi6jizo" crossorigin="anonymous"></script>
131+
<script src="https://cdnjs.cloudflare.com/ajax/libs/popper.js/1.14.6/umd/popper.min.js" integrity="sha384-wHAiFfRlMFy6i5SRaxvfOCifBUQy1xHdJ/yoi7FRNXMRBu5WHdZYu1hA6ZOblgut" crossorigin="anonymous"></script>
132+
<script src="https://stackpath.bootstrapcdn.com/bootstrap/4.2.1/js/bootstrap.min.js" integrity="sha384-B0UglyR+jN6CkvvICOB2joaf5I4l3gm9GU6Hc1og6Ls7i6U/mkkaduKaBhlAXv9k" crossorigin="anonymous"></script>
133+
</body>
134+
</html>
2.05 MB
Binary file not shown.

assets/cov21.pdf

1.37 MB
Binary file not shown.

assets/fair.jpg

33.7 KB
Loading

assets/fairRQ.pdf

649 KB
Binary file not shown.

assets/googlegorilla.png

41.2 KB
Loading

assets/googlegorilla2.png

401 KB
Loading

0 commit comments

Comments
 (0)