Skip to content

Latest commit

 

History

History
235 lines (224 loc) · 24.7 KB

File metadata and controls

235 lines (224 loc) · 24.7 KB

This project contains all the scripts used to conduct the experiments presented in the TSE papar titled "An Empirical Study of Type-Related Defects in Python Projects"

The paper conducts an empirical evaluation of 400 defects in 211 unique Python projects. The following table represents the github owner-repo links, the number of contributors, the stars and the size of the project in LOC.

Unique Projects

repo_links contributors stars loc
0 https://github.com/secdev/scapy.git 344 5742 113440
1 https://github.com/ansible/ansible.git 5846 45667 279233
2 https://github.com/pypa/warehouse.git 286 2534 247367
3 https://github.com/WellDone/pymomo.git 3 2 12536
4 https://github.com/virtool/virtool.git 12 27 289103
5 https://github.com/dhlab-basel/Knora.git 18 55 243959
6 https://github.com/PaddlePaddle/Paddle.git 545 13286 644506
7 https://github.com/rlee287/pyautoupdate.git 3 3 2591
8 https://github.com/cprogrammer1994/ModernGL.git 36 858 38548
9 https://github.com/saltstack/salt.git 3167 11371 815600
10 https://github.com/bokeh/bokeh.git 549 14248 151358
11 https://github.com/NixOS/nixops.git 150 773 10862
12 https://github.com/numenta/nupic.git 116 6187 143450
13 https://github.com/liqd/adhocracy3.mercator.git 28 51 105327
14 https://github.com/solvebio/solvebio-python.git 22 15 12154
15 https://github.com/wradlib/wradlib.git 40 136 19085
16 https://github.com/kwikteam/phy.git 16 140 28067
17 https://github.com/swcarpentry/amy.git 31 73 130863
18 https://github.com/numba/numba.git 249 5787 194064
19 https://github.com/stianjensen/wikipendium.no.git 8 35 8456
20 https://github.com/s4w3d0ff/python-poloniex.git 19 552 1154
21 https://github.com/Cypresslin/UbunTuTuMKII.git 7 0 2707
22 https://github.com/BitBotFactory/MikaLendingBot.git 55 1047 4663
23 https://github.com/SethMMorton/natsort.git 17 411 5854
24 https://github.com/MCOfficer/EndlessSky-Discord-Bot.git 17 5 2617
25 https://github.com/rarewin/AtomSeeker.git 2 0 544
26 https://github.com/divio/django-cms.git 565 7823 179997
27 https://github.com/ContinuumIO/blaze.git 68 2902 47077
28 https://github.com/esikachev/my-dev-server.git 4 0 553
29 https://github.com/trailofbits/manticore.git 88 2132 233249
30 https://github.com/androguard/androguard.git 84 3270 126324
31 https://github.com/rtfd/readthedocs.org.git 434 6170 225723
32 https://github.com/EUDAT-B2SHARE/b2share.git 208 26 40956
33 https://github.com/streamlink/streamlink.git 260 5442 42217
34 https://github.com/zyga/guacamole.git 1 7 2696
35 https://github.com/tum-ens/urbs.git 29 112 8584
36 https://github.com/ganga-devs/ganga.git 107 65 125182
37 https://github.com/airbnb/streamalert.git 29 2406 52628
38 https://github.com/TechEmpower/FrameworkBenchmarks.git 754 5473 156136
39 https://github.com/mypaint/mypaint.git 337 1752 402313
40 https://github.com/numpy/numpy.git 1263 15419 375456
41 https://github.com/awesto/django-shop.git 94 2348 39863
42 https://github.com/pandas-dev/pandas.git 2424 27279 390377
43 https://github.com/pimutils/khal.git 71 1517 12722
44 https://github.com/ansible/ansible-modules-core.git 1473 1207 71523
45 https://github.com/ansible/ansible-modules-extras.git 1365 927 79664
46 https://github.com/Tribler/tribler.git 143 3415 52737
47 https://github.com/Alir3z4/html2text.git 72 1016 3660
48 https://github.com/mozilla/build-relengapi.git 28 14 35203
49 https://github.com/beetbox/beets.git 460 9737 62225
50 https://github.com/ArkaneMoose/BotBot.git 9 7 1138
51 https://github.com/pywinauto/pywinauto.git 80 2513 34553
52 https://github.com/Cadasta/cadasta-platform.git 39 52 114297
53 https://github.com/kabirbaidhya/pglistend.git 3 22 529
54 https://github.com/cogniteev/docido-python-sdk.git 4 0 5501
55 https://github.com/missionpinball/mpf.git 44 125 78739
56 https://github.com/astropy/astropy.git 483 2551 462832
57 https://github.com/micahflee/onionshare.git 160 4199 34903
58 https://github.com/MongoEngine/mongoengine.git 366 3319 28879
59 https://github.com/KhronosGroup/SPIRV-Cross.git 86 1075 170127
60 https://github.com/projectatomic/commissaire-http.git 7 3 3529
61 https://github.com/frappe/erpnext.git 480 7106 530111
62 https://github.com/simphony/simphony-mayavi.git 7 0 8099
63 https://github.com/GameServerManagers/LinuxGSM.git 163 2507 18247
64 https://github.com/jupyter/nbgrader.git 84 961 105231
65 https://github.com/pytest-dev/pytest.git 694 6713 59500
66 https://github.com/ciudadanointeligente/write-it.git 23 37 35923
67 https://github.com/andycasey/ads.git 17 122 2265
68 https://github.com/DMSC-Instrument-Data/lewis.git 8 15 7756
69 https://github.com/ultrabug/py3status.git 200 774 20234
70 https://github.com/ensime/ensime-vim.git 38 195 2665
71 https://github.com/CenterForOpenScience/osf.io.git 210 548 335405
72 https://github.com/pytroll/satpy.git 96 638 81410
73 https://github.com/choderalab/openpathsampling.git 12 57 62756
74 https://github.com/chainer/chainer.git 352 5483 187028
75 https://github.com/pymedusa/Medusa.git 328 1195 444714
76 https://github.com/strands-project/aaf_deployment.git 26 0 78324
77 https://github.com/radremedy/radremedy.git 11 12 10890
78 https://github.com/LCAS/zoidbot.git 5 1 7080
79 https://github.com/kumoru/torment.git 7 5 1640
80 https://github.com/ansible/ansible-container.git 78 2262 38613
81 https://github.com/biocore/qiita.git 43 94 61575
82 https://github.com/openstenoproject/plover.git 50 997 167258
83 https://github.com/andreas-h/emiprep.git 1 1 2747
84 https://github.com/wal-e/wal-e.git 73 3176 7832
85 https://github.com/krmaxwell/maltrieve.git 14 554 800
86 https://github.com/pybuilder/pybuilder.git 38 1258 24780
87 https://github.com/inveniosoftware/invenio-accounts.git 89 4 5649
88 https://github.com/home-assistant/home-assistant.git 2363 37384 875329
89 https://github.com/capitalone/cloud-custodian.git 302 3272 483360
90 https://github.com/KanoComputing/kano-burners.git 9 12 3428
91 https://github.com/golemfactory/golem.git 82 2919 88108
92 https://github.com/pantsbuild/pants.git 249 1435 103058
93 https://github.com/inveniosoftware/invenio.git 219 438 4790
94 https://github.com/cherrypy/cherrypy.git 156 1302 20256
95 https://github.com/google/mobly.git 39 409 14694
96 https://github.com/ubc/compair.git 22 20 82037
97 https://github.com/voc/voctomix.git 40 446 16071
98 https://github.com/geopython/OWSLib.git 126 223 161317
99 https://github.com/TESScience/SPyFFI.git 7 6 10954
100 https://github.com/desihub/desitarget.git 41 9 26028
101 https://github.com/letsencrypt/letsencrypt.git 469 27371 57122
102 https://github.com/planetlabs/planet-client-python.git 30 154 6291
103 https://github.com/kartoza/geonode.git 253 6 425486
104 https://github.com/google/blockly.git 157 8214 196194
105 https://github.com/madisonhicks/blob.git 2 0 2224
106 https://github.com/mozilla-services/autopush.git 23 184 19538
107 https://github.com/GetStream/stream-django.git 19 406 1016
108 https://github.com/tryolabs/luminoth.git 18 2323 14314
109 https://github.com/pydata/xarray.git 281 1853 72030
110 https://github.com/pfnet/chainer.git 352 5483 187028
111 https://github.com/LLNL/spack.git 766 1822 298250
112 https://github.com/miguelgrinberg/Flask-SocketIO.git 57 3979 3242
113 https://github.com/catkin/catkin_tools.git 62 103 13188
114 https://github.com/INM-6/h5py_wrapper.git 3 1 843
115 https://github.com/tomv564/LSP.git 94 944 14188
116 https://github.com/shopkeep/shpkpr.git 12 13 3456
117 https://github.com/mitodl/micromasters.git 44 24 112841
118 https://github.com/poldracklab/mriqc.git 31 136 23790
119 https://github.com/nansencenter/nansat.git 34 149 15411
120 https://github.com/renisac/pdnssdk-py.git 5 3 1140
121 https://github.com/SUSE/azurectl.git 7 9 12038
122 https://github.com/librosa/librosa.git 78 4050 16806
123 https://github.com/coala-analyzer/coala.git 494 3091 34688
124 https://github.com/Sixdsn/terra-terminal.git 8 7 5690
125 https://github.com/tomchristie/django-rest-framework.git 1133 19264 71500
126 https://github.com/ilogue/niprov.git 6 9 8195
127 https://github.com/RIOT-OS/RIOT.git 407 3600 1422950
128 https://github.com/Alignak-monitoring/alignak.git 186 85 76380
129 https://github.com/mitodl/lore.git 18 18 285226
130 https://github.com/mozilla/normandy.git 49 45 29465
131 https://github.com/christabor/flask_jsondash.git 6 3121 156515
132 https://github.com/mantidproject/mantid.git 251 148 2697357
133 https://github.com/neutrons/FastGR.git 16 0 67665
134 https://github.com/FX31337/FX-BT-Scripts.git 19 19 2899
135 https://github.com/ament/ament_tools.git 24 8 2
136 https://github.com/liqd/a4-meinberlin.git 24 19 51459
137 https://github.com/unt-libraries/pyuntl.git 7 2 4894
138 https://github.com/cloudant/python-cloudant.git 30 155 12843
139 https://github.com/gem/oq-engine.git 93 192 607525
140 https://github.com/torchbox/wagtail.git 530 9653 294330
141 https://github.com/faneshion/MatchZoo.git 47 3262 25628
142 https://github.com/gitcoinco/web.git 264 1029 1490351
143 https://github.com/mlsecproject/combine.git 11 602 1160
144 https://github.com/CentOS-PaaS-SIG/linchpin.git 59 106 36938
145 https://github.com/neuropoly/spinalcordtoolbox.git 76 108 74967
146 https://github.com/Jumpscale/go-raml.git 22 125 1028299
147 https://github.com/ManageIQ/integration_tests.git 182 64 166710
148 https://github.com/BD2KGenomics/toil-scripts.git 16 31 4695
149 https://github.com/GoogleCloudPlatform/gcloud-python.git 311 3405 481
150 https://github.com/codeforamerica/pittsburgh-purchasing-suite.git 11 16 19997
151 https://github.com/QISKit/qiskit-sdk-py.git 255 2884 105193
152 https://github.com/mkdocs/mkdocs.git 179 11055 24740
153 https://github.com/buildbot/buildbot.git 757 4485 181971
154 https://github.com/tildaslash/RatticWeb.git 27 480 26647
155 https://github.com/asciidisco/plugin.video.netflix.git 47 1216 10103
156 https://github.com/inspirehep/inspire-next.git 79 43 657591
157 https://github.com/project-rig/rig.git 5 1 25577
158 https://github.com/wong2/pick.git 11 356 331
159 https://github.com/hydroshare/hydroshare.git 77 113 310733
160 https://github.com/apache/incubator-superset.git 530 31241 277160
161 https://github.com/Microsoft/PTVS.git 84 2301 943647
162 https://github.com/ufo-kit/concert.git 15 8 9818
163 https://github.com/apache/incubator-mxnet.git 962 19103 405628
164 https://github.com/nvbn/thefuck.git 167 57170 10583
165 https://github.com/dseuss/mpnum.git 2 40 5866
166 https://github.com/miki725/django-rest-framework-bulk.git 8 450 989
167 https://github.com/zenodo/zenodo.git 57 480 48547
168 https://github.com/PyAr/fades.git 32 178 16834
169 https://github.com/phoebe-project/phoebe2.git 28 33 55627
170 https://github.com/metomi/rose.git 44 43 65687
171 https://github.com/ContinuumIO/odo.git 43 925 12051
172 https://github.com/Cog-Creators/Red-DiscordBot.git 111 1876 289177
173 https://github.com/QuantConnect/Lean.git 170 4118 398981
174 https://github.com/dbcli/vcli.git 42 74 5786
175 https://github.com/SoCo/SoCo.git 74 1170 14564
176 https://github.com/geometalab/osmaxx.git 17 21 92165
177 https://github.com/ONSdigital/eq-survey-runner.git 78 17 183776
178 https://github.com/pixelated/pixelated-user-agent.git 94 158 31232
179 https://github.com/flask-restful/flask-restful.git 157 5807 5967
180 https://github.com/SanaMobile/sana.protocol_builder.git 12 6 14087
181 https://github.com/jupyter/jupyterlab.git 414 10436 248330
182 https://github.com/postmarketOS/pmbootstrap.git 109 1003 35430
183 https://github.com/raiden-network/raiden.git 103 1774 78455
184 https://github.com/pr-omethe-us/PyKED.git 5 10 11648
185 https://github.com/freedomofpress/securedrop.git 213 2950 61996
186 https://github.com/poldracklab/fmriprep.git 77 352 114547
187 https://github.com/cgstudiomap/cgstudiomap.git 3 2 5735321
188 https://github.com/boundlessgeo/qgis-webappbuilder-plugin.git 10 18 293882
189 https://github.com/clab/dynet.git 153 3143 64192
190 https://github.com/mesonbuild/meson.git 601 2988 100764
191 https://github.com/basho/riak-python-client.git 93 323 19179
192 https://github.com/dirk-thomas/vcstool.git 21 133 3572
193 https://github.com/Ex-Mente/auxi.0.git 8 7 19832
194 https://github.com/associatedpress/geomancer.git 6 58 3324
195 https://github.com/Difrex/surok.git 6 3 3366
196 https://github.com/CartoDB/cartoframes.git 49 199 18867
197 https://github.com/SciTools/cartopy.git 96 809 16826
198 https://github.com/SonarOpenCommunity/sonar-cxx.git 67 613 182560
199 https://github.com/CSC-IT-Center-for-Science/pouta-blueprints.git 17 8 34974
200 https://github.com/lbryio/lbry.git 100 4864 67986
201 https://github.com/influxdata/telegraf.git 842 9395 282824
202 https://github.com/conan-io/conan.git 251 4472 85161
203 https://github.com/ray-project/ray.git 410 13812 305278
204 https://github.com/frictionlessdata/tabulator-py.git 25 196 6807
205 https://github.com/ImageEngine/gaffer.git 48 645 325819
206 https://github.com/pgmpy/pgmpy.git 94 1634 24904
207 https://github.com/mitmproxy/mitmproxy.git 385 20644 86786
208 https://github.com/canihavesomecoffee/sample-platform.git 30 18 41424
209 https://github.com/biocore/qiime.git 86 269 94132

Repository Structure

  • The sql_queries folder contains the sql queries used to fetch the data from GitHub archive hosted on the googles Big Query platform
  • The experience folder contains all the code used to calculate the experience counts of the authors.
  • The project_statistics folder contains the code used to extract the table above.
  • The entire_corpus_python_patched folder contains all of the github issues with atleast one python file in the pull request. This is where we select our sample from.

Other links