mlatcl
diff --git a/‎_people/samuel-bell.md‎
Lines changed: 1 addition & 0 deletions b/‎_people/samuel-bell.md‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎_people/vidhi-lalchand.md‎
Lines changed: 1 addition & 0 deletions b/‎_people/vidhi-lalchand.md‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎_projects/autoai.md‎
Lines changed: 6 additions & 3 deletions b/‎_projects/autoai.md‎
Lines changed: 6 additions & 3 deletions
diff --git a/‎_publications/2021-09-20-inconsistency-in-peer-review.md‎
Lines changed: 26 additions & 0 deletions b/‎_publications/2021-09-20-inconsistency-in-peer-review.md‎
Lines changed: 26 additions & 0 deletions
diff --git a/‎_publications/2021-10-08-differentially-private-regression-and-classificaiton-with-sparse-gaussian-processes.md‎
Lines changed: 1 addition & 0 deletions b/‎_publications/2021-10-08-differentially-private-regression-and-classificaiton-with-sparse-gaussian-processes.md‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎_publications/2022-04-30-challenges-in-deploying-machine-learning-a-survey-of-case-studies.md‎
Lines changed: 31 additions & 0 deletions b/‎_publications/2022-04-30-challenges-in-deploying-machine-learning-a-survey-of-case-studies.md‎
Lines changed: 31 additions & 0 deletions
diff --git a/‎_publications/2022-07-14-modelling-technical-and-biological-effects-in-scrna-seq-data-with-scalable-gplvms.md‎
Lines changed: 45 additions & 0 deletions b/‎_publications/2022-07-14-modelling-technical-and-biological-effects-in-scrna-seq-data-with-scalable-gplvms.md‎
Lines changed: 45 additions & 0 deletions
diff --git a/‎_publications/2023-03-07-ai-for-science-an-emerging-agenda.md‎
Lines changed: 22 additions & 0 deletions b/‎_publications/2023-03-07-ai-for-science-an-emerging-agenda.md‎
Lines changed: 22 additions & 0 deletions
diff --git a/‎_publications/2023-03-16-data-flowgraphs-as-complete-causal-graphs.md‎
Lines changed: 19 additions & 0 deletions b/‎_publications/2023-03-16-data-flowgraphs-as-complete-causal-graphs.md‎
Lines changed: 19 additions & 0 deletions
diff --git a/‎_publications/2023-04-15-dimensionality-reduction-as-probabilistic-inference.md‎
Lines changed: 18 additions & 0 deletions b/‎_publications/2023-04-15-dimensionality-reduction-as-probabilistic-inference.md‎
Lines changed: 18 additions & 0 deletions
@@ -10,6 +10,7 @@ scholar: yfgSAi8AAAAJ
 github: samueljamesbell
 linkedin: samueljamesbell
 start: 2018-10-01
+end: 2023-04-19
 crsid: sjb326
 supervisor: ndl21
 position: PhD Student
 
@@ -5,6 +5,7 @@ family: Lalchand
 crsid: vr308
 student: true
 start: 2022-02-07
+end: 2023-10-01
 website: https://www.vidhilalchand.co.uk
 orcid: null
 position: Research Associate
 
@@ -9,13 +9,16 @@ featured_image: alina-grubnyak-ziqkhi7417a-unsplash.jpg
 people:
   - andrei-paleyes
   - christian-cabrera
-  - eric-meissner
   - jessica-montgomery
-  - mala-virdee
-  - markus-kaiser
   - morine-amutorine
   - neil-d-lawrence
+  - diana-robinson
+alumni:
+  - eric-meissner
+  - mala-virdee
+  - markus-kaiser
   - pierre-thodoroff
+
 publications:
   - an-empirical-evaluation-of-flow-based-programming-in-the-machine-learning-deployment-context
   - benchmarking-real-time-reinforcement-learning
 
@@ -0,0 +1,26 @@
+---
+layout: techreport
+title: "Inconsistency in Conference Peer Review: Revisiting the 2014 NeurIPS Experiment"
+date: 2021-09-20
+published: 2021-09-20
+abstract: |
+  In this paper we revisit the 2014 NeurIPS experiment that examined inconsistency in conference peer review. We determine that 50% of the variation in reviewer quality scores was subjective in origin. Further, with seven years passing since the experiment we find that for *accepted* papers, there is no correlation between quality scores and impact of the paper as measured as a function of citation count. We trace the fate of rejected papers, recovering where these papers were eventually published. For these papers we find a correlation between quality scores and impact. We conclude that the reviewing process for the 2014 conference was good for identifying poor papers, but poor for identifying good papers. We give some suggestions for improving the reviewing process but also warn against removing the subjective element. Finally, we suggest that the real conclusion of the experiment is that the community should place less onus on the notion of 'top-tier conference publications' when assessing the quality of individual researchers. For NeurIPS 2021, the PCs are repeating the experiment, as well as conducting new ones.
+author:
+  - given: Corinna
+    family: Cortes
+  - given: Neil D.
+    family: Lawrence
+arxiv: 2109.09774
+website: https://arxiv.org/abs/2109.09774
+doi: 10.48550/arXiv.2109.09774
+subjects:
+  - Digital Libraries (cs.DL)
+  - Machine Learning (cs.LG)
+submission_history:
+  - version: v1
+    date: 2021-09-20
+    size: 2,012 KB
+    submitter: Neil Lawrence
+    email: view email
+software: https://github.com/lawrennd/neurips2014/
+---
@@ -18,6 +18,7 @@ firstpage: 1
 lastpage: 41
 key: Smith-differentially21
 doi: 
+sscholar: 202712625
 html: https://jmlr.org/papers/v22/19-017.html
 pdf: https://jmlr.org/papers/volume22/19-017/19-017.pdf
 software: https://github.com/lionfish0/dp4gp
 
@@ -0,0 +1,31 @@
+---
+layout: article
+title: "Challenges in Machine Learning Deployment: A Survey of Case Studies"
+abstract: In recent years, machine learning has transitioned from a field of
+  academic research interest to a field capable of solving real-world business
+  problems. However, the deployment of machine learning models in production
+  systems can present a number of issues and concerns. This survey reviews
+  published reports of deploying machine learning solutions in a variety of use
+  cases, industries and applications and extracts practical considerations
+  corresponding to stages of the machine learning deployment workflow. By
+  mapping found challenges to the steps of the machine learning deployment
+  workflow  we show that practitioners face issues at each stage of the
+  deployment process. The goal of this paper is to lay out a research agenda to
+  explore approaches addressing these challenges.
+published: 2022-04-30
+author:
+  - given: Andrei
+    family: Paleyes
+    mlatcl_title: andrei-paleyes
+    person_page: andrei-paleyes
+  - given: Raoul-Gabriel
+    family: Urma
+  - given: Neil D.
+    family: Lawrence
+    person_page: neil-d-lawrence
+journal: ACM Comput. Surv.
+publisher: Association for Computing Machinery
+address: New York, NY, USA
+website: https://doi.org/10.1145/3533378
+arxiv: 2011.09926
+---
@@ -0,0 +1,45 @@
+---
+layout: techreport
+title: "Modelling Technical and Biological Effects in scRNA-seq Data with Scalable GPLVMs"
+date: 2022-09-14
+published: 2022-11-06
+abstract: |
+  Single-cell RNA-seq datasets are growing in size and complexity, enabling the study of cellular composition changes in various biological/clinical contexts. Scalable dimensionality reduction techniques are in need to disentangle biological variation in them, while accounting for technical and biological confounders. In this work, we extend a popular approach for probabilistic non-linear dimensionality reduction, the Gaussian process latent variable model, to scale to massive single-cell datasets while explicitly accounting for technical and biological confounders. The key idea is to use an augmented kernel which preserves the factorisability of the lower bound allowing for fast stochastic variational inference. We demonstrate its ability to reconstruct latent signatures of innate immunity recovered in Kumasaka et al. (2021) with 9x lower training time. We further analyze a COVID dataset and demonstrate across a cohort of 130 individuals, that this framework enables data integration while capturing interpretable signatures of infection. Specifically, we explore COVID severity as a latent dimension to refine patient stratification and capture disease-specific gene expression.
+author:
+  - given: Vidhi
+    family: Lalchand
+  - given: Aditya
+    family: Ravuri
+  - given: Emma
+    family: Dann
+  - given: Natsuhiko
+    family: Kumasaka
+  - given: Dinithi
+    family: Sumanaweera
+  - given: Rik G.H.
+    family: Lindeboom
+  - given: Shaista
+    family: Madad
+  - given: Sarah A.
+    family: Teichmann
+  - given: Neil D.
+    family: Lawrence
+arxiv: 2209.06716
+website: https://arxiv.org/abs/2209.06716
+doi: 10.48550/arXiv.2209.06716
+subjects:
+  - Machine Learning (cs.LG)
+  - Genomics (q-bio.GN)
+  - Applications (stat.AP)
+  - Machine Learning (stat.ML)
+submission_history:
+  - version: v1
+    date: 2022-09-14
+    size: 12,682 KB
+    submitter: Vidhi Lalchand Miss
+    email: view email
+  - version: v2
+    date: 2022-11-06
+    size: 12,682 KB
+comments: Machine Learning and Computational Biology Symposium (Oral), 2022
+---
@@ -0,0 +1,22 @@
+---
+layout: techreport
+title: "AI for Science: an emerging agenda"
+date: 2023-03-08
+published: 2023-03-08
+abstract: |
+  This report documents the programme and the outcomes of Dagstuhl Seminar 22382 "Machine Learning for Science: Bridging Data-Driven and Mechanistic Modelling". Today's scientific challenges are characterised by complexity. Interconnected natural, technological, and human systems are influenced by forces acting across time- and spatial-scales, resulting in complex interactions and emergent behaviours. Understanding these phenomena -- and leveraging scientific advances to deliver innovative solutions to improve society's health, wealth, and well-being -- requires new ways of analysing complex systems. The transformative potential of AI stems from its widespread applicability across disciplines, and will only be achieved through integration across research domains. AI for science is a rendezvous point. It brings together expertise from AI and application domains; combines modelling knowledge with engineering know-how; and relies on collaboration across disciplines and between humans and machines. Alongside technical advances, the next wave of progress in the field will come from building a community of machine learning researchers, domain experts, citizen scientists, and engineers working together to design and deploy effective AI tools. This report summarises the discussions from the seminar and provides a roadmap to suggest how different communities can collaborate to deliver a new wave of progress in AI and its application for scientific discovery.
+author:
+  - given: Philipp
+    family: Berens
+  - given: Kyle
+    family: Cranmer
+  - given: Neil D.
+    family: Lawrence
+  - given: Ulrike
+    family: von Luxburg
+  - given: Jessica
+    family: Montgomery
+arxiv: 2303.04217
+website: https://arxiv.org/abs/2303.04217
+doi: 10.48550/arXiv.2303.04217
+---
@@ -0,0 +1,19 @@
+---
+layout: inproceedings
+title: "Dataflow graphs as complete causal graphs"
+published: 2023
+author:
+  - given: Andrei
+    family: Paleyes
+  - given: Siyuan
+    family: Guo
+  - given: Bernhard
+    family: Schölkopf
+  - given: Neil D.
+    family: Lawrence
+abstract: |
+  Component-based development is one of the core principles behind modern software engineering practices. Understanding of causal relationships between components of a software system can yield significant benefits to developers. Yet modern software design approaches make it difficult to track and discover such relationships at system scale, which leads to growing intellectual debt. In this paper we consider an alternative approach to software design, flow-based programming (FBP), and draw the attention of the community to the connection between dataflow graphs produced by FBP and structural causal models. With expository examples we show how this connection can be leveraged to improve day-to-day tasks in software projects, including fault localisation, business analysis and experimentation.
+booktitle: 2023 IEEE/ACM 2nd International Conference on AI Engineering–Software Engineering Approaches
+website: https://arxiv.org/abs/2303.09552
+arxiv: 2303.09552
+---
@@ -0,0 +1,18 @@
+---
+layout: techreport
+title: "Dimensionality Reduction as Probabilistic Inference"
+published: 2023-04-15
+abstract: |
+  Dimensionality reduction (DR) algorithms compress high-dimensional data into a lower dimensional representation while preserving important features of the data. DR is a critical step in many analysis pipelines as it enables visualisation, noise reduction and efficient downstream processing of the data. In this work, we introduce the ProbDR variational framework, which interprets a wide range of classical DR algorithms as probabilistic inference algorithms in this framework. ProbDR encompasses PCA, CMDS, LLE, LE, MVU, diffusion maps, kPCA, Isomap, (t-)SNE, and UMAP. In our framework, a low-dimensional latent variable is used to construct a covariance, precision, or a graph Laplacian matrix, which can be used as part of a generative model for the data. Inference is done by optimizing an evidence lower bound. We demonstrate the internal consistency of our framework and show that it enables the use of probabilistic programming languages (PPLs) for DR. Additionally, we illustrate that the framework facilitates reasoning about unseen data and argue that our generative models approximate Gaussian processes (GPs) on manifolds. By providing a unified view of DR, our framework facilitates communication, reasoning about uncertainties, model composition, and extensions, particularly when domain knowledge is present.
+author:
+  - given: Aditya
+    family: Ravuri
+  - given: Francisco
+    family: Vargas
+  - given: Vidhi
+    family: Lalchand
+  - given: Neil D.
+    family: Lawrence
+website: https://arxiv.org/abs/2304.07658
+arxiv: 2304.07658
+---