Skip to content

Commit 9a32734

Browse files
Added simple linear regression sample analytic for Python 3 support
1 parent 981b4e5 commit 9a32734

10 files changed

Lines changed: 270 additions & 0 deletions

File tree

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ These are sample analytics written for use with Predix Analytics:
1111
- **[demo-timeseries-adder-with-model](analytics/demo-timeseries-adder-with-model):** Takes 2 arrays of timeseries data and returns a timeseries array that contains the sums at each timestamp, adjusted by a threshold value provided in a trained model. Currently available in Java and Python.
1212
- **[demo-RTM-loco](analytics/demo-RTM-loco):** A reference analytic that is used to calculate locomotive efficiency using a linear regression model. It has been implemented in Java, Matlab (r2011b), and Python.
1313
- **[miners-rule](analytics/miners-rule):** A sample analytic that performs a Miner's Rule operation on 2 timeseries arrays and returns a timeseries array. Currently only available in Java.
14+
- **[sample-linear-regression](analytics/sample-linear-regression):**:new: A sample analytic that performs a simple linear regression with two arrays as input and returns the p-value, r-value, slope, intercept and standard error. Currently only available in Python 3.
1415

1516
For more information on developing analytics for use with Predix Analytics, see [Analytic Development](https://docs.predix.io/en-US/content/service/analytics_services/analytics_framework/analytic-development) on Predix IO.
1617

analytics/README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,5 +7,6 @@ These are sample analytics written for use with [Predix Analytics](https://docs.
77
- **[demo-timeseries-adder-with-model](demo-timeseries-adder-with-model):** Takes 2 arrays of timeseries data and returns a timeseries array that contains the sums at each timestamp, adjusted by a threshold value provided in a trained model. Currently available in Java and Python.
88
- **[demo-RTM-loco](demo-RTM-loco):** A reference analytic that is used to calculate locomotive efficiency using a linear regression model. It has been implemented in Java, Matlab (r2011b), and Python.
99
- **[miners-rule](miners-rule):** A sample analytic that performs a Miner's Rule operation on 2 timeseries arrays and returns a timeseries array. Currently only available in Java.
10+
- **[sample-linear-regression](sample-linear-regression):**:new: A sample analytic that performs a simple linear regression with two arrays as input and returns the p-value, r-value, slope, intercept and standard error. Currently only available in Python 3.
1011

1112
For more information on developing analytics for use with Predix Analytics, see [Analytic Development](https://docs.predix.io/en-US/content/service/analytics_services/analytics_framework/analytic-development) on Predix IO.
Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
2+
3+
#Simple Linear Regression
4+
5+
6+
A Python3-based sample analytic performing simple linear regression computation which can be deployed to Predix Analytics platform.
7+
## Pre-requisites
8+
9+
To run this analytic locally, you will need to have the following:
10+
11+
- Python 3.6+
12+
- scipy 0.19.1
13+
14+
## Running unit tests
15+
16+
```bash
17+
$ cd <location>/simple-linear-regression/analytics
18+
$ python -m unittest test_simple_linear_regression.py
19+
20+
```
21+
22+
## Building, deploying and running the analytic
23+
24+
1. Zip the contents of this directory or if you have Maven3 installed and configured, do a `mvn clean install` to generate the `py3-simple-linear-regression-0.1-bin.zip` in the `target` folder.
25+
2. Create an analytic in Analytics Catalog with the name "simple-linregress", the version "v1" and the supported language to "PYTHON_3".
26+
3. Upload the zip file and attach it to the created analytic.
27+
4. Deploy and test the analytic on Predix Analytics platform.
28+
29+
## Analytic template
30+
This analytic takes in 2 arrays and returns the p-value, r-value, slope, intercept and standard error. This structure is outlined in this [analytic template](simple_linear_regression_template.json).
31+
32+
### Input format
33+
34+
The expected JSON input data format is as follows
35+
36+
```json
37+
{
38+
"y": [19.0, 20.3, 20.5, 21.5, 22.45, 23.0, 23.0, 25.5, 24.0],
39+
"x": [1, 2, 3, 4, 5, 6, 7, 8, 9]
40+
}
41+
```
42+
43+
### Output format
44+
The JSON output format from the analytic is as follows:
45+
46+
```json
47+
{
48+
"slope": 0.7166666666666667,
49+
"intercept": 18.472222222222225,
50+
"r_value": 0.9559490311973318,
51+
"p_value": 5.6576124923241295e-05,
52+
"std_err": 0.08317445171439007
53+
}
54+
```
55+
56+
## Developing a Python-based analytic
57+
58+
1. Implement the analytic (and test functions) according to your development guidelines.
59+
2. Create an entry method in your analytic class. The entry method signature must be in one of the following two formats:
60+
* For analytics that do not use trained models, use the following signature for your entry method:
61+
`def entry_method(self, inputJson):`
62+
* For analytics that use trained models, use the following signature for your entry method:
63+
`def entry_method(self, inputJson, inputModels):`
64+
* In either case, the `entry_method` can be any method name. `inputJson` is the JSON string input that will be passed to the analytic. The output of this method must also be a JSON string.
65+
* `inputModels` contains a dict() of trained models as defined in the port-to-field map. The entry method should properly handle the case of an empty dict.
66+
3. Create a config.json file in the top level of the project directory. Specify the entry method in the format of `<subdirectory>.<className>.<methodName>`, conda-libs, and non-conda-libs.
67+
4. Package all the analytic files and the config.json file into a ZIP file.
68+
69+
For more information on developing analytics for use with the Predix Analytics platform, please visit the **[Analytic Development](https://docs.predix.io/en-US/content/service/analytics_services/analytics_framework/analytic-development)** section of the Predix Analytics Services documentation on predix.io.
70+
71+
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
from analytics.simple_linear_regression import SimpleLinearRegression
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
import json
2+
3+
from scipy import stats
4+
5+
6+
class SimpleLinearRegression:
7+
8+
def compute(self, data):
9+
data_json = json.loads(data)
10+
x = data_json.get("x")
11+
y = data_json.get("y")
12+
slope, intercept, r_value, p_value, std_err = stats.linregress(x, y)
13+
14+
output = {
15+
"slope": slope,
16+
"intercept": intercept,
17+
"r_value": r_value,
18+
"p_value": p_value,
19+
"std_err": std_err}
20+
21+
return json.dumps(output)
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
import json
2+
from unittest import TestCase
3+
4+
from simple_linear_regression import SimpleLinearRegression
5+
6+
7+
class SimpleLinearRegressionTest(TestCase):
8+
9+
def setUp(self):
10+
pass
11+
12+
def test_compute(self):
13+
input_data_dict = {"y": [19, 20, 20.5, 21.5, 22, 23, 23, 25.5, 24],
14+
"x": [1, 2, 3, 4, 5, 6, 7, 8, 9]}
15+
16+
simple_linear_regression = SimpleLinearRegression()
17+
output_data = simple_linear_regression.compute(json.dumps(input_data_dict))
18+
19+
analytic_output_dict = json.loads(output_data)
20+
21+
print(analytic_output_dict)
22+
23+
self.assertEqual(0.7166666666666667, analytic_output_dict["slope"])
24+
self.assertEqual(0.9559490311973318, analytic_output_dict["r_value"])
25+
self.assertEqual(5.6576124923241295e-05, analytic_output_dict["p_value"])
26+
self.assertEqual(18.472222222222225, analytic_output_dict["intercept"])
27+
self.assertEqual(0.08317445171439007, analytic_output_dict["std_err"])
Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
<?xml version="1.0" encoding="UTF-8"?>
2+
3+
<!--
4+
~ Copyright (c) 2015 - 2016 General Electric Company. All rights reserved.
5+
~
6+
~ The copyright to the computer software herein is the property of
7+
~ General Electric Company. The software may be used and/or copied only
8+
~ with the written permission of General Electric Company or in accordance
9+
~ with the terms and conditions stipulated in the agreement/contract
10+
~ under which the software has been supplied.
11+
-->
12+
13+
<assembly xmlns="http://maven.apache.org/plugins/maven-assembly-plugin/assembly/2.5.5">
14+
<id>bin</id>
15+
<formats>
16+
<format>zip</format>
17+
</formats>
18+
<includeBaseDirectory>false</includeBaseDirectory>
19+
<fileSets>
20+
<fileSet>
21+
<directory>${project.basedir}</directory>
22+
<excludes>
23+
<exclude>pom.xml</exclude>
24+
<exclude>assembly.xml</exclude>
25+
<exclude>target/</exclude>
26+
<exclude>.idea/</exclude>
27+
<exclude>*.iml</exclude>
28+
<exclude>**/**.pyc</exclude>
29+
<exclude>**/**/test_*.py</exclude>
30+
<exclude>**/**/*_pycache_*</exclude>
31+
<exclude>README.MD</exclude>
32+
</excludes>
33+
</fileSet>
34+
</fileSets>
35+
</assembly>
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
{
2+
"entry-method": "analytics.SimpleLinearRegression.compute",
3+
"conda-libs": [
4+
"scipy==0.19.1"
5+
]
6+
}
7+
8+
Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
<?xml version="1.0" encoding="UTF-8"?>
2+
<!--
3+
~ Copyright (c) 2015 - 2016 General Electric Company. All rights reserved.
4+
~
5+
~ The copyright to the computer software herein is the property of
6+
~ General Electric Company. The software may be used and/or copied only
7+
~ with the written permission of General Electric Company or in accordance
8+
~ with the terms and conditions stipulated in the agreement/contract
9+
~ under which the software has been supplied.
10+
-->
11+
12+
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
13+
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
14+
15+
<modelVersion>4.0.0</modelVersion>
16+
<groupId>com.ge.predix.insight</groupId>
17+
<artifactId>py3-simple-linear-regression</artifactId>
18+
<name>Sample Simple Linear regression in Python 3</name>
19+
<packaging>pom</packaging>
20+
<description>Implementation of Simple Linear regression analytic service in Python 3</description>
21+
<version>0.1</version>
22+
23+
<build>
24+
<plugins>
25+
<plugin>
26+
<groupId>org.apache.maven.plugins</groupId>
27+
<artifactId>maven-assembly-plugin</artifactId>
28+
<version>2.5.5</version>
29+
<executions>
30+
<execution>
31+
<id>create-distribution</id>
32+
<phase>package</phase>
33+
<goals>
34+
<goal>single</goal>
35+
</goals>
36+
<configuration>
37+
<descriptors>
38+
<descriptor>assembly.xml</descriptor>
39+
</descriptors>
40+
</configuration>
41+
</execution>
42+
</executions>
43+
</plugin>
44+
</plugins>
45+
</build>
46+
47+
</project>
48+
Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
{
2+
"analyticName": "SimpleLinearRegression",
3+
"analyticVersion": "V1.0",
4+
"inputPortDefinitions": [
5+
{
6+
"portType": "FIELD",
7+
"portName": "x",
8+
"variable": false,
9+
"dataType": "DOUBLE_ARRAY",
10+
"required": true
11+
},
12+
{
13+
"portType": "FIELD",
14+
"portName": "y",
15+
"variable": false,
16+
"dataType": "INTEGER_ARRAY",
17+
"required": true
18+
}
19+
],
20+
"outputPortDefinitions": [
21+
{
22+
"portType": "FIELD",
23+
"portName": "slope",
24+
"variable": false,
25+
"dataType": "DOUBLE",
26+
"required": true
27+
},
28+
{
29+
"portType": "FIELD",
30+
"portName": "intercept",
31+
"variable": false,
32+
"dataType": "DOUBLE",
33+
"required": true
34+
},
35+
{
36+
"portType": "FIELD",
37+
"portName": "r_value",
38+
"variable": false,
39+
"dataType": "DOUBLE",
40+
"required": true
41+
},
42+
{
43+
"portType": "FIELD",
44+
"portName": "p_value",
45+
"variable": false,
46+
"dataType": "DOUBLE",
47+
"required": true
48+
},
49+
{
50+
"portType": "FIELD",
51+
"portName": "std_err",
52+
"variable": false,
53+
"dataType": "DOUBLE",
54+
"required": true
55+
}
56+
]
57+
}

0 commit comments

Comments
 (0)