Skip to content

Commit 3030af2

Browse files
authored
Analysis module preparation #80 (#144)
1 parent aa3ed10 commit 3030af2

38 files changed

Lines changed: 773 additions & 279 deletions

jacodb-analysis/README.md

Lines changed: 67 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,73 @@
11
# Module jacodb-analysis
22

3-
Module for custom analysis
3+
Analysis module allows launching dataflow analyses of applications.
4+
It contains API to write custom analyses, along with several implemented ready-to-use analyses.
45

5-
## IFDS
6+
## Concept of units
67

7-
TODO
8+
The [IFDS](https://dx.doi.org/10.1145/199448.199462) framework is used as the basis for this module.
9+
However, in order to be scalable, the analyzed code is split into so-called units, so that the framework
10+
can analyze them concurrently.
11+
Information is shared between the units via summaries, but the lifecycle of each unit is controlled
12+
separately.
813

9-
## Points To
14+
## Get started
1015

11-
TODO
16+
The entry point of the analysis is the [runAnalysis] method. In order to call it, you have to provide:
17+
* `graph` — an application graph that is used for analysis. To obtain this graph, one should call the [newApplicationGraphForAnalysis] method.
18+
* `unitResolver` — an object that groups methods into units. Choose one from `UnitResolversLibrary`.
19+
Note that in general, larger units mean more precise but also more resource-consuming analysis.
20+
* `ifdsUnitRunner` — an [IfdsUnitRunner] instance, which is used to analyze each unit. This is what defines concrete analysis.
21+
Ready-to-use runners are located in `RunnersLibrary`.
22+
* `methods` — a list of methods to analyze.
23+
24+
For example, to detect unused variables in the given `analyzedClass` methods, you may run the following code
25+
(assuming that `classpath` is an instance of [JcClasspath]):
26+
27+
```kotlin
28+
val applicationGraph = runBlocking {
29+
classpath.newApplicationGraphForAnalysis()
30+
}
31+
32+
val methodsToAnalyze = analyzedClass.declaredMethods
33+
val unitResolver = MethodUnitResolver
34+
val runner = UnusedVariableRunner
35+
36+
runAnalysis(applicationGraph, unitResolver, runner, methodsToAnalyze)
37+
```
38+
39+
## Implemented runners
40+
41+
By now, the following runners are implemented:
42+
* `UnusedVariableRunner` that can detect issues like unused variable declaration, unused return value, etc.
43+
* `NpeRunner` that can find instructions with possible null-value dereference.
44+
* Generic `TaintRunner` that can perform taint analysis.
45+
* `SqlInjectionRunner` which find places vulnerable to sql injections, thus performing a specific kind of taint analysis.
46+
47+
## Implementing your own analysis
48+
49+
To implement a simple one-pass analysis, use [IfdsBaseUnitRunner].
50+
To instantiate it, you need an [AnalyzerFactory] instance, which is an object that can create [Analyzer] via
51+
[JcApplicationGraph].
52+
53+
To instantiate an [Analyzer] interface, you have to specify the following:
54+
55+
* `flowFunctions` which describe dataflow facts and their transmissions during the analysis.
56+
57+
* How vulnerabilities are produced by these facts, i.e. you have to implement `getSummaryFacts` and `getSummaryFactsPostIfds` methods.
58+
59+
To implement bidirectional analysis, you may use composite [SequentialBidiIfdsUnitRunner] and [ParallelBidiIfdsUnitRunner].
60+
61+
<!--- MODULE jacodb-analysis -->
62+
<!--- INDEX org.jacodb.analysis -->
63+
64+
[runAnalysis]: https://jacodb.org/docs/jacodb-analysis/org.jacodb.analysis/run-analysis.html
65+
[newApplicationGraphForAnalysis]: https://jacodb.org/docs/jacodb-analysis/org.jacodb.analysis/new-application-graph-for-analysis.html
66+
[IfdsUnitRunner]: https://jacodb.org/docs/jacodb-analysis/org.jacodb.analysis.engine/-ifds-unit-runner/index.html
67+
[JcClasspath]: https://jacodb.org/docs/jacodb-api/org.jacodb.api/-jc-classpath/index.html
68+
[IfdsBaseUnitRunner]: https://jacodb.org/docs/jacodb-analysis/org.jacodb.analysis.engine/-ifds-base-unit-runner/index.html
69+
[AnalyzerFactory]: https://jacodb.org/docs/jacodb-analysis/org.jacodb.analysis.engine/-analyzer-factory/index.html
70+
[Analyzer]: https://jacodb.org/docs/jacodb-analysis/org.jacodb.analysis.engine/-analyzer/index.html
71+
[JcApplicationGraph]: https://jacodb.org/docs/jacodb-api/org.jacodb.api.analysis/-jc-application-graph/index.html
72+
[SequentialBidiIfdsUnitRunner]: https://jacodb.org/docs/jacodb-analysis/org.jacodb.analysis.engine/-sequential-bidi-ifds-base-unit-runner/index.html
73+
[ParallelBidiIfdsUnitRunner]: https://jacodb.org/docs/jacodb-analysis/org.jacodb.analysis.engine/-parallel-bidi-ifds-base-unit-runner/index.html

jacodb-analysis/src/main/kotlin/org/jacodb/analysis/AnalysisMain.kt

Lines changed: 46 additions & 88 deletions
Original file line numberDiff line numberDiff line change
@@ -14,103 +14,61 @@
1414
* limitations under the License.
1515
*/
1616

17+
@file:JvmName("AnalysisMain")
1718
package org.jacodb.analysis
19+
1820
import kotlinx.serialization.Serializable
1921
import mu.KLogging
20-
import org.jacodb.analysis.analyzers.AliasAnalyzerFactory
21-
import org.jacodb.analysis.analyzers.NpeAnalyzerFactory
22-
import org.jacodb.analysis.analyzers.NpePrecalcBackwardAnalyzerFactory
23-
import org.jacodb.analysis.analyzers.SqlInjectionAnalyzerFactory
24-
import org.jacodb.analysis.analyzers.SqlInjectionBackwardAnalyzerFactory
25-
import org.jacodb.analysis.analyzers.TaintAnalysisNode
26-
import org.jacodb.analysis.analyzers.TaintAnalyzerFactory
27-
import org.jacodb.analysis.analyzers.TaintBackwardAnalyzerFactory
28-
import org.jacodb.analysis.analyzers.TaintNode
29-
import org.jacodb.analysis.analyzers.UnusedVariableAnalyzerFactory
30-
import org.jacodb.analysis.engine.IfdsBaseUnitRunner
31-
import org.jacodb.analysis.engine.SequentialBidiIfdsUnitRunner
32-
import org.jacodb.analysis.engine.TraceGraph
22+
import org.jacodb.analysis.engine.IfdsUnitManager
23+
import org.jacodb.analysis.engine.IfdsUnitRunner
24+
import org.jacodb.analysis.engine.Summary
25+
import org.jacodb.analysis.engine.UnitResolver
26+
import org.jacodb.analysis.engine.VulnerabilityInstance
27+
import org.jacodb.analysis.graph.newApplicationGraphForAnalysis
3328
import org.jacodb.api.JcMethod
34-
import org.jacodb.api.cfg.JcExpr
35-
import org.jacodb.api.cfg.JcInst
36-
37-
@Serializable
38-
data class DumpableVulnerabilityInstance(
39-
val vulnerabilityType: String,
40-
val sources: List<String>,
41-
val sink: String,
42-
val traces: List<List<String>>
43-
)
44-
45-
@Serializable
46-
data class DumpableAnalysisResult(val foundVulnerabilities: List<DumpableVulnerabilityInstance>)
47-
48-
data class VulnerabilityInstance(
49-
val vulnerabilityType: String,
50-
val traceGraph: TraceGraph
51-
) {
52-
private fun JcInst.prettyPrint(): String {
53-
return "${toString()} (${location.method}:${location.lineNumber})"
54-
}
55-
56-
fun toDumpable(maxPathsCount: Int): DumpableVulnerabilityInstance {
57-
return DumpableVulnerabilityInstance(
58-
vulnerabilityType,
59-
traceGraph.sources.map { it.statement.prettyPrint() },
60-
traceGraph.sink.statement.prettyPrint(),
61-
traceGraph.getAllTraces().take(maxPathsCount).map { intermediatePoints ->
62-
intermediatePoints.map { it.statement.prettyPrint() }
63-
}.toList()
64-
)
65-
}
66-
}
29+
import org.jacodb.api.analysis.JcApplicationGraph
6730

68-
fun List<VulnerabilityInstance>.toDumpable(maxPathsCount: Int = 3): DumpableAnalysisResult {
69-
return DumpableAnalysisResult(map { it.toDumpable(maxPathsCount) })
70-
}
31+
internal val logger = object : KLogging() {}.logger
7132

7233
typealias AnalysesOptions = Map<String, String>
7334

7435
@Serializable
7536
data class AnalysisConfig(val analyses: Map<String, AnalysesOptions>)
7637

77-
val UnusedVariableRunner = IfdsBaseUnitRunner(UnusedVariableAnalyzerFactory)
7838

79-
fun newSqlInjectionRunner(maxPathLength: Int = 5) = SequentialBidiIfdsUnitRunner(
80-
IfdsBaseUnitRunner(SqlInjectionAnalyzerFactory(maxPathLength)),
81-
IfdsBaseUnitRunner(SqlInjectionBackwardAnalyzerFactory(maxPathLength)),
82-
)
83-
84-
fun newNpeRunner(maxPathLength: Int = 5) = SequentialBidiIfdsUnitRunner(
85-
IfdsBaseUnitRunner(NpeAnalyzerFactory(maxPathLength)),
86-
IfdsBaseUnitRunner(NpePrecalcBackwardAnalyzerFactory(maxPathLength)),
87-
)
88-
89-
fun newAliasRunner(
90-
generates: (JcInst) -> List<TaintAnalysisNode>,
91-
sanitizes: (JcExpr, TaintNode) -> Boolean,
92-
sinks: (JcInst) -> List<TaintAnalysisNode>,
93-
maxPathLength: Int = 5
94-
) = IfdsBaseUnitRunner(AliasAnalyzerFactory(generates, sanitizes, sinks, maxPathLength))
95-
96-
fun newTaintRunner(
97-
isSourceMethod: (JcMethod) -> Boolean,
98-
isSanitizeMethod: (JcMethod) -> Boolean,
99-
isSinkMethod: (JcMethod) -> Boolean,
100-
maxPathLength: Int = 5
101-
) = SequentialBidiIfdsUnitRunner(
102-
IfdsBaseUnitRunner(TaintAnalyzerFactory(isSourceMethod, isSanitizeMethod, isSinkMethod, maxPathLength)),
103-
IfdsBaseUnitRunner(TaintBackwardAnalyzerFactory(isSourceMethod, isSinkMethod, maxPathLength))
104-
)
105-
106-
fun newTaintRunner(
107-
sourceMethodMatchers: List<String>,
108-
sanitizeMethodMatchers: List<String>,
109-
sinkMethodMatchers: List<String>,
110-
maxPathLength: Int = 5
111-
) = SequentialBidiIfdsUnitRunner(
112-
IfdsBaseUnitRunner(TaintAnalyzerFactory(sourceMethodMatchers, sanitizeMethodMatchers, sinkMethodMatchers, maxPathLength)),
113-
IfdsBaseUnitRunner(TaintBackwardAnalyzerFactory(sourceMethodMatchers, sinkMethodMatchers, maxPathLength))
114-
)
115-
116-
internal val logger = object : KLogging() {}.logger
39+
/**
40+
* This is the entry point for every analysis.
41+
* Calling this function will find all vulnerabilities reachable from [methods].
42+
*
43+
* @param graph instance of [JcApplicationGraph] that provides mixture of CFG and call graph
44+
* (called supergraph in RHS95).
45+
* Usually built by [newApplicationGraphForAnalysis].
46+
*
47+
* @param unitResolver instance of [UnitResolver] which splits all methods into groups of methods, called units.
48+
* Units are analyzed concurrently, one unit will be analyzed with one call to [IfdsUnitRunner.run] method.
49+
* In general, larger units mean more precise, but also more resource-consuming analysis, so [unitResolver] allows
50+
* to reach compromise.
51+
* It is guaranteed that [Summary] passed to all units is the same, so they can share information through it.
52+
* However, the order of launching and terminating analysis for units is an implementation detail and may vary even for
53+
* consecutive calls of this method with same arguments.
54+
*
55+
* @param ifdsUnitRunner an [IfdsUnitRunner] instance that will be launched for each unit.
56+
* This is the main argument that defines the analysis.
57+
*
58+
* @param methods the list of method for analysis.
59+
* Each vulnerability will only be reported if it is reachable from one of these.
60+
*
61+
* @param timeoutMillis the maximum time for analysis.
62+
* Note that this does not include time for precalculations
63+
* (like searching for reachable methods and splitting them into units) and postcalculations (like restoring traces), so
64+
* the actual running time of this method may be longer.
65+
*/
66+
fun runAnalysis(
67+
graph: JcApplicationGraph,
68+
unitResolver: UnitResolver<*>,
69+
ifdsUnitRunner: IfdsUnitRunner,
70+
methods: List<JcMethod>,
71+
timeoutMillis: Long = Long.MAX_VALUE
72+
): List<VulnerabilityInstance> {
73+
return IfdsUnitManager(graph, unitResolver, ifdsUnitRunner, methods, timeoutMillis).analyze()
74+
}
Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
/*
2+
* Copyright 2022 UnitTestBot contributors (utbot.org)
3+
* <p>
4+
* Licensed under the Apache License, Version 2.0 (the "License");
5+
* you may not use this file except in compliance with the License.
6+
* You may obtain a copy of the License at
7+
* <p>
8+
* http://www.apache.org/licenses/LICENSE-2.0
9+
* <p>
10+
* Unless required by applicable law or agreed to in writing, software
11+
* distributed under the License is distributed on an "AS IS" BASIS,
12+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
* See the License for the specific language governing permissions and
14+
* limitations under the License.
15+
*/
16+
17+
package org.jacodb.analysis
18+
19+
import kotlinx.serialization.Serializable
20+
import org.jacodb.analysis.engine.VulnerabilityInstance
21+
22+
/**
23+
* Simplified version of [VulnerabilityInstance] that contains only serializable data.
24+
*/
25+
@Serializable
26+
data class DumpableVulnerabilityInstance(
27+
val vulnerabilityType: String,
28+
val sources: List<String>,
29+
val sink: String,
30+
val traces: List<List<String>>
31+
)
32+
33+
@Serializable
34+
data class DumpableAnalysisResult(val foundVulnerabilities: List<DumpableVulnerabilityInstance>)
35+
36+
fun List<VulnerabilityInstance>.toDumpable(maxPathsCount: Int = 3): DumpableAnalysisResult {
37+
return DumpableAnalysisResult(map { it.toDumpable(maxPathsCount) })
38+
}

jacodb-analysis/src/main/kotlin/org/jacodb/analysis/engine/FlowFunctions.kt

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,35 +20,88 @@ import org.jacodb.api.JcMethod
2020
import org.jacodb.api.analysis.JcApplicationGraph
2121
import org.jacodb.api.cfg.JcInst
2222

23+
/**
24+
* Interface for flow functions -- mappings of kind DomainFact -> Collection of DomainFacts
25+
*/
2326
fun interface FlowFunctionInstance {
2427
fun compute(fact: DomainFact): Collection<DomainFact>
2528
}
2629

30+
/**
31+
* An interface with which facts appearing in analyses should be marked
32+
*/
2733
interface DomainFact
2834

35+
/**
36+
* A special [DomainFact] that always holds
37+
*/
2938
object ZEROFact : DomainFact {
3039
override fun toString() = "[ZERO fact]"
3140
}
3241

42+
/**
43+
* Implementations of the interface should provide all four kinds of flow functions mentioned in RHS95,
44+
* thus fully describing how the facts are propagated through the supergraph.
45+
*/
3346
interface FlowFunctionsSpace {
47+
/**
48+
* @return facts that may hold when analysis is started from [startStatement]
49+
* (these are needed to initiate worklist in ifds analysis)
50+
*/
3451
fun obtainPossibleStartFacts(startStatement: JcInst): Collection<DomainFact>
3552
fun obtainSequentFlowFunction(current: JcInst, next: JcInst): FlowFunctionInstance
3653
fun obtainCallToStartFlowFunction(callStatement: JcInst, callee: JcMethod): FlowFunctionInstance
3754
fun obtainCallToReturnFlowFunction(callStatement: JcInst, returnSite: JcInst): FlowFunctionInstance
3855
fun obtainExitToReturnSiteFlowFunction(callStatement: JcInst, returnSite: JcInst, exitStatement: JcInst): FlowFunctionInstance
3956
}
4057

58+
/**
59+
* [Analyzer] interface describes how facts are propagated and how vulnerabilities are produced by these facts during
60+
* the run of tabulation algorithm by [IfdsBaseUnitRunner].
61+
*
62+
* There are two methods that can turn facts into vulnerabilities or other [SummaryFact]s: [getSummaryFacts] and
63+
* [getSummaryFactsPostIfds]. First is called during the analysis, each time a new path edge is found, and second
64+
* is called only after all path edges were found.
65+
* While some analyses really need full set of facts to find vulnerabilities, most analyses can report [SummaryFact]s
66+
* right after some fact is reached, so [getSummaryFacts] is a recommended way to report vulnerabilities when possible.
67+
*
68+
* Note that methods and properties of this interface may be accessed concurrently from different threads,
69+
* so the implementations should be thread-safe.
70+
*
71+
* @property flowFunctions a [FlowFunctionsSpace] instance that describes how facts are generated and propagated
72+
* during run of tabulation algorithm.
73+
*
74+
* @property saveSummaryEdgesAndCrossUnitCalls when true, summary edges and cross-unit calls will be automatically
75+
* saved to summary (usually this property is true for forward analyzers and false for backward analyzers).
76+
*/
4177
interface Analyzer {
4278
val flowFunctions: FlowFunctionsSpace
4379

4480
val saveSummaryEdgesAndCrossUnitCalls: Boolean
4581
get() = true
4682

83+
/**
84+
* This method is called by [IfdsBaseUnitRunner] each time a new path edge is found.
85+
*
86+
* @return [SummaryFact]s that are produced by this edge, that need to be saved to summary.
87+
*/
4788
fun getSummaryFacts(edge: IfdsEdge): List<SummaryFact> = emptyList()
4889

90+
/**
91+
* This method is called once by [IfdsBaseUnitRunner] when the propagation of facts is finished
92+
* (normally or due to cancellation).
93+
*
94+
* @return [SummaryFact]s that can be obtained after the facts propagation was completed.
95+
*/
4996
fun getSummaryFactsPostIfds(ifdsResult: IfdsResult): List<SummaryFact> = emptyList()
5097
}
5198

99+
/**
100+
* A functional interface that allows to produce [Analyzer] by [JcApplicationGraph].
101+
*
102+
* It simplifies instantiation of [IfdsUnitRunner]s because this way you don't have to pass graph and reversed
103+
* graph to [Analyzer]s directly, relying on runner to do it by itself.
104+
*/
52105
fun interface AnalyzerFactory {
53106
fun newAnalyzer(graph: JcApplicationGraph): Analyzer
54107
}

jacodb-analysis/src/main/kotlin/org/jacodb/analysis/engine/IdLikeFlowFunction.kt

Lines changed: 0 additions & 34 deletions
This file was deleted.

0 commit comments

Comments
 (0)