Skip to content

Commit 126d5cc

Browse files
Rangeet PanRangeet Pan
authored andcommitted
updating notebooks
1 parent a45c3a8 commit 126d5cc

2 files changed

Lines changed: 228 additions & 3 deletions

File tree

docs/examples/java/code_summarization.ipynb

Lines changed: 227 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,231 @@
1515
{
1616
"cell_type": "markdown",
1717
"source": [
18+
"# Using CLDK to explain Java methods\n",
19+
"\n",
20+
"In this tutorial, we will use CLDK to explain or generate code summary for all the methods in a Java Application.\n",
21+
"\n",
22+
"By the end of this tutorial, you will have code summary for all the methods in a Java application. You'll be able to explore some of the benefits of using CLDK to perform fast and easy program analysis and build a LLM-based code summary generation.\n",
23+
"\n",
24+
"You will learn how to do the following:\n",
25+
"\n",
26+
"<ol>\n",
27+
"<li> Create a new instance of the CLDK class.\n",
28+
"<li> Create an analysis object over the Java application.\n",
29+
"<li> Iterate over all the files in the project.\n",
30+
"<li> Iterate over all the classes in the file.\n",
31+
"<li> Iterate over all the methods in the class.\n",
32+
"<li> Get the code body of the method.\n",
33+
"<li> Initialize the treesitter utils for the class file content.\n",
34+
"<li> Sanitize the class for analysis.\n",
35+
"</ol>\n",
36+
"Next, we will write a couple of helper methods to:\n",
37+
"\n",
38+
"<ol>\n",
39+
"<li> Format the instruction for the given focal method and class.\n",
40+
"<li> Prompts the local model on Ollama.\n",
41+
"<li> Prints the instruction and LLM output.\n",
42+
"</ol>"
43+
],
44+
"metadata": {
45+
"collapsed": false
46+
},
47+
"id": "59d05bbe28e62687"
48+
},
49+
{
50+
"cell_type": "markdown",
51+
"source": [
52+
"## Prequisites\n",
53+
"\n",
54+
"Before we get started, let's make sure you have the following installed:\n",
55+
"\n",
56+
"<ol>\n",
57+
"<li> Python 3.11 or later\n",
58+
"<li> Ollama 0.3.4 or later\n",
59+
"</ol>\n",
60+
"We will use ollama to spin up a local granite model that will act as our LLM for this turorial."
61+
],
62+
"metadata": {
63+
"collapsed": false
64+
},
65+
"id": "92896c8ce12b0e9e"
66+
},
67+
{
68+
"cell_type": "markdown",
69+
"source": [
70+
"### Prerequisite 1: Install ollama\n",
71+
"\n",
72+
"If you don't have ollama installed, please download and install it from here: [Ollama](https://ollama.com/download).\n",
73+
"Once you have ollama, start the server and make sure it is running.\n",
74+
"If you're on MacOS, Linux, or WSL, you can check to make sure the server is running by running the following command:"
75+
],
76+
"metadata": {
77+
"collapsed": false
78+
},
79+
"id": "bfeb1e1227191e3b"
80+
},
81+
{
82+
"cell_type": "code",
83+
"execution_count": null,
84+
"outputs": [],
85+
"source": [
86+
"systemctl status ollama"
87+
],
88+
"metadata": {
89+
"collapsed": false
90+
},
91+
"id": "c53214c8106642ce"
92+
},
93+
{
94+
"cell_type": "markdown",
95+
"source": [
96+
"If not, you may have to start the server manually. You can do this by running the following command:"
97+
],
98+
"metadata": {
99+
"collapsed": false
100+
},
101+
"id": "34a7b1802be15a3f"
102+
},
103+
{
104+
"cell_type": "code",
105+
"execution_count": null,
106+
"outputs": [],
107+
"source": [
108+
"systemctl start ollama"
109+
],
110+
"metadata": {
111+
"collapsed": false
112+
},
113+
"id": "f60e2d9ec12f0bf6"
114+
},
115+
{
116+
"cell_type": "markdown",
117+
"source": [
118+
"Once ollama is up and running, you can download the latest version of the Granite 8b Instruct model by running the following command:\n",
119+
"\n",
120+
"There are other granite versions available, but for this tutorial, we will use the Granite 8b Instruct model. You if prefer to use a different version, you can replace `8b-instruct` with any of the other [versions](https://ollama.com/library/granite-code/tags)."
121+
],
122+
"metadata": {
123+
"collapsed": false
124+
},
125+
"id": "f629a10841aca9e2"
126+
},
127+
{
128+
"cell_type": "code",
129+
"execution_count": null,
130+
"outputs": [],
131+
"source": [
132+
"ollama pull granite-code:8b-instruct"
133+
],
134+
"metadata": {
135+
"collapsed": false
136+
},
137+
"id": "6ff900382e86a18e"
138+
},
139+
{
140+
"cell_type": "markdown",
141+
"source": [
142+
"Let's make sure the model is downloaded by running the following command:"
143+
],
144+
"metadata": {
145+
"collapsed": false
146+
},
147+
"id": "d076e98c390591b5"
148+
},
149+
{
150+
"cell_type": "code",
151+
"execution_count": null,
152+
"outputs": [],
153+
"source": [
154+
"ollama run granite-code:8b-instruct \\\"Write a python function to print 'Hello, World!'"
155+
],
156+
"metadata": {
157+
"collapsed": false
158+
},
159+
"id": "7aff854a031589f0"
160+
},
161+
{
162+
"cell_type": "markdown",
163+
"source": [
164+
"### Prerequisite 3: Install ollama Python SDK"
165+
],
166+
"metadata": {
167+
"collapsed": false
168+
},
169+
"id": "531205b489bbec73"
170+
},
171+
{
172+
"cell_type": "code",
173+
"execution_count": null,
174+
"outputs": [],
175+
"source": [
176+
"pip install ollama"
177+
],
178+
"metadata": {
179+
"collapsed": false
180+
},
181+
"id": "e2a749932a800c9d"
182+
},
183+
{
184+
"cell_type": "markdown",
185+
"source": [
186+
"### Prerequisite 4: Install CLDK\n",
187+
"CLDK is avaliable on github at github.com/IBM/codellm-devkit.git. You can install it by running the following command:"
188+
],
189+
"metadata": {
190+
"collapsed": false
191+
},
192+
"id": "6f42dbd286b3f7a6"
193+
},
194+
{
195+
"cell_type": "code",
196+
"execution_count": null,
197+
"outputs": [],
198+
"source": [
199+
"pip install git+https://github.com/IBM/codellm-devkit.git"
200+
],
201+
"metadata": {
202+
"collapsed": false
203+
},
204+
"id": "327e212f20a489d6"
205+
},
206+
{
207+
"cell_type": "markdown",
208+
"source": [
209+
"### Step 1: Get the sample Java application\n",
210+
"For this tutorial, we will use apache commons cli. You can download the source code to a temporary directory by running the following command:"
211+
],
212+
"metadata": {
213+
"collapsed": false
214+
},
215+
"id": "dd8ec5b9c837898f"
216+
},
217+
{
218+
"cell_type": "code",
219+
"execution_count": null,
220+
"outputs": [],
221+
"source": [
222+
"wget https://github.com/apache/commons-cli/archive/refs/tags/rel/commons-cli-1.7.0.zip -O /tmp/commons-cli-1.7.0.zip && unzip -o /tmp/commons-cli-1.7.0.zip -d /tmp"
223+
],
224+
"metadata": {
225+
"collapsed": false
226+
},
227+
"id": "c196e58b3ce90c34"
228+
},
229+
{
230+
"cell_type": "markdown",
231+
"source": [
232+
"The project will now be extracted to `/tmp/commons-cli-rel-commons-cli-1.7.0`. We'll remove these files later, so don't worry about the location."
233+
],
234+
"metadata": {
235+
"collapsed": false
236+
},
237+
"id": "44e875e7ce6db504"
238+
},
239+
{
240+
"cell_type": "markdown",
241+
"source": [
242+
"### Generate code summary\n",
18243
"Code summarization or code explanation is a task that converts a code written in a programming language to a natural language. This particular task has several\n",
19244
"benefits, such as understanding code without looking at its intrinsic details, documenting code for better maintenance, etc. To do that, one needs to\n",
20245
"understand the basic details of code structure works, and use that knowledge to generate the summary using various AI-based approaches. In this particular\n",
@@ -147,7 +372,7 @@
147372
"Step 5: CLDK uses different analysis engine--Codeanalyzer (built using WALA and Javaparser), Treesitter, and CodeQL (future). By default, codenanalyzer has\n",
148373
"been selected as the default analysis engine. Also, CLDK support different analysis levels--(a) symbol table, (b) call graph, (c) program dependency graph, and\n",
149374
"(d) system dependency graph. Analysis engine can be selected using ```AnalysisLevel``` enum. In this example, we will generate summarization of all the methods\n",
150-
"of an application. To select the application location, you can set the environment variable ```JAVA_APP_PATH```. "
375+
"of an application. "
151376
],
152377
"metadata": {
153378
"collapsed": false
@@ -160,7 +385,7 @@
160385
"outputs": [],
161386
"source": [
162387
"# Create an analysis object over the java application\n",
163-
"analysis = cldk.analysis(project_path=\"JAVA_APP_PATH\", analysis_level=AnalysisLevel.symbol_table)"
388+
"analysis = cldk.analysis(project_path=\"/tmp/commons-cli-rel-commons-cli-1.7.0\", analysis_level=AnalysisLevel.symbol_table)"
164389
],
165390
"metadata": {
166391
"collapsed": false

docs/examples/java/generate_unit_tests.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -239,7 +239,7 @@
239239
{
240240
"cell_type": "markdown",
241241
"source": [
242-
"### Building a JUnit test generator using CLDK and Granite Code Instruct Model\\n\n",
242+
"### Building a JUnit test generator using CLDK and Granite Code Instruct Model\n",
243243
"Now that we have all the prerequisites installed, let's start building a JUnit test generator using CLDK and the Granite Code Instruct Model."
244244
],
245245
"metadata": {

0 commit comments

Comments
 (0)