Skip to content

Commit 0efc6c2

Browse files
Rangeet PanRangeet Pan
authored andcommitted
cookbook examples
1 parent 0a889da commit 0efc6c2

4 files changed

Lines changed: 91 additions & 3 deletions

File tree

cldk/core.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -114,7 +114,7 @@ def treesitter_parser(self):
114114
else:
115115
raise NotImplementedError(f"Treesitter parser for {self.language} is not implemented yet.")
116116

117-
def tree_sitter_utils(self, source_code: str):
117+
def tree_sitter_utils(self, source_code: str) -> [TreesitterSanitizer| NotImplementedError]:
118118
"""
119119
Parse the project using treesitter.
120120

docs/examples/java/code_summarization.ipynb

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,17 @@
11
{
22
"cells": [
3+
{
4+
"cell_type": "code",
5+
"execution_count": null,
6+
"outputs": [],
7+
"source": [
8+
"!pip install ollama"
9+
],
10+
"metadata": {
11+
"collapsed": false
12+
},
13+
"id": "eebee2515df69b96"
14+
},
315
{
416
"cell_type": "markdown",
517
"source": [

docs/examples/java/generate_unit_tests.ipynb

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,17 @@
11
{
22
"cells": [
3+
{
4+
"cell_type": "code",
5+
"execution_count": null,
6+
"outputs": [],
7+
"source": [
8+
"!pip install ollama"
9+
],
10+
"metadata": {
11+
"collapsed": false
12+
},
13+
"id": "b8da254b236c8a4b"
14+
},
315
{
416
"cell_type": "markdown",
517
"source": [
@@ -80,7 +92,7 @@
8092
"execution_count": null,
8193
"outputs": [],
8294
"source": [
83-
"def prompt_ollama(message: str, model_id: str = \"granite-code:8b-instruct\") -> str:\n",
95+
"def prompt_ollama(message: str, model_id: str = \"granite-code:20b-instruct\") -> str:\n",
8496
" \"\"\"Prompt local model on Ollama\"\"\"\n",
8597
" response_object = ollama.generate(model=model_id, prompt=message)\n",
8698
" return response_object[\"response\"]"

docs/examples/java/validating_code_translation.ipynb

Lines changed: 65 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,22 @@
11
{
22
"cells": [
3+
{
4+
"cell_type": "code",
5+
"execution_count": null,
6+
"outputs": [],
7+
"source": [
8+
"from cldk.utils.treesitter.tree_sitter_utils import TreeSitterUtils\n",
9+
"!pip install ollama"
10+
],
11+
"metadata": {
12+
"collapsed": false
13+
},
14+
"id": "3195a8c0612cb428"
15+
},
316
{
417
"cell_type": "markdown",
518
"source": [
6-
"Code translation aims to convert source code from one programming language (PL) to another. Given the promising abilities of large language models (LLMs) in code synthesis, researchers are exploring their potential to automate code translation. In our recent paper [https://dl.acm.org/doi/10.1145/3597503.3639226] published at ICSE'24, we found that LLM-based code translation is very promising. In this example, we will walk through the steps of translating each Java class to Python and checking various properties of translated code, such as the number of methods, number of fields, formal arguments, etc.\n",
19+
"Code translation aims to convert source code from one programming language (PL) to another. Given the promising abilities of large language models (LLMs) in code synthesis, researchers are exploring their potential to automate code translation. In our recent paper [https://dl.acm.org/doi/10.1145/3597503.3639226] published at ICSE'24, we found that LLM-based code translation is very promising. In this example, we will walk through the steps of translating each Java class to Python and checking various properties of translated code, such as the number of methods, number of fields, formal arguments, etc.\n",
720
"\n",
821
"(Step 1) First, we will import all the necessary libraries"
922
],
@@ -83,6 +96,57 @@
8396
"collapsed": false
8497
},
8598
"id": "1c86224032a6eb70"
99+
},
100+
{
101+
"cell_type": "markdown",
102+
"source": [
103+
"(Step 4) Translate each class in the application (provide the application path as an environment variable, ```JAVA_APP_PATH```) and check certain properties of the translated code, such as (a) number of translated method, and (b) number of translated fields. "
104+
],
105+
"metadata": {
106+
"collapsed": false
107+
},
108+
"id": "518efea0d8c4d307"
109+
},
110+
{
111+
"cell_type": "code",
112+
"execution_count": null,
113+
"outputs": [],
114+
"source": [
115+
"from cldk.analysis.python.treesitter import PythonSitter\n",
116+
"from cldk.analysis.java.treesitter import JavaSitter\n",
117+
"\n",
118+
"# Create a new instance of the CLDK class\n",
119+
"cldk = CLDK(language=\"java\")\n",
120+
"# Create an analysis object over the java application. Provide the application path using JAVA_APP_PATH\n",
121+
"analysis = cldk.analysis(project_path=\"JAVA_APP_PATH\", analysis_level=AnalysisLevel.symbol_table)\n",
122+
"# Go through all the classes in the application\n",
123+
"for class_name in analysis.get_classes():\n",
124+
" # Get the location of the Java class\n",
125+
" class_path = analysis.get_java_file(qualified_class_name=class_name)\n",
126+
" # Read the file content\n",
127+
" if not class_path:\n",
128+
" class_body = ''\n",
129+
" with open(class_path, 'r', encoding='utf-8', errors='ignore') as f:\n",
130+
" class_body = f.read()\n",
131+
" # Sanitize the file content by removing comments.\n",
132+
" tree_sitter_utils = cldk.tree_sitter_utils(source_code=class_body)\n",
133+
" sanitized_class = JavaSitter.remove_all_comments(source_code=class_body)\n",
134+
" translated_code = prompt_ollama(\n",
135+
" message=sanitized_class,\n",
136+
" model_id=\"granite-code:20b-instruct\")\n",
137+
" py_cldk = PythonSitter()\n",
138+
" all_methods = py_cldk.get_all_methods(module=translated_code)\n",
139+
" all_functions = py_cldk.get_all_functions(module=translated_code)\n",
140+
" all_fields = py_cldk.get_all_fields(module=translated_code)\n",
141+
" if len(all_methods) + len(all_functions) != len(analysis.get_methods_in_class(qualified_class_name=class_name)):\n",
142+
" print(f'Number of translated method not matching in class {class_name}')\n",
143+
" if len(all_fields) != len(analysis.get_class(qualified_class_name=class_name).field_declarations):\n",
144+
" print(f'Number of translated field not matching in class {class_name}') "
145+
],
146+
"metadata": {
147+
"collapsed": false
148+
},
149+
"id": "fe3be3de6790f7b3"
86150
}
87151
],
88152
"metadata": {

0 commit comments

Comments
 (0)