For the final stage of ETL is to commit changes to remote (Google Sheets), no method was found to work.
There are no exceptions, just a function call to remote
and no data state update.
For the save/commit to the remote (i..e the stateful data source): 3 attempts at three approaches to get the local dataset to either|
- INTEGRATE Overwrite using the specialist gspread_dataframe library as per https://gspread-dataframe.readthedocs.io/en/latest/#gspread_dataframe.set_with_dataframe & PerplexityAI
https://www.perplexity.ai/search/2a256f73-47dc-4117-bbab-87e4c0a7cbe1?s=c
2 UPDATE: Overwrite using the basic gspread sheet.update as per
https://docs.gspread.org/en/latest/user-guide.html#using-gspread-with-pandas
- INJECTION: Inject a series data, which is the main data object for ETL, AND with support from Perplexity
https://www.perplexity.ai/search/33fb1a34-54aa-49d4-84aa-45b5d846eba8?s=c
[pycriteria] controller.py (Lines 2031-2032)
def save(self, saved: pd.DataFrame, series: pd.Series, index: int, action: str, debug: bool = False) -> None:
""" Saves the dataframe and commits it to the remote source"""
Open in IDE · Open on GitHub
[pycriteria] controller.py (Lines 2033-2041)
# 1. Prompt the user to save the updated DataFrame
if click.confirm("Are you ready to commit changes?"):
sheet: gspread.Worksheet = Controller.load_wsheet()
# 2. Check for Validation Client and Worksheet ID presence
if sheet.client is not None and \
isinstance(sheet.client, gspread.Client):
if sheet.id is not None and \
sheet.get_all_records():
Open in IDE · Open on GitHub
[pycriteria] controller.py (Lines 2042-2064)
# 3. Convert sheet to a target DataFrame
target: pd.DataFrame = \
DataController.load_dataframe_wsheet(sheet)
# 4. SAVE ATTEMPT 1: INTEGRATE a single record into the target
integratedframe: pd.DataFrame = self.integrate(
single=saved,
source=target,
index=index)
# 4. SAVE ATTEMPT 1b: switch based on TEST Saving mode
# s1 = Use integratedframe &
# gspread_dataframe & set_with_dataframe => set_remote
# This is not commment out code, it is annotation
if action == 's1' and debug is False:
saving = integratedframe.astype(str)
set_remote(worksheet=sheet,
dataframe=saving,
row=1,
col=1,
include_index=False,
include_column_header=True,
resize=False,
allow_formulas=False,
string_escaping='default')
Open in IDE · Open on GitHub
[pycriteria] controller.py (Lines 2065-2075)
# 5. SAVE ATTEMPT 2: Commit the updated DataFrame to remote
# Use the integratedframe as a overwrit: sheet.update
# Using https://docs.gspread.org/en/latest/user-guide.html#using-gspread-with-pandas
elif action == 's2' and debug is False:
sheet.update([saving.columns.values.tolist()] \
+ saving.values.tolist())
# Apparently, this is not working as documented.
# Expected type 'list' (matched generic type 'list[_T]')
# , got 'object' instead
pass
Open in IDE · Open on GitHub
[pycriteria] controller.py (Lines 2077-2084)
# 6. SAVE ATTEMPT 3: Inject the updated series into the remote
# source, via the series row and index parameters.
# BY matching on the Position column, primary key
elif action == 's3' and debug is True:
self.injection(series=series,
sheet=sheet,
row=index,
debug=debug)
Open in IDE · Open on GitHub
[pycriteria] controller.py (Lines 2110-2128)
@staticmethod
def integrate(single: pd.DataFrame,
source: pd.DataFrame,
index: int,
reset=True,
debug: bool = False) -> pd.DataFrame:
""" Merges the source and target DataFrames"""
# merge the single-row DataFrame into
# the source DataFrame at the same index
# https://www.perplexity.ai/search/2a256f73-47dc-4117-bbab-87e4c0a7cbe1?s=c
# drop any NaN values from the single-row DataFrame
single = single.dropna()
if debug is True:
rich.inspect(single)
return pd.concat([source.iloc[:index],
single,
source.iloc[index:]]).reset_index(drop=reset)
Open in IDE · Open on GitHub
[pycriteria] controller.py (Lines 2130-2136)
@staticmethod
def convertto(lt):
""" Converts the list of lists to a list of strings without nan vals
"""
return [[str(element) for element in sublist if str(element) != 'nan'] for sublist in lt]
Open in IDE · Open on GitHub
[pycriteria] controller.py (Lines 2089-2108)
@staticmethod
def injection(series: pd.Series, sheet, row: int, debug: bool = False) -> None:
""" Injects the updated series into the remote source, via the row"""
records = sheet.get_all_records()
row_id = None
for record in records:
if record['Position'] == series['Position']:
row_id = int(record[row])
break
rich.inspect(row_id)
# Update the row with the values from the series
if row_id is not None:
values = [series[key] for key in series.keys()]
if debug is True:
rich.inspect(values)
if debug is False:
sheet.update(str(row_id), values)
Open in IDE · Open on GitHub
Created from JetBrains using CodeStream
For the final stage of ETL is to commit changes to remote (Google Sheets), no method was found to work.
There are no exceptions, just a function call to remote
and no data state update.
For the save/commit to the remote (i..e the stateful data source): 3 attempts at three approaches to get the local dataset to either|
https://www.perplexity.ai/search/2a256f73-47dc-4117-bbab-87e4c0a7cbe1?s=c
2 UPDATE: Overwrite using the basic gspread sheet.update as per
https://docs.gspread.org/en/latest/user-guide.html#using-gspread-with-pandas
https://www.perplexity.ai/search/33fb1a34-54aa-49d4-84aa-45b5d846eba8?s=c
[pycriteria] controller.py (Lines 2031-2032)
Open in IDE · Open on GitHub
[pycriteria] controller.py (Lines 2033-2041)
Open in IDE · Open on GitHub
[pycriteria] controller.py (Lines 2042-2064)
Open in IDE · Open on GitHub
[pycriteria] controller.py (Lines 2065-2075)
Open in IDE · Open on GitHub
[pycriteria] controller.py (Lines 2077-2084)
Open in IDE · Open on GitHub
[pycriteria] controller.py (Lines 2110-2128)
Open in IDE · Open on GitHub
[pycriteria] controller.py (Lines 2130-2136)
Open in IDE · Open on GitHub
[pycriteria] controller.py (Lines 2089-2108)
Open in IDE · Open on GitHub
Created from JetBrains using CodeStream