Using pandas to clean and prepare loblaw's data for analysis
What I learned?
- Pandas
- Exporting dataframes to Excel sheets
What do the following files do?
- SKUpriceswithDates(timeseries) : From the MostValuableSKU.xlxs file, takes the transaction records for all most valubale SKUs from different sheets to merge them all together. Then changing the date frequency to daily and putting them all together in a sheet and exporting that to an excel sheet.
- timeseriesdiffdates.py : Returns a plot which compares the prices of one SKU wrt another SKU over time
- append_df_to_excel.py : The function provides us a way to append pandas dataframes to existing (If not existing, it will make a new one) excel file/sheets.
- CalculateMeanPriceperSKU.py : prints mean price per SKU using the Most valuable SKU dataframe
- Count non zeroes in col.py : prints number of non zero entries in a particular column in dataframe
- PandasBasicBerrieCherries.py : From the data provided by loblaws, identifies the most valuable SKUs and appends those dataframes to a new excel file.